Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
[flagged] GitHub Turned into an Enterprise Under Microsoft?
10 points by aliostad on Jan 29, 2020 | hide | past | favorite | 9 comments
I was requested to remove a training file from my Deep Learning Language detection repo (only 64 stars but still). The repo used Deep Learning to detect programming language of a file or snippet. The files and snippets were harvested from public files and snippets of github and stackoverflow. The repo was taken down even after I removed the file from the git history. More info and screenshots here: https://twitter.com/aliostad/status/1222440190821781506?s=20


The reason you couldn't delete the blob, is because someone forked your repository and GitHub uses git alternates for deduplication of fork networks.

I think you could ask GitHub if you can recreate your repository without the offending blob, and you should be good again.


This is essentially what I did using bfg tool. They still took it down.

https://help.github.com/en/github/authenticating-to-github/r...


I'm saying the blob was in a new repository, you had no control over. You couldn't have removed it, you could only make sure it doesn't get referenced in _your_ repository. Which is what you did.


sure, but I imagine they have already removed those forks too.

Have a look at this list https://github.com/github/dmca/blob/master/2020/01/2020-01-2...


github.com was always a corporate entity and subject to DMCA takedowns.

The notice link so others don't have to hand-type it: https://github.com/github/dmca/blob/master/2020/01/2020-01-2...

I feel like that other twitter user / BSA / IBM as the originators of that takedown notice are more useful targets of animosity here.


Lack of communication and courtesy - disrespect to public good. I am happy to remove the mention per his request.


Fair enough, communication could indeed be improved. NAL but I was under the impression that, while they could surely be more helpful here, once they received that official DMCA takedown notice they don't really have a choice in the matter of taking it down or not.

Edit: disabling the repository after being notified by you within 24 hours seems to be against their own policy at https://help.github.com/en/github/site-policy/dmca-takedown-... - have you tried contacting their support again?


Well they said, I had 24 hours to remove the offending item according to the "remove sensitive data" link which I abided in a matter of a few minutes. They still took down the repo - that is the problem, not sending the notice.

"We're giving you 24 hours to make the changes identified in the following notice:

https://github.zendesk.com/attachments/token/BqByLyvvRzOAmVy...

If you need to remove specific content from your repository, simply making the repository private or deleting it via a commit won't resolve the alleged infringement. Instead, you must follow these instructions to remove the content from your repository's history, even if you don't think it's sensitive:

https://help.github.com/articles/remove-sensitive-data


Here is the terminal output of what I did to remove the file from the git history:

~/g/aliostad bfg --delete-files 1703 deep-learning-lang-detection.git

Using repo : /Users/alikheyrollahi/github/aliostad/deep-learning-lang-detection.git

Found 72811 objects to protect Found 2 commit-pointing refs : HEAD, refs/heads/master

Protected commits -----------------

These are your protected commits, and so their contents will NOT be altered:

* commit ac12aa68 (protected by 'HEAD') - contains 8 dirty files : - data/stackoverflow-snippets/cpp/1703 (3.0 KB) - data/stackoverflow-snippets/csharp/1703 (835 B) - ...

WARNING: The dirty content above may be removed from other commits, but as the protected commits still use it, it will STILL exist in your repository.

Details of protected dirty content have been recorded here :

/Users/alikheyrollahi/github/aliostad/deep-learning-lang-detection.git.bfg-report/2020-01-27/22-24-03/protected-dirt/

If you really want this content gone, make a manual commit that removes it, and then run the BFG on a fresh copy of your repo.

Cleaning --------

Found 69 commits Cleaning commits: 100% (69/69) Cleaning commits completed in 304 ms.

Updating 1 Ref --------------

Ref Before After --------------------------------------- refs/heads/master | ac12aa68 | c51406cc

Updating references: 100% (1/1) ...Ref update completed in 13 ms.

Commit Tree-Dirt History ------------------------

Earliest Latest | | .................................................DDDDDDDDDDm

D = dirty commits (file tree fixed) m = modified commits (commit message or parents changed) . = clean commits (no changes to file tree)

                         Before     After
 -------------------------------------------
 First modified commit | a4a1bbac | cb32cfbf
 Last dirty commit     | 45322921 | 6b9e8d5d
Deleted files -------------

Filename Git id --------------------------------------------------- 1703 | 530293d7 (614 B), 98c9b646 (3.0 KB), ...

In total, 47 object ids were changed. Full details are logged here:

/Users/alikheyrollahi/github/aliostad/deep-learning-lang-detection.git.bfg-report/2020-01-27/22-24-03

BFG run is complete! When ready, run: git reflog expire --expire=now --all && git gc --prune=now --aggressive

-- You can rewrite history in Git - don't let Trump do it for real! Trump's administration has lied consistently, to make people give up on ever being told the truth. Don't give up: https://www.aclu.org/ --

~/g/aliostad cd deep-learning-lang-detection.git ~/g/a/deep-learning-lang-detection.git git reflog expire --expire=now --all && git gc --prune=now --aggressive Enumerating objects: 89539, done. Counting objects: 100% (89539/89539), done. Delta compression using up to 8 threads Compressing objects: 100% (89537/89537), done. Writing objects: 100% (89539/89539), done. Total 89539 (delta 28336), reused 61123 (delta 0) ~/g/a/deep-learning-lang-detection.git git push Enter passphrase for key '/Users/alikheyrollahi/.ssh/id_rsa': Enumerating objects: 89539, done. Counting objects: 100% (89539/89539), done. Delta compression using up to 8 threads Compressing objects: 100% (61201/61201), done. Writing objects: 100% (89539/89539), 40.83 MiB | 1.01 MiB/s, done. Total 89539 (delta 28336), reused 89539 (delta 28336) remote: Resolving deltas: 100% (28336/28336), done. To github.com:aliostad/deep-learning-lang-detection.git + ac12aa680...c51406cc8 master -> master (forced update) ~/g/a/deep-learning-lang-detection.git cd ..




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: