I run https://github-wiki-see.page. Please read the about page link at the bottom or visit https://github-wiki-see.page for an explanation. I put it up after realizing my GitHub wiki contributions weren't available via Google.
GitHub blocks https://github.com/entropia/tip-toi-reveng/wiki/Languages and many other wikis from being indexed. In the case of the page you linked, GitHub serves the content with "X-Robots-Tag: none". The content of that page currently does not exist in Google at all. You can see the robot blocking header by looking at the Network tab in Chrome while loading the page in incognito mode or equivalent in other browsers.
As for having no link to GitHub, my service provides a huge button at the top and a direct URL to the original content. Please use those controls at the top to get to the content on GitHub. I do not selectively redirect to not trip cloaking detection or automatically redirect which risks the indexing helper being classified as a redirect in search engines.
If you have any other questions or suggestions, please let me know.
GitHub Wiki Search Engine Enablement (GHWSEE) allows non-indexed GitHub Wikis to be indexed by search engines.
This site will be decommissioned to redirect old links once the block is lifted or GitHub produces some other solution to index GitHub Wikis in harmony with their SEO concerns.
I do not see any wrongdoing from github-wiki-see.page here. They don't even amke money from it. Quite contrary, I do think that this is a useful project.
Hah, yeah I don't make any money from it. I think I'm like currently $300 in the hole from experiments and queries with it until I had settled on the current ramen architecture.
That might be fixable, if people want to expend the effort. Wiki pages are almost certainly copyrightable, so the owners could send DMCA takedown notices to github-wiki-see.page. If they're not responsive, send the DMCA notices to Google, which should be required to delist them. Unfortunately you have to do it on a URL-by-URL basis, and you can only send notices for pages you actually own copyright for, so it would mean a big coordinated effort to get them brought down.
I just don't understand why Google themselves allows this and doesn't rank these sorts of sites lower. They're clearly garbage sites with low utility.
Please read my explanation at http://github-wiki-see.page/ and observe why it exists. I believe it to be a site with extremely high utility.
It has already recently convinced/defrosted GitHub to gradually change their policy to not let GitHub wiki pages be indexed since 2012. For at least 9 years, people were writing content into GitHub and not realizing it wasn't indexed at all.
I'm happy to answer any questions or suggestions you have.
I also do not host the content at all. That said, people have submitted outdated content requests if they move off GitHub Wikis to Google and they are honored.
Google puts substantial effort into identifying copycat content. The main way they do that is to see which site had the content first.
Unfortunately with smaller sites, it could be a few days till their search bot finds the content, and often the copycat sites have agressive scrapers so appear to have the content first.
From googles point of view, the copycat is the original, and the original is the copycat.
There are also some kinds of copycat content which users actually prefer. For example, sites which bypass paywalls, sites which quote other sites, sites that display decrapified content from another site, etc.
FWIW, GitHub seems to be letting some Wikis be indexed on a test basis and I am very happy to see they are outranking GHWSEE. That said, with the current guessed criteria, there are still many publicly editable wikis with many stars and publically un-editable wikis on repos with few stars but useful information out there that aren't being indexed.
https://github-wiki-see.page/m/entropia/tip-toi-reveng/wiki/...
higher than the actual GitHub wiki that all of the content was copied from
https://github.com/entropia/tip-toi-reveng/wiki/Languages
BTW, building my own interactive book was a great thing to do over Christmas.