Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This was more of a kludge than a serious project, but glad I was not the only one to find this useful, will definitley continue working on it.

>Is it possible to avoid duplicates?

I thought about this as well in the past but never actually found a solution. Maybe someone here knows if there's research or an algorithm to uniquely identify the URL.

Edit: typos



Some pages will link to a canonical URL in the head: https://en.wikipedia.org/wiki/Canonical_link_element


Stripping out the query string probably works for 80%(maybe even more) of the sites. Maybe that by default and then an option to search for the whole url as a fallback?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: