I built (and sold) cli.gs, one of the 13 final complaint services. The system that evolved was effectively strict border controls along with frequent police checks:
1. When a new URL shortening request is received, the requester is checked and the destination is also checked. If both pass, the new short URL is returned.
2. When a short URL forwarding request was received (i.e. the bulk of the traffic), the destination is checked again at a configurable probability. If the destination is now deemed malicious, it is disabled on the spot and a message is shown. In times of spam attacks, the checking probability would be set to 100%.
I blogged about this when it launched and started evolving:
Interesting article, but I find the author's use of statistics to be quite bizarre...
[quote]
Approximately 68% of URL shortening services were Stage 1 Compliant.
Approximately 56% of URL shortening services were exclusively Stage 2 Compliant.
[/quote]
It seems from his numbers that he just meant to not include the word "exclusively", even though it was italicized. Also, I'm not sure what prompted the venn diagram with three sections "A", "B", and "A and B". Most of the regions (such as "A"-and-"A and B"-not-"B") are empty, for good reason.
Were they supposed to be safe? How can they be classified as "safe" or "unsafe?" It's like calling tar or zip utilities insecure because the archives produced might contain malware.
http://safe.mn/, for example, is doing checks on both the URL and the content. If the content cannot be scanned (too big, server too slow, local URL), visitors are warned that the link was not checked.
Am I missing something or is the Venn Diagram horrible. There are "Stage 1 Compliant" and "Stage 2 Compliant" areas, the overlap of which would logically be "Stage 1 and Stage 2 Compliant" Instead there is a third area for "Stage 1 and Stage 2 Compliant" with the count in the label instead of the area.
That whole chart is either ridiculous or I am a moron and can't parse it with my brain.
The issue is that by making your long URLs short, they also obscure them, making it harder for you to determine whether it is a good idea to visit.
In my ideal world, the URL shortener would not block you from visiting the site, but would display an interstitial page warning you of the possible problems if you did - much like Google Safe Browsing.
That is exactly the path I took with http://safe.mn/ If the destination is deemed unsafe, you get a warning explaining what is possible wrong (malware, virus, adult content., etc.) + a screen shot (hidden by default) + a link to continue to the site anyway.
1. When a new URL shortening request is received, the requester is checked and the destination is also checked. If both pass, the new short URL is returned.
2. When a short URL forwarding request was received (i.e. the bulk of the traffic), the destination is checked again at a configurable probability. If the destination is now deemed malicious, it is disabled on the spot and a message is shown. In times of spam attacks, the checking probability would be set to 100%.
I blogged about this when it launched and started evolving:
http://blog.cli.gs/news/new-anti-spam-and-anti-malware-featu...
http://blog.cli.gs/news/more-anti-spam-and-anti-malware-prot...