So you replace the hard to read CAPTCHAs with easy to read ads. Of which presumably there will be a limited number in circulation at any one time. Sounds kinda easy to circumvent...
A longtime standing solution to hard-to-read captchas are easy to read (but hard to process) captchas. E.g. show a picture of an animal or a shape and ask what it is, or even just ask a simple math problem or riddle in writing.
Problem with those are that they need to be constantly updated from a reliable source, or else, once the solution becomes popular, the spammer can bruteforce it in linear time (no matter how high N is, there are only N possible patterns).
This seems to be an attempt at fixing this. Ads are often relatively short lived so by the time the spammer has them bruteforced, it might be out of circulation - and more importantly, there are new ads in. It's an armsrace, and this is a way to pay our troops. Also, it's trivial for advertisers to make many different variations (e.g. one for each sales bulletpoint), so there are many variations in circulation. Since there's often more textual content than the password in the ad, they're not prone to simple OCR, while still easy to comprehend for the user.
Obvious shortcomings are if ads are not so shortlived, and if it's easy to identify and break classes of ads (e.g. if it's yellow and has the IE logo in position X, OCR area Y, done). Also, it's a bit of a dealbreaker if I'm forced to open and visit a website to get the password.
This was later, awesomely riffed on by HotCaptcha (http://valleywag.gawker.com/246656/a-face-only-a-bot-could-l...) which pulled HotOrNot data and asked you to select the 3 hot women out of 9. Sadly, the site is down now but I remember trying it and it was remarkably useful and a hell of a lot more fun than word captchas.
Your argument is somewhat flawed from a technical level as Olegk mentioned, but also from a business marketing stand-point.
Specific ads may be in circulation for a short time, or there may be many running concurrently, but it is the message that the advertiser is trying to get across, the tagline, and it would be of most benefit to the advertiser to get the user to associate their brand with a single concept. Using the example in the article, Subaru may want to be associated with "outback", not "four wheel drive" or "comfy" or "sporty". Businesses who try to target too many things or too wide an audience end up not getting their message across.
I'm not saying that what Solve Media is trying to do isn't a great idea. I think it has lots of potential, but they clearly still have more to work on.
Please be careful. It could be that his thoughts are excellent but his communication is flawed. It could also be that he has a great deal of general expertise in the subject but is mistaken in this specific statement.
Thus, a general statement about him could be false is also aggressively ad hominem. You might want to consider focusing on the statement itself rather than the speaker, such as:
"Your suggestion is entirely wrong."
JM2C of course, and it is possible that I don't know what I'm talking about. I am not a psychologist or a logician.
So if I have a library of 10s of thousands of images of animals, shapes, things etc. that are all easily recognisable to English speakers -- e.g. cat, dog, house, drum, road, tree, book, horse -- and ask them to write in what is it, AND I constantly update that library and retire pictures that's been used many time -- what is your dumb script's success rate?
What if I combine three pictures in each challenge - e.g. "cat house triangle"?
My spam script would always answer "cat", so among 8 options (cat, dog, house, drum, road, tree, book, horse), I'd get a 12.5% success rate.
Plus you constantly have to update your image library, which a huge pain.
Also, recognizing 10,000 images will take me around one day and less than $1000 with Amazon turk, thus giving me a perfect 100% success rate. After that you would have to completely renew your image database.
You aren't getting it. The probability is 1/<number of options you present to the user>. If you show the user 100 images and ask them to select one, a bot will have a 1% probability to find the right one, but the user will tell you to get lost.
If you present 10 images (still a stretch), bots will have 10% success rate just answering randomly.
EDIT: Wait, from what I see you mean that the user will have to write "cat" or "dog" or whatever? That's better, yes. Communication, however, is hard, which is why me the GP didn't understand what you meant.
not to mention the fact that the bot will get spotted for entering the same phrase more then a few times, get put on a list and get served the squiggly crap
You have no idea how spam works. A bot isn't just one user trying to enter "cat" repeatedly. Botnets send requests from thousands of different IPs. You wouldn't know which ones are real users, and which ones are bots.
You know, most modern capchas are not solved by bots, but by people in third world countries solving them for pennies. The current going rate is about $1/1000 and there are easy to use captcha solving APIs for any platform.
capcha has long provided an illusion of security, nothing more. Any and all captchas will be broken.
ok, but how many comments are you losing because people don't want to fill out a captcha?
i've used defensio.com for filtering comments on my site with no captcha and rarely ever get a false positive or negative. false negatives are easy to spot, and users can manually override false positives by supplying an email address to get a confirmation link (which gets fed back to defensio as a false positive once clicked).
It's a tradeoff, sure. In my case, I'm perfectly happy to trade a few comments from people who don't want to deal with a captcha for never having to manually deal with spam. YMMV.
That's great, pretty much any home-rolled CAPTCHA will perform great, simply because it's not worthwhile for spammers to design an attack. If that, however, was the default anti-spam mechanism on, say, wordpress installs, it'd be automated against in minutes.
That's one datapoint. The counterpoint is the thousands of forum owners who mistakenly relied on captcha to prevent spam at the expense of other antispam measures, like GeoIP.
I'm not saying captchas are a magic one-step solution to end all spam (hence the mention of Akismet). I'm just disputing the idea that they are "nothing more" than illusory security. Sure, they can be turked, and sure, if someone relies only on captchas they'll probably eventually get screwed. But they do provide some benefit as part of a complete approach against spam, in that they raise the cost of a spam attempt up from zero. Unless you have meaningful levels of traffic, the vast majority of spammers aren't going to bother with you if it costs anything at all to hit you.
Heh, I know people who are working on breaking reCaptcha. Maybe they should get together and learn from each other.
If you're saying that it's possible to prevent manual, outsourced captcha solving, I would have to strongly disagree. It's similar to the futility of DRM as an antipiracy measure. As long as I can see the captcha, I can
pay someone in another country to solve it for me.
I will agree that it's rare- but the most sophisticated and high-volume spammers do it, and they're the ones you have to worry about.
In case people start giving me sideways glances I want to make it clear that I haven't done any blackhat stuff in a very long time, but I'm still interested in the blackhat community from a security researcher's perspective.
I'm not saying that its possible to do it in an automated way. I'm just saying that tools exist to create red flags that would make thing easily verified by a human. They have ways in which they deal with these things. All I'm saying is so far, its been working for them.
You're probably not going to tell me more about these tools, although I would be very curious to learn how they work.
All I have is anecdotal evidence- I know several people who are making their living spamming Craigslist and outsourcing their ReCaptcha solving. Since they can make $XX per post, paying pennies for captcha is a tiny expense.
I don't condone this behavior, but be aware- it happens more than you think.
Anyway, probably best to take the discussion to email if you want more...perspective from the other side.
£1/1000 is still more expensive then if they could be done with human input. And they need to be done for every time a captcha is needed, with this scheme you could presumably pay for humans to solve the adds currently in rotation (ten? hundreds? thousands? not more than that) then just remember the results.
I don't think having people solve capchas for money is all that common. Do you have a source or something? Even if it is common, capchas still stop those spamming strategies that rely on each spam being free for the spammer.
I don't know how prevalent this practice is exactly, but I have it on good authority that sites like decaptcher.com are making a mint selling this service.
yeh but you gota look at it from a business standpoint
who profits from the outsourcing? why would a website owner pay to outsource the ads to get solved in india just to get blacklisted then lose revenue that would have come in anyway for less
Indeed. Without some form of randomisation (which advertisers probably do not want), this would drastically lower the price of solving a captcha. Because you need to submit each ad to a human solver network only once, and can store the solution.