Categories
Uncategorized

CAPTCHAs: More Effective Than You've Been Led to Believe

Every now and again, I read articles like this one that claim that CAPTCHAs — those “please enter the text from this image” tests meant to verify that a human is filling out a web form — are no longer effective, as spammers have come up with algorithms and countermeasures to defeat them.

Jeff Atwood of the programming blog Coding Horror argues the opposite; he says that they work, and you only have to look to the 'net for proof:

Although there have been a number of CAPTCHA-defeating proof of concepts published, there is no practical evidence that these exploits are actually working in the real world. And if CAPTCHA is so thoroughly defeated, why is it still in use on virtually every major website on the internet? Google, Yahoo, Hotmail, you name it, if the site is even remotely popular, their new account forms are protected by CAPTCHAs.

In the article, he runs a number of experiments in which he takes graphics of text with varying degrees of distortion and runs them through SimpleOCR's demo page. He found that only a slight bit of distortion — not enough to fool even a five-year-old — was enough to confound SimpleOCR.  He also found that the text distortion might not even be necessary: just a little “noise” added to the picture caused SimpleOCR to fail to recognize any of the characters in the text.

He also points to his own experience on his blog, which uses what he calls “Naive CAPTCHA”, in which the CAPTCHA text is the same every time, and he's still stopped 99% of his comment spam.

He provides a CAPTCHA recipe that he says is “more protection than most websites need. All it needs to do is combine these elements:

  • high contrast for human readability
  • medium, per-character perturbation
  • random fonts per character
  • low background noise

Here's an example of a CAPTCHA created following this recipe:

Sample of an effective CAPTCHA from 'Coding Horror'.

Jeff also debunks the scenarios in which spammers use “Turing Farms” — either “sweatshops” of low-paid people to respond to CAPTCHA challenges or the much-publicized trick of showing people porn in exchange for answering a CAPTCHA challenge. They're just too expensive to be worth the effort, which is why CAPTCHAs work: they hit spammers where it hurts — in the pocketbook.

Link