I'm Not a Bot

Posted by Kenneth Ronkowitz on Tuesday, October 30. 2018

This early version of a CAPTCHA uses a nonsense word "smwm" and obscures it from computer interpretation by making it an image, twisting the letters and adding slight background color gradient.

CAPTCHA (/kæp.t??/ is an acronym for "Completely Automated Public Turing Test To Tell Computers and Humans Apart"). It is the general name for a type of challenge–response test used in computing to determine whether or not the user is human.

You have encountered them when logging into sites. The early versions were scrambled words as images. But they have become more complex.

I suspect that the acronym was formed with the idea of capture+gotcha. That is especially true of a newer form known as an image identification captcha which may be better at fooling robots, but is also better at fooling and frustrating me.

For example, you may encounter ones asking you to "select all the images with a fire hydrant" in them. (It could also be automobiles or road signs or...)

capcha

The problem with this type is that the images are small and low quality. On the example shown here I can't tell if there is a fire hydrant hiding in the image. And the captcha will keep giving me new ones if I'm not correct. The result? I give up at trying to use the service.

This user identification procedure has received criticism since it was first introduced in 2003. It certainly has accessibility issues for disabled people. But everyday users also balk at having to use it.

We use a simple version on this blog to try to prevent bots from posting spamming comments. That didn't work very well and we had to shut down commenting. We'll never know how many legitimate comments never were posted because the captcha stopped the commenter.

Do they work? I don't know their effectiveness score, but there approaches to defeating CAPTCHAs. The simplest is to use cheap human labor to recognize them. There are many algorithms and types out there now and some have bugs that have been exploited to allow the attacker to completely bypass the CAPTCHA. Good old AI and machine learning has allowed people to build automated solvers.

Is there a need for this technology? Yes. Anyone with a blog knows that spam comments are a problem.

The NoCAPTCHA reCAPTCHA

And then there is the "No CAPTCHA reCAPTCHA." In 2013, the updated reCAPTCHA began implementing behavioral analysis of the browser's interactions with the CAPTCHA to predict whether the user was a human or a bot before displaying the captcha, and presenting a "considerably more difficult" captcha in cases where it had reason to think the user might be a bot.

Public Google services started using it the following year. The first issue with its use was that because NoCAPTCHA relies on the use of Google cookies that are at least a few weeks old, reCAPTCHA has become nearly impossible to complete for people who frequently clear their cookies. An improved version introduced in 2017 by Google is called "invisible reCAPTCHA".

We will continue to make ways to block bots and people will continue to make ways to defeat them. A new project, Mailhide, is being developed to protect email addresses on web pages from being harvested by spammers. It converts the address that doesn't allow the bot to see the full email address, so "captcha@gmail.com" becomes "cap...@gmail.com". A human would have to click on it, and solve a CAPTCHA to see the full email address.

Can this be defeated by cheap human labor too? Yes. It's like putting a strong lock on your door. Someone can bust it if they are determined to get in, but you hope to discourage others.

Serendipity35

I'm Not a Bot

Trackbacks

Comments