If you're one of the people who hate picking cars, street signs, and other objects in CAPTCHA image grids, get used to it, because the days of text-based alternatives are numbered.
CAPTCHA stands for "Fully Automated Public Turing Test to Differentiate Computers and People." CAPTCHA tests are used to separate bots from people, as many Internet users have discovered.
They do not work properly, so companies like Facebook are constantly cleaning up bad accounts. Ongoing research into machine learning and image recognition techniques makes it even more difficult to create puzzles that annoy software but not people.
Boffins from Lancaster University in the United Kingdom, Northwest University in the US and Peking University in China have developed a system for creating text-based CAPTCHA solvers that automatically decrypt encrypted textual representations.
Researchers Guixin Ye, Zhanyong Tang, Dingyi Fang, Zhanxing Zhu, Yansong Feng, Pengfei Xu, Xiaojiang Chen and Zheng Wang describe their CAPTCHA cracking system in an article that will be presented at the 25th ACM Security Conference Computers and Communications was introduced in October and has now been released to the public.
As can be seen from the title "Another Text Captcha Solver: A Generative Adversarial Network" Based Approach, the computer scientists have used a Generative Adversarial Network (GAN) to teach their CAPTCHA generator to use for training
A GAN first described in 201
Coincidentally, this is the year in which researchers from Google and Stanford published an article titled "The End is Nigh: Generic Solving of Text-based CAPTCHAs." Four years later, the speed thresholds became generic attacks limit, overcome.
Can we break it? Yes, we GAN!
A GAN lends itself to the efficient tra ining of data models. It allowed researchers to teach their CAPTCHA generation program to quickly create many synthetic text puzzles to train their basic puzzle solving model. They then refine it via transfer learning to defeat real text confusion with only a small set (~ 500 instead of millions) of actual samples.
Numerous attacks on text-based CAPTCHAs have been invented over the years, the researchers say But the need to train attack mechanisms to deal with certain text-monging techniques has limited the attackers' response to CAPTCHA changes.
"Attunement of attack heuristics or models requires a strong intervention of the expert and is labor intensive and time consuming process of data collection and identification," they explain in the work.
Facebook Open-Sources object recognition work: Attention, Google CAPTCHA
While some generic attacks have been suggested, they have worked on relatively simple security features such as noisy backgrounds and individual fonts.
Researchers claim that they are influenced by less human intervention and efforts to develop a CAPTCHA targeted solver. The attack poses "a particularly serious threat to text-based CAPTCHAs."
The Boffins tested 33 text-based CAPTCHA schemes, of which 11 were used by 32 of the Alexa Ranking Top 50 sites in April of this year. And they could crack them with a desktop GPU in less than 50 milliseconds.
Who in the know would still use text-based CAPTCHAs when image-based alternatives are available and Google revitalized its reCAPTCHA technology in October? It turns out that there are quite a few companies, including Baidu, eBay, Google, Microsoft and Wikipedia.
However, because object-identification CAPTCHAs yield to machine-learning attacks, perhaps it is time to look beyond Turing tests. ®