Google's reCAPTCHA v2 is one of the tools to distinguish humans from automated programs, protecting websites from automated attacks. However, modern AI technologies, including advanced machine learning models, are starting to effectively break these protections. The article "Breaking reCAPTCHA v2" reveals that using YOLO (You Only Look Once) models , it is possible to solve 100% of reCAPTCHA v2 tasks. In this article, we will not only check how AI outsmarted reCAPTCHA, but also remind ourselves how this mechanism works, so that we can understand everything.

It is worth noting that the study only covers reCAPTCHA version v2, i.e. the one that works after clicking the "I am not a robot" field.

meme captcha commission

Contents

How does reCAPTCHA v2 work?

i'm not a robot recaptcha
Does this image look familiar? Surely most of us have come across this simple yet ubiquitous feature. But have you ever wondered how this mechanism actually works?
What the mechanism analyzes first is:
  • Time spent on page.
  • Mouse movements (e.g. how you move around the page and click).
  • Interactions with page elements.

The user's behavior history is also checked

In research on the reCAPTCHA v2 mechanism, it is stated that:

“Our study also finds that reCAPTCHAv2
is heavily based on cookie and browser history data when
evaluating whether a user is human or not.”

Does this mean that the mechanism shamelessly looks into our browser history and knows which websites we have visited?

Fortunately not 😉

The mechanism does not directly search the user's browsing history or cookies. Instead, it uses data collected by Google from the user's previous interactions with other reCAPTCHA-protected sites . This allows it to assess the user's reputation based on their previous online behavior.

If the mechanism at this stage v2 has no suspicions about the user's human nature, it allows the user to proceed calmly without having to solve additional challenges.

Only when the system has doubts does it launch the "image challenge"

reCAPTCHA v2 offers three different types of image challenges, each testing a different aspect of visual reasoning:

types of google captcha
📸 Article "Breaking reCAPTCHA v2"
  1. Challenge Type 1 is a classification task in which the user must determine whether each photo in a static 3×3 grid contains a specified object or not.
  2. Challenge Type 2 is an image segmentation task. The user is given a single static image divided into a 4×4 grid. The user is asked to identify and recognize specific parts of the image that are relevant to the challenge.
  3. Challenge Type 3 is similar to Challenge Type 1 in terms of grid layout, but includes dynamic images that refresh upon user interaction. It also requires image classification.

Methodology, or YOLO

The study on the effectiveness of breaking reCAPTCHAv2 focused on the use of advanced machine learning models, in particular YOLO (You Only Look Once). YOLO is a family of algorithms designed to recognize objects in images, known for its exceptional speed and precision. Unlike other methods, YOLO processes the entire image in a single pass , which makes it an ideal tool for tasks that require rapid identification of objects - such as reCAPTCHA tests.

The study used the YOLOv8 model, one of the newest and most advanced variants of this algorithm, which was specifically tailored to the image segmentation and classification tasks of reCAPTCHAv2. This model allowed the researchers not only to accurately recognize objects in images, but also to automatically go through the subsequent stages of the captcha without human intervention.

YOLOv8 in action

🎥 https://yolov8.com/

How exactly was the AI effectiveness testing for reCAPTCHAv2 conducted?

The research process took place in a controlled test environment that simulated real-world web browsing conditions . Python 3.9 and Selenium WebDriver for Firefox were used, which allowed for accurate mapping of user interactions with the captcha, from mouse movements to IP changes thanks to the use of VPN.

The YOLOv8 algorithm was fine-tuned and trained on specially collected and labeled data. It used 14,000 image/label pairs , which enabled the model to recognize key objects such as cars, bikes, and road signs, which are commonly used in captcha tasks. The experiments tested various scenarios, such as image classification on a 3×3 grid, segmentation of images divided into 16 parts, and dynamic tasks where images changed after being clicked.

Each stage of the experiment involved analyzing not only the effectiveness of the YOLO model in solving captchas, but also the impact of factors such as mouse movements, IP variables, and the presence of browsing history and cookies. With this approach, the study aimed not only to demonstrate AI's capabilities in breaking security, but also to identify the weaknesses of the reCAPTCHAv2 system in various, realistic conditions of use.

Research Findings: How Does AI Handle reCAPTCHA v2?

The study found that the YOLOv8 model achieved an impressive 100% success rate in solving reCAPTCHAv2 challenges , a significant leap forward compared to previous methods. Previous studies used older algorithms that solved the captcha in 68-71% of cases, often struggling in more complex tests. The breakthrough came with YOLOv8, a model that was specifically tuned to the requirements of reCAPTCHAv2. But under what circumstances did the system decide to display the challenge in the first place, and what factors influenced its appearance?

The Impact of VPN on Bot Recognition

The use of VPNs proved crucial in bypassing bot detection mechanisms. The variable IPs meant that each attempt at the captcha was treated as a separate session, allowing the bot to continue without raising suspicion. This allowed the bot to complete the entire series of tests without any problems, as the conclusion underlines:

“A VPN limits the ability of risk assessment algorithms to monitor and create a profile of the bot over several runs by allocating a different IP address for each run.”

In the top graph (a) we can see that initially the bot performs relatively well, but after about the 20th attempt the number of challenges increases rapidly , especially for Types 2 and 3, indicating that the system begins to recognize the bot's repetitive behavior as suspicious. As a result, reCAPTCHAv2 imposes more and more tasks to verify the user, which eventually leads to the bot being blocked completely.

In the lower graph (b), where VPN is used, the distribution of the number of challenges is much more even, and the bot can go through many more attempts (up to 100) without a sudden increase in difficulty. The IP variability makes the system treat each attempt of the bot as a separate session, which effectively avoids the increase in the number of challenges. The differences between the graphs clearly indicate that VPN is a key tool in breaking the security of the captcha, allowing bots to pass the tests without arousing suspicion from the system.

vpn test

Simulation of natural use of a computer mouse

Mouse movement simulation significantly affects the effectiveness of bots in bypassing reCAPTCHAv2. Research shows that imitating human movements reduces the number of challenges imposed by the system, making the bot less recognizable. Natural, fluid movements best imitate real user behavior, which increases the chances of passing captcha tests.

The results from the graphs are as follows:

  • Without mouse movement (Graph a):
    The bot encounters a large number of challenges, regardless of the type of captcha. The system quickly recognizes the lack of natural movements, which results in increased difficulty of the tasks.

  • Simple, linear movements (Graph b):
    The introduction of simple, linear movements reduces the challenge, especially for Type 2 and Type 3 captchas, but there are still moments of increased difficulty.

  • Motions along Bézier curves (Graph c):
    Simulating more complex, fluid movements significantly reduces the challenge. Bézier curves best mimic the natural movements of the human hand, making it harder for the system to detect a bot and allowing it to pass tests more efficiently.

The role of cookies and browsing history

Browsing history and cookies also had an impact on the performance of reCAPTCHAv2. The presence of this data allows the system to better recognize real users . The results of the study show that access to user data significantly improves the bot's efficiency, reducing the number of additional tasks to perform.

The graphs show the results as follows:

  • No browsing history or cookies (Chart a):
    When the bot is running without user data, the number of captcha challenges is high and unstable. There is a particularly noticeable increase in difficulty for some types, such as Type 3, where challenges can reach as many as 60 tasks in a single attempt. The lack of cookies and history causes the bot to be treated as new and potentially suspicious, leading to an increase in the number of tests.

  • Presence of browsing history and cookies (Chart b):
    When the bot has access to browser data, the number of challenges drops significantly and tests are more predictable. The reCAPTCHA system treats the bot as less suspicious, which reduces the number of tasks, especially for more demanding captcha types. The stability of the results is higher, which suggests that the presence of user data has a positive effect on recognition and reduces the risk of bot detection.

Comparison of bot and human performance

The researchers also decided to test how their bot would perform in comparison to a real human.

  • Chart (a) – Number of challenges for people:
    For humans, Type 3 challenges were the most common. These challenges usually occurred less than 4 times per session, indicating that the reCAPTCHA system generally considered human behavior to be trustworthy. However, there were exceptions, and humans sometimes encountered Type 2 challenges.

  • Chart (b) – Number of bot challenges:
    The number of challenges that bots were given was similar to that encountered by humans. Interestingly, unlike human users, bots were more likely to receive Type 1 and 2 challenges, while Type 3 challenges were much less common.

What about reCAPTCHA v3?

The article does not directly address methods for bypassing reCAPTCHA v3. The study focuses primarily on analyzing reCAPTCHAv2, specifically on bots solving visual challenges using YOLO models. reCAPTCHA v3 works differently than v2 – it does not require users to solve the challenges directly, but evaluates suspicious behavior in the background , assigning scores to users. Due to the different way reCAPTCHA v3 works, the methods examined in the article may not be directly applicable to this newer version, which is primarily designed to detect bots without requiring visual challenges.

Summary

The study found that modern AI models such as YOLOv8 are able to successfully defeat reCAPTCHAv2, achieving 100% success rate in solving image tests. The key finding is that reCAPTCHA is no longer fully effective in distinguishing bots from real users, undermining its value as a protection tool.

Various factors, such as browsing history, cookies, mouse movements, and VPN usage, significantly impact the number and type of challenges reCAPTCHAv2 poses. Simulating human behaviors, such as natural mouse movements or changing IP addresses, allows bots to avoid detection and reduce the number of challenges they pose, making them nearly indistinguishable from real users.

The results show that current captcha systems, especially those based on reCAPTCHA v2 (and possibly v3), need to evolve to meet new threats from advanced AI algorithms. This study is an important warning to security developers who should seek new, more resilient methods to protect against automated bot attacks.