Google's reCAPTCHA v2 is one of the tools to distinguish humans from automated programs, protecting websites from automated attacks. However, modern AI technologies, including advanced machine learning models, are starting to effectively break these protections. The article "Breaking reCAPTCHA v2" reveals that using YOLO (You Only Look Once) models , it is possible to solve 100% of reCAPTCHA v2 tasks. In this article, we will not only check how AI outsmarted reCAPTCHA, but also remind ourselves how this mechanism works, so that we can understand everything.
It is worth noting that the study only covers reCAPTCHA version v2, i.e. the one that works after clicking the "I am not a robot" field.
In research on the reCAPTCHA v2 mechanism, it is stated that:
“Our study also finds that reCAPTCHAv2
is heavily based on cookie and browser history data when
evaluating whether a user is human or not.”
Does this mean that the mechanism shamelessly looks into our browser history and knows which websites we have visited?
Fortunately not 😉
The mechanism does not directly search the user's browsing history or cookies. Instead, it uses data collected by Google from the user's previous interactions with other reCAPTCHA-protected sites . This allows it to assess the user's reputation based on their previous online behavior.
If the mechanism at this stage v2 has no suspicions about the user's human nature, it allows the user to proceed calmly without having to solve additional challenges.
reCAPTCHA v2 offers three different types of image challenges, each testing a different aspect of visual reasoning:
The study on the effectiveness of breaking reCAPTCHAv2 focused on the use of advanced machine learning models, in particular YOLO (You Only Look Once). YOLO is a family of algorithms designed to recognize objects in images, known for its exceptional speed and precision. Unlike other methods, YOLO processes the entire image in a single pass , which makes it an ideal tool for tasks that require rapid identification of objects - such as reCAPTCHA tests.
The study used the YOLOv8 model, one of the newest and most advanced variants of this algorithm, which was specifically tailored to the image segmentation and classification tasks of reCAPTCHAv2. This model allowed the researchers not only to accurately recognize objects in images, but also to automatically go through the subsequent stages of the captcha without human intervention.
🎥 https://yolov8.com/
The research process took place in a controlled test environment that simulated real-world web browsing conditions . Python 3.9 and Selenium WebDriver for Firefox were used, which allowed for accurate mapping of user interactions with the captcha, from mouse movements to IP changes thanks to the use of VPN.
The YOLOv8 algorithm was fine-tuned and trained on specially collected and labeled data. It used 14,000 image/label pairs , which enabled the model to recognize key objects such as cars, bikes, and road signs, which are commonly used in captcha tasks. The experiments tested various scenarios, such as image classification on a 3×3 grid, segmentation of images divided into 16 parts, and dynamic tasks where images changed after being clicked.
Each stage of the experiment involved analyzing not only the effectiveness of the YOLO model in solving captchas, but also the impact of factors such as mouse movements, IP variables, and the presence of browsing history and cookies. With this approach, the study aimed not only to demonstrate AI's capabilities in breaking security, but also to identify the weaknesses of the reCAPTCHAv2 system in various, realistic conditions of use.
The use of VPNs proved crucial in bypassing bot detection mechanisms. The variable IPs meant that each attempt at the captcha was treated as a separate session, allowing the bot to continue without raising suspicion. This allowed the bot to complete the entire series of tests without any problems, as the conclusion underlines:
“A VPN limits the ability of risk assessment algorithms to monitor and create a profile of the bot over several runs by allocating a different IP address for each run.”
In the top graph (a) we can see that initially the bot performs relatively well, but after about the 20th attempt the number of challenges increases rapidly , especially for Types 2 and 3, indicating that the system begins to recognize the bot's repetitive behavior as suspicious. As a result, reCAPTCHAv2 imposes more and more tasks to verify the user, which eventually leads to the bot being blocked completely.
In the lower graph (b), where VPN is used, the distribution of the number of challenges is much more even, and the bot can go through many more attempts (up to 100) without a sudden increase in difficulty. The IP variability makes the system treat each attempt of the bot as a separate session, which effectively avoids the increase in the number of challenges. The differences between the graphs clearly indicate that VPN is a key tool in breaking the security of the captcha, allowing bots to pass the tests without arousing suspicion from the system.
Mouse movement simulation significantly affects the effectiveness of bots in bypassing reCAPTCHAv2. Research shows that imitating human movements reduces the number of challenges imposed by the system, making the bot less recognizable. Natural, fluid movements best imitate real user behavior, which increases the chances of passing captcha tests.
The results from the graphs are as follows:
Without mouse movement (Graph a):
The bot encounters a large number of challenges, regardless of the type of captcha. The system quickly recognizes the lack of natural movements, which results in increased difficulty of the tasks.
Simple, linear movements (Graph b):
The introduction of simple, linear movements reduces the challenge, especially for Type 2 and Type 3 captchas, but there are still moments of increased difficulty.
Motions along Bézier curves (Graph c):
Simulating more complex, fluid movements significantly reduces the challenge. Bézier curves best mimic the natural movements of the human hand, making it harder for the system to detect a bot and allowing it to pass tests more efficiently.
Browsing history and cookies also had an impact on the performance of reCAPTCHAv2. The presence of this data allows the system to better recognize real users . The results of the study show that access to user data significantly improves the bot's efficiency, reducing the number of additional tasks to perform.
The graphs show the results as follows:
No browsing history or cookies (Chart a):
When the bot is running without user data, the number of captcha challenges is high and unstable. There is a particularly noticeable increase in difficulty for some types, such as Type 3, where challenges can reach as many as 60 tasks in a single attempt. The lack of cookies and history causes the bot to be treated as new and potentially suspicious, leading to an increase in the number of tests.
Presence of browsing history and cookies (Chart b):
When the bot has access to browser data, the number of challenges drops significantly and tests are more predictable. The reCAPTCHA system treats the bot as less suspicious, which reduces the number of tasks, especially for more demanding captcha types. The stability of the results is higher, which suggests that the presence of user data has a positive effect on recognition and reduces the risk of bot detection.
The researchers also decided to test how their bot would perform in comparison to a real human.
Chart (a) – Number of challenges for people:
For humans, Type 3 challenges were the most common. These challenges usually occurred less than 4 times per session, indicating that the reCAPTCHA system generally considered human behavior to be trustworthy. However, there were exceptions, and humans sometimes encountered Type 2 challenges.
Chart (b) – Number of bot challenges:
The number of challenges that bots were given was similar to that encountered by humans. Interestingly, unlike human users, bots were more likely to receive Type 1 and 2 challenges, while Type 3 challenges were much less common.
The article does not directly address methods for bypassing reCAPTCHA v3. The study focuses primarily on analyzing reCAPTCHAv2, specifically on bots solving visual challenges using YOLO models. reCAPTCHA v3 works differently than v2 – it does not require users to solve the challenges directly, but evaluates suspicious behavior in the background , assigning scores to users. Due to the different way reCAPTCHA v3 works, the methods examined in the article may not be directly applicable to this newer version, which is primarily designed to detect bots without requiring visual challenges.
The study found that modern AI models such as YOLOv8 are able to successfully defeat reCAPTCHAv2, achieving 100% success rate in solving image tests. The key finding is that reCAPTCHA is no longer fully effective in distinguishing bots from real users, undermining its value as a protection tool.
Various factors, such as browsing history, cookies, mouse movements, and VPN usage, significantly impact the number and type of challenges reCAPTCHAv2 poses. Simulating human behaviors, such as natural mouse movements or changing IP addresses, allows bots to avoid detection and reduce the number of challenges they pose, making them nearly indistinguishable from real users.
The results show that current captcha systems, especially those based on reCAPTCHA v2 (and possibly v3), need to evolve to meet new threats from advanced AI algorithms. This study is an important warning to security developers who should seek new, more resilient methods to protect against automated bot attacks.