AI Model Performance Comparison: Claude Sonnet 4.5 Leads in CAPTCHA Solving Tests
By
mdahardy
Front-window bakery material. Catches the eye, delivers the goods.
Summary
The article presents a benchmarking study comparing three leading AI models—Claude Sonnet 4.5, Gemini 2.5 Pro, and GPT-5—on their ability to solve Google reCAPTCHA v2 challenges. The testing revealed significant performance differences: Claude Sonnet 4.5 achieved the highest success rate at 60%, slightly outperforming Gemini 2.5 Pro at 56%, while GPT-5 performed significantly worse with only a 28% success rate. The research examines how well modern CAPTCHA systems hold up against advanced AI agents.
Key quotes
· 3 pulledClaude Sonnet 4.5 performed best with a 60% success rate, slightly outperforming Gemini 2.5 Pro at 56%
GPT-5 performed significantly worse and only managed to solve CAPTCHAs on 28% of trials
Many sites use CAPTCHAs to distinguish humans from automated traffic. How well do these CAPTCHAs hold up against modern AI agents?
You might also wanna read
Datacurve's DeepSWE Benchmark Shows GPT-5.5 Leading AI Coding Models with 70% Pass Rate
A new benchmark called DeepSWE, released by startup Datacurve, reveals significant performance differences among AI coding models that were

Anthropic Releases Claude Opus 4.5 AI Model Amid Cybersecurity Concerns
Anthropic has released Claude Opus 4.5, positioning it as the world's best AI model for coding, agents, and computer use, claiming it surpas

Google's Gemini 3 AI Model Tops Benchmarks and Leaderboards, Outperforming Competitors
Google's Gemini 3 AI model has been released to widespread acclaim, topping benchmarks and leaderboards while outperforming competitors like
Google's Android Bench leaderboard ranks GPT 5.5 above Gemini for Android app development
Google launched the Android Bench benchmarking portal in March to help developers choose the best AI models for Android app development. The
bit.ly·1d agoGoogle Gemini 3.1 Pro: Advanced AI Model for Complex Problem-Solving
Google's Gemini 3.1 Pro is an advanced AI model designed for complex problem-solving tasks that require more than simple answers. It builds

Evaluation of Google's Gemini 3 AI Model: Performance Assessment Against Marketing Claims
The article evaluates Google's Gemini 3 AI model against the company's marketing claims, finding that while it delivers reasonably well on p
