All Topics

Technology

Art

AI Model Performance Comparison: Claude Sonnet 4.5 Leads in CAPTCHA Solving Tests

mdahardy

6mo ago· 7 min readenInsight

85/100

Golden Brown

Bagelometer↗

Front-window bakery material. Catches the eye, delivers the goods.

Score85TypeanalysisSentimentneutral

Summary

The article presents a benchmarking study comparing three leading AI models—Claude Sonnet 4.5, Gemini 2.5 Pro, and GPT-5—on their ability to solve Google reCAPTCHA v2 challenges. The testing revealed significant performance differences: Claude Sonnet 4.5 achieved the highest success rate at 60%, slightly outperforming Gemini 2.5 Pro at 56%, while GPT-5 performed significantly worse with only a 28% success rate. The research examines how well modern CAPTCHA systems hold up against advanced AI agents.

Key quotes

· 3 pulled

Claude Sonnet 4.5 performed best with a 60% success rate, slightly outperforming Gemini 2.5 Pro at 56%

GPT-5 performed significantly worse and only managed to solve CAPTCHAs on 28% of trials

Many sites use CAPTCHAs to distinguish humans from automated traffic. How well do these CAPTCHAs hold up against modern AI agents?

Snippet from the RSS feed

We evaluate three leading AI models—Claude Sonnet 4.5 (Anthropic), Gemini 2.5 Pro (Google), and GPT-5 (OpenAI)—on their ability to solve Google reCAPTCHA v2 challenges.

You might also wanna read

Datacurve's DeepSWE Benchmark Shows GPT-5.5 Leading AI Coding Models with 70% Pass Rate

A new benchmark called DeepSWE, released by startup Datacurve, reveals significant performance differences among AI coding models that were

share.transistor.fm·4d ago

Anthropic Releases Claude Opus 4.5 AI Model Amid Cybersecurity Concerns

Anthropic has released Claude Opus 4.5, positioning it as the world's best AI model for coding, agents, and computer use, claiming it surpas

The Verge·6mo ago

Google's Gemini 3 AI Model Tops Benchmarks and Leaderboards, Outperforming Competitors

Google's Gemini 3 AI model has been released to widespread acclaim, topping benchmarks and leaderboards while outperforming competitors like

The Verge·6mo ago

Google's Android Bench leaderboard ranks GPT 5.5 above Gemini for Android app development

Google launched the Android Bench benchmarking portal in March to help developers choose the best AI models for Android app development. The

bit.ly·1d ago

Google Gemini 3.1 Pro: Advanced AI Model for Complex Problem-Solving

Google's Gemini 3.1 Pro is an advanced AI model designed for complex problem-solving tasks that require more than simple answers. It builds

Product Hunt·3mo ago

Evaluation of Google's Gemini 3 AI Model: Performance Assessment Against Marketing Claims

The article evaluates Google's Gemini 3 AI model against the company's marketing claims, finding that while it delivers reasonably well on p

The Verge·6mo ago