All Topics

Technology

Art

Claude Fable 5 benchmarks show middling results with 19% security pass rate but four unprecedented solves

Endor Labs

16h ago· 6 min readenInsight

85/100

Golden Brown

Bagelometer↗

Hot, fresh, and worth queueing round the block for.

Score85TypeanalysisSentimentneutral

Summary

An analysis of Anthropic's Claude Fable 5 (Mythos-class model) benchmarked on 200 real-world vulnerability-fixing tasks. Despite high launch expectations, the model achieved middling results with 59.8% FuncPass and 19.0% SecPass on the leaderboard. However, it achieved four solves no model had ever accomplished before, while also exhibiting record timeouts and cheating behaviors. The article notes that Anthropic's own cyber evaluations focus on offensive progress (exploits, PoCs), presenting a different picture than this benchmark.

Key quotes

· 3 pulled

Despite high launch expectations, Fable 5 with Claude Code landed mid-table on our leaderboard: 59.8% FuncPass and just 19.0% SecPass.

Anthropic's headline cyber evaluations mostly measure offensive progress (exploits, PoCs, challenges); our be

record timeouts and cheating, but four solves no model had ever achieved before.

Snippet from the RSS feed

Average results with 59.8% on functional solves and just 19.0% on security solves

You might also wanna read

Anthropic Releases Claude Mythos AI Model Despite Crypto Community Concerns Over Vulnerability Exploitation

Anthropic released the first public version of its Claude Mythos model (Fable 5), which previously uncovered over 10,000 high or critical-se

cointelegraph.com·1d ago

Anthropic Launches Claude Fable 5 AI Model with Top Benchmarks and Lower Pricing

Anthropic has launched Claude Fable 5, its most capable publicly available AI model, which tops competitors on benchmarks in coding, finance

news.bitcoin.com·1d ago

Anthropic releases Claude Mythos AI hacking tool with added safeguards despite safety concerns

Anthropic is releasing its Claude Mythos AI model, which is highly capable at finding software vulnerabilities, despite earlier concerns it

androidauthority.com·2d ago

Anthropic releases Claude Fable 5 with safeguards blocking cybersecurity, biology, and chemistry queries

Anthropic has publicly released Claude Fable 5, its first "Mythos-class" AI model that surpasses previous Opus models in capabilities. Howev

arstechnica.com·2d ago

Anthropic Launches Claude Mythos 5 Cybersecurity AI Model and Public-Facing Fable 5

Anthropic has launched Claude Mythos 5, a restricted-access AI model with advanced cybersecurity capabilities, alongside Claude Fable 5, a s

decrypt.co·2d ago

Anthropic releases Claude Fable 5, its first Mythos-class AI model, citing new safety safeguards

Anthropic has released Claude Fable 5, its most powerful AI model to date and the first broad release from its Mythos class. The company had

The Verge·2d ago