Anthropic Releases Original Performance Take-Home Coding Challenge on GitHub
By
myahio
A baker's-dozen of insight crammed into one ring.
Summary
Anthropic has open-sourced their original performance take-home coding challenge on GitHub. The repository contains a version of the coding test used to evaluate AI performance, specifically tracking how Claude Opus models have improved over time. Originally a 4-hour challenge, it was shortened to 2 hours after Claude Opus 4 outperformed most humans. The current repo is based on the newer version but reverts to slower baseline code, allowing developers to test their skills against Anthropic's AI benchmarking standards.
Key quotes
· 3 pulledThis repo contains a version of Anthropic's original performance take-home, before Claude Opus 4.5 started doing better than humans given only 2 hours.
The original take-home was a 4-hour one that starts close to the contents of this repo, after Claude Opus 4 beat most humans at that, it was updated to a 2-hour one.
This repo is based on the newer take-home which has a few more instructions and comes with better debugging tools, but has the starter code reverted to the slowest baseline.
You might also wanna read
Anthropic releases Claude Opus 4.8, emphasizing honesty and reliability over raw performance
Anthropic has released Claude Opus 4.8, a new large language model that prioritizes honesty and carefulness over raw performance. The model
zdnet.com·2d ago
Anthropic Releases Claude Opus 4.6 AI Model with Enhanced Multi-Step Task Capabilities
Anthropic has released Claude Opus 4.6, described as a 'direct upgrade' from its predecessor with improved capabilities for handling complex
Anthropic Releases Claude Opus 4.7 AI Model for Complex Reasoning and Agentic Coding
Claude Opus 4.7 is Anthropic's most advanced generally available AI model, designed specifically for complex reasoning and agentic coding ta
Anthropic releases Claude Opus 4.8 with effort controls, cheaper fast mode, and improved honesty
Anthropic released Claude Opus 4.8, the newest version of its flagship AI model, featuring effort controls, dynamic workflows, cheaper fast
bit.ly·1d ago
Anthropic's Claude Sonnet 4.5: AI Model Capable of 30-Hour Autonomous Coding
This article discusses Anthropic's new Claude Sonnet 4.5 AI model, which can code autonomously for 30 hours straight, and explores the broad
Anthropic Launches Claude Haiku 4.5: Faster, Cheaper AI Model Matching Sonnet 4 Performance
Anthropic launched Claude Haiku 4.5, a small AI model that delivers frontier-level coding performance matching Claude Sonnet 4, but at 2x fa
