All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Scaling Karpathy's Autoresearch: Parallel GPU Processing Enables New AI Experimentation Strategies

By

hopechong

2mo ago· 11 min readenInsight

Summary

The article describes an experiment where researchers scaled Andrej Karpathy's autoresearch system by giving it access to 16 GPUs on a Kubernetes cluster instead of running single experiments sequentially. Over 8 hours, the AI agent submitted approximately 910 experiments, discovering that scaling model width was more important than individual hyperparameters. The agent autonomously learned to use H200 GPUs for validation while screening ideas on H100s, achieving a 2.87% improvement in validation bits per byte (val_bpb) from 1.003 to 0.974. The key insight was that parallel processing fundamentally changed the agent's search strategy from greedy hill-climbing to running factorial grids of 10-13 experiments per wave, enabling it to catch interactions between hyperparameters that would be missed in sequential execution.

Key quotes

· 5 pulled
Over 8 hours it submitted ~910 experiments, found that scaling model width mattered more than any single hyperparameter
taught itself to use H200s for validation while screening ideas on H100s
drove val_bpb from 1.003 down to 0.974 - a 2.87% improvement over baseline
With one GPU, it's stuck doing greedy hill-climbing - try one thing, check, repeat
With 16 GPUs, it ran factorial grids of 10-13 experiments per wave, catching interactions
Snippet from the RSS feed
Karpathy's autoresearch runs one experiment at a time. We gave it access to our GPU infra and let it run experiments in parallel.

You might also wanna read