All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Zyphra's ZAYA1-8B Matches Frontier AI Models on Benchmarks Using Under 1 Billion Active Parameters, Trained on AMD Hardware

By

Mohit Geryani

24d ago· 8 min readenInsight

Summary

Zyphra released ZAYA1-8B, a model that matches or competes with frontier AI models like DeepSeek-R1, Claude Sonnet 4.5, and Gemini 2.5 Pro on math, reasoning, and coding benchmarks — while using less than 1 billion active parameters (760M). It was trained entirely on AMD hardware, a notable departure from the NVIDIA-dominated AI training ecosystem. The model uses a Markovian RSA inference method that allows performance to scale with compute budget. However, it underperforms on agentic tasks like tool calling and multi-step instruction following. The article highlights this as significant for researchers focused on test-time compute methods and for those interested in AMD's potential in the AI hardware space.

Key quotes

· 5 pulled
ZAYA1-8B matches DeepSeek-R1 on math benchmarks.
This one runs on less than 1 billion active parameters.
It was trained entirely on AMD hardware, which almost no serious model can say.
The benchmark numbers at 760M active parameters are not normal and the Markovian RSA boost means performance scales with compute budget rather than hitting a fixed ceiling.
This is the most capable model trained end to end on AMD hardware that anyone has published.
Snippet from the RSS feed
Who should care If you work with math, science problems, or complex coding tasks and you're looking for something small enough to run locally or cheaply via API, this is worth serious evaluation. The benchmark numbers at 760M active parameters are not nor

You might also wanna read