Zyphra's ZAYA1-8B Matches Frontier AI Models on Benchmarks Using Under 1 Billion Active Parameters, Trained on AMD Hardware
By
Mohit Geryani
Master baker tier. Every paragraph earns its place on the tray.
Summary
Zyphra released ZAYA1-8B, a model that matches or competes with frontier AI models like DeepSeek-R1, Claude Sonnet 4.5, and Gemini 2.5 Pro on math, reasoning, and coding benchmarks — while using less than 1 billion active parameters (760M). It was trained entirely on AMD hardware, a notable departure from the NVIDIA-dominated AI training ecosystem. The model uses a Markovian RSA inference method that allows performance to scale with compute budget. However, it underperforms on agentic tasks like tool calling and multi-step instruction following. The article highlights this as significant for researchers focused on test-time compute methods and for those interested in AMD's potential in the AI hardware space.
Key quotes
· 5 pulledZAYA1-8B matches DeepSeek-R1 on math benchmarks.
This one runs on less than 1 billion active parameters.
It was trained entirely on AMD hardware, which almost no serious model can say.
The benchmark numbers at 760M active parameters are not normal and the Markovian RSA boost means performance scales with compute budget rather than hitting a fixed ceiling.
This is the most capable model trained end to end on AMD hardware that anyone has published.
You might also wanna read
Arcee AI Launches Trinity-Large-Thinking: Open-Source AI Model Matching Opus 4.6 Performance at 96% Lower Cost
Arcee AI has launched Trinity-Large-Thinking, an open-source AI model that claims to match the performance of OpenAI's Opus 4.6 while being

Anthropic Releases Claude Opus 4.5 AI Model Amid Cybersecurity Concerns
Anthropic has released Claude Opus 4.5, positioning it as the world's best AI model for coding, agents, and computer use, claiming it surpas
Anthropic Launches Claude Haiku 4.5: Faster, Cheaper AI Model Matching Sonnet 4 Performance
Anthropic launched Claude Haiku 4.5, a small AI model that delivers frontier-level coding performance matching Claude Sonnet 4, but at 2x fa
AMD Releases Instella: Open 3 Billion Parameter Language Models
AMD has released Instella, a high-performance 3 billion parameter language model trained on MI300X hardware. The model weights are available
Zibra AI Launches GPU-Native Data Orchestration Platform for Spatial AI Training
Zibra AI introduces a GPU-native data orchestration platform designed to solve I/O bottlenecks in spatial and physical AI training. The plat
DeepSeek's V4 Model Shows Widening Gap with US Frontier AI Despite Being China's Best
DeepSeek's latest V4 model release was met with a muted reaction, as analysis by the US National Institute for Standards and Technology foun
