Technology

Art

Xiaomi MiMo-V2.5-Pro-UltraSpeed achieves 1000+ tokens/s on 1T-parameter model

gainsurier

16d ago· 9 min readen

technology programming

Summary

Xiaomi's MiMo-V2.5-Pro-UltraSpeed model, developed in collaboration with TileRT, achieves a breakthrough in AI inference speed — reaching over 1000 tokens per second on a 1-trillion-parameter model using commodity GPUs. The article frames speed as the defining edge of AI intelligence, arguing that ultra-fast reasoning transforms AI from a waiting tool into an extension of human thinking. It highlights extreme model-system codesign as the key enabler of this performance milestone.

Source

Hacker NewsXiaomi MiMo-V2.5-Pro-UltraSpeed achieves 1000+ tokens/s on 1T-parameter modelmimo.xiaomi.com

Key quotes

· 3 pulled

From the first roaring racer of the combustion age to the sonic boom that shattered the sound barrier, humanity's hunger for speed is written into our very DNA.

The speed of AI reasoning is no different — it defines the boundaries of intelligence itself.

When a model is fast enough, it ceases to be a tool you wait on and becomes an extension of your own thinking: responding in real time, iterating in an instant.

Snippet from the RSS feed

MiMo, in collaboration with TileRT, releases the UltraSpeed mode of Xiaomi MiMo-V2.5-Pro — breaking 1000 tokens/s generation speed on a 1T-parameter model for the first time on commodity GPUs through extreme model-system codesign.

You might also wanna read

Xiaomi Releases MiMo: Open-Source AI Model Series Optimized for Reasoning Tasks

Xiaomi has released MiMo, an open-source large language model series under Apache 2.0 license that is specifically designed for reasoning ta

Product Hunt·9d ago

MiniCPM 4.0: Ultra-Efficient Open-Source AI Models for On-Device Deployment

MiniCPM 4.0 is a family of ultra-efficient, open-source AI models designed for on-device deployment, offering significant speed improvements

Product Hunt·1y ago

Xiaomi releases MiMo-V2.5-ASR: open-source 8B speech recognition model supporting Mandarin, English, dialects, and song lyrics

MiMo-V2.5-ASR is an 8-billion-parameter open-source speech recognition model developed by Xiaomi. It supports transcription of Mandarin, Eng

Product Hunt·2mo ago

General Compute Launches ASIC-Based Inference Cloud for Faster AI Agent Performance

General Compute is an inference cloud built on ASICs (purpose-built alternatives to Nvidia GPUs) designed specifically for AI inference, not

Product Hunt·2mo ago

MiniCPM 4.0: Ultra-Efficient Open-Source AI Models for On-Device Deployment

MiniCPM 4.0 is a family of ultra-efficient, open-source AI models designed for on-device deployment, offering significant speed improvements

Product Hunt·29d ago

MiniCPM 4.0: Ultra-Efficient Open-Source AI Models for On-Device Deployment

MiniCPM 4.0 is an ultra-efficient, open-source AI model family designed for on-device deployment, featuring significant speed improvements o

Product Hunt·9mo ago

Comments

No comments yet. Be the first.