All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

OpenAI's GPT OSS 120B Model Now Available on Cerebras Inference Cloud

By

samspenc

6mo ago· 3 min readenNews

Summary

OpenAI's GPT OSS 120B model is now available on Cerebras' Inference Cloud, offering high-speed AI inference performance. The 120-billion parameter mixture-of-expert model delivers accuracy comparable to OpenAI's o4-mini while achieving speeds up to 3,000 tokens per second. The model features 131K context length and is priced at $0.25 per million input tokens and $0.69 per million output tokens. Cerebras positions itself as a platform for fast AI training and inference.

Key quotes

· 4 pulled
The first open weight reasoning model by OpenAI, OSS 120B delivers model accuracy that rivals o4-mini while running at up to 3,000 tokens per second on the Cerebras Inference Cloud.
Reasoning tasks that take up to a minute to complete on GPUs finish in just one second on Cerebras.
OSS 120B is available today with 131K context at $0.25 per M input tokens and $0.69 per M output tokens.
GPTOSS120B is a 120 billion parameter mixture-of-expert model that delivers near parity performance with OpenAI's popular o4mini on core reasoning benchmarks.
Snippet from the RSS feed
Cerebras is the go-to platform for fast and effortless AI training. Learn more at cerebras.ai.

You might also wanna read