All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Relace Open-Sources Fast Apply Model Training Methods After Reaching 10k Tokens/Second Performance

By

eborgnia

7mo ago· 15 min readenInsight

Summary

Relace reflects on one year since releasing their first Fast Apply model, sharing insights from training specialized models for code-specific tasks. They are open-sourcing their learnings on dataset curation, training methods, and inference techniques that led to Relace Apply 3, their best model capable of running at 10k+ tokens per second while maintaining state-of-the-art accuracy. The article discusses performance improvements and how this model series contributed to surpassing $1M in annual recurring revenue.

Key quotes

· 3 pulled
Today, we're open-sourcing what we've learned in training this series of models — dataset curation, training methods, and inference techniques that led to Relace Apply 3, our best model yet
capable of running at 10k+ tokens per second while maintaining state-of-the-art accuracy
A year ago today, we released our first Fast Apply model publicly. Since then, we've learned a lot about how to fine-tune small, specialized models for code-specific tasks
Snippet from the RSS feed
Relace Apply 3, and how we built the series of models that took us past 1M ARR.

You might also wanna read