Relace Open-Sources Fast Apply Model Training Methods After Reaching 10k Tokens/Second Performance
By
eborgnia
7mo ago· 15 min readenInsight
100/100
Golden Brown
Bagelometer↗
Kettled twice. Extra chewy, extra trustworthy.
Score100TypeanalysisSentimentpositive
Summary
Relace reflects on one year since releasing their first Fast Apply model, sharing insights from training specialized models for code-specific tasks. They are open-sourcing their learnings on dataset curation, training methods, and inference techniques that led to Relace Apply 3, their best model capable of running at 10k+ tokens per second while maintaining state-of-the-art accuracy. The article discusses performance improvements and how this model series contributed to surpassing $1M in annual recurring revenue.
Key quotes
· 3 pulledToday, we're open-sourcing what we've learned in training this series of models — dataset curation, training methods, and inference techniques that led to Relace Apply 3, our best model yet
capable of running at 10k+ tokens per second while maintaining state-of-the-art accuracy
A year ago today, we released our first Fast Apply model publicly. Since then, we've learned a lot about how to fine-tune small, specialized models for code-specific tasks
Relace Apply 3, and how we built the series of models that took us past 1M ARR.
