Challenges in Scaling Reinforcement Learning
By
jxmorris12
11mo ago· 10 min readenNews
100/100
Golden Brown
Bagelometer↗
A baker's-dozen of insight crammed into one ring.
Score100TypenewsSentimentneutral
Summary
Reinforcement learning (RL) is questioned for its scalability compared to other objectives like next-token prediction, denoising diffusion, and contrastive learning that have shown scalability in training models with billions of parameters.
Key quotes
· 2 pulledDoes RL also scale like all the other objectives?
Apparently, it does.
Over the past few years,
we've seen that next-token prediction scales, denoising diffusion scales, contrastive learning scales,
and so on, all the way to the point where we can train models with billions of parameters
