Appears on
Articles3
Technical Implementation of DeepSeek LLM Deployment with Expert Parallelism on 96 H100 GPUs
Insight
Fine-Tuned Small LLMs Outperform Larger Models at 5-30x Lower Cost with Data Curation
Insight
Supervised Fine-Tuning as Reinforcement Learning: Introducing Importance-Weighted SFT
Insight

