Alibaba Cloud's Aegaeon System Reduces Nvidia GPU Usage by 82% for AI Models
By
hd4
7mo ago· 2 min readenNews
55/100
Doughy
Bagelometer↗
Looks the part, but the middle's still raw. Chew with caution.
Score55TypenewsSentimentpositive
Summary
Alibaba Cloud has developed a computing pooling system called Aegaeon that reduces Nvidia GPU usage by 82% when serving AI models. The system was tested in Alibaba Cloud's model marketplace for over three months, cutting the number of Nvidia H20 GPUs needed to serve dozens of models with up to 72 billion parameters from 1,192 to just 213. The research was presented at the 31st Symposium on Operating Systems Principles in Seoul.
Key quotes
· 3 pulledAlibaba Group Holding has introduced a computing pooling solution that it said led to an 82 per cent cut in the number of Nvidia graphics processing units (GPUs) needed to serve its artificial intelligence models.
The system, called Aegaeon, was beta tested in Alibaba Cloud's model marketplace for more than three months, where it reduced the number of Nvidia H20 GPUs required to serve dozens of models of up to 72 billion parameters from 1,192 to 213.
Aegaeon is t
The new Aegaeon system can serve dozens of large language models using a fraction of the GPUs previously required, potentially reshaping AI workloads.
