Developer Documents Azure LLM Performance Degradation Over Six Months
By
sgt3v
8mo ago· 2 min readenInsight
55/100
Doughy
Bagelometer↗
All dough, no crust. Filling but forgettable.
Score55TypeanalysisSentimentnegative
Summary
A developer shares their experience with Azure's LLM (Large Language Model) performance degradation over time. They describe a systematic testing approach using identical conversations with temperature set to 0 to ensure consistent responses. Over six months of development, they observed that the same LLM model produces progressively worse JSON responses, suggesting performance deterioration despite using the same prompts and testing methodology.
Key quotes
· 3 pulledI have a set of conversations, used with 0 temperature to guarantee I get most similar answers
I can see how the very same model of the LLM gets worse and worse
The JSON responses I receive get less and less accurate over time
I am working on a product that uses Azure in the back-end for LLMs and Audio Models. Just like how I test the code for every release, every time I add or update things on the system prompts for calibration or new features I also test the conversational…
