All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Research Shows LLMs Have Coherent Utility Functions and Value Systems

By

alexcos

7mo ago· 24 min readenInsight

Summary

The article discusses a February 2025 research paper from the Center for AI Safety titled 'Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs.' The research demonstrates that modern large language models (LLMs) have coherent and transitive implicit utility functions and world models. Key findings include that larger and more capable LLMs exhibit more coherent and transitive preferences (where preferring A > B and B > C implies A > C). The article specifically examines how LLMs trade off lives between different categories, referencing Figure 16 which shows GPT-4o's valuation of lives across different categories.

Key quotes

· 4 pulled
On February 19th, 2025, the Center for AI Safety published 'Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs'
They showed that modern LLMs have coherent and transitive implicit utility functions and world models
Bigger and more capable LLMs had more coherent and more transitive preferences
Figure 16, which showed how GPT-4o valued lives over different categories
Snippet from the RSS feed
How do LLM's trade off lives between different categories?

You might also wanna read