All Topics
All Topics
Technology
Technology
AI
AI
Business
Business
Entertainment
Entertainment
News
News
Programming
Programming
Security
Security
Science
Science
Design
Design
Environment
Environment
Finance
Finance
Crypto
Crypto
Politics
Politics
Sports
Sports
Education
Education
Gaming
Gaming
Art
Art
Music
Music
Health
Health
Books
Books
Food
Food
Travel
Travel
Personal
Personal
Bluesky
Twitter

OpenAI and DeepMind develop algorithm that learns from human preference comparisons for safer AI

3d ago· 6 min readenNews

Summary

OpenAI and DeepMind's safety team developed a learning algorithm that infers human preferences by comparing two proposed behaviors, rather than requiring humans to write explicit goal functions. This approach aims to build safer AI systems by using small amounts of human feedback to solve modern reinforcement learning environments, reducing the risk of dangerous behavior caused by poorly specified or oversimplified goals.

Source

bskyOpenAI and DeepMind develop algorithm that learns from human preference comparisons for safer AIopenai.com

Key quotes

· 3 pulled
One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior.
In collaboration with DeepMind's safety team, we've developed an algorithm which can infer what humans want by being told which of two proposed behaviors is better.
We present a learning algorithm that uses small amounts of human feedback to solve modern RL environments.
Snippet from the RSS feed
One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior. In collaborati

You might also wanna read

Comments

Sign in to join the conversation.

No comments yet. Be the first.