All Topics

Technology

Art

OpenAI and DeepMind develop algorithm that learns from human preference comparisons for safer AI

3d ago· 6 min readenNews

Summary

OpenAI and DeepMind's safety team developed a learning algorithm that infers human preferences by comparing two proposed behaviors, rather than requiring humans to write explicit goal functions. This approach aims to build safer AI systems by using small amounts of human feedback to solve modern reinforcement learning environments, reducing the risk of dangerous behavior caused by poorly specified or oversimplified goals.

Source

bskyOpenAI and DeepMind develop algorithm that learns from human preference comparisons for safer AIopenai.com

Key quotes

· 3 pulled

One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior.

In collaboration with DeepMind's safety team, we've developed an algorithm which can infer what humans want by being told which of two proposed behaviors is better.

We present a learning algorithm that uses small amounts of human feedback to solve modern RL environments.

Snippet from the RSS feed

One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior. In collaborati

You might also wanna read

OpenAI Withholds New Text-Generation Model Over Safety Concerns, Reigniting AI Ethics Debate

OpenAI has developed a new text-generation model capable of writing coherent, versatile prose but has decided not to release the full algori

slate.com·2mo ago

OpenAI's Approach to AI Usage Policies: Balancing Safety, Innovation and User Control

OpenAI outlines its approach to usage policies for AI tools, emphasizing safety, responsibility, and user control. The company aims to balan

openai.com·7mo ago

OpenAI's Approach to Balancing Teen Safety, Freedom and Privacy in AI Systems

OpenAI CEO Sam Altman discusses the company's approach to balancing competing principles around teen safety, freedom, and privacy in AI syst

openai.com·9mo ago

New Benchmark Reveals High Rates of Outcome-Driven Constraint Violations in Autonomous AI Agents

Researchers introduce a new benchmark for evaluating autonomous AI agents' safety, specifically focusing on outcome-driven constraint violat

arxiv.org·4mo ago

AI Safety Researchers at Anthropic Work to Prevent Potential Societal Harms from Advanced AI Systems

The article focuses on Deep Ganguli, a research director at Stanford Institute for Human-Centered AI, who became concerned about the rapid a

The Verge·6mo ago

OpenAI's Safety vs. Growth Dilemma: Balancing ChatGPT's Appeal with User Protection

OpenAI faced a dilemma between making ChatGPT more appealing to users and maintaining safety standards. The company initially tweaked its ch

nytimes.com·7mo ago

Comments

No comments yet. Be the first.