Learning from human preferences

📰 OpenAI News

OpenAI and DeepMind developed an algorithm that infers human preferences by comparing two proposed behaviors

advanced Published 13 Jun 2017

Action Steps

Collaborate with safety teams to identify complex goals
Develop algorithms that can infer human preferences from comparisons
Test and refine the algorithm with human feedback
Integrate the algorithm into AI systems to improve safety and alignment

Who Needs to Know This

AI researchers and engineers on a team can benefit from this algorithm to build safer AI systems, and product managers can use it to develop more aligned AI products

Key Insight

💡 Inferring human preferences from comparisons can help build safer AI systems

Key Takeaways

OpenAI and DeepMind developed an algorithm that infers human preferences by comparing two proposed behaviors

Full Article

One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior. In collaboration with DeepMind’s safety team, we’ve developed an algorithm which can infer what humans want by being told which of two proposed behaviors is better.

Read full article → ← Back to Reads