Learning from human preferences
📰 OpenAI News
OpenAI and DeepMind developed an algorithm that infers human preferences by comparing two proposed behaviors
Action Steps
- Collaborate with safety teams to identify complex goals
- Develop algorithms that can infer human preferences from comparisons
- Test and refine the algorithm with human feedback
- Integrate the algorithm into AI systems to improve safety and alignment
Who Needs to Know This
AI researchers and engineers on a team can benefit from this algorithm to build safer AI systems, and product managers can use it to develop more aligned AI products
Key Insight
💡 Inferring human preferences from comparisons can help build safer AI systems
Share This
🤖 New algorithm infers human preferences from comparisons! 🚀
Key Takeaways
OpenAI and DeepMind developed an algorithm that infers human preferences by comparing two proposed behaviors
Full Article
One step towards building safe AI systems is to remove the need for humans to write goal functions, since using a simple proxy for a complex goal, or getting the complex goal a bit wrong, can lead to undesirable and even dangerous behavior. In collaboration with DeepMind’s safety team, we’ve developed an algorithm which can infer what humans want by being told which of two proposed behaviors is better.
DeepCamp AI