Evolved Policy Gradients

📰 OpenAI News

OpenAI introduces Evolved Policy Gradients, a metalearning approach that evolves loss functions for fast training on novel tasks

advanced Published 18 Apr 2018

Action Steps

Understand the concept of metalearning and its applications
Explore the Evolved Policy Gradients approach and its potential benefits
Experiment with EPG on novel tasks to evaluate its effectiveness
Integrate EPG into existing learning agent architectures to improve adaptability

Who Needs to Know This

ML researchers and engineers on a team can benefit from EPG to improve the adaptability of their learning agents, while product managers can leverage this technology to develop more robust AI systems

Key Insight

💡 EPG enables learning agents to succeed at tasks outside their training regime

Key Takeaways

OpenAI introduces Evolved Policy Gradients, a metalearning approach that evolves loss functions for fast training on novel tasks

Full Article

We’re releasing an experimental metalearning approach called Evolved Policy Gradients, a method that evolves the loss function of learning agents, which can enable fast training on novel tasks. Agents trained with EPG can succeed at basic tasks at test time that were outside their training regime, like learning to navigate to an object on a different side of the room from where it was placed during training.

Read full article → ← Back to Reads