Equivalence between policy gradients and soft Q-learning

📰 OpenAI News
Published 21 Apr 2017
Read full article → ← Back to News