Variance reduction for policy gradient with action-dependent factorized baselines

📰 OpenAI News
Published 20 Mar 2018
Read full article → ← Back to News