Behavior-Consistent Deep Reinforcement Learning
📰 ArXiv cs.AI
arXiv:2605.21214v2 Announce Type: cross Abstract: Reinforcement learning (RL) often exhibits high variance across training runs, leading to unreliable performance and posing a major challenge to deployment in real-world domains. In this work, we address the challenge of cross-run policy divergence by formalizing the problem of behavior-consistent RL, where the objective is to obtain policies that are both high-performing and distributionally similar across training runs. Our key observation is t
DeepCamp AI