Bayesian Inverse Transition Learning: Learning Dynamics From Near-Optimal Trajectories
📰 ArXiv cs.AI
arXiv:2411.05174v2 Announce Type: replace-cross Abstract: We consider the problem of estimating the transition dynamics $T^*$ from near-optimal expert trajectories in the context of offline model-based reinforcement learning. We develop a novel constraint-based method, Inverse Transition Learning, that treats the limited coverage of the expert trajectories as a \emph{feature}: we use the fact that the expert is near-optimal to inform our estimate of $T^*$. We integrate our constraints into a Bay
DeepCamp AI