Hierarchical Behaviour Spaces
📰 ArXiv cs.AI
arXiv:2604.24558v1 Announce Type: new Abstract: Recent work in hierarchical reinforcement learning has shown success in scaling to billions of timesteps when learning over a set of predefined option reward functions. We show that, instead of using a single reward function per option, the reward functions can be effectively used to induce a space of behaviours, by letting the controller specify linear combinations over reward functions, allowing a more expressive set of policies to be represented
DeepCamp AI