AgentHER: Hindsight Experience Replay for LLM Agent Trajectory Relabeling
📰 ArXiv cs.AI
arXiv:2603.21357v2 Announce Type: replace Abstract: LLM agents fail on the majority of real-world tasks -- GPT-4o succeeds on fewer than 15% of WebArena navigation tasks and below 55% pass@1 on ToolBench (Zhou et al., 2024; Qin et al., 2024) -- yet every failed trajectory is routinely discarded, wasting the dominant source of collected experience. We introduce AgentHER, a framework that recovers this lost training signal by adapting the Hindsight Experience Replay (HER; Andrychowicz et al., 2017
DeepCamp AI