Where Did It Go Wrong? Capability-Oriented Failure Attribution for Vision-and-Language Navigation Agents
📰 ArXiv cs.AI
arXiv:2604.25161v1 Announce Type: cross Abstract: Embodied agents in safety-critical applications such as Vision-Language Navigation (VLN) rely on multiple interdependent capabilities (e.g., perception, memory, planning, decision), making failures difficult to localize and attribute. Existing testing methods are largely system-level and provide limited insight into which capability deficiencies cause task failures. We propose a capability-oriented testing approach that enables failure detection
DeepCamp AI