TRAJEVAL: Decomposing Code Agent Trajectories for Fine-Grained Diagnosis

📰 ArXiv cs.AI

TRAJEVAL is a diagnostic framework for decomposing code agent trajectories into interpretable stages for fine-grained diagnosis

advanced Published 27 Mar 2026

Action Steps

Decompose agent trajectories into search, planning, and execution stages
Analyze each stage for errors or deviations from expected behavior
Use TRAJEVAL to identify specific points of failure and improve agent performance

Who Needs to Know This

AI engineers and researchers on a team benefit from TRAJEVAL as it provides visibility into code agent failures, while product managers and software engineers can use it to improve agent performance and reliability

Key Insight

💡 Decomposing agent trajectories into interpretable stages enables fine-grained diagnosis and improvement of code agent performance