Build Agents That Run for Hours (Without Losing the Plot) — Ash Prabaker & Andrew Wilson, Anthropic
Why self-evaluation is a trap and adversarial evaluator agents work better; why context compaction doesn't cure coherence drift but structured handoffs do; how to decompose work into testable sprint contracts; how to grade subjective output with rubrics an LLM can actually apply; and how to read traces as your primary debugging loop. Plus the question nobody asks: which parts of your harness should you delete when the next model drops?
Speaker info:
- Ash Prabaker | https://www.linkedin.com/in/ash-prabaker/
- Andrew Wilson | https://www.linkedin.com/in/anddwilson/
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: Agent Foundations
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Prescriptive actions for BFSI Banking: next-best workflow tasks, escalation, and value realization
Dev.to · Ananthapathmanabhan A
I red-teamed Oracle APEX 26.1's new AI Agent feature in the 72 hours after it went GA. Claude refused 7 of my 10 attacks on its own.
Dev.to · Ranjith Kumar Kondoju
OpenClaw told me it failed its own trust test, and that’s the real story
Dev.to AI
OpenClaw outbound agents need deliverability checks before sending
Dev.to AI
🎓
Tutor Explanation
DeepCamp AI