Build Agents That Run for Hours (Without Losing the Plot) — Ash Prabaker & Andrew Wilson, Anthropic

AI Engineer · Intermediate ·🤖 AI Agents & Automation ·6h ago

Skills: Agent Foundations90%Tool Use & Function Calling70%

Why self-evaluation is a trap and adversarial evaluator agents work better; why context compaction doesn't cure coherence drift but structured handoffs do; how to decompose work into testable sprint contracts; how to grade subjective output with rubrics an LLM can actually apply; and how to read traces as your primary debugging loop. Plus the question nobody asks: which parts of your harness should you delete when the next model drops? Speaker info: - Ash Prabaker | https://www.linkedin.com/in/ash-prabaker/ - Andrew Wilson | https://www.linkedin.com/in/anddwilson/

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: Agent Foundations

View skill →

Build and Deploy an Agent with Reasoning Engine in Vertex AI

Adding a Phone Gateway to a Virtual Agent

From Zero to Working AI Agent in 60 Seconds

From Zero to Working AI Agent in 60 Seconds

Create An AI Agent With Replit That Automates Your Sales

Create An AI Agent With Replit That Automates Your Sales

Capstone: Autonomous Runway Detection for IoT

Capstone: Autonomous Runway Detection for IoT

AI Agents with Model Context Protocol & Typescript

AI Agents with Model Context Protocol & Typescript

Related AI Lessons

Prescriptive actions for BFSI Banking: next-best workflow tasks, escalation, and value realization

Learn how to implement prescriptive actions in BFSI banking for next-best workflow tasks, escalation, and value realization using AI and ML

Dev.to · Ananthapathmanabhan A

I red-teamed Oracle APEX 26.1's new AI Agent feature in the 72 hours after it went GA. Claude refused 7 of my 10 attacks on its own.

Learn how to red-team test Oracle APEX 26.1's new AI Agent feature and understand its security vulnerabilities

Dev.to · Ranjith Kumar Kondoju

OpenClaw told me it failed its own trust test, and that’s the real story

Learn how OpenClaw's self-reported failure on its own trust test reveals more about agent reliability than polished benchmarks

OpenClaw outbound agents need deliverability checks before sending

Implement deliverability checks for OpenClaw outbound agents to ensure emails land in the inbox, reducing operational risk

Introducing Gemini Enterprise Agent Ready