Patronus AI with Anand Kannappan - Weaviate Podcast #122!

Weaviate vector database · Advanced ·🤖 AI Agents & Automation ·1y ago
AI agents are getting more complex and harder to debug. How do you know what's happening when your agent makes 20+ function calls? What if you have a Multi-Agent System orchestrating several Agents? Anand Kannappan, co-founder of Patronus AI, reveals how their groundbreaking tool Percival transforms agent debugging and evaluation. Percival can instantly analyze complex agent traces, it pinpoints failures across 60 different modes, and it automatically suggests prompt fixes to improve performance. Anand unpacks several of these common failure modes. This includes the critical challenges of "context explosion" where agents process millions of tokens. He also explains domain adaptation for specific use cases, and the complex challenge of multi-agent orchestration. The paradigm of AI Evals is shifting from static evaluation to dynamic oversight! Also learn how Percival's memory architecture leverages both episodic and semantic knowledge with Weaviate! This conversation explores powerful concepts like process vs. outcome rewards and LLM-as-judge approaches. Anand shares his vision for "agentic supervision" where equally capable AI systems provide oversight for complex agent workflows. Whether you're building AI agents, evaluating LLM systems, or interested in how debugging autonomous systems will evolve, this episode delivers concrete techniques. You'll gain philosophical insights on evaluation and a roadmap for how evaluation must transform to keep pace with increasingly autonomous AI systems. Links: Percival Launch: https://www.patronus.ai/percival Docs: https://docs.patronus.ai/docs/percival/ Paper: https://arxiv.org/abs/2505.08638 Chapters 0:00 Welcome Anand! 1:15 Percival! 17:20 Online and Offline Agent Tracing 20:40 Complex Agent Traces 23:05 Quick Insights and Deep Research 24:47 Automated Agent Tuning 31:19 LLM-as-Judge and Scalable Oversight 42:24 Agent Inbox for Evals 45:49 Causal Inference and AI 51:24 Percival and Weaviate 56:04 Exciting Directions for AI
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

The Context Layer: Why Enterprise AI Agents Fail Without It — and What It Actually Takes to Fix That
Enterprise AI agents often fail due to lack of context, but understanding the four-layer context problem can help fix this issue
Dev.to · Swapnil Chougule
Comparing 6 AI Routers Is a Mistake — Until You Define ‘Survived’
Evaluating AI routers requires a clear definition of success criteria, as comparing them without context is misleading
Medium · AI
Comparing 6 AI Routers Is a Mistake — Until You Define ‘Survived’
Evaluating AI routers requires defining survival metrics, as a simple comparison of 6 AI routers can be misleading
Medium · Programming
What if an AI continued thinking even after you closed the chat?
Explore the concept of AI systems that continue thinking after a conversation ends and its implications
Dev.to · Stell

Chapters (11)

Welcome Anand!
1:15 Percival!
17:20 Online and Offline Agent Tracing
20:40 Complex Agent Traces
23:05 Quick Insights and Deep Research
24:47 Automated Agent Tuning
31:19 LLM-as-Judge and Scalable Oversight
42:24 Agent Inbox for Evals
45:49 Causal Inference and AI
51:24 Percival and Weaviate
56:04 Exciting Directions for AI
Up next
Combine Skills and MCP to Close the Context Gap — Pedro Rodrigues, Supabase
AI Engineer
Watch →