How We Built LangSmith Engine | Interrupt 26
Key Takeaways
Builds LangSmith Engine using AI agents to improve and automate the process of reading traces, evaluating, and fixing issues
Original Description
Until now, improving your agent has been a manual process of reading traces, looking for patterns, writing evals, and creating fixes. Now LangSmith Engine can run that cycle for you. It watches your production traces, clusters failures into named issues, diagnoses root causes against your code, and proposes fixes and eval coverage to keep regressions from coming back. You just review and merge improvements.
At LangChain's agent conference Interrupt, Ben Tannyhill and Vivek Trivedy introduced LangSmith Engine and what it unlocks for teams running agents at scale.
How We Built LangSmith Engine | Interrupt 26
00:00 Introduction and context
00:33 LangChain as the Agent Engineering Platform
00:50 Our go-to-market agent and the problems we hit
01:47 Why the current process is broken (customer pain)
02:48 What we set out to build
02:45 LangSmith Engine demo: the prioritized issue inbox
03:14 Engine proposes fixes and opens PRs
03:32 Custom online evaluators
03:46 Dataset examples for offline evals
04:28 Architecture overview: how Engine works end-to-end
05:18 Early customers: Clay, Vanta, Campfire
05:23 The first version: a wind-up toy
06:54 The false positive problem ("Show me the man")
07:53 Architecture deep dive: orchestration and sandboxes
09:49 Why traces are the most valuable input
10:47 Connecting source code for PR generation
11:10 Types of fixes Engine generates
12:02 Learning from customers: the preference problem
12:56 The agent overview: Engine's memory file
13:40 Passing to Viv: evaluating Engine itself
14:04 Why evals are the only answer
14:31 How we bootstrapped evals (dogfooding + synthetic data)
15:24 Building a diverse and rounded eval suite
16:14 How evals inform model selection and prompt decisions
17:41 Beyond evals: trusting user feedback
18:24 The self-improving loop: Engine improving Engine
19:04 Key learnings and closing summary
20:36 Thank you
Extra resources:
• Everything we shipped at Interrupt: https://www.langchain.com/blog/interrupt-2026-
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: Agent Foundations
View skill →Related Reads
📰
📰
📰
📰
Steal my prompt to turn Codex into an Orchestration Manager
Dev.to AI
**Accelerating Digital Transformation in Japan: Leveraging AI for Kaizen and Workforce Harmony**
Dev.to AI
The 2026 AI CLI Landscape: Claude Code, Gemini CLI (Antigravity CLI), and OpenClaw
Dev.to · DevLycan
The Three Engineering Problems That Make Industrial AIoT Harder Than It Looks — and More Interesting Than Anything Else
Dev.to · AssetTech
Chapters (28)
Introduction and context
0:33
LangChain as the Agent Engineering Platform
0:50
Our go-to-market agent and the problems we hit
1:47
Why the current process is broken (customer pain)
2:48
What we set out to build
2:45
LangSmith Engine demo: the prioritized issue inbox
3:14
Engine proposes fixes and opens PRs
3:32
Custom online evaluators
3:46
Dataset examples for offline evals
4:28
Architecture overview: how Engine works end-to-end
5:18
Early customers: Clay, Vanta, Campfire
5:23
The first version: a wind-up toy
6:54
The false positive problem ("Show me the man")
7:53
Architecture deep dive: orchestration and sandboxes
9:49
Why traces are the most valuable input
10:47
Connecting source code for PR generation
11:10
Types of fixes Engine generates
12:02
Learning from customers: the preference problem
12:56
The agent overview: Engine's memory file
13:40
Passing to Viv: evaluating Engine itself
14:04
Why evals are the only answer
14:31
How we bootstrapped evals (dogfooding + synthetic data)
15:24
Building a diverse and rounded eval suite
16:14
How evals inform model selection and prompt decisions
17:41
Beyond evals: trusting user feedback
18:24
The self-improving loop: Engine improving Engine
19:04
Key learnings and closing summary
20:36
Thank you
🎓
Tutor Explanation
DeepCamp AI