How We Built LangSmith Engine | Interrupt 26

LangChain · Beginner ·🤖 AI Agents & Automation ·1mo ago

Skills: Agent Foundations80%Autonomous Workflows60%

Key Takeaways

Builds LangSmith Engine using AI agents to improve and automate the process of reading traces, evaluating, and fixing issues

Original Description

Until now, improving your agent has been a manual process of reading traces, looking for patterns, writing evals, and creating fixes. Now LangSmith Engine can run that cycle for you. It watches your production traces, clusters failures into named issues, diagnoses root causes against your code, and proposes fixes and eval coverage to keep regressions from coming back. You just review and merge improvements. At LangChain's agent conference Interrupt, Ben Tannyhill and Vivek Trivedy introduced LangSmith Engine and what it unlocks for teams running agents at scale. How We Built LangSmith Engine | Interrupt 26 00:00 Introduction and context 00:33 LangChain as the Agent Engineering Platform 00:50 Our go-to-market agent and the problems we hit 01:47 Why the current process is broken (customer pain) 02:48 What we set out to build 02:45 LangSmith Engine demo: the prioritized issue inbox 03:14 Engine proposes fixes and opens PRs 03:32 Custom online evaluators 03:46 Dataset examples for offline evals 04:28 Architecture overview: how Engine works end-to-end 05:18 Early customers: Clay, Vanta, Campfire 05:23 The first version: a wind-up toy 06:54 The false positive problem ("Show me the man") 07:53 Architecture deep dive: orchestration and sandboxes 09:49 Why traces are the most valuable input 10:47 Connecting source code for PR generation 11:10 Types of fixes Engine generates 12:02 Learning from customers: the preference problem 12:56 The agent overview: Engine's memory file 13:40 Passing to Viv: evaluating Engine itself 14:04 Why evals are the only answer 14:31 How we bootstrapped evals (dogfooding + synthetic data) 15:24 Building a diverse and rounded eval suite 16:14 How evals inform model selection and prompt decisions 17:41 Beyond evals: trusting user feedback 18:24 The self-improving loop: Engine improving Engine 19:04 Key learnings and closing summary 20:36 Thank you Extra resources: • Everything we shipped at Interrupt: https://www.langchain.com/blog/interrupt-2026-

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: Agent Foundations

View skill →

Build and Deploy an Agent with Reasoning Engine in Vertex AI

Adding a Phone Gateway to a Virtual Agent

From Zero to Working AI Agent in 60 Seconds

From Zero to Working AI Agent in 60 Seconds

Create An AI Agent With Replit That Automates Your Sales

Create An AI Agent With Replit That Automates Your Sales

Capstone: Autonomous Runway Detection for IoT

Capstone: Autonomous Runway Detection for IoT

AI Agents with Model Context Protocol & Typescript

AI Agents with Model Context Protocol & Typescript

Related Reads

Steal my prompt to turn Codex into an Orchestration Manager

Turn Codex into an Orchestration Manager by creating a single thread for project management, reducing manual intervention and increasing efficiency

**Accelerating Digital Transformation in Japan: Leveraging AI for Kaizen and Workforce Harmony**

Learn how Japan is leveraging AI for digital transformation and workforce harmony, and how you can apply similar strategies to your organization

The 2026 AI CLI Landscape: Claude Code, Gemini CLI (Antigravity CLI), and OpenClaw

Explore the 2026 AI CLI landscape with Claude Code, Gemini CLI, and OpenClaw to enhance terminal-based AI interactions

Dev.to · DevLycan

The Three Engineering Problems That Make Industrial AIoT Harder Than It Looks — and More Interesting Than Anything Else

Industrial AIoT poses unique engineering challenges that require adaptability and creative problem-solving, making it a fascinating field for engineers

Dev.to · AssetTech

Chapters (28)

Introduction and context

0:33 LangChain as the Agent Engineering Platform

0:50 Our go-to-market agent and the problems we hit

1:47 Why the current process is broken (customer pain)

2:48 What we set out to build

2:45 LangSmith Engine demo: the prioritized issue inbox

3:14 Engine proposes fixes and opens PRs

3:32 Custom online evaluators

3:46 Dataset examples for offline evals

4:28 Architecture overview: how Engine works end-to-end

5:18 Early customers: Clay, Vanta, Campfire

5:23 The first version: a wind-up toy

6:54 The false positive problem ("Show me the man")

7:53 Architecture deep dive: orchestration and sandboxes

9:49 Why traces are the most valuable input

10:47 Connecting source code for PR generation

11:10 Types of fixes Engine generates

12:02 Learning from customers: the preference problem

12:56 The agent overview: Engine's memory file

13:40 Passing to Viv: evaluating Engine itself

14:04 Why evals are the only answer

14:31 How we bootstrapped evals (dogfooding + synthetic data)

15:24 Building a diverse and rounded eval suite

16:14 How evals inform model selection and prompt decisions

17:41 Beyond evals: trusting user feedback

18:24 The self-improving loop: Engine improving Engine

19:04 Key learnings and closing summary

20:36 Thank you

Multi Agent System EXPLAINED

TestMu AI (Formerly LambdaTest)