Why We Built LangSmith for Improving Agent Quality
Skills:
Agent Foundations90%Tool Use & Function Calling80%RAG Evaluation70%Multi-Agent Systems60%Autonomous Workflows60%
Harrison Chase (CEO of LangChain) sits down with Bagatur (LangSmith Engineer) and Tanushree (Product Manager) for a technical roundtable on bringing production agents from prototype to rigor. They discuss the evolution of LangSmith's platform, dive deep into the new Insights Agent feature for automatically discovering patterns in production traces, and explore Multi-turn Evaluations for understanding end-to-end user interactions.
00:00 - Introductions + the evolution of LangSmith
02:39 - Introducing Insights Agent
03:49 - Real-world use cases for Insights Agent
04:44 - Customizing insights for your specific use case
05:22 - The algorithm behind Insights Agent
06:30 - The hardest part of getting Insights to work
07:13 - Tips for getting started with Insights
08:47 - Evals vs Insights - what's the difference
09:36 - What are Threads and why do they matter?
11:59 - Offline vs online evals
12:46 - Multi-turn evals for measuring agent performance in production
13:19 - Thread-level metrics and workflows
14:22 - The hot take: "Are evals dead?"
16:08 - The future of testing
17:08 - Closing thoughts
Read more about our latest LangSmith updates: https://bit.ly/3WrUNDZ
Learn more about Insights Agent: https://docs.langchain.com/langsmith/insights
Learn more about Multi-turn Evals: https://docs.langchain.com/langsmith/online-evaluations#configure-multi-turn-online-evaluators
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from LangChain · LangChain · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Chat With Your Documents Using LangChain + JavaScript
LangChain
LangChain SQL Webinar
LangChain
LangChain "OpenAI functions" Webinar
LangChain
LangSmith Launch
LangChain
LangChain x Pinecone: Supercharging Llama-2 with RAG
LangChain
LangChain Expression Language
LangChain
Building LLM applications with LangChain with Lance
LangChain
Benchmarking Question/Answering Over CSV Data
LangChain
LangChain "RAG Evaluation" Webinar
LangChain
Fine-tuning in Your Voice Webinar
LangChain
Tabular Data Retrieval
LangChain
Building an LLM Application with Audio by AssemblyAI
LangChain
Superagent Deepdive Webinar
LangChain
Lessons from Deploying LLMs with LangSmith
LangChain
Shortwave Assistant Deepdive Webinar
LangChain
Cognitive Architectures for Language Agents
LangChain
Effectively Building with LLMs in the Browser with Jacob
LangChain
Data Privacy for LLMs
LangChain
"Theory of Mind" Webinar with Plastic Labs
LangChain
LangChain Templates
LangChain
Using Natural Language to Query Postgres with Jacob
LangChain
Building a Research Assistant from Scratch
LangChain
Benchmarking RAG over LangChain Docs
LangChain
Skeleton-of-Thought: Building a New Template from Scratch
LangChain
Benchmarking Methods for Semi-Structured RAG
LangChain
LangSmith Highlights: Getting Started
LangChain
LangSmith Highlights: Debugging
LangChain
LangSmith Highlights: Datasets
LangChain
LangSmith Highlights: Evaluation
LangChain
LangSmith Highlights: Human Annotation
LangChain
LangSmith Highlights: Monitoring
LangChain
LangSmith Highlights: Hub
LangChain
SQL Research Assistant
LangChain
Getting Started with Multi-Modal LLMs
LangChain
Build a Full Stack RAG App With TypeScript
LangChain
Auto-Prompt Builder (with Hosted LangServe)
LangChain
LangChain v0.1.0 Launch: Introduction
LangChain
LangChain v0.1.0 Launch: Observability
LangChain
LangChain v0.1.0 Launch: Integrations
LangChain
LangChain v0.1.0 Launch: Composability
LangChain
LangChain v0.1.0 Launch: Streaming
LangChain
LangChain v0.1.0 Launch: Output Parsing
LangChain
LangChain v0.1.0 Launch: Retrieval
LangChain
LangChain v0.1.0 Launch: Agents
LangChain
Build and Deploy a RAG app with Pinecone Serverless
LangChain
Hosted LangServe + LangChain Templates
LangChain
LangGraph: Intro
LangChain
LangGraph: Agent Executor
LangChain
LangGraph: Chat Agent Executor
LangChain
LangGraph: Human-in-the-Loop
LangChain
LangGraph: Dynamically Returning a Tool Output Directly
LangChain
LangGraph: Respond in a Specific Format
LangChain
LangGraph: Managing Agent Steps
LangChain
LangGraph: Force-Calling a Tool
LangChain
LangGraph: Multi-Agent Workflows
LangChain
Streaming Events: Introducing a new `stream_events` method
LangChain
Building a web RAG chatbot: using LangChain, Exa (prev. Metaphor), LangSmith, and Hosted Langserve
LangChain
OpenGPTs
LangChain
Open Source RAG with Nomic's New Embedding Model (and ChromaDB and Ollama)
LangChain
LangGraph: Persistence
LangChain
More on: Agent Foundations
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
What Happens When We Let AI Govern a Civilization Without Limits?
Dev.to AI
Understanding Generative AI in Financial Operations: A Retail Banker's Guide
Dev.to AI
The AI Adoption Milestones Most Companies Are Already Experiencing
Forbes Innovation
ZTE Showcases AI Interactive Flat Panel at the Broadband User Congress in Brazil
The Register
Chapters (15)
Introductions + the evolution of LangSmith
2:39
Introducing Insights Agent
3:49
Real-world use cases for Insights Agent
4:44
Customizing insights for your specific use case
5:22
The algorithm behind Insights Agent
6:30
The hardest part of getting Insights to work
7:13
Tips for getting started with Insights
8:47
Evals vs Insights - what's the difference
9:36
What are Threads and why do they matter?
11:59
Offline vs online evals
12:46
Multi-turn evals for measuring agent performance in production
13:19
Thread-level metrics and workflows
14:22
The hot take: "Are evals dead?"
16:08
The future of testing
17:08
Closing thoughts
🎓
Tutor Explanation
DeepCamp AI