LangSmith Highlights: Human Annotation

LangChain · Intermediate ·🤖 AI Agents & Automation ·2y ago

Key Takeaways

The video demonstrates how to add human feedback to annotate runs in LangSmith, including tagging a run with feedback and checking out the annotation queue.

Full Transcript

one of the things that we help you do in Lang Smith is ADD human feedback to annotate your runs so we just showed how you can use automatic evaluation to have llms grade your runs or to programmatically Auto evaluate each of your runs but there's really no substitute for a human adding annotation feedback on runs as well you might do this for a couple of reasons maybe there's some kind of measure that's hard to have an automatic evaluation on or maybe you've used auto evals on thousands of runs and you want to have a human just pick through a small sub subset of those runs to make sure that your llm grader is is still doing a good job so I'll show how to do that in this video so this is a test run we have uh some feedback already recorded on each of these runs things like correctness helpfulness and sensitivity uh as well as uh embedding cosine distance and what we're going to do is we're going to pick all of the RS that had a correct score uh and we're going to grab them all and send them to an annotation queue so we're going to add this second human review uh annotation cue and now all of these runs will be queed up in this way that we can easily go through and add our own own feedback and we can we can see here all of the tags this Ron already has but maybe we want to have a different kind of feedback like creativity which is harder for llms to creade and it can have a score of 1 to five and just making this up but this one is a creativity of SC two maybe you have a rubric that a human evaluator wants to follow that this one is done I can now add a score of again creativity to this run here we'll give it a score of five this one is done and you can see how I can quickly add tags at to each of these runs and add some additional feedback manually uh to to the ones that are in my queue so that if I'm uh supporting a a flow of making sure that each of the runs have a good response uh you can do that pretty seamlessly within your annotation que and we're all cut caught up meaning I have no more to review and this is really helpful if you're in a supporter role or you're helping curate data sets uh to make sure that you have the appropriate tags and feedback on each of your runs

Original Description

See how to: -Tag a run with feedback -Check out your annotation queue Log in or sign up for LangSmith (BETA): https://smith.langchain.com/
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from LangChain · LangChain · 30 of 60

1 Chat With Your Documents Using LangChain + JavaScript
Chat With Your Documents Using LangChain + JavaScript
LangChain
2 LangChain SQL Webinar
LangChain SQL Webinar
LangChain
3 LangChain "OpenAI functions" Webinar
LangChain "OpenAI functions" Webinar
LangChain
4 LangSmith Launch
LangSmith Launch
LangChain
5 LangChain x Pinecone: Supercharging Llama-2 with RAG
LangChain x Pinecone: Supercharging Llama-2 with RAG
LangChain
6 LangChain Expression Language
LangChain Expression Language
LangChain
7 Building LLM applications with LangChain with Lance
Building LLM applications with LangChain with Lance
LangChain
8 Benchmarking Question/Answering Over CSV Data
Benchmarking Question/Answering Over CSV Data
LangChain
9 LangChain "RAG Evaluation" Webinar
LangChain "RAG Evaluation" Webinar
LangChain
10 Fine-tuning in Your Voice Webinar
Fine-tuning in Your Voice Webinar
LangChain
11 Tabular Data Retrieval
Tabular Data Retrieval
LangChain
12 Building an LLM Application with Audio by AssemblyAI
Building an LLM Application with Audio by AssemblyAI
LangChain
13 Superagent Deepdive Webinar
Superagent Deepdive Webinar
LangChain
14 Lessons from Deploying LLMs with LangSmith
Lessons from Deploying LLMs with LangSmith
LangChain
15 Shortwave Assistant Deepdive Webinar
Shortwave Assistant Deepdive Webinar
LangChain
16 Cognitive Architectures for Language Agents
Cognitive Architectures for Language Agents
LangChain
17 Effectively Building with LLMs in the Browser with Jacob
Effectively Building with LLMs in the Browser with Jacob
LangChain
18 Data Privacy for LLMs
Data Privacy for LLMs
LangChain
19 "Theory of Mind" Webinar with Plastic Labs
"Theory of Mind" Webinar with Plastic Labs
LangChain
20 LangChain Templates
LangChain Templates
LangChain
21 Using Natural Language to Query Postgres with Jacob
Using Natural Language to Query Postgres with Jacob
LangChain
22 Building a Research Assistant from Scratch
Building a Research Assistant from Scratch
LangChain
23 Benchmarking RAG over LangChain Docs
Benchmarking RAG over LangChain Docs
LangChain
24 Skeleton-of-Thought: Building a New Template from Scratch
Skeleton-of-Thought: Building a New Template from Scratch
LangChain
25 Benchmarking Methods for Semi-Structured RAG
Benchmarking Methods for Semi-Structured RAG
LangChain
26 LangSmith Highlights: Getting Started
LangSmith Highlights: Getting Started
LangChain
27 LangSmith Highlights: Debugging
LangSmith Highlights: Debugging
LangChain
28 LangSmith Highlights: Datasets
LangSmith Highlights: Datasets
LangChain
29 LangSmith Highlights: Evaluation
LangSmith Highlights: Evaluation
LangChain
LangSmith Highlights: Human Annotation
LangSmith Highlights: Human Annotation
LangChain
31 LangSmith Highlights: Monitoring
LangSmith Highlights: Monitoring
LangChain
32 LangSmith Highlights: Hub
LangSmith Highlights: Hub
LangChain
33 SQL Research Assistant
SQL Research Assistant
LangChain
34 Getting Started with Multi-Modal LLMs
Getting Started with Multi-Modal LLMs
LangChain
35 Build a Full Stack RAG App With TypeScript
Build a Full Stack RAG App With TypeScript
LangChain
36 Auto-Prompt Builder (with Hosted LangServe)
Auto-Prompt Builder (with Hosted LangServe)
LangChain
37 LangChain v0.1.0 Launch: Introduction
LangChain v0.1.0 Launch: Introduction
LangChain
38 LangChain v0.1.0 Launch: Observability
LangChain v0.1.0 Launch: Observability
LangChain
39 LangChain v0.1.0 Launch: Integrations
LangChain v0.1.0 Launch: Integrations
LangChain
40 LangChain v0.1.0 Launch: Composability
LangChain v0.1.0 Launch: Composability
LangChain
41 LangChain v0.1.0 Launch: Streaming
LangChain v0.1.0 Launch: Streaming
LangChain
42 LangChain v0.1.0 Launch: Output Parsing
LangChain v0.1.0 Launch: Output Parsing
LangChain
43 LangChain v0.1.0 Launch: Retrieval
LangChain v0.1.0 Launch: Retrieval
LangChain
44 LangChain v0.1.0 Launch: Agents
LangChain v0.1.0 Launch: Agents
LangChain
45 Build and Deploy a RAG app with Pinecone Serverless
Build and Deploy a RAG app with Pinecone Serverless
LangChain
46 Hosted LangServe + LangChain Templates
Hosted LangServe + LangChain Templates
LangChain
47 LangGraph: Intro
LangGraph: Intro
LangChain
48 LangGraph: Agent Executor
LangGraph: Agent Executor
LangChain
49 LangGraph: Chat Agent Executor
LangGraph: Chat Agent Executor
LangChain
50 LangGraph: Human-in-the-Loop
LangGraph: Human-in-the-Loop
LangChain
51 LangGraph: Dynamically Returning a Tool Output Directly
LangGraph: Dynamically Returning a Tool Output Directly
LangChain
52 LangGraph: Respond in a Specific Format
LangGraph: Respond in a Specific Format
LangChain
53 LangGraph: Managing Agent Steps
LangGraph: Managing Agent Steps
LangChain
54 LangGraph: Force-Calling a Tool
LangGraph: Force-Calling a Tool
LangChain
55 LangGraph: Multi-Agent Workflows
LangGraph: Multi-Agent Workflows
LangChain
56 Streaming Events: Introducing a new `stream_events` method
Streaming Events: Introducing a new `stream_events` method
LangChain
57 Building a web RAG chatbot: using LangChain, Exa (prev. Metaphor), LangSmith, and Hosted Langserve
Building a web RAG chatbot: using LangChain, Exa (prev. Metaphor), LangSmith, and Hosted Langserve
LangChain
58 OpenGPTs
OpenGPTs
LangChain
59 Open Source RAG with Nomic's New Embedding Model (and ChromaDB and Ollama)
Open Source RAG with Nomic's New Embedding Model (and ChromaDB and Ollama)
LangChain
60 LangGraph: Persistence
LangGraph: Persistence
LangChain

This video shows how to use LangSmith to add human feedback to annotated runs, enabling more accurate evaluation and improvement of LLM performance. By following the steps, users can efficiently manage their annotation queues and ensure high-quality datasets.

Key Takeaways
  1. Log in to LangSmith
  2. Select a test run
  3. Pick runs with correct scores
  4. Send runs to annotation queue
  5. Add human feedback and tags to each run
  6. Review and score runs based on creativity or other custom criteria
💡 Human feedback is essential for improving LLM performance, especially for tasks that are difficult to evaluate automatically, such as creativity.

Related AI Lessons

I Let 5 AI Agents Shop For Me in 2026. It Went About as Well as You’d Expect.
Learn from an experiment where 5 AI agents were used to shop for everyday items, highlighting what works and what doesn't in AI-powered shopping
Medium · AI
The Governance Gap Nobody's Measuring
Learn how to identify and address the governance gap in AI systems, where configuration changes can lead to unintended consequences, and why it matters for ensuring accountability and transparency
Medium · AI
My agent kept reading data it wasn't allowed to. The prompt was never going to stop it.
Learn how to secure autonomous agents with proper credential management to prevent unauthorized data access
Dev.to AI
8 Must-Know AI Chatbot Tools That Actually Help Small Businesses
Discover 8 essential AI chatbot tools that can genuinely benefit small businesses, and learn how to choose the right one for your specific use case
Dev.to AI
Up next
Building Great Agent Skills: The Missing Manual
AI Engineer
Watch →