LangGraph Reflection

LangChain · Advanced ·🤖 AI Agents & Automation ·1y ago

Key Takeaways

The video demonstrates LangGraph Reflection, a prebuilt graph that uses a reflection-style architecture to improve an initial agent's output, utilizing tools like LangGraph Reflection, Lang Chain, and Open Evals. It showcases the creation of a main assistant graph, a reflection agent, and the use of an LLM as a judge to evaluate output.

Full Transcript

agents can often mess up they can hallucinate a response in a rag workflow they can not be up Tod dat with the version of code that you are using or they can just plan and do things incorrectly a common technique to overcome these issues is to use some type of reflection so what this means is run some sort of evaluation process on the agent response live while it's doing its job and and and based on that if it fails that eval pass it back to the agent and tell it to correct itself or if it succeeds then you're okay to finish today I'm excited to introduce Lang graph reflection an attempt at providing some higher level scaffolding for getting started with that at its core it's a pretty simple architecture to be honest so there's this core graph this is the main agent that comes in and then there's this reflection agent and so basically what happens is the user input will come in it will go to the main agent from there it will go to this reflection agent and then this reflection agent will either send it back to the graph or it will go to the end there are only two assumptions that we make for this pre-built architecture so one the main agent should take as input a list of messages it can take other things but it should at least take as input a list of messages and then two the reflection agent should return a user message if there are any critiques otherwise it shouldn't return any messages and the reason that this is important is because this conditional Edge will look for the presence of a user message when deciding whether to go back to the main agent or to return to make this more concrete we added two examples one is a coding example where it uses a lint Checker to check whether the code seems correct that it generated the other one uses an llm as a judge to just judge the output let's take a closer look at both of these starting with the llm as a judge first let's take a look at the code we can see that we install three modules L graph reflection which is this higher level architecture Lang chain which we will use for managing the entry point to llms and then open evals which is a package we created to do LM as a judge evaluations first we create the main assistant graph so this is just going to be a really simple single call to an llm so we're you calling into Claude 3.7 Sonet and you know this agent isn't really an agent it's just a call to an LM but this is just for demo purposes and it also shows that you can use this architecture not just for super complicated agents but even for kind of like simple just single invocations to an llm next let's take a look at the critique that we do so here we have this prompt that we're going to ask the llm as a judge to grade the initial response for it's going to grade it on accuracy completeness Clarity helpfulness and safety so this is pretty generic and if you're applying this to your use case you you'll probably want to customize this prompt and make it a little bit more specific and that's totally possible next we're going to create this function that does this llm as a judge bit so here we can see that we use create llm as a judge this is a helper from open eval we're going to use 03 mini as our judge and then the feedback key that we're going to want is just pass we're then going to pass in the outputs as the contents of the last message that we get because remember this is running after the initial agent so any response from the initial agent will be the last message that we get and we're not really going to look at the inputs because we're just judging the outputs here then based on this if the result of score is true um so this is the score is going to be a Boolean if it's true that means it's approved by the judge and we're just going to finish otherwise it's false and we're going to use the comment field to pass back in feedback and so score and common are standard fields that we get back from this llm as a judge evaluator we're going to build this graph and then we're really simple just going to use create reflection graph pass in the first graph pass in the judge graph get back the sap awesome and then we can see here that we have an example query and so let's see what happened I'm going to run this graph right here by just running the this file I can see that I get some nice print statements it'll take a little bit because I asked it for a pretty complex report I can see that I log something out to the terminal and then it finishes so it passed on the first try let's take a look at the lsmith trace to see exactly what's going on under so here we can see that there was two main sub agents here the graph which is the first one and then the reflection agent so clicking into this we can see our first call to an llm here we have our user input here and then we get this nice long AI output now here we have another call to an llm and this is for judging and so here we can see our system prompt that we wrote here and then we can see that we passed in the response from the first graph into here we can see the output is saying that it should pass and then there's this reasoning bit as well and so because the score is true it's not going any further let's take a look at another example this time with code so here we're going to install the same packages but we're also going to install pyite which is a python linter we can then see that we have this helper function analyze with pyite so this will use pyate for static type checking and errors other than that it's pretty similar so we can see that we create a single call to a model this will be our base agent and then we create this reflection bit which will attempt to grade the generated code so the first thing it does is it actually extracts the generated code from the response so because the response is just an AI message we actually need to pluck out the parts that have to do with coding so we're first going to use an llm to do that and so that's what this extraction bit does here then if we did have a call that that extracted the python code we're going to analyze it with pyite we're going to print stuff out just so we can see what's happening and then if there's any errors we're going to pass back in a message to the user I ran power and found this with the explanation that we get from the General Diagnostic section we de tell it to try to fix it and try to regenerate the entire code snippet this is because we're doing extraction on that message in the final in the step before this so we want it to generate the whole thing and then we also say that if you're not sure you can also just like ask a question um and this is because sometimes in code the model might not actually know the most upto-date syntax and so rather than trying to generate something incorrect and keep on trying it should Instead try to ask the user for help let's try this one out now great so here we can see some initial errors that we get so one has to do with an import statement Lang chain coma could not be resolved so I don't actually have this package installed in my environment next I can see that I this next I can see that there's some parameter error no parameter named initial State I can then see another issue so cannot access attribute run for class State graph so there's a few issues here some of which could probably be fixed by installing the Right Packages other of which appear to be a little bit wrong still hopefully this helps motivate this General reflection architecture and provides a few examples for how you can use it I want to call out that it's really important to get a good reflection agent here as well and that's actually something we'll be working on over the next few weeks is to provide some off-the-shelf reflection agents for rag for code and for general purpose things but you should also always know that you can customize this reflection agent to your particular domain and application you can try out L graph Reflection by doing pip install L graph reflection thanks for watching

Original Description

This prebuilt graph is an agent that uses a reflection-style architecture to check and improve an initial agent's output. This reflection agent uses two subagents: - A "main" agent, which is the agent attempting to solve the users task - A "critique" agent, which checks the main agents work and offers any critiques GitHub: https://github.com/langchain-ai/langgraph-reflection
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from LangChain · LangChain · 0 of 60

← Previous Next →
1 Chat With Your Documents Using LangChain + JavaScript
Chat With Your Documents Using LangChain + JavaScript
LangChain
2 LangChain SQL Webinar
LangChain SQL Webinar
LangChain
3 LangChain "OpenAI functions" Webinar
LangChain "OpenAI functions" Webinar
LangChain
4 LangSmith Launch
LangSmith Launch
LangChain
5 LangChain x Pinecone: Supercharging Llama-2 with RAG
LangChain x Pinecone: Supercharging Llama-2 with RAG
LangChain
6 LangChain Expression Language
LangChain Expression Language
LangChain
7 Building LLM applications with LangChain with Lance
Building LLM applications with LangChain with Lance
LangChain
8 Benchmarking Question/Answering Over CSV Data
Benchmarking Question/Answering Over CSV Data
LangChain
9 LangChain "RAG Evaluation" Webinar
LangChain "RAG Evaluation" Webinar
LangChain
10 Fine-tuning in Your Voice Webinar
Fine-tuning in Your Voice Webinar
LangChain
11 Tabular Data Retrieval
Tabular Data Retrieval
LangChain
12 Building an LLM Application with Audio by AssemblyAI
Building an LLM Application with Audio by AssemblyAI
LangChain
13 Superagent Deepdive Webinar
Superagent Deepdive Webinar
LangChain
14 Lessons from Deploying LLMs with LangSmith
Lessons from Deploying LLMs with LangSmith
LangChain
15 Shortwave Assistant Deepdive Webinar
Shortwave Assistant Deepdive Webinar
LangChain
16 Cognitive Architectures for Language Agents
Cognitive Architectures for Language Agents
LangChain
17 Effectively Building with LLMs in the Browser with Jacob
Effectively Building with LLMs in the Browser with Jacob
LangChain
18 Data Privacy for LLMs
Data Privacy for LLMs
LangChain
19 "Theory of Mind" Webinar with Plastic Labs
"Theory of Mind" Webinar with Plastic Labs
LangChain
20 LangChain Templates
LangChain Templates
LangChain
21 Using Natural Language to Query Postgres with Jacob
Using Natural Language to Query Postgres with Jacob
LangChain
22 Building a Research Assistant from Scratch
Building a Research Assistant from Scratch
LangChain
23 Benchmarking RAG over LangChain Docs
Benchmarking RAG over LangChain Docs
LangChain
24 Skeleton-of-Thought: Building a New Template from Scratch
Skeleton-of-Thought: Building a New Template from Scratch
LangChain
25 Benchmarking Methods for Semi-Structured RAG
Benchmarking Methods for Semi-Structured RAG
LangChain
26 LangSmith Highlights: Getting Started
LangSmith Highlights: Getting Started
LangChain
27 LangSmith Highlights: Debugging
LangSmith Highlights: Debugging
LangChain
28 LangSmith Highlights: Datasets
LangSmith Highlights: Datasets
LangChain
29 LangSmith Highlights: Evaluation
LangSmith Highlights: Evaluation
LangChain
30 LangSmith Highlights: Human Annotation
LangSmith Highlights: Human Annotation
LangChain
31 LangSmith Highlights: Monitoring
LangSmith Highlights: Monitoring
LangChain
32 LangSmith Highlights: Hub
LangSmith Highlights: Hub
LangChain
33 SQL Research Assistant
SQL Research Assistant
LangChain
34 Getting Started with Multi-Modal LLMs
Getting Started with Multi-Modal LLMs
LangChain
35 Build a Full Stack RAG App With TypeScript
Build a Full Stack RAG App With TypeScript
LangChain
36 Auto-Prompt Builder (with Hosted LangServe)
Auto-Prompt Builder (with Hosted LangServe)
LangChain
37 LangChain v0.1.0 Launch: Introduction
LangChain v0.1.0 Launch: Introduction
LangChain
38 LangChain v0.1.0 Launch: Observability
LangChain v0.1.0 Launch: Observability
LangChain
39 LangChain v0.1.0 Launch: Integrations
LangChain v0.1.0 Launch: Integrations
LangChain
40 LangChain v0.1.0 Launch: Composability
LangChain v0.1.0 Launch: Composability
LangChain
41 LangChain v0.1.0 Launch: Streaming
LangChain v0.1.0 Launch: Streaming
LangChain
42 LangChain v0.1.0 Launch: Output Parsing
LangChain v0.1.0 Launch: Output Parsing
LangChain
43 LangChain v0.1.0 Launch: Retrieval
LangChain v0.1.0 Launch: Retrieval
LangChain
44 LangChain v0.1.0 Launch: Agents
LangChain v0.1.0 Launch: Agents
LangChain
45 Build and Deploy a RAG app with Pinecone Serverless
Build and Deploy a RAG app with Pinecone Serverless
LangChain
46 Hosted LangServe + LangChain Templates
Hosted LangServe + LangChain Templates
LangChain
47 LangGraph: Intro
LangGraph: Intro
LangChain
48 LangGraph: Agent Executor
LangGraph: Agent Executor
LangChain
49 LangGraph: Chat Agent Executor
LangGraph: Chat Agent Executor
LangChain
50 LangGraph: Human-in-the-Loop
LangGraph: Human-in-the-Loop
LangChain
51 LangGraph: Dynamically Returning a Tool Output Directly
LangGraph: Dynamically Returning a Tool Output Directly
LangChain
52 LangGraph: Respond in a Specific Format
LangGraph: Respond in a Specific Format
LangChain
53 LangGraph: Managing Agent Steps
LangGraph: Managing Agent Steps
LangChain
54 LangGraph: Force-Calling a Tool
LangGraph: Force-Calling a Tool
LangChain
55 LangGraph: Multi-Agent Workflows
LangGraph: Multi-Agent Workflows
LangChain
56 Streaming Events: Introducing a new `stream_events` method
Streaming Events: Introducing a new `stream_events` method
LangChain
57 Building a web RAG chatbot: using LangChain, Exa (prev. Metaphor), LangSmith, and Hosted Langserve
Building a web RAG chatbot: using LangChain, Exa (prev. Metaphor), LangSmith, and Hosted Langserve
LangChain
58 OpenGPTs
OpenGPTs
LangChain
59 Open Source RAG with Nomic's New Embedding Model (and ChromaDB and Ollama)
Open Source RAG with Nomic's New Embedding Model (and ChromaDB and Ollama)
LangChain
60 LangGraph: Persistence
LangGraph: Persistence
LangChain

This video teaches how to create a reflection agent using LangGraph Reflection, which improves an initial agent's output by evaluating it with an LLM as a judge. The reflection agent can be customized for specific domains and applications. By following the steps, viewers can design and implement their own reflection agents.

Key Takeaways
  1. Create the main assistant graph using an LLM
  2. Create a function to use the LLM as a judge to evaluate the output
  3. Pass the output to the LLM as a judge and get feedback
  4. Use the feedback to decide whether to send it back to the main agent or return to the user
  5. Run the graph by running the file
  6. Extract generated code from AI response using LLM
  7. Analyze code with pyite for static type checking and errors
  8. Pass back in feedback to user with error messages
💡 The reflection agent is crucial for good performance, and off-the-shelf reflection agents will be provided for RAG, code, and general purpose, but can also be customized for specific domains and applications.

Related AI Lessons

Up next
Building Great Agent Skills: The Missing Manual
AI Engineer
Watch →