LangGraph Reflection

LangChain · Advanced ·🤖 AI Agents & Automation ·1y ago

Skills: Agent Foundations90%Tool Use & Function Calling80%RAG Evaluation70%

Key Takeaways

The video demonstrates LangGraph Reflection, a prebuilt graph that uses a reflection-style architecture to improve an initial agent's output, utilizing tools like LangGraph Reflection, Lang Chain, and Open Evals. It showcases the creation of a main assistant graph, a reflection agent, and the use of an LLM as a judge to evaluate output.

Full Transcript

agents can often mess up they can hallucinate a response in a rag workflow they can not be up Tod dat with the version of code that you are using or they can just plan and do things incorrectly a common technique to overcome these issues is to use some type of reflection so what this means is run some sort of evaluation process on the agent response live while it's doing its job and and and based on that if it fails that eval pass it back to the agent and tell it to correct itself or if it succeeds then you're okay to finish today I'm excited to introduce Lang graph reflection an attempt at providing some higher level scaffolding for getting started with that at its core it's a pretty simple architecture to be honest so there's this core graph this is the main agent that comes in and then there's this reflection agent and so basically what happens is the user input will come in it will go to the main agent from there it will go to this reflection agent and then this reflection agent will either send it back to the graph or it will go to the end there are only two assumptions that we make for this pre-built architecture so one the main agent should take as input a list of messages it can take other things but it should at least take as input a list of messages and then two the reflection agent should return a user message if there are any critiques otherwise it shouldn't return any messages and the reason that this is important is because this conditional Edge will look for the presence of a user message when deciding whether to go back to the main agent or to return to make this more concrete we added two examples one is a coding example where it uses a lint Checker to check whether the code seems correct that it generated the other one uses an llm as a judge to just judge the output let's take a closer look at both of these starting with the llm as a judge first let's take a look at the code we can see that we install three modules L graph reflection which is this higher level architecture Lang chain which we will use for managing the entry point to llms and then open evals which is a package we created to do LM as a judge evaluations first we create the main assistant graph so this is just going to be a really simple single call to an llm so we're you calling into Claude 3.7 Sonet and you know this agent isn't really an agent it's just a call to an LM but this is just for demo purposes and it also shows that you can use this architecture not just for super complicated agents but even for kind of like simple just single invocations to an llm next let's take a look at the critique that we do so here we have this prompt that we're going to ask the llm as a judge to grade the initial response for it's going to grade it on accuracy completeness Clarity helpfulness and safety so this is pretty generic and if you're applying this to your use case you you'll probably want to customize this prompt and make it a little bit more specific and that's totally possible next we're going to create this function that does this llm as a judge bit so here we can see that we use create llm as a judge this is a helper from open eval we're going to use 03 mini as our judge and then the feedback key that we're going to want is just pass we're then going to pass in the outputs as the contents of the last message that we get because remember this is running after the initial agent so any response from the initial agent will be the last message that we get and we're not really going to look at the inputs because we're just judging the outputs here then based on this if the result of score is true um so this is the score is going to be a Boolean if it's true that means it's approved by the judge and we're just going to finish otherwise it's false and we're going to use the comment field to pass back in feedback and so score and common are standard fields that we get back from this llm as a judge evaluator we're going to build this graph and then we're really simple just going to use create reflection graph pass in the first graph pass in the judge graph get back the sap awesome and then we can see here that we have an example query and so let's see what happened I'm going to run this graph right here by just running the this file I can see that I get some nice print statements it'll take a little bit because I asked it for a pretty complex report I can see that I log something out to the terminal and then it finishes so it passed on the first try let's take a look at the lsmith trace to see exactly what's going on under so here we can see that there was two main sub agents here the graph which is the first one and then the reflection agent so clicking into this we can see our first call to an llm here we have our user input here and then we get this nice long AI output now here we have another call to an llm and this is for judging and so here we can see our system prompt that we wrote here and then we can see that we passed in the response from the first graph into here we can see the output is saying that it should pass and then there's this reasoning bit as well and so because the score is true it's not going any further let's take a look at another example this time with code so here we're going to install the same packages but we're also going to install pyite which is a python linter we can then see that we have this helper function analyze with pyite so this will use pyate for static type checking and errors other than that it's pretty similar so we can see that we create a single call to a model this will be our base agent and then we create this reflection bit which will attempt to grade the generated code so the first thing it does is it actually extracts the generated code from the response so because the response is just an AI message we actually need to pluck out the parts that have to do with coding so we're first going to use an llm to do that and so that's what this extraction bit does here then if we did have a call that that extracted the python code we're going to analyze it with pyite we're going to print stuff out just so we can see what's happening and then if there's any errors we're going to pass back in a message to the user I ran power and found this with the explanation that we get from the General Diagnostic section we de tell it to try to fix it and try to regenerate the entire code snippet this is because we're doing extraction on that message in the final in the step before this so we want it to generate the whole thing and then we also say that if you're not sure you can also just like ask a question um and this is because sometimes in code the model might not actually know the most upto-date syntax and so rather than trying to generate something incorrect and keep on trying it should Instead try to ask the user for help let's try this one out now great so here we can see some initial errors that we get so one has to do with an import statement Lang chain coma could not be resolved so I don't actually have this package installed in my environment next I can see that I this next I can see that there's some parameter error no parameter named initial State I can then see another issue so cannot access attribute run for class State graph so there's a few issues here some of which could probably be fixed by installing the Right Packages other of which appear to be a little bit wrong still hopefully this helps motivate this General reflection architecture and provides a few examples for how you can use it I want to call out that it's really important to get a good reflection agent here as well and that's actually something we'll be working on over the next few weeks is to provide some off-the-shelf reflection agents for rag for code and for general purpose things but you should also always know that you can customize this reflection agent to your particular domain and application you can try out L graph Reflection by doing pip install L graph reflection thanks for watching

Original Description

This prebuilt graph is an agent that uses a reflection-style architecture to check and improve an initial agent's output. This reflection agent uses two subagents: - A "main" agent, which is the agent attempting to solve the users task - A "critique" agent, which checks the main agents work and offers any critiques GitHub: https://github.com/langchain-ai/langgraph-reflection

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from LangChain · LangChain · 0 of 60

← Previous Next →

Chat With Your Documents Using LangChain + JavaScript

Chat With Your Documents Using LangChain + JavaScript

LangChain SQL Webinar

LangChain SQL Webinar

LangChain "OpenAI functions" Webinar

LangChain "OpenAI functions" Webinar

LangSmith Launch

LangSmith Launch

LangChain x Pinecone: Supercharging Llama-2 with RAG

LangChain x Pinecone: Supercharging Llama-2 with RAG

LangChain Expression Language

LangChain Expression Language

Building LLM applications with LangChain with Lance

Building LLM applications with LangChain with Lance

Benchmarking Question/Answering Over CSV Data

Benchmarking Question/Answering Over CSV Data

LangChain "RAG Evaluation" Webinar

LangChain "RAG Evaluation" Webinar

Fine-tuning in Your Voice Webinar

Fine-tuning in Your Voice Webinar

Tabular Data Retrieval

Tabular Data Retrieval

Building an LLM Application with Audio by AssemblyAI

Building an LLM Application with Audio by AssemblyAI

Superagent Deepdive Webinar

Superagent Deepdive Webinar

Lessons from Deploying LLMs with LangSmith

Lessons from Deploying LLMs with LangSmith

Shortwave Assistant Deepdive Webinar

Shortwave Assistant Deepdive Webinar

Cognitive Architectures for Language Agents

Cognitive Architectures for Language Agents

Effectively Building with LLMs in the Browser with Jacob

Effectively Building with LLMs in the Browser with Jacob

Data Privacy for LLMs

Data Privacy for LLMs

"Theory of Mind" Webinar with Plastic Labs

"Theory of Mind" Webinar with Plastic Labs

LangChain Templates

LangChain Templates

Using Natural Language to Query Postgres with Jacob

Using Natural Language to Query Postgres with Jacob

Building a Research Assistant from Scratch

Building a Research Assistant from Scratch

Benchmarking RAG over LangChain Docs

Benchmarking RAG over LangChain Docs

Skeleton-of-Thought: Building a New Template from Scratch

Skeleton-of-Thought: Building a New Template from Scratch

Benchmarking Methods for Semi-Structured RAG

Benchmarking Methods for Semi-Structured RAG

LangSmith Highlights: Getting Started

LangSmith Highlights: Getting Started

LangSmith Highlights: Debugging

LangSmith Highlights: Debugging

LangSmith Highlights: Datasets

LangSmith Highlights: Datasets

LangSmith Highlights: Evaluation

LangSmith Highlights: Evaluation

LangSmith Highlights: Human Annotation

LangSmith Highlights: Human Annotation

LangSmith Highlights: Monitoring

LangSmith Highlights: Monitoring

LangSmith Highlights: Hub

LangSmith Highlights: Hub

SQL Research Assistant

SQL Research Assistant

Getting Started with Multi-Modal LLMs

Getting Started with Multi-Modal LLMs

Build a Full Stack RAG App With TypeScript

Build a Full Stack RAG App With TypeScript

Auto-Prompt Builder (with Hosted LangServe)

Auto-Prompt Builder (with Hosted LangServe)

LangChain v0.1.0 Launch: Introduction

LangChain v0.1.0 Launch: Introduction

LangChain v0.1.0 Launch: Observability

LangChain v0.1.0 Launch: Observability

LangChain v0.1.0 Launch: Integrations

LangChain v0.1.0 Launch: Integrations

LangChain v0.1.0 Launch: Composability

LangChain v0.1.0 Launch: Composability

LangChain v0.1.0 Launch: Streaming

LangChain v0.1.0 Launch: Streaming

LangChain v0.1.0 Launch: Output Parsing

LangChain v0.1.0 Launch: Output Parsing

LangChain v0.1.0 Launch: Retrieval

LangChain v0.1.0 Launch: Retrieval

LangChain v0.1.0 Launch: Agents

LangChain v0.1.0 Launch: Agents

Build and Deploy a RAG app with Pinecone Serverless

Build and Deploy a RAG app with Pinecone Serverless

Hosted LangServe + LangChain Templates

Hosted LangServe + LangChain Templates

LangGraph: Intro

LangGraph: Intro

LangGraph: Agent Executor

LangGraph: Agent Executor

LangGraph: Chat Agent Executor

LangGraph: Chat Agent Executor

LangGraph: Human-in-the-Loop

LangGraph: Human-in-the-Loop

LangGraph: Dynamically Returning a Tool Output Directly

LangGraph: Dynamically Returning a Tool Output Directly

LangGraph: Respond in a Specific Format

LangGraph: Respond in a Specific Format

LangGraph: Managing Agent Steps

LangGraph: Managing Agent Steps

LangGraph: Force-Calling a Tool

LangGraph: Force-Calling a Tool

LangGraph: Multi-Agent Workflows

LangGraph: Multi-Agent Workflows

Streaming Events: Introducing a new `stream_events` method

Streaming Events: Introducing a new `stream_events` method

Building a web RAG chatbot: using LangChain, Exa (prev. Metaphor), LangSmith, and Hosted Langserve

Building a web RAG chatbot: using LangChain, Exa (prev. Metaphor), LangSmith, and Hosted Langserve

Open Source RAG with Nomic's New Embedding Model (and ChromaDB and Ollama)

Open Source RAG with Nomic's New Embedding Model (and ChromaDB and Ollama)

LangGraph: Persistence

LangGraph: Persistence

This video teaches how to create a reflection agent using LangGraph Reflection, which improves an initial agent's output by evaluating it with an LLM as a judge. The reflection agent can be customized for specific domains and applications. By following the steps, viewers can design and implement their own reflection agents.

Key Takeaways

Create the main assistant graph using an LLM
Create a function to use the LLM as a judge to evaluate the output
Pass the output to the LLM as a judge and get feedback
Use the feedback to decide whether to send it back to the main agent or return to the user
Run the graph by running the file
Extract generated code from AI response using LLM
Analyze code with pyite for static type checking and errors
Pass back in feedback to user with error messages

💡 The reflection agent is crucial for good performance, and off-the-shelf reflection agents will be provided for RAG, code, and general purpose, but can also be customized for specific domains and applications.

🔒 Pro feature: Ask AI to explain this lesson →

More on: Agent Foundations

View skill →

Build and Deploy an Agent with Reasoning Engine in Vertex AI

Adding a Phone Gateway to a Virtual Agent

From Zero to Working AI Agent in 60 Seconds

From Zero to Working AI Agent in 60 Seconds

Create An AI Agent With Replit That Automates Your Sales

Create An AI Agent With Replit That Automates Your Sales

Capstone: Autonomous Runway Detection for IoT

Capstone: Autonomous Runway Detection for IoT

AI Agents with Model Context Protocol & Typescript

AI Agents with Model Context Protocol & Typescript

Related AI Lessons

OpenClaw is finally available on Android and iOS

OpenClaw, a free open-source agentic program, is now available on Android and iOS, allowing users to leverage AI on their mobile devices

Amazon Bedrock AgentCore vs Reality

Learn how to build a secure, reliable, and scalable AI agent using Amazon Bedrock AgentCore on the AWS tech stack

Stop Blaming the Model. Your AI Agents Need a Control Plane

Learn why a control plane is crucial for AI agents, going beyond just the core agent loop

Medium · Data Science

What 12 failure classes and 30 Billion tokens spent taught us about trusting AI coding agents

Learn from 12 failure classes of AI coding agents to improve trust and reliability in production environments

Dev.to · keesan.eth

Building Great Agent Skills: The Missing Manual