NEW Multi-Agent CODE explained (by OpenAI)
Skills:
Multimodal LLMs90%Agent Foundations80%Tool Use & Function Calling80%Multi-Agent Systems70%Autonomous Workflows60%
Key Takeaways
The video explains OpenAI's new multi-agent code for conversational systems, demonstrating how to design and implement a multi-agent system for managing user interactions using tools like Python and OpenAI's API. It covers defining routines, configuring agents, and implementing seamless transitions between agents.
Full Transcript
hello Community you ask me hey what is the latest State here to code agents and you are lucky because just four days ago openi showed us here a beautiful new updated example so I go with notation of openi where they Define here the class of an agent and we will have a look but before you have your question so very fast answer your question you said can also a vision language mod constitute an agent but of course you just an access to a stereo camera and you will activate here you have your Active Vision language mod which will identify object in its environment you have a beautiful control cycle and you know almost every language model now is almost a vision language model so no problem if you have the specific tools like here cameras there are cameras and the system is trained you have no problem at all based on the pre-training data that this system will work and will constitute an agent also a vision language action yes of course you just have an active sensor array I don't know and you have infrared ultraviolet lighter radar and you had a set of actuators so you can move your little robot around in three dimension and the vision language action model learn this and the pre-training data here in especially in the fine tuning data how to control this so we are in beautiful robotics Al we have in robotics our agents that work just great depending on the pre-training data and the fine-tuning and then the question that I always love can we build super human eye systems yes of course you just take your vision language mod and you add here access to an infrared sensor or to an UltraViolet sensor and you have a superum agent because I as a human I cannot see the infrared spectrum and I have no ultraviolet sensor in my eyes so here we have it a super humid agent thank you for your question and now on we go what are according to the latest open ey literature the key characteristics of an agent not going to believe it four points you identify it you have a Model A language model that you use G4 Omni mini for example you have the instruction this is more or less our system promt that guides the behavior and you have tools this is a list python function that the agent can call to perform actions or access external functionalities and here we have it our agent the name the model that you're going to use the instruction where you tell what you expect here from an AI system from the agent and the tools that the the agent has access to there is your beautiful the open ey cookbook orchestrating agent multi-agent configuration routine and handoffs and this is what I'm going to show you but here you have of course the link and I leave the link on the description and I leave you also the link to the complete python notebook so you get a complete code to download let's have a look just understand here a basic idea we say now here that agents multi-agent system each agent has a specific role and each agent has access to specific tools so we have model system scalable system conversational system llm system where each agent handles specialized tasks and the conversation between human and a multi-agent system can have seamless transition between the agents because the system will call all the agent specific whatever is needed now you remember in my last videos we had seven agents agent One agent two till agent 7 and they were here in a chain now you might ask is a chain typology really the best thing if we have multi- agent system that interact with a Knowledge Graph the answer is no there are better configuration but this will be the content of my next video anyway what are our elements what you should know if you're new to agents routines simple sequence of steps represent here as instructions to guide here the language model in fulfilling specific tasks I will show you routine in a minute then we have tools those are python function that perform specific actions like act execute a refund or look up a specific item in a database we have function to schema conversion it this is a simple utility to convert your python function into adjacent format into a schema that a language model understands and can execute and then we will handle tool calls and this is simple mod can call specific functions during the conversation and these functions are mapped to python implementation that can execute the actions and return the result back to the conversation three more terms handoffs by open ey agent can transfer conversation to another agent we have multi-agent systems similar how a real life customer support my escalator transfer you to a different department we have agent class this is already showed you here provides a structure to Define their agents their instruction their tools and their model the LM that they interact with and we have a beautiful multi-agent orchestration finally code that I will show you allows multiple agents to interact within a single conversation a routine is so simple it is a system message that gives instruction to the agent so you have in the agent hey you say hey you are a customer support agent for a specific company here has three points you should do ask the user suggest the fix and if the user isn't happy hey offer the user a refund that's it that's a routine agents are like worker so they have specific tasks to do and they use here specific tools you know now that those a python function to complete the tasks for example a refund agent uses here function to look up here specific item in the database of the company and another function to process here a refund to the customer so we have here function look up the item here we have the name of the item and then function execute a refund beautiful so an agent will use this function that we have to Define to complete the task now this code here now allows you to simulate here conversation where the user says something types something and the agent simply respond by following its routine so we have here a function run full turn this is the simplest where we have a system message and some user messages and this simply simulates getting a user input and calling the model gbd4 for example or GPD Omni mini whatever you understand this then we have messages appended and return a message this is the simplest case nothing is happening just the agent asks here to use a question and you got here prepare an answer now sometimes you know you need tools no the I agent needs to use tools to answer here so in a customer service scenario the agent might need to look up the item in the repo or reprocess here refund so what we have to do we have to map now the f function into a schema schema is a specific format that the language model understands and let the model call those tools so we have here simple function to schema we have the name the parameters the type and the properties and now the model knows how to call this function so we see tool schema function to schema from execute refund function to schema from the function look up the item in the database of the company easy and then when the model needs to call now specific tool it simply sends here a request like call the refund function the code looks up here which tool to use and calls it with the necessary parameters because maybe you have the customer name or you have an ID or you have the amount of items or whatever so execute the tool call a simple function but now it gets interesting because if now the situation needs a different agent let's say a sales agent or a refund agent or whatever now the current agent and this is the beauty now with open ey solution here can hand off now conversation to another agent so after placing an order if the user requests now suddenly a refund from another item the sales agent can now pass the conversation to the refund agent of the company and the user can continue to cooperate here with those companies so we have a refund agent we Define this agent we give it a name we give it a clear instruction what is the task of this agent we have here sales agent help the user place an order and then we can simulate very easily ah hand of a simple structure now we can more or less run this complete system so the complete code runs in a loop allowing the users to interact with the agents and it switches between the agents based on the conversation flow and this is all the code there is to it and I hope you are familiar with this code now this was the simplest example nothing was happening we were just getting familiar with the terms we have a look what's going to happen so let's take now a little bit more complex example where we have three agents okay at first we have a triage agent this is the agent here at the Forefront of our company he's fighting here with the customer when a customer calls us hey I'm absolutely not satisfied here so we have here functions our triage agent here has a name and instructions say hey you are a customer service bot for this company task one introduce yourself always be very brief guard the information to direct the customer to the right department but make your question subtle natural and transfer it to the department and what we have on functions we have transfer to the sales agent transfer to issue and repairs and escalate to a human operator if the EI system says no way I can handle this angry human so you see you define your function you have your agents you have your tools now transfer to sales agent here transfer to issue and repair this is here and escalate to the human you be find it here and this is all there is to it now the sales agent agent we have the name with the instruction you are a sales agent sell the user a product always answer in a sentence LS follow this routine with the user ask them about any problem yeah you get the idea and what we have execute an order price should be in US Dollars order summary product name the price confirm the order yes no input order successfully beautiful or order cancelled okay transfer back to the triage if everything worked out fine call this if the user brings up a topic outside of your perview including escalation to human if there's another problem there's always a fullback solution to another agent and then our third agent is issues and repair and you're not going to believe it we have again the same schema agent the name all the instruction all the function all the tools that we can use here they are defined now we know what to do we have a clear defined input output schema great but how do agents now interact now we have a conversation flow then as a I showed you we have tools or function calls we have the handoffs and we have Dynamic transactions so have a look now at the main Loop what is the main Loop doing user input the agent processing the agent updates and keeps the conversation running that's all there is and look this is the code of the main Loop we start here with the triage agent the agent at the front line facing here the customer and user has something to say beautiful and we have now a run full turn with the agents with the messages our agent has a respond from the agent update the current agent if a hand off has occurred if a problem escalated if you hand it off to a human or to another agent and you have all the messages here in a beautiful list now as you see the real interesting function here is of course run full turn I have a look at this it is s to the system as it handles here the execution of a single conversation turn between user and the agent that is here interacting with the user it manages here the interaction by sending the messages to the language model handling all the function calls handling all the tools and processing here the agent handoffs whenever they occur whenever they are necessary I give you the code this is rather simple code have a look at this this is part one this is part two take your time look at it I will explain here what is happening in a conversation flow make it easier so the human user initiates a conversation said hey I need help with a product that I want to purchase the trios agent now handles here the first turn so trios agent is number one it is processing now here a run full turn and it got us the information from the human and makes a possible function call because either it transfer here to issues and repairs or whatever is the context and we have an agent handoff so the function call transfer to issues and repairs Returns the issue and repair agent the agent is updated now we have a new agent not anymore the tri agent acting and then we have here the agent continues now with the action processing continues with the w Loop in the full run turn and you have the communication now beautiful agent has a response ask probing question proposes some fixes or maybe offer a refund you have function calls look up the item or execute a refund and the conversation continues and you see this is a beautiful Loop a beautiful Circle you can have another perspective if you want maybe this is easier for you this is how it works in practice so we have a user input and a message handling situation the user message is appended to the complete conversation history maybe we start with zero but maybe this user already called 10 minutes ago and the message is as easy as a copy of the conversation messages is made to avoid modifying the original list during the processing the agent interaction with the language mod the agent instructions are used as the system messages to set the context I've shown you this already the conversation history is provided now to our llm ensuring that the large language model has full context and the model generates a response which may include content and specific function calls to solve this let's say we have a function calls we have tool use so if the model makes any function call they are handled one by one each function call is executed using the execute tool call which invokes the corresponding python function we Define those function I've shown you this and then an agent handoff if the executed function returns an agent instance a hand of occurs the current agent is updated to the new agent it is returned and a message indicating the transfer is added now to the conversation to the recording and then we have a loop continuation and the the loop continues as long as the model makes function calls and once there is no more function calls the turn ends and returning response however you would like to see it it is a rather easy schema and it's a beautiful new idea implemented by open ey for myc green grasshoppers an explanation another one here of The Run full turn function since this is the central element so we have our agent we have the messages of the conversation here on the list of the messages that were exchanged so far whatever happened you have here a function Lo loic now you initialize the variables you have the main loop I showed you this you convert the tools to a schema you get the models response you have the client chat completion create that you know you check for new function calls you handle those function calls you handle if there's an agent handoff and you have a response of the system if you want to see this in detail no problem I have here detail 1.1 and point 2 now you can read this also on a non 4K screen detail 2 here get the model response check for the function calls and the details three is finally the agent handoff and a return great so this is it more or less this is the new schema 4 days old isn't it beautiful so I ask now strawberry hey what are the benefits and can you give me a summary of here this new multi agent orchestration as shown by open ey and know a little strawberry comes back and says hey he the benefits are modularity we have Dynamic handoffs we have tool integration we have a context preservation and it is scalable new agent and new tools can be added without significant changes to the existing system this is nice and summary by our strawberry is this run full turn function as I showed you this is crucial this is where you should focus your attention it is crucial that it orchestrates you the interaction between the user and all the different agent it manages here the message passing but careful this is not the message passing from the graph neural network that we have over there this is here the trivial case this it's really just the conversation history then we have the model interaction all the function calls all the tools that we use and all the agent handoffs and yeah if you apply this you can build a sophisticated conversational system that is both flexible scalable beautiful so this is a very nice system great I told you I will give you here the complete code so you simply go to openi gab openi cookbook examples orchestrating the agent it's a python notebook beautiful as you see 4 days ago now all the errors are gone now it's beautifully working and you can download here the complete python notebook you have everything even in small details that's really beautiful examples in addition but I hope I've given you here an introduction to help you a little bit with the new notion of routine and handoff here and if you want they also provide here then a sample repo for a swarm that is an implementation without a coordinating agent but I would say we skip this idea because in my next video or one of my next videos we will look here at a professional communication protocols between our agents that is automatically adapted to the specific configuration that we have between our agents so in the easiest case we have a chain of Agents but we will examine that we have different configuration and different spaces where we have different communication protocol that are necessary and yeah but this will be part of one of my next videos so if you want to subscribe hello you are welcome and it would be great to see you in my next video
Original Description
New Multi-Agent orchestration by OpenAI. Code based video with detailed explanations.
This video revolves around the design and implementation of a multi-agent system for managing user interactions in tasks such as customer service, sales, and support.
The core concept involves defining routines, which are structured sequences of instructions for handling specific workflows. Each routine consists of a system message that outlines the steps the agent must follow, such as asking probing questions, proposing solutions, or offering refunds. To enable dynamic actions within the routine, the system integrates tools, which are Python functions used by agents to perform tasks like looking up an item or processing a refund. These tools are translated into JSON schemas so that the language model (e.g., OpenAI GPT-4) can invoke them as part of the conversation.
The conversation management aspect of the system is built around the idea of tool calls, where the model can determine when it needs to execute a tool based on the user's input and the agent's instructions. A function (execute_tool_call) is used to map the model’s tool requests to the appropriate Python function, allowing agents to interact with external systems or databases, simulate refunds, or search for items. Additionally, the system allows for handoffs between agents, where one agent can seamlessly transfer the conversation to another agent more suited to handle a specific request (e.g., from sales to support). This handoff mechanism is implemented using agent classes that can switch context based on the conversation's flow, enabling a flexible and dynamic interaction process.
The discussion focuses on simplifying the intricate logic behind the multi-agent framework, making it more accessible for understanding. The role of agents and routines in managing distinct workflows was explained, along with the introduction of tools to bridge the gap between conversational AI and real-world actions. Additionally, the
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Discover AI · Discover AI · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Step Into the Unknown (by YouChat) - May 2023 be your best year yet
Discover AI
Wishing you all an amazing 2023 filled with Love, Laughter, and Happiness!
Discover AI
Create a Smarter Future!
Discover AI
The Art of Text to Vector Transformation: A Comprehensive Look at AI and NLP Transformers
Discover AI
Feature Vectors: The Key to Unlocking the Power of BERT and SBERT Transformer Models
Discover AI
Domain-Specific AI Models: How to Create Customized BERT and SBERT Models for Your Business
Discover AI
Achieve Unimaginable Levels of Domain Knowledge through SBERT Extreme in 3D (SBERT 48)
Discover AI
Unlocking Scientific Domain Knowledge w/ BPE Tokenizer: An Amazing Journey! (SBERT 49)
Discover AI
SBERT Extreme 3D: Train a BERT Tokenizer on your (scientific) Domain Knowledge (SBERT 50)
Discover AI
Discover Vision Transformer (ViT) Tech in 2023
Discover AI
Pre-Train BERT from scratch: Solution for Company Domain Knowledge Data | PyTorch (SBERT 51)
Discover AI
Flan-T5-XL model on a free COLAB | A free LLM - that explains itself w/ reasoning /write essay | AI
Discover AI
BERT and GPT in Language Models like ChatGPT or BLOOM | EASY Tutorial on Large Language Models LLM
Discover AI
Free Alternative to ChatGPT: Flan-T5-XL GUI (open-source) #shorts
Discover AI
From T5 to T5X: A Game-Changing Evolution with JAX & FLAX
Discover AI
How to start with ChatGPT? | Short Introduction to OpenAI API #shorts
Discover AI
The Future of Conversational AI? Google's PaLM w/ RLHF | LLM ChatGPT Competitor
Discover AI
Microsoft and ChatGPU
Discover AI
From Zero to FLAN-T5 XL Model GUI with Gradio: A Step-by-Step Guide on Free COLAB Notebook PyTorch
Discover AI
Google's 2nd Answer to "BING ChatGPT": Sparrow | after BARD w/ LaMDA | 2nd Gen Conversational AI
Discover AI
TF2: Pre-Train BERT from scratch (a Transformer), fine-tune & run inference on text | KERAS NLP
Discover AI
3D Visualization for BERT: How to Pre-Train with a New Layer & Fine-Tune with Downstream Task Layer
Discover AI
FLAN-T5-XXL on NVIDIA A100 GPU w/ HF Inference Endpoints, let's explore 11b models!
Discover AI
ChatGPT - Can it Lie to you?
Discover AI
ChatGPT Alternative: Perplexity by Perplexity.AI
Discover AI
2023 KerasNLP Tutorial: Explore Latest KERAS Toolbox & NLP Processing Library for BERT - TF2
Discover AI
Self-aware AI: You.com/chat vs Perplexity.ai | Live Demo, LLMs show Future of ChatGPT w/ BING
Discover AI
BLOOM 176B Inference on AWS | Bigger than GPT-3 for more Power!
Discover AI
Fine-tune ChatGPT? Buy Embeddings /OpenAI? What are Embeddings? My own ChatGPT? | Visual Q+A
Discover AI
Unleashing the Power of BLOOM 176B with AWS ml.p4de.24xlarge, DJL & DeepSpeed: The Ultimate Boost!
Discover AI
After ChatGPT: NEW BioGPT by Microsoft | Do YOU trust Microsoft for your Medication?
Discover AI
Improve ChatGPT: Modular, Adaptive, Smart LLM | Inside ChatGPT
Discover AI
Fine-tune ChatGPT w/ in-context learning ICL - Chain of Thought, AMA, reasoning & acting: ReAct
Discover AI
The Intersection of Copyright Law and Human Faces: Exploring Virtual K-Pop with MAVE
Discover AI
New TECH: Vision Transformer 2023 on Image Classification | AI
Discover AI
PyTorch code Vision Transformer: Apply ViT models pre-trained and fine-tuned | AI Tech
Discover AI
New BING ChatGPT: Unlock the Power of Emotions in your Search Engine!
Discover AI
New BING ChatGPT loses its mind
Discover AI
Self-Attention Heads of last Layer of Vision Transformer (ViT) visualized (pre-trained with DINO)
Discover AI
Visualizing the Self-Attention Head of the Last Layer in DINO ViT: A Unique Perspective on Vision AI
Discover AI
Microsoft strongly restricts access to ChatGPT on new BING - WHY?
Discover AI
PyTorch ViT: The Ultimate Guide to Fine-Tuning for Object Identification (COLAB)
Discover AI
New BING Chat AGGRESSIVE
Discover AI
Panoptic Image Segmentation: Mask2Former explained | Identify all objects!
Discover AI
Code Panoptic Image Segmentation w/ Vision Transformer & Mask2Former - A PyTorch tutorial
Discover AI
Dream Job Alert: AI Prompt Engineer - $335K | AI Prompt Design: A Crash Course
Discover AI
Streamlining Similar Image Detection with ViT in PyTorch: A Step-by-Step Guide
Discover AI
Microsoft's CEO in Trouble #shorts
Discover AI
Why wait for KOSMOS-1? Code a VISION - LLM w/ ViT, Flan-T5 LLM and BLIP-2: Multimodal LLMs (MLLM)
Discover AI
OpenAI's ChatGPT can NOW summarize external Sources on the Internet?
Discover AI
ChatGPT polarizes
Discover AI
Hospital /Clinic AI Decision Models: Performance of 12 AI LLM Systems (incl $$) Radiology, Biomed
Discover AI
ChatGPT Prompt Engineering w/ in-context learning (ICL) - 7 Examples | Tutorial
Discover AI
Chat with your Image! BLIP-2 connects Q-Former w/ VISION-LANGUAGE models (ViT & T5 LLM)
Discover AI
ChatGPT: Multidimensional Prompts
Discover AI
ChatGPT: In-context Retrieval-Augmented Learning (IC-RALM) | In-context Learning (ICL) Examples
Discover AI
Code your BLIP-2 APP: VISION Transformer (ViT) + Chat LLM (Flan-T5) = MLLM
Discover AI
Buy Microsoft "Azure OpenAI Service" or buy from OpenAI its API for ChatGPT access & tuning?
Discover AI
Pretraining vs Fine-tuning vs In-context Learning of LLM (GPT-x) EXPLAINED | Ultimate Guide ($)
Discover AI
Reversible Transformer: ReFORMER for GPU Memory Optimization! Reversible Residual Layers?
Discover AI
More on: Multimodal LLMs
View skill →Related Reads
📰
📰
📰
📰
GPU Survivors: Can You Survive a 1T Parameter Inference Run?
Dev.to AI
Plan-and-Solve: make the model plan the steps before it computes any of them
Dev.to AI
Fine-Tuning Vision-Language Models for Production Invoice Extraction
Medium · Machine Learning
Fine-Tuning Vision-Language Models for Production Invoice Extraction
Medium · LLM
🎓
Tutor Explanation
DeepCamp AI