Agentic AI Course Free 2026 | Learn Agentic AI Full Course | Intellipaat

Intellipaat · Beginner ·🧠 Large Language Models ·3mo ago

Skills: Agent Foundations90%LLM Foundations80%Tool Use & Function Calling70%

Key Takeaways

This video provides an introduction to Agentic AI, covering topics such as generative AI, large language models, and agent-based systems. It explores the capabilities and limitations of Agentic AI, including its ability to understand goals, break down tasks, and execute them autonomously.

Full Transcript

Hello everyone and welcome to this Agentic AI full course by Intellipaat. In February 2026, Delhi hosted one of the most talked about AI event, the AI tech summit 2026. Top global and Indian companies showed up at this event, including Google, Microsoft, Nvidia, Amazon, and Meta, along with Indian giants like Infosys and Reliance Jio. Now, the summit is over and the announcements are made. But, the real question is, what happens next? Because events like these are not just about launches or networking. They are the signal of where the industry is heading. And if you look closely, there was a clear pattern across discussions, a shift from AI that respond to AI that actually acts. That's exactly where companies are investing so heavily in the summits. They're not just showcasing tools anymore. They're preparing for the next wave, and that shift is being driven by Agentic AI. Until now, most AI tools work like assistant. You ask something, they respond. But, Agentic AI changes the game. It can understand a goal, break it into small steps, make a decision, use tool, and actually execute task end to end. In simple term, AI is moving from thinking to doing. And that's exactly what we are going to explore in this video. What Agentic AI really is, how it works, and why it's becoming the next big thing in tech. So, without wasting any time, let's get started. We are going to be discussing Agentic AI. Okay, I'm going to be talking about large language models, and finally I will be able to talk to you guys in open. I'll be able to talk to you guys about practically in the industry, what do we use? What's the current technology trend in the market? We are also going to do a lot of hands-on. So, just be prepared that we are going to be working on crazy amount of hands-on stuff. You know, I got a question from the audience on you know, what Agentic framework is very good for beginners, and let's say if they want to start building agents, what's like the best possible you know, framework that they can start with. And I was like, you know what? There are already available agents, you know, Agentic framework, you know, on this planet. You have basically, you know, N8N, which is like a no-code, low-code platform. You have Gum Loop. You have I don't know, man, Glean. You have LangGraph. You have LangChain. You have so many different platforms that people use, and there is no end to it, right? Crew AI is one good example, right? But, I personally believe that anyone who wants to work on generative AI in the future, okay, the best Agentic framework that you are going to be learning, which you will use because you can create your own personalized, completely custom framework, it's by LangGraph, okay? There are other frameworks, too, but for beginners, I believe LangGraph and LangChain is the best thing to start with. Because, number one, you can build any graph, highly customizable, and you have world's biggest library that builds AI, and that's currently used by a lot of organizations, including N8N, right? So, what we're going to do, we'll start with our understanding on generative AI. Okay? And from then onwards, we will do a lot of hands-on exercises, so that you guys feel a little bit comfortable building your own agents. You feel like, okay, I can build my own, let's say, use cases. I can build my own understanding on how do I build my agents, and pretty much we can proceed forward with that, okay? So, also, we are going to build our own MCP server, meaning, I will help you guys to create your own async MCP server, right? It's going to be, of course, an MCP server that you can use as a node to node policy. You can, of course, get an idea from your own MCP servers, and you can build on it. A2A is absolute nonsense. Agent to agent protocol is just so bad. So, just just focus on MCP. It's It's The fancy new things come up in the market, but they don't survive for long, right? MCP is something that I have used. I have built, and you know, I have utilized it so much that I think it's like pretty amazing, right? So, we'll work on that. So, guys, first of all, let's get started with our understanding on generative AI system, okay? Everybody, first of all, before we even start talking about Agentic frameworks, you got to understand one thing very carefully, and that is, your foundations on generative AI has to be absolutely sorted. When I say sorted, no. See, you got to understand one thing. If I give you a plane to fly, okay? You got to understand when and how long can you fly that plane, meaning, please understand there are always restrictions, and there are always limitation to things, okay? No system in the world is not built without a limitation. You got to understand this first. Please understand generative AI is not a magical pill that can literally solve every problem on this planet. If that was the case, every company would have moved to generative AI, right? It's not that companies don't use generative AI, but it is It's not like everything is done by generative AI, right? The media tells you a nonsense that, oh, you know what, generative AI has taken jobs, generative AI has taken jobs. You got to understand that there is not 100% truth to it. You got to understand that generative AI isn't taking jobs, okay? Jobs that are so monotonous and and something that can be done by the system very well, where you don't need a human being, right? Those things are taken over by your generative AI systems. You can't take over a software developing job by AI system. I know you have industry, you have, you know, multiple different types of tools like Trey, you have, you know, multiple other frameworks which can help you build code, but trust me, I have used all of them. They don't generate code in the way you expect them to be. They are absolutely terrible, like Cursor, right? Base 44. Multiple other, you know, platforms which you think are, you know, built for like lovable. You have Replit, you have Cursor, you have all these platforms that can build you code. I have tried them. They are not at all built for security and production grade softwares, okay? And remember, guys, in the future, there is no shame in admitting the fact that, yes, I wrote a program, I wrote a software, but 40% of my software was written with the help of AI. That's okay. Because, the rest 60% is where you had to utilize your immense amount of creativity and brain, okay? And the rest 40% if in AI system helped you, it's okay. Right? So, remember, AI is going to help you in the future, and there is no shame in admitting it, right? I use generated code, generative AI. That makes my life easy, and I'm able to work on many projects at once. So, no worries about that. It's absolutely okay. We'll worry about what's important and what's not important, okay? The number one thing I want you guys to remember from now onwards is if anybody talks to you about generative AI, please understand that it is not a very complicated topic. It's a simple topic. If anyone asks you, what is a generative AI system? My people, I want you to answer it like this. Imagine you have this immense amount of data. This big, huge, large amount of data, okay? And that data you pass inside an input, or you pass inside as an input to a system, okay? This system learns patterns from any unstructured input data, and is able to generate an unstructured output data. What What does this mean, right? You have already studied machine learning. You've already studied deep learning. You already have an idea that any machine learning system in this planet, right? If you give it some data, it will understand the pattern from it. No problem. The difference, my people, is in generative AI systems, the way we build generative AI system is we don't just pass it data. We pass it crazy amount of large data. When I say large, I'm talking about billions, sometimes trillions of tokens and values that you pass inside a system, and that system through multiple huge amount of data is able to learn from it. Right? But, what does the generative AI system learn? Literally everything. The contextual relationship, which is your semantic relationship. It understands how to build word order. It understands token probabilities. It understands everything. It understands how to generate code. It understands how to generate an image, generate a video. Basically, anything that is your end goal, it will generate that. Okay? So, generative AI systems, what I wanted you to take from this example is that they will take input in an unstructured format, and they will give you output in an unstructured format. Now, you must be wondering, right? Sir, what do you mean by unstructured? I have seen generative AI fairly structured. No. Here is a difference. See, guys, if you don't know, let me tell you one of the reasons why generative AI industry is generating billions of dollars, believe me or not. is to get output in a structured format. Now, you must be wondering, "Huh? Seriously?" I'm like, "Yes, seriously." You know why? I'll tell you. See, have you ever asked any generative AI system a code? Like suppose ask it like, "I want to make a pizza." Right? And this pizza, how do I cook? And it will give you this nice formatted response. See, the problem is ask the LLM the same question but with a different LLM and observe the response. It will be different. Ask another LLM, it will be another difference this different response. Right? No matter how many LLMs you use, you want the output to be in a structure like this. I want the output to be number one, "Greet me. Hello Mr. Joseph. How are you doing? Today's time is let's say 6:51 a.m. 20th September Berlin, Germany." And it would Basically, you want your output to be provided in a structure that you want. Meaning, Joseph wants a different structure, Harish wants a different structure, Deepak wants a different structure specific to his or her requirement. Now, your generative AI systems are unstructured because what happens is depending on the question, if you slightly change the question, the answer also will become different. Right? So, there is no structured output towards the end towards your LLM that will be consistent throughout. Now, what you want to do in any generative AI use case, you want to make sure that because my system is a stable system, I want to give out answers that's pretty much simple and stable. So, remember your generative AI systems, you still need to call it a system that takes unstructured input data and generates an unstructured data as an output. Now, please remember everyone your generative AI systems are not going to generate any number for you. They are not going to generate any class label for you. They are not going to generate probabilities for you. What they do or what generative AI output is is they will give you either text, they will give you image, they will give you audio, or they will give you video. Remember, any generative AI system on this planet is built in such a way that it is not going to give you any probability output, it is not going to give you any class label, it is not going to give you some regression number. No, it's either going to take input as a text and give output as a text, input as a text give output as an image, input as a text give output as an audio, input as a text give output as a video. Okay? So, remember these are some of the most important thing. Okay? Your generative AI systems are always going to generate your output either in the form of a text, in an image, in an audio, or a video. There is nothing in between that generative AI systems are going to generate. If you hear somebody say that, "Hey, you know what? There is this new generative AI tool is generating probabilities." And you'll be like, "No, bro. The fundamentals don't allow." Any generative AI systems will not allow any nonsense to be generated in between. Answer straight will come as a text, as an image, as an audio, as a video. Number one. Simple and sorted. Nothing to worry about. Now, generalized learning systems. If you guys want to understand how your large language models are built, okay? We'll try to get an understanding on how large language models function. Okay? Everybody, see what happens is you must have wondered a lot many times that, "Hey, how are how are these companies able to build these large language models that can, you know, do so many different operations?" I'm sure if you guys have heard of this new large language model by Google called Nano Banana. Have you heard of this? Google recently released this model called Nano Banana, right? It's a state-of-the-art model which is performing at amazing benchmarks, right? But you got to understand one thing. See, I'll I'll tell you what. Every single time a new language model comes out, no? I only want you to observe one thing. What is the input? What is the output? Okay? Do you know Nano Banana takes input in a multimodal form and gives you output in a multimodal form? Meaning, it can accept input as a text, as an image, as an audio, as a video. It will give you output as a text, as an image, audio, video, text, right? Now, the problem is Google does not provide one large language model that can do all four. The reason is they need to make money, right? They need to of course make money. They have spent millions of dollars in making the model. They can't give away everything for free. So, audio will be built differently. Your text will be built differently. Your image will be built differently. If you remember when ChatGPT came into picture, right? Do you remember they came up with the model called Sora? Where you type in a prompt and it generates this video-based model. Now, if you don't know or if you have validated it, they also allowed an audio input in the Sora in the beginning. But then that was taken away. Meaning, it just stayed there as one week feature. They said that it doesn't work really really tight, you know, well and they removed it, right? Now, the reason why that happens is because see, these large language model companies understand one thing that if they provide everything at once for free to everyone, it's not going to make money for them. Okay? So, you got to understand that any large language model built has a specific goal in their mind. What goal? Example, if you look at this system in front of you, what do you see? You have this large amount of data. This data can be either text, images, audio, code, your programs, computer programs, labeled or unlabeled data. Okay? What you do, my people, is when you want to build your own large language model, there are two phases to your models. Okay? And the phases are you are going to first pass it to something called pre-training phase. Now, what do you mean by pre-training phase? The baseline model or baseline generative model, right? Here, if you look at it very carefully, here the model is usually trained on a large amount of mostly unstructured data to learn relationships and patterns. Now, everybody, any large language model that you build, right? On this planet, anybody who has built a large language model or worked on a large language model knows this that first what happens is any large language model that you build, you first need that model that baseline model to understand what the patterns are from the data. Okay? Once you understand the pattern from the data, your life is sorted. But actually, the game is much more different. Because what happens, no? In any large language model that you build, no? You need data that is labeled. Please understand, labeling is one of the bigger challenges in any large language model. You know why? The problem is labels are actually created by humans. In reinforcement learning, okay? We call it RLHF, okay? Reinforcement learning on human feedback, okay? What happens basically is this vast amount of data is converted into a multiple choice question. Okay? For example, if you guys want to take an example, okay? Let's say if I give you an idea that Delhi, okay? Is the capital of India. Okay? Is the capital of India. Suppose this is your text. Okay? How large language models are built is this same text will be converted into empty MCQ. Dash is the capital of India. A, Mumbai. B, Delhi. C, Bangalore. D, Chennai. This is going to be one question created from the data. The second question created from the data is dash is the capital of dash. A, Delhi India. Two, Delhi Mumbai India. Three, Sydney New Zealand. You're getting what I'm trying to say. Your questions are converted into MCQ formats. And in MCQ formats, one label is provided by the human and other label is provided by the human. So, the in the initial run, what they do in any large language model, they start the labeling of the data like a multiple choice questions, like a multiple choice data. And this data, the labeled data, is actually put by a human being. A human labels it. Then what happens? We train a large language model and we still generate an MCQ. We again get the MCQ. This time automatically labeled by the algorithm or by your large language model, the first version of large language model. Whichever is wrong, we label it wrong. Whichever is right, we label it right. And again, we pass the large language model into training. We repeat this process so many times until and unless your large language model keeps working on the data. Now, see guys, these MCQ and all of that stuff, you guys don't need to build this. Okay? Trust me. You're not going to be building this. You're not going to be the one who's going to be spending billions of dollars. These are done by companies with immense amount of resources. Google can do this. Meta can do this. Other companies can do this. I'm sure your company or any company that you will work for in the future, you're not going to spend money or even go into this process. This is only for your knowledge. Okay? Only for your understanding. Okay? So, you see guys, what happens is you have something called as a baseline model. A model that is learning from the behavior of or the pattern from the data itself. Once the behavior is learned, you pass it to something called as the adaptation phase that the model is usually fine-tuned on data to small specific questions. Meaning, because you want a large language model to give you or generate an image, you take the baseline model, again fine-tune it. Meaning, make it into a way which can only generate a new image. If a user asks, "Make me a strawberry that looks like green strawberry building or, you know, growing on top of a human head." So, if you ask a question like that, you're going to have to perform some kind of fine-tuning in your model that can only give you answers in a specific way. Suppose if you want to build your own ChatGPT, then what you need to do, you need to take your baseline model, which is now trained on the world's data, and convert it into only questioning and answering. So, you ask me a question, I will give you the answer. And then, your LLM can give you question answering, code generation, image generation, and text generation. Okay. So, let me explain it to you like this, everyone. If you have a child growing up around you, okay? Does the child first learn to speak words like mama, papa, dada, mama, something like that? Uncle, right? Once it starts speaking in a very simple word, right? Like either it'll say water, either the baby will start saying the word uncle or auntie or anything, whatever the first word comes, then the baby forms the words together and is able to talk to you, right? Like, "Mama, I am hungry. I want food." Right? The baby may not be able to say this correctly, but first it understands words. Initially, baby doesn't speak at all. Right? It becomes very hard. First, the baby is completely saying gibberish, which you don't understand. Then it understands the words like water, food. It combines the words. May not be correct. The words combined are becoming paragraphs. Paragraph becomes knowledge. Knowledge becomes then your child starts to go into school. He learns history, science, mathematics, and then he might get a PhD in the future. So, did your baby learn to do all of that in a day? No. For your child, it took years together to get this knowledge. But what if suddenly I could give your child a brain superpower where your child is able to absorb the knowledge of the world within a day, and suddenly in a day it can become powerful? That is what large language models do. What large language models do is they feed so much data into the model, so much data. I'm talking about the data's capacity that will take human beings literally 2,000 plus years to complete that. Did you know that ChatGPT-4 is built on top of data that would require human beings up to 2,000 years to just study that material. It's that much material, right? Imagine the accuracy and efficiency of that system. See, I know all of you must be wondering that your generative AI systems would be so powerful tool, such a powerful tool. Well, believe it or not, a human being, by the time he is 12 years old or she is 12 years old, aren't human beings already smart? Does it take 2,000 years worth of knowledge for a humans to become smart? No, right? We humans consume less energy, take less input, and become creative and generate things much better than any large language model. Right? So, you got to understand that any large language model architecture, even today, are not very effective or efficient algorithms. They take so much energy, they consume so much energy. By the way, did you guys know this? That every single time you generate an output from a generative AI system, it is not at all uh you know, cost-effective. Meaning, it burns so much energy, right? [clears throat] That somewhere around the corner of the earth, there is a data center consuming many, many, many watts of energy to just give you an answer called, "Thank you so much for joining this." Or XYZ, right? So, it's not at all environment-friendly in any way. Right? If you guys don't know, United States of Ameri- See, you got to understand all of this, guys, right? Because all these are very important for your future understanding on on recommending generative AI in the system. The reason is when you become data scientists of the future, you are not going to be like, "Oh, yeah, yeah, yeah, let's use generative AI in everything." You're going to be like, "No, it's going to consume so much energy, we don't have the money to pay for it. Let's just build a simple machine learning system." Right? So, remember, my people, it will generative AI systems are not at all cost-effective, right? Because you guys don't get to see it, there is a huge pile of garbage and huge pile of, let's say, uh immense amount of dump, right? Imagine you have this beautiful-looking city, right? And there is this huge wall, and there is a crazy amount of garbage mountain. Because your city looks pretty, you can't just suddenly say that, "Oh, you know what? Everything is good to go." But the minute you go beyond the wall, you start seeing the reality, right? And the garbage pile is just mounting up. What I mean by that is in generative AI systems, you know, you pay money to the company, company gives you an answer, you know, company gives you a large language model, but at the end of the day, it consumes a lot of energy. And where is the energy coming from? From some place. Who are they charging the tax for? The normal human being, right? The government does that. So, you got to understand one thing, that over the period of time, any large language model that you're going to be building or you're going to be working on, please make sure that use case on which has to be on generative AI. Meaning, if you can solve that using some basic algorithm, my recommendation is go ahead with it. Don't build or don't take any generative AI system and and waste your time, right? Go with a simple system that can solve a problem. Not every single time the answer for something is generative AI, right? For example, see here, the vast amount of data I passed as a text, as an image, as an audio, as a code, as an unlabeled or labeled data, had to go through two different training phase to give me either question answering, code generation, image generation, or text generation. And for what? Uh have you seen these ads generated by your generative AI model or the images generated by generative AI model, right? They are not there yet. Like, will you agree with me, right? Give me the state-of-the-art model that you want, right? Whichever do you want to recommend, the image generation, the video generation from these models are first of all not 1 minute, 2 minute, or 3 minute. They are only couple of seconds. Have you tried anybody? Have you tried Sora? Have you tried any other model where if you build any video generation model, they are just maximum, I think I have seen 10 seconds or 20 seconds. It's not even 1 minute. The generation of the videos are not even like 1 minute. They are lesser than that. Because, guys, even after so much architecture, so much money, right? Uh we don't have a model that can generate like a longer high-dimensional videos or images still at a level that we want or expect, which I believe will be solved. But the point is that it consumes so much energy that it is not effective at all. So, remember, generative AI is not this magical pill that you guys think you should be using, okay? But generative AI, let's not deny, it's very powerful, right? Very recently, I'll give you a small example, okay? A small example. I work on a platform, okay, in my company, which is able to analyze fraud. Okay? Suppose, if you guys are going to submit a document to my company, okay? A document to my platform. And this document is either photoshopped, this document is either manipulated digitally, or this document has false numbers, right? No matter what your document is, I will catch you, right? I will figure it out what you've done. You've done some changes in the document, and I'll catch it. Now, remember, doc fraud is not a very easy thing. You can always be smart and skip the AI, but AI is smarter. It knows what humans do, right? Or how humans are capable of producing fraudulent documents so that the banks don't lose money, and we flag a fraud activity in the very beginning, saving company millions of dollars, right? My entire platform is generative AI heavy. Meaning, I use lot many generative AI applications to provide fraudulent activities, and right now I am at a 98% peak accuracy. Meaning, out of 100 documents, 98 documents I am able to identify. I'm able to flag. Meaning, my false positives are so low, and I'm able to perform better. My tool is able to perform better that I can catch fraud really, really well, right? So, my point is it's not that generative AI is not at all helpful. It's not at all, you know, going to make your future well. But the most important thing that you got to understand is that you got to learn how to utilize it in which place, okay? That becomes more important. So, in order to understand that, let's proceed forward with understanding how are we going to take generative AI and how we are going to learn it, okay? How to take fraud in my account, how to trace it. It doesn't work like that. It's a bank-to-bank policy. I mean, you can fool couple of banks very but depends on what bank it is, right? Like if you use my bank then you will not be able to because this tool is currently active, but other banks might not have it. Like smaller banks don't have it. So uh you can maybe I'm not encouraging you to try this, but yeah, it's it's possible that other banks don't have it. Humans will be looking into it and maybe they can trace that, right? How to trace out a person? I have been working on this tool for 1 and 1/2 year. So it would be fairly hard for me to give you an idea just in 5-10 minutes, but what we do is we basically break the documents into multiple different smaller documents, and those smaller documents are basically then gone through a small process call it normalization where we look at the document, we look at the chunk, we check whether if something has been changed, we look at the metadata, we do many other things, and we check whether the doc is fraudulent or not. Okay? I mean, it's complicated than that, but I mean, on a short I'm basically telling you. How do you does your do you Yeah, yeah, yeah, we do a lot of fine-tuning. So basically we fine-tuned our model on a lot of fraudulent documents from paystub salary slips to bank statements, Aadhaar card, PAN card form 1090, form 1099 uh fraudulent documents, and it it is basically uh passed through multiple different checkers or multiple different checks to basically analyze and flag whether something is fraud or could be fraud. It's a beautiful platform. I mean uh I am proud of it. It's it's something that has taken a lot of time, but I'm glad it is kind of making uh or it's helping the organization do the right thing, right? So, from where did you get this data? We never got the data, we purchased the data. We bought the data. The the platform was built with the initial investment of 4 4 and 1/2 million, and once the money and the data was used we were able to generate a certain amount of revenue back again, and now we cater to lot many banks other than us, right? If you have never heard of this, I'm just going to show you the platform also. It's called accelerator.ai, okay? This is for an organization called EXL. It's a big organization. It's also one of the most uh amazing organizations on this planet. It's it's huge, right? This company is very huge. And if you if you have heard of EXL, it's it's a big big organization. Uh and uh they have one of the most advanced tools. It's called accelerator.ai, right? One of the integrations of the tool is with EXL, right? Got it? Okay, lovely. Now everyone, listen to me. Go to something called Google AI Studio. Can you click on the right-hand side, you will see something called nano banana. Click on it. You are going to find something called Gemini 2.5 Flash. Click on it. Now that you're here on Gemini 2.5 Flash, I want you to go to something called towards the bottom there is something called get API key. Click on the get API key, and here generate or create an API key for yourself. Here, type anything, whatever you want to type, right? Whatever you want to do, write that, add it, and create API key in existing project or create API key if you don't have an existing project, and click on create API key. Just create API key. Don't give your API key to anybody. Don't don't do that, otherwise it's like you giving your credit card details. Don't do that. Once it generates, you have API key towards the bottom. Can you see here? Towards the bottom? Okay. Why am I teaching you guys how to generate your key? Because I did not want to help you guys make your own or my key and then be like, "Okay, you know what? Let's just work on it." From now onwards, you guys are going to build your own API keys, you guys are going to be building your own generative AI systems. You have no dependency with me, I have no dependency with you. We are all independent people here, we'll be building our own stuff, right? Now, why are we using something called API? For multiple reasons, and the reason is everybody, okay? I want you to know one thing. Your large language model, you can use locally also. Your large language model, you can use the API key also. Okay? What do you mean by that? Did you know that you can build your own or you can use Chat GPT completely locally? Locally meaning you can take some open-source large language model, and you can put it or download it in your computer, right? And you can use it like a Chat GPT system. You can also do one other thing. What you can do? You can basically take some API the way you have built like Google, and you can use that API to ask a question to that API. What is that question? That question could be things like um I want to make pizza, I want to make something you know, XYZ, whatever that you want to do. So everybody, it's important to understand and analyze this, right? Large language models can be used both locally and can be used on cloud, okay? Which one should you go ahead and buy? If I ask you as a company, would you buy a large language model? Would you use a large language model locally in your computer or in your server so that you don't have to pay for the LLM? Or would you rather purchase an API where you can use the LLM? Which one would you go with? Which what approach will you go with? Will you go with the API approach or having a local large language model in your server? Let me tell you one thing. At the end of the day, everything comes down to pricing, correct? No matter what you do in the world, everything comes down to pricing. Will you agree? With pricing, it also comes down to something called safety and compliance, right? So tell me this, everyone. At the end of the day people who think that sir, taking a large language model of my own choice, putting it on a server, I will be able to use the large language model for free. No, nothing is free. Who's going to maintain the large language model? Who's going to take care of the GPU cost? Who's going to take care of the infrastructure cost? Are you not going to take care of all of this cost? There is going to be an infrastructure cost associated with it, right? But what about API? In the API, you have to literally spend nothing, but maybe $0.12 for maybe 100,000 tokens. They don't charge anything from you. It is literally cheap, very very cheap. So will API work in your favor? If you want to save money in the beginning, you want to save a lot of money, you don't want to worry about the infra, you don't want to deploy you don't want to take all of that, you can just take API and your life is going to be sorted. That is why 80 to 90% of any generative AI use case in the planet is working on API. Now you must be thinking, "Sir, what about data privacy?" Every single data provider has an SLM agreement with you. You know what is an SLM agreement? SLM agreement is basically that we will not take your data, your data remains your data, we don't have anything to do with your data. They provide you a digital certificate saying that we will not save your data. Only if you click on yes, save our data, we will save, otherwise we will not save the data. Meaning it is safe and secure. If your company is bonkers and Can I tell you something, guys? Believe it or not, if you go to a billion-dollar company, these companies are less scared of losing the data. Do you know what companies are scared of losing the data? It's the startups with the 30 people working in a in the corner of a building. These people are more worried. Oh, what if the company takes my data? A billion-dollar organization is not worried, but a startup with a small scale is much more worried that what if they take my data? Now, why am I telling you this? Because I have interacted with many companies on the planet on data security, and I have asked them one thing. Do you trust a third-party company on whether you know, would you give a data? Believe it or not, believe it or not banks were the least companies that trusted any company. Banks did not want to trust either American companies, let alone Chinese companies. Nobody wanted to trust Chinese companies. Nobody on this planet was like, "We will give our data to a Chinese model." No company wanted to give their data to even US-based companies like OpenAI, like Anthropic, right? So basically what is happening is if you are in a regulatory business like banking, finance, or any other industry like that they would prefer even spending more money and having their own large language model deployed on the server so they have absolute 100% control. But if it's any other company, say healthcare or any other market research or any other company, they would pretty much be okay with having a large language model bought with API. Now, why am I teaching you this? The reason I'm teaching you this is because in the future you guys are also going to sit in one of the discussions with your managers where you are also going to discuss whether you need to purchase an API or whether you need to take up infra. Now guys, in my experience, local LLM deployed on a server and API, the cost difference is 32x. What do you mean by 32x? If you're paying 1 rupee here, you spend 32 rupees here. That's the difference in pricing, okay? In whatever projects I have built or I have worked on so far, this is the pricing difference I have received. The reason is GPU is very very expensive, correct? I hope you guys know that in any large language model that you want to build in right now, if if you want to build today, you need GPU. Without GPU, your large language model will not work. It will not work. I can tell you this with a guarantee, with a 100% guarantee. You do quantization, you do distillation, whatever you want to perform, without GPU, there is no performance. I'm telling you this right now. Okay? So, understand one thing. APIs become your savior there. So, understand, I am not saying that companies who are crazy, who are scared of losing the data are wrong or false. No, no, no. You are not. You are not. You are not somebody, you know, to feel There is nothing to feel bad about or there is nothing to feel wrong about. Remember, the reason I'm saying this, my people, is because at the end of the day, pricing can make the company, pricing can break the company. If you are a startup thinking of creating this new platform of your own choice, and you're like, you know what, I will give my customers this and I'll give my customer that, and you have built up this dream in your mind, and you have created these hopes in your mind, and what you do basically is after all that hope creation, suddenly you get to know, oh, but the but the API is cheaper, but the large language model deployment is expensive. Now your entire dreams are shattered, right? Now your entire production is going to be halted because at the end of the day, pricing is never something that you considered. So, remember, it's very important from today onwards that whatever agent you're going to build with me, we are not going to act like some children who are building these toy programs. No, no, no, no, no, no, no, no. We are not going to perform, we are not going to proceed like that. We are going to proceed like every single one of you is a manager or you are some kind of a tech lead, and you're actually monitoring the pricing also. Everybody, observe. When we went to Google AI Studio, right? And when we click on the Google AI Studio, please look what these guys are saying. You will get input for $30, output for $2.50. And what is the knowledge cut-off? Knowledge cut-off is on January 2025. My people, this is what you got to ask, right? This is the information that you guys got to have, right? First of all, let's understand what this particular information means, right? Let me copy this and let me paste it. Has anybody thought what does this particular mean? The input token and the output token thing, what is the 1 million token context window? Every large language model on this planet will have something called as a context window. What do you mean by context window, everybody? The meaning of context window is when you try to tell your large language model something, what is the maximum limit you can pass inside the large language model? What will be the maximum things that you can ask? Now, that is something called 1 million token. Now, you must be thinking, oh, so basically I can write 1 million token, meaning token number one, token number two, token number three. But, sir, what is a token? Right? I'll go to something called token calculator, okay? Token calculator by OpenAI's platform. Everyone, look at this, okay? If I write the sentence, I am happy and I feel and I feel very ecstatic, ecstatic, right? Talking about my country. And if I click on enter, how many tokens did I consume in this particular token? There are 13 tokens and how many characters? Okay. How many Google AI Studio is accommodating at once you can use? What is the context window? The context window is 1 million token. Oh, so you must be thinking that I can pass 1 million token into my input model and then everything will work. Actually, that's not true. Tokens are counted on both input and output, not just input. Okay? Meaning Meaning, let me let me give you an idea. Let me tell you how the AI industry works and how they take your money, right? How they take your lunch money. Okay. If you take this word, no, this particular word, each word is converted into a token, my dear. Each word with a space is a token. Are you getting what I'm trying to say? The most easiest way to understand tokenization is break individual words, individual word becomes a token. Are you clear? There are only 12 words in this particular example. If you see that there are 12 words, you're absolutely correct. But, the token count is 13. Why? The reason is the encoding of the tokens happen not directly, it happens also in something called byte pair encoding, okay? So, basically, what happens, no? Your models, they use something called byte pair encoding. Sir, what is byte pair encoding? Suppose if I say the word ecstatic, okay? Ecstatic, it will break the word X and static into two separate words also. We don't know in this particular code which byte pair encoding, which token was split into two, that's why the answer is 13. It's not always exactly the count, but nearby to the count of the tokens. Understanding what I'm trying to say? It's not always that every single time the token, but it will always be closer. Which word it is breaking, we don't know. Which tokenization principle that they have used, I don't know. But, sometimes what they do, they take a bigger sentence and they break it down into X and static, into two different words. Getting what I'm trying to say? If you also want to look at examples, so see, you have token IDs and you also have tokens, right? So, if you can see here, they have given you some ideas here that a helpful rule of thumb is that one token generally corresponds to four characters of text in a common English text. This is not always true, but this is just a thumb rule. Thumb rule matlab, it is a general rule. It doesn't apply to everybody. It doesn't apply to every single text, okay? So, understand that what happens, okay? Sir, don't have one token to show. So, if you want one token, I will take the word I. See? Without any space also, I is broken down into two. If I say am, also broken down into two. If I say help, is also broken down into two. So, I and am and help, are they all token counts two, but character counts five, three, and one? You're getting what I'm trying to say? Your character counts were different, but your token count remains the same, two. You got to understand one thing, this happens because of byte pair encoding or the encoding strategy that the LLM is using. Every LLM, they have a different understanding. So, if I say understand, right? Can you see that the count is three? Because the word understand is being broken down. Getting what I'm trying to say? Your word understand broke or breaks down into multiple sub sentences, and those sentences are what we count as a token. And whatever your character count is, your up to four characters of text is taken as let's say one token. If you say four, then let's say eight, and then the other two is three, so I'm going to say three. But, it doesn't actually work like this. This is not the reality. It doesn't work always that four characters becomes one token. If that happens, I think it becomes very expensive. But, there is a way encoding strategies work, okay? So, the point is in large language models, whatever input you give, how to make pizza, suppose. How to make pizza, okay? What is this? Can I say this is your input prompt or your input token to the large language model? The input tokens will be charged at how much dollars, everyone? $0.30. $30. Your input token counts are $30, and whatever your output you get from your large language model, that pizza can be called as this. And what is the output token cost? What is the output token cost? $2.50. Remember, any any company you see in the planet, output token cost will always be larger, your input token cost will always be lesser. Together, your input and output together generate or create something called as a context window. Because, guys, if you want to have a context window of 1 million, remember your input plus output token together cannot exceed 1 million. Got it? Sir, what does this input 30 cent mean? This 30 cent doesn't mean that for one sentence I, it will take 30 cent. No. Input would be very very long. I'm think I'm assuming 100 plus 100,000 plus tokens would be 30 cent. So, it is basically cheap. Your output also must be some 100,000 tokens. They have not written it in detail, but it is not going to be like per word they are going to charge 30 cent. No, no, no, no, no, no. It should be up to 1 million token, 100,000 context length, right? Context window is different, context length is different, understand? They are going to charge cheap. So, basically, this model is a cheap model. It's very very cheap. At the end of the day, this is very cheap. Kuch bhi nahi hai. This is a throwaway price. And when was the last model trained? This model has an information last of which time and which era? January 2025. Whatever happened in February 2025, does this model know? Does this model know what happened in February 2025? No. The last knowledge of the model is only January 2025. This, my people, is what we call as knowledge cut-off. And the name of the model is called Gemini 2.5 Flash. Understood? Now, everyone, listen to me very carefully. What you have taken in the class today is also Gemini 2.5 flash, okay? What we are going to be using is we will use either pro model or we will use the flash model, okay? We will also use these two models to get our answers. We'll see how the answers are because what happens now? Every day the token count gets reset for flash and pro meaning if you use some count today tomorrow everything will be fresh and brand new. But if we take any other model then it doesn't have that, right? So everybody you all created your API. Can you see something called chat completion API? Did I make you guys create an API so that in the future we are going to be using an API but use a local LLM also meaning I don't want to pay some money to open AI. I want to just use a local LLM. I don't want to use any notebook LM or any local inference. I just want to download it and use it. We can also look at that, okay? It's very very simple, okay? And the minute I teach context window then I might feel that I will be like okay I might feel that I have completed this section, right? So first of all everybody let me tell you a little something, okay? First of all what do you mean by context window, okay? With respect to with respect to your input token and your output token in any large language model, okay? See first of all I want you to understand one thing. In any large language model will you be able to generate anything without passing some text? Can I say I will pass some text in a generative AI model or I will not pass anything it will automatically understand from my brain what I want. Can any generative AI model work like that? No, okay. So sir what do you mean by context window? Everybody please observe context window is the maximum is the maximum total number okay? Total number of tokens both input token and output token included that the LLM or your large language model can process, okay? In a single conversation. In a single conversation when you pass something to the large language model how many input and output token can it handle at once? Sir what do you mean by that? Please remember what becomes your input token? Your input token in is anything that is your prompt plus your conversation's history. Your past history and your prompt together can be passed as an input token. What is the output token? Your model's response okay? Your model's response is what we call as an output token. So let's say if the context window or the context window of a model is 128,000, okay? This is the context length. Your model's input or the prompt plus the model generated output cannot go beyond or should not go beyond 128,000 tokens. Understanding what I'm trying to say? If it goes beyond 120 128,000 tokens it will break the conversation and only take 128,000 tokens. The rest of the tokens will be ignored. So please remember your input token and your output token together should not go beyond the context length or the context window of the model which they have provided or which is provided to you in a model. You always got to maintain the same thing. If you don't maintain then there is a problem. Understood? That is what we call as a context window. When you ask a chat GPT model I love India. How many context window did you give or what is the token count that you give as an input? Three. You passed it to the LLM and the LLM gave the answer even I love India. Even I love India, okay? It is a beautiful country. 1 2 3 4 5 6 7 8 9. What is the output token generated at the end? Nine. What is 3 + 9? 12. So the total total total token counts that you used is 12. Understood? Not three, not nine but 12. You have to add them together. So your input model you passed three tokens output you generated nine together you have used 12. Understood? You have to add 9 + 3 and you make it 12. Understood? Whatever conversation you have passed in the bot in the history now for example hi it will respond hi. How are you doing? It will respond how are you doing? I have a question. Whatever that user history you have now you can pass that. How we can control that output token? My dear it's very easy. You can set up a limit. If any question passed by the user or any answer by the user you can use something called maximum length and I'm going to I will teach you how to do that. I I I you can give something called max token and max length. Okay? Unlimited tokens you can never get in the world now. Unlimited tokens nobody will give you now. Like world may 1 million 2 million 30 million tokens nobody will give you unlimited tokens in large language models now, okay? So let's say this is a okay this is your mama, okay? Your mom. Let's say this is my mom. Okay? I tell my mom about how my day went, okay? How my day went. Mama I woke up at 7:00 a.m. and I did this and 8:00 a.m. I did this and okay and 9:00 a.m. I did this and 10:00 a.m. I did this. If I tell her every 1 hour what did I do will she get overloaded with the information? She'll not be able to use she'll not be able to understand anything. Tell me my people at once your mom will be able to hear only a generic advice now like mama overall what did I do today? Maybe you can speak for 2 minutes. Maybe you can speak for 3 minutes. If you speak for 30 minutes she'll not be able to take it. Think of context window like that. In a large language model what happens is when you take an LLM in single question and single answer, okay? Passing a question and getting an answer you cannot go beyond 120,000 tokens. Please tell me this 120,000 is what? Is the token limit at once but if you want to break the question into multiple questions you can do that but at once like how you could not tell your mama at once what did you do the whole day every hour your mom will be able to listen what did you do in 7:00 to 8:00? Later after 5 minutes she will take 8:00 to 9:00 but she can't take at once everything. At once to your large language model you can't pass anything beyond 120,000 token. Meaning your input token could be 30,000 token your output token could be 40,000 token. Are you still within the limit of 120,000 token? Are you still within the limit of this? See 40,000 + 30,000 is 70,000 now. Is it lesser than 120,000 token? Yes. If you ask the second time the token count will get reset for again 120,000 token matlab together input and output you can't go up to 120 but you can give as many as you want because daily you have a limit of 1 million 3 million 4 million. There is unlimited tokens but each conversation cannot go beyond 120,000. Understood what I'm trying to say everybody? Did you get the logic? Input token plus output token together is my context window. So first first of all let me do this let me tell you this. Google open AI and multiple other kind of models there are so many different other models like that, right? That will help you to make the whole understanding you know like crazy, right? You can go beyond a certain level, right? There is a model called magic def magic def LTM 2 mini. Check this model. It has 100 million context window. Then you have meta's llama for scout. So basically there is a model called llama for scout. This has a 10 billion dollar token limit. You have a Google Gemini 1.5 pro 3 million. Gemini 2.5 pro 2.5 flash 1 billion. So there are not just one there are millions of models like that on the planet. 100 million is the code context limit. Believe it or not. I said 40 it has 100. The world has grown my friend. The world has grown, right? Okay everyone we spoke about your context window. We understood how the input tokens and output tokens together make the whole thing happen. It's an easy concept. Don't worry. It's it's a little hard to grasp in the beginning but it's very easy concept. You don't have to overthink anything. It's a very simple concept, right? What an input token plus what an output token is and then pretty much together you build a context window and overall you have a token limit. You exceed that token limit you have to pay more or you have to pay money. Mostly not many people exceed the token limits. Uh it's it's very rare that a company or or a process has exceeded a token limit because it gets refreshed. It's re-updated every single time. So all right now guys what are we what do we work on? What do we focus on right now, right? If the token limit is reached then is it an option to increase the limit? Yes. If your token limit has reached and if you want to increase it please remember it's a pay as you go system. What is pay as you go system? As you pay or as you use the systems that much money you have to pay. So don't worry about it. Your token limits are not going to be exhausted if you don't keep a limit. Okay? See, mostly what happens now, companies keep a limit because they want to keep a track of the tokens used. Understanding what I'm trying to say? But if it's a production grade software, they remove the token limit and allow the user to ask whatever questions you want. I mean, think about it, guys. See, my application is being used by, I don't know, maximum to maximum, I guess, 800,000 clients. Right? Now, are all the 800,000 clients going to ask like one entire story book worth of content? No, right? Are they going to put so much input token that I'm going to exhaust it so soon? No, right? So, the idea here is I don't consume it that much, right? So, if you have GitHub Copilot and if you want to increase the limit, you have to utilize something called pay-as-you-go. If you set up the pay-as-you-go, you don't have to worry about it. Let's just jump into agentic frameworks directly. Okay, let's just open up Colabs. Okay? And let me open up a new notebook. Let's call this notebook agentic_frameworks and basics. Okay? And underscore underscore basics, right? Let's Let's start it like that. Agentic frameworks and basics, right? Let Let's Let's go like this. So, guys, as you have understood uh the basics or the foundations of generative AI model, I'm just going to kind of go through it bit quicker. Okay? So, first of all, teaching you a very important framework called LangChain. LangChain is this amazing software library. And in this library, what I'm going to be doing, any agentic thought process that you are thinking of, right? Any agent that you want to work on, you guys are going to be building on top of LangChain and LangGraph. So, I'm going to be teaching you two things right now. LangGraph and LangChain in this class. And with this, you guys will be able to create agents on, let's say, basic like your basic agents and intermediate to advanced agents, right? Also, um LangChain is one of the libraries that you guys are going to use to provide modularity in your framework. So, if you can see very clearly, right? It includes prompt templates for various use cases, your memory to store LLM interactions. LangChain, my people, believe it or not, I don't know what many people off internet is divided by saying LangChain is not a production grade software. I personally think LangChain is production-ready. It is a powerful, amazing tool. Believe it or not, some of my agents are written based on LangChain and LangGraph. Okay? And they are working wonderful and they are working amazing. There is no problem. In fact, uh you guys will believe it or not, I have used LangChain in my content delivery platform. So, basically, you must have seen my Instagram. It's been some time since I've uploaded the content. My content planner to content script maker to content checker and validator, they are all built on LangGraph, right? I have a platform that is internally built by myself where what I do is all of these actions are performed by the agents themselves. So, basically, anytime I want, I have a content ready to go. I just have to record it and I'm good to go, right? And these are all the contents that are currently ranking in the social media, right? So, the point is you can build anything that you want, which is so cool, and that too without having to worrying about too much of a cost because these are literally very cheap. I don't even pay like $4 a month. Not even $4, much lesser than that, right? So, remember that most agentic frameworks that you find in the market, the best one to start with is LangGraph and LangChain. Now, first of all, everyone, what is LangChain and what is LangGraph? See, first of all, LangChain is a software library, also like a wrapper designed to facilitate the creation of any application of LangChain that you want, okay? If you want to build any agentic, you know, AI, if you want to build some kind of an integrations of your API, which I made you create, we are all going to be working on this, okay? So, all of these can be integrated using LangChain. Now, what is LangChain in a very simple terms, right? See, there are pre-built chains optimized for different use cases available. Meaning, in case if you guys want to build your any Not just your agents, right? Not just AI agents. Have you guys heard of a concept called RAG, which I'm going to teach you in a second, which is called retrieval augmented generation? I'm going to teach you something called RAG, CAG, PAG, SAG, and more, right? These are some models available in the market, right? First of all, everybody, any AI application that you can think of, all of these can be built using your LangChain. Now, everybody, the agentic AI that I'm going to be teaching you, I'm going to be helping you to create something called react agent. Reason plus acting agent. Agents that will reason with itself and then based on the reasoning, it will act, okay? We are not going to create pal agents because nobody in the industry uses pal agents, okay? We do not create pal, we use react. So, I'm going to be covering something that is industry relevant, okay? And we are going to be incorporating it into chains and we will execute something called actions, okay? LangChain is actively developed with new features every day. It's used for fast prototyping. It is amazing. I'm going to tell you what is pal in a second, don't worry. There are multiple different kind of agents created in this planet, but most of them are react agents, including your 18, Gum Loop, Crew.ai, all of them, they work on react, okay? The principle is react. Okay? Is LangChain and LangFlow same? No. Slight differences. LangChain is different, LangFlow is different, okay? They have some minor differences, okay? So, first of all, everyone, using your large language models in applications, why are they important? Large language models have something called knowledge cut-off. Will you guys agree with me? Any large language model built on this planet, they have something called knowledge cut-off. What is knowledge cut-off? A large language model has already been trained on January 2025. If you ask a question that in March 2025, who won the World Cup? Let's say India won the World Cup. It will not know. So, tell me, everybody, your large language models will struggle with the outdated information, all the old information. One of the challenges also LLMs can face is with the complex math problem, they tend to generate text even when they don't know the answer. The problem is, have you ever met somebody who so confidently lies about something that you think that, okay, must be the answer is correct. But what is happening is your LLM is going through a problem of hallucination. Hallucination is basically your LLM think it's the right answer, but genuinely it is not the right answer, right? So, please understand that large language models, one of the biggest challenges in the industry even today, even if you start making your LLM applications or agentic applications, it will be hallucination. So, the So, the cancer of LLMs is hallucinations. Hallucinations, if anybody solves, they are anyways going to be a multi-billion dollar company just today. If anybody comes out and says, "You know what? If I solve the problem of hallucination, I'm done." Hallucination is nothing but your LLMs thinking that if you ask the question 2 + 2 is equal to 4, LLM says, "No, no, no. 2 + 2 is equal to 5. I'm very, very sure about it." Meaning, first of all, it gave you a wrong answer, but it gave you a wrong answer in complete utmost confidence saying that, "No, I am telling you 2 + 2 is 5, not 2 + 2 is 4." And if you did not know this, LLMs have this problem. Whether you use Grok, whether you use Claude, whether you use GPT, depending on what kind of questions you are asking and in which condition you are asking that question, it will hallucinate and the chances are it's possible that it will hallucinate, right? So, remember, if you are using any kind of large language model and ask it a complicated question, and this mathematical question is just an example, right? Complex math problems like if you give it a research level complex math problem, the bunchy, what will happen, your LLM is not going to give you an answer in a specific way. It might hallucinate, give you some wrong answer, but it will consider that to be exactly right, right? Meaning, it will give you an answer saying that, "No, no, no, I believe it is correct, right?" So, remember, your large language models have this problem of hallucination. In fact, in fact, believe it or not, yesterday, I was about to ask a question to large language model. The The reason is, I'm going to show you an example of, I think, one of the conversation I had with the LLM, right? I asked the LLM, "10:00 a.m. in India is what time in Berlin at September?" Because I live in Berlin. I live in Germany. Uh you know, here, at 10:00 a.m. in India will be what time? It says it will be 5:30 in Berlin. And I was like, "But wait, isn't it 6:30?" And then the answer is, "Yes, you're absolutely right. I made an error in my calculation. 10:00 a.m. in India would be 6:30." What happened right now? Just yesterday, I asked it to convert the time in India into Berlin time, right? So, that there is a daylight saving here, right? What happened? Your LLM was able to just give me an answer that, "No, no, no, it will be 5:30." But then when I asked it, I confirmed, "Hey, are you sure it will be 5:30 because there's a 3 and 1/2 hour difference?" It will like, "Oh, no, no, No, no, no, sorry." And then when I asked, "Are you sure? It'll be like, yeah, yeah, yeah, I'm sure. Let me double check. Yes, it is going to be 6:30 in the uh CEST, and it'll be 10:00 a.m. in the India. So, see, including your very big or amazing model like your uh how do I say this? Your uh Claude or your Sonet models, even they sometimes make mistakes in conversions of just as simple as just writing, what time will it be in India when it is it's a time in Berlin or what time will it be in Berlin, what time will it be in India? That conversion takes a little bit of time, right? This is how hallucinations happen in large language model. And this was just a simple example. Like this, imagine if you're working on a complicated examples. For example, imagine if you have a bank statement, okay? Of a user, right? Suppose this is the bank statement. And I ask the user or the large language model, for example, I ask the large language model this question. From this bank statement, okay? Customer has spent money on which uh account the most, right? What money or where did the customer spend the maximum money on? Now, as per your knowledge, right? If you can see this particular bank statement example, I see somewhere around 7,500 being the maximum, okay? Here. Will my large language model always say 7,567 on deposit? No, it might it might say 5,984. You're getting what I'm trying to say. The answer is something else, LLM is saying something else. What your LLM is going through is what we call as hallucination. Meaning, it is so confident in its answer, we believe it is right, but actually it is wrong. So, see, just because using an LLM is not going to solve a problem. So, remember, LLMs also face the problem of something called hallucinations. Right? Now, what [clears throat] is retrieval augmented generation, guys? The reason why we have rag applications is because my people, I'm going to ask you a simple question. If you are able to understand what I'm saying, everything will make sense. If you ask a large language model a question, LLM a question, a question like, hey, who won the World Cup in February 2025? But the LLM was trained in January 2025. Will the LLM answer who was the World Cup winner in 2025? No. Because the LLM will not answer in 2025, will you go to internet search? Will you look at the internet to get the answer? Yes, everybody would be like, yeah, I would go to internet. But what if you're building a platform where internet is not the option, but you have to look within the company or internal to the company? For example, if you ask a question, is Robin uh still the VP of the sales in my company? So, tell me everybody, rather than going on internet, will my large language model know who is Robin? In my company, let's say Piramal Finance, who is Robin? Can anybody say who's Robin? Uh or if you ask a question, hey, is Rudra Sharma still the associate vice president for the fraud department? It will not know, right? So, tell me, will the LLM have to look internal to my company's database, check and validate, and then return the response? Oh, yeah, looks like Rudra Sharma is still the AVP. He's still there in the company. He's in the company since 2022, and he's still continuing in the company. So, tell me everybody, your large language model, if I give it the capability of looking on the internet or at the internet and internal to the company, am I not building one of the best search platforms on the planet where it can literally search somewhere, give me the answer, and return the answer back? My people, to get this particular application up and running, we have a concept called rag or retrieval augmented generation. What is retrieval augmented generation? My people, I'm going to show you the architecture of rag. If everything goes well, we will also create a rag in this class. So, basically, a lot of hands-on we are going to do, right? So, that we are able to build sort of like platforms relevantly, right? Now, how does rag work? I'm going to show you a small simple architecture. And if everything goes well, I think you will be able to understand it very, very carefully, okay? Rag architecture, right? Let me uh this is framework. Okay, this is fine. Large language model, uh this is I think making more sense. Copy and paste. Are you all able to see this? This particular graph that is right in front of you, right? Lovely. Everybody, let's suppose if the user asks, is Rudra Sharma, okay? Is Rudra Sharma still an AVP in the company or not? What will happen, my people, is you have a static database. What do you mean by static database? Can I say this is the internal HR database of the company? Who is the person? What is the person's designation? Who does the person report to? All this particular data is available in my database. Can I say it like that? What your retrieval augmented generation does, my people, is whenever a user asks me a question, it is converted into something called as embedding, okay? A word embedding or a dense embedding, okay? What happens are you can ignore embedding for now. What happens is retrieval, what it'll do, from the database, it will do something called lookup. That this question, that is Rudra Sharma still an AVP, I will look in the database. Whatever information I will pass, I will pass it to the retrieval. The question and the information together, I will pass it to the large language model, and my large language model will summarize, will make the answer much more better, neater, and then I will generate the answer that yes, Rudra Sharma, who is a 31-year-old, you know, a candidate working in the Piramal Finance since 2023, is still an AVP in the fraud department. He joined the So, basically, all that extra text that I generated, because in the database I had a column. I had Excel data set like this, right? I was able to convert all of that into an answer which was very visually appealing to the customer or the user. So, I will generate the answer to the user and saying that yes. So, whatever question I am asking, I am also getting the answer in the text. Even though my database was not a text, right? So, this is what your rag does, where your rag is able to look internal to the company's data set, is able to retrieve that information, and give you the answer. But my people, this looks easy is not easy. I'm going to tell you this right now. Rag is easy, but it is not easy. Why? Can I tell you why? The reason is the most important thing that you guys need to understand is something called guardrails. What do you mean by guardrails? There is a company in India, which is one of the biggest fintech company. Did you know what happened? That when that company, when people started asking, how do I make pizza? That company's bot responded back with the saying, how do you make a pizza? Meaning, they did not keep any safety at all. You can literally ask any question, it will it will literally respond with any answer. Should your company's bot be responsible for teaching people how to make pizza? Should your brand be giving you that answer? If you ask a question, suppose if I'm a fintech company, if I ask a question, uh hey, uh how much money do I have in the account? This is a valid question. But if some user asks, how to make a pizza? Shouldn't my bot's response be like, please keep your question very professional and strict. I only respond to financial queries. Isn't this how the response of the bot should be? Most of the times, when we build rag, right? Most people don't understand that in rag you also need to implement something called guardrail. If you don't implement guardrail, you are screwed. Okay? And also, please tell me, my people, should every information be available to the user? Suppose if I am not a part of the company, and if I ask, what is the phone number of Rudra Sharma? Should should the bot just give away my phone number to anybody that they want? Should they give away my email ID to anybody what they want? No, right? So, you got to understand one thing, my people. Just because you're allowing it an access to the data doesn't mean your rag has the authority to access every data in the company. You got to be very careful, and you have to strategize. Do you think your AI will take care of this for you? No. You as a human being have to decide the strategy to access what data, to allow what data, and to answer what. You have to implement those strategies. Otherwise, do you think your rag application is perfect? No, it is not. And believe me or not, my people, I have seen many companies in my life experience making this mistake. Not small companies, big companies making this stupid mistake, where I don't know who decided to build an application like that, where all your PII data is available, every information is available, and I don't know who built that, and it is so stupid, right? Meaning, you got to come up with the architecture of the rag also in such a way that you have to be very careful that is it aligning with the goal or not. Until here, is everybody clear what is the concept of rag, what is retrieval augmented generation? First, it will retrieve, it will augment the data, and generate some new answer, and that is what you call as rag, right? Everybody, you already know that rag connects to an external knowledge base to augment the existing knowledge. Meaning, rag will connect to the external data, will give you some answer, okay? My people, in this class, I will teach you how to connect your rag or your rag application to the internal database, but also to the internet. Meaning, have you all used something called perplexity? Perplexity AI, you can ask any question, it will go search the internet, it will give you the response. I mean, imagine if you wanted to build your own perplexity AI, okay, where it will go to the internet, search the answer. If the internet was not able to give the answer, it will search the internal database. If internal is also not able to answer, it'll say, "Sorry, I could not find the information." But, at least internet will have the answer or internal information will have the answer, right? So, we will make our LLM consume internet's knowledge. We will make the internet's knowledge get consumed internally also. We will use Pinecone, we will use FAISS, multiple other types of vector DBs to get this applications done. Okay? So, what we're going to do, we're going to work on them, and we're going to try and understand how these applications help us perform better, right? RAG is done. This is just one example of RAG. This is more complicated where I'm basically teaching you how to create your own embeddings. We don't have to learn this right now. We can just skip this and we can move forward. Um there is also something called reranker, but I think we can skip this. You don't have to worry about this. And we can directly go to something called as our agentic framework directly, which is going to be what our agents are. Finally, all the basics that I wanted you guys to be prepared on. Okay. So, first of all, what are AI agents? Okay? First of all, guys, AI agents are software programs that can interact with the environment, collect the data, use the data to perform the tasks that you don't have to define, but it will do it all by itself. It will set the goals, right? Humans don't set the goals now. Even AI are able to set their own goals now. But, an AI agent independently chooses the best action it needs to perform to achieve the goal. For example, consider a contact center AI agent that wants to resolve the customer query. The agent will automatically ask the customer different question, look the information internally in the document, respond back with the solution. Whatever based on the customer responses are, determine whether the query can be handled by the bot or should it be passed to the human. So, basically, your AI agents are not what automations are. So, don't confuse your automations with AI agent. Automation is basically repeating the same thing again and again and again. But, the instance or the input will be the same input. But, in AI agents, you can literally give it anything, and like a human being, it will be able to handle any problem or any situation that it is built to handle. Okay? Please remember, AI agents are different than agentic AI. Okay? These two words are the most confusing words, but they are completely different words. First, we will understand agents, then we will work into agentic part. Okay? Everyone, please understand this and never forget this. This is your interview ready uh slide. They will ask you this question, and you need to give the answer exactly like this. Please tell me, what are the key components of any AI agent architecture? Please remember, guys, agents in artificial intelligence may operate in different environment to accomplish unique purpose. However, all functional agents share components. What do you mean by components? Number one, architecture. Architecture is the base the agent operates from. Architecture can be a physical structure, a software program, a combination like robotic AI agents can have sensors, can have a software, but your architecture will also have an AI software agent which will work according to the text prompt. So, my people, in very simple word, what is an architecture? What does your agent do? Is it going to interact only from software to software or does it also have to interact with a hardware? For example, I'll give you a very simple example. If I ask my AI agent, a WhatsApp bot, "Hey, can you schedule my appointment for 9:30 tomorrow morning?" And you will be like, and my agent will be like, "Sure. I went to your Google Calendar. I have set up the appointment." Did I go to the Google Calendar, open the Google Calendar, open a click a button, select something, and then set up a meeting automatically with the variable saying that a meeting for discussing something? No. The AI agent decided to do this for me. But, can I say this is software to software AI agent where I used WhatsApp to connect my Google Calendar, correct? Now, everybody, imagine you have a microcontroller or a microprocessor in your house. Okay? If you say, "Can you please turn off the light in my house?" It will be like, "Okay." Can I reduce a turn off or can I say a perform a specific task, and it'll perform that task for you. Or if you ask it, "Okay, depending on the mood of my house, can you set up a light?" And then it will turn your light into a specific, you know, RGB light so that it sets the mood in the house depending on the time or the area. My point is this, your agentic AI or any agents that you're going to build in the future, it needs to have an architecture, very simply spoken, a goal. Goal with the intentions of whether it is software or hardware. So, what did you understand, everyone? Every agent that you're going to build is going to have what? What will you start with? Any agent, what do you what you're going to start with? You're going to start with the architecture. Without the architecture, don't even proceed forward with your agent. What is it that you want your agent to do? You need to tell it very clearly, right? You need to tell your agent what it is need to performing or what what performance does it have to do, right? Then, what is the agent function? Everyone, the agent function describes how the data collected is translated into actions. Everyone, your agents will do some actions, right? What those actions are, you need to tell the agent very clearly. Perform this. Perform this. Get this. Do this. Delete this. Add this. Whatever you want your agent to perform, you are going to call that as your agent's action. So, whenever your agent, okay, is going to collect some data, that data is going to convert into what action, and is it supporting the objective of the agent? When designing the agent function, please remember the developer considers the type of information, the capability of my AI, the knowledge base, the feedback cycle or feedback mechanism, and other technology. Okay? Remember that agent functions are not just randomly created out of thin air. You got to think, strategize, and then come up with the right strategy. You're just not going to uh say that, "Oh, you know what? I'm going to connect this with this. I'm going to build this with this." You got to strategize everything, and this strategy is important. If you don't do this, you're pretty much going to be screwed. Okay? Then, the third thing is what we call as something called as agent program. Okay? What do you mean by agent program? Your agent program is going to be the implementation of the agents function. Okay? It involves developing, training, and deploying the AI agents on the designated architecture where the programs aligns with the agent's business logic, technical requirement, and performance element. Meaning, your agent program is the program that's going to execute your agent. It's going to start initiating or initializing your agent. Without the agent program, there is no agent. Okay? So, remember, these are the three things that I want you to keep in your mind every single time you're going to build an agent. It is a very simple theoretical concept, but without this, you building an agent becomes slightly complicated because there is a chance you might build an agent, but might go wrong into it. Okay? Now, remember, after you have built this, you also need to use something called determine the goal, acquire information, and implement tasks. Now, these three things, my people, are very simple to understand. I don't think I even need to go into the depth. You got to understand what goals uh your specific instructions have been to the agent, and based on that, has it performed or not? Do you need to acquire any information? For example, if the agent is like, "No, I cannot do anything until and unless I get some more information." Right? Until and unless I get the information, how am I supposed to proceed forward? So, acquiring information is also important, and implementation of multiple tasks together are also implementing because what happens is agents can connect with agents. There can be not one, but multiple agents together, and these agents determine whether the task is complete or not. For example, what different types of agents exist in the market? Please remember that you have simple reflex agents, model-based reflex agents, and goal-based reflex agents or goal-based agents, right? What do you mean by that? A simple reflex agent operates strictly on the basis of predefined rules and its immediate data. What do you mean by immediate data? Whatever data I provide will be also strict. Whatever task I have given is also strict. It will not respond to situations beyond a given event, condition, or action rule. Hence, these agents are suitable for simple tasks that don't require any complicated or extensive training. For example, you can use a simple reflex agent to reset passwords by detecting specific keywords in a user's conversation. See, in your simple reflex agents, what do you understand so far? Understand the definition. What do you mean by simple reflex agents, right? When I say simple reflex agent, right? What do you see in my simple reflex agents? What am I trying to do? Am I trying to have like a very simple example like a if-then condition? For example, okay, automatic doors. If you have seen an automatic door system, right? If my motion sensor detects the movement, if motion is detected, then open door. Can you see if-then relationship? Are you understanding the if-then relation? If a certain trigger has been added, then perform some action. Now, tell me one thing. In the if and the then, okay, what are you going to do? If a certain action has been initiated. For example, I'll give you a very simple example, okay? Have you all received a lot of WhatsApp spam nowadays? A lot of WhatsApp spam messages have come into your WhatsApp. Somebody's offering you a job on WhatsApp. Somebody's offering you, "Hey, I'm you know, this guy. I want to discuss with money." Even WhatsApp is now starting to fill with spam, right? Loans, stock trading, XYZ nonsense, right? If a specific keyword shows up in your WhatsApp, then are you automatically going to mark that particular thing as spam and block from WhatsApp? Can you perform this simple agent action? You all have an access to API, you know, of WhatsApp. WhatsApp provides you something called APIs. Your Your API platform has now an access to web platform called WhatsApp API, okay? It's called WATI. WATI, it's a third-party tool. What you're going to do is, if somebody sends me an email or WhatsApp or messages where if the sentence starts with a specific keyword or specific sentences, then I am automatically going to block you and tag you as a spam. Can I say did I require any external help of LLM? No. My LLM just was able to identify whether this text was a spammy text or not and based on that was I able to trigger trigger block? What was my agent's action? Action was to block, correct? Think of it like this on authentication, okay? Think of it like authentication, guys. Tell me everybody, do you all have something called as a system firewall or pay firewall, right? Think of it like a system firewall in your in your in your in your laptop, right? If the IP is whitelisted, okay, no problem. Visit link, no problem. I have no issues. But, if the IP does not have a DigiCert or is not whitelisted, do you think I will just go ahead and say that boss not allowed? So, don't you think that it's such a simple such a simple agent? It's like an automation platform, but here I'm using the power of LLM. That's the only difference. I'm using LLM to make certain decisions, but the idea is it's such a simple agent. So, tell me, in the simple reflex agent, are you doing anything complicated here? So, what kind of relationship does it follow? It follows something called if-then relationship. What relationship, everybody? It follows something called the if-then relationship, okay? If the if condition is satisfied, then a certain operation is done. Now, my people, your model-based reflex agent maintains an internal state updating it based on the action and the perception, okay? Meaning, what happens when your particular model-based reflex agent is, in a very simple way if I explain, it will have something called as an internal state. Think of internal state like your internal memory, okay? Internal memory plus current perception. Whatever current input you have gotten, based on this it will take an action. Okay? For example, for example, for example, tell me everyone, in your smart home security systems or in your smart home security systems, what is the internal state? Automatically, your doors Some doors might be open, some doors might be locked. Can I say some doors are open, some windows are open, some homes are locked? Okay. Do you need me to track which homes are open or which windows are open and track which windows are closed? Can I say this is my internal state? My internal information, internal memory, how many windows are open and how many windows are closed? Can I say this is my internal memory, right? What is the current input or the current perception? Somebody came to the door or somebody came at the door and that person's facial map or facial recognition is not recognized. What will the agent do? The agent, based on the door not recognizing the person, can it trigger an alarm or automatically close the windows and doors and shut them down? Are you understanding what I'm basically trying to say? Your model reflex agent, what it does is it is more like an internal state plus current perception model where I first have internal information about my system plus new input to the model. Based on that, I will take an action. Now, the beautiful thing is you don't have to define action defensively. It will take the actions all by itself, right? It will decide which action should I take. For example, if alarm is red and good to go and there is a motion sensor trigger the alarm, close the doors. If it is somebody who I know, then don't trigger the alarm. Don't close the door. Maybe open the front door and let the person in. But, maybe before opening the front door, ask somebody should I let this person in? Maybe on an app, maybe on WhatsApp, maybe anywhere. Point is the triggers are dependent on the situation and that is why you build something called as a model reflex agents. Please understand model reflex agents are very simple to build. Like simple reflex agents, they are just a little bit complicated than simple reflex agents, okay? Where you have to very clearly define what agents you need to do. The hardest kind of agents you need to build or you build is the goal-based agents. Now, my people, this is one of the bigger problems, okay? Why are bigger problems? The problem here is in the goal-based agents, my people, the agents will do anything possible to achieve the goal. And when I say anything, I literally mean anything. It does not have any plan or set action, you know, set. If you set it, good. If you're not, that's okay. Point is, in a goal-based agent, you give it a goal, you give it the current internal state, you give it an action, and it will perform the best possible action. Meaning, for example, instead of just giving you the random answer that you expect it to give, okay, it's going to do something called reasoning plus action. Meaning, it's going to reason with itself. It's going to give itself advices. No, I don't think you need to do this. Oh, I think this is a better idea. Okay, then go ahead and do this. Goal achieved. Goal-based agents are the most complicated agents that you build in the market right now. Most agents in the market that you see right now are also goal-based agents. They also connect software to software, software to hardware, hardware hardware and software, software hardware and software, okay? There are multiple different kind of agentic framework that you can use. Out of them, goal-based agents are the most utilized uh agents that you see in the market right now. I'll give you a very simple example. Have you all been to a platform called N8N by any chance? Just to show you one simple example. Just one simple example if I show you, right? I'll show you one simple example, right? Let me go to my instance, right? If you look at this particular workspace that I have created in N8N, I'm going to show you what this does, right? So, let me show you N8N agent and open this up in your image and show you an example of a simple agent. Yeah, this is a simple agent. AI WhatsApp, okay, lovely. I'm going to copy this and I am going to, yeah, copy this and I'm going to paste it. Everybody observe this and paste this. Okay, everyone. Okay, this is how your N8N will look like, okay? In If you worked on N8N, good. If not, then it's okay. Don't worry about it. See, this is where your input goes. Can you see the arrow? Are you Are you able to see the arrow? You are going to receive some chat message. Your AI agent, okay, has interaction to an OpenAI model, some simple memory, and a a tool called send a message in Slack, okay? Send message and wait for response. See what's going to happen. Based on some message you're going to receive, the AI agent will first go to the chat model, will ask the chat model a question that, "Hey, I have received this model and what do I do?" The OpenAI model will then instigate the AI agent to send some messages in the Slack. Once the model is sending the message on the Slack, it will then wait and then sort of like a send and wait for the message to receive and then it's going to perform some actions. So, point is, see, what The reason I wanted to show you this is because goal-based agents or any agent that you build, you know, are built like this only. In your mind, at least you need to build it like this. First, I will do this. It will go here. Then, the process will go here. It will save the memory here. And then it will wait until the message is respond. Then, it will again go to the LLM and then give the response. Meaning, in your brain, you need to come up with the strategy of what exactly do you need to do. That strategizing becomes very important, right? All your goal-based agents that you see are based on what you say are strategizing. Now, everyone, you cannot build an AI agent without understanding or learning something called design patterns. If you don't learn design patterns, there is no building agentic AI. So, for now, I'm going to teach you what the design patterns of agentic AI frameworks are and how do you use them. The first one is what we call as a reflection pattern. Everyone, it's very simple. Very, very simple. When a user prompts something, okay, an output text is generated. I will reflect whether my output is correct. We call this principle as a maker checker principle. What do we call this, everyone? What do we call this principle? We call this the maker checker principle, the maker and checker. What do you mean by maker and checker? Think of it like this, guys. Your ChatGPT is generating the answer. The question from the user was, "How do you make the best pizza?" The best vegetarian or vegetarian or vegan pizza, okay? This was your question. Your OpenAI or your ChatGPT generated the answer for making the best pizza in the world. This was created by the ChatGPT, which is the maker of that pizza's answer, correct? Your your GPT or your photo mini model is the maker. But, once you generate the answer, both the question plus the answer, are you not going to pass it to another large language model? Let's say Claude, and we are going to let the Claude judge. Claude, can you please judge and tell me, did the GPT do a good job? So, tell me, is this not going to check your answer? So, we call this checker. So, this is basically the maker checker principle. One LLM will generate the answer, other LLM will judge the answer. So, tell me, everybody, you can use the same LLM only, or you can use other LLM. But, what is happening in the reflection? You prompt or the user prompts, I generate something, I reflect on it, and the reflected text is backed again passed into the model. I will keep repeating this unless and until I get satisfactory result. Once I get satisfactory result, what do I do? I give the response to the user. Understanding what I'm trying to say? The biggest problem in the reflection pattern, infinite loop. How do you break out of this loop? Have you ever thought, isn't it possible that you can give it some condition that it will never reach? And because it will never reach, your agent is now stuck in the infinite loop. Yes or no? This happens with your while loop. This happens with your recursion. This happens with memorization, right? In data structures and algorithms. So, what do you do? In your recursion, do you write something called as a base case? Do you write something called as base case, right? Remember, guys, in your reflection pattern, let me ask you guys a question. Whenever you hit an API once, does it not sometimes return an error saying that 500 error, server exchange error? That is because too many people are using the API. So, do you just hit the API once and be done with it, or do you do maximum try? You first hit the API once, it returns the status code 500. Then, you wait for 2 second, 3 second, 1 minute, and you hit the API again. And then, it returns the response status 200, okay. Meaning, do you not let the API try itself maybe 10 times, 15 times before finally saying, "Sorry, API not working." Yes or no? In your reflection pattern, can you tell the reflection pattern that, "Look, I'm going to give you maximum 20 retries. Within 20 retry, you have to try your best to give me the answer." Can I do it like that? Rather than letting it go through infinite loop and wasting the time of the customer and yourself and wasting money, because every everyone, listen to me carefully, every single time your LLM is being called, are you not paying the money? Are you not spending that 30 cent for the input and $2.50 for the output? So, tell me, do you want the user to get unlimited access to your money's bank account? If you have so much money, please you I'll give you the money, transfer it to my account. I will at least buy a coffee for myself, right? What am I trying to say? What I'm trying to say is no organization should just work on reflection pattern without having to have a break condition, right? If you don't have a break condition, pretty much you're just going to be have a problem, right? So, understand based on your goal in the reflection pattern, how many iteration can you allow? Okay? How can we justify the best output? If the iterations are lesser and it give you the answer within the Let's say, if you said, "My output is maximum five." Okay? And inside this five output, it generated the answer in three, you can say the answer is correct. But, tell me one thing, do you have to perform something called validation and valuation, evaluation of your model? What you need to do, my my dear, is even before you build a reflection pattern, you need to test it out. Do I call it as a dry run? In software engineering, we call this as a dry run. I'm sure you know what dry run is, right? Rather than letting the agent just directly answer, we take multiple test cases and dry run on our model and see if the responses are correct. If the responses are full of hallucination, then we don't give it to the customer. If it is not, we just pass it to the customer. If the model hallucinate with ChatGPT, your Claude will fix it. If Claude hallucinate, ChatGPT will fix it. This is why the maker checker principle is important. You're getting what I'm trying to say. Both hallucinating at the same time is is not impossible, but very rare, no? Both your large language model will hallucinate for the same question at the same time is very hard. So, that is why to avoid hallucination, we provide maker checker. Why do you Why do you think we are having something called reflection pattern? It reflects on its own answer. Dry run. What is dry run? Dry run is basically taking test cases like 50 test cases that you can think of, passing it into your large language model, and manually validating the answer. Huh, this is sahi, this is sahi, this is sahi, ye galat kar diya. This is right, this is right. So, wherever it went wrong, you write the prompt again, tune it once again, and then finally get the answer. Getting what I'm trying to say? Dry run is basically you testing it like a test drive of your own car, own model that you have prepared, in as many cases as you can, so that you can check whether the model's performance is good or not. Understood, everybody? Did you get the logic until here, until the reflection pattern? But, if you are using ChatGPT in both places, chances are yes, there is a good chance it will it will hallucinate. That is why in maker checker, you don't take two LLMs of the same category in the same. You take two different LLMs. That's why you take two different LLMs. That's the use, that's the reason why in the maker checker, we use or we try different perspectives of things. So, in LLMs, you can monitor multiple other things. In my PPT, if you scroll down, if you scroll down, scroll further down, down, down, down, down, down, I have also taught you how to do something called fine-tuning, Laura, QLaura, quantization, and everything, which of course, I'm not going to teach now. But, uh if I scroll down, okay, I think somewhere deep down, I have also taught you how to monitor this or metric it, right? So, for example, Aha, can you see? Contextual accuracy, compliance assurance, adaptability, harm prevention, risk mitigation, right? These are some of the guardrail principles. So, you can also monitor and calculate some of the ethical constraints in your model. So, don't worry, you will be able to do that. I'll teach you how to do that, okay? Now, these topics are very important in order for all of you to understand what kind of agentic frameworks exist, how can you build your agents, what are the basics of these agents, right? Now, it's very critical for all of you to question everything with first principle. When I say first principle, basically, think of any use case in your head, right? The best way I have realized that I can learn or you implement agentic AI is think of a use case in your head, and ask yourself, "What is possible and what is not possible, right? To what level of control can my agents have or possess, and what kind of control my agent cannot have?" Okay, meaning, what are possibilities and what is an impossibility? Like, okay, I think with today's technology, with agentic frameworks, this is not possible and this is possible. So, the more you think about it, the better idea you get on what's possible, what's not possible. Okay, it gives you a very fair, good idea, okay? I think I can say it like this, right? See, whenever you work with agents, or whenever you're working with, let's say, agentic frameworks, right? The core difference that you got to understand between AI agents and agentic AI is actually much more than that. So, let me give you an understanding, okay? Precisely, if I tell you, okay, agentic AI is more like the capability. Think of it like this. The ability of an Okay, let me write it down, right? So, that all of you can understand. Think of it like this. If you basically tell me that, "Okay, sir, if you can give us an idea about what do you mean by an agentic AI, right? Agentic AI." Oh, by the way, I am not the only one. The entire machine learning and AI fraternity is confused about who do you call agentic AI and who do you call AI agents. There are literally research papers written on this, where people have fought with each other. So, what I'm trying to tell you is is one of the understanding that I have realized after maybe writing three research papers, four white papers, right? On this particular topic. So, first of all, see, agentic AI basically tells you one thing, okay? It is your AI system, okay? Any AI system that you build. Now, when I say any, I mean by the grace of God any. When I say any, please remember it has a double star on it. And when I say any, matlab kuch bhi, anything. Any AI system that can act complete autonomy or that can act in complete autonomy for only one and one goal, okay? Whatever goal you have given it, if it wants to act in complete autonomy, that particular thing is going to be your a agentic AI, okay? Your pure agentic AI, okay? Your agentic AI is always goal-driven, okay? It is not process-driven. It is always and always goal-driven. Whatever goal you give it, autonomy means humans need to not involve themselves at all inside this. It means once you write an agentic AI, you are not even going to look at its behavior or modification or anything. It will It wants to work in complete autonomy. It wants to take decision in autonomy. When I say decision, what it wants to do, how it wants to do, everything is its own behavior and modification, okay? It will not want you to involve or anybody else to involve in its decision process. That's what we call complete autonomy, okay? The difference in agentic AI and AI agents, right? Is this, right? Okay? Now, what are AI agents? See, agentic AI is your is your understanding, right? It's like your intelligence part of it. Now, what are AI agents, right? The actual systems or the systems that can help you implement agentic AI, okay? It's the real thing. It's your implementation, right? The actual implementation are what you call as AI agents. I want you to think of it like this. Agentic AI and AI agents are not separate but a part of the same page where agentic AI is more like an idea, more like an understanding, but its implementation closest to the reality is what your AI agents are. Are you understanding what I'm trying to say? Don't look at agentic AI and AI agents as a separate completely two different unknown variables. No, no, no, no, no, no, no, no, no. Not like that. Agentic AI is the initial idea, the hope, the expectation that what I'm going to do, I'm going to do this, I'm going to do this, I'm going to do this, I'm going to do this. But what do you able to achieve in reality is only 80% of what you actually thought of or what you dreamt of. Now, that implementation is your AI agents. Understanding what I'm trying to say? Your implementation principles that you were able to implement in reality is what you call as your AI agents. The one that you are practically able to implement. Now, nobody can give 100% autonomy to AI agents. Can anybody do that? Do you all feel comfortable to give a certain, let's say, do you feel comfortable of giving any AI agents the access to your UPI account? Plan and execute. Think of it like this, right? Planning, executing. Think of it like this. Then, in the planning stage, the reason we call it agentic is because in the planning stage, you can literally think of anything and it'll work out for you. You can think of doing this, you can think of doing this, you can think of doing this, but by the time you come to building AI agents, do you not face a lot of problems like governance problem, information security problem, multiple other problems that doesn't make your AI agents implement as perfect as your plan goes? It will not. Understanding what I'm trying to say? Understand one thing that in real life, when you work in agents or when you work with any AI systems, right? So, first of all, everyone, let me be very frank with you and let me tell you what I have experienced so far and I have dealt with a lot of budgeting issues because of this, okay? Agents are one of the most over-abused word in the industry. When I say over-abused, you will literally see every company talk about agents like it's a dolop, right? Like, oh yeah, we also work with agents. Oh, we also work with agents. Oh, we also work with agents. And in the very simple aim, what they're doing is a simple program, right? My My point is, you don't need agents every single time to solve a problem. Agents are not at all required. The world has survived without agents for a long period of time. It can still continue to do so. Only and only ask agents or build agents if you know you have a requirement. So, please understand that this is the reality of the industry. And the reason I'm saying this is because you got to understand that not every single place you need an agent. You need an agent when you have a very specific goal in your mind and you want to achieve that goal in complete autonomy, right? When I say autonomy, you want that agent to perform that task, right? Which can be better than a human being in some cases. And you want that goal to be just done. I just want to complete that goal, right? And one of the other intention you have to understand is that you have to complete the goal in a specific time frame. It's not like I give you 1 year to complete the goal. I know you will complete the goal. I know AI will complete the goal. But it makes sense to complete the goal or to run an operation in a specific time frame. I cannot make my user, my customer, or whatever application you have built wait for 1 year for the agent to complete the task, okay? So, remember, there is of course goal orientation here, but also you have to remember that the process needs to run very, very quickly. I mean, how would you feel if I give you an application that takes 1 hour to load? Would you use it? Would anybody feel comfortable or have we all lost patience in our life? Nobody has the patience to even wait for a minute, right? Do you think you're going to wait for an hour for some operations to complete? No. So, understand one thing. Whenever you are building any agents, okay? When any agents, understand one thing that is it a real-time agent, okay? Or is this an agent that is going to run in the background, okay? Like you sleep in the night and by morning it will do something for you, right? So, what kind of agents are you going to build? That is also important. Do you want your agent to start working instantly and giving you the results? Or is it an agent that runs in the background? You don't care if it takes 1 day, 2 day, 3 day, as long as it gives you an output. So, this is also a valid agent building. This is also a valid agent building. On that note, everyone, if you remember, right? What's the difference between task-specific and goal-specific? So, first of all, here is the thing, okay? When you look at task-specific or a goal-specific thing, goal is always your final outcome, my dear, right? Your goal could be anything, right? So, think of it like this. Your goal is to become the prime minister of India, right? A very simple logical flow I'm telling you. You want to become prime minister by hook or by crook because your end goal is to become prime minister. Now, how you reach that did not matter. Now, when you say task-specific, right? Your goal matters to you even if it's task-specific, but there are certain actions you need to take. There are certain routes you need to take. Only under such routes are you going to make your particular things task-specific. Remember, at the end of the day, without goal, you cannot have anything, okay? At the end of the day, anything that you call goal-driven. So, let's say a very simple example, right? You want to make a dinner for the family, right? Let's say you have decided to make dinner for your family, right? You can choose what to cook, how to cook, what ingredients you want to buy, okay? You can adapt if the store that you went to did not have an ingredient for you. Like, say for example, you wanted to cook a dal, but suddenly they don't have a toor dal. They have a moong dal. So, okay, no problem, I'll adapt, right? And what is the success of your goal? Your family had your food and they said the food is good. Now, what is task-specific? Let's say you say I want to boil a water for 10 minutes. I want to add pasta. I want to add gouda cheese. And I want to stir for 3 minutes. And I want to feed my family. Now, in the goal-driven and the task-driven task, what you did is in the goal, your end goal was something else. And in your task, you had a very specific set of instructions that you wanted to follow. So, that would become the difference between goal-specific and task-specific. In terms of agentic AI, this is also one of the most important things. Do you have a specific tasks that you want your agents to perform? So, I'll give you an example. In any anything agents that you build, do you see it is goal-specific or more task-specific? What kind of agents do you start seeing here? These are task-specific agents. You know why? Look at this. Once I do something, it needs to follow this process at the end of the day to get you an answer. Are you getting what I'm trying to say, my dear? Right? So, mostly no-code platforms or low code platforms are always going to be of course goal driven. Everything is goal driven, but at the end of the day, it's going to be broken down into specific tasks and those tasks are going to run. Understood, my dear? Okay. I taught you a concept called reflection pattern. I told you that once a user asks some question to your agent, your agent generates an output. That output goes back to another agent, the maker checker principle, lovely LLM as a judge. The agent reflects and checks whether the answer is correct. It generates another text called reflected text, and it keeps iterating again and again and again until and unless satisfaction is reached or maximum number of tries of reflections are reached and you come outside the loop. Observe carefully. Reflection pattern focuses on improving the agent's capability of evaluating and refining its own output. This is a self-critique loop of generating and reflecting, and it is not limited to single iteration. Meaning, your reflection and generation is not just limited to one loop. You can have multiple loops. System can repeat the reflection process as many times as possible, and self-reflection rag is one of the most popular type of agentic rag. And guys, you cannot say that there will be no hallucinations. There will be very minor hallucinations that can be handled, but remember, hallucinations do exist, and it is very much possible that the maker and checker can have hallucinations. The chances are lesser, but I am not completely saying that they are not possible. It is very much possible that your maker also hallucinates, your checker also hallucinates. Understood? Hallucination is very highly reduced, but the chances are not eliminated. They are not reduced. They are not zero. They exist. Nobody in this world in today's technology is going to say that they have reduced hallucination. Because, my people, the day if I come and tell the world that I have completely removed hallucination, I might build a billion-dollar enterprise, right? Hallucination is one of the bigger problems we need to solve. We haven't solved it. Nobody has solved it. Okay? We can reduce it. We can reduce it to a certain level, but we have not completely eliminated it. Okay? So, you have got to be very very careful. Right? Hallucinations happen because LLM thinks that that output is more relevant. Okay. Before transformers, pre-transformers encoder-decoder model. Do you remember? Like we had an LSTM cell, LSTM cell one, LSTM cell two, LSTM cell three. Then suppose if my input to the model is I love India. Suppose if you say I, if you say love, and if you say India, there will be a final context vector generated or a thought vector generated called CV, right? In a encoder part. Then this context vector will go to a decoder, and this decoder receives the first input called start of sequence. So, I love India is translated into Hindi, "Mujhe" m u j h e "Mujhe". And then you have another decoder here. And in this decoder, you take the sentence "Mujhe" and you feed it to the decoder once again to generate the next output, India, i n d i a India. And then you have another decoder here. And then you take the word India and pass it into the model here and say "Pasand" p a s a n d "Pasand". And then it goes on and on and on, right? Your hidden vectors are passed like this. Now, everybody a concept called teacher forcing where if in the decoder, if you're converting I love India, and inside the I love India, let's say "Mujhe India", the word should be India, but by mistake my decoder created a word called "Mujhe yeah" y e a h y. If your decoder output is wrong, won't every other output in the decoder after this particular sentence will be wrong? If your decoder generates a wrong answer here in the second time step, every single answer after this time step is going to be wrong because everything is dependent on that, right? Everyone, your LLM also works like that. You know why? When your large language model makes a prediction, it generates a token probability for each individual word. If one token probability goes wrong, your LLM goes in a very different direction than the right direction it should have gone. And what happens because of that? It believes the answer it is saying is correct, but actually the answer is wrong. So, hallucination happens because of many technical problems that we internally have built for ourselves. Okay? The way we have built our large language models, the way our data is trained, the way two things are very similar to each other. So, asking a way the LLM to go in a completely different direction that the LLM should not go, but unfortunately, due to the mathematics behind it, it goes to the hallucination. Right? The one simple example I taught you at the time of training your encoder-decoder, we solved this problem with teacher forcing. But unfortunately, in LLM, the transformers that we use are decoder-only transformers. Okay? So, decoders-only transformers are used to build your large language model. Okay? Any large language model that you see, right, is not an encoder-decoder model. It's a decoder-only model. Okay? Because of the architectural issues and the math behind it, uh hallucinations are something that we cannot unfortunately avoid right now, right? You can reduce it to a certain extent, but you can't completely zero it. It's not Okay? So, everybody, coming back to the topic. What do you mean by something called as tool use pattern? Can anybody look at the diagram and tell me what are you understanding from this diagram? What do you mean from a tool use pattern? Can you see you have divided the responsibilities into tools? Tool A, tool B, and tool C. Can you see that you have broken down the big problem into small problems, and each problem is going to do its own individual task? Have a look at this. Have a look at this, everyone. The tool use pattern significantly broadens the LLM's capability by allowing it to interact with external tools and APIs. So, first of all, everyone, what is the main use of tool use pattern? Why do we implement something called tool use pattern? Because we want the LLM to go beyond its capability and connect itself with multiple other external vendors. Now, look at this. Agentic AI systems using this pattern can access databases, search the internet, execute complex functions with programming languages like Python. It's very useful to augment rag systems with the capabilities to answer questions based on real-time searches. Meaning, when you ask Perplexity or when you ask your own agent, hey, let me give you an idea, okay? Let's say, guys, you have an internal database within your company, and you have an external internet search. When you ask a question, can the tool A and tool B, can I instigate both the tools A and B together at the same time? Whoever brings me the answer, I will take his or her answer, and I will have a final answer, either from internet search or internal search from the company. Can I not do this, right? And when you say internal search in the company, can I not say I can go to the internal database in the company, which is a simple SQL database? I can also go internal to companies' databases like a database. I can also go internal companies' documentation like PDF files. Meaning, are you not building an agent where now you're very clearly defining what kind of databases the company is looking for? How exactly it is looking at the database and it's returning the answer, right? So, understand one thing, guys. Your tool use pattern is one of the most significantly used, you know, patterns, and this is something that is very very useful, and you guys are going to be using this while you're going to be working on agentic frameworks or whatever agents you're going to be building in the future. This is one of the other examples of using it. Okay? I think hallucinations can also happen due to lack of context. Think about prompt. What do you know about API can be answered as application programming interface. Yes, 100%. I mean, so I'm going to ask you guys a question, right? One of the reasons why, you know, hallucination happens is also because if you don't tell clear goals, why would it give you clear answers, right? Like if your question itself isn't clear, then the answers won't be clear. So, your answers will also be a little messed up by the time you start questioning the whole thing. Okay? So, it's like dependent on multiple thing at once. Okay? Now, everyone, have a look at this. What usually happens is in the tool use pattern, we don't keep it sequential in nature. Like for example, after one operation, you have a second operation, and only once these two operations are done, will the third operation be done. Usually, they are completely independent of each other. But, if you want to build something like that, you can. Where tool A and tool B Mostly, what happens, tool use patterns are only used to fetch some answers. Okay? And aggregation of the answer happens, Shivam, at the LLM phase, here. Each tool has its internal LLM, and aggregation happens in the tool stage only. Okay? Which tool is suitable LLM to find and call? For example, okay? Basic question is, what is tool exactly? Okay? If LLM supports concurrent requests, you can do multiple processing. Okay? What is a tool? So, first of all, everyone, let me give you an idea. Okay? Have you ever heard of a tool called Tavily, right? Tavily is a tool that will connect your LLM, your large language model, to the internet. Okay? If you want to go to the internet and test something, have you all heard of Perplexity? Have you all heard of Perplexity, right? Perplexity API, right? Lovely. Now, everyone, listen to me. I'm going to ask you a simple question. Tell me if this makes sense to you or not. Okay? Just a very simple question. Okay? Can I use Tavily as a tool to connect to the internet? This is going to be my tool number one. I will ask a query to the customer. And sorry, a a query is asked by the customer, where I'm going to do internet search. Lovely. You remember, in Python, we have something called Pi SQL, or your SQL queries that you write, that will connect to your internal database. I ask it question, it returns answer. Now, tell me, what do you mean by tools here? Most basic question. What do you mean by tools? Any tool, any platform that can help you perform cross-functional operations can be called as a tool, no? Agree, disagree? I'll give you one more example. Let's say you have an Excel file. Let's say you have an Excel file, okay? XLSX, Excel file. My people, can you tell me in Python, what have I taught you that can read Excel and can also write Excel, edit Excel? What have you learned? Pandas. Can Pandas not help you read Excel? Let's say, locally, you have some file called Excel, or you have something called CSV. Can I not use Pandas to then perform a task of reading, editing, and then, once the answer comes up, use an LLM to summarize the answer and pass it to the customer? So, tell me, what do you mean by tools here? Can tools be specific to only AI AI? No. Tools are anything that can help you perform a certain task. A gateway of sorts. A connector of sorts. Anything that can connect from one position to other position, one application to other application, like APIs, like SDKs, like these specific tools. These are all your tools. Understanding what I'm trying to say. There is no high definition of everything. It's a simple concept. Anything that lets you connect to anything else with a simple framework is what a tool is. You got the concept? Okay? This is something that you guys are going to face in your practical aspect, also. See, why this question is important. Okay? Everybody. Now, this is where you're going to get the answer. Imagine I ask you a simple AI agent, or I ask you to build a simple AI agent, okay? Let's say you have written multiple different tools, right? Multiple different flow works for your agent, okay? Let's suppose there is a two tool. Let's say tool number one and tool number two. Example. I asked a question, like, who is the president of the company? Tool one responded back by saying that the president of the company is, let's say, Rudra. Example. And tool number two says, president of the company is, let's say, Tony. Example. Right? Now, who do you believe, right? Who do you go ahead and believe? Who do you say is tool one correct or tool two correct, right? Understand one thing, guys. This is something that you might encounter in the future, okay? You would like to know single reply. Whose aggregation or answers from the tool do you work? So, first of all, you got to understand one thing. Every single time, validating everything is not possible. Please think of it from a very practical perspective. Do you think for every single response, are you going to compare against a ground truth or validate? Do you think you have the time? Your agent is running on millions of queries. Your agent is performing hundreds of tasks a day. Do you think you can always come back, check whether something is validated, check whether something is not? No, right? Okay. So, what happens is any company will give precedence to the internal information more than the external information. Can I say like that? Whatever is available in the internal information of the company is given more priority than the external information. So, meaning, in in case, if two tools are returning preceding answers or or contradicting answers, one tool can take precedence over other tool with the importance of these tools. So, internal information is more important to me than external knowledge and information. What if the company changed the president midnight yesterday, but the internet does not even know that the president has been changed? So, who would you prefer? Would you prefer the internal company documents, or would you prefer external resources? So, understand one thing, guys. If you build something in the future, where two different tools are giving you two different answers, and you have this each tool its own LLM to aggregate the answer, please understand, you first of all take the precedence of internal knowledge over external knowledge, and each tool has its own LLM for the aggregation of the answers. Precedence and aggregation, they are both going to be taken at the ones. They are all Each individual tool has its own individual aggregator or your LLM. Okay? How about when both are returning info from open, depending upon the training? So, that is something that you don't rely on. Please understand that you need to rely on the information, also. If both the sources are returning from, say, external sources, and they are both giving wrong answers, then pretty much you need to build one more agent called as a judge agent, and that judge agent is going to finally give you the answer, which source makes sense. Here, internal means company site, internal company's database, internal database of the company, internal documents of the company, internal structure of the company, internal. That's what I mean by internal. Okay. So, everybody, we understood the concept of tool use patterns. What's the next pattern? Everyone, observe carefully. Please look at this. What is the pattern? The name of the design pattern is called as a planning pattern. What do we call it, everybody? Now, you are entering the zone of complicated and complex agentic frameworks, right? So, first of all, listen to this. What do you mean by planning pattern? Guys, a [snorts] customer or a user prompts something, I break the prompt into plans or plans. The plan is going to generate the task. The task is going to execute an agent. The agent will return a task. If the task is good or bad, based on that, you re-plan and generate again. Keep repeating the cycle until you find satisfactory result and give the response. Now, the difference between reflection pattern, okay, observe this, versus your planning pattern, what do you find the main difference? Okay? So, everybody, have a look at this understanding. In the planning pattern, you enable a large language model to break down large, complicated tasks into smaller, more manageable tasks. My people, do you think many people in life take this approach? When they are given a complicated tasks, something complex to do, they break the complex task into smaller, manageable tasks, and they do these tasks one after one after another, right? My people, planning equips an agent with the ability to react to request and strategically structure the steps before execution. Is it also logically making sense? That what my planning pattern is doing is it is taking a complicated tasks, breaking the task into smaller tasks, and these smaller tasks, I also plan that what will I do first, what will I do next, what will I do next. So, can I say in planning pattern, I create a roadmap of subtasks, determining the most efficient path of completion, prioritize, implement, prioritize, implement, prioritize, implement. Understood, everybody? Now, all of you, this is the most important thing that you need to learn in this class. One of the most important kind of most utilized patterns is something called react pattern. What do you mean by react? Reason and act. First, do reasoning with the prompt. First, perform reasoning, and then, once the reasoning is done, perform acting. The other one is called ReAct, which is reasoning with open ontology. We don't utilize ReAct because ReAct unfortunately goes through a lot of problem. Preferably, people prefer something called react. What do do mean by that? We extend this approach by integrating decision-making and contextual reasoning into the planning process. Sir, what do you mean by integrating decision-making and contextual reasoning? Everyone, if I ask you a question, for example, all of you, if I ask you a question, okay? Guys, I saw a car that could fly, okay? Now, the minute I say the word I could see or I saw a car that could fly, what is going on in your head? Are you not reasoning with your own head? Car, which is something that you drive on the road, cannot fly. Hence, whatever this guy is saying is not true. Did you perform reasoning in your head? You're getting what I'm trying to say? Did you think logical steps one by one by yourself to come up with an answer? You performed reasoning with yourself, right? My people, every single time you build any large language model, okay, sorry, or you use any agentic framework like planning pattern, what you do know, you implement something called reasoning and acting. I'm going to show you how to build a very simple react agent. You guys are going to love it. You guys are going to be like, "Okay, wow. This is amazing." But remember, that's the most important thing, okay? Now, one of the most simple understanding is this. What is ontology? See, first of all, the thing is, let me explain something to you, okay? There is a design pattern called ReAct also, okay? My people, what happens is ontology, okay, is the concept of process by process, okay? When I say process by process, basically what happens in ontology is the core concept is very simple, okay? Like, if you get an understanding, I think it becomes very, very simple. See, what happens know, it is a relationship between certain tasks. Think of ontology like that. First, what happened? Second, what happened? Third, what happened? Think of it like relationship and rules, right? Ontology is a map of all these, right? Ontology basically means was I able to map? Think of it like chronology, but with respect to the relationship between each other. For example, for example, if you ask agent, "Can you tell me what are the, let's say, AI trends in the future?" So, what will you do? First, what will the agent do? Number one, it will say, "Okay, I am planning to search for the AI trends in the future." What will be the next step? It will execute, meaning it will go to the internet, search about AI trends. And what will it do? It will observe. It will get an output. It will say, "Got 10 articles about AI trend." Then, what will it do? Now, search for specific trends. Then, it will return the answer, meaning it will continue repeating its own internal ontology, its own internal relationship, and it will give you an answer in a specific way. Understood what I'm trying to say? That relationship mapping is what we call ontology. That relationship chain is what we call ontology, okay? Everyone, have a look at this once again. Break the whole thing. Break the whole thing. Understand, plan it, generate the task, and perform, okay? So, first of all, let me explain. In any agent you're going to build, my people, please remember you first need to understand the word plan, execute, and observe. Plan, execute, and observe. Everybody, if your plan, suppose, is if I give you a question, search for AI trend. Is plan or the planning stage that my agent is currently in isn't this directly the question my consumer gave me? Whenever a customer or a user asks the question or a query, what do you do with that? Do we take that question, put it in the planning stage? My planner is going to plan that now. Now, everybody, once we plan, what do you mean by execute? Execute is nothing but taking an action now. To take some kind of an action is yes or no. What do you mean by action? Action can be search on internet. So, web search. What will you do the web search? Whatever your plan is, you're going to take that and you're going to search the internet. You're going to execute the operation. Now, my people, whatever result you get from the web search, isn't that becomes a part of your observation? 10 URL received from the internet. Can I say this is your observation? Now, everyone, wasn't this task number one that my model performed? The iteration number one that my model performed? Agree, disagree? Now, tell me, will the agent not perform the second operation? Copy, paste. Now that I have received, what has the model received, everybody, so far? My model has received something called as the 10 URL from the last resource, correct? Now, everybody, because the user is like, "Find me an AI trend." What will you plan? You will plan something like this. Finding specific trend in the internet. Are you then again going to search on web for a specific trend, right? Now, everyone, let me know one thing. Once you have done task number one and task number two and received 10 URL here and 10 URL here, are you not going to then let that URL pass through an LLM and let the LLM decide what should be the final answer, correct? You have received multiple different URL, multiple different links from the internet for search for AI trend. Here, you got the observation. Here also, you got the observation. Rerankers are slightly different in use case of, you know, rags, but sort of like though they are called autoencoders. Naruto, autoencoders are slow in process, and they take a little bit of time to rerank and give you the output. So, reranking is not applicable here. It's more like task by task breaking. And whatever task results you have, you finally pass them to your LLM, and then one answer is finally generated in the LLM, okay? How LLM decided which URL to give as output? It takes all the URL as output. All, not one, not two, not three, did not select anything. Please hear me out. Listen to me very carefully. It takes the entire dump of the data, passes to the LLM, and the LLM decides which one is going to be important. You can think of this like a research project. You can think of it like whatever project you want to build, anything you want to build, right? You got to understand that you are generating the data from something, from some place, and that data is dumped to the LLM, and the LLM is able to generate an answer. Understood what I'm basically trying to say, everyone? Please remember, every agent is going to perform three things. Plan, execute, observe. Plan, execute, observe. Plan, execute, observe. Plan, execute, and observe. Are you clear, everyone? So, planning, execute, and observe is not program, it's not it's a prompt technique plus program together combined will give you plan, execute, and observe. This is more of a prompt you give to the model to force it to first plan, then execute, and observe. So, I don't write a program for planning separately, execute separately, and observe separately. They are done automatically. It's a prompt that you give to the user, or sorry, the agent, so that it it breaks the problem into planning and execute and observe. Now, all of you, in your internet or in your Google Colab file, please understand one thing, okay? So, everybody, let's say the that I asked you to create, okay? What you're going to do, you're going to import a library. What is that library? You are going to import something called google. generative AI, genai. configure. Now, everybody, do you remember I made you write an API key or I made you create one API key, whatever API key you have created, okay? Whatever API key you have created, I want you to take that API key, okay? API_key is equal to, and I want you to paste that API key here, okay? Whatever API key you have. So, this is my API key. You can use yours. Always prefer using yours, okay? Now, everyone, "Sir, what is this google. generative AI.genai?" This is an SDK built by the company Google. What they did is, because they have so many generative AI models in the market, if you want to use any of that generative AI model, you cannot use it without using something called that package called google.generative AI. And what am I doing? Am I configuring the API, everybody? I'm configuring. So, available_models is equal to genai. list_models, okay? So, basically what happens is, if I write down something called list_models, okay, inside the genai part, and if I write down the list of available models, okay, and run this, everyone, what program have I written? The program that I have written for you guys in front of you is to tell you that all available models are in the Google Gemini or in the Google Gemini package. Everyone, have a look at this. Can you see that? First of all, Amazon, sorry, Google is also supporting something called Gecko. Your Google also supports Gemini 1.5 Pro. Your Gemini also supports Gemini 1.5 latest. Your model supports Gemini 1.5 flash, flash 00, flash 8 billion, flash 8 billion 001. Are these all the available models in your Google Gemini package? Can you use any model? See, there is no end to it. I can scroll, scroll, scroll and on and on and on and it's going till the end. Why did I teach you to write this? Because guys, every single time you are going to work on any use case, any API in the future, remember what models does it support? You got to make sure that you identify that first. If it doesn't support a model, which happens very rarely, then your API is not going to work. Also, you need to know what model do I copy and paste? Because can you see here everyone? The name of the model is what? What is the name of the model? Model / Gemini live 2.5 flash preview. If you don't mention the hyphen, will the internet will the LLM work? No. Because you need to make sure the naming convention is also properly followed. So, let's say if you made a mistake in the convention of the naming, it's going to be a problem. So, in order to resolve all those conflicts, what do you do everyone? You take the list and you take all of that. Understood? What is the use of temperature? Temperature is controlling the behavior of your large language model. Temperature is a number between 0 to 1, sometimes more than 0 to 1. More temperature means more creativity, lesser temperature means lesser creativity, more stricter answers. Top P is what you call as nuclei sampling. And you have a certain probability value, okay, in your large language model, let's say 90%. Or let's say top P is equal to 90. Your tokens are only and only going to be considered, let's say the first token, second token, third token, and fourth token, and all the four tokens together will add up to 90%. Okay? In top P sampling, in the generation, as relevant answers, you want relevant answers to be generated. I'm going to teach you how to build a react agent. This is the first time we are building a react agent. Right now, in front of us. We are going to observe what? What is it that I'm going to be teaching you? You guys will see reasoning plus action performed automatically by the agent itself. You are not going to involve yourself or any human intervention is not required or you will not do it, but you will see the reasoning and the acting performed automatically by the agent itself. Understood everyone? So, first of all, what am I going to do? Import google.generativeai as genai. I'm going to take this as an alias, okay? I'm going to import something called LangChain, okay? Inside this, we have something called LangChain Google genai. I'm going to import a specific module called chat Google generative AI, okay? I'm also going to take from LangChain. I'm going to call something called schema. I'm going to import human message and I'm going to import AI message, okay? And finally, I'm going to import something called regular expression. Now, I'm going to implement the code one by one by one by one by one and you have to kind of understand it from a perspective of what you what we are trying to do or how are we trying to implement this, okay? You're going to understand it line by line, process by process, and let's continue with that. So, first of all, have I used something called LangChain in my library here? I told you this from the very beginning that LangChain and LangGraph is going to be one of the important packages that you're going to use. Now, rather than theoretically explain to you what chat Google generative AI is, what human and AI message is, I rather program and show you the difference. Got it? I would rather program it, show you the behavior so that you understand what it is doing, okay? So, first of all, gemini_api_key I'm going to take down this particular key, my key of Gemini API. I'm going to copy this. I'm going to come over here towards the bottom and I'm going to paste my Gemini API key. This is my Gemini API key. Once you write down the API key, so basically, see majorly what happens now? Either you're going to be preferring to use Ollama, okay? Either you prefer using something called Hugging Face, okay? Either you work something called LM Studio, okay? Or either you mostly use something called Docker. So, these are the four most utilized How do I say this? Like the most utilized tools, packages that you use? Apart from LangChain and LangGraph, these are the ones that will become a part of your life, part and parcel of your life. So, yeah, I I wouldn't say TensorFlow and Keras are used much, but yeah. Right? So, genai.configure, inside this configure part, my API key is going to be my Gemini API key. Everyone, please tell me what have I done here? Have I taken my API key and have I configured my API key? I hope everybody in this class who has never worked on API, let this be a final reminder to everybody in this class that guys, without having an understanding in APIs, it becomes very complicated for me to teach you very simple basics of how your large language model works and how your APIs work. So, please understand every single time you're going to use any API or you're going to work on API, you have got to implement how these API models are configured, okay? What the configuration means and how are you able to configure these models, okay? So, basically, you have got to understand how the models function, okay? So, first of all, I have configured my API. Uh let me repeat once again. The libraries that I have imported, explaining this library would be much better if I show the program then explain it to you that chat Google generative AI will do what and human message and AI message will do with that. Is that something that I can do? Because if I do this, I am telling you you will understand it better. Okay, let's go ahead and run this. The first step in building any agent, in any agent that you want, is to initialize yeah, a large language model. To initialize a large language model, can I say LLM is equal to chat Google generative AI? Inside this, model is equal to, okay? What models do you want? Model / Gemini .2.0 flash. This is the model that we are going to use, not Pro. Pro gets exhausted very quickly. We want to use flash Pro, okay? Google_API_Key is equal to I'm going to pass something called Gemini API key. Temperature is equal to 0 and maximum_output_tokens is equal to 512. Everyone, please observe what have I done here? Did I write chat Google generative AI and inside this I wrote down all the details related to the model? So, tell me, why do we use chat Google generative AI? Because this is for your AI LLM specification mentioning, right? If you want to mention your AI LLM specifications, you use chat Google generative AI as a part of its feature. Understood? To set everything in one single place, I used something called chat Google generative AI. Now, everyone, what model am I implementing? What model are we implementing? Are we implementing something called this? Oh, put a comma here and put a comma here, no? Yeah. What model have I implemented? 2.0 flash. Okay. Did I also pass the key? Lovely. Did I also pass the temperature? Temperature is equal to 0. Agree, disagree? Everybody, what do you mean by temperature in a large language model? Everyone, let me explain what temperature is. Okay? What is temperature? See, temperature is a number, usually usually a number between 0 to 1, okay? In some large language model, it goes beyond 1 also, but for now, for simplicity reasons, let's say it's a number between 0 to 1, okay? Your large language model will generate token by token by token by token, it generates the answer, hence creating the entire sentence, okay? When you give the temperature very, very high, suppose you keep the temperature 1, you allow the large language model to become creative in the response. Meaning, the large language model's response or the generation will become very random. It will become creative but random. But if you keep the large language model temperature 0, large language model will become very, very strict. It will be like, I don't care about creativity. I will always respond in a very strict manner where only and only the probability of the token is high, I will put that pro token probability. Meaning, the LLM's response becomes very monotonous and boring, okay? Your response for every single question becomes only one single answer. Your answers that you generate from the LLM is not at all having any variation at all. Every single time you ask a question, the answer is always in the same format. So, to maintain the creativity level, you keep it between 0 to 1. If you want your model to perform very strict answers, or you know, if you want your model to perform in a very specific behavior, keep the temperature zero. If you want to give it creativity, give it one. Let me do one thing. Let me actually show you a better example, right? So, let me explain it to you in a much more nicer way in this content that I have, uh you know, this content has really good examples with respect to terminologies and also with respect to the uh temperature part. If I find it good, if not, then maybe just I'll explain it in in my own understanding on terms, right? So, guys, this is a resource that I've, you know, uh happened to come across. It It basically explains certain principles very, very better, and one of the principles is this. Lovely. Okay. Everybody remember one thing. Every large language model has five things you need to be very careful about. These terms are how you will be able to manage your large language model in the right way, okay? Sometimes, you don't need complicated architecture and complicated framework, you just need right numbers for all these things. So, number one is what we call as a context window. Guys, context window is nothing but the maximum number of tokens that the model can include both input and output. Did I teach you guys an idea of context window? That at once in a large language model, whatever prompt you pass, and the answer you get from the prompt, every large language model has a restriction, as a condition that at once you cannot pass more than a context window of 100,000 token, 1 million token, 3 million token. Prompt plus completion plus history. Everybody, we also have something called maximum tokens. This is a parameter to adjust the number of tokens to be used for a particular request. This is kept at the context window. Everyone, can you guys see here? I have written something called maximum output token. I am telling the large language model what? I am telling the large language model that, "Look, large language model, I am giving you a prompt. No matter what happens, the output tokens that you're going to generate should be capped at how many? How many tokens I will accept not beyond that." Your output cannot go beyond how much? 512 tokens. Are you able to see that I'm I'm even telling the large language model that the response you generate should not be beyond maximum output tokens. Yes or no? Am I controlling that behavior in a large language model? Yes, you can do that. So, apart from this, what other things that can we take? We take something called as a temperature. Everybody, what do you mean by temperature? Everybody observe very carefully, okay? Temperature is nothing but, let's suppose you have a question. I love to eat Nutella or Nutella and {dash}, okay? Your input to the model was like this. I like to eat eat Nutella and {dash}. What is the {dash}? Now, everybody, when you say {dash}, isn't your large language going to large language model going to give you, or internally, it will try to tell you that Nutella and bread? Bread has a probability of 87%. Right? 87% is the probability of bread, the next word. Then, Nutella and let's say cake. The probability of the cake is the word 0.34%. Then, the next word, Nutella and let's say stick, is going to be is 0.21. Now, everybody, the next word, whoever's probability is high, can you see the probabilities popping up at 0.87? Agree, disagree? Lovely. Observe this book, observe this page, and what do you see in this particular page? The cooler temperature, or lesser value, the distribution is strongly peaked. What do you mean by strongly peaked? Guys, lesser the temperature you give, no? Suppose, what happens is, if you give a lesser temperature, what happens is, your large language model will only choose the word that has the highest probability, the probability that is known and that is very, very strict. Tell me one thing, guys. Can I not give variation in I like to eat Nutella and something can also have a cake because cake also makes sense. Stick also makes sense. Everything else also makes sense. Because bread has a high probability, doesn't always mean I will take bread every single time. See, what temperature does to your large language model is, temperature, first of all, is a number between 0 to 1. Are you first of all clear? It can be more than 0 to 1, but for simplicity reasons, everybody, please understand that a temperature value in a large language model will be 0 to 1, okay? Lovely. Very, very good. Now, let's say, if I ask a question to my large language model, "Is Earth flat?" I ask another question to a large language model, "The are bees important?" Okay? "Bees important for humans." Okay? Let's say I asked two questions. Okay? Everybody, when the temperature you keep is low or almost zero, what you're telling your large language model is, "Do not get creative creative, okay? Give me Give me standard reply standard replies." So, what will happen is, "Is Earth flat? No. As per studies, Earth is round." Okay? "Earth is a globe." Let's say like this, right? "Are bees important for humans? Yes. Bees are important important for humans." Do you see any creativity in the answer? Do you see any responses that will make you say that it went through multiple different iteration direct straight responses. But, if I say temperature is equal to 1, okay? Copy this and paste it here. Let's say I say temperature is equal to 1. You are allowing your large language model to get creative. So, when you ask the question, "Is Earth flat?" "Well, good question. Allow me to respond with an Are you understanding what I'm doing here?" Your large language model is now not sticking to very strict responses, but it is going beyond a certain level, and it is responding back with either a greeting, either a certain study. It is expanding its answer, giving analogy, giving references. It's giving many things to keep your answer very prof- It's professional answers, but more creative answers. It's not very strict. So, tell me, everyone, in which condition should you keep the temperature zero, and in which condition should you keep the temperature one? Remember, always keep the temperature closer to zero when you want very structured output. Every single time, whenever you want the output to be very structured, keep the temperature zero. But, if you want your answer to be not structured, not very dis- It's very unique answers, right? You keep the temperature to 1. Understanding what is temperature, everyone? The response the request response cycle of your LLM, right? Please make sure that if you don't want creative nonsensical answers, don't give the temperature one, give the temperature zero. In the agent, automatically, what temperature did I give? Zero. Because I did not want my answers to go beyond. And also, making my temperature one means what? It is going to give the same answer in a different way, wasting my tokens, right? But, does everybody believe that your own responses are So, basically, does it not depend on your goal? If your goal is, "I am okay," then it is okay. Get a simple answer. If your goal is to get very structured output, don't keep the temperature more than that. Am I making myself clear? Did you guys get the logic? Keep your temperature aligned so that you don't mess it up. Okay, lovely. Sir, what about the next concept? What is something called top P, top N? Okay? But, keeping temperature higher, won't it make the model hallucinate? Yes, 100%. My people, getting creative, meaning the chances that you might be wrong in your answer can also go up, no? That probability also exist. While going creative, don't you think you are messing up with the reasonality or the reasons part of it? Yes, 100%. You're not wrong. So, don't you need to control that behavior? Every large language model defaults the temperature to zero, never to one. If you don't mention, automatically considered zero. Got it? In every large language model I have used, nobody automatically defaults it to one. They default it to zero, okay? Lovely. Now, everyone, you are also going to hear about something called top N and top P. Let me scroll down. So, this PPT basically has ideas about everything, man. I mean, you read this book, I I'm telling you this is very amazing. This is something I referred to build my um paper, also. There is a platform called guardrails.ai, okay? Guardrails AI, okay? There is a platform called guardrails.ai. In this guardrails.ai, if you go to hub, can you see you can do every single check? For example, if somebody's giving profanity-based language, suppose somebody's chatting with your chatbot, and he's abusing your chatbot. Hey, you know, you are X, you are Y, your company's X, your company's Y. You know, right? You cannot keep everybody happy in life. I hope you know this. Let's say you have built the best platform on the planet. You cannot keep everybody happy in life. There will always be this one person who will start screaming, "I'm I go but you know." There is always this one person who will always scream, "I'm not getting it. This is XYZ. You're terrible. You're this. You're that." Nobody appreciates the effort. Companies understand this very carefully. So, there will always be one customer who will come back and answer, "Oh, you know, nonsense. You guys are, you know, XYZ." Can you see? What does it say? He says, "You're a dash and you're a dash idiot." Right? You want your chatbot to not include that and give you the answer. And what do you want to do? What do you want to answer? You want to answer in a very nice way saying that, "Hey, I'm sorry, but we don't acknowledge profanity in our chatbot. Can you please ask me if you have a question? I will try very best to give you an answer." And let's suppose, if if your customer starts abusing you, can you not let the chatbot then pass it to a human being? Because then abusing is something that you don't handle, right? Because the customer either must be very frustrated or the customer has a lot of free time in his hand, right? I'm sure you guys know these customers, right? Uh so, when you find somebody like that, you got to have implementation of guardrails. Because tell me, how will your large language model be be able to handle with temperature only the abusers or anything else that you want to manage, right? You got to have something called guardrails. So, Harendra, remember, guardrails are a necessity. Guardrails nahi hai to LLM nahi hai. I'm going to tell you this. In the implementation of real life, if you don't have guardrails, you cannot implement it. Okay? So, everybody, remember, implementation of guardrails are very, very important. Okay? Everyone, Grok is more creative. Now, if you if you guys know this or not, Grok is a little bit more creative, right? Like, if you tell Grok in a certain way, it will respond back in a certain way like, "Hey, kya bol raha hai? Tujhe kuch pata nahi." It returns the answers in slangs. Tell me, do you ever want to be in a situation where you your customer says something to your bot and your bot responds back double the messages? Like, you said a gali and then the LLM returns back the response saying that, "Hey, you are this and you are that." Would you want? No. I'm genuinely telling you. I know this is a fun thing, but you don't want your LLMs to respond back in a escalatory manner, no? So, in case if you want to because it ruins your brand image, right? So, no matter what large language model you use, mostly large language models like ChatGPT, your your Gemini, any of these models, no? These models automatically come with internal policies. Like, if you go to ChatGPT and abuse, will it return a response? ChatGPT will be like, "Sorry, my internal policy says I'm not supposed to respond back." But what if you don't know and if there is a response back? Then in order to control that, we use guardrail. So, remember, guardrail is to not just control your input, but also the output from the LLM. Whatever I'm teaching you right now, these are very practical advices, my people. I have burnt my hands. I have gotten mails from my managers, "K bhaiya, kya bana diya hai? Have you not even thought through?" So, I have made lot of mistakes in my life. So, these are advices that I'm telling you don't repeat because I have made them and it is very embarrassing. I have made them. But initially, when things were starting up, nobody knew what we are doing, right? So, these are some advices I'm telling you. These are very practical, okay? Now, everyone, what do you mean by the concept of top P and top N? Okay? So, first of all, everyone, since large language models' outputs are based on probability, setting a top parameter will restrict the selection of the next word from the top N. Meaning, most probable words or top words summing up to the probability P. Now, you must all be wondering, "K bhaiya, what is he speaking? What is Rudra saying?" Don't worry. Allow me to explain. Everybody, focus on my screen for the next 10 minutes. I promise you you guys are going to love this, okay? Guys, you have a question you asked, okay? Ignore the softmax activation. I want you to ignore the softmax because it is a uh probability for multiple different classes. See, the word or the token is selected using random weighted strategy, but only from among the top N words. Everybody, listen to me very carefully. I said, my question was, I love I love banana bread, okay? Burn uh I love um how do I say this? Okay, makki roti and dash dash dash with it. Now, everyone, I'm going more Indian Hindustani example. In most of our Punjabi households, makki roti is eaten with sarso ka saag, okay? Sarso saag is how you we eat the roti, okay? So, we know for a fact that sarso saag is the most probable word with the probability of 0. 87% agree? But you can also eat makki roti and you can also eat dal, right? Dal is also a valid option. I say the probability is 76%. You can also eat makki roti. You can also eat makki roti with I don't know, let's say sabzi, example. And the probability of sabzi is 0.47. Now, everybody, when you say top_n is equal to three, what you're basically doing is you're telling your large language model that when I said the word I love makki roti and dash, okay? And it gave me multiple different answer like I love makki roti and let's say top N is equal to two. How many options do I have to fill the sarso saag or I mean, how many option do I have to fill up the the fill in the blank? Three. But how many of them am I choosing? I am only telling the large language model that, "Can you please choose the answer from the top performing probabilities? Understanding? Ignore the third one. Don't even include it. Don't even consider that for my output. Only include the top N which equal which is top equals to two." Did you get what I'm trying to say? You're telling the large language model that whichever probabilities are ranking in the highest to the lowest order, even to include in your answers, don't even ignore don't even include sabzi. Only include dal and sarso and based on some analysis, figure out which one is right. Not always sarso saag will be chosen. Remember, my people, just because the probability is high, it's not mandatory that only sarso saag will be selected. Sometimes, depending on the token counts, your temperature and many other things, something else like dal also will be selected. But are you including sabzi in your analysis? No. You're only including dal and sarso saag. Are you understanding what top N does? Here, for N is equal to three, one of cake, donut, or banana will be randomly selected, but apple will never be selected. Are you understanding? These are selected out of random behavior based on peak density, but apple will never be selected because you have chosen only the top three values here. Did you get the the idea of top N? Remember, personally speaking, I have never implemented top N. I don't use it. Don't prefer it. Have It has not given me any better answer so far. I have experimented with it. I don't know what others say. Personally speaking, top N has never helped me. So, I have never used it, okay? Sometimes, we ask a question for that, it generates two responses. And ChatGPT shows better of the response. Which one do you like better? Basically, top N? No. That is what we call as AB testing. For example, you ask a question and ChatGPT says this one and this one. Which one do you prefer? You are basically performing a testing where you're letting the user label the data for you. Which one do you want? So, it's more like an AB testing thing. It's not related to your top N. Understanding? It's more like how should ChatGPT become better in serving the customers? Got it? Now, everyone, look at this. What do you mean by top P? Everybody, listen to me very carefully. What do you mean by top P? My people, top P is the word and the token is selected using random weighted strategy, but only among the top words totaling the probability 0.33. Everybody, listen to me very carefully. You said that, "I want you to choose words that add up to the probability of 33%." My people, only cake and donut together, they added up to 33%, right? So, the probability itself is filled already by the top two tokens. It doesn't have any more probabilities to include because you said up to 33% probability density, you're able to take. So, your top P is only going to select the top two models whose probabilities add up to this much probability that you want to add. Got the logic? So, your top P and top N are basically trying to Yes, it always does random behavior. You got to understand everything, guys. My people, any large language model, towards the end of it, how will the token probabilities be distributed? Like normal distribution. And from this normally distributed token probabilities, you have to choose the right probability. Understand that. When you are building any large language model, understand that every single behavior in this world behaves like a normal distribution. And from that token probability, you're going to pretty much pick up the probability. Okay. Guys, coming back to the program, please tell me, did I write my program to initialize my large language model? I have initialized my large language model. I have. Did I tell what model am I using? Yes. Did I tell what Google JPK? Yes. Did I tell what temperature? Yes. Did I tell what maximum token? Yes. Absolutely. Everyone, once I initialize my LLM, what do I do next? What do I do next? What do I do? I need to write down a function called calculate. Okay, calculate. Now, everybody observe very carefully what I'm going to do, okay? This is where my entire logic is going to go inside. Guys, what I'm doing now, I'm going to write down something called expression. And inside the calculate expression, I'm going to write down return. I'm going to say return. I'm going to say eval, and I'm going to pass something called expression. Now, everybody, do you know what eval is? In Python, you can evaluate certain expressions. For example, eval 2 + 3 multiplied by 4. What is the answer? What is the answer? Okay. Did it evaluate? First of all, did it give me the answer? 2 + 3 into 4 is 14. Did it give me the answer? It gave me the answer. Okay, lovely. Why do you use eval? Eval in Python is evaluate something. To evaluate and give you the answer. That is why you use the eval part. Any mathematical expression, if you want to evaluate and give the answer, you want to do it. For example, can I do it like this? multiplied by Rudra Can I take a string value called Rudra and multiply this with a number? No, right? But if I have a number, will the evaluation be complete? Will the evaluation be given or will evaluate give me the answer? Evaluate is going to take any mathematical expression and it will give you the answer. That is why we have evaluation. Now, everybody listen to me, okay? I will write down a function called evaluate or calculate average pill weight, okay? So, I'm going to come over here. Average {underscore} pill {underscore} weight. And inside this, I'm going to write down a dictionary. Everybody observe what I'm doing, okay? Pill {underscore} weights is equal to I'm going to take this dictionary. And inside this, I'm going to call aspirin. Tell me if my if my spelling of aspirin is wrong or right, okay? 500 mg. And what other medicines do people consume? Paracetamol. Right? Paracetamol is, let's say, 650 mg. And let's say I say ibuprofen, okay? Ibuprofen. Okay, this is something that I consume, 400 mg, okay? Everybody listen to me. Are these medicines and their weights? Aspirin is worth 500 mg. When you take aspirin, you should take 500 mg. When you should take paracetamol, you should always take 650 mg. And when you should take or when you take ibuprofen, you should take 400 mg. Have I taken two functions who are doing this? One is called calculate and other is called average pill weight. My people, please listen to me very carefully. What I have done, known {underscore} actions is equal to I have taken a bracket here. And inside this, I have taken calculate. And I have written something called calculate. And I have written average average {underscore} pill {underscore} weights Okay? Is going to be my average pill weight. My people, for my large language model, for my agent to perform some actions, what actions have I written? Have I written two action? One is to calculate something. I don't know what something is. And the other is to get the value of the average pill weight. Have I given my large language model two actions to perform? Will my large language model or my agent perform anything else apart from these actions? Did you understand one thing? I literally told my agent that the actions you are going to perform are going to be these ones. And did I very specifically mention the actions? So, tell me, am I building an agent that can perform any action? No. This is the actions that my agent will perform. Understood, everyone? Understood, everybody? These are the actions. And what are these actions? Whenever I do calculate, it will calculate something. If I call the average pill weight, it will return the key and the value. Understood? That's the only action that I want my agent to perform. Until here is everybody clear that I just want my agent to perform this. For calculate, we have not specified action. My calculate has specified action. No, I will return whatever expression you pass, I will calculate the mathematical expression and return. Yes or no, everybody? If I pass in the calculate 2 into 3, will it not return 6? 3 * 2 = 6. Evaluate will return the mathematical function. 3 * 2 = 6. Now, my people, I'm going to write down a class called agent. Define {underscore} {underscore} init. What do you call init, everybody? Is init your constructor? Self.messages is equal to bracket, empty array, empty list. Everybody, in every agent or in every chatbot, do you not have a user history? I asked something, user bot responded something, then user asked something, then the bot responded something, then I asked something. That history of messages, where am I taking? Where will I save it? I will save it in a empty list. So, can I say this is to maintain message history? Whatever my user has asked, whatever my bot has responded or agent has responded, to maintain the history of that, I am also taking an empty array called self.messages. Correct? Agree or disagree? {underscore} call What is my call, everybody? Please tell me. Why have I written {underscore} {underscore} call? {underscore} {underscore} call is what you call as a dunder method in Python. Why do you use call? Because it overrides. Yes or no? It will override some behavior of the class. And if you are asking me, "Sir, what is override? What is behavior of the class?" then I am like, "Guys, you need to pick up your object-oriented skills." Specifically basic Python. Everyone, I'm going to pass messages inside or message inside my model. Everybody, don't you want to keep the human message separate and AI message separate? Won't it be amazing? Somehow, human all the human messages are kept separate and all the AI messages are kept separate. It would be amazing if I can segregate them. I would love it if I can segregate them. Self.messages .append. Whenever a human pass a message, can you make it human message and the content will be equal to message. Are you guys understanding what am I basically doing now? Did I take from my library two single things? One is called human message and the other is called AI message. Why did I do that? Is it helping me segregate what message did the human pass inside the AI? And what response did you or did the AI give you? To keep the template separate, did I order or did I import something called human message and AI message? Are you guys understanding? If I come over here and pass content is equal to message, and what am I going to do? I am going to get some response. Yes or no, everybody? I'm going to get some response, no? Lovely. Now, everyone, I'm going to write down something called pass here. Okay? The reason I'm going to write down something called pass here is because what I did not teach you so far is how to use the API, which is your Google Gemini API. Everybody, is this your Gemini API? Here? This particular API here, right? I'm going to copy this both. And I'm going to paste it here towards the bottom, okay? I'm going to paste it here towards the bottom. Your question to me could be, "Sir, I have a question I want to ask my chatbot." Okay? Like, "Who is the president of India? Who is the prime minister of India?" Whatever XYZ question you want to ask, you have some question. "Sir, I want my bot to respond with with a certain, you know, response like, 'Okay, the prime minister of India is this. The prime minister of XYZ is this.' I want to generate some answer. I want to generate the answer using my LLM." "Sir, how do I do that?" Everyone, please understand one thing. Your Gemini is literally very simple to invoke. And when I say invoke, it is literally so simple to literally get a response. You know how do you get a response? Suppose if you have a question, okay? Let's say your question is this. P R O M P T. Prompt is "Who is the prime minister minister of India?" Who is the prime minister of India? Response will be equal to Here, my people, you want to generate the idea. Meaning, you want to generate a certain response from the model, okay? Now, in case if you want to generate a response from the model, no? You got to first declare the model. So, here, what you're going to do, you're going to write down something called model is equal to gen AI. Okay? Model equals to gen AI. dot Generative model. And inside this what is the name of your model everyone? Is it Gemini 2.0 flash? Is this the name of your model that you have used in your model or in your the LLM that you're using? Lovely. Who is the prime minister of India? The response will be nothing but model. Generate content. Inside this pass your prompt and here you're going to pass something called response, okay? The response directly. And if I run this, did I get the answer? That the prime minister of India is Narendra Modi and I got multiple different response from my API, finish reason stop, average log probability, metadata, token and everything. Sir, I don't want my response to be so dummy, like so many different things. Just write down response.text. Is the answer directly text? The current prime minister of India is Narendra Modi. Let me ask a simple question. Okay? What is the most important personality trait personality trait a man should have and run this. Let's say asking it a weird question like that. See, it responds. There is no single most important personality trait for a man or anyone really. What's considered important often depends on individual values. Did it give me some answers which I'm not interested but did it give me the answer, right? So, tell me, are you asking a large language model and did I use the Gemini's API to get the response? This is how the response is going to be generated. You will create something called model, you will get something called prompt and you will put something called generate content, the response will be generated. Understood? Simple as that, easy as that. Understood? Okay. Coming to the bottom, everyone. So, can I say in the response part, in the response part of my program, I might write another function in the future that can give me some response. For now, can I say that the response is going to be something called self. execute? Execute is a function that I have not written yet, I will write later. Can I say it like that? This function self.execute I have not written yet, I will write it later, but for now I will say I will get some response. And whenever I get some response, I will in the self.messages append this time whose response will I get? Will I get the AI message response after the message of human is passed? And what will be the content of the message? Whatever response I generate from the user, that response I'm going to get here, right? So, I'm going to copy the response. And whatever response you are giving me, I will save it. And return response. Return response. Because I have written a function called execute, my people now am I going to create something called as execute, this function called execute? Because execute needs to do something. What will the execute do everybody? Response is equal to LLM of self.messages. Whatever messages I am passing inside the model, am I going to take that and pass it as a self.messages to the response? And for that return or for that response, I'm going to return an answer called response.content. Everyone, what is execute doing? It is taking your messages and running it. I'm basically running the response cycle, that's it. So, first of all here, if I take dot content, please tell me, what is the result here? Did it return anything? Dot content, did it return any content? What did I say? Did I say text here? Did it return the text? Okay. You got to understand that this function right here, the one that you are seeing in front of you, execute, is not complete yet. So, dot content is something that I will tell it to return. You're getting what I'm trying to say. In Python, you know know that you can write down certain methods that you declare later that will follow your behavior. You're getting what I'm trying to say. Are you understanding? This dot method is something that I will override. I will tell dot content will return what. Which is nothing but dot text, but I will come to that later, no? This execute is something that I have written. Understood until here is what I'm trying to say. The reason I keep report repeatedly importing the packages and re-initializing it again is because I want the code to be readable and beautiful. Meaning anybody in the class should not look at my piece of code and is like, but where did he initialize the large language model? So, I'm basically going between writing the whole initialization so that you guys can understand where I have initialized it. You need the complete and absolute control over the messages you are the user pass. So, in order to do that you need to take the messages as a history like that. You can take caching also if you want to or you can take a database here also if you want to. Meaning, rather than having the self.messages, you can also pass a database here and it'll execute, okay? It's executing human message and responding, correct. All the messages inside the LLM, it's going to take and it's going to respond. Where is the messages? Isn't this the messages, the list? You're going to understand, don't worry. Right? Can I execute my agent? Can I run my agent? My agent just ran, correct? Now everyone, action_regular_expression is equal to re.compile. Inside this I'm going to write down R bracket I'm going to say action, I'm going to write down colon here /s star colon /w plus colon /s star close the bracket dot star question mark close the bracket again question mark colon /n or operator and dollar. Now, you must be wondering, "Sir, what have you done? What in the world have you written here? What in the world is this?" Okay. Have you ever heard of a concept called regex? Anybody? Have you heard of a concept called regex? Yes, lovely. How does regex works anybody? Regex 101, okay? Lovely. I love India so much at the rate at the rate 1 2 3 4 5. How many different characters do you see in this text? Three. If I say give me A to Z. Okay? If I say give me anything between let's say A to Z or if I ask it to give me anything A to Z. Is it running? Is it highlighting anything? No. Sir, I have something called R that you said about something called A to Z. Again not highlighting. See, you got to understand one thing. To test your regex okay? Or to test your regex with a specifically something called regex 101 you got to understand that there are multiple different frameworks like say for example, Python. Can you see here you have something called Python? Like if I say A to Z. Right? Or A. Just let's say A. Do you see that I am matching something called my global modifier A? If I say the word P, is it P? No. If I say one, one is highlighted. If I say capital A to Z and close the bracket here and open the bracket here it did not match anything. Okay, sir, if I do all small A to Z, did it match anything? No, again it did not match anything. The reason why it will not match anything is because in regular expression specifically like Python, no, you have to give a certain conditions. In those conditions certain anchor tags are given. Only and only if you give those anchor tags are you able to match it, but let me come to that in a second, okay? First of all everybody you got to understand one thing. Whenever you are dealing with text, you want to identify certain things, correct? Like for example, how many are characters? How many are sentences? How many are uh words? How many are numbers? Correct? Let me ask you a question. When you guys were doing Python, did you not do regex at all? Did you implement regex? Like for example, if I ask you to implement A to Z or A to Z, did you not implement a regex like that? No, okay. No problem. Everyone, I want to monitor every single character which is from the alphabet A all the way till alphabet Z. Is everything highlighted everyone? See? I space space space love space space space India space space space so space space space much space space space at the rate and then 1 2 3 4. Can I say every character is highlighted? Every character got highlighted. Okay. I also want every number between zero to nine get highlighted. Now, is number and character both highlighted? Okay. If I say /s, is my space also highlighted? If I remove everything, I only want to highlight my white spaces. Are my white spaces highlighted now? Just the white spaces. If I say /w, what is highlighted now? Can I say both my alphabet and my characters, everything is highlighted apart from the special character at the rate at the rate? Can I say it like that? So, everyone, can I say that in regex there are multiple different anchors like that. /s is monitor my space, /w is monitor everything, dot will have some meaning, star will have some meaning, question mark will have some meaning, this question mark will have some meaning, this colon will have some meaning. Do you think in regex all these tags or anchors will have their own meaning? So, if I go to regex anchors, regex anchors, okay? And if I click on regex anchors and click on this everyone, how many anchors are you supposed to learn? Can you see everyone? In your regex anchor boundary, carrot, dollar, beginning, end, bringing order, delimiter, multiple different types of anchors. So, if I say regex anchors Python and click on this and click on this particular exactly, how many anchors do you see everyone? Can you see plus star multiple different types of anchors? Okay. Can you guys try to understand what anchors I have used in this code? What is this slash s? What is this star? What is this w plus? What is this slash s? Can you try to figure out what these anchors are? See, because guys even if I try to explain these rejects pattern for you in a day teaching you this rejects patterns and anchors will not be easy for me. And without the rejects, I will not be able to build the agent. So, first of all, your understanding of the rejects is very very important. And the best thing is all you need to go is you need to go to something called rejects 101. If you go to this library or this particular website, you will be able to literally test any rejects pattern that you want. Okay? That's the quick info, guys. Intellipaat offers an advanced certification in agentic AI systems and design in collaboration with IIT Madras Pravartak. It will enable you to build autonomous agents, multi-agent systems, and production-grade RAG pipeline using CrewAI, AutoGen, LangGraph, and modern Python. With this course, we have already helped thousands of professionals successfully transition into AI-driven roles. You can check out their testimonials on our Achievers channels, whose link is given in the description below. Without a doubt, this course can take your career to next level. So, visit the course page link given below in the description and take your first step towards building a future in agentic AI. I gave you guys an understanding about a small little agent that we are building. When you want to build an agent of sorts, what do you do with that, right? We basically have a certain framework in our mind and we follow that framework. Okay? And that framework was, if I just open it up very quickly and show you in the PPT, right? That when we build any specific agent, right? That agent, okay? When we started discussing on something called, you know, say a reflection pattern or a tool use pattern or a planning pattern or any kind of pattern, right? When you see different kind of patterns in an agent, right? You see that an agent is given a certain prompt and based on that prompt, the agent decides what task it needs to achieve and then it achieves that task. Okay? I made you guys understand how do you take an API key. Now, of course, it is not recommended to keep the API key open and please do not make the mistake. Uh I'm doing this just in case to show you guys an example, but mostly in production, you are always going to hide the API key. So, I'm going to write the comment. Hide the API key. Please do not keep it open. Okay? Please do not keep it open for everybody else to see. So guys, what I did is I took the Gemini API key. I initiated the LLM. Let me go ahead and very quickly install all other packages too because as you guys are aware in Google Colab uh you kind of have to reinstall everything if you're starting the instance after a long time, right? So, let me just reinstall everything and let's very quickly jump on to checking whether the API key works as well. So, let's first let this install. Let's see what instance do we have. Yeah, CPU seems fine. That's okay. Let's see list of available models. If this runs, my API is working fine. Status code is 200, which is all going to be fine. Lovely. Perfect. Then I have okay response.text. [snorts] Lovely. This is fine. Okay, no problem. So, let's say LangChain LangChain schema. Okay, no deprecation. Everything is fine. Chat Google generative AI, which is fine. Eval of 2 + 3 6, which is fine. Okay, lovely. Okay, good. I was here where I have built an agent. And within this agent, I have created this thing called self.messages. Self.messages is a way for me to just maintain a history or like a small history throughout the message, right? Whenever the AI writes a message, I write that. Whenever a human writes a message, I write that, right? And then I used a method called call. Call is basically going to call something and execute something. And here, I'm keeping human messages separately and AI messages separately. Okay? And I'm returning the response and execute, which I've mentioned, right? Here is just a way to execute the message. I told you guys that we usually have something called response.text, but I'm going to show you why did we do response.content. Okay? Now, everyone, I wrote down a rejects a pattern of rejects called action_re or action of rejects, where what I'm basically doing here is I'm writing some rejects anchors. By the way, just letting you guys know all this uh you know, the hyphens, the question marks, right? The slash s, these are known as, okay? These are known as rejects anchors, okay? So, what I've done is I've taken this anchors and I'm going to be applying this anchor on something. And I'm going to tell you exactly how this anchor works in a second, okay? What I'm going to do is I'm going to take a user query and I'm going to perform an action. Okay? But before I do that, I want you guys to understand one very important concept, okay? Because without this uh understanding the agentic framework will become slightly complicated. See guys, in any agent that you build, okay? You have to understand one thing. That whenever agent is given some task, okay? Let's say a task is given to an agent, okay? See guys, what usually happens when you write a task to an agent, right? The agent is going to perform something called as observation, okay? The first thing what the agent does is agent is going to perform something called observation. Observation is agent's understanding on what the task is, right? If you say uh I want you to extract all the details for the internet or from the internet about, let's say, the keyword AI trend, okay? Suppose if I ask AI all AI trend related news, social media post, Instagram, Facebook, everything that mentions AI trend, extract. So, when you give a task like that, the agent is going to first observe, okay, what that task is. And once the agent understands, which is what observation means, it's going to then perform the next step. It's called action, okay? Action is basically agent saying, okay, let me first go to internet, then go to Instagram, then go to Facebook, and whatever task I'm performing, right? Those actions are going to be taken. Remember, the actions you're going to be taken has to be first defined by you. If your actions are not defined, okay? In clear cases, what is it that you allow the agent to perform and don't allow the agent to perform, then actions won't be taken, okay? You also can automate your actions automatically, but remember guys, mostly what happens, we don't allow agent in real life, not that I have seen any company, but you know, do this, where they allow the agent to just do anything, right? They allow the agent a certain amount of how do I say this? A certain amount of freedom, but that freedom is always validated. You don't You don't just keep an agent and says, "Okay, I'm going to allow you to do anything." That doesn't happen, okay? So, you will You usually have refined and defined some actions that the agent is going to perform. And then the agent decides which action it needs to perform. For example, suppose you have action number one, action number two, action number three, action number four, and goes on and on. Many kind of actions you want the agent to perform. You can always let the agent choose which action is the best action for that task. That's okay. But what the actions will be, will that definition has to be given in order for your results to always be proper and very standardized, okay? So, remember, your agent is going to have or perform something called observation. And based on the observation, the agent is going to then perform a certain task. So, what we are going to do, we are going to take this understanding and we are going to put it in the code where the agent performs a certain observation. And then after observation, it performs a certain task, okay? That's exactly what we are intending to do. So, everyone, what we are going to do, we are going to write down a function, okay? Now, this function right here is going to perform something. What that something is, let's just write it down, right? Define run_query, okay? And let me expand this a little bit and tell me if this is visible, right? I've zoomed in a little bit. So, So guys, here in the run query part, first, I need to take the prompt from the user. Okay? First, I will take the prompt from the user. Then, I will also take another thing called maximum turns is equal to 10. This maximum turns equals to 10 is my way of telling the program that I'm going to allow agent 10 different operations or 10 different uh you know, opportunities. If within these 10 opportunities, the agent is incapable of performing the task, I exit from the code and I stop the program. Okay? I'm telling the agent that look, I'm going to give you 10 turns and you need to finish the task specified to you. Okay? This is only and only so that the agent has a break limit. As in, if the agent goes inside the loop, it is not able to complete its task, it's keep continuing, continuing, continuing, we say, "Okay, it's it's been 10 turns, you're not able to find anything. I'm asking you to stop everything and come out of it." Okay? Maximum turns is not the action that needs to be taken by the agent. These are the actions to be taken by the agent. Let's not get confused at all. So, what I'm going to do right now is I'm going to take the agent and I'm going to initialize my agent. And how do I initialize my agent? Because I have already written a small agent class here, right? I initialize it. What happens is I also take a variable called next {underscore} prompt is equal to prompt. Okay? I write down it. Next {underscore} prompt is equal to prompt. Now, everybody, okay? I write down a small for loop, okay? And I say, "For turn in range of okay? I say maximum turns." Meaning, whatever maximum turns I'm allowing or I'm permitting the agent to go through, I'm basically looping through that particular operation and I'm going to write down print F double quotes of and I'm going to say turn. So, basically, just letting the user know what turn am I currently in. So, I'm going to say turn plus one because as you guys are aware, in Python, the index is always start at zero. So, telling that this is my zeroth turn is not very uh not very uh how do I say this? It doesn't make sense to most users that, "Okay, how can it be zero turn? Shouldn't it start with one, right?" So, we say turn plus one just to make it make sense. Then, we generate a response, right? So, response is equal to we take the agent initialized and we pass something called next prompt inside this, right? We take the agent and we allow it to pass or give something called next prompt. Here, we basically take in the agent class a certain history, a certain execute method where we have a response generated by the LLM also. So, we pass in something called next prompt here, which is going to give me my response and then we basically go ahead and we print the response and we say things like assistant or whatever your assistant is has given me {slash}n next line in the bracket, let's say response and we write down something called {slash}n. What this code or this loop is achieving is a very simple task, right? It is going through the loop, looking for the maximum turns and for whatever return it is, I'm printing it, generating a certain response and my assistant is generating and giving me that response. Now, what happens, my people, is for my agent to perform certain task or to perform certain actions, I need to provide it something called actions. So, what I'm going to do, I'm going to come over here and I'm going to write down things like this. Actions equal to actions {underscore} RE, which is going to be the actions regular expression, and inside this I'm going to write down something called find all and response. Now, guys, this is where I think it should become slightly clear to you that whenever a certain response is generated by the assistant, in that assistant, I'm applying my action regular expression. This regular expression is written for what? It is written so that my LLM's response can be formatted in a certain way. How is it formatted? Where I'm going to check is actions or the actions generated or the actions calculated based on the response by the LLM, does it exist or not? So, I'm going to write it like this. If actions, okay? If this particular value exist, I'm going to say the action and the action {underscore} input provided by the agent would be actions of zero. Now, this is going to slightly be tricky without understanding what the output of the LLM is. So, you'd be like, "Okay, but then why did you take the actions of zero, right?" In a very simple term, what I'm doing, my people, is LLM is going to tell you what actions you need to take or agent decides what action it needs to take, but it decides that this is the first action I want to take, this is the second action, this is the third action. I'm preferring that I'm taking the first action provided or recommended by the LLM itself to itself. As in, you know, like how we humans, let's say you want to travel to a different country. Your mind must be thinking, "Can I take a flight? Can I take a bus? Or can I take a let's say a cruise, right?" And then you're like, "No, I think flight makes more sense, right? Let me go ahead and take my first decision was correct." So, it's like this. You think by yourself, decide on your own actions and you decide to say, "Okay, this is what I'm going to be taking." Okay? What I'm also going to do is action {underscore} input is going to be action input. Whatever action input I have, I'm also going to strip that value because here, in the action input, I am also going to receive a string value, which is the value or the response generated by the text, okay? When the output of this agent is going to be run, this code is going to make a little bit more sense. Right now, it might look like, "Oh, you know, but sir, why have we taken it exactly like this?" But don't worry, it will make more sense once you see the output, okay? So, I'm going to say another if condition and I'm going to say if action, okay? Not available in the known actions. Meaning, if the LLM, by any chance, generates or takes or tells the agent to take an action which you have not authorized. Meaning, the actions that the agent can take is only two things. Either calculate something or either give me the average bill weight. If the action is anything that is not provided by you, then you need to raise some error or you need to tell some kind of an error message to the value saying that, "Okay, I think this is a value error, okay? Unknown action. The action that you're recommending does not exist as a valid action." So, we are going to write down unknown action. Hey agent, you are performing an action which is unknown, okay? I don't know how to kind of handle this. Okay? Action. And here, we come down and in the same if condition, we write down something something different, right? How will agent do unknown actions, sir, as we are only defining methods, correct? Yes, but you have to understand that as a Python developer, you always have to make sure that you write down the else condition or the error condition that every single time you're able to ex you know, catch that exception. In Python, right? When we wrote the program, did you ever worry about writing something called try except? Where you want the model to try something, everything it will try, only in certain cases you find an exception, correct? Right? You you you catch some kind of an exception, maybe it could be a keyboard interrupt, if it's a value error, it's any kind of other error that apart from try, we don't know what kind of uh handling will my model do. So, in that case, exceptions will be able to catch that, right? Rather than having an exception, you I'm giving it something called raise a value error. So, raise is yet another keyword in Python that will automatically give you an error if the exception has been caught. So, I'm not using try and except, but I'm just using something called exception. So, what I'm basically saying is I'm just using this keyword called raise and it'll raise the error. Meaning, I'm just catching that other else condition that might show up maybe in other case, right? Whatever the case could be. What if my LLM is not able to respond properly and I'm raising that exception. In order to make you understand the code well, you know what? Let me first do one thing. Let me write down the piece of code first. Observation, okay? And then what I'm going to do is I'm going to take you through the line by line understanding, print everything and so that you guys can understand it even better. Because what happens, you know, sometimes uh let's say if I'm

Original Description

🔥𝐁𝐨𝐨𝐤 𝐲𝐨𝐮𝐫 𝐅𝐫𝐞𝐞 𝐌𝐚𝐬𝐭𝐞𝐫𝐜𝐥𝐚𝐬𝐬: https://forms.gle/g5tExa7e54xpYZW97 🔥Enroll for Agentic AI Course: https://intellipaat.com/agentic-ai-systems-design-course/ In this video Agentic AI Full Course 2026 Free, you’ll explore the complete journey of modern AI, starting from the fundamentals of Generative AI and moving toward advanced concepts like Agentic AI. You’ll clearly understand the differences between these two powerful paradigms and how generative AI systems actually work in real-world applications. The session is structured to build your foundation step-by-step, making even complex AI topics easy to understand. As the video progresses of Agentic AI Full Course 2026 Free, you’ll dive deeper into essential tools and concepts such as LangChain, AI Agents, and key Agentic AI design patterns. It also covers important technical building blocks like embeddings and Large Language Models (LLMs), helping you understand how intelligent systems process, store, and retrieve information. These concepts are crucial for anyone looking to build or work with advanced AI applications. Towards the end of the video Agentic AI Full Course 2026 Free we explores cutting-edge topics like RAG (Retrieval-Augmented Generation) Agents and the Model Context Protocol (MCP), giving you insights into how next-generation AI systems are being designed. Whether you're a student, developer, or professional, this video provides a complete roadmap to understanding and working with AI technologies that are shaping the future. 📖 Below are the concepts covered in the video on "Agentic AI Full Course": 00:00:00 – Introduction to Agentic AI Course 00:04:25- What is Generative AI system? 00:57:17- What is LangChain 01:17:01- AI Agents 01:35:19- Agentic AI Design Patterns 02:27:06- Agentic frameworks and basics 04:27:53- Embedding 04:59:22- Large Language Model 06:47:33- Rag-Agent 07:45:34- Model Context Protocol #agenticai #artificialintelligence #ai #machinelearning #da

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Playlist UUCktnahuRFYIBtNnKT5IYyg · Intellipaat · 35 of 50

← Previous Next →

Learn Excel for Data Analyst Jobs 2026 | Complete Guide for Beginners | Intellipaat

Learn Excel for Data Analyst Jobs 2026 | Complete Guide for Beginners | Intellipaat

🎯 Digital Marketing Salary in 2026 | Fresher to Experienced Pay | Intellipaat

🎯 Digital Marketing Salary in 2026 | Fresher to Experienced Pay | Intellipaat

🔥Strategic Consultant Role and 30% Salary Hike With Intellipaat

🔥Strategic Consultant Role and 30% Salary Hike With Intellipaat

Future of Databases in 2026 | Trends Every Developer Must Know | Intellipaat

Future of Databases in 2026 | Trends Every Developer Must Know | Intellipaat

Python Roadmap for Beginners 2026 | From Zero to Job Ready | Intellipaat

Python Roadmap for Beginners 2026 | From Zero to Job Ready | Intellipaat

🎯54% Salary Hike After Upskilling | My Intelllipaat Journey

🎯54% Salary Hike After Upskilling | My Intelllipaat Journey

Why JEE Maths Feels Impossible (And How to Fix It) | JEE Maths Strategy 2026 | Intellipaat

Why JEE Maths Feels Impossible (And How to Fix It) | JEE Maths Strategy 2026 | Intellipaat

🎯Your College Won’t Get You a 20 LPA Job | The Reality No One Tells You | Intellipaat

🎯Your College Won’t Get You a 20 LPA Job | The Reality No One Tells You | Intellipaat

🎯Real Data Science & AI Career Transformations | Intellipaat AI Course Reviews

🎯Real Data Science & AI Career Transformations | Intellipaat AI Course Reviews

🔥Agentic AI Is the Future of AI Jobs | Intellipaat

🔥Agentic AI Is the Future of AI Jobs | Intellipaat

🎯Data Analytics Explained | What Data Analysts Do in Real Jobs | Intellipaat

🎯Data Analytics Explained | What Data Analysts Do in Real Jobs | Intellipaat

SQL Tips & Tricks | Intellipaat

SQL Tips & Tricks | Intellipaat

Learn Data Science From Basics | Data Science For Beginners | Intellipaat

Learn Data Science From Basics | Data Science For Beginners | Intellipaat

Data Science Full Course 2026 | Learn Data science For Free | Intellipaat

Data Science Full Course 2026 | Learn Data science For Free | Intellipaat

🔥You’re just one skill away from doubling your salary | Intellipaat

🔥You’re just one skill away from doubling your salary | Intellipaat

Generative AI vs Agentic AI — The Future of Artificial Intelligence Explained | Intellipaat

Generative AI vs Agentic AI — The Future of Artificial Intelligence Explained | Intellipaat

🔥AI Agents vs Agentic AI | Intellipaat

🔥AI Agents vs Agentic AI | Intellipaat

Artificial Intelligence Full Course | Free AI Course 2026 | Intellipaat

Artificial Intelligence Full Course | Free AI Course 2026 | Intellipaat

Artificial Intelligence Full Course 2026 | AI Course From Scratch | Intellipaat

Artificial Intelligence Full Course 2026 | AI Course From Scratch | Intellipaat

Top 5 Programming Languages to Learn in 2026 (With Salaries & Career Paths) | Intellipaat

Top 5 Programming Languages to Learn in 2026 (With Salaries & Career Paths) | Intellipaat

🔥Top 3 Cyber Threats in 2026 | Intellipaat

🔥Top 3 Cyber Threats in 2026 | Intellipaat

🔥How I Became a Manager | State Coordinator to Manager Journey | Intellipaat

🔥How I Became a Manager | State Coordinator to Manager Journey | Intellipaat

🔥Upgrade your resume using these simple tricks! | Intellipaat

🔥Upgrade your resume using these simple tricks! | Intellipaat

Traditional AI vs Generative AI Explained in 3 Minutes | What is Gen AI? | Intellipaat

Traditional AI vs Generative AI Explained in 3 Minutes | What is Gen AI? | Intellipaat

🔥SEO Analyst to SEO Team Lead | 84% Salary Hike Story | Intellipaat

🔥SEO Analyst to SEO Team Lead | 84% Salary Hike Story | Intellipaat

Top Engineering Colleges in Vijayawada | Fees, Placements, Cutoff (2026 Guide) | Intellipaat

Top Engineering Colleges in Vijayawada | Fees, Placements, Cutoff (2026 Guide) | Intellipaat

🔥Will Agentic AI Create New Job Roles | Intellipaat

🔥Will Agentic AI Create New Job Roles | Intellipaat

🎯 Intellipaat's UI UX Design Course Review | Real Career Transformations

🎯 Intellipaat's UI UX Design Course Review | Real Career Transformations

🔥Top Skills You Must Learn in 2026 | Intellipaat

🔥Top Skills You Must Learn in 2026 | Intellipaat

5 AI Projects in Python that Make Your Resume Stand Out [With Free Resources] | Intellipaat

5 AI Projects in Python that Make Your Resume Stand Out [With Free Resources] | Intellipaat

Data Analytics Full Course FREE | Data Analytics Course 2026 | Intellipaat

Data Analytics Full Course FREE | Data Analytics Course 2026 | Intellipaat

Data Analytics Full Course 2026 | Learn Data Analytics For Free | Intellipaat

Data Analytics Full Course 2026 | Learn Data Analytics For Free | Intellipaat

n8n Tutorial for Beginners | AI Workflow: YouTube Videos → Notion Notes | Intellipaat

n8n Tutorial for Beginners | AI Workflow: YouTube Videos → Notion Notes | Intellipaat

🔥SQL Joins Explained | Intellipaat

🔥SQL Joins Explained | Intellipaat

Agentic AI Course Free 2026 | Learn Agentic AI Full Course | Intellipaat

Agentic AI Course Free 2026 | Learn Agentic AI Full Course | Intellipaat

Agentic AI Full Course 2026 Free | Intellipaat

Agentic AI Full Course 2026 Free | Intellipaat

🔥What REALLY Happens in a Hackathon | Intelllipaat School of Technology

🔥What REALLY Happens in a Hackathon | Intelllipaat School of Technology

Top Engineering Colleges in Guntur 2026 | Fees, Placements, Cutoffs Explained | Intellipaat

Top Engineering Colleges in Guntur 2026 | Fees, Placements, Cutoffs Explained | Intellipaat

🔥My Intellipaat Journey | From Student to Tech Professional

🔥My Intellipaat Journey | From Student to Tech Professional

🔥Websites Every College Student Must Know | Intelllipaat

🔥Websites Every College Student Must Know | Intelllipaat

Ultimate OpenClaw Setup Guide | Step-by-Step Installation & Configuration for Beginner | Intellipaat

Ultimate OpenClaw Setup Guide | Step-by-Step Installation & Configuration for Beginner | Intellipaat

🔥Technical Support to General Manager | 30% Salary Hike After Intellipaat

🔥Technical Support to General Manager | 30% Salary Hike After Intellipaat

How to Get Engineering Admission Without JEE 🔥 Colleges, Fees & Reality | Intellipaat

How to Get Engineering Admission Without JEE 🔥 Colleges, Fees & Reality | Intellipaat

🔥Hackathon Energy Like Never Before! | Intelli Hack X 2026 Highlights | Intellipaat

🔥Hackathon Energy Like Never Before! | Intelli Hack X 2026 Highlights | Intellipaat

🎯Intellipaat Data Science and AI Reviews 2026 | Freshers Getting Job

🎯Intellipaat Data Science and AI Reviews 2026 | Freshers Getting Job

🔥AI Is Replacing Entry-Level Jobs? The Truth Every Fresher Must Know | Intellipaat

🔥AI Is Replacing Entry-Level Jobs? The Truth Every Fresher Must Know | Intellipaat

🔥How to Get a Job with No Experience | Intellipaat

🔥How to Get a Job with No Experience | Intellipaat

🔥Can Freshers Get 20+ LPA in AI and Data Science? Truth revealed | Intellipaat

🔥Can Freshers Get 20+ LPA in AI and Data Science? Truth revealed | Intellipaat

Learn Generative AI Full Course Free 2026 | Intellipaat

Learn Generative AI Full Course Free 2026 | Intellipaat

Generative AI Full Course 2026 | Generative AI Course Free | Intellipaat

Generative AI Full Course 2026 | Generative AI Course Free | Intellipaat

This video introduces Agentic AI, covering its capabilities, limitations, and applications. It provides a comprehensive overview of the topic, including the use of large language models, agent-based systems, and autonomous workflows.

Key Takeaways

Understand the basics of Agentic AI and generative AI
Learn about large language models and their applications
Explore agent-based systems and autonomous workflows
Use LangGraph and LangChain for Agentic AI
Integrate LLMs with external tools and APIs
Implement guardrails for LLMs

💡 Agentic AI has the potential to revolutionize the way we interact with technology, enabling autonomous agents to perform complex tasks and make decisions without human intervention.

🔒 Pro feature: Ask AI to explain this lesson →

More on: Agent Foundations

View skill →

Build and Deploy an Agent with Reasoning Engine in Vertex AI

Adding a Phone Gateway to a Virtual Agent

From Zero to Working AI Agent in 60 Seconds

From Zero to Working AI Agent in 60 Seconds

Create An AI Agent With Replit That Automates Your Sales

Create An AI Agent With Replit That Automates Your Sales

Capstone: Autonomous Runway Detection for IoT

Capstone: Autonomous Runway Detection for IoT

AI Agents with Model Context Protocol & Typescript

AI Agents with Model Context Protocol & Typescript

Related Reads

Outpost: Routing Agent Turns to a Local Model, with Frontier Escalation

Learn how to optimize AI agent performance by using a local model as a proxy to reduce reliance on external LLM providers

Outpost: Routing Agent Turns to a Local Model, with Frontier Escalation

Learn how to optimize AI agent performance by using a local model as a proxy to reduce reliance on external LLM providers

Medium · ChatGPT

Building Business Intelligence Tools with LLM

Learn to build business intelligence tools with large language models, enabling interactive and language-driven interfaces for analysts and operators

Leveraging LLM for Business Intelligence

Learn how to build a conversational BI agent using LLM to turn English questions into SQL and get insights from structured data

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)