Build Your Own RAG System: Step-by-Step Python Tutorial (LangChain, CrewAI, OpenAI)

Analytics Vidhya · Intermediate ·🧠 Large Language Models ·1y ago

Skills: LLM Foundations90%RAG Basics90%Prompt Craft80%Vector Stores80%

Key Takeaways

This video tutorial demonstrates how to build a Retrieval-Augmented Generation (RAG) system using LangChain, CrewAI, and OpenAI, covering the entire workflow from identifying relevant documents to deploying the system. The tutorial provides a step-by-step guide on how to use these tools to generate coherent responses to queries.

Full Transcript

hello everyone today we are going to learn to build a rag based system rag stands for retrieval augmented generation and simply put a rag system enables us to chat with a set of documents retrieval augmented generation or rag is a hybrid AI approach that combines retrieval with generation retrieval is the process to fetch relevant context from documents or knowledge base based on a query or a prompt generation is the process of combining the relevant context F by the retrieval with the general awareness of the llm to generate a coherent response a rack system can be used for several tasks such as building a customer query resolution system using reference documents it can also be used to build an internal knowledge retrieval system for employees to quickly find answers from company's documents a rack system can also be used to build a legal case research system or a healthcare decision support system simply put wherever there are lot of documents or context involved and we need to extract specific information that's where a rack can be used in this video I will be showing you how to build a rack system using a simplified version of learner query resolution system that we have built at analytics with the you could use this system as a reference to build other various kinds of rag applications the workflow over here shows the key components in a rack system let's quickly walk over it and understand it while building rag systems I often divide the process into two phases phase one involves building a vector database and phase two involves testing the responses to the queries the first step is identify relevant do documents in our use case our Learners post their questions on selected queries and it's very important to respond to these queries as soon as possible so that Learners have a great experience the queries that we get are mostly specific to the course or the lesson that Learners are going through sometimes these queries can be General as well so the list of documents that we initially thought were going to be relevant to answer the learner queries were videos of the courses ppts of the courses subtitles of the courses and the past queries that the Learners have posted on testing we realize that processing videos directly is going to be expensive without significant benefit to the Quality the ppts were also not capturing the entire context as instructors often have a tendency to explain a lot of Concepts verbally the last two resources the subtitles and the past queries are actually coming out to be very effective in the final version we used both of them however for this video to keep things simple I'm just going to show you the system which is based on subtitles after we identify the documents the next step is to break the documents into smaller parts called chunks this is required so that we just get the small relevant section of the docum ment which is relevant to the query posted the next step is embedding models now our computer systems don't really understand text they only understand numbers so this bunch of text is converted to some meaningful numbers which capture the gist of the spoken words this is done through some pre-trained embedding models some popular embedding models are open AI text embedding and sentence birth but generally speaking most most of the popular llms have their own set of embeddings so there's an embedding by Lama there's an embedding model which is used in deep seek and so on once we get the embeddings from chunks we need to store them efficiently using a vector DB store there are various Vector DB stores which are popular one of them is pine cone the other is viate and another one is chroma DB in our scenario we have used chroma because it's open source and free to use now that we understand phase one let's start implementing this in code let's move to vs code we will Begin by importing some essential libraries let's understand the important ones the recursive text splitter helps in chunking the documents the embeddings that we are using in this case are the openi embeddings we are using the chroma DB store which is already built in within Lang chain and we also importing the essential classes from the crew AI library to build the agents later on let's import the libraries next I'm going to use my openai API key you can either use an environment variable or store the key in a file like the way I have done it next we are going to build a helper function which will use P SRT to process the SRT files that we have uh in our system so let's run this function now the structure in which our data is stored is that there's a folder and then it will have multiple SRT files so what we are going to do over here is we are going to refer to the name of the course and we are creating a dictionary where we are directing it to the file path where that folder is stor and in that folder we would have various kind of files there would be ppts there would be SRT files for various different lessons so what this punch of code is going to do is is it's just going to extract the SRT files and it would start storing them uh in a list right so and that list is course SRT files right so in fact that's a dictionary rather so let's run that in fact uh let me show you some s files as well right so uh basically uh what we have is the different file Paths of the various SRT files that we have in the system so just to keep the code clean let me comment it back okay now we come to the next part of our system right we are going to chunk the document and set up a vector store as well after we get the embeddings done so one thing that you would notice is that we are creating a persistant directory to store our Vector DB the reason for this is let's say if we run this file again we would not want the same embeddings to be used again right so even though embedding models are quite cheap nowadays but still why not save money if you already have embeddings available for a particular course right so so in case embeddings are present for a course with the persistent directory we can make sure that we don't recreate the embeddings for that part okay so we set up a chunk size of, right so that means for each thousand characters we'll have one chunk and we have also added a chunk overlap over here a chunk overlap is added so that the context does not end abruptly in a chunk right so uh it would be like this so the text would start from here and the next chunk is going to start from a little bit of an overlap so that no context is lost between the different chunks we initialize our open air embeddings and then set up the vector store so what we have done is we have just set up things so far we have not actually built the vector store let me set up the vector store over here we have named The Collection as course material and this collection name would be used later on to retrieve information from the vector store right so and we have passed on the initialized open air embeddings over here along with the persistent directory that we created earlier okay let's run this code as well okay we get some warnings related to langin replication but not really important okay so this is a very interesting bit of code that we have done so just so that we can estimate we have also added the part about how much will it cost us to do the embeddings right so we have got some estimates on the costing and we are going to print out the total cost of actually doing the embedding right so from the dictionary that we saw earlier we are going to look at each of the course get the SRD files over there and add them to the collection right so interestingly what we also doing is we are creating a collection with a meta name we are creating the meta name as the name of the course this is a very important step because this is going to help us in efficient retrieval whenever we are going to post a query it is only going to look at the SRT files of that particular course so in our system when a learner posts a query they post it for a particular course and we needn't go through the entire Vector database to look at the relevant content for that we just need to directly jump to the meta description of that particular course and that should be quite efficient for us so this is a key component that you should be also thinking about in implementing in your system okay so uh what we do over here is we use the some of the functions that we created earlier to extract text from the SRD file this was the helper function that we created in the very beginning uh after we extract we use the document Library that we imported earlier and then the text splitter that we created in the earlier code with the chunk size of 1,000 and an overlap of 200 right so this is going to divide the data into multiple parts and we are going to do this in batch processes and finally in the vector store we are going to add the documents right so Vector store is something that we initialized earlier with the open air embedding LS and the persistent directory okay so let's run this code it's going to take maybe 30 seconds or so let's see how long does it take and let's look at the cost as well again this is a dummy version or a small version of the actual solution that we have implemented so in this code we have just used subtitles of three courses right so uh we see that it has added one course maybe for the other two courses the chunks were already present right so we have a course on Lang chain and it has added chunks for that particular course uh if I run it again it probably should say uh course were already added in fact let's try it out but before we do that let's also look at the cost nothing significant very minimal cost but also because uh we only had subtitles for one course but at analytics with we have several courses on various topics okay so let me just run this course code again right so this is what I wanted to show you it says course already exists right so this course name is already present over there and that's why it would not recreate the embedding process or actually we can show it as well so while this code was being run I realized that the names of the two other courses that we had in the system were not correct so I went above and changed the file Paths of those folders and then again added to the database and again that shows the beauty of this part right so the introduction to Lang chain course we already had Incorporated the subtitles and it said course already exists the two other courses which were there they were not present earlier and we have now embeddings for those two courses as well that brings us to the end of phase one before we jump to phase two let's understand what phase two brings for us phase two is all about querying and getting the response once the rack system receives the query it goes to the same embedding model as used in phase one and we get the embedded query the embedded query is then matched with similar embeddings in the vector DB store to get the relevant SRT contexts the Matched chunks are retrieved and used as relevant context for our current query in The Next Step both the query and the Rel document passes through the LM which then generates a final response now let's jump back to our python code and Implement phase two over there okay so by now we have created our Vector database toore the next phase is all about querying the data we have built a helper function over here called retrieve course materials what this function is going to do is first of all based on the query and the course name only filter out the the relevant course it will not look at other courses it will only look at the relevant course this brings back to the meta description that we talked about earlier once it Narrows down search to the embeddings for a particular course it's going to find the most similar embeddings based on the query and by default this is based on cosine similarity when we specify k equal to 3 we are saying give us the top three results only when we get the results we combine them in a document ment and return them so let's run the system and probably try out on a query as well so we have a course called introduction to deep learning using py to and one probable question that Learners would probably ask in this course is what is gradient descent so let's try and run this function and see the output okay so let's look at the relevant context from subtitles and if I if you look at the first chunk it seems quite relevant gradient descent is an optimization technique used to find the local minimum or optimize loss function right this is the first chunk the next chunk talks about linear models and mathematical solution and visibly so it is less related to gradient distance compared to the first chunk I'm guessing the third chunk would be even less relatable to gradient descent even though the second chunk mentions some bit about graded descent as well right so it says this is where gradient descent comes into picture but the first one was very spoton let's look at the third chunk as well uh this brings us to the intuition behind gradient descent right so our system is working fairly well all the three documents that it has retrieved on this query on what is gradient descent seem to be quite relevant this is the set of relevant documents that we would now want to pass on to llm so that it can take this context as well as the query to answer this question finally on what is gradient descent okay so let's move on now we could have just built a simple llm to do these tasks for us but we wanted a more professional system so we are building an agent using the crew AI library right so it has some popular classes which help you build an agent one of them is Agent the other is Task and then finally there's another one called crewp so as the name suggest the agent class helps us to build an agent and crew AI gives us enough opportunity to add relevant context so we are initializing this agent but look at the context we have assigned a role to the agent and in this case since we are answering queries it is learning support specialist look at the goal you help Learners with their queries with the best possible response backstory helps us add other detailed context related to the the overall objective of the learning support specialist right so I'm not going to uh read it but yes we give context about what we do we are a n tech company and our focus is courses on machine learning generative AI Etc okay this is the agent's goal let's look at the task that the agent would do again look at the elaborate description that we have provided over here I'll come to the description in a while but let me showcase the part that we have added within curly braces this is not string formatting but in crew AI within the task or for that matter even within the agent when we add some context within curly brackets they act as variables input variables that we can use so one input variable that we have provided is the query the actual question we are saying answer the learner queries to best of your abilities try to keep your response concise with less than 100 words here is the query and then the variable similarly the relevant content that we retrieved from the first step using that function is getting pasted over here here is similar content from course extracted from subtitles so one technique to build good agents is to provide as elaborate context as possible to your agent and that's what we are doing over here apart from that we have also added past discussions uh in this case these are not historical past discussions but these are discuss questions which happened to and from so let's say for an example if somebody asks what is gradient descent and we respond to that the person may have a follow-up query can you explain it to me in more detail right so there could be a thread of conversation happening over there and that thread is what we are passing over here so that the agent has the complete context of the past discussion which is happening finally we give the learner's name over there so that the agent can respond in a more personalized manner okay and we are saying that the output should be a accurate response to the query let's run this one as well query answer agent is not defined I think I forgot to run the earlier code let me run it again and come back to this file finally in the crew we'll combine the agent and the task and we keeping the verbos as false so that we don't get interim output and the crew that we have created is called response crew let's run it now what we are going to do is this is something a little practical it's uh not really related to building an agent but uh we are importing a CSV file which has a lot of queries so we are testing out various queries that have been posted uh and uh there are some helper bit of code that we are using that if a person has only one reply then don't look at the thread if it has more than one reply then look at the thread something like that uh we are getting the query so just basic processing in the current version of agent we are not processing images so in case a query has images we are saying we are not giving a response to that query the interesting part is this one where we are first retrieving the context from the function that we created earlier right and we are saving it as context and this is probably the CU of whatever we have done right so the crew that we created response crew is taking as input the query which was sort of referred over here the relevant content which has been extracted from here and then finally the thread in case uh we have that as available and finally to test the system out uh we are sort of adding some string formatting so let's run this and let's get responses to maybe a query which is on index one of our database okay so this was the question and it looks like that this is a follow-up question because it starts with thanks for the response so it means when input a question or query the query engine will fire llm call to check and so and so but let's look at the response as well one very personalized hi sushma yes when you input a question the query engine generated embeddings and fire llm calls and so on right so this is a very quick way to respond to queries provided that the response is accurate now usually you would not directly deploy such a system you would want to test test it extensively so I'm just going to show you this bunch of code but uh what we did while we were testing the system is we got the response from large number of our past queries right so uh We've printed the response that the agent gave out on some of the past queries and then we showed it to our internal query experts right so we showed them that this was the query and let's say this is the response which is generated by the agent is it accept or not so we took their feedback and then Incorporated that their feedback within the agent as well so it's a ongoing process and uh the process involves that you first create a solution then you evaluated it in detail once you're satisfied with the solution then you deploy it but you still keep on improving it as well so here are some of the suggestions that you can probably use to further improve such a system although there may be a a little bit specific to uh the problem that we have in hand over here and there you have it a fully functional rack system for Effective query resolution but this is not the end this rack system can be improved further you can explore different methods of chunking to find the better one you can also improve the retrieval through query enhancement we can also add image processing capability to answer queries with images in them we can test different approaches and select relevant documents based on various strategies we can also include other databases like past discussions which we have included in our system at analytics with as well that's it for this video do comment below other rag examples or use cases that you would want us to cover and finally please like and share for more such content

Original Description

GitHub Link - https://github.com/ApoorvV/RAG-for-Query-Resolution Blog Link - https://www.analyticsvidhya.com/blog/2025/03/building-a-rag-based-query-resolution-system-with-langchain-and-crewai/ Learn how to build a fully functional Retrieval-Augmented Generation (RAG) system from scratch using Python in this step-by-step tutorial! Understand the core concepts behind RAG and see how to implement a practical query resolution system, similar to what we use at Analytics Vidhya. RAG (Retrieval-Augmented Generation) combines the power of information retrieval with the text generation capabilities of Large Language Models (LLMs) to provide context-aware and accurate responses, allowing you to effectively "chat" with your documents. In this video, you'll learn: Timestamps: 0:00 Intro 0:08 What is RAG (Retrieval-Augmented Generation)? 0:45 RAG Use Cases 1:36 Simplified RAG Architecture Part 1 4:41 Phase 1: Code 12:24 Simplified RAG Architecture Part 2 13:09 Phase 2: Code 21:50 Ideas for Upgrading the System 22:41 Outro Whether you're building internal knowledge bases, customer support bots, or research tools, this tutorial provides a solid foundation for developing your own RAG applications. #RAG #RetrievalAugmentedGeneration #LLM #GenerativeAI #Python #LangChain #CrewAI #OpenAI #VectorDatabase #ChromaDB #AITutorial #AnalyticsVidhya Like this video and subscribe to Analytics Vidhya for more tutorials on AI, Machine Learning, and Data Science! Let us know in the comments what other RAG use cases you'd like us to cover.

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Analytics Vidhya · Analytics Vidhya · 0 of 60

← Previous Next →

The DataHour: Data Science in Retail

The DataHour: Data Science in Retail

Analytics Vidhya

The DataHour: Anomaly detection using NLP and Predictive Modeling

The DataHour: Anomaly detection using NLP and Predictive Modeling

Analytics Vidhya

The DataHour: Energy Data Science Project from Scratch

The DataHour: Energy Data Science Project from Scratch

Analytics Vidhya

The DataHour: Explainable AI Need and Implementation

The DataHour: Explainable AI Need and Implementation

Analytics Vidhya

The DataHour: Google Cloud AI/ML

The DataHour: Google Cloud AI/ML

Analytics Vidhya

Prediction to Production in Machine Learning #machinelearning #prediction

Prediction to Production in Machine Learning #machinelearning #prediction

Analytics Vidhya

Practical Applications of Data science in Ecommerce

Practical Applications of Data science in Ecommerce

Analytics Vidhya

How to tackle Overfitting?#machinelearning #overfitting

How to tackle Overfitting?#machinelearning #overfitting

Analytics Vidhya

Building Data Pipelines on GCP #googlecloud #datapipelines #data

Building Data Pipelines on GCP #googlecloud #datapipelines #data

Analytics Vidhya

Hands-on with A/B Testing #abtesting #datascience

Hands-on with A/B Testing #abtesting #datascience

Analytics Vidhya

Efficient Implementations of Transformers #transformers #cnn #machinelearning

Efficient Implementations of Transformers #transformers #cnn #machinelearning

Analytics Vidhya

Modern Deep Learning Architecture #deeplearning #architecture #deeplearningtutorial

Modern Deep Learning Architecture #deeplearning #architecture #deeplearningtutorial

Analytics Vidhya

Key steps for Designing Artificial Neural Network (ANN) for Image classification #machinelearning

Key steps for Designing Artificial Neural Network (ANN) for Image classification #machinelearning

Analytics Vidhya

5 things you should know about Azure SQL #azure #sql #datahour #datascience

5 things you should know about Azure SQL #azure #sql #datahour #datascience

Analytics Vidhya

AI & ML in the Automotive Industry #machinelearning #ai

AI & ML in the Automotive Industry #machinelearning #ai

Analytics Vidhya

Building Machine Learning Models in BigQuery

Building Machine Learning Models in BigQuery

Analytics Vidhya

NLP aspects in Telecommunication Industry

NLP aspects in Telecommunication Industry

Analytics Vidhya

Practical Time Series Analysis

Practical Time Series Analysis

Analytics Vidhya

Fundamentals of Quantum Computing

Fundamentals of Quantum Computing

Analytics Vidhya

A DAY IN THE LIFE of a Data Scientist (From waking up to working on algorithms)

A DAY IN THE LIFE of a Data Scientist (From waking up to working on algorithms)

Analytics Vidhya

Classification Machine Learning Model from Scratch

Classification Machine Learning Model from Scratch

Analytics Vidhya

Knowledge Graph Solutions using Neo4j

Knowledge Graph Solutions using Neo4j

Analytics Vidhya

Model Guesstimation (MLOps)

Model Guesstimation (MLOps)

Analytics Vidhya

ETL Pipelines in Google Cloud Platform

ETL Pipelines in Google Cloud Platform

Analytics Vidhya

Key steps for Designing Convolutional Neural Network(CNN) for Image Classification

Key steps for Designing Convolutional Neural Network(CNN) for Image Classification

Analytics Vidhya

Getting Started with AWS EC2 #amazon #aws

Getting Started with AWS EC2 #amazon #aws

Analytics Vidhya

How to Use Azure NLP and Graph Databases for Intelligent Knowledge Mining

How to Use Azure NLP and Graph Databases for Intelligent Knowledge Mining

Analytics Vidhya

Certified AI & ML BlackBelt Plus Program #shorts

Certified AI & ML BlackBelt Plus Program #shorts

Analytics Vidhya

Visualizing Data using Python #machinelearning #visualization #python

Visualizing Data using Python #machinelearning #visualization #python

Analytics Vidhya

DCNN for Machine RUL Prediction using Time-series Data #timeseries #machinelearning #datascience

DCNN for Machine RUL Prediction using Time-series Data #timeseries #machinelearning #datascience

Analytics Vidhya

M in ML stands for Math & Magic

M in ML stands for Math & Magic

Analytics Vidhya

An Unsupervised ML approach using Clustering

An Unsupervised ML approach using Clustering

Analytics Vidhya

Customizing Large Language Models GPT3 for Real-life Use Cases #gpt3 #datascience

Customizing Large Language Models GPT3 for Real-life Use Cases #gpt3 #datascience

Analytics Vidhya

Model Parameters vs Hyperparameters - Techniques in ML Engineering #machinelearning

Model Parameters vs Hyperparameters - Techniques in ML Engineering #machinelearning

Analytics Vidhya

Practical MLOps #mlops #datascience

Practical MLOps #mlops #datascience

Analytics Vidhya

Data Engineering with Databricks #dataengineering #databricks

Data Engineering with Databricks #dataengineering #databricks

Analytics Vidhya

Multi-Objective Optimisation

Multi-Objective Optimisation

Analytics Vidhya

When Airflow Meets Kubernetes

When Airflow Meets Kubernetes

Analytics Vidhya

Analytics Vidhya

Learn Convolutional Neural Network for Image Recognition

Learn Convolutional Neural Network for Image Recognition

Analytics Vidhya

Extracting Value from Data

Extracting Value from Data

Analytics Vidhya

How to measure Marketing Channel Effectiveness

How to measure Marketing Channel Effectiveness

Analytics Vidhya

Transforming Lives | Data Science Immersive Bootcamp

Transforming Lives | Data Science Immersive Bootcamp

Analytics Vidhya

Stock Market Analysis - AI driven approach

Stock Market Analysis - AI driven approach

Analytics Vidhya

Become a Data Engineering Professional in 2022 | Future Trends + Skills Required

Become a Data Engineering Professional in 2022 | Future Trends + Skills Required

Analytics Vidhya

Ensemble Techniques in Machine Learning #machinelearning #ensemble #datascience

Ensemble Techniques in Machine Learning #machinelearning #ensemble #datascience

Analytics Vidhya

The Power of Visualization | Tableau Full Course | Analytics Vidhya

The Power of Visualization | Tableau Full Course | Analytics Vidhya

Analytics Vidhya

Demand for Data Engineers is on the Rise | Data Engineer | Analytics Vidhya

Demand for Data Engineers is on the Rise | Data Engineer | Analytics Vidhya

Analytics Vidhya

Data Visualization in Data Science | DataHour | Analytics Vidhya

Data Visualization in Data Science | DataHour | Analytics Vidhya

Analytics Vidhya

Role of Optimization in Machine Learning & Deep Learning | DataHour | Analytics Vidhya

Role of Optimization in Machine Learning & Deep Learning | DataHour | Analytics Vidhya

Analytics Vidhya

Solving any Machine Learning Problem | Approach and Steps Involved

Solving any Machine Learning Problem | Approach and Steps Involved

Analytics Vidhya

Topic Modeling Explained with Implementation | Using LDA in Python | DataHour by Arpendu Ganguly

Topic Modeling Explained with Implementation | Using LDA in Python | DataHour by Arpendu Ganguly

Analytics Vidhya

Data Engineering in E-Commerce | The Best Case Study

Data Engineering in E-Commerce | The Best Case Study

Analytics Vidhya

Introduction to Classification using Azure Machine Learning | DataHour | Analytics Vidhya

Introduction to Classification using Azure Machine Learning | DataHour | Analytics Vidhya

Analytics Vidhya

Introduction to Federated Learning | DataHour | Analytics Vidhya

Introduction to Federated Learning | DataHour | Analytics Vidhya

Analytics Vidhya

Diffusion Models for Generative Arts | DataHour | Analytics Vidhya

Diffusion Models for Generative Arts | DataHour | Analytics Vidhya

Analytics Vidhya

Master Google Analytics in 1 Hour | DataHour | Analytics Vidhya

Master Google Analytics in 1 Hour | DataHour | Analytics Vidhya

Analytics Vidhya

Learn Hypothesis Testing | DataHour | Analytics Vidhya

Learn Hypothesis Testing | DataHour | Analytics Vidhya

Analytics Vidhya

A Practical Approach to Kaggle Competition | DataHour | Analytics Vidhya

A Practical Approach to Kaggle Competition | DataHour | Analytics Vidhya

Analytics Vidhya

Making AI work for Business | DataHour | Analytics Vidhya

Making AI work for Business | DataHour | Analytics Vidhya

Analytics Vidhya

This video tutorial teaches how to build a RAG system using LangChain, CrewAI, and OpenAI, covering the entire workflow from identifying relevant documents to deploying the system. The tutorial provides a step-by-step guide on how to use these tools to generate coherent responses to queries. By following this tutorial, viewers can learn how to build a fully functional RAG system and improve their understanding of retrieval augmented generation.

Key Takeaways

Identify relevant documents
Break documents into smaller parts called chunks
Use embedding models to convert text into meaningful numbers
Store embeddings efficiently using a vector DB store
Use Chroma DB as an open-source and free vector store
Initialize OpenAI embeddings
Set up vector store with collection name and persistent directory
Filters out relevant course based on query and course name
Finds most similar embeddings based on query and cosine similarity
Combines results in a document and returns them

💡 The key insight from this tutorial is that building a RAG system requires a combination of natural language processing (NLP) and information retrieval techniques, and that using tools like LangChain, CrewAI, and OpenAI can simplify the process.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related AI Lessons

Sub-10ms AI Workflows: Accelerating sim.ai with On-Device Semantic Search using Moss

Learn how to accelerate AI workflows with on-device semantic search using Moss, achieving sub-10ms response times and improving user experience

Medium · Machine Learning

Stop Guessing: Guaranteed Structured Output from LLMs in Node.js

Learn to guarantee structured output from LLMs in Node.js and stop parsing JSON manually

Dev.to · Hardik Mehta

Spring AI Tutorial — Your First REST Endpoint with OpenAI (2026)

Build a REST endpoint with Spring Boot 3 and OpenAI to create an LLM-powered API, leveraging the power of AI in your applications

Notes: Memory, Context, and Large Language Models (LLMs)

Learn how memory and context work in Large Language Models (LLMs) and potential improvements

Dev.to · Vladimir Panov

Chapters (9)

Intro

0:08 What is RAG (Retrieval-Augmented Generation)?

0:45 RAG Use Cases

1:36 Simplified RAG Architecture Part 1

4:41 Phase 1: Code

12:24 Simplified RAG Architecture Part 2

13:09 Phase 2: Code

21:50 Ideas for Upgrading the System

22:41 Outro

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)