Early days of RAG and LlamaIndex - Jerry Liu

Alejandro AO · Beginner ·🔍 RAG & Vector Search ·2y ago

Key Takeaways

The video discusses the history and evolution of Retrieval Augmented Generation (RAG) and its application in AI, specifically with LlamaIndex, highlighting the potential of using Large Language Models (LLMs) for data transformation and decision-making.

Full Transcript

All right, so about that history of rag, modern rag was actually kind of born in 2022 then, right? Um, yeah, I mean I think like I forgot the exact date of the like the original retrieval augmented generation paper, which is not by me. Obviously it was by others. And so this is either in 2021 or 2022 or even 2020. Um, and the like it basically proposed this overall idea of oh, you know, you take in some set of like documents and then you watch it embed them, like put them through an embedding model and then put it into some some storage system that's able to like serve like the relevant documents through retrieval, right? And so that's why it's called like retrieval augmented generation. Like you want to do like a retrieval pass over some storage system before you actually put it into the LLM prompt. So that kind of that resurfaced basically as more and more people start building with LLM apps. People like kind of start discovering that oh, hey, this thing is like a cool idea. And my initial version was not doing that. It wasn't using embeddings because at the time, I don't know why. I think it might have just been like due to like a design like my my goal at the time when I first started was not necessarily to make this useful. It was to just like do something cool. And so to me like doing something cool was like oh, what if we just didn't have embeddings or I thought about it briefly, but I was like what I really wanted to do is just have the LLM figure it out completely on its own, right? And I I I still think that would be a quite an interesting concept. Like instead of just like, you know, relying on a separate model, just have a language model completely similar to like a human, just completely figure out how to reason, organize things and then also traverse them via text. Um, yeah. And that kind of reflects in the current state of Llama Index, right? Because I mean, you it's kind of the central, I mean, as far as I can see, it's kind of the central part of Llama Index that you use language models as well in during the ingestion process, not only in the in the generation process. Yeah, so the default rag paradigm really only uses the LLM at the very end. So you have like ingestion, like ingestion doesn't need LLMs. You just take in you know, some data, parse it and then you just like chunk it using an algorithm. And of course you use an embedding model to put it into some vector store. And then the retrieval process doesn't use an LLM cuz at its simplest it's just like top K embedding look up. So like, you know, you look up stuff by embedding similarity. And so in a standard rag pipeline, the general the the way the place where LLMs actually come in is at the very end. And it's only responsible for um kind of just like synthesizing an answer from a piece of unstructured text. And to be totally honest, like, you know, it like even like at the start when we were just like implementing this, I thought it was a little basic and it didn't really like make use LLMs to its full potential, right? Cuz like LLMs are not just for generation and and simple reasoning. They can actually help you make decisions. They can actually help you like like a greater layer of like just like understanding and decision making. And so if you really wanted to make these systems more interesting, you could use LLMs kind of like at the beginning. So for instance, during the data ingestion phase or you know, during query time. Instead of just using it at the very end for generation, use it for like query understanding, use it for like processing, like evaluating like the quality of your retrieved context. And then for instance, like not only just retrieving from vector store, actually using a variety of different tools. And so on the ingestion side, the places that you can use LLMs. And so this this this overall concept is pretty interesting, which is um the whole point of like ingestion is to process data for your LLM app. And so that's kind of like ETL for LLMs, right? But you can also use LLMs for ETL because, you know, LLMs have an inherent capability of understanding unstructured data and transforming it. And that part I think is interesting. Like so for instance, let's say, you know, for each unstructured document, you wanted to extract like a summary, the table of contents, like, you know, extract like a set of like topics or tags for each Basically, you can figure out a clever way to prompt the LLM by feeding it in a bunch of data from the document to basically first extract out a set of like structured annotations or tags. And this represents like a data transformation basically cuz you're trying to like feed in some input unstructured data and transform that into structured data. And then you can basically attach those tags on top of the unstructured data as well. And so these like this is just an example of like metadata extraction that's also powered by LLMs. And this is something that, you know, uses LLMs, but is also like, you know, useful for just like the any sort of downstream application you want to build. Because if you're trying to build a rag system over this, having metadata tags is often times very useful. It gives you like better retrieval results, better generation quality and and all this types of types of things. And so I I think that interplay between LLMs and kind of like data transformation is very interesting cuz you can use it for like in the middle, but also you it helps for any sort of like applications you want to build later on. Yeah, that's pretty some advanced rag techniques over there. And it kind of brings goes back to a little bit to your I mean, it it's a little bit adjacent to your original idea of creating this kind of systems, right? Yeah, I think I think the, you know, I I thought about this in the at the beginning of the project, but the project was not really um like close to kind of like realizing that vision at the time. But if you think about like the overall picture of like where I think LLM power software will evolve, it's basically like um there's a new type of like data like a data stack that's emerging, a new set of operations within that data stack to basically power like like AI software. And to really like kind of like we want to basically provide the right tooling to help developers build that data stack. And so this is helping them figure out how do you like, you know, move data from one place to another specifically for LLMs to use. And that could include like LLMs in the middle as well. And this also includes the orchestration piece on top of that data. How do you figure out how to get LLMs to interact with the data through these different types of interfaces? Awesome. Hello everyone. Thank you for watching the clip. If you enjoyed the clip, you can click the link in the description or somewhere right here on the screen to watch the full conversation for free. If the podcast is not up yet, you can always subscribe to my Patreon to get early access and support the channel and this podcast. Or alternatively, you can subscribe here on YouTube, click the bell icon and you will be notified when the full episode comes out next week. Thank you very much for watching and I will see you next time.

Original Description

Jerry Liu discusses the history of RAG (retrieval augmented generation) and the use of LlamaIndex in AI applications. Learn about the evolution of RAG and how LLMs can be used for data transformation and decision-making. LINKS: 🦙 Check out LlamaIndex: https://docs.llamaindex.ai/en/stable/ ❤️ For early access to the podcast, you can either: - Support on Patreon: https://www.patreon.com/posts/podcast-early-105418000?utm_medium=clipboard_copy&utm_source=copyLink&utm_campaign=postshare_creator&utm_content=join_link ✉️ Get the Newsletter: https://link.alejandro-ao.com/AIIguB ----------------------------- #aiagents #machinelearning #LLM #LlamaIndex #RAG #OpenAI #aipodcast #podcast
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Alejandro AO · Alejandro AO · 34 of 60

1 Linear Regression in R - Full Project for Beginners
Linear Regression in R - Full Project for Beginners
Alejandro AO
2 Configure Webpack 5 in Wordpress (2025) with Typescript and SASS
Configure Webpack 5 in Wordpress (2025) with Typescript and SASS
Alejandro AO
3 R Programming 101 - Crash Course for beginners
R Programming 101 - Crash Course for beginners
Alejandro AO
4 Convert HTML template to WordPress Theme (2025) - Full Course
Convert HTML template to WordPress Theme (2025) - Full Course
Alejandro AO
5 Javascript Interactive Map with Leaflet EASY (with Marker Clusters & Popups)
Javascript Interactive Map with Leaflet EASY (with Marker Clusters & Popups)
Alejandro AO
6 Vanilla JS Project: Multi Step form in HTML, CSS & OOP Javascript
Vanilla JS Project: Multi Step form in HTML, CSS & OOP Javascript
Alejandro AO
7 How to do AJAX in WordPress correctly (2025)
How to do AJAX in WordPress correctly (2025)
Alejandro AO
8 React Leaflet Tutorial for Beginners (2025)
React Leaflet Tutorial for Beginners (2025)
Alejandro AO
9 Linear Regression in Python - Full Project for Beginners
Linear Regression in Python - Full Project for Beginners
Alejandro AO
10 Logistic Regression Project: Cancer Prediction with Python
Logistic Regression Project: Cancer Prediction with Python
Alejandro AO
11 Display Equations in ChatGPT
Display Equations in ChatGPT
Alejandro AO
12 Create a Chrome Extension (Manifest V3) for ChatGPT
Create a Chrome Extension (Manifest V3) for ChatGPT
Alejandro AO
13 Full-Stack Project | ChatGPT API, React, Node.js, Express
Full-Stack Project | ChatGPT API, React, Node.js, Express
Alejandro AO
14 Streamlit Python Course: Build a Machine Learning App to Predict Cancer
Streamlit Python Course: Build a Machine Learning App to Predict Cancer
Alejandro AO
15 Langchain PDF App (GUI) | Create a ChatGPT For Your PDF in Python
Langchain PDF App (GUI) | Create a ChatGPT For Your PDF in Python
Alejandro AO
16 LangChain Memory Tutorial | Building a ChatGPT Clone in Python
LangChain Memory Tutorial | Building a ChatGPT Clone in Python
Alejandro AO
17 Chat with a CSV | LangChain Agents Tutorial (Beginners)
Chat with a CSV | LangChain Agents Tutorial (Beginners)
Alejandro AO
18 Create a ChatGPT clone using Streamlit and LangChain
Create a ChatGPT clone using Streamlit and LangChain
Alejandro AO
19 Chat with Multiple PDFs | LangChain App Tutorial in Python (Free LLMs and Embeddings)
Chat with Multiple PDFs | LangChain App Tutorial in Python (Free LLMs and Embeddings)
Alejandro AO
20 Full Python Environment Setup for AI (or other) Apps + Virtual Environments
Full Python Environment Setup for AI (or other) Apps + Virtual Environments
Alejandro AO
21 Langchain + Qdrant Cloud | Pinecone FREE Alternative (20GB) | Tutorial
Langchain + Qdrant Cloud | Pinecone FREE Alternative (20GB) | Tutorial
Alejandro AO
22 LangChain Version 0.1 Explained | New Features & Changes
LangChain Version 0.1 Explained | New Features & Changes
Alejandro AO
23 Create a RAG Chain using LangChain 0.1 (New version)
Create a RAG Chain using LangChain 0.1 (New version)
Alejandro AO
24 Tutorial | Chat with any Website using Python and Langchain (LATEST VERSION)
Tutorial | Chat with any Website using Python and Langchain (LATEST VERSION)
Alejandro AO
25 Deploy Your AI Streamlit App for FREE | Step-by-Step (Heroku Alternative)
Deploy Your AI Streamlit App for FREE | Step-by-Step (Heroku Alternative)
Alejandro AO
26 What is Google's Gemini 1.5 Pro | 10 Million Token Window
What is Google's Gemini 1.5 Pro | 10 Million Token Window
Alejandro AO
27 Chat with MySQL Database with Python | LangChain Tutorial
Chat with MySQL Database with Python | LangChain Tutorial
Alejandro AO
28 Stream LLMs with LangChain + Streamlit | Tutorial
Stream LLMs with LangChain + Streamlit | Tutorial
Alejandro AO
29 Chat with MySQL Database using GPT-4 and Mistral AI | Python GUI App
Chat with MySQL Database using GPT-4 and Mistral AI | Python GUI App
Alejandro AO
30 #1 Harrison Chase: LangChain and The Future of LLM Applications | Alejandro AO
#1 Harrison Chase: LangChain and The Future of LLM Applications | Alejandro AO
Alejandro AO
31 CrewAI Step-by-Step | Complete Course for Beginners
CrewAI Step-by-Step | Complete Course for Beginners
Alejandro AO
32 Python: Automating a Marketing Team with AI Agents | Planning and Implementing CrewAI
Python: Automating a Marketing Team with AI Agents | Planning and Implementing CrewAI
Alejandro AO
33 Build a Web App (GUI) for your CrewAI Automation (Easy with Python)
Build a Web App (GUI) for your CrewAI Automation (Easy with Python)
Alejandro AO
Early days of RAG and LlamaIndex - Jerry Liu
Early days of RAG and LlamaIndex - Jerry Liu
Alejandro AO
35 LlamaParse: Convert PDF (with tables) to Markdown
LlamaParse: Convert PDF (with tables) to Markdown
Alejandro AO
36 #2 Jerry Liu - What is LlamaIndex, Agents & Advice for AI Engineers
#2 Jerry Liu - What is LlamaIndex, Agents & Advice for AI Engineers
Alejandro AO
37 CrewAI + Exa: Generate a Newsletter with Research Agents (Part 1)
CrewAI + Exa: Generate a Newsletter with Research Agents (Part 1)
Alejandro AO
38 #3 Joe Moura | Multi Agent Systems and CrewAI
#3 Joe Moura | Multi Agent Systems and CrewAI
Alejandro AO
39 Python: Create a ReAct Agent from Scratch
Python: Create a ReAct Agent from Scratch
Alejandro AO
40 New Groq Models: Best for Function-Calling Agents
New Groq Models: Best for Function-Calling Agents
Alejandro AO
41 Introduction to LlamaIndex with Python (2025)
Introduction to LlamaIndex with Python (2025)
Alejandro AO
42 LlamaIndex: How to use LLMs
LlamaIndex: How to use LLMs
Alejandro AO
43 LlamaIndex: How to Get Structured Data from LLMs
LlamaIndex: How to Get Structured Data from LLMs
Alejandro AO
44 Multimodal RAG: Chat with PDFs (Images & Tables) [2025]
Multimodal RAG: Chat with PDFs (Images & Tables) [2025]
Alejandro AO
45 Advanced RAG with LlamaIndex - Metadata Extraction [2025]
Advanced RAG with LlamaIndex - Metadata Extraction [2025]
Alejandro AO
46 Learn MCP Servers with Python (EASY)
Learn MCP Servers with Python (EASY)
Alejandro AO
47 Create MCP Clients in JavaScript - Tutorial
Create MCP Clients in JavaScript - Tutorial
Alejandro AO
48 Create an MCP Client in Python - FastAPI Tutorial
Create an MCP Client in Python - FastAPI Tutorial
Alejandro AO
49 How to Build an MCP Client GUI with Streamlit and FastAPI
How to Build an MCP Client GUI with Streamlit and FastAPI
Alejandro AO
50 Vibe Coding For Engineers (make it ACTUALLY work)
Vibe Coding For Engineers (make it ACTUALLY work)
Alejandro AO
51 LlamaExtract Tutorial: Convert PDF & Images into JSON
LlamaExtract Tutorial: Convert PDF & Images into JSON
Alejandro AO
52 Local MCP Servers for Cursor (Step by step)
Local MCP Servers for Cursor (Step by step)
Alejandro AO
53 Anthropic: How to Build Multi Agent Systems
Anthropic: How to Build Multi Agent Systems
Alejandro AO
54 Deploy Remote MCP Servers in Python (Step by Step)
Deploy Remote MCP Servers in Python (Step by Step)
Alejandro AO
55 GPT-5 for Developers: API Changes, Pricing, Model Router & Security
GPT-5 for Developers: API Changes, Pricing, Model Router & Security
Alejandro AO
56 Tutorial: Auth for Remote MCP Servers (Step by Step) | OAuth 2.1 with ScaleKit
Tutorial: Auth for Remote MCP Servers (Step by Step) | OAuth 2.1 with ScaleKit
Alejandro AO
57 Generate UI Tests with TestSprite MCP Server + TRAE
Generate UI Tests with TestSprite MCP Server + TRAE
Alejandro AO
58 #4 Allan Guo | 19-yo YC Founder - Willow Voice
#4 Allan Guo | 19-yo YC Founder - Willow Voice
Alejandro AO
59 RAG Project: Build an AI Onboarding Chatbot with Streamlit, LangChain, and ChromaDB
RAG Project: Build an AI Onboarding Chatbot with Streamlit, LangChain, and ChromaDB
Alejandro AO
60 MCP Security | Malicious MCP Servers (Protect Yourself)
MCP Security | Malicious MCP Servers (Protect Yourself)
Alejandro AO

The video discusses the history and evolution of RAG and its application in AI, specifically with LlamaIndex, highlighting the potential of using LLMs for data transformation and decision-making. Viewers can learn about the basics of LLMs, RAG, and how to apply them to real-world problems. The video also covers advanced topics such as using LLMs for metadata extraction and data transformation.

Key Takeaways
  1. Understand the basics of RAG and LLMs
  2. Learn about the evolution of RAG and its application in AI
  3. Apply LLMs to data transformation and decision-making
  4. Use LLMs for metadata extraction and data transformation
  5. Design and implement RAG pipelines
  6. Craft effective prompts for LLMs
💡 The video highlights the potential of using LLMs for data transformation and decision-making, and how RAG can be used to improve the performance of LLMs in these tasks.

Related AI Lessons

Why you shouldn’t search your documents directly with AI
Learn why directly searching documents with AI can be inefficient and how retrieval-augmented systems can improve the process
Medium · Programming
Your AI Keeps Making Things Up. RAG Is How You Make It Use Real Facts Instead.
Learn how to use RAG to make your AI provide accurate answers based on real facts instead of making things up
Medium · RAG
Evaluation Metrics for RAG: Measure Retrieval, Generation, and End-to-End Quality With Numbers That…
Learn to evaluate RAG models using metrics that measure retrieval, generation, and end-to-end quality
Medium · AI
Evaluation Metrics for RAG: Measure Retrieval, Generation, and End-to-End Quality With Numbers That…
Learn to evaluate RAG models using metrics that measure retrieval, generation, and end-to-end quality
Medium · Data Science
Up next
RRF vs DBSF with Qdrant: Hybrid Retrieval Fusion for RAG in Python
Professor Py: AI Engineering
Watch →