LlamaIndex: How to Get Structured Data from LLMs

Alejandro AO · Intermediate ·🧠 Large Language Models ·1y ago

Key Takeaways

This video demonstrates how to get structured data from language models using LlamaIndex and Pydantic, allowing for predictable output with specific keys and value formats.

Full Transcript

hey everyone how's it going today's video very quick about how to get structured output out of your language model in Lama index okay now the idea behind this is that usually when you have your language model and you query it you're going to get just text from it okay now what happens if instead of getting text you want to get something like a Json file with very specific keys and value um formats that you want from that Json file okay your average language model is not going to be able to accurately get that just by prompting it okay so the idea that we have right here is that we're going to be initializing a schema using pantic adding it to our language model using Lama index and this is the language model that we're going to query and every time we're going to get a response with the schema that we have determined okay so let's take a look at how to do that [Music] okay so the first thing to do as I mentioned before is to create the schema of the data that we want to get in our case we're going to be using pantic in order to uh initialize our schema in case you're not familiar with it pantic is essentially just a library uh for python for high for data validation and essentially what it allows you to do is to create classes which are going to become the schemas of your data and you're going to be able to validate your data using these schemas okay um in this case I'm going to be asking my language model to create an album and an album is going to contain a name the name is always going to be a string an artist which is always going to be a string and a list of songs and each song is also going to have its own schema and it it's going to have a title and a length in seconds and the length in seconds is in integers okay now for the record this example comes straight from the documentation of L index um but let's actually take a look at this so I'm going to execute this and right here I'm going to go from L index core import a chat message and here as you can see I had previously initialized my language model let me just show you before because I was doing another tutorial before this one um so here I have initialized my language model from open AI okay so in this case I'm going to be using open ey for this I am using GPT 40 mini and this language model is the same one that I'm using right here however this is not the instance that I'm going to be quering like that I am actually going to going to be running as structured llm method on it and I'm going to Define this parameter right here output CLS as the uh output schema that I want my language model to return on every single location okay and in this case I am passing it the album schema that I created right here which contains a reference to the song schema as well okay and this I am going to be assigning it to a variable called s llm for structured llm and now anytime I am going to query this this language model right here I am going to get the response in the exact schema that I specialized right here so let's see I'm going to initialize a chat model um sorry a chat message from this string right here generate an example album for the film uh for the film uh who frame Roger Rabbit and let's see what it returns to us I am going to call the slm using the chat method and let's see now here we have the response and as you can see here the chat message is from the assistant and the content is not just the string of text it is actually the object that I want to get so here you have the name of the album is Who Framed Roger Rabbit the soundtrack um Roger Rabbit soundtrack sorry the artist is going to be various artists and then we have a list of songs okay so let's take a look at it under the microscope and here I'm for the record using pretty print I imported It Up Above in the dictionary in the notebook but essentially I'm using pretty print to print these objects more neatly but as you can see we have the artist right here various artists the name is who frame Roger Rabbit and the songs which are all of them I mean each one of them an element an instance of song which each one has a title and a length in seconds so there you go I mean as you can see this follows exactly the same um the same schema that we defined above right here so there we go that is how to use on how to use structured outputs in language models using L index let me know if you have any questions and let's continue with this course [Music]

Original Description

In this video, we’ll dive into how to get structured and predictable output from language models by leveraging LlamaIndex and Pydantic. Structured data is essential for applications where consistent formatting and data validation are key. Here, we’ll show you how to use a Pydantic model to enforce a schema, so that our Language Learning Model (LLM) generates responses that fit a specific structure. --- Useful links: 👉 Code on this video: https://colab.research.google.com/drive/18rJ-BGN3-JVJtBGjslQ33M5liowbXkc4?usp=sharing 🚀 Become an AI Engineer with my cohort: https://course.alejandro-ao.com --- ☎️ Consulting for your company: https://link.alejandro-ao.com/consulting-call ❤️ Buy me a coffee... or a beer (thanks): https://link.alejandro-ao.com/l83gNq 💬 Join the Discord Help Server: https://link.alejandro-ao.com/HrFKZn --- Connect with me LinkedIn: https://www.linkedin.com/in/alejandro-ao/ X: https://twitter.com/_alejandroao
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Alejandro AO · Alejandro AO · 43 of 60

1 Linear Regression in R - Full Project for Beginners
Linear Regression in R - Full Project for Beginners
Alejandro AO
2 Configure Webpack 5 in Wordpress (2025) with Typescript and SASS
Configure Webpack 5 in Wordpress (2025) with Typescript and SASS
Alejandro AO
3 R Programming 101 - Crash Course for beginners
R Programming 101 - Crash Course for beginners
Alejandro AO
4 Convert HTML template to WordPress Theme (2025) - Full Course
Convert HTML template to WordPress Theme (2025) - Full Course
Alejandro AO
5 Javascript Interactive Map with Leaflet EASY (with Marker Clusters & Popups)
Javascript Interactive Map with Leaflet EASY (with Marker Clusters & Popups)
Alejandro AO
6 Vanilla JS Project: Multi Step form in HTML, CSS & OOP Javascript
Vanilla JS Project: Multi Step form in HTML, CSS & OOP Javascript
Alejandro AO
7 How to do AJAX in WordPress correctly (2025)
How to do AJAX in WordPress correctly (2025)
Alejandro AO
8 React Leaflet Tutorial for Beginners (2025)
React Leaflet Tutorial for Beginners (2025)
Alejandro AO
9 Linear Regression in Python - Full Project for Beginners
Linear Regression in Python - Full Project for Beginners
Alejandro AO
10 Logistic Regression Project: Cancer Prediction with Python
Logistic Regression Project: Cancer Prediction with Python
Alejandro AO
11 Display Equations in ChatGPT
Display Equations in ChatGPT
Alejandro AO
12 Create a Chrome Extension (Manifest V3) for ChatGPT
Create a Chrome Extension (Manifest V3) for ChatGPT
Alejandro AO
13 Full-Stack Project | ChatGPT API, React, Node.js, Express
Full-Stack Project | ChatGPT API, React, Node.js, Express
Alejandro AO
14 Streamlit Python Course: Build a Machine Learning App to Predict Cancer
Streamlit Python Course: Build a Machine Learning App to Predict Cancer
Alejandro AO
15 Langchain PDF App (GUI) | Create a ChatGPT For Your PDF in Python
Langchain PDF App (GUI) | Create a ChatGPT For Your PDF in Python
Alejandro AO
16 LangChain Memory Tutorial | Building a ChatGPT Clone in Python
LangChain Memory Tutorial | Building a ChatGPT Clone in Python
Alejandro AO
17 Chat with a CSV | LangChain Agents Tutorial (Beginners)
Chat with a CSV | LangChain Agents Tutorial (Beginners)
Alejandro AO
18 Create a ChatGPT clone using Streamlit and LangChain
Create a ChatGPT clone using Streamlit and LangChain
Alejandro AO
19 Chat with Multiple PDFs | LangChain App Tutorial in Python (Free LLMs and Embeddings)
Chat with Multiple PDFs | LangChain App Tutorial in Python (Free LLMs and Embeddings)
Alejandro AO
20 Full Python Environment Setup for AI (or other) Apps + Virtual Environments
Full Python Environment Setup for AI (or other) Apps + Virtual Environments
Alejandro AO
21 Langchain + Qdrant Cloud | Pinecone FREE Alternative (20GB) | Tutorial
Langchain + Qdrant Cloud | Pinecone FREE Alternative (20GB) | Tutorial
Alejandro AO
22 LangChain Version 0.1 Explained | New Features & Changes
LangChain Version 0.1 Explained | New Features & Changes
Alejandro AO
23 Create a RAG Chain using LangChain 0.1 (New version)
Create a RAG Chain using LangChain 0.1 (New version)
Alejandro AO
24 Tutorial | Chat with any Website using Python and Langchain (LATEST VERSION)
Tutorial | Chat with any Website using Python and Langchain (LATEST VERSION)
Alejandro AO
25 Deploy Your AI Streamlit App for FREE | Step-by-Step (Heroku Alternative)
Deploy Your AI Streamlit App for FREE | Step-by-Step (Heroku Alternative)
Alejandro AO
26 What is Google's Gemini 1.5 Pro | 10 Million Token Window
What is Google's Gemini 1.5 Pro | 10 Million Token Window
Alejandro AO
27 Chat with MySQL Database with Python | LangChain Tutorial
Chat with MySQL Database with Python | LangChain Tutorial
Alejandro AO
28 Stream LLMs with LangChain + Streamlit | Tutorial
Stream LLMs with LangChain + Streamlit | Tutorial
Alejandro AO
29 Chat with MySQL Database using GPT-4 and Mistral AI | Python GUI App
Chat with MySQL Database using GPT-4 and Mistral AI | Python GUI App
Alejandro AO
30 #1 Harrison Chase: LangChain and The Future of LLM Applications | Alejandro AO
#1 Harrison Chase: LangChain and The Future of LLM Applications | Alejandro AO
Alejandro AO
31 CrewAI Step-by-Step | Complete Course for Beginners
CrewAI Step-by-Step | Complete Course for Beginners
Alejandro AO
32 Python: Automating a Marketing Team with AI Agents | Planning and Implementing CrewAI
Python: Automating a Marketing Team with AI Agents | Planning and Implementing CrewAI
Alejandro AO
33 Build a Web App (GUI) for your CrewAI Automation (Easy with Python)
Build a Web App (GUI) for your CrewAI Automation (Easy with Python)
Alejandro AO
34 Early days of RAG and LlamaIndex - Jerry Liu
Early days of RAG and LlamaIndex - Jerry Liu
Alejandro AO
35 LlamaParse: Convert PDF (with tables) to Markdown
LlamaParse: Convert PDF (with tables) to Markdown
Alejandro AO
36 #2 Jerry Liu - What is LlamaIndex, Agents & Advice for AI Engineers
#2 Jerry Liu - What is LlamaIndex, Agents & Advice for AI Engineers
Alejandro AO
37 CrewAI + Exa: Generate a Newsletter with Research Agents (Part 1)
CrewAI + Exa: Generate a Newsletter with Research Agents (Part 1)
Alejandro AO
38 #3 Joe Moura | Multi Agent Systems and CrewAI
#3 Joe Moura | Multi Agent Systems and CrewAI
Alejandro AO
39 Python: Create a ReAct Agent from Scratch
Python: Create a ReAct Agent from Scratch
Alejandro AO
40 New Groq Models: Best for Function-Calling Agents
New Groq Models: Best for Function-Calling Agents
Alejandro AO
41 Introduction to LlamaIndex with Python (2025)
Introduction to LlamaIndex with Python (2025)
Alejandro AO
42 LlamaIndex: How to use LLMs
LlamaIndex: How to use LLMs
Alejandro AO
LlamaIndex: How to Get Structured Data from LLMs
LlamaIndex: How to Get Structured Data from LLMs
Alejandro AO
44 Multimodal RAG: Chat with PDFs (Images & Tables) [2025]
Multimodal RAG: Chat with PDFs (Images & Tables) [2025]
Alejandro AO
45 Advanced RAG with LlamaIndex - Metadata Extraction [2025]
Advanced RAG with LlamaIndex - Metadata Extraction [2025]
Alejandro AO
46 Learn MCP Servers with Python (EASY)
Learn MCP Servers with Python (EASY)
Alejandro AO
47 Create MCP Clients in JavaScript - Tutorial
Create MCP Clients in JavaScript - Tutorial
Alejandro AO
48 Create an MCP Client in Python - FastAPI Tutorial
Create an MCP Client in Python - FastAPI Tutorial
Alejandro AO
49 How to Build an MCP Client GUI with Streamlit and FastAPI
How to Build an MCP Client GUI with Streamlit and FastAPI
Alejandro AO
50 Vibe Coding For Engineers (make it ACTUALLY work)
Vibe Coding For Engineers (make it ACTUALLY work)
Alejandro AO
51 LlamaExtract Tutorial: Convert PDF & Images into JSON
LlamaExtract Tutorial: Convert PDF & Images into JSON
Alejandro AO
52 Local MCP Servers for Cursor (Step by step)
Local MCP Servers for Cursor (Step by step)
Alejandro AO
53 Anthropic: How to Build Multi Agent Systems
Anthropic: How to Build Multi Agent Systems
Alejandro AO
54 Deploy Remote MCP Servers in Python (Step by Step)
Deploy Remote MCP Servers in Python (Step by Step)
Alejandro AO
55 GPT-5 for Developers: API Changes, Pricing, Model Router & Security
GPT-5 for Developers: API Changes, Pricing, Model Router & Security
Alejandro AO
56 Tutorial: Auth for Remote MCP Servers (Step by Step) | OAuth 2.1 with ScaleKit
Tutorial: Auth for Remote MCP Servers (Step by Step) | OAuth 2.1 with ScaleKit
Alejandro AO
57 Generate UI Tests with TestSprite MCP Server + TRAE
Generate UI Tests with TestSprite MCP Server + TRAE
Alejandro AO
58 #4 Allan Guo | 19-yo YC Founder - Willow Voice
#4 Allan Guo | 19-yo YC Founder - Willow Voice
Alejandro AO
59 RAG Project: Build an AI Onboarding Chatbot with Streamlit, LangChain, and ChromaDB
RAG Project: Build an AI Onboarding Chatbot with Streamlit, LangChain, and ChromaDB
Alejandro AO
60 MCP Security | Malicious MCP Servers (Protect Yourself)
MCP Security | Malicious MCP Servers (Protect Yourself)
Alejandro AO

This video teaches how to use LlamaIndex and Pydantic to get structured data from language models, enabling predictable output with specific keys and value formats. By following the steps outlined in the video, viewers can learn how to define a schema for their data and use it to validate the output from their language model.

Key Takeaways
  1. Create a schema for the data using Pydantic
  2. Initialize a language model using LlamaIndex
  3. Define the output schema for the language model
  4. Query the language model using the structured LLM method
  5. Validate the output against the defined schema
💡 Using a schema to define the structure of the output data allows for predictable and validated results from language models

Related AI Lessons

Claude AI vs ChatGPT: Which One Is Actually Better in 2026?
Compare Claude AI and ChatGPT based on real-world usage and benchmarking to determine which one is better in 2026
Medium · AI
Claude AI vs ChatGPT: Which One Is Actually Better in 2026?
Compare Claude AI and ChatGPT to determine which AI model is better for your needs in 2026
Medium · Programming
IntelliBooks: Classic RAG vs Graph RAG vs Agentic RAG – Choosing the Right AI Retrieval Architecture for Enterprise AI
Learn to choose the right AI retrieval architecture for enterprise AI between Classic RAG, Graph RAG, and Agentic RAG
Dev.to AI
Fluid, natural voice translation with Gemini 3.5 Live Translate
Learn about Gemini 3.5 Live Translate, a new voice translation technology that enables fluid and natural conversations across languages
Dev.to AI
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →