Local Retrieval Augmented Generation (RAG) from Scratch (step by step tutorial)
In this video we'll build a Retrieval Augmented Generation (RAG) pipeline to run locally from scratch.
There are frameworks to do this such as LangChain and LlamaIndex, however, building from scratch means that you'll know all the parts of the puzzle.
Specifically, we'll build NutriChat, a RAG pipeline that allows someone to ask questions of a 1200 page Nutrition Textbook PDF.
Code on GitHub - https://github.com/mrdbourke/simple-local-rag
Whiteboard - https://whimsical.com/simple-local-rag-workflow-39kToR3yNf7E8kY4sS2tjV
Be sure to check out NVIDIA GTC, NVIDIA's GPU Technology Conference running from March 18-21. It's free to attend virtually! That's what I'm doing.
Sign up to GTC24 here: https://nvda.ws/3GUZygQ
Other links:
Download Nutrify (take a photo of food and learn about it) - https://nutrify.app
Learn AI/ML (beginner-friendly course) - https://dbourke.link/ZTMMLcourse
Learn TensorFlow - https://dbourke.link/ZTMTFcourse
Learn PyTorch - https://dbourke.link/ZTMPyTorch
AI/ML courses/books I recommend - https://www.mrdbourke.com/ml-resources/
Read my novel Charlie Walks - https://www.charliewalks.com
Connect elsewhere:
Web - https://dbourke.link/web
Twitter - https://www.twitter.com/mrdbourke
Twitch - https://www.twitch.tv/mrdbourke
ArXiv channel (past streams) - https://dbourke.link/archive-channel
Get email updates on my work - https://dbourke.link/newsletter
Timestamps:
0:00 - Intro/NVIDIA GTC
2:25 - Part 0: Resources and overview
8:33 - Part 1: What is RAG? Why RAG? Why locally?
12:26 - Why RAG?
19:31 - What can RAG be used for?
26:08 - Why run locally?
30:26 - Part 2: What we're going to build
40:40 - Original Retrieval Augmented Generation paper
46:04 - Part 3: Importing and processing a PDF document
48:29 - Code starts! Importing a PDF and making it readable
1:17:09 - Part 4: Preprocessing our text into chunks (text splitting)
1:28:27 - Chunking our sentences together
1:56:38 - Part 5: Embedding creation
1:58:15 - Incredible embeddin
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Daniel Bourke · Daniel Bourke · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Xbox One S Unboxing and Xbox One Comparison
Daniel Bourke
Text/Profanity Checker in Python
Daniel Bourke
Drawing Flowers in Python
Daniel Bourke
Finding The Right Medium - TDBS 18 April 2017
Daniel Bourke
What Is Neuralink??! - TDBS 22 April 2017
Daniel Bourke
Disagree and Commit, Words of Wisdom from Jeff Bezos - TDBS 19 April 2017
Daniel Bourke
A Lesson In Movement | Raw Training Australia
Daniel Bourke
FALLING IS FUN | Functional Friday 4
Daniel Bourke
My first HACKATHON! | 100 Days of Code 1
Daniel Bourke
MORE MACHINE LEARNING | 100 Days of Code 2
Daniel Bourke
TensorBoard and learning from Einstein | 100 Days of Code 3
Daniel Bourke
Job Interview Tips and Open Ocean Swim | 100 Days of Code 4
Daniel Bourke
I Want To Help 100,000 People Workout | AI Powered Personal Trainer
Daniel Bourke
MACHINE LEARNING IN 5 MINUTES
Daniel Bourke
COFFEE, YOGA and AWS | 100 Days of Code 5
Daniel Bourke
MY FIRST STARTUP WEEKEND | 100 Days of Code 6
Daniel Bourke
GENERATING TV SCRIPTS WITH DEEP LEARNING | 100 Days of Code 7
Daniel Bourke
Attention, please
Daniel Bourke
TEACHING BOTS TO PLAY GAMES | 100 Days of Code 9
Daniel Bourke
Udacity Deep Learning Nanodegree Language Translation Project Submission | 100 Days of Code 10
Daniel Bourke
Learning about Generative Adversarial Networks on Udacity | 100 Days of Code 11
Daniel Bourke
Completing Andrew Ng's Machine Learning Course on Coursera | 100 Days of Code 12
Daniel Bourke
Finishing the Treehouse Python Track | 100 Days of Code 13
Daniel Bourke
GENERATING FACES WITH GANs | 100 Days of Code 14
Daniel Bourke
Graduating From the Udacity Deep Learning Nanodegree | 100 Days of Code 15
Daniel Bourke
WHAT I'VE LEARNED FROM TALKING TO PEOPLE
Daniel Bourke
3 Life Principles I Learned From Ray Dalio
Daniel Bourke
PYTHON && POETRY | 100 Days of Code 16
Daniel Bourke
Physique Update and 6 Things I Wish I Knew Before Starting Gym
Daniel Bourke
The 100 Days is Over! | 100 Days of Code 17
Daniel Bourke
How to Burn Over 100 Calories in 4 Minutes
Daniel Bourke
Solving Sudoku with AI | Learning Intelligence 1
Daniel Bourke
Upper Body Calisthenics Workout in the Park
Daniel Bourke
What is an Adversarial Search Agent? | Learning Intelligence 2
Daniel Bourke
My Self-Created Artificial Intelligence Master's Degree | Learning Intelligence 0
Daniel Bourke
Try Going Over It Again | Learning Intelligence 3
Daniel Bourke
Python and Pullups | Learning Intelligence 4
Daniel Bourke
AI Meets Blockchain! | Learning Intelligence 5
Daniel Bourke
How to Pass the Turing Test + I FAILED | Learning Intelligence 6
Daniel Bourke
Biology and Physics meet Computer Science | Learning Intelligence 7
Daniel Bourke
Udacity Artificial Intelligence Nanodegree Project 3 Progress | Learning Intelligence 8
Daniel Bourke
Passing Project 3 of Udacity's Artificial Intelligence Nanodegree | Learning Intelligence 9
Daniel Bourke
Bayes Networks, Hidden Markov Models and How I Wake Up | Learning Intelligence 10
Daniel Bourke
Udacity AI Nanodegree Progress and Bayes' Rule Explained | Learning Intelligence 11
Daniel Bourke
Udacity AI Nanodegree Project 4 Planning and Progress | Learning Intelligence 12
Daniel Bourke
Finishing Term 1 of Udacity's Artificial Intelligence Nanodegree | Learning Intelligence 13
Daniel Bourke
deeplearning.ai Progress! | Learning Intelligence 14
Daniel Bourke
Coursera Deep Learning Specialization Progress | Learning Intelligence 15
Daniel Bourke
Computer Vision Basics + More deeplearning.ai Progress! | Learning Intelligence 16
Daniel Bourke
My Experience at CodeCamp, Intro to Keras and Failing Hard | Learning Intelligence 17
Daniel Bourke
In-Depth Udacity Deep Learning Nanodegree Review
Daniel Bourke
Completing the Deeplearning.ai Specialization on Coursera | Learning Intelligence 18
Daniel Bourke
You're Never Too Young to Start Learning AI - Learning Intelligence Talks with Shaik Asad
Daniel Bourke
Starting Term 2 of the Udacity Artificial Intelligence Nanodegree | Learning Intelligence 19
Daniel Bourke
Submitting the Computer Vision Capstone Project | Udacity AI Nanodegree | Learning Intelligence 20
Daniel Bourke
Leg Day at World Gym Northlakes ft. Ben Jones Fitness
Daniel Bourke
deeplearning.ai Sequence Models Course Progress | Learning Intelligence 21
Daniel Bourke
Graduating from the deeplearning.ai Coursera Specialization | Learning Intelligence 22
Daniel Bourke
Udacity Artificial Intelligence Nanodegree NLP Concentration Progress | Learning Intelligence 23
Daniel Bourke
Learning How to Build What's Next at Google Cloud On Board Brisbane
Daniel Bourke
More on: RAG Basics
View skill →Related AI Lessons
Chapters (14)
Intro/NVIDIA GTC
2:25
Part 0: Resources and overview
8:33
Part 1: What is RAG? Why RAG? Why locally?
12:26
Why RAG?
19:31
What can RAG be used for?
26:08
Why run locally?
30:26
Part 2: What we're going to build
40:40
Original Retrieval Augmented Generation paper
46:04
Part 3: Importing and processing a PDF document
48:29
Code starts! Importing a PDF and making it readable
1:17:09
Part 4: Preprocessing our text into chunks (text splitting)
1:28:27
Chunking our sentences together
1:56:38
Part 5: Embedding creation
1:58:15
Incredible embeddin
🎓
Tutor Explanation
DeepCamp AI