Retrieval Augmented Generation (RAG) Explained: Embedding, Sentence BERT, Vector Database (HNSW)

Umar Jamil · Beginner ·🔍 RAG & Vector Search ·2y ago

Skills: RAG Basics90%Vector Stores80%RAG Evaluation70%

Get your 5$ coupon for Gradient: https://gradient.1stcollab.com/umarjamilai In this video we explore the entire Retrieval Augmented Generation pipeline. I will start by reviewing language models, their training and inference, and then explore the main ingredient of a RAG pipeline: embedding vectors. We will see what are embedding vectors, how they are computed, and how we can compute embedding vectors for sentences. We will also explore what is a vector database, while also exploring the popular HNSW (Hierarchical Navigable Small Worlds) algorithm used by vector databases to find embedding vectors given a query. Download the PDF slides: https://github.com/hkproj/retrieval-augmented-generation-notes Sentence BERT paper: https://arxiv.org/pdf/1908.10084.pdf Chapters 00:00 - Introduction 02:22 - Language Models 04:33 - Fine-Tuning 06:04 - Prompt Engineering (Few-Shot) 07:24 - Prompt Engineering (QA) 10:15 - RAG pipeline (introduction) 13:38 - Embedding Vectors 19:41 - Sentence Embedding 23:17 - Sentence BERT 28:10 - RAG pipeline (review) 29:50 - RAG with Gradient 31:38 - Vector Database 33:11 - K-NN (Naive) 35:16 - Hierarchical Navigable Small Worlds (Introduction) 35:54 - Six Degrees of Separation 39:35 - Navigable Small Worlds 43:08 - Skip-List 45:23 - Hierarchical Navigable Small Worlds 47:27 - RAG pipeline (review) 48:22 - Closing

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: RAG Basics

View skill →

High Performance (Realtime) RAG Chains: From Basic to Advanced

High Performance (Realtime) RAG Chains: From Basic to Advanced

Coding the Ultimate RAG Engine from Zero

Coding the Ultimate RAG Engine from Zero

Build an LLM and RAG-based Chat Application using AlloyDB and LangChain

RAG Demo for Beginners: Full Hands-On Tutorial in Tamil | Build Your Own RAG AI | Karthik's Show

RAG Demo for Beginners: Full Hands-On Tutorial in Tamil | Build Your Own RAG AI | Karthik's Show

RAG with LangChain on Google Cloud

RAG with LangChain on Google Cloud

Google Cloud Tech

Build an End-to-End RAG API with AWS Bedrock & Azure OpenAI

Build an End-to-End RAG API with AWS Bedrock & Azure OpenAI

Related AI Lessons

RAG Evaluation with RAGAS: Measuring Faithfulness, Context Precision, and Recall in Production

Learn to evaluate RAG models using RAGAS, measuring faithfulness, context precision, and recall in production environments

Dev.to · Anna Danilec

Chunking for RAG: stop tuning the wrong knob

Learn how to optimize RAG performance with a practical chunking playbook, avoiding common pitfalls and improving evaluation metrics

Dev.to · saurabh naik

Your RAG Pipeline Isn’t Broken. Your Chunks Are.

Learn how to optimize your RAG pipeline by fixing common issues with document chunks

Your RAG Pipeline Isn’t Broken. Your Chunks Are.

Learn how to optimize your RAG pipeline by fixing chunking issues, a crucial step in improving retrieval performance

Chapters (20)

Introduction

2:22 Language Models

4:33 Fine-Tuning

6:04 Prompt Engineering (Few-Shot)

7:24 Prompt Engineering (QA)

10:15 RAG pipeline (introduction)

13:38 Embedding Vectors

19:41 Sentence Embedding

23:17 Sentence BERT

28:10 RAG pipeline (review)

29:50 RAG with Gradient

31:38 Vector Database

33:11 K-NN (Naive)

35:16 Hierarchical Navigable Small Worlds (Introduction)

35:54 Six Degrees of Separation

39:35 Navigable Small Worlds

43:08 Skip-List

45:23 Hierarchical Navigable Small Worlds

47:27 RAG pipeline (review)

48:22 Closing

Watch this before applying for jobs as a developer.