The Complete Guide to Hybrid Search in RAG (BM25 + Embeddings + Reranker)

Dave Ebbelaar · Beginner ·🔍 RAG & Vector Search ·1h ago
Skills: RAG Basics90%
Want to learn real AI Engineering? Go here: https://go.datalumina.com/QpP01LX Want to start freelancing? Let me help: https://go.datalumina.com/jOYILqO 🔗 GitHub Repository https://github.com/daveebbelaar/ai-cookbook/tree/main/knowledge/hybrid-retrieval 🛠️ My VS Code / Cursor Setup https://youtu.be/mpk4Q5feWaw ⏱️ Timestamps 00:00 Hybrid Retrieval Overview 01:00 Meet the Finance QA Data 03:29 Exploring Queries and Corpus 06:09 Mapping Questions to Documents 09:37 Retrieval Pipeline Roadmap 10:05 BM25 Keyword Retrieval 14:44 Tokenizing the Corpus 17:11 Building the BM25 Index 19:25 Querying with BM25 23:56 Why Dense Embeddings Help 25:13 Creating Dense Embeddings 32:11 Dense Search in Python 36:45 Dense Retrieval Compared 37:12 Reciprocal Rank Fusion 40:51 Fusing Search Results 43:56 Adding the Re-Ranker 46:44 Re-Ranking Hybrid Candidates 49:36 Evaluating Retrieval Quality 54:27 Tuning for Your Own Data 📌 Description In this lecture, I build a production-style hybrid retrieval system from scratch, combining BM25, dense embeddings (OpenAI text-embedding-3-small), reciprocal rank fusion, and Cohere's re-ranker into a single pipeline. Using the FinanceQA dataset from the BEIR benchmark, I walk through each stage, loading and inspecting the corpus, building a BM25 index, generating dense embeddings, fusing rankings with RRF, and re-ranking the top candidates. The final section evaluates all four approaches with NDCG@10, showing how the full hybrid plus re-ranker stack outperforms each method on its own. 👋🏻 About Me Hi! I'm Dave, AI Engineer and founder of Datalumina®. On this channel, I share practical tutorials that teach developers how to build production-ready AI systems that actually work in the real world. Beyond these tutorials, I also help people start successful freelancing careers. Check out the links above to learn more!
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

Ever Wondered How to Make Your RAG More Effective?
Improve your RAG effectiveness by connecting instead of searching
Medium · RAG
Why StarRocks Is Better Than Elasticsearch for RAG and AI-Powered Vector Search Analytics
Learn why StarRocks outperforms Elasticsearch for RAG and AI-powered vector search analytics, and how to apply this knowledge to improve your data architecture
Medium · LLM
Production RAG: Shipping a RAG System Into an Enterprise Product
Learn how to ship a RAG system into an enterprise product, overcoming operational realities and challenges beyond the demo stage
Medium · RAG
HyDE: Search With the Answer You Wish You Had
Learn how HyDE improves search by using the answer you wish you had as a query, and why traditional question-based searches are limited
Medium · RAG

Chapters (19)

Hybrid Retrieval Overview
1:00 Meet the Finance QA Data
3:29 Exploring Queries and Corpus
6:09 Mapping Questions to Documents
9:37 Retrieval Pipeline Roadmap
10:05 BM25 Keyword Retrieval
14:44 Tokenizing the Corpus
17:11 Building the BM25 Index
19:25 Querying with BM25
23:56 Why Dense Embeddings Help
25:13 Creating Dense Embeddings
32:11 Dense Search in Python
36:45 Dense Retrieval Compared
37:12 Reciprocal Rank Fusion
40:51 Fusing Search Results
43:56 Adding the Re-Ranker
46:44 Re-Ranking Hybrid Candidates
49:36 Evaluating Retrieval Quality
54:27 Tuning for Your Own Data
Up next
Watch this before applying for jobs as a developer.
Tech With Tim
Watch →