Multi-Vector Search with Amélie Chatelain and Antoine Chaffin - Weaviate Podcast #134!

Weaviate vector database · Beginner ·🔍 RAG & Vector Search ·1mo ago

Skills: RAG Basics90%Vector Stores80%RAG Evaluation70%Advanced RAG60%

Amélie Chatelain and Antoine Chaffin from LightOn are leading the way in the next generation of search powered by Multi-Vector representations and Late Interaction. The podcast begins with what motivates them to work on Multi-Vector Search, continuing to discuss particular details such as the combination between lexical and semantic search, as well as bi-encoder speed with cross encoder accuracy. This discussion continues to present insights about training multi-vector models and how they differ from their single-vector predecessors. The conversation continues into particular successes of Late Interaction such as code, reasoning-intensive, and multimodal retrieval. Agents are great at searching with grep, but they are even better with ColGrep! Reasoning-Intensive Retrieval is a step change in how we think about search systems, beautifully enabled by both Late Interaction models and Agentic Search. Further, Multimodal Search, such as matching text with videos, is seeing massive benefits from Multi-Vector representations. The podcast continues to dive into the cost of MaxSim and how efficient methods such as MUVERA and PLAID can help. The podcast concludes with a presentation of their recent work on ColBERT-Zero, pre-training with Late Interaction instead of Single-Vector Dense Embedding models. LightOn are also the developers of PyLate, the world's leading open-source library for training these kinds of models. Chapters 0:00 Welcome to the Weaviate Podcast! 2:17 An Introduction to Multi-Vector Search 8:17 Multi- vs. Single-Vector 11:12 Comparison with Cross Encoders 18:12 ColGrep for Coding Agents 32:51 Reasoning-Intensive Retrieval 44:19 Multimodal Multi-Vector 50:51 The Cost of Multi-Vector 55:43 MUVERA and PLAID 1:08:35 ColBERT-Zero and PyLate

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: RAG Basics

View skill →

High Performance (Realtime) RAG Chains: From Basic to Advanced

High Performance (Realtime) RAG Chains: From Basic to Advanced

Coding the Ultimate RAG Engine from Zero

Coding the Ultimate RAG Engine from Zero

Build an LLM and RAG-based Chat Application using AlloyDB and LangChain

RAG Demo for Beginners: Full Hands-On Tutorial in Tamil | Build Your Own RAG AI | Karthik's Show

RAG Demo for Beginners: Full Hands-On Tutorial in Tamil | Build Your Own RAG AI | Karthik's Show

RAG with LangChain on Google Cloud

RAG with LangChain on Google Cloud

Google Cloud Tech

Build an End-to-End RAG API with AWS Bedrock & Azure OpenAI

Build an End-to-End RAG API with AWS Bedrock & Azure OpenAI

Related AI Lessons

Why StarRocks Is Better Than Elasticsearch for RAG and AI-Powered Vector Search Analytics

Learn why StarRocks outperforms Elasticsearch for RAG and AI-powered vector search analytics, and how to apply this knowledge to improve your data architecture

Production RAG: Shipping a RAG System Into an Enterprise Product

Learn how to ship a RAG system into an enterprise product, overcoming operational realities and challenges beyond the demo stage

HyDE: Search With the Answer You Wish You Had

Learn how HyDE improves search by using the answer you wish you had as a query, and why traditional question-based searches are limited

Hierarchical Indices: Find the Section First, Then Find the Sentence

Learn how hierarchical indices work by mimicking human search behavior in long documents, improving search efficiency

Chapters (10)

Welcome to the Weaviate Podcast!

2:17 An Introduction to Multi-Vector Search

8:17 Multi- vs. Single-Vector

11:12 Comparison with Cross Encoders

18:12 ColGrep for Coding Agents

32:51 Reasoning-Intensive Retrieval

44:19 Multimodal Multi-Vector

50:51 The Cost of Multi-Vector

55:43 MUVERA and PLAID

1:08:35 ColBERT-Zero and PyLate

Watch this before applying for jobs as a developer.