Semantic Query Engines with Matthew Russo - Weaviate Podcast #131!
Skills:
RAG Basics90%
Matthew Russo is a Ph.D. student at MIT where he is researching the intersection of AI and Database Systems. AI is transforming Database Systems. Perhaps the biggest impact so far has been natural language to query language translations, or Text-to-SQL. However, another massive innovation is brewing. AI presents new Semantic Operators for our query languages. For example, we are all familiar with the WHERE filter. Now we have AI_WHERE, in which an LLM or another AI model computes the filter value without needing it to be already available in the database!
`SELECT * FROM podcasts AI_WHERE “Text-to-SQL” in topics`
Semantic Filters are just the tip of iceberg, the roster of Semantic Operators further includes Semantic Joins, Map, Rank, Classify, Groupby, and Aggregation! And it doesn’t stop there! One of the core ideas for Relational Algebra and how its influenced Database Systems is query planning and finding the optimal order to apply filters. For example, let’s say you have two filters, the car is red and the car is a BMW. Now let’s say the dataset only contains 100 BMW, but 50,000 red cars!! Applying the BMW filter first will limit the size of the set for the next filter!
So many interesting nuggets in this podcast, loved discussing these things with Matthew, and I hope you find it interesting!
Links:
Palmipzest (GitHub): https://github.com/mitdbg/palimpzest
A Declarative System for Optimizing AI Workloads: https://arxiv.org/abs/2405.14696
Abacus: A Cost-Based Optimizer for Semantic Operator Systems: https://arxiv.org/abs/2505.14661
Semantic Operators: A Declarative Model for Rich, AI-based Data Processing: https://arxiv.org/abs/2407.11418
SemBench: A Benchmark for Semantic Query Processing Engines: https://arxiv.org/abs/2511.01716
Chapters:
0:00 Welcome Matthew!
0:47 Semantic Query Processing Engines
7:07 Semantic Operators
13:57 Relational Algebra and Query Planning
18:19 Internal vs. External Query Engines
23:33 Semantic Joins
33:22 SemBench
42:07 Latency,
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: RAG Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
RAG - Sliding Window, Token Based Chunking and PDF Chunking Packages
Dev.to AI
Ever Wondered How to Make Your RAG More Effective?
Medium · RAG
Why StarRocks Is Better Than Elasticsearch for RAG and AI-Powered Vector Search Analytics
Medium · LLM
Production RAG: Shipping a RAG System Into an Enterprise Product
Medium · RAG
Chapters (8)
Welcome Matthew!
0:47
Semantic Query Processing Engines
7:07
Semantic Operators
13:57
Relational Algebra and Query Planning
18:19
Internal vs. External Query Engines
23:33
Semantic Joins
33:22
SemBench
42:07
Latency,
🎓
Tutor Explanation
DeepCamp AI