Demo: Private RAG with local Mistral 7b LLM and Weaviate on K8s

Samos123 · Intermediate ·🔍 RAG & Vector Search ·1y ago
Live demo that shows how to deploy an end-to-end Retrieval Augmented Generation stack including the LLM and Embedding Server on top of any K8s cluster. Lingo as the model proxy and autoscaler: https://github.com/substratusai/lingo Verba as the RAG application: https://github.com/weaviate/Verba Weaviate as the Vector DB: https://github.com/weaviate/weaviate Mistral-7B-Instruct-v2 as the LLM STAPI with MiniLM-L6-v2 as the embedding model: https://github.com/substratusai/stapi Blog post with copy pasteable steps: https://www.substratus.ai/blog/lingo-weaviate-private-rag
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

How to Evaluate RAG Applications
Learn to evaluate RAG applications and avoid confidently wrong results
Medium · LLM
RAG Chunking Is Not About Length — It Is About Preserving Meaning
Learn how RAG chunking preserves meaning in long documents by avoiding fixed-size chunks
Medium · AI
The Future of RAG: Dead, Evolving… or Becoming the Brain of AI?
Learn about the future of RAG, from its current state to emerging trends like Agentic RAG and multimodal AI
Medium · Machine Learning
Smart Routing, Transfer Family Ingestion, and Voice Chat — Permission-Aware RAG v4.2
Learn about the latest features in Permission-Aware RAG v4.2, including Smart Routing, Transfer Family Ingestion, and Voice Chat, and how to apply them in your projects
Dev.to · Yoshiki Fujiwara(藤原 善基)@AWS Community Builder
Up next
Watch this before applying for jobs as a developer.
Tech With Tim
Watch →