Demo: Private RAG with local Mistral 7b LLM and Weaviate on K8s

Samos123 · Intermediate ·🔍 RAG & Vector Search ·1y ago

Skills: RAG Basics90%Vector Stores80%LLM Engineering70%

Live demo that shows how to deploy an end-to-end Retrieval Augmented Generation stack including the LLM and Embedding Server on top of any K8s cluster. Lingo as the model proxy and autoscaler: https://github.com/substratusai/lingo Verba as the RAG application: https://github.com/weaviate/Verba Weaviate as the Vector DB: https://github.com/weaviate/weaviate Mistral-7B-Instruct-v2 as the LLM STAPI with MiniLM-L6-v2 as the embedding model: https://github.com/substratusai/stapi Blog post with copy pasteable steps: https://www.substratus.ai/blog/lingo-weaviate-private-rag

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: RAG Basics

View skill →

High Performance (Realtime) RAG Chains: From Basic to Advanced

High Performance (Realtime) RAG Chains: From Basic to Advanced

Coding the Ultimate RAG Engine from Zero

Coding the Ultimate RAG Engine from Zero

Build an LLM and RAG-based Chat Application using AlloyDB and LangChain

RAG Demo for Beginners: Full Hands-On Tutorial in Tamil | Build Your Own RAG AI | Karthik's Show

RAG Demo for Beginners: Full Hands-On Tutorial in Tamil | Build Your Own RAG AI | Karthik's Show

RAG with LangChain on Google Cloud

RAG with LangChain on Google Cloud

Google Cloud Tech

Build an End-to-End RAG API with AWS Bedrock & Azure OpenAI

Build an End-to-End RAG API with AWS Bedrock & Azure OpenAI

Related AI Lessons

How to Evaluate RAG Applications

Learn to evaluate RAG applications and avoid confidently wrong results

RAG Chunking Is Not About Length — It Is About Preserving Meaning

Learn how RAG chunking preserves meaning in long documents by avoiding fixed-size chunks

The Future of RAG: Dead, Evolving… or Becoming the Brain of AI?

Learn about the future of RAG, from its current state to emerging trends like Agentic RAG and multimodal AI

Medium · Machine Learning

Smart Routing, Transfer Family Ingestion, and Voice Chat — Permission-Aware RAG v4.2

Learn about the latest features in Permission-Aware RAG v4.2, including Smart Routing, Transfer Family Ingestion, and Voice Chat, and how to apply them in your projects

Dev.to · Yoshiki Fujiwara(藤原善基)@AWS Community Builder

Watch this before applying for jobs as a developer.