Lightning Talk: Training Embedding Model Resiliently for Multimodal M... Huamin Chen & Haichen Zhang

PyTorch · Advanced ·🧠 Large Language Models ·3w ago
Lightning Talk: Training Embedding Model Resiliently for Multimodal Model Inference Routing - Huamin Chen, Red Hat & Haichen Zhang, AMD LLM systems increasingly rely on intelligent routing to balance cost, latency, and quality tradeoffs. The vLLM Semantic Router, a vLLM Ecosystem project, provides both semantic and performance level routing intelligence for Mixture-of-Multimodal Models (MoM) architectures, but its effectiveness depends on fast and accurate classifiers. This talk presents our end-to-end journey training production-grade embedding and classification models on AMD GPUs using native PyTorch, achieving high GPU utilization with distributed training optimizations. We introduce a multilingual text embedding model with 32K context window and 2D Matryoshka support, and multimodal embedding models, trained on AMD GPUs using PyTorch DDP. The talk covers practical training optimizations for AMD ROCm. All training code uses native PyTorch distributed primitives, with additional enhancement to improve training stability and pipeline efficiency. Attendees will learn how to train efficient classifiers for LLM routing systems and integrate these models into production inference pipelines.
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

Build AI Compliance SaaS with RAG
Build a scalable AI-powered compliance monitoring SaaS with RAG and regulatory alerts to help businesses stay on top of regulatory changes
Dev.to AI
How We Cut LLM API Costs by 94%: A 3-Layer Caching Strategy
Cut LLM API costs by 94% using a 3-layer caching strategy without sacrificing quality or performance
Dev.to AI
I Asked AI to Teach Algebra. The First Result Was Slop. Here’s How We Fixed It.
Learn how to improve AI-generated educational content by refining prompts and fine-tuning models, as demonstrated by a project to create an AI-generated algebra course
Medium · Machine Learning
AI Is Like a Super Smart Toy Box — But It Still Needs You
Discover how AI can augment human capabilities, but still requires human input and oversight to function effectively
Medium · AI
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →