Building Multimodal Search and RAG
Learn how to build multimodal search and RAG systems. RAG systems enhance an LLM by incorporating proprietary data into the prompt context. Typically, RAG applications use text documents, but, what if the desired context includes multimedia like images, audio, and video? This course covers the technical aspects of implementing RAG with multimodal data to accomplish this.
1. Learn how multimodal models are trained through contrastive learning and implement it on a real dataset.
2. Build any-to-any multimodal search to retrieve relevant context across different data types.
3. Learn how LLMs a…
Watch on Coursera ↗
(saves to browser)
DeepCamp AI