Multimodal Generative AI: Vision, Speech, and Assistants

Coursera Courses ↗ · Coursera

Open Course on Coursera

Free to audit · Opens on Coursera

Multimodal Generative AI: Vision, Speech, and Assistants

Coursera · Beginner ·🧠 Large Language Models ·1mo ago

Skills: Multimodal LLMs85%AI Pair Programming60%

We are introducing a new course to replace the "Coding with ChatGPT" course in the Generative AI specialization. This updated course will cover materials, models, and content released in 2024. Some of the new additions include material on using AI for image-to-text (vision), text-to-speech, speech-to-text, and the Assistant API. All these topics come with new labs, lessons, and exercises.

Watch on Coursera ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: Multimodal LLMs

View skill →

INSTALL NEW UNCENSORED FaceGen Ai WebUI LOCALLY in 1 CLICK!

INSTALL NEW UNCENSORED FaceGen Ai WebUI LOCALLY in 1 CLICK!

Google Veo 3 Tutorial: How to create AI Videos in Flow, Gemini or Google Vids?

Google Veo 3 Tutorial: How to create AI Videos in Flow, Gemini or Google Vids?

AI Tool Journey

NVIDIA Clara Guardian Virtual Patient Assistant

NVIDIA Clara Guardian Virtual Patient Assistant

NVIDIA Developer

Building Multimodal Search and RAG

Building Multimodal Search and RAG

Midjourney Trick: Consistent Character in Different Images

Midjourney Trick: Consistent Character in Different Images

Ollama Multimodal: EASILY setup Llava locally & Integrate API

Ollama Multimodal: EASILY setup Llava locally & Integrate API

Related AI Lessons

Thursday Thoughts: The Models We Can't Run

Explore the limitations of running latest AI models and their implications on the AI community

Big Tech firms are accelerating AI investments and integration, while regulators and companies focus on safety and responsible adoption.

Big Tech firms are investing billions in AI, driving growth and transformation, while prioritizing safety and responsible adoption

35 ChatGPT Prompts for Recruiters (That Actually Work in 2026)

Learn 35 effective ChatGPT prompts for recruiters to streamline their workflow in 2026

Dev.to · ClawGear

Stop Writing Like a Robot: The Prompt That Makes ChatGPT Sound Human

Learn how to craft prompts that make ChatGPT sound human, overcoming lifeless AI writing

Medium · ChatGPT

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)