Create Image-to-Speech Apps with Azure AI

Coursera · Intermediate ·🛠️ AI Tools & Apps ·2mo ago

Skills: AI Workflow Automation90%AI Productivity Tools70%Building Custom AI Tools60%

Want to build applications that can see and speak? In this video, you’ll explore how Azure AI Vision and Azure AI Speech work together to create multimodal experiences—like describing images aloud for accessibility and automation. Follow a step-by-step walkthrough inside Azure AI Foundry to create resources, test Vision Studio’s dense captioning, and convert image descriptions into speech using the Speech playground. By the end, you’ll understand how to connect these services to build smarter, more accessible applications. 00:00 – Why Multimodal AI Matters 00:29 – What Is Azure AI Vision? 00:40 – Azure AI Service vs Resource vs Studio 01:01 – Creating an Azure AI Foundry Resource 01:21 – Managing Resources in the Azure Portal 01:57 – Setting Up a New Project 02:05 – Navigating Vision Studio 02:44 – Finding Keys and Endpoints 02:49 – Using Dense Captioning in Image Studio 03:19 – Testing AI Speech (Text-to-Speech) 04:00 – Connecting Vision and Speech 04:44 – Portal vs Studio: When to Use Each 05:34 – Monitoring, Security, and Production Use 05:56 – Building Accessible AI Applications This is a lecture video and part of a free course preview. Enroll in the full course to gain hands-on practice deploying Azure AI services, managing resources securely, and building production-ready multimodal applications. Explore the full *Microsoft Generative AI Engineering Professional Certificate* here: https://bit.ly/4rEnewg #AzureAI #AIVision #AISpeech #MultimodalAI #CloudComputing #AIForDevelopers #AccessibilityTech #MicrosoftAzure #GenerativeAI

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Coursera · Coursera · 0 of 60

← Previous Next →

Principles of Obesity Economics with Professor Kevin Frick

Principles of Obesity Economics with Professor Kevin Frick

Introduction to the U.S. Food System: Perspectives from Public Health by John Hopkins University

Introduction to the U.S. Food System: Perspectives from Public Health by John Hopkins University

E-learning and Digital Cultures

E-learning and Digital Cultures

Equine Nutrition with Jo-Anne Murray

Equine Nutrition with Jo-Anne Murray

Coursera Meetup BBQ

Coursera Meetup BBQ

Contraception: Choices, Culture and Consequences with Jerusalem Makonnen

Contraception: Choices, Culture and Consequences with Jerusalem Makonnen

Nutrition for Health Promotion and Disease Prevention with Katie Clark

Nutrition for Health Promotion and Disease Prevention with Katie Clark

Information Security and Risk Management in Context with Dr. Barbara Endicott-Popovsky

Information Security and Risk Management in Context with Dr. Barbara Endicott-Popovsky

Contraception: Choices, Culture and Consequences with Jerusalem Makonnen

Contraception: Choices, Culture and Consequences with Jerusalem Makonnen

Writing in the Sciences with Kristin Sainani

Writing in the Sciences with Kristin Sainani

Economic Issues, Food, and You with Jennifer Clark

Economic Issues, Food, and You with Jennifer Clark

Leading Strategic Innovation and Creativity in Organizations with David A. Owens, PhD

Leading Strategic Innovation and Creativity in Organizations with David A. Owens, PhD

Useful Genetics with Professor Rosie Redfield

Useful Genetics with Professor Rosie Redfield

A History of the World since 1300!!!! with Jeremy Adelman

A History of the World since 1300!!!! with Jeremy Adelman

Microeconomics with Richard McKenzie

Microeconomics with Richard McKenzie

Discrete Optimization with Professor Pascal Van Hentenryck

Discrete Optimization with Professor Pascal Van Hentenryck

Leading Strategic Innovation and Creativity in Organizations with David A. Owens, PhD

Leading Strategic Innovation and Creativity in Organizations with David A. Owens, PhD

Science from Superheroes to Global Warming with Michael Dennin

Science from Superheroes to Global Warming with Michael Dennin

Introduction to Digital Sound Design with Steve Everett by Emory University

Introduction to Digital Sound Design with Steve Everett by Emory University

Women and the Civil Rights Movement with Dr. Elsa Barkley Brown

Women and the Civil Rights Movement with Dr. Elsa Barkley Brown

Galaxies and Cosmology with S. George Djorgovski

Galaxies and Cosmology with S. George Djorgovski

Science, Technology, and Society in China I, II, and III: Basic Concepts with Naubahar Sharif

Science, Technology, and Society in China I, II, and III: Basic Concepts with Naubahar Sharif

Introduction to Pharmacy with Kenneth M. Hale, R.Ph., Ph.D.

Introduction to Pharmacy with Kenneth M. Hale, R.Ph., Ph.D.

AIDS with Kimberley Sessions Hagen, EdD

AIDS with Kimberley Sessions Hagen, EdD

Health Informatics in the Cloud with Mark Braunstein

Health Informatics in the Cloud with Mark Braunstein

Songwriting with Pat Pattison by Berklee College of Music

Songwriting with Pat Pattison by Berklee College of Music

Software Defined Networking with Dr. Nick Feamster

Software Defined Networking with Dr. Nick Feamster

Epigenetic Control of Gene Expression with Dr Marnie Blewitt

Epigenetic Control of Gene Expression with Dr Marnie Blewitt

Guitar for Beginners - Introduction to Guitar with Thaddeus Hogarth by Berklee College of Music

Guitar for Beginners - Introduction to Guitar with Thaddeus Hogarth by Berklee College of Music

Organizational Analysis with Daniel McFarland

Organizational Analysis with Daniel McFarland

Scientific Computing with J. Nathan Kutz

Scientific Computing with J. Nathan Kutz

Jazz Improvisation - Introduction to Improvisation with Gary Burton by Berklee College of Music

Jazz Improvisation - Introduction to Improvisation with Gary Burton by Berklee College of Music

Principles of Economics for Scientists with Antonio Rangel

Principles of Economics for Scientists with Antonio Rangel

Introduction to Music Production with Loudon Stearns by Berklee College of Music

Introduction to Music Production with Loudon Stearns by Berklee College of Music

Principles of Public Health with Zuzana Bic

Principles of Public Health with Zuzana Bic

The Science of Gastronomy with King Chow, Lam Lung Yeung by HKUST

The Science of Gastronomy with King Chow, Lam Lung Yeung by HKUST

The Language of Hollywood: Storytelling, Sound, and Color with Scott Higgins by Wesleyan University

The Language of Hollywood: Storytelling, Sound, and Color with Scott Higgins by Wesleyan University

Nutrition and Physical Activity for Health with John M. Jakicic, Ph.D., and Amy D. Rickman,...

Nutrition and Physical Activity for Health with John M. Jakicic, Ph.D., and Amy D. Rickman,...

Nutrition, Health, and Lifestyle: Issues and Insights with Jamie Pope, MS, RD, L

Nutrition, Health, and Lifestyle: Issues and Insights with Jamie Pope, MS, RD, L

Survey of Music Technology with Jason Freeman by Georgia Institute of Technology

Survey of Music Technology with Jason Freeman by Georgia Institute of Technology

Exercise Physiology: Understanding the Athlete Within with Mark Hargreaves

Exercise Physiology: Understanding the Athlete Within with Mark Hargreaves

Canine Theriogenology for Dog Enthusiasts with Margaret V. Root

Canine Theriogenology for Dog Enthusiasts with Margaret V. Root

Web Intelligence and Big Data with Gautam Shroff

Web Intelligence and Big Data with Gautam Shroff

Critical Perspectives on Management with Rolf Strom-Olsen

Critical Perspectives on Management with Rolf Strom-Olsen

El ABC del emprendimiento esbelto with Sergio Ortiz Valdes

El ABC del emprendimiento esbelto with Sergio Ortiz Valdes

Interprofessional Healthcare Informatics with Karen Monsen

Interprofessional Healthcare Informatics with Karen Monsen

Creativity, Innovation, and Change with Jack V. Matson, Darrell Velegol and Kath

Creativity, Innovation, and Change with Jack V. Matson, Darrell Velegol and Kath

Innovacion educativa con recursos abiertos with Maria Soledad Ramirez Montoya an

Innovacion educativa con recursos abiertos with Maria Soledad Ramirez Montoya an

Inspiring Leadership through Emotional Intelligence with Richard Boyatzis

Inspiring Leadership through Emotional Intelligence with Richard Boyatzis

Matematicas y movimiento with

Matematicas y movimiento with

Sustainability of Food Systems: A Global Life Cycle Perspective with Jason Hill

Sustainability of Food Systems: A Global Life Cycle Perspective with Jason Hill

Latin American Culture with Enrique Tames

Latin American Culture with Enrique Tames

Latin American Culture' with undefined

Latin American Culture' with undefined

Computer Security with Dan Boneh

Computer Security with Dan Boneh

Introduction to Art: Concepts & Techniques

Introduction to Art: Concepts & Techniques

Programmed cell death

Programmed cell death

El ABC del emprendimiento esbelto

El ABC del emprendimiento esbelto

Understanding economic policymaking

Understanding economic policymaking

History of Rock, Part 1 by University of Rochester

History of Rock, Part 1 by University of Rochester

Pensamiento Cientifico

Pensamiento Cientifico

More on: AI Workflow Automation

View skill →

Framer Tutorial: Build a Shopify-integrated Website

Framer Tutorial: Build a Shopify-integrated Website

NEW AI PC Build - Live Stream

NEW AI PC Build - Live Stream

Vertex Pipelines: Qwik Start

How to Run n8n Locally (Full On-Premise Setup Tutorial)

How to Run n8n Locally (Full On-Premise Setup Tutorial)

NetworkChuck (2)

Cloud Composer: Copying BigQuery Tables Across Different Locations

Houdini Procedural Modeling: Advanced Projects

Houdini Procedural Modeling: Advanced Projects

Related AI Lessons

AI Video Editing Tools Are Changing Content Creation in 2026

Learn how AI video editing tools are revolutionizing content creation for beginners, freelancers, and creators, enabling faster production of professional videos

"I Got Tired of Rewriting 4 AI CLI Config Files. So I Put Setup Behind One Button"

Simplify AI CLI setup with a one-button solution, streamlining installation and configuration of multiple tools

Dev.to · CodeKing

PBIFORGE: The First AI Tool to Generate Full Power BI Dashboards from a Text Prompt

Generate full Power BI dashboards using PBIFORGE, an AI tool that takes text prompts to automate dashboard creation

Dev.to · suddhasheel bhatt

I made an extension that notifies you when Codex / Gemini / Claude finishes — got tired of tabbing back every 2 minutes to check

Learn how to create a browser extension to notify when AI models like Codex, Gemini, or Claude finish their tasks, saving time and increasing productivity

Dev.to · Rakshit Hooda

Chapters (14)

Why Multimodal AI Matters

0:29 What Is Azure AI Vision?

0:40 Azure AI Service vs Resource vs Studio

1:01 Creating an Azure AI Foundry Resource

1:21 Managing Resources in the Azure Portal

1:57 Setting Up a New Project

2:05 Navigating Vision Studio

2:44 Finding Keys and Endpoints

2:49 Using Dense Captioning in Image Studio

3:19 Testing AI Speech (Text-to-Speech)

4:00 Connecting Vision and Speech

4:44 Portal vs Studio: When to Use Each

5:34 Monitoring, Security, and Production Use

5:56 Building Accessible AI Applications

These NEW Codex Update are INSANE! 🤯

Julian Goldie SEO