We’re introducing three audio models in the API

OpenAI · Beginner ·🧠 Large Language Models ·1h ago
We’re introducing three audio models in the API that unlock a new class of voice apps for developers. With these models, developers can build voice experiences that feel more natural, respond more intelligently, and take action in real time: • GPT‑Realtime‑2, our first voice model with GPT‑5‑class reasoning that can handle harder requests and carry the conversation forward naturally. • GPT‑Realtime‑Translate, a new live translation model that translates speech from 70+ input languages into 13 output languages while keeping pace with the speaker. • GPT‑Realtime‑Whisper, a new streaming speech-to-text that transcribes speech live as the speaker talks.
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

I Used Claude for 30 Days Straight. Here’s What I Stopped Doing Manually.
Discover how using Claude for 30 days automated tasks and reduced manual workload, and learn how to apply AI to your own workflow
Medium · AI
Yapay Zeka Aslında Nasıl Çalışıyor?
Learn how AI works by understanding Generative Adversarial Networks (GANs), Transformers, LLMs, and Diffusion models
Medium · Data Science
Yapay Zeka Aslında Nasıl Çalışıyor?
Learn how AI works by understanding Generative Adversarial Networks (GANs), Transformers, LLMs, and Diffusion, and how they relate to each other
Medium · LLM
How to Learn Claude: A Practical Guide for Real‑World Use
Learn to use Claude, a powerful AI tool, to maximize its capabilities and apply it to real-world use cases, such as writing essays and optimizing business processes
Medium · Python
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →