NOT USING Granite 4.1 ASR - The Fastest ASR?
Skills:
ML Maths Basics70%
In this video, I dive into IBM's newly released Granite Speech 4.1 models and explore what makes them interesting — particularly the three 2B variants they've dropped and how each one makes a different trade-off between accuracy, richness, and throughput that you'll actually care about for real applications.
We look at the base Granite Speech 4.1 2B which hits an impressive 5.33% WER on the OpenASR leaderboard, the Plus variant that adds speaker-attributed ASR and word-level timestamps, and the NAR (Non-Autoregressive) version that flips the architecture entirely to generate sequences all at once for much better GPU throughput. I also walk through multilingual support across English, French, German, Spanish, Portuguese, and Japanese, plus the bidirectional translation capabilities that make this genuinely useful for enterprise edge deployments.
All three models are Apache 2.0 licensed and available on Hugging Face right now.
🔗 Links:
Granite Speech 4.1 2B → https://huggingface.co/ibm-granite/granite-speech-4.1-2b
Granite Speech 4.1 2B Plus → https://huggingface.co/ibm-granite/granite-speech-4.1-2b-plus
Granite Speech 4.1 2B NAR → https://huggingface.co/ibm-granite/granite-speech-4.1-2b-nar
IBM Research Blog → https://research.ibm.com/blog/granite-4-1-ai-foundation-models
Twitter: https://x.com/Sam_Witteveen
🕵️ Interested in building LLM Agents? Fill out the form below
Building LLM Agents Form: https://drp.li/dIMes
👨💻Github:
https://github.com/samwit/llm-tutorials
⏱️Time Stamps:
00:00 Intro
00:20 IBM Granite Collection
00:27 Granite Docling
00:46 Granite Speech 4.1
01:16 Granite 4.1 Blog
01:38 Granite Speech 4.1 2B
04:02 Granite Speech 4.1 2B Plus
06:15 Granite Speech 4.1 2B NAR
07:30 NLE: Non-autoregressive LLM-based ASR by Transcript Editing Paper
07:45 Architecture
09:45 Code Time
12:00 Granite Speech Model Github
#DellProPrecision #DellProMax
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: ML Maths Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
The Day I Realized Most Developers Are Learning Python the Wrong Way
Medium · Python
Deterministic OCR in JavaScript: PaddleOCR for Node, Bun, Deno, and the Browser
Dev.to · Awal Ariansyah
From Spite to a Double Offer: Data Science Intern at Adobe Research
Medium · Machine Learning
Out of curiosity, how did a lot of you start?
Dev.to · libre-main
Chapters (12)
Intro
0:20
IBM Granite Collection
0:27
Granite Docling
0:46
Granite Speech 4.1
1:16
Granite 4.1 Blog
1:38
Granite Speech 4.1 2B
4:02
Granite Speech 4.1 2B Plus
6:15
Granite Speech 4.1 2B NAR
7:30
NLE: Non-autoregressive LLM-based ASR by Transcript Editing Paper
7:45
Architecture
9:45
Code Time
12:00
Granite Speech Model Github
🎓
Tutor Explanation
DeepCamp AI