Cohere Transcribe AI Model That Runs on Your Laptop — No API, No GPU

AI Anytime · Advanced ·🧠 Large Language Models ·8h ago
In this video, I built a local speech-to-text transcription app using Cohere's new Transcribe model — the #1 model on the Hugging Face Open ASR Leaderboard. The app runs entirely on your CPU using ONNX Runtime, no GPU or API key needed. What we cover: 1. Cohere Transcribe ONNX model (2B parameters, 14 languages) 2. Raw ONNX Runtime inference with encoder-decoder architecture and KV cache 3. Streamlit UI for uploading audio files or downloading from YouTube/URLs Supports INT8, Q4, and FP16 quantization If you found this helpful, please like, comment, and subscribe for more AI tutorials! GitH…
Watch on YouTube ↗ (saves to browser)
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Next Up
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)