Cohere Transcribe AI Model That Runs on Your Laptop — No API, No GPU
In this video, I built a local speech-to-text transcription app using Cohere's new Transcribe model — the #1 model on the Hugging Face Open ASR Leaderboard. The app runs entirely on your CPU using ONNX Runtime, no GPU or API key needed.
What we cover:
1. Cohere Transcribe ONNX model (2B parameters, 14 languages)
2. Raw ONNX Runtime inference with encoder-decoder architecture and KV cache
3. Streamlit UI for uploading audio files or downloading from YouTube/URLs
Supports INT8, Q4, and FP16 quantization
If you found this helpful, please like, comment, and subscribe for more AI tutorials!
GitH…
Watch on YouTube ↗
(saves to browser)
DeepCamp AI