Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence

📰 ArXiv cs.AI

arXiv:2604.24954v1 Announce Type: cross Abstract: We introduce Nemotron 3 Nano Omni, the latest model in the Nemotron multimodal series and the first to natively support audio inputs alongside text, images, and video. Nemotron 3 Nano Omni delivers consistent accuracy improvements over its predecessor, Nemotron Nano V2 VL, across all modalities, enabled by advances in architecture, training data and recipes. In particular, Nemotron 3 delivers leading results in real-world document understanding,

Published 29 Apr 2026
Read full paper → ← Back to Reads