Nw\=ach\=a Mun\=a: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR

📰 ArXiv cs.AI

Introduction of Nw=ach=a Mun=a, a Devanagari speech corpus for Nepal Bhasha ASR and a proximal transfer benchmark

advanced Published 31 Mar 2026

Action Steps

Curate a speech corpus for a low-resource language like Nepal Bhasha
Manually transcribe the speech data using the Devanagari script
Establish a benchmark for speech recognition using script-preserving acoustic modeling
Investigate proximal cross-lingual transfer from a geographically close language

Who Needs to Know This

This research benefits AI engineers, data scientists, and linguists working on speech recognition and natural language processing for low-resource languages, as it provides a new dataset and benchmark for Nepal Bhasha ASR.

Key Insight

💡 Proximal cross-lingual transfer can be effective for low-resource languages like Nepal Bhasha