Efficient Training for Cross-lingual Speech Language Models
📰 ArXiv cs.AI
arXiv:2604.11096v1 Announce Type: cross Abstract: Currently, large language models (LLMs) predominantly focus on the text modality. To enable more natural human-AI interaction, speech LLMs are emerging, but building effective end-to-end speech LLMs remains challenging due to limited data and the difficulty in expanding to more languages. In this paper, we introduce Cross-lingual Speech Language Model (CSLM), an efficient training method for cross-lingual speech LLMs based on discrete speech toke
DeepCamp AI