Distilling Conversations: Abstract Compression of Conversational Audio Context for LLM-based ASR
📰 ArXiv cs.AI
arXiv:2603.26246v1 Announce Type: cross Abstract: Standard LLM-based speech recognition systems typically process utterances in isolation, limiting their ability to leverage conversational context. In this work, we study whether multimodal context from prior turns improves LLM-based ASR and how to represent that context efficiently. We find that, after supervised multi-turn training, conversational context mainly helps with the recognition of contextual entities. However, conditioning on raw con
DeepCamp AI