Real-Time Voicemail Detection in Telephony Audio Using Temporal Speech Activity Features
📰 ArXiv cs.AI
arXiv:2604.09675v1 Announce Type: cross Abstract: Outbound AI calling systems must distinguish voicemail greetings from live human answers in real time to avoid wasted agent interactions and dropped calls. We present a lightweight approach that extracts 15 temporal features from the speech activity pattern of a pre-trained neural voice activity detector (VAD), then classifies with a shallow tree-based ensemble. Across two evaluation sets totaling 764 telephony recordings, the system achieves a c
DeepCamp AI