Neural networks for Text-to-Speech evaluation

📰 ArXiv cs.AI

arXiv:2604.08562v1 Announce Type: cross Abstract: Ensuring that Text-to-Speech (TTS) systems deliver human-perceived quality at scale is a central challenge for modern speech technologies. Human subjective evaluation protocols such as Mean Opinion Score (MOS) and Side-by-Side (SBS) comparisons remain the de facto gold standards, yet they are expensive, slow, and sensitive to pervasive assessor biases. This study addresses these barriers by formulating, and implementing a suite of novel neural mo

Published 13 Apr 2026
Read full paper → ← Back to Reads