KSDiff: Keyframe-Augmented Speech-Aware Dual-Path Diffusion for Facial Animation

📰 ArXiv cs.AI

arXiv:2509.20128v2 Announce Type: replace-cross Abstract: Audio-driven facial animation has made significant progress in multimedia applications, with diffusion models showing strong potential for talking-face synthesis. However, most existing works treat speech features as a monolithic representation and fail to capture their fine-grained roles in driving different facial motions, while also overlooking the importance of modeling keyframes with intense dynamics. To address these limitations, we

Published 14 Apr 2026
Read full paper → ← Back to Reads