Parallelism and Generation Order in Masked Diffusion Language Models: Limits Today, Potential Tomorrow

📰 ArXiv cs.AI

arXiv:2601.15593v2 Announce Type: replace-cross Abstract: Masked Diffusion Language Models (MDLMs) promise parallel token generation and arbitrary-order decoding, yet it remains unclear to what extent current models truly realize these capabilities. We characterize MDLM behavior along two dimensions -- parallelism strength and generation order -- using Average Finalization Parallelism (AFP) and Kendall's tau. We evaluate eight mainstream MDLMs (up to 100B parameters) on 58 benchmarks spanning kn

Published 14 Apr 2026
Read full paper → ← Back to Reads