Parallelism and Generation Order in Masked Diffusion Language Models: Limits Today, Potential Tomorrow
📰 ArXiv cs.AI
arXiv:2601.15593v2 Announce Type: replace-cross Abstract: Masked Diffusion Language Models (MDLMs) promise parallel token generation and arbitrary-order decoding, yet it remains unclear to what extent current models truly realize these capabilities. We characterize MDLM behavior along two dimensions -- parallelism strength and generation order -- using Average Finalization Parallelism (AFP) and Kendall's tau. We evaluate eight mainstream MDLMs (up to 100B parameters) on 58 benchmarks spanning kn
DeepCamp AI