SimSD: Simple Speculative Decoding in Diffusion Language Models
📰 ArXiv cs.AI
arXiv:2606.02544v1 Announce Type: cross Abstract: Diffusion large language models (dLLMs) have recently emerged as a promising alternative to autoregressive (AR) LLMs, offering faster inference through parallel or blockwise decoding. However, their masked language modeling formulation remains incompatible with standard token-level speculative decoding, one of the most effective acceleration techniques for AR models. In AR decoding, the causal mask preserves temporally valid token-level contexts,
DeepCamp AI