SimSD: Simple Speculative Decoding in Diffusion Language Models

📰 ArXiv cs.AI

arXiv:2606.02544v1 Announce Type: cross Abstract: Diffusion large language models (dLLMs) have recently emerged as a promising alternative to autoregressive (AR) LLMs, offering faster inference through parallel or blockwise decoding. However, their masked language modeling formulation remains incompatible with standard token-level speculative decoding, one of the most effective acceleration techniques for AR models. In AR decoding, the causal mask preserves temporally valid token-level contexts,

Published 2 Jun 2026

Read full paper → ← Back to Reads