Variational Speculative Decoding: Rethinking Draft Training from Token Likelihood to Sequence Acceptance
📰 ArXiv cs.AI
arXiv:2602.05774v4 Announce Type: replace-cross Abstract: Speculative decoding accelerates inference for (M)LLMs, yet a training-decoding discrepancy persists: while existing methods optimize single greedy trajectories, decoding involves verifying and ranking multiple sampled draft paths. We propose Variational Speculative Decoding (VSD), formulating draft training as variational inference over latent proposals (draft paths). VSD maximizes the marginal probability of target-model acceptance, yie
DeepCamp AI