DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation

📰 ArXiv cs.AI

arXiv:2506.14202v4 Announce Type: replace-cross Abstract: End-to-end backpropagation requires storing activations throughout all layers, creating memory bottlenecks that limit model scalability. Existing block-wise training methods offer means to alleviate this problem, but they rely on ad-hoc local objectives and remain largely unexplored beyond classification tasks. We propose $\textit{DiffusionBlocks}$, a principled framework for transforming transformer-based networks into genuinely independ

Published 15 Jun 2026
Read full paper → ← Back to Reads