Rethinking Cross-Layer Information Routing in Diffusion Transformers

📰 ArXiv cs.AI

arXiv:2605.20708v1 Announce Type: cross Abstract: Diffusion Transformers (DiTs) have become a de facto backbone of modern visual generation, and nearly every major axis of their design -- tokenization, attention, conditioning, objectives, and latent autoencoders -- has been extensively revisited. The residual stream that governs how information accumulates across layers, however, has been directly inherited from the original Transformer. In this paper, we present a systematic empirical analysis

Published 21 May 2026

Read full paper → ← Back to Reads