$\pi$-Attention: Periodic Sparse Transformers for Efficient Long-Context Modeling

📰 ArXiv cs.AI

arXiv:2511.10696v2 Announce Type: replace-cross Abstract: Transformers have revolutionized natural language processing, but their quadratic complexity with respect to sequence length remains a fundamental bottleneck for long-range modeling. While sparse attention mechanisms like RingAttention reduce computational costs by restricting attention to local neighborhoods, they suffer from limited receptive fields and lack of adaptability. We present \PiAttention, a periodic sparse Transformer that fa

Published 31 Mar 2026
Read full paper → ← Back to Reads