LoopQ: Quantization for Recursive Transformers

📰 ArXiv cs.AI

arXiv:2605.16343v1 Announce Type: cross Abstract: Looped language models (LoopLMs) improve parameter efficiency by recursively reusing Transformer blocks, enabling deeper computation under a fixed model size. However, this reuse makes LoopLMs more fragile under post-training quantization (PTQ). We present the first systematic study of quantization in LoopLMs and identify three challenges: distribution shift across roles, state reuse across loop transitions, and recursive error accumulation. To a

Published 19 May 2026
Read full paper → ← Back to Reads