LoopQ: Quantization for Recursive Transformers
📰 ArXiv cs.AI
arXiv:2605.16343v1 Announce Type: cross Abstract: Looped language models (LoopLMs) improve parameter efficiency by recursively reusing Transformer blocks, enabling deeper computation under a fixed model size. However, this reuse makes LoopLMs more fragile under post-training quantization (PTQ). We present the first systematic study of quantization in LoopLMs and identify three challenges: distribution shift across roles, state reuse across loop transitions, and recursive error accumulation. To a
DeepCamp AI