Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training

📰 ArXiv cs.AI

arXiv:2509.25758v2 Announce Type: replace Abstract: The remarkable capabilities of modern large reasoning models are largely unlocked through post-training techniques such as supervised fine-tuning (SFT) and reinforcement learning (RL). However, the architectural mechanisms behind such improvements remain largely opaque. In this work, we use circuit analysis to demonstrate that post-training for complex reasoning sparks the emergence of novel, functionally specialized attention heads. These head

Published 15 Apr 2026
Read full paper → ← Back to Reads