A Layer-wise Analysis of Supervised Fine-Tuning

📰 ArXiv cs.AI

arXiv:2604.11838v1 Announce Type: cross Abstract: While critical for alignment, Supervised Fine-Tuning (SFT) incurs the risk of catastrophic forgetting, yet the layer-wise emergence of instruction-following capabilities remains elusive. We investigate this mechanism via a comprehensive analysis utilizing information-theoretic, geometric, and optimization metrics across model scales (1B-32B). Our experiments reveal a distinct depth-dependent pattern: middle layers (20\%-80\%) are stable, whereas

Published 15 Apr 2026

Read full paper → ← Back to Reads