Gradient-Direction Sensitivity Reveals Linear-Centroid Coupling Hidden by Optimizer Trajectories
📰 ArXiv cs.AI
arXiv:2604.25143v1 Announce Type: cross Abstract: We show that replacing the rolling SVD of AdamW updates with a rolling SVD of loss gradients changes the diagnostic by 1-2 orders of magnitude. Performing SVD on the loss gradient instead of the AdamW update increases the measured perturbative coupling between SED directions and Linear Centroid Hypothesis (LCH) features from $ \bar{R}_k \approx 3 $--$9\times$ to $100$--$330\times$ across four single-task modular arithmetic operations, eliminating
DeepCamp AI