Detached Skip-Links and $R$-Probe: Decoupling Feature Aggregation from Gradient Propagation for MLLM OCR

📰 ArXiv cs.AI

Detached Skip-Links and $R$-Probe decouple feature aggregation from gradient propagation to improve MLLM OCR performance

advanced Published 23 Mar 2026

Action Steps

Identify optimization issues in multi-layer feature fusion
Decouple feature aggregation from gradient propagation using Detached Skip-Links
Implement $R$-Probe to mitigate gradient interference
Evaluate and refine the approach for improved MLLM OCR performance

Who Needs to Know This

ML researchers and engineers working on MLLMs can benefit from this approach to improve model performance on OCR tasks, and software engineers can apply these principles to optimize model training

Key Insight

💡 Detached Skip-Links and $R$-Probe can mitigate gradient interference and improve model stability