Efficient Data Selection for Multimodal Models via Incremental Optimization Utility

📰 ArXiv cs.AI

arXiv:2605.07488v1 Announce Type: new Abstract: The scaling of Large Multimodal Models (LMMs) is constrained by the quality-quantity trade-off inherent in synthetic data. Previous approaches, such as LLM-as-a-Judge, have proven their effectiveness in addressing this but suffer from prohibitive computational costs and lack of interpretability. To bridge this gap, we propose One-Step-Train (OST), a framework that reformulates data selection as an incremental optimization utility ranking problem. I

Published 11 May 2026

Read full paper → ← Back to Reads