Learning from Noisy Preferences: A Semi-Supervised Learning Approach to Direct Preference Optimization
📰 ArXiv cs.AI
arXiv:2604.24952v1 Announce Type: cross Abstract: Human visual preferences are inherently multi-dimensional, encompassing aesthetics, detail fidelity, and semantic alignment. However, existing datasets provide only single, holistic annotations, resulting in severe label noise: images that excel in some dimensions but are deficient in others are simply marked as winner or loser. We theoretically demonstrate that compressing multi-dimensional preferences into binary labels generates conflicting gr
DeepCamp AI