Learning from Noisy Preferences: A Semi-Supervised Learning Approach to Direct Preference Optimization

📰 ArXiv cs.AI

arXiv:2604.24952v1 Announce Type: cross Abstract: Human visual preferences are inherently multi-dimensional, encompassing aesthetics, detail fidelity, and semantic alignment. However, existing datasets provide only single, holistic annotations, resulting in severe label noise: images that excel in some dimensions but are deficient in others are simply marked as winner or loser. We theoretically demonstrate that compressing multi-dimensional preferences into binary labels generates conflicting gr

Published 29 Apr 2026
Read full paper → ← Back to Reads