Personalizing LLMs with Binary Feedback: A Preference-Corrected Optimization Framework

📰 ArXiv cs.AI

arXiv:2605.10043v1 Announce Type: cross Abstract: Large Language Model (LLM) personalization aims to align model behaviors with individual user preferences. Existing methods often focus on isolated user histories, neglecting the essential role of inter-user differences. We propose C-BPO, a framework that personalizes LLMs via preference-calibrated binary signals. By treating target user data as positive feedback and other users' data as an auxiliary set of implicit negative signals, C-BPO captur

Published 12 May 2026

Read full paper → ← Back to Reads