Step-level Denoising-time Diffusion Alignment with Multiple Objectives
📰 ArXiv cs.AI
arXiv:2604.14379v1 Announce Type: cross Abstract: Reinforcement learning (RL) has emerged as a powerful tool for aligning diffusion models with human preferences, typically by optimizing a single reward function under a KL regularization constraint. In practice, however, human preferences are inherently pluralistic, and aligned models must balance multiple downstream objectives, such as aesthetic quality and text-image consistency. Existing multi-objective approaches either rely on costly multi-
DeepCamp AI