Step-level Denoising-time Diffusion Alignment with Multiple Objectives

📰 ArXiv cs.AI

arXiv:2604.14379v1 Announce Type: cross Abstract: Reinforcement learning (RL) has emerged as a powerful tool for aligning diffusion models with human preferences, typically by optimizing a single reward function under a KL regularization constraint. In practice, however, human preferences are inherently pluralistic, and aligned models must balance multiple downstream objectives, such as aesthetic quality and text-image consistency. Existing multi-objective approaches either rely on costly multi-

Published 17 Apr 2026
Read full paper → ← Back to Reads