HP-Edit: A Human-Preference Post-Training Framework for Image Editing

📰 ArXiv cs.AI

arXiv:2604.19406v1 Announce Type: cross Abstract: Common image editing tasks typically adopt powerful generative diffusion models as the leading paradigm for real-world content editing. Meanwhile, although reinforcement learning (RL) methods such as Diffusion-DPO and Flow-GRPO have further improved generation quality, efficiently applying Reinforcement Learning from Human Feedback (RLHF) to diffusion-based editing remains largely unexplored, due to a lack of scalable human-preference datasets an

Published 22 Apr 2026

Read full paper → ← Back to Reads