FashionMV: Product-Level Composed Image Retrieval with Multi-View Fashion Data

📰 ArXiv cs.AI

arXiv:2604.10297v1 Announce Type: cross Abstract: Composed Image Retrieval (CIR) retrieves target images using a reference image paired with modification text. Despite rapid advances, all existing methods and datasets operate at the image level -- a single reference image plus modification text in, a single target image out -- while real e-commerce users reason about products shown from multiple viewpoints. We term this mismatch View Incompleteness and formally define a new Multi-View CIR task t

Published 14 Apr 2026

Read full paper → ← Back to Reads