CRAFT: Video Diffusion for Bimanual Robot Data Generation

📰 ArXiv cs.AI

arXiv:2604.03552v1 Announce Type: cross Abstract: Bimanual robot learning from demonstrations is fundamentally limited by the cost and narrow visual diversity of real-world data, which constrains policy robustness across viewpoints, object configurations, and embodiments. We present Canny-guided Robot Data Generation using Video Diffusion Transformers (CRAFT), a video diffusion-based framework for scalable bimanual demonstration generation that synthesizes temporally coherent manipulation videos

Published 7 Apr 2026

Read full paper → ← Back to News