Rays as Pixels: Learning A Joint Distribution of Videos and Camera Trajectories

📰 ArXiv cs.AI

arXiv:2604.09429v1 Announce Type: cross Abstract: Recovering camera parameters from images and rendering scenes from novel viewpoints have long been treated as separate tasks in computer vision and graphics. This separation breaks down when image coverage is sparse or poses are ambiguous, since each task needs what the other produces. We propose Rays as Pixels, a Video Diffusion Model (VDM) that learns a joint distribution over videos and camera trajectories. We represent each camera as dense ra

Published 13 Apr 2026
Read full paper → ← Back to Reads