UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling

📰 ArXiv cs.AI

arXiv:2604.19734v1 Announce Type: cross Abstract: Scaling humanoid foundation models is bottlenecked by the scarcity of robotic data. While massive egocentric human data offers a scalable alternative, bridging the cross-embodiment chasm remains a fundamental challenge due to kinematic mismatches. We introduce UniT (Unified Latent Action Tokenizer via Visual Anchoring), a framework that establishes a unified physical language for human-to-humanoid transfer. Grounded in the philosophy that heterog

Published 22 Apr 2026
Read full paper → ← Back to Reads