RobotArena $\infty$: Scalable Robot Benchmarking via Real-to-Sim Translation
📰 ArXiv cs.AI
arXiv:2510.23571v3 Announce Type: replace-cross Abstract: The pursuit of robot generalists, agents capable of performing diverse tasks across diverse environments, demands rigorous and scalable evaluation. Yet real-world testing of robot policies remains fundamentally constrained: it is labor-intensive, slow, unsafe at scale, and difficult to reproduce. As policies expand in scope and complexity, these barriers only intensify, since defining "success" in robotics often hinges on nuanced human ju
DeepCamp AI