Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification
📰 ArXiv cs.AI
arXiv:2603.26648v1 Announce Type: cross Abstract: Recent advances in large language models have improved the capabilities of coding agents, yet systematic evaluation of complex, end-to-end website development remains limited. To address this gap, we introduce Vision2Web, a hierarchical benchmark for visual website development, spanning from static UI-to-code generation, interactive multi-page frontend reproduction, to long-horizon full-stack website development. The benchmark is constructed from
DeepCamp AI