Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification

📰 ArXiv cs.AI

arXiv:2603.26648v1 Announce Type: cross Abstract: Recent advances in large language models have improved the capabilities of coding agents, yet systematic evaluation of complex, end-to-end website development remains limited. To address this gap, we introduce Vision2Web, a hierarchical benchmark for visual website development, spanning from static UI-to-code generation, interactive multi-page frontend reproduction, to long-horizon full-stack website development. The benchmark is constructed from

Published 30 Mar 2026

Read full paper → ← Back to News