Variation in Verification: Understanding Verification Dynamics in Large Language Models

📰 ArXiv cs.AI

arXiv:2509.17995v2 Announce Type: replace-cross Abstract: Recent advances have shown that scaling test-time computation enables large language models (LLMs) to solve increasingly complex problems across diverse domains. One effective paradigm for test-time scaling (TTS) involves LLM generators producing multiple solution candidates, with LLM verifiers assessing the correctness of these candidates without reference answers. In this paper, we study generative verifiers, which perform verification

Published 15 Apr 2026

Read full paper → ← Back to Reads