interwhen: A Generalizable Framework for Steering Reasoning Models with Test-time Verification

📰 ArXiv cs.AI

arXiv:2602.11202v3 Announce Type: replace-cross Abstract: Reasoning models produce long traces of intermediate decisions and tool calls, making test-time verification important for ensuring correctness. Existing approaches either verify only the final answer, which misses early errors, or rely on branch-and-verify strategies that explore multiple trajectories. We introduce interwhen, a single-trajectory verification framework that steers model behavior by providing feedback on intermediate reaso

Published 14 May 2026

Read full paper → ← Back to Reads