VeruSAGE: A Study of Agent-Based Verification for Rust Systems

📰 ArXiv cs.AI

arXiv:2512.18436v2 Announce Type: replace-cross Abstract: Large language models (LLMs) have shown impressive capability to understand and develop code. However, their capability to rigorously reason about and prove code correctness remains in question. This paper offers a comprehensive study of LLMs' capability to develop correctness proofs for system software written in Rust. We curate a new system-verification benchmark suite, VeruSAGE-Bench, which consists of 849 proof tasks extracted from ei

Published 16 Apr 2026

Read full paper → ← Back to Reads