VeruSAGE: A Study of Agent-Based Verification for Rust Systems
📰 ArXiv cs.AI
arXiv:2512.18436v2 Announce Type: replace-cross Abstract: Large language models (LLMs) have shown impressive capability to understand and develop code. However, their capability to rigorously reason about and prove code correctness remains in question. This paper offers a comprehensive study of LLMs' capability to develop correctness proofs for system software written in Rust. We curate a new system-verification benchmark suite, VeruSAGE-Bench, which consists of 849 proof tasks extracted from ei
DeepCamp AI