What Makes a Good AI Review? Concern-Level Diagnostics for AI Peer Review

📰 ArXiv cs.AI

arXiv:2604.19998v1 Announce Type: new Abstract: Evaluating AI-generated reviews by verdict agreement is widely recognized as insufficient, yet current alternatives rarely audit which concerns a system identifies, how it prioritizes them, or whether those priorities align with the review rationale that shaped the final assessment. We propose concern alignment, a diagnostic framework that evaluates AI reviews at the concern level rather than only at the verdict level. The framework's core data str

Published 23 Apr 2026

Read full paper → ← Back to Reads