NovBench: Evaluating Large Language Models on Academic Paper Novelty Assessment

📰 ArXiv cs.AI

arXiv:2604.11543v1 Announce Type: cross Abstract: Novelty is a core requirement in academic publishing and a central focus of peer review, yet the growing volume of submissions has placed increasing pressure on human reviewers. While large language models (LLMs), including those fine-tuned on peer review data, have shown promise in generating review comments, the absence of a dedicated benchmark has limited systematic evaluation of their ability to assess research novelty. To address this gap, w

Published 14 Apr 2026
Read full paper → ← Back to Reads