NovBench: Evaluating Large Language Models on Academic Paper Novelty Assessment
📰 ArXiv cs.AI
arXiv:2604.11543v1 Announce Type: cross Abstract: Novelty is a core requirement in academic publishing and a central focus of peer review, yet the growing volume of submissions has placed increasing pressure on human reviewers. While large language models (LLMs), including those fine-tuned on peer review data, have shown promise in generating review comments, the absence of a dedicated benchmark has limited systematic evaluation of their ability to assess research novelty. To address this gap, w
DeepCamp AI