LLM-ReSum: A Framework for LLM Reflective Summarization through Self-Evaluation

📰 ArXiv cs.AI

arXiv:2604.25665v1 Announce Type: cross Abstract: Reliable evaluation of large language model (LLM)-generated summaries remains an open challenge, particularly across heterogeneous domains and document lengths. We conduct a comprehensive meta-evaluation of 14 automatic summarization metrics and LLM-based evaluators across seven datasets spanning five domains, covering documents from short news articles to long scientific, governmental, and legal texts (2K-27K words) with over 1,500 human-annotat

Published 29 Apr 2026

Read full paper → ← Back to Reads