ProEval: Proactive Failure Discovery and Efficient Performance Estimation for Generative AI Evaluation

📰 ArXiv cs.AI

arXiv:2604.23099v1 Announce Type: cross Abstract: Evaluating generative AI models is increasingly resource-intensive due to slow inference, expensive raters, and a rapidly growing landscape of models and benchmarks. We propose ProEval, a proactive evaluation framework that leverages transfer learning to efficiently estimate performance and identify failure cases. ProEval employs pre-trained Gaussian Processes (GPs) as surrogates for the performance score function, mapping model inputs to metrics

Published 28 Apr 2026

Read full paper → ← Back to Reads