Submodular Benchmark Selection
📰 ArXiv cs.AI
arXiv:2605.02209v1 Announce Type: new Abstract: Evaluating large language models across many benchmarks is expensive, yet many benchmarks are highly correlated. We formalize the selection of a small, informative subset as submodular maximization under a multivariate Gaussian model. Entropy (log-determinant covariance) and mutual information between selected and remaining benchmarks arise as natural objectives. Both are submodular; entropy selection coincides with pivoted Cholesky and has spectra
DeepCamp AI