Measuring the Machine: Evaluating Generative AI as Pluralist Sociotechical Systems

📰 ArXiv cs.AI

arXiv:2604.20545v1 Announce Type: new Abstract: In measurement theory, instruments do not simply record reality; they help constitute what is observed. The same holds for generative AI evaluation: benchmarks do not just measure, they shape what models appear to be. Functionalist benchmarks treat models as isolated predictors, while prescriptive approaches assess what systems ought to be. Both obscure the sociotechnical processes through which meaning and values are enacted, risking the reificati

Published 23 Apr 2026
Read full paper → ← Back to Reads