Consensus Sampling for Safer Generative AI

📰 ArXiv cs.AI

arXiv:2511.09493v2 Announce Type: replace Abstract: Motivated by undetectable risks in generative AI, we study a general robust aggregation problem: how to aggregate several probability distributions to boost safety. We present consensus sampling, a black-box algorithm that, given k distributions, has risk competitive with the average risk of the safest $s$ while abstaining when there is insufficient agreement. This yields an architecture-agnostic approach to generative-model safety when the dis

Published 12 May 2026
Read full paper → ← Back to Reads