Adversarial Attacks Leverage Interference Between Features in Superposition

📰 ArXiv cs.AI

arXiv:2510.11709v2 Announce Type: replace-cross Abstract: Why do adversarial examples exist, and why do they transfer between models? Existing explanations appeal to high-dimensional geometry, non-robust patterns in the input, and decision boundary structure, but none provides a representation-level mechanism that explains why specific perturbations succeed and why attacks transfer between models. In this paper, we show that adversarial vulnerability can stem from efficient information encoding

Published 17 Jun 2026
Read full paper → ← Back to Reads