Beyond Single-Agent Alignment: Preventing Context-Fragmented Violations in Multi-Agent Systems
📰 ArXiv cs.AI
arXiv:2604.22879v1 Announce Type: cross Abstract: We identify and formalize a novel security risk: Context-Fragmented Violations (CFVs) - a class of policy breaches where individual agent actions appear locally safe and reasonable, yet collectively violate organizational policies because critical policy facts are siloed in different departments private contexts. Existing prompt-based alignment mechanisms and monolithic interceptors are poorly matched to violations that span contextual islands. W
DeepCamp AI