PRISM Risk Signal Framework: Hierarchy-Based Red Lines for AI Behavioral Risk

📰 ArXiv cs.AI

arXiv:2604.11070v1 Announce Type: new Abstract: Current approaches to AI safety define red lines at the case level: specific prompts, specific outputs, specific harms. This paper argues that red lines can be set more fundamentally -- at the level of value, evidence, and source hierarchies that govern AI reasoning. Using the PRISM (Profile-based Reasoning Integrity Stack Measurement) framework, we define a taxonomy of 27 behavioral risk signals derived from structural anomalies in how AI systems

Published 14 Apr 2026

Read full paper → ← Back to Reads