MidSteer: Optimal Affine Framework for Steering Generative Models
📰 ArXiv cs.AI
arXiv:2605.05220v1 Announce Type: cross Abstract: Steering intermediate representations has emerged as a powerful strategy for controlling generative models, particularly in post-deployment alignment and safety settings. However, despite its empirical success, it currently lacks a comprehensive theoretical framework. In this paper, we bridge this gap by formalizing the theory of concept steering. First, we establish a link between steering and affine concept erasure, proving that the standard ap
DeepCamp AI