Hacking an LLM's Personality with Representation Engineering
### Papers & Resources
* [Persona Vectors: Monitoring and Controlling Character Traits in Language Models](https://arxiv.org/abs/2507.21509)
+ = Interpretability
+ [Blog post](https://www.anthropic.com/research/persona-vectors)
+ [Code Repo](https://github.com/safety-research/persona_vectors)
+ [Anthropic Thread](https://x.com/AnthropicAI/status/1951317898313466361)
+ [Anthropic Hiring](https://x.com/Jack_W_Lindsey/status/1948138767753326654)
* [A Simple but Tough-to-Beat Baseline for Sentence Embeddings](https://openreview.net/pdf?id=SyK00v5xx)
* [Improving Reasoning Performance …
Watch on YouTube ↗
(saves to browser)
DeepCamp AI