OpenAI o1 System Card

📰 ArXiv cs.AI

arXiv:2412.16720v2 Announce Type: replace Abstract: The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought. These advanced reasoning capabilities provide new avenues for improving the safety and robustness of our models. In particular, our models can reason about our safety policies in context when responding to potentially unsafe prompts, through deliberative alignment. This leads to state-of-the-art performance on certain benchmarks for risks su

Published 1 May 2026
Read full paper → ← Back to Reads