From hard refusals to safe-completions: toward output-centric safety training

📰 OpenAI News

OpenAI's GPT-5 introduces safe-completions approach for output-centric safety training

advanced Published 7 Aug 2025

Action Steps

Understand the limitations of hard refusals in AI safety training
Explore the concept of output-centric safety training
Implement safe-completions approach in AI models like GPT-5
Evaluate the effectiveness of safe-completions in handling dual-use prompts

Who Needs to Know This

AI engineers and researchers benefit from this approach as it improves safety and helpfulness in AI responses, while product managers can leverage this to develop more reliable AI-powered products

Key Insight

💡 Output-centric safety training can improve both safety and helpfulness in AI responses