Model Capability Assessment and Safeguards for Biological Weaponization

📰 ArXiv cs.AI

arXiv:2604.19811v2 Announce Type: cross Abstract: AI leaders and safety reports increasingly warn that advances in model reasoning may enable biological misuse, including by low-expertise users, while major labs describe safeguards as expanding but still evolving rather than settled. This study benchmarks ChatGPT 5.2 Auto, Gemini 3 Pro Thinking, Claude Opus 4.5 and Meta's Muse Spark Thinking on 73 novice-framed, open-ended benign STEM prompts to measure operational intelligence. On benign quanti

Published 23 Apr 2026

Read full paper → ← Back to Reads