SciPredict: Can LLMs Predict the Outcomes of Scientific Experiments in Natural Sciences?
📰 ArXiv cs.AI
arXiv:2604.10718v1 Announce Type: new Abstract: Accelerating scientific discovery requires the identification of which experiments would yield the best outcomes before committing resources to costly physical validation. While existing benchmarks evaluate LLMs on scientific knowledge and reasoning, their ability to predict experimental outcomes - a task where AI could significantly exceed human capabilities - remains largely underexplored. We introduce SciPredict, a benchmark comprising 405 tasks
DeepCamp AI