AInstein: Can LLMs Solve Research Problems From Parametric Memory Alone?

📰 ArXiv cs.AI

arXiv:2510.05432v2 Announce Type: replace Abstract: Can large language models solve AI research problems using only their parametric knowledge, without fine-tuning, retrieval, or other external aids? We introduce AInstein, a framework for testing whether LLM agents can generate and refine solutions to research problems through iterative critique loops. A blind study with 20 domain experts on held-out ICLR 2026 problems validates our automated metrics, which we then scale to 1,214 ICLR 2025 paper

Published 29 Apr 2026
Read full paper → ← Back to Reads