AInstein: Can LLMs Solve Research Problems From Parametric Memory Alone?
📰 ArXiv cs.AI
arXiv:2510.05432v2 Announce Type: replace Abstract: Can large language models solve AI research problems using only their parametric knowledge, without fine-tuning, retrieval, or other external aids? We introduce AInstein, a framework for testing whether LLM agents can generate and refine solutions to research problems through iterative critique loops. A blind study with 20 domain experts on held-out ICLR 2026 problems validates our automated metrics, which we then scale to 1,214 ICLR 2025 paper
DeepCamp AI