Do LLMs Build Spatial World Models? Evidence from Grid-World Maze Tasks
📰 ArXiv cs.AI
arXiv:2604.10690v1 Announce Type: new Abstract: Foundation models have shown remarkable performance across diverse tasks, yet their ability to construct internal spatial world models for reasoning and planning remains unclear. We systematically evaluate the spatial understanding of large language models through maze tasks, a controlled testing context requiring multi-step planning and spatial abstraction. Across comprehensive experiments with Gemini-2.5-Flash, GPT-5-mini, Claude-Haiku-4.5, and D
DeepCamp AI