Efficient Agentic Reasoning Through Self-Regulated Simulative Planning

📰 ArXiv cs.AI

arXiv:2605.22138v1 Announce Type: new Abstract: How should an agent decide when and how to plan? A dominant approach builds agents as reactive policies with adaptive computation (e.g., chain-of-thought), trained end-to-end expecting planning to emerge implicitly. Without control over the presence, structure, or horizon of planning, these systems dramatically increase reasoning length, yielding inefficient token use without reliable accuracy gains. We argue efficient agentic reasoning benefits fr

Published 23 May 2026

Read full paper → ← Back to Reads