Why Reinforcement Learning Unlocks Reasoning in LLMs (Aha Moments Explained)
In this video, we break down the paper Emergent Hierarchical Reasoning in LLMs Through Reinforcement Learning and explain how reinforcement learning enables large language models (LLMs) to reason step by step. We also explore the HICRA algorithm (short for Hierarchy-Aware Credit Assignment), which improves reasoning by focusing on high-level strategic planning tokens. Learn how these insights explain “aha moments” in AI reasoning and why HICRA outperforms GRPO.
Written Review - https://aipapersacademy.com/emergent-hierarchical-reasoning-in-llms/
Paper - https://arxiv.org/abs/2509.03646
______…
Watch on YouTube ↗
(saves to browser)
Chapters (7)
Introduction
1:57
What Is Hierarchical Reasoning
2:55
How RL Unlocks Reasoning
4:23
Execution Vs Planning Tokens
5:31
Emergent Reasoning In RL
7:32
Introducing HICRA
10:21
HICRA Results
DeepCamp AI