Efficient Reasoning with Hidden Thinking

📰 ArXiv cs.AI

arXiv:2501.19201v2 Announce Type: replace-cross Abstract: Chain-of-Thought (CoT) reasoning has become a powerful framework for improving complex problem-solving capabilities in Multimodal Large Language Models (MLLMs). However, the verbose nature of textual reasoning introduces significant inefficiencies. In this work, we propose Heima (as hidden llama), an effective CoT compression framework that condenses lengthy CoTs into a small set of abstract thinking tokens, preserving essential reasoning

Published 5 May 2026

Read full paper → ← Back to Reads