Do LLMs Know When They're Wrong?

Martin Andrews · Beginner ·📄 Research Papers Explained ·6mo ago
We're moving past LLMs that just predict the next word. Discover a new frontier: models that can gauge their own uncertainty to improve reasoning. This video explores two brand new papers that turn the "Entropix" meme into practical, working code. Current methods like Chain-of-Thought are powerful, but they are essentially a model "thinking out loud." What if a model could recognize when it's on a bad path and correct itself? This is the core idea behind using token entropy and logprobs as a "confidence" signal. This video is for the AI builder, developer, and enthusiast who wants to look un…
Watch on YouTube ↗ (saves to browser)

Chapters (4)

Introduction: The Idea of LLM Confidence
0:31 Background: From OpenAI's o-1 to the "Entropix" Meme
5:26 Paper 1: ARPO & Agentic Rollout Confidence
7:55 Paper 2: Meta's "Deep Think wi
He Left It Out to Rust… But It Never Did 🧪    #shorts
Next Up
He Left It Out to Rust… But It Never Did 🧪 #shorts
Jacky Chou from Indexsy