Building AlphaGo from scratch – Eric Jang

Dwarkesh Patel · Beginner ·📐 ML Fundamentals ·1h ago
Eric Jang walks through how to build AlphaGo from scratch, but with modern AI tools. Sometimes you understand the future better by stepping backward. AlphaGo is still the cleanest worked example of the primitives of intelligence: search, learning from experience, and self-play. You have to go back to 2017 to get insight into how the more general AIs of the future might learn. Once he explained how AlphaGo works, it gave us the context to have a discussion about how RL works in LLMs and how it could work better – naive policy gradient RL has to figure out which of the 100k+ tokens in your trajectory actually got you the right answer, while AlphaGo’s MCTS suggests a strictly better action every single move, giving you a training target that sidesteps the credit assignment problem. The way humans learn is surely closer to the second. Eric also kickstarted an Autoresearch loop on his project. And it was very interesting to discuss which parts of AI research LLMs can already automate pretty well (implementing and running experiments, optimizing hyperparameters) and which they still struggle with (choosing the right question to investigate next, escaping research dead ends). Informative to all the recent discussion about when we should expect an intelligence explosion, and what it would look like from the inside. 𝐄𝐏𝐈𝐒𝐎𝐃𝐄 𝐋𝐈𝐍𝐊𝐒 * Check out the flashcards I wrote to retain the insights: https://flashcards.dwarkesh.com/eric-jang/ * Transcript: https://www.dwarkesh.com/p/eric-jang 𝐒𝐏𝐎𝐍𝐒𝐎𝐑𝐒 - Cursor's agent SDK let me build a pipeline to generate flashcards for this episode. For each card, I had an agent read the transcript, ingest blackboard screenshots, generate an SVG visual, and run everything through a critic. A durable agent is much better at this kind of work than a chain of LLM calls, and Cursor's SDK made it easy. Check out the cards at https://flashcards.dwarkesh.com and get started with the SDK at https://cursor.com/dwarkesh - Jane Street
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Related AI Lessons

Bigger AI models aren't always better. Here's how to actually choose.
Larger AI models don't always outperform smaller ones, and choosing the right model requires careful consideration of several factors
Dev.to · Rohini Gaonkar
Nobody Knows What The Beach Is Saying. And That’s The Point.
Learn how signal and semantic models form the foundation of powerful AI systems and why understanding their gap is crucial
Medium · Deep Learning
Building a Production MCP Server in TypeScript: 5 Gotchas the Tutorials Skip
Learn to build a production-ready MCP server in TypeScript and avoid common pitfalls
Dev.to · Andrew Vaughey
EEG Motor Imagery: Using Brain Signals to Predict Movement Intention
Learn how EEG motor imagery can predict movement intention using brain signals and machine learning
Medium · Machine Learning
Up next
Python Full Course 2026 [FREE] | Python Tutorial For Beginners | Advance Python Course | Simplilearn
Simplilearn
Watch →