Superhuman AI Cracked An Impossible Game! | DeepNash, Explained

Underfitted · Beginner ·🧬 Deep Learning ·3y ago

Skills: ML Maths Basics70%

Key Takeaways

The video discusses DeepMind's DeepNash, an AI agent that has mastered the game of Stratego, which is more complex than chess and go, and how it achieves this through learning and matching equilibrium policy, bluffing, and self-play without human data.

Full Transcript

[Music] how old is chess I'm trying to rate here the history of Chess and the bottom line is that apparently it's kind of old now this is not a video about Chess but I wanted to bring it up because for a long time Chase was a high bar for artificial intelligence by the mid-1980s computer chess programs began challenging and occasionally beating Grand Masters it remained unclear whether they could ever defeat the world's best remember IBM's deep blue bin the world champion Gary Kasparov that was in 1997 12 years after the development of Deep Blue started and our fascination with chess did not end there fast forward 20 years and in 2017 deepmind released Alpha zero but this time the system could play chess go and Shaggy at a super human level huge accomplishment still is five years later but we just blew past that the story that I want to tell you is not about Chess not about go this is much bigger this is about artificial intelligence mastering the impossible this is Stratego not a place not a time but a battle of wit and skill and strategy that was just the beginning of a 1983 commercial about Stratego now my wife was this close to buying the game for my son but at the end we decided not to do it but it doesn't matter here is how it works Stratego is a two-player board game where you have 40 pieces that move around and the goal is to capture your opponent's flight now two specific characteristics May Stratego way more challenging for artificial intelligence than either chess or go the first day thing we need to consider is the complexity of the game the number of valid States of each one of these games now chess is very complex it has 10 to the power of 1 23 possible valid stay to put this in context we estimate there are 10 to the power of 22 grains of sand on earth and 10 to the power of 25 drops of water in the ocean that's nothing compared to this number here the sheer amount of possible States in chess is one of the reasons it took so long for AI to master It Go however is in a totally different planet 10 to the power of 360 possible States much much hotter than chess beating a professional player at go is a long-standing grand challenge of AI research okay we solved chess we solved go it's time for a new challenge so how about Stratego well 10 to the power of 535 possible States that's a number beyond anything we could ever imagine in comparison chess and go are both nothing now this is just one of the reasons that make Stratego more challenging there is something else the Stratego is an imperfect information game The key thing to understand about why improved information makes things difficult is that you have to worry not just about which actions to play but the probability that you're going to play those actions in a perfect information game like chess or go you see everything that's happening during the game there's nothing hidden from you you can see every piece every play everything we designed Alpha zero to master perfect information games but Alpha zero doesn't work with games where players don't have the full picture and when you think about the real world we usually have to make decisions with partial information if we want to to get closer to artificial intelligence that can help solve the problems we face every day we need to go beyond Alpha zero think about poker for example you don't see your opponent's cards they are completely hidden from you like Noah mentioning his conversation with Lex Freeman there is an additional layer and imperfect information game it's not only about the actions you take but the success probability of those actions Alpha zero did not solve this in fact imperfect information games have been tough for artificial intelligence to crack until now a few days ago on December 1st the mine published a new paper in science talking about their new AI agent deepmash here is their blog post not the paper you can read that one later if you want Stratego the classic board game that's more complex than chess and go and craftier than poker has now been mastered if I start talking about every cool thing about deep Nash we will be here the whole day so let me focus on a couple of details starting with the most important idea deep Nash goal is to learn and match equilibrium policy I should probably make a separate video about Nash equilibrium but this is what you need to know in a two-player zero-sum game like chess go poker or Stratego in Nash equilibrium guarantees that deep Nash will do very well even when playing against the best opponents now Stratego is hard remember some of the information hidden so deep Nash aims to find that Nash equilibrium not perfect but still good enough to win more than 97 percent of gains against the best strategor Bots out there and 84 against top expert human players now speaking about hidden information bluffing is a big part of Stratego sometimes you want to deceive the other player maybe lure them into a trap make them think you're stronger than you really are it's part of the game but deceiving your opponent is a mental state that we have we shouldn't expect it from an artificial intelligence system right well I'm sure you know where I'm going with this deep Nash Bluffs if you go and check the paper you will find links to a bunch of sample games where deep Nash clearly deceives their opponents to take advantage of them it's incredible not only that but deep Nash can make non-trivial trade where it shows how much it values information and that's something unexpected finally there is something I find fascinating deep Nash learns Stratego from scratch have you ever wondered what the meaning of the word zero in Alpha zero is alphago zero doesn't use any human data whatsoever instead what it has to do is learn for itself completely from self-play zero means no human knowledge in the loop deep Nash works the same way it learns exclusively from playing itself and this is such a beautiful and Powerful idea so it starts off extremely naive it starts off with completely random play and yet at every step of the learning process it has an opponent as exactly calibrated to its current level of performance so deep match is not about collecting more human data or having better data deep Dash is not about data at all if you think about it this is great deep Nash is not biased by the way we play the game is not trying trying to copy us instead it builds its own strategies is some playing style and we can use that we can find different tactics and unconventional ways to play just by looking at deepmatch and that is a big part of the value of these systems we can learn a ton from them and by the way Stratego is just a game but the ultimate goal here is to apply these algorithms to real life situations traffic modeling smart grid auction design there are many problems with similar characteristics that's why deep Nash is so important all of a sudden we have a chance against large scale imperfect information problems with a huge State space things that were impossible before are now disclosed if you like this type of content subscribe awesome hey

Original Description

An explanation of DeepMind's DeepNash and what it means for us. 🔔 Subscribe for more stories: https://www.youtube.com/@underfitted?sub_confirmation=1 📚 My 3 favorite Machine Learning books: • Deep Learning With Python, Second Edition — https://amzn.to/3xA3bVI • Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow — https://amzn.to/3BOX3LP • Machine Learning with PyTorch and Scikit-Learn — https://amzn.to/3f7dAC8 Twitter: https://twitter.com/svpino Disclaimer: Some of the links included in this description are affiliate links where I'll earn a small commission if you purchase something. There's no cost to you.

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Underfitted · Underfitted · 25 of 60

← Previous Next →

Test-Time Augmentation In Machine Learning.

Test-Time Augmentation In Machine Learning.

Don't Replace Missing Values In Your Dataset.

Don't Replace Missing Values In Your Dataset.

Introduction to Adversarial Validation In Machine Learning.

Introduction to Adversarial Validation In Machine Learning.

Introduction To Autoencoders In Machine Learning.

Introduction To Autoencoders In Machine Learning.

Active Learning. The Secret of Training Models Without Labels.

Active Learning. The Secret of Training Models Without Labels.

Early Stopping. The Most Popular Regularization Technique In Machine Learning.

Early Stopping. The Most Popular Regularization Technique In Machine Learning.

The Confusion Matrix in Machine Learning

The Confusion Matrix in Machine Learning

3 Tips to Build a Career in Machine Learning (Unconventional Advice)

3 Tips to Build a Career in Machine Learning (Unconventional Advice)

I can predict cars CRASHING. And it's 99% accurate!

I can predict cars CRASHING. And it's 99% accurate!

A Critical Skill People Learn Too LATE: Learning Curves In Machine Learning.

A Critical Skill People Learn Too LATE: Learning Curves In Machine Learning.

The BEST Machine Learning Interview Strategy.

The BEST Machine Learning Interview Strategy.

OpenAI’s Whisper is AMAZING!

OpenAI’s Whisper is AMAZING!

5 Lessons You’re NOT Taught in School

5 Lessons You’re NOT Taught in School

TensorFlow On Apple Silicon. Step-by-Step Instructions

TensorFlow On Apple Silicon. Step-by-Step Instructions

Generating Images From Text. Stable Diffusion, Explained

Generating Images From Text. Stable Diffusion, Explained

The Wrong Batch Size Will Ruin Your Model

The Wrong Batch Size Will Ruin Your Model

8 Mistakes Holding Your Career Back | Machine Learning

8 Mistakes Holding Your Career Back | Machine Learning

AI Just Solved a 53-Year-Old Problem! | AlphaTensor, Explained

AI Just Solved a 53-Year-Old Problem! | AlphaTensor, Explained

Bias and Variance, Simplified

Bias and Variance, Simplified

Should You Stop Splitting Your Data Like This?

Should You Stop Splitting Your Data Like This?

The Function That Changed Everything

The Function That Changed Everything

This Model Caused A Nuclear Disaster

This Model Caused A Nuclear Disaster

Will Your Code Write Itself?

Will Your Code Write Itself?

The Simplest Encoding You’ve Never Heard Of

The Simplest Encoding You’ve Never Heard Of

Superhuman AI Cracked An Impossible Game! | DeepNash, Explained

Superhuman AI Cracked An Impossible Game! | DeepNash, Explained

Can you become a Data Scientist without a Ph.D?

Can you become a Data Scientist without a Ph.D?

How to 10x your productivity with ChatGPT?

How to 10x your productivity with ChatGPT?

Cheating the Prisoner's Dilemma

Cheating the Prisoner's Dilemma

We integrated OpenAI's Whisper with Spot

We integrated OpenAI's Whisper with Spot

The Machine Learning School program

The Machine Learning School program

We integrated ChatGPT with our robots

We integrated ChatGPT with our robots

Solving complex tasks using a Large Language Model (LLM)

Solving complex tasks using a Large Language Model (LLM)

5 problems when using a Large Language Model

5 problems when using a Large Language Model

We just discovered faster sorting algorithms!

We just discovered faster sorting algorithms!

The 3 most important updates to OpenAI's API.

The 3 most important updates to OpenAI's API.

People are divided! Does GPT-4 understand what it says?

People are divided! Does GPT-4 understand what it says?

How much should you charge hourly as a Machine Learning freelancer?

How much should you charge hourly as a Machine Learning freelancer?

Building a RAG application from scratch using Python, LangChain, and the OpenAI API

Building a RAG application from scratch using Python, LangChain, and the OpenAI API

Building a RAG application using open-source models (Asking questions from a PDF using Llama2)

Building a RAG application using open-source models (Asking questions from a PDF using Llama2)

How to evaluate an LLM-powered RAG application automatically.

How to evaluate an LLM-powered RAG application automatically.

Step by step no-code RAG application using Langflow.

Step by step no-code RAG application using Langflow.

I built a simple game using Langchain. Here is a step by step tutorial.

I built a simple game using Langchain. Here is a step by step tutorial.

I used the first AI Software Engineer for a week. This is happening.

I used the first AI Software Engineer for a week. This is happening.

I deployed a recommendation model. Testing Models In Production using Interleaving Experiments.

I deployed a recommendation model. Testing Models In Production using Interleaving Experiments.

How to run PyTorch, TensorFlow, and JAX on your Mac (Apple Silicon)

How to run PyTorch, TensorFlow, and JAX on your Mac (Apple Silicon)

How to train a model to generate image embeddings from scratch

How to train a model to generate image embeddings from scratch

Building an AI assistant that listens and sees the world (Step by step tutorial)

Building an AI assistant that listens and sees the world (Step by step tutorial)

Why are vector databases so FAST?

Why are vector databases so FAST?

A Machine Learning roadmap (the one I recommend to my students)

A Machine Learning roadmap (the one I recommend to my students)

How to build a real-time AI assistant (with voice and vision)

How to build a real-time AI assistant (with voice and vision)

An introduction to Mojo (for Python developers)

An introduction to Mojo (for Python developers)

How does Lexical Scoping in Mojo 🔥 works (under 3 minutes)

How does Lexical Scoping in Mojo 🔥 works (under 3 minutes)

Building a CI workflow for those who hate it (using GitHub Actions)

Building a CI workflow for those who hate it (using GitHub Actions)

How to run Python Code in Mojo 🔥

How to run Python Code in Mojo 🔥

AI will not take your job. Here is what I think will happen instead.

AI will not take your job. Here is what I think will happen instead.

How to fine-tune a model using LoRA (step by step)

How to fine-tune a model using LoRA (step by step)

Late initialization in Mojo🔥 (Python doesn't support this)

Late initialization in Mojo🔥 (Python doesn't support this)

The $1,000,000 problem AI can't solve

The $1,000,000 problem AI can't solve

A gentle introduction to RAG (using open-source models)

A gentle introduction to RAG (using open-source models)

Automating feedback using ChatGPT and Zapier

Automating feedback using ChatGPT and Zapier

DeepNash, an AI agent, has mastered Stratego, a complex game with imperfect information, by learning and matching equilibrium policy through self-play without human data, and its applications go beyond games to real-life problems.

Key Takeaways

Understand the basics of Stratego and its complexity
Learn about Nash Equilibrium and its application to AI
Study how DeepNash achieves self-play without human data
Analyze the applications of DeepNash beyond games

💡 DeepNash's ability to learn and match equilibrium policy through self-play without human data makes it a powerful tool for solving complex problems with imperfect information.

🔒 Pro feature: Ask AI to explain this lesson →

More on: ML Maths Basics

View skill →

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

ChethanAIChronicles

“Hello, world” from scratch on a 6502 — Part 1

“Hello, world” from scratch on a 6502 — Part 1

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

ROC and AUC in R

ROC and AUC in R

StatQuest with Josh Starmer

Data Science Fundamentals: Data Cleaning in Python

Data Science Fundamentals: Data Cleaning in Python

Related AI Lessons

Want to get started with deep learning

Get started with deep learning by leveraging resources like Andrew Karpathy's playlist and frameworks such as TensorFlow or PyTorch

Reddit r/deeplearning

Building a Deepfake Detector From Scratch — What Nobody Tells You

Learn to build a deepfake detector from scratch and understand the challenges involved in detecting AI-generated fake media

Medium · Deep Learning

Unfolding the Meandering Path: High-Dimensional Invariance and the Flat 2D Plane of Neural…

Learn about high-dimensional invariance and its relation to the flat 2D plane of neural networks, and how to apply these concepts to improve model performance

Medium · Deep Learning

Implementing Neural Style Transfer from Scratch: The Project That Started It All

Learn to implement Neural Style Transfer from scratch and understand its significance in deep learning

Medium · Deep Learning

Image Classification with ml5.js

The Coding Train