TEACHING BOTS TO PLAY GAMES | 100 Days of Code 9

Daniel Bourke · Beginner ·🤖 AI Agents & Automation ·9y ago

Skills: Agent Foundations90%Tool Use & Function Calling80%Autonomous Workflows70%

Key Takeaways

The video series '100 Days of Code' covers the creation of AI bots to play games at a superhuman level using reinforcement learning, with tools such as Open AI Gym framework and Gym environment, and concepts like machine learning, artificial intelligence, and data science.

Full Transcript

what's going on y'all day 36 is 100 days of code series what do we got here 16th of June 2017 hope you're all well you know what I want to tell your story when I started this course machine learning I just finished week seven by the way yesterday afternoon I did not read this didn't read prerequisites calculus never heard of partial derivatives until two months ago same with the chain rule linear algebra hadn't done that since high school seven years ago octave or MATLAB no idea what they were e sorry added my own stability and sort of rushing into doing things I signed up to this course seven weeks ago without reading the prerequisite and the fact that it's advanced I started programming like two and a half months ago I'm doing a hundred days of code that's what beginners do now I'm kidding anyone can do that but if I read that in the start I probably wouldn't have started this at all and now that I'm seven weeks through I'm really clouded and read that because I've learned so much over the past seven weeks and so I think that it's relatable with so many different things like job postings if someone like was to read a job posting and thought they didn't meet the prerequisites and they don't apply hey they could have been a perfect candidate for the job it's just like you don't have to fit all the prerequisites same with this course I mean if if I was to realize prerequisites I probably wouldn't have signed up and I wonder how many other people have like read courses and things like that and not signed up because they hadn't met the prerequisites now of course I understand it could be a stupid move to sign up to something that you don't fully fit the prerequisites for but with enough determination and willingness to learn you can learn it just like I did and I'm not special I just I don't know I just try to learn these things I get excited about learning PS I have to add 100 days of code series on medium I'll link that in the description if you haven't seen it I write there I write there every day with a little summary of what I'm doing much like this video it's like a a dual-wielding 100 days of code series because the videos the driving series on medium but I'll catch the next what am i working on today I'm working on WordPress trying to get the front end ready for my website still still early days still still new new to me but it's all learning curve run can't you see what y'all day 37 and 100 days of code series I just applied for a research assistants job at the University of Oxford application successfully submitted so what they're doing well it will be they're trying to put together at some research and study of in creative book for existential risk what does that mean so like a large risk let's just say a pandemic or let's say artificial intelligence taking over the world or not and so I'm really interested in that sort of stuff particularly artificial intelligence but there's also a lot of things that really do interest me as well such as pandemics and genetics and things like that how how easily people can alter genetics these days is mind blown so 100 years ago couldn't do it all now high school students can alter the genetics of small little organisms even even your right foot and we can give our DNA sequence in a matter of hours yeah exciting day I'm going to do some code later this evening probably some Python just a little bit of Anki and a little bit of reading the python textbook that I'm reading otherwise sand lays I like to have the actual out day so as little technology as possible let my brain recover and then have a big week of study ahead but I'm sort of narrowed down over how I'm going to in the next six months depending on if I get this position unlikely but if I do be happy we awesome given you an open source data science master's degree and I'll keep you posted for that you can go to data science master's dot org I believe or just such open-source data science master's going to spend at least the rest this year doing that and then hopefully use that knowledge to build something great or work for another company towards building something great but I'll check back in later today if if I get me good progress done otherwise I'll see you tomorrow in it who knows what I'm doing that actually at tomorrow's writing I do some writing on Sundays but another day another thing to learn what you can see here is a deep learning transfer Learning Network or to q-learning neural network if I've got that right training this little card here to keep this Paul upright it's a game I'm still trying to understand it to so I'm learning deep learning it's day 39 at 100 days ago and learning about transfer learning T and what's happening now is I'm using a framework from open a I called this gem framework or the gym environment which open now I've created two sort of user environment to train transfer learning neural networks on Atari games and basic games like cup whole cup holds one of the easier ones because it's just a moving car trying to keep the pole upright and essentially what you're trying to do is is take one network and turn a game how to play the game by using things like reinforcement learning and stuff like that and rewards and steps and actions and series and States and just like we would learn the game so when we play a game we learn sort of what works and what doesn't work and that's what we're trying to teach the neural network here in this example this is a really simple example but it's really cool actually like there's a there's an article that's that I've been linked here it's about how open AI or deed mind or something one of the one of the big companies in terms of AI D playing one ma was able to train a neural network to learn how to play different Atari games relatively quickly and better in the human level while a human level expert at these games so really cool why some of the games well because it's not going to hurt anyone if we really want to create sort of artificial general intelligence it's best to learn on games first rather than sort of deploying these things to the real world that's what I'm going to be learning for the most of today is deep learning and Udacity and then if I manage to finish the classes for today I'll do some reading I want them but I'll catch up later this afternoon with what I've learned as you can see here it's going through a whole bunch of training iterations so episode 5 4 3 5 4 4 reward 109 training loss the Explorer P is going down with each iteration so see 7 4 3 7 3 0 7 1 8 and as you can see with each new iteration the cart which is this little black box at the bottom here gets better at balancing that wooden pole see what that was really good it's getting better and better so each time the wooden pole sways too far - left or right it'll reset itself or each part each time the cart moves too far to the left or the right it'll reset itself but what it's doing is its learning and the longer it can hold the pole upright it gets a bigger reward so right now I think the rewards capped out at 199 but it is still slowly reducing there the error here and retirement gets a little bit smaller that's really cool so you take this approach apply it to other other games and stuff like that applies to real life situations and get better and better and there you go all of a sudden computer can play a game that I would take a long time to master in about 5 minutes of training a beautiful thing is this algorithm can be transferred to different games different Atari games because it's it's based off the same principle right the reward and an exploration and exploitation principle but I'm still learning more about this hitch up more in the next clip what if you haven't checked out Andre capacity well if you're into machine learning and deep learning and computer science or not you have to check out Andre capacity here's blog is phenomenal I just read an article called deep reinforcement learning I'll link it in the description palm from pixels so essentially he talks about how to train a reinforcement network to play pong so if you've probably heard before in the news and stuff like that deep mind and whatnot have trained no networks to be out of play Atari games really well but in a human level and it sounds incredibly impressive which it is don't get me wrong it is but after reading Andres blog it sort of breaks it down step by step and sort of teaches you how to do it so then you start to realize oh wait I can actually do this and it's not sort of groundbreaking as say something like I don't know something else something next level but I think it's it's awesome to have this resource available so I'll link is blog in the description highly recommend reading it you're into machine learning computer science or just deep learning or just amazing tech in general so shout out to Andre capaci I'm going to learn some stuff on Khan Academy now actually because I'm finding that my linear algebra skills are not up to scratch and this is one of the courses that's recommended on the open source computer open source data science master's so I'm going to work through this for about the next hour or so and that will be it for today's day of study so catch it

Original Description

Over the past few days, I learned how Google's Deep Mind created AI bots that were able to learn how to play Atari games at a superhuman level. I even trained my own agent (another word for bot) to play a simple game called cart-pole. Links mentioned in the video: Medium 100 days of code - https://medium.com/series/my-100-days-of-code-bf23b507fc77 Coursera Machine Learning Course - https://www.coursera.org/learn/machine-learning Python textbook - www.learnpythonthehardway.org AI learning Atari games - https://deepmind.com/research/publications/playing-atari-deep-reinforcement-learning/ Andrej Karpathy blog - http://karpathy.github.io/ Deep Reinforcement learning from Andrej Karpathy - http://karpathy.github.io/2016/05/31/rl/ Open Source Data Science Masters - http://datasciencemasters.org/ Say Hi to me anywhere! Web: https://www.mrdbourke.com Writing: https://www.mrdbourke.com/blog/ Quora: https://www.quora.com/profile/Daniel-Bourke-2 Instagram: https://www.instagram.com/mrdbourke/ Facebook: https://www.facebook.com/mrdbourke Twitter: https://www.twitter.com/mrdbourke #udacity #100daysofcode

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Daniel Bourke · Daniel Bourke · 19 of 60

← Previous Next →

Xbox One S Unboxing and Xbox One Comparison

Xbox One S Unboxing and Xbox One Comparison

Text/Profanity Checker in Python

Text/Profanity Checker in Python

Drawing Flowers in Python

Drawing Flowers in Python

Finding The Right Medium - TDBS 18 April 2017

Finding The Right Medium - TDBS 18 April 2017

What Is Neuralink??! - TDBS 22 April 2017

What Is Neuralink??! - TDBS 22 April 2017

Disagree and Commit, Words of Wisdom from Jeff Bezos - TDBS 19 April 2017

Disagree and Commit, Words of Wisdom from Jeff Bezos - TDBS 19 April 2017

A Lesson In Movement | Raw Training Australia

A Lesson In Movement | Raw Training Australia

FALLING IS FUN | Functional Friday 4

FALLING IS FUN | Functional Friday 4

My first HACKATHON! | 100 Days of Code 1

My first HACKATHON! | 100 Days of Code 1

MORE MACHINE LEARNING | 100 Days of Code 2

MORE MACHINE LEARNING | 100 Days of Code 2

TensorBoard and learning from Einstein | 100 Days of Code 3

TensorBoard and learning from Einstein | 100 Days of Code 3

Job Interview Tips and Open Ocean Swim | 100 Days of Code 4

Job Interview Tips and Open Ocean Swim | 100 Days of Code 4

I Want To Help 100,000 People Workout | AI Powered Personal Trainer

I Want To Help 100,000 People Workout | AI Powered Personal Trainer

MACHINE LEARNING IN 5 MINUTES

MACHINE LEARNING IN 5 MINUTES

COFFEE, YOGA and AWS | 100 Days of Code 5

COFFEE, YOGA and AWS | 100 Days of Code 5

MY FIRST STARTUP WEEKEND | 100 Days of Code 6

MY FIRST STARTUP WEEKEND | 100 Days of Code 6

GENERATING TV SCRIPTS WITH DEEP LEARNING | 100 Days of Code 7

GENERATING TV SCRIPTS WITH DEEP LEARNING | 100 Days of Code 7

Attention, please

Attention, please

TEACHING BOTS TO PLAY GAMES | 100 Days of Code 9

TEACHING BOTS TO PLAY GAMES | 100 Days of Code 9

Udacity Deep Learning Nanodegree Language Translation Project Submission | 100 Days of Code 10

Udacity Deep Learning Nanodegree Language Translation Project Submission | 100 Days of Code 10

Learning about Generative Adversarial Networks on Udacity | 100 Days of Code 11

Learning about Generative Adversarial Networks on Udacity | 100 Days of Code 11

Completing Andrew Ng's Machine Learning Course on Coursera | 100 Days of Code 12

Completing Andrew Ng's Machine Learning Course on Coursera | 100 Days of Code 12

Finishing the Treehouse Python Track | 100 Days of Code 13

Finishing the Treehouse Python Track | 100 Days of Code 13

GENERATING FACES WITH GANs | 100 Days of Code 14

GENERATING FACES WITH GANs | 100 Days of Code 14

Graduating From the Udacity Deep Learning Nanodegree | 100 Days of Code 15

Graduating From the Udacity Deep Learning Nanodegree | 100 Days of Code 15

WHAT I'VE LEARNED FROM TALKING TO PEOPLE

WHAT I'VE LEARNED FROM TALKING TO PEOPLE

3 Life Principles I Learned From Ray Dalio

3 Life Principles I Learned From Ray Dalio

PYTHON && POETRY | 100 Days of Code 16

PYTHON && POETRY | 100 Days of Code 16

Physique Update and 6 Things I Wish I Knew Before Starting Gym

Physique Update and 6 Things I Wish I Knew Before Starting Gym

The 100 Days is Over! | 100 Days of Code 17

The 100 Days is Over! | 100 Days of Code 17

How to Burn Over 100 Calories in 4 Minutes

How to Burn Over 100 Calories in 4 Minutes

Solving Sudoku with AI | Learning Intelligence 1

Solving Sudoku with AI | Learning Intelligence 1

Upper Body Calisthenics Workout in the Park

Upper Body Calisthenics Workout in the Park

What is an Adversarial Search Agent? | Learning Intelligence 2

What is an Adversarial Search Agent? | Learning Intelligence 2

My Self-Created Artificial Intelligence Master's Degree | Learning Intelligence 0

My Self-Created Artificial Intelligence Master's Degree | Learning Intelligence 0

Try Going Over It Again | Learning Intelligence 3

Try Going Over It Again | Learning Intelligence 3

Python and Pullups | Learning Intelligence 4

Python and Pullups | Learning Intelligence 4

AI Meets Blockchain! | Learning Intelligence 5

AI Meets Blockchain! | Learning Intelligence 5

How to Pass the Turing Test + I FAILED | Learning Intelligence 6

How to Pass the Turing Test + I FAILED | Learning Intelligence 6

Biology and Physics meet Computer Science | Learning Intelligence 7

Biology and Physics meet Computer Science | Learning Intelligence 7

Udacity Artificial Intelligence Nanodegree Project 3 Progress | Learning Intelligence 8

Udacity Artificial Intelligence Nanodegree Project 3 Progress | Learning Intelligence 8

Passing Project 3 of Udacity's Artificial Intelligence Nanodegree | Learning Intelligence 9

Passing Project 3 of Udacity's Artificial Intelligence Nanodegree | Learning Intelligence 9

Bayes Networks, Hidden Markov Models and How I Wake Up | Learning Intelligence 10

Bayes Networks, Hidden Markov Models and How I Wake Up | Learning Intelligence 10

Udacity AI Nanodegree Progress and Bayes' Rule Explained | Learning Intelligence 11

Udacity AI Nanodegree Progress and Bayes' Rule Explained | Learning Intelligence 11

Udacity AI Nanodegree Project 4 Planning and Progress | Learning Intelligence 12

Udacity AI Nanodegree Project 4 Planning and Progress | Learning Intelligence 12

Finishing Term 1 of Udacity's Artificial Intelligence Nanodegree | Learning Intelligence 13

Finishing Term 1 of Udacity's Artificial Intelligence Nanodegree | Learning Intelligence 13

deeplearning.ai Progress! | Learning Intelligence 14

deeplearning.ai Progress! | Learning Intelligence 14

Coursera Deep Learning Specialization Progress | Learning Intelligence 15

Coursera Deep Learning Specialization Progress | Learning Intelligence 15

Computer Vision Basics + More deeplearning.ai Progress! | Learning Intelligence 16

Computer Vision Basics + More deeplearning.ai Progress! | Learning Intelligence 16

My Experience at CodeCamp, Intro to Keras and Failing Hard | Learning Intelligence 17

My Experience at CodeCamp, Intro to Keras and Failing Hard | Learning Intelligence 17

In-Depth Udacity Deep Learning Nanodegree Review

In-Depth Udacity Deep Learning Nanodegree Review

Completing the Deeplearning.ai Specialization on Coursera | Learning Intelligence 18

Completing the Deeplearning.ai Specialization on Coursera | Learning Intelligence 18

You're Never Too Young to Start Learning AI - Learning Intelligence Talks with Shaik Asad

You're Never Too Young to Start Learning AI - Learning Intelligence Talks with Shaik Asad

Starting Term 2 of the Udacity Artificial Intelligence Nanodegree | Learning Intelligence 19

Starting Term 2 of the Udacity Artificial Intelligence Nanodegree | Learning Intelligence 19

Submitting the Computer Vision Capstone Project | Udacity AI Nanodegree | Learning Intelligence 20

Submitting the Computer Vision Capstone Project | Udacity AI Nanodegree | Learning Intelligence 20

Leg Day at World Gym Northlakes ft. Ben Jones Fitness

Leg Day at World Gym Northlakes ft. Ben Jones Fitness

deeplearning.ai Sequence Models Course Progress | Learning Intelligence 21

deeplearning.ai Sequence Models Course Progress | Learning Intelligence 21

Graduating from the deeplearning.ai Coursera Specialization | Learning Intelligence 22

Graduating from the deeplearning.ai Coursera Specialization | Learning Intelligence 22

Udacity Artificial Intelligence Nanodegree NLP Concentration Progress | Learning Intelligence 23

Udacity Artificial Intelligence Nanodegree NLP Concentration Progress | Learning Intelligence 23

Learning How to Build What's Next at Google Cloud On Board Brisbane

Learning How to Build What's Next at Google Cloud On Board Brisbane

This video series teaches how to create AI bots that can play games at a superhuman level using reinforcement learning, and how to apply this approach to other games and real-life situations. The series covers the use of Open AI Gym framework and Gym environment, and the application of machine learning, artificial intelligence, and data science concepts.

Key Takeaways

Train a deep learning transfer learning network to play Atari games using reinforcement learning and rewards
Create a user environment for training transfer learning neural networks using the Gym environment and the Open AI Gym framework
Apply the approach to other games and real-life situations to improve performance
Use reward and exploration and exploitation principle
Train reinforcement network to play Pong

💡 The use of reinforcement learning and transfer learning enables AI agents to learn how to play games at a superhuman level, and this approach can be applied to other games and real-life situations.

🔒 Pro feature: Ask AI to explain this lesson →

More on: Agent Foundations

View skill →

Build and Deploy an Agent with Reasoning Engine in Vertex AI

Adding a Phone Gateway to a Virtual Agent

From Zero to Working AI Agent in 60 Seconds

From Zero to Working AI Agent in 60 Seconds

Create An AI Agent With Replit That Automates Your Sales

Create An AI Agent With Replit That Automates Your Sales

Capstone: Autonomous Runway Detection for IoT

Capstone: Autonomous Runway Detection for IoT

AI Agents with Model Context Protocol & Typescript

AI Agents with Model Context Protocol & Typescript

Related AI Lessons

Stop Blaming the Model. Your AI Agents Need a Control Plane

Learn why a control plane is crucial for AI agents, going beyond just the core agent loop

Medium · Data Science

Lumo Is a Privacy-Focused AI Chatbot, With Clear Limits

Learn about Lumo, a privacy-focused AI chatbot with no chat logs, and understand its implications on user data protection

Dev.to · Simon Paxton

I Let 5 AI Agents Shop For Me in 2026. It Went About as Well as You’d Expect.

Learn from an experiment where 5 AI agents were used to shop for everyday items, highlighting what works and what doesn't in AI-powered shopping

The Governance Gap Nobody's Measuring

Learn how to identify and address the governance gap in AI systems, where configuration changes can lead to unintended consequences, and why it matters for ensuring accountability and transparency

Building Great Agent Skills: The Missing Manual