TEACHING BOTS TO PLAY GAMES | 100 Days of Code 9

Daniel Bourke · Beginner ·🤖 AI Agents & Automation ·9y ago

Key Takeaways

The video series '100 Days of Code' covers the creation of AI bots to play games at a superhuman level using reinforcement learning, with tools such as Open AI Gym framework and Gym environment, and concepts like machine learning, artificial intelligence, and data science.

Full Transcript

what's going on y'all day 36 is 100 days of code series what do we got here 16th of June 2017 hope you're all well you know what I want to tell your story when I started this course machine learning I just finished week seven by the way yesterday afternoon I did not read this didn't read prerequisites calculus never heard of partial derivatives until two months ago same with the chain rule linear algebra hadn't done that since high school seven years ago octave or MATLAB no idea what they were e sorry added my own stability and sort of rushing into doing things I signed up to this course seven weeks ago without reading the prerequisite and the fact that it's advanced I started programming like two and a half months ago I'm doing a hundred days of code that's what beginners do now I'm kidding anyone can do that but if I read that in the start I probably wouldn't have started this at all and now that I'm seven weeks through I'm really clouded and read that because I've learned so much over the past seven weeks and so I think that it's relatable with so many different things like job postings if someone like was to read a job posting and thought they didn't meet the prerequisites and they don't apply hey they could have been a perfect candidate for the job it's just like you don't have to fit all the prerequisites same with this course I mean if if I was to realize prerequisites I probably wouldn't have signed up and I wonder how many other people have like read courses and things like that and not signed up because they hadn't met the prerequisites now of course I understand it could be a stupid move to sign up to something that you don't fully fit the prerequisites for but with enough determination and willingness to learn you can learn it just like I did and I'm not special I just I don't know I just try to learn these things I get excited about learning PS I have to add 100 days of code series on medium I'll link that in the description if you haven't seen it I write there I write there every day with a little summary of what I'm doing much like this video it's like a a dual-wielding 100 days of code series because the videos the driving series on medium but I'll catch the next what am i working on today I'm working on WordPress trying to get the front end ready for my website still still early days still still new new to me but it's all learning curve run can't you see what y'all day 37 and 100 days of code series I just applied for a research assistants job at the University of Oxford application successfully submitted so what they're doing well it will be they're trying to put together at some research and study of in creative book for existential risk what does that mean so like a large risk let's just say a pandemic or let's say artificial intelligence taking over the world or not and so I'm really interested in that sort of stuff particularly artificial intelligence but there's also a lot of things that really do interest me as well such as pandemics and genetics and things like that how how easily people can alter genetics these days is mind blown so 100 years ago couldn't do it all now high school students can alter the genetics of small little organisms even even your right foot and we can give our DNA sequence in a matter of hours yeah exciting day I'm going to do some code later this evening probably some Python just a little bit of Anki and a little bit of reading the python textbook that I'm reading otherwise sand lays I like to have the actual out day so as little technology as possible let my brain recover and then have a big week of study ahead but I'm sort of narrowed down over how I'm going to in the next six months depending on if I get this position unlikely but if I do be happy we awesome given you an open source data science master's degree and I'll keep you posted for that you can go to data science master's dot org I believe or just such open-source data science master's going to spend at least the rest this year doing that and then hopefully use that knowledge to build something great or work for another company towards building something great but I'll check back in later today if if I get me good progress done otherwise I'll see you tomorrow in it who knows what I'm doing that actually at tomorrow's writing I do some writing on Sundays but another day another thing to learn what you can see here is a deep learning transfer Learning Network or to q-learning neural network if I've got that right training this little card here to keep this Paul upright it's a game I'm still trying to understand it to so I'm learning deep learning it's day 39 at 100 days ago and learning about transfer learning T and what's happening now is I'm using a framework from open a I called this gem framework or the gym environment which open now I've created two sort of user environment to train transfer learning neural networks on Atari games and basic games like cup whole cup holds one of the easier ones because it's just a moving car trying to keep the pole upright and essentially what you're trying to do is is take one network and turn a game how to play the game by using things like reinforcement learning and stuff like that and rewards and steps and actions and series and States and just like we would learn the game so when we play a game we learn sort of what works and what doesn't work and that's what we're trying to teach the neural network here in this example this is a really simple example but it's really cool actually like there's a there's an article that's that I've been linked here it's about how open AI or deed mind or something one of the one of the big companies in terms of AI D playing one ma was able to train a neural network to learn how to play different Atari games relatively quickly and better in the human level while a human level expert at these games so really cool why some of the games well because it's not going to hurt anyone if we really want to create sort of artificial general intelligence it's best to learn on games first rather than sort of deploying these things to the real world that's what I'm going to be learning for the most of today is deep learning and Udacity and then if I manage to finish the classes for today I'll do some reading I want them but I'll catch up later this afternoon with what I've learned as you can see here it's going through a whole bunch of training iterations so episode 5 4 3 5 4 4 reward 109 training loss the Explorer P is going down with each iteration so see 7 4 3 7 3 0 7 1 8 and as you can see with each new iteration the cart which is this little black box at the bottom here gets better at balancing that wooden pole see what that was really good it's getting better and better so each time the wooden pole sways too far - left or right it'll reset itself or each part each time the cart moves too far to the left or the right it'll reset itself but what it's doing is its learning and the longer it can hold the pole upright it gets a bigger reward so right now I think the rewards capped out at 199 but it is still slowly reducing there the error here and retirement gets a little bit smaller that's really cool so you take this approach apply it to other other games and stuff like that applies to real life situations and get better and better and there you go all of a sudden computer can play a game that I would take a long time to master in about 5 minutes of training a beautiful thing is this algorithm can be transferred to different games different Atari games because it's it's based off the same principle right the reward and an exploration and exploitation principle but I'm still learning more about this hitch up more in the next clip what if you haven't checked out Andre capacity well if you're into machine learning and deep learning and computer science or not you have to check out Andre capacity here's blog is phenomenal I just read an article called deep reinforcement learning I'll link it in the description palm from pixels so essentially he talks about how to train a reinforcement network to play pong so if you've probably heard before in the news and stuff like that deep mind and whatnot have trained no networks to be out of play Atari games really well but in a human level and it sounds incredibly impressive which it is don't get me wrong it is but after reading Andres blog it sort of breaks it down step by step and sort of teaches you how to do it so then you start to realize oh wait I can actually do this and it's not sort of groundbreaking as say something like I don't know something else something next level but I think it's it's awesome to have this resource available so I'll link is blog in the description highly recommend reading it you're into machine learning computer science or just deep learning or just amazing tech in general so shout out to Andre capaci I'm going to learn some stuff on Khan Academy now actually because I'm finding that my linear algebra skills are not up to scratch and this is one of the courses that's recommended on the open source computer open source data science master's so I'm going to work through this for about the next hour or so and that will be it for today's day of study so catch it

Original Description

Over the past few days, I learned how Google's Deep Mind created AI bots that were able to learn how to play Atari games at a superhuman level. I even trained my own agent (another word for bot) to play a simple game called cart-pole. Links mentioned in the video: Medium 100 days of code - https://medium.com/series/my-100-days-of-code-bf23b507fc77 Coursera Machine Learning Course - https://www.coursera.org/learn/machine-learning Python textbook - www.learnpythonthehardway.org AI learning Atari games - https://deepmind.com/research/publications/playing-atari-deep-reinforcement-learning/ Andrej Karpathy blog - http://karpathy.github.io/ Deep Reinforcement learning from Andrej Karpathy - http://karpathy.github.io/2016/05/31/rl/ Open Source Data Science Masters - http://datasciencemasters.org/ Say Hi to me anywhere! Web: https://www.mrdbourke.com Writing: https://www.mrdbourke.com/blog/ Quora: https://www.quora.com/profile/Daniel-Bourke-2 Instagram: https://www.instagram.com/mrdbourke/ Facebook: https://www.facebook.com/mrdbourke Twitter: https://www.twitter.com/mrdbourke #udacity #100daysofcode
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Daniel Bourke · Daniel Bourke · 19 of 60

1 Xbox One S Unboxing and Xbox One Comparison
Xbox One S Unboxing and Xbox One Comparison
Daniel Bourke
2 Text/Profanity Checker in Python
Text/Profanity Checker in Python
Daniel Bourke
3 Drawing Flowers in Python
Drawing Flowers in Python
Daniel Bourke
4 Finding The Right Medium - TDBS 18 April 2017
Finding The Right Medium - TDBS 18 April 2017
Daniel Bourke
5 What Is Neuralink??! - TDBS 22 April 2017
What Is Neuralink??! - TDBS 22 April 2017
Daniel Bourke
6 Disagree and Commit, Words of Wisdom from Jeff Bezos - TDBS 19 April 2017
Disagree and Commit, Words of Wisdom from Jeff Bezos - TDBS 19 April 2017
Daniel Bourke
7 A Lesson In Movement | Raw Training Australia
A Lesson In Movement | Raw Training Australia
Daniel Bourke
8 FALLING IS FUN | Functional Friday 4
FALLING IS FUN | Functional Friday 4
Daniel Bourke
9 My first HACKATHON! | 100 Days of Code 1
My first HACKATHON! | 100 Days of Code 1
Daniel Bourke
10 MORE MACHINE LEARNING | 100 Days of Code 2
MORE MACHINE LEARNING | 100 Days of Code 2
Daniel Bourke
11 TensorBoard and learning from Einstein | 100 Days of Code 3
TensorBoard and learning from Einstein | 100 Days of Code 3
Daniel Bourke
12 Job Interview Tips and Open Ocean Swim | 100 Days of Code 4
Job Interview Tips and Open Ocean Swim | 100 Days of Code 4
Daniel Bourke
13 I Want To Help 100,000 People Workout | AI Powered Personal Trainer
I Want To Help 100,000 People Workout | AI Powered Personal Trainer
Daniel Bourke
14 MACHINE LEARNING IN 5 MINUTES
MACHINE LEARNING IN 5 MINUTES
Daniel Bourke
15 COFFEE, YOGA and AWS | 100 Days of Code 5
COFFEE, YOGA and AWS | 100 Days of Code 5
Daniel Bourke
16 MY FIRST STARTUP WEEKEND | 100 Days of Code 6
MY FIRST STARTUP WEEKEND | 100 Days of Code 6
Daniel Bourke
17 GENERATING TV SCRIPTS WITH DEEP LEARNING | 100 Days of Code 7
GENERATING TV SCRIPTS WITH DEEP LEARNING | 100 Days of Code 7
Daniel Bourke
18 Attention, please
Attention, please
Daniel Bourke
TEACHING BOTS TO PLAY GAMES | 100 Days of Code 9
TEACHING BOTS TO PLAY GAMES | 100 Days of Code 9
Daniel Bourke
20 Udacity Deep Learning Nanodegree Language Translation Project Submission | 100 Days of Code 10
Udacity Deep Learning Nanodegree Language Translation Project Submission | 100 Days of Code 10
Daniel Bourke
21 Learning about Generative Adversarial Networks on Udacity | 100 Days of Code 11
Learning about Generative Adversarial Networks on Udacity | 100 Days of Code 11
Daniel Bourke
22 Completing Andrew Ng's Machine Learning Course on Coursera | 100 Days of Code 12
Completing Andrew Ng's Machine Learning Course on Coursera | 100 Days of Code 12
Daniel Bourke
23 Finishing the Treehouse Python Track | 100 Days of Code 13
Finishing the Treehouse Python Track | 100 Days of Code 13
Daniel Bourke
24 GENERATING FACES WITH GANs | 100 Days of Code 14
GENERATING FACES WITH GANs | 100 Days of Code 14
Daniel Bourke
25 Graduating From the Udacity Deep Learning Nanodegree | 100 Days of Code 15
Graduating From the Udacity Deep Learning Nanodegree | 100 Days of Code 15
Daniel Bourke
26 WHAT I'VE LEARNED FROM TALKING TO PEOPLE
WHAT I'VE LEARNED FROM TALKING TO PEOPLE
Daniel Bourke
27 3 Life Principles I Learned From Ray Dalio
3 Life Principles I Learned From Ray Dalio
Daniel Bourke
28 PYTHON && POETRY | 100 Days of Code 16
PYTHON && POETRY | 100 Days of Code 16
Daniel Bourke
29 Physique Update and 6 Things I Wish I Knew Before Starting Gym
Physique Update and 6 Things I Wish I Knew Before Starting Gym
Daniel Bourke
30 The 100 Days is Over! | 100 Days of Code 17
The 100 Days is Over! | 100 Days of Code 17
Daniel Bourke
31 How to Burn Over 100 Calories in 4 Minutes
How to Burn Over 100 Calories in 4 Minutes
Daniel Bourke
32 Solving Sudoku with AI | Learning Intelligence 1
Solving Sudoku with AI | Learning Intelligence 1
Daniel Bourke
33 Upper Body Calisthenics Workout in the Park
Upper Body Calisthenics Workout in the Park
Daniel Bourke
34 What is an Adversarial Search Agent? | Learning Intelligence 2
What is an Adversarial Search Agent? | Learning Intelligence 2
Daniel Bourke
35 My Self-Created Artificial Intelligence Master's Degree | Learning Intelligence 0
My Self-Created Artificial Intelligence Master's Degree | Learning Intelligence 0
Daniel Bourke
36 Try Going Over It Again | Learning Intelligence 3
Try Going Over It Again | Learning Intelligence 3
Daniel Bourke
37 Python and Pullups | Learning Intelligence 4
Python and Pullups | Learning Intelligence 4
Daniel Bourke
38 AI Meets Blockchain! | Learning Intelligence 5
AI Meets Blockchain! | Learning Intelligence 5
Daniel Bourke
39 How to Pass the Turing Test + I FAILED | Learning Intelligence 6
How to Pass the Turing Test + I FAILED | Learning Intelligence 6
Daniel Bourke
40 Biology and Physics meet Computer Science | Learning Intelligence 7
Biology and Physics meet Computer Science | Learning Intelligence 7
Daniel Bourke
41 Udacity Artificial Intelligence Nanodegree Project 3 Progress | Learning Intelligence 8
Udacity Artificial Intelligence Nanodegree Project 3 Progress | Learning Intelligence 8
Daniel Bourke
42 Passing Project 3 of Udacity's Artificial Intelligence Nanodegree | Learning Intelligence 9
Passing Project 3 of Udacity's Artificial Intelligence Nanodegree | Learning Intelligence 9
Daniel Bourke
43 Bayes Networks, Hidden Markov Models and How I Wake Up | Learning Intelligence 10
Bayes Networks, Hidden Markov Models and How I Wake Up | Learning Intelligence 10
Daniel Bourke
44 Udacity AI Nanodegree Progress and Bayes' Rule Explained | Learning Intelligence 11
Udacity AI Nanodegree Progress and Bayes' Rule Explained | Learning Intelligence 11
Daniel Bourke
45 Udacity AI Nanodegree Project 4 Planning and Progress | Learning Intelligence 12
Udacity AI Nanodegree Project 4 Planning and Progress | Learning Intelligence 12
Daniel Bourke
46 Finishing Term 1 of Udacity's Artificial Intelligence Nanodegree | Learning Intelligence 13
Finishing Term 1 of Udacity's Artificial Intelligence Nanodegree | Learning Intelligence 13
Daniel Bourke
47 deeplearning.ai Progress! | Learning Intelligence 14
deeplearning.ai Progress! | Learning Intelligence 14
Daniel Bourke
48 Coursera Deep Learning Specialization Progress | Learning Intelligence 15
Coursera Deep Learning Specialization Progress | Learning Intelligence 15
Daniel Bourke
49 Computer Vision Basics + More deeplearning.ai Progress! | Learning Intelligence 16
Computer Vision Basics + More deeplearning.ai Progress! | Learning Intelligence 16
Daniel Bourke
50 My Experience at CodeCamp, Intro to Keras and Failing Hard | Learning Intelligence 17
My Experience at CodeCamp, Intro to Keras and Failing Hard | Learning Intelligence 17
Daniel Bourke
51 In-Depth Udacity Deep Learning Nanodegree Review
In-Depth Udacity Deep Learning Nanodegree Review
Daniel Bourke
52 Completing the Deeplearning.ai Specialization on Coursera | Learning Intelligence 18
Completing the Deeplearning.ai Specialization on Coursera | Learning Intelligence 18
Daniel Bourke
53 You're Never Too Young to Start Learning AI - Learning Intelligence Talks with Shaik Asad
You're Never Too Young to Start Learning AI - Learning Intelligence Talks with Shaik Asad
Daniel Bourke
54 Starting Term 2 of the Udacity Artificial Intelligence Nanodegree | Learning Intelligence 19
Starting Term 2 of the Udacity Artificial Intelligence Nanodegree | Learning Intelligence 19
Daniel Bourke
55 Submitting the Computer Vision Capstone Project | Udacity AI Nanodegree | Learning Intelligence 20
Submitting the Computer Vision Capstone Project | Udacity AI Nanodegree | Learning Intelligence 20
Daniel Bourke
56 Leg Day at World Gym Northlakes ft. Ben Jones Fitness
Leg Day at World Gym Northlakes ft. Ben Jones Fitness
Daniel Bourke
57 deeplearning.ai Sequence Models Course Progress | Learning Intelligence 21
deeplearning.ai Sequence Models Course Progress | Learning Intelligence 21
Daniel Bourke
58 Graduating from the deeplearning.ai Coursera Specialization | Learning Intelligence 22
Graduating from the deeplearning.ai Coursera Specialization | Learning Intelligence 22
Daniel Bourke
59 Udacity Artificial Intelligence Nanodegree NLP Concentration Progress | Learning Intelligence 23
Udacity Artificial Intelligence Nanodegree NLP Concentration Progress | Learning Intelligence 23
Daniel Bourke
60 Learning How to Build What's Next at Google Cloud On Board Brisbane
Learning How to Build What's Next at Google Cloud On Board Brisbane
Daniel Bourke

This video series teaches how to create AI bots that can play games at a superhuman level using reinforcement learning, and how to apply this approach to other games and real-life situations. The series covers the use of Open AI Gym framework and Gym environment, and the application of machine learning, artificial intelligence, and data science concepts.

Key Takeaways
  1. Train a deep learning transfer learning network to play Atari games using reinforcement learning and rewards
  2. Create a user environment for training transfer learning neural networks using the Gym environment and the Open AI Gym framework
  3. Apply the approach to other games and real-life situations to improve performance
  4. Use reward and exploration and exploitation principle
  5. Train reinforcement network to play Pong
💡 The use of reinforcement learning and transfer learning enables AI agents to learn how to play games at a superhuman level, and this approach can be applied to other games and real-life situations.

Related AI Lessons

Stop Blaming the Model. Your AI Agents Need a Control Plane
Learn why a control plane is crucial for AI agents, going beyond just the core agent loop
Medium · Data Science
Lumo Is a Privacy-Focused AI Chatbot, With Clear Limits
Learn about Lumo, a privacy-focused AI chatbot with no chat logs, and understand its implications on user data protection
Dev.to · Simon Paxton
I Let 5 AI Agents Shop For Me in 2026. It Went About as Well as You’d Expect.
Learn from an experiment where 5 AI agents were used to shop for everyday items, highlighting what works and what doesn't in AI-powered shopping
Medium · AI
The Governance Gap Nobody's Measuring
Learn how to identify and address the governance gap in AI systems, where configuration changes can lead to unintended consequences, and why it matters for ensuring accountability and transparency
Medium · AI
Up next
Building Great Agent Skills: The Missing Manual
AI Engineer
Watch →