Proximal Policy Optimization Implementation: 9 Atari-specific Details (2/3)

Weights & Biases · Beginner ·🎮 Reinforcement Learning ·4y ago

Skills: RL Foundations90%Policy Gradient Methods80%

Key Takeaways

Implements Proximal Policy Optimization with 9 Atari-specific details

Original Description

Proximal Policy Optimization (PPO) is one of the most popular reinforcement learning algorithms, and works with a variety of domains from robotics control to Atari games to chip design In this video, we dive deep into 9 Atari-specific implementation details of PPO and build from the PPO implementation from our last video (https://youtu.be/MEt6rrxH8W4). --- Source code: https://github.com/vwxyzjn/ppo-implementation-details Related blog post: https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/ Background music: Flutes Will Chill — https://artlist.io/song/48722/flutes-will-chill --- 0:00 Introduction 1:00 Setup 1:55 Environment Preprocessing 2:23 1. NoopResetEnv 3:17 2. MaxAndSkipEnv 3:48 3. EpisodicLifeEnv 4:10 4. FireResetEnv 4:56 5. ClipRewardEnv 5:18 6. Image Transformation 5:49 7. FrameStack 6:29 8. Shared Nature-CNN network 8:02 9. Scale the input to [0, 1] 8:17 Match hyperparameters 8:40 Give it a run 9:04 Stream metrics live 9:13 Retrieve experiments done a year ago 11:24 Videos of agents playing the game 11:45 Summary of changes

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Weights & Biases · Weights & Biases · 0 of 60

← Previous Next →

0. What is machine learning?

0. What is machine learning?

Weights & Biases

1. Build Your First Machine Learning Model

1. Build Your First Machine Learning Model

Weights & Biases

Intro to ML: Course Overview

Intro to ML: Course Overview

Weights & Biases

2. Multi-Layer Perceptrons

2. Multi-Layer Perceptrons

Weights & Biases

3. Convolutional Neural Networks

3. Convolutional Neural Networks

Weights & Biases

Weights & Biases at OpenAI

Weights & Biases at OpenAI

Weights & Biases

Why Experiment Tracking is Crucial to OpenAI

Why Experiment Tracking is Crucial to OpenAI

Weights & Biases

4. Autoencoders

4. Autoencoders

Weights & Biases

5. Sentiment Analysis

5. Sentiment Analysis

Weights & Biases

6. Recurrent Neural Networks [RNNs]

6. Recurrent Neural Networks [RNNs]

Weights & Biases

7. Text Generation using LSTMs and GRUs

7. Text Generation using LSTMs and GRUs

Weights & Biases

8. Text Classification Using Convolutional Neural Networks

8. Text Classification Using Convolutional Neural Networks

Weights & Biases

9. Hybrid LSTMs [Long Short-Term Memory]

9. Hybrid LSTMs [Long Short-Term Memory]

Weights & Biases

Toyota Research Institute on Experiment Tracking with Weights & Biases

Toyota Research Institute on Experiment Tracking with Weights & Biases

Weights & Biases

Weights and Biases - Developer Tools for Deep Learning

Weights and Biases - Developer Tools for Deep Learning

Weights & Biases

Introducing Weights & Biases

Introducing Weights & Biases

Weights & Biases

10. Seq2Seq Models

10. Seq2Seq Models

Weights & Biases

11. Transfer Learning for Domain-Specific Image Classification with Small Datasets

11. Transfer Learning for Domain-Specific Image Classification with Small Datasets

Weights & Biases

12. One-shot learning for teaching neural networks to classify objects never seen before

12. One-shot learning for teaching neural networks to classify objects never seen before

Weights & Biases

13. Speech Recognition with Convolutional Neural Networks in Keras/TensorFlow

13. Speech Recognition with Convolutional Neural Networks in Keras/TensorFlow

Weights & Biases

14. Data Augmentation | Keras

14. Data Augmentation | Keras

Weights & Biases

15. Batch Size and Learning Rate in CNNs

15. Batch Size and Learning Rate in CNNs

Weights & Biases

Applied Deep Learning Fellowship Overview and Project Selection with Josh Tobin (2019)

Applied Deep Learning Fellowship Overview and Project Selection with Josh Tobin (2019)

Weights & Biases

Grading Rubric for AI Applications with Sergey Karayev (2019)

Grading Rubric for AI Applications with Sergey Karayev (2019)

Weights & Biases

16. Video Frame Prediction using CNNs and LSTMs (2019)

16. Video Frame Prediction using CNNs and LSTMs (2019)

Weights & Biases

Image to LaTeX - Applied Deep Learning Fellowship (2019)

Image to LaTeX - Applied Deep Learning Fellowship (2019)

Weights & Biases

17. Build and Deploy an Emotion Classifier (2019)

17. Build and Deploy an Emotion Classifier (2019)

Weights & Biases

Applied Deep Learning - Data Management with Josh Tobin (2019)

Applied Deep Learning - Data Management with Josh Tobin (2019)

Weights & Biases

Snorkel: Programming Training Data with Paroma Varma of Stanford University (2019)

Snorkel: Programming Training Data with Paroma Varma of Stanford University (2019)

Weights & Biases

Applied Deep Learning - Troubleshooting and Debugging with Josh Tobin (2019)

Applied Deep Learning - Troubleshooting and Debugging with Josh Tobin (2019)

Weights & Biases

Troubleshooting and Iterating ML Models with Lee Redden (2019)

Troubleshooting and Iterating ML Models with Lee Redden (2019)

Weights & Biases

Designing a Machine Learning Project with Neal Khosla (2019)

Designing a Machine Learning Project with Neal Khosla (2019)

Weights & Biases

Lukas Beiwald on ML Tools and Experiment Management (2019)

Lukas Beiwald on ML Tools and Experiment Management (2019)

Weights & Biases

Building Machine Learning Teams with Josh Tobin (2019)

Building Machine Learning Teams with Josh Tobin (2019)

Weights & Biases

Pieter Abeel on Potential Deep Learning Research Directions (2019)

Pieter Abeel on Potential Deep Learning Research Directions (2019)

Weights & Biases

Testing and Deployment of Deep Learning Models with Josh Tobin (2019)

Testing and Deployment of Deep Learning Models with Josh Tobin (2019)

Weights & Biases

Five Lessons for Team-Oriented Research with Peter Welder (2019)

Five Lessons for Team-Oriented Research with Peter Welder (2019)

Weights & Biases

Applied Deep Learning - Rosanne Liu on AI Research (2019)

Applied Deep Learning - Rosanne Liu on AI Research (2019)

Weights & Biases

Making the Mid-career Leap from Urban Design to Deep Learning/Data Science

Making the Mid-career Leap from Urban Design to Deep Learning/Data Science

Weights & Biases

Organizing ML projects — W&B walkthrough (2020)

Organizing ML projects — W&B walkthrough (2020)

Weights & Biases

Brandon Rohrer — Machine Learning in Production for Robots

Brandon Rohrer — Machine Learning in Production for Robots

Weights & Biases

Nicolas Koumchatzky — Machine Learning in Production for Self-Driving Cars

Nicolas Koumchatzky — Machine Learning in Production for Self-Driving Cars

Weights & Biases

My experiments with Reinforcement Learning with Jariullah Safi

My experiments with Reinforcement Learning with Jariullah Safi

Weights & Biases

Applications of Machine Learning to COVID-19 Research with Isaac Godfried

Applications of Machine Learning to COVID-19 Research with Isaac Godfried

Weights & Biases

Testing Machine Learning Models with Eric Schles

Testing Machine Learning Models with Eric Schles

Weights & Biases

How Linear Algebra is not like Algebra with Charles Frye

How Linear Algebra is not like Algebra with Charles Frye

Weights & Biases

Predicting Protein Structures using Deep Learning with Jonathan King

Predicting Protein Structures using Deep Learning with Jonathan King

Weights & Biases

Rachael Tatman — Conversational AI and Linguistics

Rachael Tatman — Conversational AI and Linguistics

Weights & Biases

Reformer by Han Lee

Reformer by Han Lee

Weights & Biases

Sequence Models with Pujaa Rajan

Sequence Models with Pujaa Rajan

Weights & Biases

GitHub Actions & Machine Learning Workflows with Hamel Husain

GitHub Actions & Machine Learning Workflows with Hamel Husain

Weights & Biases

Look Mom, No Indices! Vector Calculus with the Fréchet Derivative by Charles Frye

Look Mom, No Indices! Vector Calculus with the Fréchet Derivative by Charles Frye

Weights & Biases

Jack Clark — Building Trustworthy AI Systems

Jack Clark — Building Trustworthy AI Systems

Weights & Biases

Surprising Utility of Surprise: Why ML Uses Negative Log Probabilities - Charles Frye

Surprising Utility of Surprise: Why ML Uses Negative Log Probabilities - Charles Frye

Weights & Biases

Track your machine learning experiments locally, with W&B Local - Chris Van Pelt

Track your machine learning experiments locally, with W&B Local - Chris Van Pelt

Weights & Biases

Antipatterns in open source research code with Jariullah Safi

Antipatterns in open source research code with Jariullah Safi

Weights & Biases

Attention for time series forecasting & COVID predictions - Isaac Godfried

Attention for time series forecasting & COVID predictions - Isaac Godfried

Weights & Biases

Made with ML - Goku Mohandas

Made with ML - Goku Mohandas

Weights & Biases

Angela & Danielle — Designing ML Models for Millions of Consumer Robots

Angela & Danielle — Designing ML Models for Millions of Consumer Robots

Weights & Biases

Deep Learning Salon by Weights & Biases

Deep Learning Salon by Weights & Biases

Weights & Biases

More on: RL Foundations

View skill →

Build a Doom AI Model with Python | Gaming Reinforcement Learning Full Course

Build a Doom AI Model with Python | Gaming Reinforcement Learning Full Course

Nicholas Renotte

Deep Reinforcement Learning for Atari Games Python Tutorial | AI Plays Space Invaders

Deep Reinforcement Learning for Atari Games Python Tutorial | AI Plays Space Invaders

Nicholas Renotte

Training & Testing Deep reinforcement learning (DQN) Agent - Reinforcement Learning p.6

Training & Testing Deep reinforcement learning (DQN) Agent - Reinforcement Learning p.6

Build a Game Bot (LIVE)

Build a Game Bot (LIVE)

How to Win Slot Machines - Intro to Deep Learning #13

How to Win Slot Machines - Intro to Deep Learning #13

Build an Mario AI Model with Python | Gaming Reinforcement Learning

Build an Mario AI Model with Python | Gaming Reinforcement Learning

Nicholas Renotte

Related Reads

A Practical Guide to Implementing the REINFORCE Algorithm in Python (Part 5)

Implement the REINFORCE algorithm in Python using PyTorch and Gymnasium for reinforcement learning tasks

Medium · Machine Learning

Gimitest: A Comprehensive Tool for Testing Reinforcement Learning Policies

Learn how to test reinforcement learning policies with Gimitest, a comprehensive tool for ensuring reliability and safety

RLVP: Penalize the Path, Reward the Outcome

Learn how to implement RLVP, a new reinforcement learning approach that prioritizes outcome over path, and apply it to real-world problems with costly interactions

Self-Review Reinforcement Learning (SRRL) with Cross-Episode Memory and Policy Distillation

Learn how Self-Review Reinforcement Learning (SRRL) improves learning from sparse feedback using cross-episode memory and policy distillation, and apply it to your own RL models

Chapters (18)

Introduction

1:00 Setup

1:55 Environment Preprocessing

2:23 1. NoopResetEnv

3:17 2. MaxAndSkipEnv

3:48 3. EpisodicLifeEnv

4:10 4. FireResetEnv

4:56 5. ClipRewardEnv

5:18 6. Image Transformation

5:49 7. FrameStack

6:29 8. Shared Nature-CNN network

8:02 9. Scale the input to [0, 1]

8:17 Match hyperparameters

8:40 Give it a run

9:04 Stream metrics live

9:13 Retrieve experiments done a year ago

11:24 Videos of agents playing the game

11:45 Summary of changes

How Netflix Uses Reinforcement Learning to Recommend Movies #ai #coding #machinelearning #netflix