#TWIMLfest: Office Hours - Reinforcement Learning

The TWIML AI Podcast with Sam Charrington · Advanced ·🤖 AI Agents & Automation ·5y ago

Skills: RL Foundations90%Policy Gradient Methods70%

In the Office Hours series, we invite experts and practitioners in various topic areas for AMA (ask-me-anything) style sessions to answer community member questions. The intent is to answer technical questions and/or help participants advance their specific projects and interests. This week’s topic will be centered on Reinforcement Learning! Resources: Show notebooks in Drive - https://colab.research.google.com/github/psc-g/intro_to_rl/blob/master/Introduction_to_reinforcement_learning.ipynb Streamlit - https://www.streamlit.io/ #302 - Deep Reinforcement Learning for Logistics at Instadeep w/ Karim Beguir - https://twimlai.com/twiml-talk-302-deep-reinforcement-learning-for-logistics-at-instadeep-with-karim-beguir/ Project Bonsai - https://azure.microsoft.com/en-us/services/project-bonsai/#features David Silver lectures - https://www.davidsilver.uk/teaching/ Reinforcement Learning: An Introduction by Richard Sutton https://web.stanford.edu/class/psych209/Readings/SuttonBartoIPRLBook2ndEd.pdf Theoretical Computational NeuroScience by Peter Dayan - http://www.gatsby.ucl.ac.uk/~lmate/biblio/dayanabbott.pdf Agence - https://www.agence.ai/ Emma Brunskill - https://cs.stanford.edu/people/ebrun/ Unity Machine Learning Agents - https://unity.com/products/machine-learning-agents Unity Machine Learning Agents GitHub - https://github.com/Unity-Technologies/ml-agents Invariant Causal Prediction for Block MDPs - https://arxiv.org/abs/2003.06016 Causal Modeling in Machine Learning by Robert O. Ness - https://github.com/robertness/causalML/blob/master/syllabus_NEU.md Doubly Robust Off-policy Value Evaluation for Reinforcement Learning - https://arxiv.org/abs/1511.03722 Gridworld Playground - http://gridworld-playground.glitch.me/ Join the TWIML Slack - https://twimlai.com/community Sim2Real - http://www.andrew.cmu.edu/course/10-703/slides/Lecture_sim2realmaxentRL.pdf Duckietown - https://www.duckietown.org/ AI Research at JP Morgan Chase with Manuela Veloso - #371

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from The TWIML AI Podcast with Sam Charrington · The TWIML AI Podcast with Sam Charrington · 0 of 60

← Previous Next →

Engineering Practical Machine Learning Systems with Xavier Amatriain - #3

Engineering Practical Machine Learning Systems with Xavier Amatriain - #3

The TWIML AI Podcast with Sam Charrington

How to Build Confidence as an ML Developer with Siraj Raval - #2

How to Build Confidence as an ML Developer with Siraj Raval - #2

The TWIML AI Podcast with Sam Charrington

Open Source Data Science Masters, Hybrid AI, Algorithmic Ethics & More with Clare Corthell - #1

Open Source Data Science Masters, Hybrid AI, Algorithmic Ethics & More with Clare Corthell - #1

The TWIML AI Podcast with Sam Charrington

Interactive AI, Plus Improving ML Education with Charles Isbell - #4

Interactive AI, Plus Improving ML Education with Charles Isbell - #4

The TWIML AI Podcast with Sam Charrington

Machine Learning for the Stars & Productizing AI with Joshua Bloom - #5

Machine Learning for the Stars & Productizing AI with Joshua Bloom - #5

The TWIML AI Podcast with Sam Charrington

Generating Labeled Training Data for Your ML/AI Models with Angie Hugeback - #6

Generating Labeled Training Data for Your ML/AI Models with Angie Hugeback - #6

The TWIML AI Podcast with Sam Charrington

Explaining the Predictions of Machine Learning Models with Carlos Guestrin - #7

Explaining the Predictions of Machine Learning Models with Carlos Guestrin - #7

The TWIML AI Podcast with Sam Charrington

Deep Learning: Modular in Theory, Inflexible in Practice with Diogo Almeida - #8

Deep Learning: Modular in Theory, Inflexible in Practice with Diogo Almeida - #8

The TWIML AI Podcast with Sam Charrington

Emotional AI: Teaching Computers Empathy with Pascale Fung - #9

Emotional AI: Teaching Computers Empathy with Pascale Fung - #9

The TWIML AI Podcast with Sam Charrington

Statistics vs Semantics for Natural Language Processing with Francisco Webber - #10

Statistics vs Semantics for Natural Language Processing with Francisco Webber - #10

The TWIML AI Podcast with Sam Charrington

Building AI Products with Hilary Mason - #11

Building AI Products with Hilary Mason - #11

The TWIML AI Podcast with Sam Charrington

Reprogramming the Human Genome with AI, w/ Brendan Frey - #12

Reprogramming the Human Genome with AI, w/ Brendan Frey - #12

The TWIML AI Podcast with Sam Charrington

Understanding Deep Neural Networks with Dr. James McCaffery - #13

Understanding Deep Neural Networks with Dr. James McCaffery - #13

The TWIML AI Podcast with Sam Charrington

Scaling Deep Learning: Systems Challenges & More with Shubho Sengupta - #14

Scaling Deep Learning: Systems Challenges & More with Shubho Sengupta - #14

The TWIML AI Podcast with Sam Charrington

Domain Knowledge in Machine Learning Models for Sustainability with Stefano Ermon - #15

Domain Knowledge in Machine Learning Models for Sustainability with Stefano Ermon - #15

The TWIML AI Podcast with Sam Charrington

Machine Learning in Cybersecurity with Evan Wright - #16

Machine Learning in Cybersecurity with Evan Wright - #16

The TWIML AI Podcast with Sam Charrington

Interactive Machine Learning Systems with Alekh Agarwal - #17

Interactive Machine Learning Systems with Alekh Agarwal - #17

The TWIML AI Podcast with Sam Charrington

Location-Based Intelligence for Smarter Marketing with Klustera - #18

Location-Based Intelligence for Smarter Marketing with Klustera - #18

The TWIML AI Podcast with Sam Charrington

AI-Powered Customer Support with HelloVera - #18

AI-Powered Customer Support with HelloVera - #18

The TWIML AI Podcast with Sam Charrington

Using AI to Simplify the Programming of Robots with Cambrian Intelligence - #18

Using AI to Simplify the Programming of Robots with Cambrian Intelligence - #18

The TWIML AI Podcast with Sam Charrington

Increasing Efficiency of Healthcare Insurance Billing with NLP, w/ Behold.ai - #18

Increasing Efficiency of Healthcare Insurance Billing with NLP, w/ Behold.ai - #18

The TWIML AI Podcast with Sam Charrington

Creating a Worldwide Financial Knowledge Graph with AlphaVertex - #18

Creating a Worldwide Financial Knowledge Graph with AlphaVertex - #18

The TWIML AI Podcast with Sam Charrington

From Particle Physics to Audio AI with Scott Stephenson - #19

From Particle Physics to Audio AI with Scott Stephenson - #19

The TWIML AI Podcast with Sam Charrington

Selling AI to the Enterprise with Kathryn Hume - #20

Selling AI to the Enterprise with Kathryn Hume - #20

The TWIML AI Podcast with Sam Charrington

Engineering the Future of AI with Ruchir Puri - #21

Engineering the Future of AI with Ruchir Puri - #21

The TWIML AI Podcast with Sam Charrington

Deep Neural Nets for Visual Recognition with Matt Zeiler - #22

Deep Neural Nets for Visual Recognition with Matt Zeiler - #22

The TWIML AI Podcast with Sam Charrington

Introducing Psycholinguistics into AI with Dominique Simmons- #23

Introducing Psycholinguistics into AI with Dominique Simmons- #23

The TWIML AI Podcast with Sam Charrington

Reinforcement Learning: The Next Frontier of Gaming with Danny Lange - #24

Reinforcement Learning: The Next Frontier of Gaming with Danny Lange - #24

The TWIML AI Podcast with Sam Charrington

Offensive vs Defensive Data Science with Deep Varma - #25

Offensive vs Defensive Data Science with Deep Varma - #25

The TWIML AI Podcast with Sam Charrington

Global AI Trends with Ben Lorica - #26

Global AI Trends with Ben Lorica - #26

The TWIML AI Podcast with Sam Charrington

Intelligent Autonomous Robots with Ilia Baranov - #27

Intelligent Autonomous Robots with Ilia Baranov - #27

The TWIML AI Podcast with Sam Charrington

Reinforcement Learning Deep Dive with Pieter Abbeel - #28

Reinforcement Learning Deep Dive with Pieter Abbeel - #28

The TWIML AI Podcast with Sam Charrington

Robotic Perception and Control with Chelsea Finn - #29

Robotic Perception and Control with Chelsea Finn - #29

The TWIML AI Podcast with Sam Charrington

Natural Language Understanding for Amazon Alexa with Zornitsa Kozareva - #30

Natural Language Understanding for Amazon Alexa with Zornitsa Kozareva - #30

The TWIML AI Podcast with Sam Charrington

The Power of Probabilistic Programming with Ben Vigoda - #33

The Power of Probabilistic Programming with Ben Vigoda - #33

The TWIML AI Podcast with Sam Charrington

Intel Nervana Update + Productizing AI Research with Naveen Rao and Hanlin Tang - #31

Intel Nervana Update + Productizing AI Research with Naveen Rao and Hanlin Tang - #31

The TWIML AI Podcast with Sam Charrington

Video Object Detection at Scale with Reza Zadeh - #34

Video Object Detection at Scale with Reza Zadeh - #34

The TWIML AI Podcast with Sam Charrington

Enhancing Customer Experiences with Emotional AI, w/ Rana el Kaliouby - #35

Enhancing Customer Experiences with Emotional AI, w/ Rana el Kaliouby - #35

The TWIML AI Podcast with Sam Charrington

Expressive AI-Generated Music With Google's Performance RNN with Doug Eck - #32

Expressive AI-Generated Music With Google's Performance RNN with Doug Eck - #32

The TWIML AI Podcast with Sam Charrington

Smart Buildings & IoT with Yodit Stanton - #36

Smart Buildings & IoT with Yodit Stanton - #36

The TWIML AI Podcast with Sam Charrington

Deep Robotic Learning with Sergey Levine - #37

Deep Robotic Learning with Sergey Levine - #37

The TWIML AI Podcast with Sam Charrington

Deep Learning for Warehouse Operations with Calvin Seward - #38

Deep Learning for Warehouse Operations with Calvin Seward - #38

The TWIML AI Podcast with Sam Charrington

Cognitive Biases in Data Science with Drew Conway - #39

Cognitive Biases in Data Science with Drew Conway - #39

The TWIML AI Podcast with Sam Charrington

Data Pipelines at Zymergen with Airflow, w/ Erin Shellman - #41

Data Pipelines at Zymergen with Airflow, w/ Erin Shellman - #41

The TWIML AI Podcast with Sam Charrington

Web Scale Engineering for Machine Learning with Sharath Rao - #40

Web Scale Engineering for Machine Learning with Sharath Rao - #40

The TWIML AI Podcast with Sam Charrington

Marrying Physics-Based and Data-Driven ML Models with Josh Bloom - #42

Marrying Physics-Based and Data-Driven ML Models with Josh Bloom - #42

The TWIML AI Podcast with Sam Charrington

Machine Teaching for Better Machine Learning with Mark Hammond - #43

Machine Teaching for Better Machine Learning with Mark Hammond - #43

The TWIML AI Podcast with Sam Charrington

LSTMs, Plus a Deep Learning History Lesson with Jürgen Schmidhuber - #44

LSTMs, Plus a Deep Learning History Lesson with Jürgen Schmidhuber - #44

The TWIML AI Podcast with Sam Charrington

Learning From Simulated & Unsupervised Images through Adversarial Training - TWiML Online Meetup

Learning From Simulated & Unsupervised Images through Adversarial Training - TWiML Online Meetup

The TWIML AI Podcast with Sam Charrington

Jennifer Prendki Interview - Agile Machine Learning - TWiML Talk #46

Jennifer Prendki Interview - Agile Machine Learning - TWiML Talk #46

The TWIML AI Podcast with Sam Charrington

Evolutionary Algorithms in Machine Learning with Risto Miikkulainen - #47

Evolutionary Algorithms in Machine Learning with Risto Miikkulainen - #47

The TWIML AI Podcast with Sam Charrington

Learning Long-Term Dependencies with Gradient Descent is Difficult - TWiML Online Meetup

Learning Long-Term Dependencies with Gradient Descent is Difficult - TWiML Online Meetup

The TWIML AI Podcast with Sam Charrington

Word2Vec & Friends with Bruno Gonçalves -#48

Word2Vec & Friends with Bruno Gonçalves -#48

The TWIML AI Podcast with Sam Charrington

Symbolic and Subsymbolic Natural Language Processing with Jonathan Mugan - #49

Symbolic and Subsymbolic Natural Language Processing with Jonathan Mugan - #49

The TWIML AI Podcast with Sam Charrington

Bayesian Optimization for Hyperparameter Tuning with Scott Clark - #50

Bayesian Optimization for Hyperparameter Tuning with Scott Clark - #50

The TWIML AI Podcast with Sam Charrington

Intel Nervana DevCloud with Naveen Rao & Scott Apeland - #51

Intel Nervana DevCloud with Naveen Rao & Scott Apeland - #51

The TWIML AI Podcast with Sam Charrington

AI-Powered Conversational Interfaces with Paul Tepper - #52

AI-Powered Conversational Interfaces with Paul Tepper - #52

The TWIML AI Podcast with Sam Charrington

Topological Data Analysis with Gunnar Carlsson - #53

Topological Data Analysis with Gunnar Carlsson - #53

The TWIML AI Podcast with Sam Charrington

ML Use Cases at Think Big Analytics with Mo Patel & Laura Frølich - #54

ML Use Cases at Think Big Analytics with Mo Patel & Laura Frølich - #54

The TWIML AI Podcast with Sam Charrington

Ray:A Distributed Computing Platform for Reinforcement Learning with Ion Stoica -#55

Ray:A Distributed Computing Platform for Reinforcement Learning with Ion Stoica -#55

The TWIML AI Podcast with Sam Charrington

More on: RL Foundations

View skill →

Build a Doom AI Model with Python | Gaming Reinforcement Learning Full Course

Build a Doom AI Model with Python | Gaming Reinforcement Learning Full Course

Nicholas Renotte

Deep Reinforcement Learning for Atari Games Python Tutorial | AI Plays Space Invaders

Deep Reinforcement Learning for Atari Games Python Tutorial | AI Plays Space Invaders

Nicholas Renotte

Training & Testing Deep reinforcement learning (DQN) Agent - Reinforcement Learning p.6

Training & Testing Deep reinforcement learning (DQN) Agent - Reinforcement Learning p.6

Build a Game Bot (LIVE)

Build a Game Bot (LIVE)

How to Win Slot Machines - Intro to Deep Learning #13

How to Win Slot Machines - Intro to Deep Learning #13

Build an Mario AI Model with Python | Gaming Reinforcement Learning

Build an Mario AI Model with Python | Gaming Reinforcement Learning

Nicholas Renotte

Related AI Lessons

The Model Is Not the Moat

The competitive edge in AI is shifting from model capability to trust, workflow fit, and surrounding package, making it crucial to focus on these aspects for long-term success

Dev.to · Harry Floyd

Building a Multi-Provider AI Setup (OpenAI + Claude + Gemini in One Project)

Learn to build a multi-provider AI setup to mitigate risks and increase flexibility in your projects

Common AI API Errors and How to Fix Them (2026 Developer Guide)

Learn to identify and fix common AI API errors to improve your development workflow

Designing Voice Agents Like Chips: Coverage Closure for Agent FSMs

Learn to design voice agents using coverage closure for agent FSMs, a technique inspired by System-on-Chip (SoC) design

Dev.to · Peter

Agentic Architecture: Why Files Aren't Always Enough | Real Python Podcast #295