Training AI Without Writing A Reward Function, with Reward Modelling

Robert Miles AI Safety · Advanced ·📄 Research Papers Explained ·6y ago

Skills: RL Foundations80%

How do you get a reinforcement learning agent to do what you want, when you can't actually write a reward function that specifies what that is? The paper: https://arxiv.org/pdf/1706.03741.pdf The blogpost: https://openai.com/blog/deep-reinforcement-learning-from-human-preferences/ Thanks to my wonderful patrons: https://www.patreon.com/robertskmiles James Gladamas Steef Scott Worley Jordan Medina Simon Strandgaard JJ Hepboin Pedro A Ortega Said Polat Chris Canal Jake Ehrlich Kellen lask Francisco Tolmasky Michael Andregg David Reid Robert Daniel Pickard Peter Rolf Chad Jones Richárd Nagyfi Jason Hise Phil Moyer Shevis Johnson Erik de Bruijn Alec Johnson Clemens Arbesser Ludwig Schubert Bryce Daifuku Allen Faure Eric James Qeith Wreid Jonatan R Ingvi Gautsson Michael Greve Julius Brash Tom O'Connor Robin Green Laura Olds Jon Halliday Paul Hobbs Jeroen De Dauw Lupuleasa Ionuț Tim Neilson Eric Scammell Igor Keller Ben Glanton anul kumar sinha Sean Gibat Cooper Lawton Will Glynn Tyler Herrmann Tomas Sayder Ian Munro Jérôme Beaulieu Nathan Fish Taras Bobrovytsky Anne Buit Vaskó Richárd Sebastian Birjoveanu Euclidean Plane Andrew Harcourt DGJono robertvanduursen Dmitri Afanasjev Marcel Ward Andrew Weir Ben Archer Kabs Miłosz Wierzbicki Tendayi Mawushe Jannik Olbrich Anne Kohlbrenner Jussi Männistö Wr4thon Martin Ottosen Archy de Berker Marc Pauly Andy Kobre Brian Gillespie Poker Chen Kees Darko Sperac Truls Paul Moffat Anders Öhrt Marco Tiraboschi Michael Kuhinica Fraser Cain Robin Scharf Seth Brothwell Kasper Schnack Klemen Slavic Patrick Henderson Oct todo22 Melisa Kostrzewski Hendrik Daniel Munter Graham Henry Duncan Orr Bryan Egan Robert Hildebrandt James Fowkes Alan Bandurka Ben H Tatiana Ponomareva Michael Bates Simon Pilkington Dion Gerald Bridger Petr Smital Daniel Kokotajlo Fionn Yuchong Li Diagon Parker Lund Paul Emmerich Russell schoen Andreas Blomqvist Bertalan Bodor David Morgan Jeremy Ben Schultz Zannheim Daniel Eickhardt lyon549 HD Ihor Mukha 14zRobot Iva

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Robert Miles AI Safety · Robert Miles AI Safety · 26 of 47

← Previous Next →

Predicting AI: RIP Prof. Hubert Dreyfus

Predicting AI: RIP Prof. Hubert Dreyfus

Robert Miles AI Safety

Robert Miles AI Safety

Are AI Risks like Nuclear Risks?

Are AI Risks like Nuclear Risks?

Robert Miles AI Safety

Avoiding Negative Side Effects: Concrete Problems in AI Safety part 1

Avoiding Negative Side Effects: Concrete Problems in AI Safety part 1

Robert Miles AI Safety

Avoiding Positive Side Effects: Concrete Problems in AI Safety part 1.5

Avoiding Positive Side Effects: Concrete Problems in AI Safety part 1.5

Robert Miles AI Safety

Empowerment: Concrete Problems in AI Safety part 2

Empowerment: Concrete Problems in AI Safety part 2

Robert Miles AI Safety

Why Not Just: Raise AI Like Kids?

Why Not Just: Raise AI Like Kids?

Robert Miles AI Safety

Reward Hacking: Concrete Problems in AI Safety Part 3

Reward Hacking: Concrete Problems in AI Safety Part 3

Robert Miles AI Safety

The other "Killer Robot Arms Race" Elon Musk should worry about

The other "Killer Robot Arms Race" Elon Musk should worry about

Robert Miles AI Safety

Reward Hacking Reloaded: Concrete Problems in AI Safety Part 3.5

Reward Hacking Reloaded: Concrete Problems in AI Safety Part 3.5

Robert Miles AI Safety

What Can We Do About Reward Hacking?: Concrete Problems in AI Safety Part 4

What Can We Do About Reward Hacking?: Concrete Problems in AI Safety Part 4

Robert Miles AI Safety

What can AGI do? I/O and Speed

What can AGI do? I/O and Speed

Robert Miles AI Safety

AI learns to Create ̵K̵Z̵F̵ ̵V̵i̵d̵e̵o̵s̵ Cat Pictures: Papers in Two Minutes #1

AI learns to Create ̵K̵Z̵F̵ ̵V̵i̵d̵e̵o̵s̵ Cat Pictures: Papers in Two Minutes #1

Robert Miles AI Safety

AI Safety at EAGlobal2017 Conference

AI Safety at EAGlobal2017 Conference

Robert Miles AI Safety

Scalable Supervision: Concrete Problems in AI Safety Part 5

Scalable Supervision: Concrete Problems in AI Safety Part 5

Robert Miles AI Safety

Superintelligence Mod for Civilization V

Superintelligence Mod for Civilization V

Robert Miles AI Safety

Why Would AI Want to do Bad Things? Instrumental Convergence

Why Would AI Want to do Bad Things? Instrumental Convergence

Robert Miles AI Safety

Experts' Predictions about the Future of AI

Experts' Predictions about the Future of AI

Robert Miles AI Safety

AI Safety Gridworlds

AI Safety Gridworlds

Robert Miles AI Safety

Friend or Foe? AI Safety Gridworlds extra bit

Friend or Foe? AI Safety Gridworlds extra bit

Robert Miles AI Safety

Safe Exploration: Concrete Problems in AI Safety Part 6

Safe Exploration: Concrete Problems in AI Safety Part 6

Robert Miles AI Safety

Why Not Just: Think of AGI Like a Corporation?

Why Not Just: Think of AGI Like a Corporation?

Robert Miles AI Safety

How to Keep Improving When You're Better Than Any Teacher - Iterated Distillation and Amplification

How to Keep Improving When You're Better Than Any Teacher - Iterated Distillation and Amplification

Robert Miles AI Safety

Is AI Safety a Pascal's Mugging?

Is AI Safety a Pascal's Mugging?

Robert Miles AI Safety

AI That Doesn't Try Too Hard - Maximizers and Satisficers

AI That Doesn't Try Too Hard - Maximizers and Satisficers

Robert Miles AI Safety

Training AI Without Writing A Reward Function, with Reward Modelling

Training AI Without Writing A Reward Function, with Reward Modelling

Robert Miles AI Safety

9 Examples of Specification Gaming

9 Examples of Specification Gaming

Robert Miles AI Safety

10 Reasons to Ignore AI Safety

10 Reasons to Ignore AI Safety

Robert Miles AI Safety

Sharing the Benefits of AI: The Windfall Clause

Sharing the Benefits of AI: The Windfall Clause

Robert Miles AI Safety

Quantilizers: AI That Doesn't Try Too Hard

Quantilizers: AI That Doesn't Try Too Hard

Robert Miles AI Safety

The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment

The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment

Robert Miles AI Safety

Deceptive Misaligned Mesa-Optimisers? It's More Likely Than You Think...

Deceptive Misaligned Mesa-Optimisers? It's More Likely Than You Think...

Robert Miles AI Safety

Intro to AI Safety, Remastered

Intro to AI Safety, Remastered

Robert Miles AI Safety

We Were Right! Real Inner Misalignment

We Were Right! Real Inner Misalignment

Robert Miles AI Safety

Apply to AI Safety Camp! #shorts

Apply to AI Safety Camp! #shorts

Robert Miles AI Safety

Win $50k for Solving a Single AI Problem? #Shorts

Win $50k for Solving a Single AI Problem? #Shorts

Robert Miles AI Safety

Free ML Bootcamp for Alignment #shorts

Free ML Bootcamp for Alignment #shorts

Robert Miles AI Safety

Apply Now for a Paid Residency on Interpretability #short

Apply Now for a Paid Residency on Interpretability #short

Robert Miles AI Safety

Why Does AI Lie, and What Can We Do About It?

Why Does AI Lie, and What Can We Do About It?

Robert Miles AI Safety

Apply to Study AI Safety Now! #shorts

Apply to Study AI Safety Now! #shorts

Robert Miles AI Safety

AI Ruined My Year

AI Ruined My Year

Robert Miles AI Safety

Learn AI Safety at MATS #shorts

Learn AI Safety at MATS #shorts

Robert Miles AI Safety

Using Dangerous AI, But Safely?

Using Dangerous AI, But Safely?

Robert Miles AI Safety

AI Safety Career Advice! (And So Can You!)

AI Safety Career Advice! (And So Can You!)

Robert Miles AI Safety

Robot Dog! Unitree Go2 review #shorts #robot #dog

Robot Dog! Unitree Go2 review #shorts #robot #dog

Robert Miles AI Safety

Tech is Good, AI Will Be Different

Tech is Good, AI Will Be Different

Robert Miles AI Safety

Apply for the Affine Superintelligence Alignment Seminar #shorts

Apply for the Affine Superintelligence Alignment Seminar #shorts

Robert Miles AI Safety

More on: RL Foundations

View skill →

Build a Doom AI Model with Python | Gaming Reinforcement Learning Full Course

Build a Doom AI Model with Python | Gaming Reinforcement Learning Full Course

Nicholas Renotte

Deep Reinforcement Learning for Atari Games Python Tutorial | AI Plays Space Invaders

Deep Reinforcement Learning for Atari Games Python Tutorial | AI Plays Space Invaders

Nicholas Renotte

Training & Testing Deep reinforcement learning (DQN) Agent - Reinforcement Learning p.6

Training & Testing Deep reinforcement learning (DQN) Agent - Reinforcement Learning p.6

Build a Game Bot (LIVE)

Build a Game Bot (LIVE)

How to Win Slot Machines - Intro to Deep Learning #13

How to Win Slot Machines - Intro to Deep Learning #13

Build an Mario AI Model with Python | Gaming Reinforcement Learning

Build an Mario AI Model with Python | Gaming Reinforcement Learning

Nicholas Renotte

Related AI Lessons

The ABCs of reading medical research and review papers these days

Learn to critically evaluate medical research papers by accepting nothing at face value, believing no one blindly, and checking everything

#1 DevLog Meta-research: I Got Tired of Tab Chaos While Reading Research Papers.

Learn to manage research paper tabs efficiently and apply meta-research techniques to improve productivity

How to Set Up a Karpathy-Style Wiki for Your Research Field

Learn to set up a Karpathy-style wiki for your research field to organize and share knowledge effectively

The Non-Optimality of Scientific Knowledge: Path Dependence, Lock-In, and The Local Minimum Trap

Scientific knowledge may be stuck in a local minimum, hindering optimal progress, and understanding this concept is crucial for advancing research

Microsoft Research Forum | Season 2, Episode 4

Microsoft Research