✕ Clear filters
225 lessons

🎮 Reinforcement Learning

RL algorithms, reward modelling, RLHF, policy gradients, Q-learning and multi-agent RL

All ▶ YouTube 204,525📚 External: Coursera 18,077
Preference Alignment & RLHF in LLMs Explained with Huggingface Practical | RLHF, PPO Part-3
Reinforcement Learning
Preference Alignment & RLHF in LLMs Explained with Huggingface Practical | RLHF, PPO Part-3
Sunny Savita Advanced 2d ago
Why Rewards Stop Working #studentbehavior #classroommanagement #studentbehavior
Reinforcement Learning
Why Rewards Stop Working #studentbehavior #classroommanagement #studentbehavior
Smart Classroom Management Beginner 6d ago
Tornado Threats Are a Constant. But Funding for a Safe Room Is Still Held Up
Reinforcement Learning
Tornado Threats Are a Constant. But Funding for a Safe Room Is Still Held Up
Education Week Intermediate 2w ago
Where RL Breaks- Sparse Rewards #ai #podcast
Reinforcement Learning
Where RL Breaks- Sparse Rewards #ai #podcast
The MAD Podcast with Matt Turck Intermediate 2w ago
Continuous Support | Student Experiences | Easy Learning
Reinforcement Learning
Continuous Support | Student Experiences | Easy Learning
The iScale Beginner 2w ago
Preference Alignment & RLHF in LLMs Explained | RLHF, PPO, DPO, ORPO, RL Basics & Practical Part-1
Reinforcement Learning
Preference Alignment & RLHF in LLMs Explained | RLHF, PPO, DPO, ORPO, RL Basics & Practical Part-1
Sunny Savita Beginner 1mo ago
Introduction to Reinforcement Learning and PPO for robotics | VLA for autonomous driving series
Reinforcement Learning
Introduction to Reinforcement Learning and PPO for robotics | VLA for autonomous driving series
Vizuara Beginner 1mo ago
How To Use The Law Of Cause & Effect To Control Your Future | Denis Waitley
Reinforcement Learning
How To Use The Law Of Cause & Effect To Control Your Future | Denis Waitley
Evan Carmichael Beginner 1mo ago
Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 6 - Model Training
Reinforcement Learning
Stanford CME296 Diffusion & Large Vision Models | Spring 2026 | Lecture 6 - Model Training
Stanford Online Beginner 1mo ago
RL in Production | 8 Week Bootcamp to Master Reinforcement Learning in Production
Reinforcement Learning
RL in Production | 8 Week Bootcamp to Master Reinforcement Learning in Production
Vizuara Beginner 1mo ago
These emails get 73% higher engagement 
Reinforcement Learning
These emails get 73% higher engagement 
Sky Bailey Beginner 2mo ago
Understanding Reinforcement Learning with Prime Intellect and Unsloth | Nemotron Labs
Reinforcement Learning
Understanding Reinforcement Learning with Prime Intellect and Unsloth | Nemotron Labs
NVIDIA Developer Advanced 2mo ago
Huggingface TRL vs Unsloth RL: Reinforcement Learning Frameworks. How to fine tuning LLMs - Gemma 4
Reinforcement Learning
Huggingface TRL vs Unsloth RL: Reinforcement Learning Frameworks. How to fine tuning LLMs - Gemma 4
Byte Goose AI. Advanced 2mo ago
Supervised vs Unsupervised vs Reinforcement Learning
Reinforcement Learning
Supervised vs Unsupervised vs Reinforcement Learning
Analytics Vidhya Beginner 2mo ago
Post at *THIS* time on Facebook for max engagement
Reinforcement Learning
Post at *THIS* time on Facebook for max engagement
Buffer Intermediate 2mo ago
Unsloth RL Training. Nvidia NeMO RL using GRPO. Reinforcement Learning from Verifiable Rewards  RLVR
Reinforcement Learning
Unsloth RL Training. Nvidia NeMO RL using GRPO. Reinforcement Learning from Verifiable Rewards RLVR
AI Podcast Series. Byte Goose AI. Advanced 3mo ago
The Top 1% Think Like THIS | Here's How To Do It | Denis Waitley
Reinforcement Learning ⚡ AI Lesson
The Top 1% Think Like THIS | Here's How To Do It | Denis Waitley
Evan Carmichael Beginner 3mo ago
Toblerone’s Hidden Bear Secret 🐻    #shorts
Reinforcement Learning ⚡ AI Lesson
Toblerone’s Hidden Bear Secret 🐻 #shorts
Jacky Chou from Indexsy Beginner 3mo ago
Why Most People Stay Poor
Reinforcement Learning ⚡ AI Lesson
Why Most People Stay Poor
Dan Lok Intermediate 3mo ago
Smarter AI Gradients: How Agents Learn to Think
Reinforcement Learning ⚡ AI Lesson
Smarter AI Gradients: How Agents Learn to Think
Discover AI Beginner 4mo ago
Reinforcement Learning: A (practical) introduction
Reinforcement Learning ⚡ AI Lesson
Reinforcement Learning: A (practical) introduction
Shawhin Talebi Beginner 5mo ago
#Meta rolls out new performance program with bigger #rewards for top #employees
Reinforcement Learning
#Meta rolls out new performance program with bigger #rewards for top #employees
Business Insider Advanced 5mo ago
Goliath's Challenge #shorts #trending #viral #bible #magiclightai #ytshorts
Reinforcement Learning
Goliath's Challenge #shorts #trending #viral #bible #magiclightai #ytshorts
AI From Scratch Beginner 5mo ago
Why AI Needs Humans
Reinforcement Learning
Why AI Needs Humans
The Information Beginner 5mo ago
Learn with Me: Train AI Agents for Command-Line Tasks with Synthetic Data and RL | Nemotron Labs
Reinforcement Learning ⚡ AI Lesson
Learn with Me: Train AI Agents for Command-Line Tasks with Synthetic Data and RL | Nemotron Labs
NVIDIA Developer Beginner 5mo ago
X Is Paying $1,000,000 for Writing
Reinforcement Learning
X Is Paying $1,000,000 for Writing
Full Disclosure Intermediate 5mo ago
What is a Primary Reinforcer? (Easiest Explanation)
Reinforcement Learning
What is a Primary Reinforcer? (Easiest Explanation)
Helpful Professor Explains! Beginner 5mo ago
Why LLMs Shouldn’t Follow Instructions (But Do)
Reinforcement Learning
Why LLMs Shouldn’t Follow Instructions (But Do)
ML Guy Advanced 5mo ago
Intelligent Robots in 2026: Are We There Yet? [Nikita Rudin] - 760
Reinforcement Learning ⚡ AI Lesson
Intelligent Robots in 2026: Are We There Yet? [Nikita Rudin] - 760
TWIML AI Podcast Beginner 5mo ago
The Hidden Power Behind Loyalty Programs 🤔    #shorts
Reinforcement Learning
The Hidden Power Behind Loyalty Programs 🤔 #shorts
Jacky Chou from Indexsy Beginner 5mo ago
This town has three nuclear plants. Now it wants another one
Reinforcement Learning
This town has three nuclear plants. Now it wants another one
Vox Intermediate 6mo ago
Training a Unitree G1 to Walk w/ Reinforcement Learning
Reinforcement Learning ⚡ AI Lesson
Training a Unitree G1 to Walk w/ Reinforcement Learning
Sentdex Advanced 6mo ago
#NOVER Explained: How AI Learns to Judge Its Own Reasoning (No Reward Model Needed)
Reinforcement Learning
#NOVER Explained: How AI Learns to Judge Its Own Reasoning (No Reward Model Needed)
BazAI Beginner 6mo ago
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 1: Class Intro
Reinforcement Learning ⚡ AI Lesson
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 1: Class Intro
Stanford Online Beginner 6mo ago
How to train Multi Agent Collaborative Agents with Reinforcement Learning (CTDE Explained)
Reinforcement Learning ⚡ AI Lesson
How to train Multi Agent Collaborative Agents with Reinforcement Learning (CTDE Explained)
Neural Breakdown with AVB Beginner 6mo ago
Why Every Skyrim AI Becomes a Stealth Archer
Reinforcement Learning ⚡ AI Lesson
Why Every Skyrim AI Becomes a Stealth Archer
Siraj Raval Advanced 6mo ago
What is RLHF (Reinforcement Learning from Human Feedback) ? | The Secret Ingredient Behind ChatGPT
2:15
Reinforcement Learning ⚡ AI Lesson
What is RLHF (Reinforcement Learning from Human Feedback) ? | The Secret Ingredient Behind ChatGPT
VLR Software Training Beginner 7mo ago
How ChatGPT Actually Works: The "Secret Sauce" of AI Alignment & RLHF Explained
Reinforcement Learning
How ChatGPT Actually Works: The "Secret Sauce" of AI Alignment & RLHF Explained
The Latent Space Beginner 7mo ago
What is Reinforcement Learning from Human Feedback (RLHF)
0:54
Reinforcement Learning ⚡ AI Lesson
What is Reinforcement Learning from Human Feedback (RLHF)
Data Science Made Easy Beginner 7mo ago
Money Lessons From Americans Caring for Aging Parents | Life Lessons | Business Insider
Reinforcement Learning ⚡ AI Lesson
Money Lessons From Americans Caring for Aging Parents | Life Lessons | Business Insider
Business Insider Beginner 7mo ago
Introduction for world models for autonomous driving | VLA for autonomous driving series | Session 4
Reinforcement Learning
Introduction for world models for autonomous driving | VLA for autonomous driving series | Session 4
Vizuara Beginner 2mo ago
Build OpenClaw-RL + VoiceAgents using Claude Code | LLM context engineering series | Lecture 10
Reinforcement Learning
Build OpenClaw-RL + VoiceAgents using Claude Code | LLM context engineering series | Lecture 10
Vizuara Intermediate 2mo ago
The Secret Behind Costco’s $1.50 Hotdog 🕵️‍♂️    #shorts
Reinforcement Learning
The Secret Behind Costco’s $1.50 Hotdog 🕵️‍♂️ #shorts
Jacky Chou from Indexsy Intermediate 5mo ago
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 3: Policy Gradients
Reinforcement Learning
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 3: Policy Gradients
Stanford Online Intermediate 6mo ago
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 6: Q-Learning
Reinforcement Learning
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 6: Q-Learning
Stanford Online Beginner 6mo ago
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 17: Advancing Robot Intelligence
Reinforcement Learning ⚡ AI Lesson
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 17: Advancing Robot Intelligence
Stanford Online Beginner 6mo ago
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Tutorial Session: Review of Q-Learning
Reinforcement Learning ⚡ AI Lesson
Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Tutorial Session: Review of Q-Learning
Stanford Online Beginner 6mo ago
The Fastest Way to a Rich Life | Earl Nightingale Insight in 13 Minutes
Reinforcement Learning
The Fastest Way to a Rich Life | Earl Nightingale Insight in 13 Minutes
Evan Carmichael Beginner 6mo ago