🎮 Reinforcement Learning

RL algorithms, reward modelling, RLHF, policy gradients, Q-learning and multi-agent RL

All ▶ YouTube 279,344 📚 External: Coursera 19,081 🏛 Archive.org 625 | 📰 Articles →

Looking for written articles and micro-lessons? Switch to Reads.

Reinforcement Learning

You Won't Believe How This Cop Got Away With This... #police #lawyer

Hampton Law Advanced 2w ago

Reinforcement Learning

How to build your own LLM from Scratch | Rakesh Gohel

Rakesh Gohel Advanced 2w ago

Reinforcement Learning

Preference Alignment & RLHF in LLMs Explained with Huggingface Practical | RLHF, PPO Part-3

Sunny Savita Advanced 2w ago

Reinforcement Learning

Reinforcement Learning from Human Feedback (RLHF) - High-Level Intuition

SH AI Academy Advanced 3w ago

Reinforcement Learning

GLP-1s: Overdosing, Side Effects & Long-Term Risks | Dr. Abud Bakri & Dr. Andrew Huberman

Huberman Lab Clips Advanced 1mo ago

Reinforcement Learning

The Types of LLM Fine-Tuning: SFT, RLHF, DPO, and LoRA Explained

SH AI Academy Advanced 1mo ago

Reinforcement Learning

Understanding Reinforcement Learning with Prime Intellect and Unsloth | Nemotron Labs

NVIDIA Developer Advanced 2mo ago

Reinforcement Learning

Huggingface TRL vs Unsloth RL: Reinforcement Learning Frameworks. How to fine tuning LLMs - Gemma 4

Byte Goose AI. Advanced 3mo ago

Reinforcement Learning

S02E04 — The Model Was Getting Rewarded for Mistakes — Reward Model

AI X-Rayed Advanced 3mo ago

Reinforcement Learning

Unsloth RL Training. Nvidia NeMO RL using GRPO. Reinforcement Learning from Verifiable Rewards RLVR

AI Podcast Series. Byte Goose AI. Advanced 3mo ago

Reinforcement Learning

Can You Trust an LLM Judge? An RL Researcher's Take

Deep Learning with Yacine Advanced 4mo ago

Reinforcement Learning

Deep Dive: Teaching Arcee Trinity Mini to Read Medical Research with RLVR and GRPO

Julien Simon Advanced 4mo ago

Reinforcement Learning

#Meta rolls out new performance program with bigger #rewards for top #employees

Business Insider Advanced 5mo ago

Reinforcement Learning

Why LLMs Shouldn’t Follow Instructions (But Do)

ML Guy Advanced 6mo ago

Reinforcement Learning ⚡ AI Lesson

[State of Post-Training] From GPT-4.1 to 5.1: RLVR, Agent & Token Efficiency — Josh McGrath, OpenAI

Latent Space Advanced 6mo ago

Reinforcement Learning

23. What is RLHF? Reinforcement Learning from Human Feedback Explained In Hindi

AI SayI Advanced 6mo ago

Reinforcement Learning ⚡ AI Lesson

Training a Unitree G1 to Walk w/ Reinforcement Learning

Sentdex Advanced 6mo ago

Reinforcement Learning ⚡ AI Lesson

Agent Reinforcement Fine Tuning – Will Hang & Cathy Zhou, OpenAI

AI Engineer Advanced 7mo ago

Reinforcement Learning

Stanford CS224R Deep Reinforcement Learning | Spring 2025 | Lecture 9: RL for LLMs

Stanford Online Advanced 7mo ago

Reinforcement Learning ⚡ AI Lesson

Why Every Skyrim AI Becomes a Stealth Archer

Siraj Raval Advanced 7mo ago

LLM Fine-Tuning Crash Course: Finetune model on PDFs, Instruction FT, Preference Training (DPO/RLHF)

3:36:14

Reinforcement Learning ⚡ AI Lesson

LLM Fine-Tuning Crash Course: Finetune model on PDFs, Instruction FT, Preference Training (DPO/RLHF)

Sunny Savita Advanced 7mo ago

Reinforcement Learning ⚡ AI Lesson

Keynote: Olmo-Thinking: Training a Fully Open Reasoning Model - Nathan Lambert

PyTorch Advanced 8mo ago

Reinforcement Learning

Learn to align LLMs through post-training in this new course with AMD!

DeepLearningAI Advanced 8mo ago

Reinforcement Learning

Strategy vs Plan: The Difference Every Comms Pro Gets Wrong

Joanna Parsons Advanced 10mo ago

Reinforcement Learning

Unified Agentic RAG - NEW AI for Medical Diagnosis

Discover AI Advanced 10mo ago

Reinforcement Learning

3 Communication Mistakes That Make Leaders DISMISS Your Ideas

Joanna Parsons Advanced 10mo ago

Reinforcement Learning

verl: Flexible and Scalable Reinforcement Learning Library for LLM Reasoning and Tool-Calling

PyTorch Advanced 11mo ago

Reinforcement Learning

Reinforcement Learning Models - Live Review 2

Dr Mehrdad Arashpour Advanced 11mo ago

Reinforcement Learning

The RLVR Revolution — with Nathan Lambert (AI2, Interconnects.ai)

Latent Space Advanced 11mo ago

Reinforcement Learning ⚡ AI Lesson

Reinforcement learning with Unitree G1 humanoid - Dev w/ G1 P.5

Sentdex Advanced 11mo ago

Reinforcement Learning ⚡ AI Lesson

AI Singularity Discovered

Discover AI Advanced 11mo ago

Reinforcement Learning

Learn to post-train LLMs in this free course

DeepLearningAI Advanced 1y ago

Reinforcement Learning

Let’s Talk Tokens: AMA on Reinforcement Fine-Tuning (RFT), GRPO, and AI Rewards

Predibase by Rubrik Advanced 1y ago

Reinforcement Learning

Stella Li Spurious Rewards Rethinking Training Signals in RLVR

Cohere Advanced 1y ago

Reinforcement Learning

'It's Not the Land of 10,000 Things!'

MLOps.community Advanced 1y ago

Reinforcement Learning

Tricks to Fine Tuning // Prithviraj Ammanabrolu // MLOps Podcast #318

MLOps.community Advanced 1y ago

Reinforcement Learning

Why 90% of Machine Learning Is Labeling—and Why That Era Is Over

Dev In the Details Advanced 1y ago

Reinforcement Learning ⚡ AI Lesson

Reward Models | Data Brew | Episode 40

Databricks Advanced 1y ago

Reinforcement Learning

DPO | Direct Preference Optimization (DPO) architecture | LLM Alignment

AILinkDeepTech Advanced 1y ago

Reinforcement Learning

How to Train LLMs to "Think" (o1 & DeepSeek-R1)

Shaw Talebi Advanced 1y ago

Reinforcement Learning ⚡ AI Lesson

Unlocking Enterprise AI: The DeepSeek Innovation Transforming Data Privacy

Lucidate Advanced 1y ago

Reinforcement Learning

RLHF : Reinforcement Learning through human Feedback ,PPO paper.

Tanisha Choudhary Advanced 1y ago

Reinforcement Learning

10 Ways to Communicate Effectively At Work

Joanna Parsons Advanced 11mo ago

Reinforcement Learning

If You're the ONLY Internal Comms Person in Your Company, Watch This

Joanna Parsons Advanced 1y ago

Reinforcement Learning

New AI Framework: Post-Training

Discover AI Advanced 1y ago

Reinforcement Learning ⚡ AI Lesson

NO AI Self-Improvement w/ RL

Discover AI Advanced 1y ago

Reinforcement Learning

Knowledge Graphs w/ AI Agents form CRYSTAL (MIT)

Discover AI Advanced 1y ago

Reinforcement Learning

AI Agents: NEW Inference Reasoning Q-NET (QLASS)

Discover AI Advanced 1y ago

📚 Continue on Coursera External links · Free to audit

View all →

📚 External: Coursera ↗

Self-paced

Introduction to Learning

Opens on Coursera ↗

Q Learning in Reinforcement Training Basics

📚 External: Coursera ↗

Self-paced

Q Learning in Reinforcement Training Basics

Opens on Coursera ↗

📚 External: Coursera ↗

Self-paced

Marketing Design with Easil

Opens on Coursera ↗

📚 External: Coursera ↗

Self-paced

Optimizing Diversity on Teams

Opens on Coursera ↗

RStudio for Six Sigma - Process Capability

📚 External: Coursera ↗

Self-paced

RStudio for Six Sigma - Process Capability

Opens on Coursera ↗

📚 External: Coursera ↗

Self-paced

Introduction to C++ Programming and Unreal

Opens on Coursera ↗

📚 External: Coursera ↗

Self-paced

Algorithms, Data Collection, and Starting to Code

Opens on Coursera ↗

📚 External: Coursera ↗

Self-paced

How to Get Into Software Development

Opens on Coursera ↗

Understand and Apply Artificial Intelligence Fundamentals

📚 External: Coursera ↗

Self-paced

Understand and Apply Artificial Intelligence Fundamentals

Opens on Coursera ↗

A Complete Reinforcement Learning System (Capstone)

📚 External: Coursera ↗

Self-paced

A Complete Reinforcement Learning System (Capstone)

Opens on Coursera ↗

📚 External: Coursera ↗

Self-paced

Fundamentals of Reinforcement Learning

Opens on Coursera ↗

📚 External: Coursera ↗

Self-paced

The Science of the Solar System

Opens on Coursera ↗

📚 External: Coursera ↗

Self-paced

Generative AI Advance Fine-Tuning for LLMs

Opens on Coursera ↗

Sustainability through Soccer: Systems-Thinking in Action

📚 External: Coursera ↗

Self-paced

Sustainability through Soccer: Systems-Thinking in Action

Opens on Coursera ↗

Interacting with the System and Managing Memory

📚 External: Coursera ↗

Self-paced

Interacting with the System and Managing Memory

Opens on Coursera ↗

📚 External: Coursera ↗

Self-paced

Value-Based Care: Organizational Competencies

Opens on Coursera ↗

Advanced Deep RL Algorithms and Applications

📚 External: Coursera ↗

Self-paced

Advanced Deep RL Algorithms and Applications

Opens on Coursera ↗

Creating a Team Culture of Continuous Learning

📚 External: Coursera ↗

Self-paced

Creating a Team Culture of Continuous Learning

Opens on Coursera ↗