Why LLMs Shouldn’t Follow Instructions (But Do)

Name: Why LLMs Shouldn’t Follow Instructions (But Do)
Uploaded: 2026-01-11T16:00:28+00:00
Channel: ML Guy
Description: A pretrained language model can predict text, but it doesn’t know how to help you. In this video, we break down how raw LLMs are transformed into instru...

ML Guy · Advanced ·🧠 Large Language Models ·2mo ago

A pretrained language model can predict text, but it doesn’t know how to help you. In this video, we break down how raw LLMs are transformed into instruction-following assistants like ChatGPT. You’ll learn how fine-tuning, human preference data, and reinforcement learning from human feedback (RLHF) reshape a model’s behavior — without changing its architecture. We cover: Why next-token prediction alone is not enough Supervised fine-tuning with instruction–response pairs How human rankings become a reward model What RLHF actually optimizes (and what it doesn’t) How safety, refusals, and “hel…

Watch on YouTube ↗ (saves to browser)

Next Up

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)