Why LLMs Shouldn’t Follow Instructions (But Do)
A pretrained language model can predict text, but it doesn’t know how to help you.
In this video, we break down how raw LLMs are transformed into instruction-following assistants like ChatGPT. You’ll learn how fine-tuning, human preference data, and reinforcement learning from human feedback (RLHF) reshape a model’s behavior — without changing its architecture.
We cover:
Why next-token prediction alone is not enough
Supervised fine-tuning with instruction–response pairs
How human rankings become a reward model
What RLHF actually optimizes (and what it doesn’t)
How safety, refusals, and “hel…
Watch on YouTube ↗
(saves to browser)
DeepCamp AI