How Tokenization, Inference, & LLMs Actually Work
In this video, I explain how language models generate text, why most of the process is actually deterministic (not random), and how you can shape the probability when selecting a next token from LLMs using parameters like temperature and top p.
I cover temperature in-depth and demonstrate with a spreadsheet how different values change the probabilities.
Topics:
00:10 Tokens & Why They Matter
03:27 Special Tokens
04:35 The Inference Loop
07:26 Random or Not?
08:11 Deep Dive into Temperature
14:19 Tips for Setting Temperature
16:11 Top P
If you'd like to play with the temperature calculator s…
Watch on YouTube ↗
(saves to browser)
Chapters (7)
0:10
Tokens & Why They Matter
3:27
Special Tokens
4:35
The Inference Loop
7:26
Random or Not?
8:11
Deep Dive into Temperature
14:19
Tips for Setting Temperature
16:11
Top P
DeepCamp AI