ChatGPT has Never Seen a SINGLE Word (Despite Reading Most of The Internet). Meet LLM Tokenizers.

Jay Alammar · Beginner ·🧠 Large Language Models ·2y ago

Key Takeaways

Jay Alammar explains how large language models like ChatGPT process text data through tokenizers, which translate text into a format the model can operate on. He examines the inner workings of a language model tokenizer to give a sense of how they work.

Original Description

Despite processing internet-scale text data, large language models never see words as we do. Yes, they consume text, but another piece of software called a tokenizer is what actually takes in the text and translates it into a different format that the language model actually operates on. In this video, Jay goes examines a language model tokenizer to give you a sense of how they work. Follow our upcoming book, Hands-On Large Language Models, for more details about tokenizers and LLMs in general. Updates on the book coming on https://jayalammar.substack.com/ My co-author: https://twitter.com/MaartenGr / https://maartengrootendorst.substack.com/ Early access on https://www.oreilly.com/library/view/hands-on-large-language/9781098150952/ --- Twitter: https://twitter.com/JayAlammar Blog: https://jalammar.github.io/ Mailing List: https://jayalammar.substack.com/ --- 0:00 Introduction 0:41 We're writing: Hands-On Large Language Models 1:13 Generating text with ChatGPT Cohere Command 2:42 Looking at the generation code 5:03 What is the actual input to a language model? 7:14 What is the actual output of a language model generate? 7:50 The tokenizer's lookup table and embeddings inside a model 9:07 Looking at the model, tokenizer 12:27 Summary
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Jay Alammar · Jay Alammar · 34 of 38

1 Jay's Visual Intro to AI
Jay's Visual Intro to AI
Jay Alammar
2 Making Money from AI by Predicting Sales - Jay's Intro to AI Part 2
Making Money from AI by Predicting Sales - Jay's Intro to AI Part 2
Jay Alammar
3 How GPT3 Works - Easily Explained with Animations
How GPT3 Works - Easily Explained with Animations
Jay Alammar
4 The Narrated Transformer Language Model
The Narrated Transformer Language Model
Jay Alammar
5 My Visualization Tools (my Apple Keynote setup for visualizations and animations)
My Visualization Tools (my Apple Keynote setup for visualizations and animations)
Jay Alammar
6 Explainable AI Cheat Sheet - Five Key Categories
Explainable AI Cheat Sheet - Five Key Categories
Jay Alammar
7 The Unreasonable Effectiveness of RNNs (Article and Visualization Commentary) [2015 article]
The Unreasonable Effectiveness of RNNs (Article and Visualization Commentary) [2015 article]
Jay Alammar
8 Neural Activations & Dataset Examples
Neural Activations & Dataset Examples
Jay Alammar
9 Up and Down the Ladder of Abstraction [interactive article by Bret Victor, 2011]
Up and Down the Ladder of Abstraction [interactive article by Bret Victor, 2011]
Jay Alammar
10 Probing Classifiers: A Gentle Intro (Explainable AI for Deep Learning)
Probing Classifiers: A Gentle Intro (Explainable AI for Deep Learning)
Jay Alammar
11 Inspecting Neural Networks with CCA - A Gentle Intro (Explainable AI for Deep Learning)
Inspecting Neural Networks with CCA - A Gentle Intro (Explainable AI for Deep Learning)
Jay Alammar
12 Language Processing with BERT: The 3 Minute Intro (Deep learning for NLP)
Language Processing with BERT: The 3 Minute Intro (Deep learning for NLP)
Jay Alammar
13 Behavioral Testing of ML Models (Unit tests for machine learning)
Behavioral Testing of ML Models (Unit tests for machine learning)
Jay Alammar
14 Favorite AI/ML Books: Intro to ML with Python (Book Review)
Favorite AI/ML Books: Intro to ML with Python (Book Review)
Jay Alammar
15 Favorite Python Books: Effective Python
Favorite Python Books: Effective Python
Jay Alammar
16 Favorite Stats Books: Seven Pillars of Statistical Wisdom
Favorite Stats Books: Seven Pillars of Statistical Wisdom
Jay Alammar
17 Understanding Animal Languages - Seeing Voices 2
Understanding Animal Languages - Seeing Voices 2
Jay Alammar
18 How digital assistants like Siri work #shorts
How digital assistants like Siri work #shorts
Jay Alammar
19 Writing Code in Jupyter Notebooks #shorts
Writing Code in Jupyter Notebooks #shorts
Jay Alammar
20 Experience Grounds Language: Improving language models beyond the world of text
Experience Grounds Language: Improving language models beyond the world of text
Jay Alammar
21 pandas for data science in python #shorts
pandas for data science in python #shorts
Jay Alammar
22 The Illustrated Retrieval Transformer
The Illustrated Retrieval Transformer
Jay Alammar
23 AI Image Generation is MIND BLOWING! #shorts
AI Image Generation is MIND BLOWING! #shorts
Jay Alammar
24 A Generalist Agent (Gato) - DeepMind's single model learns 600 tasks
A Generalist Agent (Gato) - DeepMind's single model learns 600 tasks
Jay Alammar
25 The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning
The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning
Jay Alammar
26 AI Art Explained: How AI Generates Images (Stable Diffusion, Midjourney, and DALLE)
AI Art Explained: How AI Generates Images (Stable Diffusion, Midjourney, and DALLE)
Jay Alammar
27 What is Generative AI? 4 Important Things to Know (about ChatGPT, MidJourney, Cohere & future AIs)
What is Generative AI? 4 Important Things to Know (about ChatGPT, MidJourney, Cohere & future AIs)
Jay Alammar
28 AI is Eating The World - This is Where YOU Can Use it to Compete (AI Product Moats)
AI is Eating The World - This is Where YOU Can Use it to Compete (AI Product Moats)
Jay Alammar
29 What is LangChain? Where does it fit with LLMs like ChatGPT and Cohere? #shorts
What is LangChain? Where does it fit with LLMs like ChatGPT and Cohere? #shorts
Jay Alammar
30 Are language models with more parameters better? #shorts #chatgpt
Are language models with more parameters better? #shorts #chatgpt
Jay Alammar
31 How to manage LLM prompts with tools like LangChain #languagemodels #chatgpt
How to manage LLM prompts with tools like LangChain #languagemodels #chatgpt
Jay Alammar
32 What is Llama Index? how does it help in building LLM applications? #languagemodels #chatgpt
What is Llama Index? how does it help in building LLM applications? #languagemodels #chatgpt
Jay Alammar
33 prompt chains are important for building large language model applications
prompt chains are important for building large language model applications
Jay Alammar
ChatGPT has Never Seen a SINGLE Word (Despite Reading Most of The Internet). Meet LLM Tokenizers.
ChatGPT has Never Seen a SINGLE Word (Despite Reading Most of The Internet). Meet LLM Tokenizers.
Jay Alammar
35 What makes LLM tokenizers different from each other? GPT4 vs. FlanT5 Vs. Starcoder Vs. BERT and more
What makes LLM tokenizers different from each other? GPT4 vs. FlanT5 Vs. Starcoder Vs. BERT and more
Jay Alammar
36 Building LLM Agents with Tool Use
Building LLM Agents with Tool Use
Jay Alammar
37 SWE-Bench authors reflect on the state of LLM agents at Neurips 2024
SWE-Bench authors reflect on the state of LLM agents at Neurips 2024
Jay Alammar
38 Learn how ChatGPT and DeepSeek models work: How Transformer LLMs Work [Free Course]
Learn how ChatGPT and DeepSeek models work: How Transformer LLMs Work [Free Course]
Jay Alammar

This video explains how large language models process text data through tokenizers and examines the inner workings of a language model tokenizer. It provides a foundation for understanding how LLMs work and how to improve prompt crafting skills.

Key Takeaways
  1. Understand the role of tokenizers in LLMs
  2. Learn how tokenizers translate text into a format the model can operate on
  3. Examine the lookup table and embeddings inside a model
  4. Understand how to generate text with ChatGPT and Cohere Command
💡 Tokenizers play a crucial role in how LLMs process text data, and understanding how they work can improve prompt crafting skills.

Related AI Lessons

Your LLM Doesn’t Pick Stocks — It Remembers Them
Discover how LLMs remember stock picks rather than making actual predictions, and why this matters for AI-driven investment strategies
Medium · Machine Learning
Word Representation
Learn how word representation works in NLP and its importance in understanding human language, enabling applications like text classification and language translation
Medium · NLP
When Cosine Similarity Approaching Singularity in Google Search AI Mode
Learn how cosine similarity approaching singularity affects Google Search AI and unified knowledge graphs, and why it matters for AI engineers and data scientists
Medium · AI
When Cosine Similarity Approaching Singularity in Google Search AI Mode
Learn how cosine similarity approaching singularity affects Google Search AI and unified knowledge graphs, and why it matters for data science and AI development
Medium · Data Science

Chapters (9)

Introduction
0:41 We're writing: Hands-On Large Language Models
1:13 Generating text with ChatGPT Cohere Command
2:42 Looking at the generation code
5:03 What is the actual input to a language model?
7:14 What is the actual output of a language model generate?
7:50 The tokenizer's lookup table and embeddings inside a model
9:07 Looking at the model, tokenizer
12:27 Summary
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →