ChatGPT has Never Seen a SINGLE Word (Despite Reading Most of The Internet). Meet LLM Tokenizers.

Jay Alammar · Beginner ·🧠 Large Language Models ·2y ago

Skills: LLM Foundations90%Prompt Craft60%

Key Takeaways

Jay Alammar explains how large language models like ChatGPT process text data through tokenizers, which translate text into a format the model can operate on. He examines the inner workings of a language model tokenizer to give a sense of how they work.

Original Description

Despite processing internet-scale text data, large language models never see words as we do. Yes, they consume text, but another piece of software called a tokenizer is what actually takes in the text and translates it into a different format that the language model actually operates on. In this video, Jay goes examines a language model tokenizer to give you a sense of how they work. Follow our upcoming book, Hands-On Large Language Models, for more details about tokenizers and LLMs in general. Updates on the book coming on https://jayalammar.substack.com/ My co-author: https://twitter.com/MaartenGr / https://maartengrootendorst.substack.com/ Early access on https://www.oreilly.com/library/view/hands-on-large-language/9781098150952/ --- Twitter: https://twitter.com/JayAlammar Blog: https://jalammar.github.io/ Mailing List: https://jayalammar.substack.com/ --- 0:00 Introduction 0:41 We're writing: Hands-On Large Language Models 1:13 Generating text with ChatGPT Cohere Command 2:42 Looking at the generation code 5:03 What is the actual input to a language model? 7:14 What is the actual output of a language model generate? 7:50 The tokenizer's lookup table and embeddings inside a model 9:07 Looking at the model, tokenizer 12:27 Summary

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Jay Alammar · Jay Alammar · 34 of 38

← Previous Next →

Jay's Visual Intro to AI

Jay's Visual Intro to AI

Making Money from AI by Predicting Sales - Jay's Intro to AI Part 2

Making Money from AI by Predicting Sales - Jay's Intro to AI Part 2

How GPT3 Works - Easily Explained with Animations

How GPT3 Works - Easily Explained with Animations

The Narrated Transformer Language Model

The Narrated Transformer Language Model

My Visualization Tools (my Apple Keynote setup for visualizations and animations)

My Visualization Tools (my Apple Keynote setup for visualizations and animations)

Explainable AI Cheat Sheet - Five Key Categories

Explainable AI Cheat Sheet - Five Key Categories

The Unreasonable Effectiveness of RNNs (Article and Visualization Commentary) [2015 article]

The Unreasonable Effectiveness of RNNs (Article and Visualization Commentary) [2015 article]

Neural Activations & Dataset Examples

Neural Activations & Dataset Examples

Up and Down the Ladder of Abstraction [interactive article by Bret Victor, 2011]

Up and Down the Ladder of Abstraction [interactive article by Bret Victor, 2011]

Probing Classifiers: A Gentle Intro (Explainable AI for Deep Learning)

Probing Classifiers: A Gentle Intro (Explainable AI for Deep Learning)

Inspecting Neural Networks with CCA - A Gentle Intro (Explainable AI for Deep Learning)

Inspecting Neural Networks with CCA - A Gentle Intro (Explainable AI for Deep Learning)

Language Processing with BERT: The 3 Minute Intro (Deep learning for NLP)

Language Processing with BERT: The 3 Minute Intro (Deep learning for NLP)

Behavioral Testing of ML Models (Unit tests for machine learning)

Behavioral Testing of ML Models (Unit tests for machine learning)

Favorite AI/ML Books: Intro to ML with Python (Book Review)

Favorite AI/ML Books: Intro to ML with Python (Book Review)

Favorite Python Books: Effective Python

Favorite Python Books: Effective Python

Favorite Stats Books: Seven Pillars of Statistical Wisdom

Favorite Stats Books: Seven Pillars of Statistical Wisdom

Understanding Animal Languages - Seeing Voices 2

Understanding Animal Languages - Seeing Voices 2

How digital assistants like Siri work #shorts

How digital assistants like Siri work #shorts

Writing Code in Jupyter Notebooks #shorts

Writing Code in Jupyter Notebooks #shorts

Experience Grounds Language: Improving language models beyond the world of text

Experience Grounds Language: Improving language models beyond the world of text

pandas for data science in python #shorts

pandas for data science in python #shorts

The Illustrated Retrieval Transformer

The Illustrated Retrieval Transformer

AI Image Generation is MIND BLOWING! #shorts

AI Image Generation is MIND BLOWING! #shorts

A Generalist Agent (Gato) - DeepMind's single model learns 600 tasks

A Generalist Agent (Gato) - DeepMind's single model learns 600 tasks

The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning

The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning

AI Art Explained: How AI Generates Images (Stable Diffusion, Midjourney, and DALLE)

AI Art Explained: How AI Generates Images (Stable Diffusion, Midjourney, and DALLE)

What is Generative AI? 4 Important Things to Know (about ChatGPT, MidJourney, Cohere & future AIs)

What is Generative AI? 4 Important Things to Know (about ChatGPT, MidJourney, Cohere & future AIs)

AI is Eating The World - This is Where YOU Can Use it to Compete (AI Product Moats)

AI is Eating The World - This is Where YOU Can Use it to Compete (AI Product Moats)

What is LangChain? Where does it fit with LLMs like ChatGPT and Cohere? #shorts

What is LangChain? Where does it fit with LLMs like ChatGPT and Cohere? #shorts

Are language models with more parameters better? #shorts #chatgpt

Are language models with more parameters better? #shorts #chatgpt

How to manage LLM prompts with tools like LangChain #languagemodels #chatgpt

How to manage LLM prompts with tools like LangChain #languagemodels #chatgpt

What is Llama Index? how does it help in building LLM applications? #languagemodels #chatgpt

What is Llama Index? how does it help in building LLM applications? #languagemodels #chatgpt

prompt chains are important for building large language model applications

prompt chains are important for building large language model applications

ChatGPT has Never Seen a SINGLE Word (Despite Reading Most of The Internet). Meet LLM Tokenizers.

ChatGPT has Never Seen a SINGLE Word (Despite Reading Most of The Internet). Meet LLM Tokenizers.

What makes LLM tokenizers different from each other? GPT4 vs. FlanT5 Vs. Starcoder Vs. BERT and more

What makes LLM tokenizers different from each other? GPT4 vs. FlanT5 Vs. Starcoder Vs. BERT and more

Building LLM Agents with Tool Use

Building LLM Agents with Tool Use

SWE-Bench authors reflect on the state of LLM agents at Neurips 2024

SWE-Bench authors reflect on the state of LLM agents at Neurips 2024

Learn how ChatGPT and DeepSeek models work: How Transformer LLMs Work [Free Course]

Learn how ChatGPT and DeepSeek models work: How Transformer LLMs Work [Free Course]

This video explains how large language models process text data through tokenizers and examines the inner workings of a language model tokenizer. It provides a foundation for understanding how LLMs work and how to improve prompt crafting skills.

Key Takeaways

Understand the role of tokenizers in LLMs
Learn how tokenizers translate text into a format the model can operate on
Examine the lookup table and embeddings inside a model
Understand how to generate text with ChatGPT and Cohere Command

💡 Tokenizers play a crucial role in how LLMs process text data, and understanding how they work can improve prompt crafting skills.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Foundations

View skill →

Getting Started with Vertex AI Gemini 1.5 Flash

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

I TRAINED AN AI TO SOLVE 2+2 (w/ Live Coding)

How to use the ChatGPT API with Python!!

How to use the ChatGPT API with Python!!

Nicholas Renotte

Gemini 2.5: Create an interactive plot of economic data

Gemini 2.5: Create an interactive plot of economic data

Google DeepMind

LangChain Chatbots: Building a Personalized AI Assistant

LangChain Chatbots: Building a Personalized AI Assistant

Analytics Vidhya

Auto-generating meeting notes with Python

Auto-generating meeting notes with Python

Related AI Lessons

Your LLM Doesn’t Pick Stocks — It Remembers Them

Discover how LLMs remember stock picks rather than making actual predictions, and why this matters for AI-driven investment strategies

Medium · Machine Learning

Word Representation

Learn how word representation works in NLP and its importance in understanding human language, enabling applications like text classification and language translation

When Cosine Similarity Approaching Singularity in Google Search AI Mode

Learn how cosine similarity approaching singularity affects Google Search AI and unified knowledge graphs, and why it matters for AI engineers and data scientists

When Cosine Similarity Approaching Singularity in Google Search AI Mode

Learn how cosine similarity approaching singularity affects Google Search AI and unified knowledge graphs, and why it matters for data science and AI development

Medium · Data Science

Chapters (9)

Introduction

0:41 We're writing: Hands-On Large Language Models

1:13 Generating text with ChatGPT Cohere Command

2:42 Looking at the generation code

5:03 What is the actual input to a language model?

7:14 What is the actual output of a language model generate?

7:50 The tokenizer's lookup table and embeddings inside a model

9:07 Looking at the model, tokenizer

12:27 Summary

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)