Favorite Stats Books: Seven Pillars of Statistical Wisdom

Jay Alammar · Beginner ·🧠 Large Language Models ·4y ago

Key Takeaways

The video discusses the book 'The Seven Pillars of Statistical Wisdom' by Stephen Stiegler, which explores seven foundational statistical ideas that are revolutionary for their time and heavily used in science, technology, and machine learning. The seven pillars are aggregation, information, likelihood, intercomparison, regression, design of experiments, and residuals.

Full Transcript

hello everybody welcome back to a new video in this video we'll be talking about a new one of my favorite books it is the seven pillars of statistical wisdom why statistics because my introduction to statistics in let's say recent years so i studied a little bit of statistics when i was doing my computer science degree but the more i went into machine learning and ai you start to have to deal with a lot of statistical concepts because a lot of machine learning i mean statistics is one of the let's say two or three foundations of machine learning next to computer science and mathematics and when you learn about these statistical ideas in the let's say in your journey to learn machine learning a book like this is very interesting because it pulls out these threads of statistical ideas um and it puts them into historical context it's a very accessible book we'll go into exactly the seven ideas but what the book does is to say these are seven major ideas of statistics that are foundational to the statistics as we know it today and then built on top of these seven is a lot of other statistical ideas but also machine learning and ai sort of come on top of this structure of statistics i love this book because it's very accessible it's easy to pick up and learn and read it's not explaining the ideas in sort of mathematical ways it's it's a really smooth way of storytelling the origins of these ideas how they developed the people around them and the kinds of problems that they were trying to solve when they came up with these let's say revolutionary methods so as somebody who's maybe didn't have the best time with the statistics textbooks uh back in school because they went right into uh you know when you toss a coin 100 times or a thousand times what happens and that's a very important and sort of rigorous understanding of statistics uh but this is a very sort of human look at looking at the history of those ideas those main extremely important and that you can maybe take for granted now if you don't see them in the proper historic light that this book puts them in so let's get into the seven pillars of statistical wisdom and see what those seven are so this is the seven pillars of statistical wisdom by stephen stiegler it's a very small book and it's very easy to go about there are a bunch of visuals on there highly really good sort of storytelling style off of the book but then what are the seven so the seven ideas or the seven pillars that a lot of modern statistics is built on top of one is aggregation from tables and means to least squares information its measure and rate of change likelihood calibration on a probability scale intercomparison within sample variation as a standard regression so multivariate analysis bayesian inference and causal inference design experimental planning and the role of randomization and residual scientific logic model comparison and diagnostic display now for me i would say the three that i sort of most enjoyed were aggregation information i really want to spend a bunch more time on likelihood but regression was also extremely important so aggregation is basically the idea that you can gain more information sometimes by throwing away information that you have so let's say an average if you have 100 measurements you can average them to have only one number and that number tells you something that maybe the hundred don't tell you and with that you can gain more information the intro [Music] chapter explains these in in a very good way aggregation is the combination of ideas so you gain you can gain information by throwing information away and that is sort of revolutionary the second pillar is information so information measurement and that's the idea that if you have 20 measurements of let's say a phenomena and then you have you go out and take 20 more measurements you're not doubling the information that you have the first 20 actually gave you more information than the second 20 and that's the square root of the number n of observations likelihood is the calibration of inference with the use of probability intercomparison which is the idea that you can gain some insight by comparing a data set to itself intercomparison is the fourth pillar the fifth is regression and the idea here is to get from galton's uh ideas about regression to the mean and how that sort of explains some of the questions raised by darwin's theory of evolution and from regression to the mean you have concepts like regression in prediction so the work in fact introduces modern multivariate analysis design of experiments is the sixth pillar and then the seventh is residuals to paraphrase these seven ideas the author puts these lists which is what is the value of targeted reduction or compression of data so that's aggregation the diminished value of an increased amount of data so your first 20 is maybe has more information than your second 20. how to put a probability measuring stick to what we do that is likelihood how to use internal variation in the data to help in that how asking questions from different perspectives can lead to revealingly different answers that would be regression and then the essential role of the planning of observations so how you design experiments and the importance of being careful in that and how all these ideas can be used in exploring and comparing competing explanations in science so this has been a quick look to the seven pillars of statistical wisdom very readable uh intro to very interesting and important statistical ideas highly recommended i've been going over it and i plan to spend even more time with it curious to see what you think about it let me know in the comments below thank you for watching

Original Description

The Seven Pillars of Statistical Wisdom is a wonderful small book about seven foundational statistical ideas that were revolutionary for their time. These seven are heavily used in science, technology, and machine learning. Jay goes over the seven ideas and what makes this an accessible and enjoyable book. Contents: Introduction (0:00) Looking Inside: Seven Pillars (2:39) Looking Inside: Paraphrasing The Seven (6:00) Closing (6:54) ------ Twitter: https://twitter.com/JayAlammar Blog: https://jalammar.github.io/ Mailing List: https://jayalammar.substack.com/ ------ More videos by Jay: Language Processing with BERT: The 3 Minute Intro (Deep learning for NLP) https://youtu.be/ioGry-89gqE Seeing Voices: 1 - Intro to Spectrograms https://www.youtube.com/watch?v=37zCgCdV468 The Narrated Transformer Language Model https://youtu.be/-QH8fRhqFHM Jay's Visual Intro to AI https://www.youtube.com/watch?v=mSTCz... How GPT-3 Works - Easily Explained with Animations https://www.youtube.com/watch?v=MQnJZ...
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Jay Alammar · Jay Alammar · 16 of 38

1 Jay's Visual Intro to AI
Jay's Visual Intro to AI
Jay Alammar
2 Making Money from AI by Predicting Sales - Jay's Intro to AI Part 2
Making Money from AI by Predicting Sales - Jay's Intro to AI Part 2
Jay Alammar
3 How GPT3 Works - Easily Explained with Animations
How GPT3 Works - Easily Explained with Animations
Jay Alammar
4 The Narrated Transformer Language Model
The Narrated Transformer Language Model
Jay Alammar
5 My Visualization Tools (my Apple Keynote setup for visualizations and animations)
My Visualization Tools (my Apple Keynote setup for visualizations and animations)
Jay Alammar
6 Explainable AI Cheat Sheet - Five Key Categories
Explainable AI Cheat Sheet - Five Key Categories
Jay Alammar
7 The Unreasonable Effectiveness of RNNs (Article and Visualization Commentary) [2015 article]
The Unreasonable Effectiveness of RNNs (Article and Visualization Commentary) [2015 article]
Jay Alammar
8 Neural Activations & Dataset Examples
Neural Activations & Dataset Examples
Jay Alammar
9 Up and Down the Ladder of Abstraction [interactive article by Bret Victor, 2011]
Up and Down the Ladder of Abstraction [interactive article by Bret Victor, 2011]
Jay Alammar
10 Probing Classifiers: A Gentle Intro (Explainable AI for Deep Learning)
Probing Classifiers: A Gentle Intro (Explainable AI for Deep Learning)
Jay Alammar
11 Inspecting Neural Networks with CCA - A Gentle Intro (Explainable AI for Deep Learning)
Inspecting Neural Networks with CCA - A Gentle Intro (Explainable AI for Deep Learning)
Jay Alammar
12 Language Processing with BERT: The 3 Minute Intro (Deep learning for NLP)
Language Processing with BERT: The 3 Minute Intro (Deep learning for NLP)
Jay Alammar
13 Behavioral Testing of ML Models (Unit tests for machine learning)
Behavioral Testing of ML Models (Unit tests for machine learning)
Jay Alammar
14 Favorite AI/ML Books: Intro to ML with Python (Book Review)
Favorite AI/ML Books: Intro to ML with Python (Book Review)
Jay Alammar
15 Favorite Python Books: Effective Python
Favorite Python Books: Effective Python
Jay Alammar
Favorite Stats Books: Seven Pillars of Statistical Wisdom
Favorite Stats Books: Seven Pillars of Statistical Wisdom
Jay Alammar
17 Understanding Animal Languages - Seeing Voices 2
Understanding Animal Languages - Seeing Voices 2
Jay Alammar
18 How digital assistants like Siri work #shorts
How digital assistants like Siri work #shorts
Jay Alammar
19 Writing Code in Jupyter Notebooks #shorts
Writing Code in Jupyter Notebooks #shorts
Jay Alammar
20 Experience Grounds Language: Improving language models beyond the world of text
Experience Grounds Language: Improving language models beyond the world of text
Jay Alammar
21 pandas for data science in python #shorts
pandas for data science in python #shorts
Jay Alammar
22 The Illustrated Retrieval Transformer
The Illustrated Retrieval Transformer
Jay Alammar
23 AI Image Generation is MIND BLOWING! #shorts
AI Image Generation is MIND BLOWING! #shorts
Jay Alammar
24 A Generalist Agent (Gato) - DeepMind's single model learns 600 tasks
A Generalist Agent (Gato) - DeepMind's single model learns 600 tasks
Jay Alammar
25 The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning
The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning
Jay Alammar
26 AI Art Explained: How AI Generates Images (Stable Diffusion, Midjourney, and DALLE)
AI Art Explained: How AI Generates Images (Stable Diffusion, Midjourney, and DALLE)
Jay Alammar
27 What is Generative AI? 4 Important Things to Know (about ChatGPT, MidJourney, Cohere & future AIs)
What is Generative AI? 4 Important Things to Know (about ChatGPT, MidJourney, Cohere & future AIs)
Jay Alammar
28 AI is Eating The World - This is Where YOU Can Use it to Compete (AI Product Moats)
AI is Eating The World - This is Where YOU Can Use it to Compete (AI Product Moats)
Jay Alammar
29 What is LangChain? Where does it fit with LLMs like ChatGPT and Cohere? #shorts
What is LangChain? Where does it fit with LLMs like ChatGPT and Cohere? #shorts
Jay Alammar
30 Are language models with more parameters better? #shorts #chatgpt
Are language models with more parameters better? #shorts #chatgpt
Jay Alammar
31 How to manage LLM prompts with tools like LangChain #languagemodels #chatgpt
How to manage LLM prompts with tools like LangChain #languagemodels #chatgpt
Jay Alammar
32 What is Llama Index? how does it help in building LLM applications? #languagemodels #chatgpt
What is Llama Index? how does it help in building LLM applications? #languagemodels #chatgpt
Jay Alammar
33 prompt chains are important for building large language model applications
prompt chains are important for building large language model applications
Jay Alammar
34 ChatGPT has Never Seen a SINGLE Word (Despite Reading Most of The Internet). Meet LLM Tokenizers.
ChatGPT has Never Seen a SINGLE Word (Despite Reading Most of The Internet). Meet LLM Tokenizers.
Jay Alammar
35 What makes LLM tokenizers different from each other? GPT4 vs. FlanT5 Vs. Starcoder Vs. BERT and more
What makes LLM tokenizers different from each other? GPT4 vs. FlanT5 Vs. Starcoder Vs. BERT and more
Jay Alammar
36 Building LLM Agents with Tool Use
Building LLM Agents with Tool Use
Jay Alammar
37 SWE-Bench authors reflect on the state of LLM agents at Neurips 2024
SWE-Bench authors reflect on the state of LLM agents at Neurips 2024
Jay Alammar
38 Learn how ChatGPT and DeepSeek models work: How Transformer LLMs Work [Free Course]
Learn how ChatGPT and DeepSeek models work: How Transformer LLMs Work [Free Course]
Jay Alammar

The video introduces the book 'The Seven Pillars of Statistical Wisdom' and explores the seven foundational statistical ideas that are essential for machine learning and data analysis. The book provides a historical context and explains the concepts in an accessible way.

Key Takeaways
  1. Read the book 'The Seven Pillars of Statistical Wisdom'
  2. Understand the seven pillars of statistical wisdom
  3. Apply the statistical concepts to machine learning and data analysis
  4. Explore the historical context of statistical ideas
  5. Learn about aggregation, information, likelihood, intercomparison, regression, design of experiments, and residuals
💡 The seven pillars of statistical wisdom provide a foundation for understanding statistical concepts and applying them to machine learning and data analysis.

Related AI Lessons

The 2026 AI Model Release Race: Every Major LLM Launch You Need to Know
Stay updated on the 2026 AI model release race, including major LLM launches like Claude Sonnet 5 and GPT-5.6, to leverage the latest advancements in AI technology
Dev.to AI
Call GPT, Claude, and Gemini from one API key — a 3-step setup
Access GPT, Claude, and Gemini through one API key with a 3-step setup using Modelishub
Dev.to AI
Your LLM Doesn’t Pick Stocks — It Remembers Them
Discover how LLMs remember stock picks rather than making actual predictions, and why this matters for AI-driven investment strategies
Medium · Machine Learning
Word Representation
Learn how word representation works in NLP and its importance in understanding human language, enabling applications like text classification and language translation
Medium · NLP
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →