StatQuickie: Thresholds for Significance
Key Takeaways
The video discusses the concept of statistical significance, specifically the threshold of 0.05, and its application in research papers. It highlights that the choice of threshold depends on the context and purpose of the study, and that a one-size-fits-all approach is not appropriate. The speaker, Josh Starmer, provides examples and exceptions to the general rule, emphasizing the importance of considering effect size and the nature of the claim being made.
Full Transcript
hello and welcome to the very first stat quickie stat quickies are a little short video it's not a full-on stat quest where it's only gonna take a few minutes where we address a question that someone's asked me in the comments or sent me an email or if I just ran into someone in the hallway and they said hey I got a question so that's what's that quickies are for this very first one we're gonna talk about what a good threshold for significance is people often ask me is point zero five the best threshold for significance should I use it for everything all the time no matter what or are there exceptions or what would how do I deal with this okay so what I'm going to talk about is one the original threshold of point O five was randomly selected well I'm not necessarily randomly but it's relatively arbitrary there's no biological or natural reason at point zero five is the optimal threshold for significance the one that we use mostly in science in publications and whatnot we even used it for a long time and for the most part it doesn't okay job lately there have been some high-profile articles about people doing pee hacking and kind of getting away with murder in terms of statistics but for the most part zero point four oh five has been sort of a good way to hedge the bets five percent of the time you're wrong but 95 percent of the time you're right and that seems to be a cost-effective threshold for most of at least biological science okay so now let's talk about what good thresholds are other than 0.05 okay so if you're got data that you're gonna publish you've done your analysis you're gonna publish it what's a good threshold for that well almost always point oh five is a great threshold for that because that's what the editors are expecting that's what the reviewers are expecting if you can go below that that's awesome okay but there are exceptions to that down before you run off let's listen to all those exceptions because sometimes it needs to be smaller and we're going to get to that in just a second okay so here's another threshold for significant we're going to talk about one is when you're not necessarily gonna publish the data you say like you've downloaded a dataset from the internet you've got a hypothesis that seems reasonable you've tested that hypothesis the p-value is not point over five okay it's a little bit higher okay is that a bad thing well it depends on what you're gonna do if you're gonna publish immediately yes you want it to be lower than pointer heart but if instead you're just using this limited data set that you got off the internet for free and you're just trying to test some basic hypothesis that you're later going to confirm or validate using alternative methods well in that case who cares what the p-value is as long as it's kind of small that's good enough you don't need a massively tiny p-value when you're just exploring some data set that you found you're generating new hypotheses that you're later going to test and validate using all additional data and additional methods okay so if that's the case you know don't don't feel like you have to stick to 0.05 okay now here's another thing you need to think about say like you've done your looks like I just got a text message next time I do this I'm gonna turn off I guess alerts or something like that okay so what's the next thing we need to talk about we need to talk about the effect size okay say like you did an R squared and I've got a whole static West on R squared so check this out because I go into this in detail but say like you did an R squared and your R squared value is isn't very good okay you've got like a you know you know r-squared of 10 percent right that means that 10 percent of the variation in the data is explained by whatever it is you're interested in that's not very much okay so even if you have a tiny tiny tiny tiny p-value okay it doesn't actually matter unless you know in certain circumstances maybe that's good enough but generally speaking you want a good r-squared you want your you want a good correlation you want whatever you're studying to explain the data okay and and so if you have a tiny p-value and a tiny effect size and not much explanation of what's go on then who cares how small your p-value is you know I've seen these things in publications where the p-value is zero point zero zero zero zero zero zero on the r-squared is horrible and I say who cares okay so you want to have a decent effect size okay so up the small p-value is not enough okay one last thing okay extraordinary claims need extraordinary data so say like you've done a great experiment proving that there are extraterrestrials flying around New York City in UFOs okay that's an extraordinary claim okay you've done your thing and you've got a p-value less than 0.05 it's point oh four seven okay it's less than that threshold of significance is that good enough absolutely not extraordinary claims need extraordinary data if you're gonna go out and publish a thing sending extraterrestrials are flying around New York City and UFOs your p-value better be so crazy small it's unbelievable and then people say well that p-value is so small I can't even believe it therefore I have to believe that there are extraterrestrials flying around New York City and UFOs okay that's it so to summarize if you're publishing your p-values you want a small p-value a PETA threshold of 0.05 it's okay unless you're the thing you're studying only explains a little bit and your r-squared is kind of crummy you want a good r-squared okay so it's you know point oh five is fine if you're if you've got a good correlation um if you're just using it doing statistical stuff just to explore stuff that you're going to validate later on who really cares what your threshold is you're just trying to go find something of interest don't worry about 0.05 and the other thing is extraordinary claims require extraordinary data if you're gonna say that there are extraterrestrials flying around New York City and UFOs you better have a darn small p-value all right tune in next time for our next stat quickie
Original Description
People often ask me what a good threshold is for statistical significance. The answer is always, "It depends!"
For a complete index of all the StatQuest videos, check out:
https://statquest.org/video-index/
If you'd like to support StatQuest, please consider...
Patreon: https://www.patreon.com/statquest
...or...
YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join
...buying one of my books, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
https://statquest.org/statquest-store/
...or just donating to StatQuest!
https://www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
https://twitter.com/joshuastarmer
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from StatQuest with Josh Starmer · StatQuest with Josh Starmer · 29 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
▶
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Cutting Butter
StatQuest with Josh Starmer
onion-dice
StatQuest with Josh Starmer
R-squared, Clearly Explained!!!
StatQuest with Josh Starmer
Wrapping up dumplings for pot stickers.
StatQuest with Josh Starmer
The standard error, Clearly Explained!!!
StatQuest with Josh Starmer
That Dude (in the movies)
StatQuest with Josh Starmer
How to puree garlic
StatQuest with Josh Starmer
Confidence Intervals, Clearly Explained!!!
StatQuest with Josh Starmer
RPKM, FPKM and TPM, Clearly Explained!!!
StatQuest with Josh Starmer
Principal Component Analysis (PCA) clearly explained (2015)
StatQuest with Josh Starmer
StatQuest: RNA-seq - the problem with technical replicates
StatQuest with Josh Starmer
That's Alright
StatQuest with Josh Starmer
Christmas In Rio! (now on iTunes!)
StatQuest with Josh Starmer
Drawing and Interpreting Heatmaps
StatQuest with Josh Starmer
Rachel's Song (the ballad of Hazel Motes)
StatQuest with Josh Starmer
Deal With It
StatQuest with Josh Starmer
Say Your Goodbyes
StatQuest with Josh Starmer
Another Day
StatQuest with Josh Starmer
StatQuest: Linear Discriminant Analysis (LDA) clearly explained.
StatQuest with Josh Starmer
Maybe It'll Go Away
StatQuest with Josh Starmer
Nasty Weather
StatQuest with Josh Starmer
Roses
StatQuest with Josh Starmer
p-hacking and power calculations
StatQuest with Josh Starmer
I Love You
StatQuest with Josh Starmer
The Coldest Day of the Year
StatQuest with Josh Starmer
Psycho Killer
StatQuest with Josh Starmer
False Discovery Rates, FDR, clearly explained
StatQuest with Josh Starmer
A New Song
StatQuest with Josh Starmer
StatQuickie: Thresholds for Significance
StatQuest with Josh Starmer
Logs (logarithms), Clearly Explained!!!
StatQuest with Josh Starmer
Bar Charts Are Better than Pie Charts
StatQuest with Josh Starmer
Mr Hattie
StatQuest with Josh Starmer
StatQuickie: Which t test to use
StatQuest with Josh Starmer
Fisher's Exact Test and the Hypergeometric Distribution
StatQuest with Josh Starmer
Standard Deviation vs Standard Error, Clearly Explained!!!
StatQuest with Josh Starmer
StatQuest: DESeq2, part 1, Library Normalization
StatQuest with Josh Starmer
The Rainbow
StatQuest with Josh Starmer
StatQuest: edgeR, part 1, Library Normalization
StatQuest with Josh Starmer
The Main Ideas behind Probability Distributions
StatQuest with Josh Starmer
StatQuest: One or Two Tailed P-Values
StatQuest with Josh Starmer
Evil Genius
StatQuest with Josh Starmer
Sampling from a Distribution, Clearly Explained!!!
StatQuest with Josh Starmer
StatQuest: edgeR and DESeq2, part 2 - Independent Filtering
StatQuest with Josh Starmer
The Main Ideas of Fitting a Line to Data (The Main Ideas of Least Squares and Linear Regression.)
StatQuest with Josh Starmer
The Sum of Regrets
StatQuest with Josh Starmer
Lowess and Loess, Clearly Explained!!!
StatQuest with Josh Starmer
StatQuest: Hierarchical Clustering
StatQuest with Josh Starmer
StatQuest: K-nearest neighbors, Clearly Explained
StatQuest with Josh Starmer
Your Dark Side
StatQuest with Josh Starmer
Boxplots are Awesome!!!
StatQuest with Josh Starmer
What is a (mathematical) model?
StatQuest with Josh Starmer
Linear Regression, Clearly Explained!!!
StatQuest with Josh Starmer
Linear Regression in R, Step-by-Step
StatQuest with Josh Starmer
Maximum Likelihood, clearly explained!!!
StatQuest with Josh Starmer
Brothers
StatQuest with Josh Starmer
Using Linear Models for t-tests and ANOVA, Clearly Explained!!!
StatQuest with Josh Starmer
StatQuest: How to make a Mean Pizza Crust!!!
StatQuest with Josh Starmer
StatQuest: A gentle introduction to RNA-seq
StatQuest with Josh Starmer
I'm Alive
StatQuest with Josh Starmer
StatQuest: t-SNE, Clearly Explained
StatQuest with Josh Starmer
More on: Reading ML Papers
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
I Spent Weeks Looking for a Research Gap Before I Realized I Was Searching the Wrong Way
Medium · AI
ICMI 2026 Reviews [D]
Reddit r/MachineLearning
Workshop submission for main conference paper under review [D]
Reddit r/MachineLearning
Kept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]
Reddit r/MachineLearning
🎓
Tutor Explanation
DeepCamp AI