3 Main Types of Missing Data | Do THIS Before Handling Missing Values!

AI For Beginners · Beginner ·📰 AI News & Updates ·1y ago

Key Takeaways

Understanding the 3 main types of missing data in datasets

Full Transcript

Hi, missing values can be a real headache. I always feel anxious when I see them. That's because you never know the actual value of the missing data point. Proper handling of missing data requires adequate knowledge of statistics and careful attention to detail. There are tons of methods for handling missing data, so how do you choose the right one? Remember, before looking into the methods for missing data handling, you need to understand why they disappeared. In this video, we will go through different types of missing data and provide examples of how to distinguish them. Missing data can occur in three main different types: missing completely at random, missing at random, and missing not at random. Missing completely at random is the best scenario. It occurs when the omitted observations are randomly missing, meaning the missingness is entirely unrelated to the observed data. For example, in a medical study, suppose the height measurements of patients are not observed because the measuring device occasionally malfunctioned randomly. In this case, missing values have nothing to do with patient characteristics like age, weight, health status, or the height variable itself. So, why is this type the most preferred? Because the missing value is totally random. Even if you decide to remove those observations, the remaining data is still a random unbiased subset of the entire sample. However, let's imagine another medical scenario where we have partially missing doctor visiting frequency data, which is influenced by the wealth or income of the patient. Poor patients can't afford to visit doctors frequently, so there are often missing values for low-income patients. This case is called missing at random. In other words, missing data is related to observed data or another variable. We have a higher likelihood or probability of missing doctor visiting frequency data for poor patients. Now, having the income data, we may be able to predict the frequency and fill in the missing values. The last type is missing not at random when the reason for the missingness is directly related to the value of the missing data itself. This creates a problem because you can't predict or explain it using other variables. For example, some wealthy people may not want to expose their income and simply don't record it. In this case, the missing income is related to the income variable itself. This may lead to underestimating the average of the income if those with higher incomes are the ones not reporting. If you just remove the missing points, you may lose the randomness in data since you accidentally removed all the wealthy people from the data set. Thus, you need to be super careful when dealing with not random missing data. We will refer to the methods for handling the missing values for all three types in future videos. So, I will recommend to follow us for more. If you want to learn more about artificial intelligence, subscribe to our channel to be aware of the new videos. Press the like button and let's discuss AI in the comments section.

Original Description

#ai #ml #datascience #data #machinelearning #artificialintelligence 🔥 This video covers the three main types of missing values: missing completely at random, missing at random and missing not at random. Before moving to the missing value handling step, you need to understand where are the values in the dataset? Why they disappeared? You can proceed to the missing value handling after understanding the statistical effect of the missing data points on your analysis. What if you mistakenly delete an important information which can lead to an underestimation? We bring valuable examples to clearly explain main differences among these three categories. Remember, missing completely at random occurs when the missing data is completely random and does not relate with the observed data. Missing at random refers to those missing values that are related to the observed data. While missing not at random is the most problematic. In that case, reasons are tied to the characteristics of the missing data, making it difficult or impossible to directly infer or predict what those missing values might be based solely on the observed data. 🔍 Key points covered: 0:00 - Introduction. 0:28 - In this video... 0:34 - Types of missing data. 0:42 - Missing Completely at Random. 1:26 - Missing at Random. 2:03 - Missing Not at Random. 2:47 - Subscribe to us! 🔔 Don't forget to like, subscribe, and hit the bell icon to stay updated with our latest videos! 🤖 Note that we use synthetic generations, such as AI-generated images and voices, to enhance the appeal and engagement of our content. 🌐 If you have any questions or topics you want us to cover, leave a comment below. Additionally, share with your thoughts about the content, how do you think we can make them better? Thanks for watching!
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from AI For Beginners · AI For Beginners · 19 of 32

1 Artificial Intelligence Explained In Simple Words | What Is AI? | Explained On A Real World Example!
Artificial Intelligence Explained In Simple Words | What Is AI? | Explained On A Real World Example!
AI For Beginners
2 AI vs. ML vs. DL vs. DS - Difference Explained | On Real World Examples | AI For Beginners
AI vs. ML vs. DL vs. DS - Difference Explained | On Real World Examples | AI For Beginners
AI For Beginners
3 Types Of Machine Learning Algorithms | Explained On Real World Examples | ML For Beginners
Types Of Machine Learning Algorithms | Explained On Real World Examples | ML For Beginners
AI For Beginners
4 Best AI Music Generator | Music Generation Tool for FREE | MusicGen developed by Meta AI
Best AI Music Generator | Music Generation Tool for FREE | MusicGen developed by Meta AI
AI For Beginners
5 The Ultimate Guide To Supervised Learning | Explained On Binary Classification Example | Part 1
The Ultimate Guide To Supervised Learning | Explained On Binary Classification Example | Part 1
AI For Beginners
6 The Ultimate Guide To Supervised Learning | Classification And Regression | Part 2
The Ultimate Guide To Supervised Learning | Classification And Regression | Part 2
AI For Beginners
7 Linear Regression Explained | A Beginner's Guide To Regression | The Basics You Need to Know!
Linear Regression Explained | A Beginner's Guide To Regression | The Basics You Need to Know!
AI For Beginners
8 Assumptions Of Linear Regression | What To Do If The Assumptions Do Not Hold? | Part 1
Assumptions Of Linear Regression | What To Do If The Assumptions Do Not Hold? | Part 1
AI For Beginners
9 Checking The Assumptions Of Linear Regression | Statistical And Visual Methods | Part 2
Checking The Assumptions Of Linear Regression | Statistical And Visual Methods | Part 2
AI For Beginners
10 The Purpose of Train-Test Split in Machine Learning | How to Correctly Split Data?
The Purpose of Train-Test Split in Machine Learning | How to Correctly Split Data?
AI For Beginners
11 The Role of Validation Sets in Model Training | Train-Test-Validation Splits | Clearly explained!
The Role of Validation Sets in Model Training | Train-Test-Validation Splits | Clearly explained!
AI For Beginners
12 Overfitting and Underfitting | Bias and Variance Tradeoff in Machine Learning | Clearly Explained!
Overfitting and Underfitting | Bias and Variance Tradeoff in Machine Learning | Clearly Explained!
AI For Beginners
13 Gradient Descent Explained | How Do ML and DL Models Learn? | Simple Explanation!
Gradient Descent Explained | How Do ML and DL Models Learn? | Simple Explanation!
AI For Beginners
14 Main Types of Gradient Descent | Batch, Stochastic and Mini-Batch Explained! | Which One to Choose?
Main Types of Gradient Descent | Batch, Stochastic and Mini-Batch Explained! | Which One to Choose?
AI For Beginners
15 The Role of Loss Functions | Most Common Loss Functions in Machine Learning | Explained!
The Role of Loss Functions | Most Common Loss Functions in Machine Learning | Explained!
AI For Beginners
16 How to Evaluate Your ML Models Effectively? | Evaluation Metrics in Machine Learning!
How to Evaluate Your ML Models Effectively? | Evaluation Metrics in Machine Learning!
AI For Beginners
17 8 Best Tips For Cleaning Your Data | Data Cleaning | Machine Learning, Data Preparation.
8 Best Tips For Cleaning Your Data | Data Cleaning | Machine Learning, Data Preparation.
AI For Beginners
18 Numerical vs. Categorical Data | Represent Your Dataset Correctly!
Numerical vs. Categorical Data | Represent Your Dataset Correctly!
AI For Beginners
3 Main Types of Missing Data | Do THIS Before Handling Missing Values!
3 Main Types of Missing Data | Do THIS Before Handling Missing Values!
AI For Beginners
20 7 PROVEN Strategies To Become An AI Engineer (2025 Updated)
7 PROVEN Strategies To Become An AI Engineer (2025 Updated)
AI For Beginners
21 Easiest Guide to K-Fold Cross Validation | Explained in 2 Minutes!
Easiest Guide to K-Fold Cross Validation | Explained in 2 Minutes!
AI For Beginners
22 Normalization and Standardization | Why to Scale the Features? | ML Basics
Normalization and Standardization | Why to Scale the Features? | ML Basics
AI For Beginners
23 The Ultimate Guide to Hyperparameter Tuning | Grid Search vs. Randomized Search
The Ultimate Guide to Hyperparameter Tuning | Grid Search vs. Randomized Search
AI For Beginners
24 How is Artificial Intelligence different from Traditional Programming?
How is Artificial Intelligence different from Traditional Programming?
AI For Beginners
25 All Machine Learning Models Clearly Explained!
All Machine Learning Models Clearly Explained!
AI For Beginners
26 6 Mistakes to Avoid When Learning Machine Learning in 2025
6 Mistakes to Avoid When Learning Machine Learning in 2025
AI For Beginners
27 Best Practices for Effective Data Visualization In Machine Learning!
Best Practices for Effective Data Visualization In Machine Learning!
AI For Beginners
28 Central Limit Theorem Intuition Explained Like You're 5!
Central Limit Theorem Intuition Explained Like You're 5!
AI For Beginners
29 Which Door Would You Choose? | Monty Hall Problem Explained!
Which Door Would You Choose? | Monty Hall Problem Explained!
AI For Beginners
30 All Machine Learning Concepts Explained in 18 Minutes!
All Machine Learning Concepts Explained in 18 Minutes!
AI For Beginners
31 What’s the Probability That Two Randomly Drawn Chords in a Circle Intersect?
What’s the Probability That Two Randomly Drawn Chords in a Circle Intersect?
AI For Beginners
32 Causation vs Correlation | The Most Confused Concept in Data Science
Causation vs Correlation | The Most Confused Concept in Data Science
AI For Beginners

Related Reads

📰
AI Weekly — 2026-06-26 to 2026-07-03 | Curated Surfaces, Sovereign Bets
Learn about the latest AI developments, including packaged AI surfaces and compute stack reorganization, and why integration is key to AI progress
Dev.to · Yang Goufang
📰
Sora Is Shutting Down: The 6 Best Alternatives in 2026 (Ranked)
Find the best alternatives to Sora, which is shutting down in 2026, and learn how to transition to new platforms
Medium · AI
📰
Qualcomm Just Tried to Buy Nvidia’s Biggest Threat. Then Everything Fell Apart.
Qualcomm's $10 billion deal to buy Nvidia's biggest threat fell apart, revealing the intense competition in the AI chip war
Medium · Data Science
📰
Would You Take $85,000 From the Company Warning AI Might Take Your Job?
Learn about Claude Corps, a paid opportunity for those under 30, and its relation to a $965 billion IPO filing in the context of AI's impact on jobs
Medium · AI

Chapters (7)

Introduction.
0:28 In this video...
0:34 Types of missing data.
0:42 Missing Completely at Random.
1:26 Missing at Random.
2:03 Missing Not at Random.
2:47 Subscribe to us!
Up next
HBAR BREAKING NEWS!!! (NVIDIA, DELL, INTEL & IBM ON HEDERA!)
Crypto AiMan
Watch →