Causal Inference | Answering causal questions

Shaw Talebi · Beginner ·📄 Research Papers Explained ·4y ago

Skills: Research Methods90%Reading ML Papers80%Paper Reproduction70%

Key Takeaways

The video discusses causal inference, aiming to answer questions involving cause and effect, using tools like the do_y library in Python and concepts like causal models, do operator, and rules of do calculus.

Full Transcript

hey folks welcome back this is the second video in a three-part series on causality in this video i'll be talking about causal inference which aims at answering questions involving cause and effect so i'll start by giving um introduction to causal inference and sketching some big ideas and then i'll finish with a concrete example with code using the microsoft do y library in python so with that let's get into the video okay so here we're talking about causal inference which aims at answering questions about cause and effect so given a causal model here we have a directed acyclic graph which i talked about in the previous video and from that how can we estimate causal effects for example how can we estimate the effect of x on y so some examples of questions that fall under the umbrella of causal inference are did the treatment directly help those who took it or was it the marketing campaign that led to increased sales this month or the holiday or how big of an effect would increasing wages have on productivity so these are very practical and significant questions that may not be so readily answered using traditional means and i'll try to highlight what causal inference is good at through what i call the three gifs of causal inference so the first gift is the do operator and the do operator simply simulates a physical intervention and we're all familiar with interventions in the real world this is like when your friend's candy habit gets completely out of control and you just have to sit him or her down and say this has got to stop this is what the do operator does but for a causal model in other words it is a mathematical representation of an intervention so suppose we have this model on the left here we have z causes x which causes y what an intervention in x looks like in this mathematical representation is we delete all the incoming edges into x and manually set x to some predetermined value say x naught so significant contribution from judea pearl and colleagues are the rules of do calculus what these rules provide is a way to translate probabilities that include the do operator into probabilities that do not include the do operator so the power of this is that often we can't perform interventions in the real world this could be because it's physically impossible or unethical or whatever reason for example intervening in someone's height by making them taller to measure the response in basketball ability is not physically feasible or intervening in smoking by forcing someone to smoke a pack of cigarettes every day to measure the response in the risk of lung disease is unethical so in other words often in the real world we have no way to collect data about the interventional probability distribution that is we don't have access to data about probabilities that include the do operator in these situations the rules of do calculus may provide a way to re-express to rewrite probabilities that we are interested in but can't measure directly so the second gift of causal inference is clarifying this notion of confounding and confounding at least for me was something that did not have a clear definition until i read judea pearl's book the book of why so in his book pearl defines confounding as anything that makes the interventional distribution different from the observational distribution in other words anything that makes a probability of y given an intervention in x different from the probability of y given an observation in x so this is easy to see in the three variable case so here we have an example of a causal model which shows the relationship between age education and wealth in this example age is the confounder and this can be understood as age is a common cause of education and wealth which is an idea that's been around for a while as pearl discusses in his book many people took this kind of common cause definition as a definition for confounding but what pearl does in defining confounding in this way in terms of the interventional versus observational distribution is becomes much more easy to generalize this notion to much more than just three variables okay so what does this mean practically if we know age is a confounder this can help inform our analysis of data that we might collect of these three variables so suppose we have this data here of age education and income and we want to assess the impact of education on income if we don't take into consideration age bank a confounder the naive thing to do would be to just partition the data into two subgroups one group has just a high school education and the other group has a college education and just compare their difference in income but since age is confounder this wouldn't give you the best result so knowing what the confounders are of your problem allow you to perform this analysis a different way so in this specific case since age is a confounder we shouldn't compare data between age groups we should compare data within age groups so that's what i'm showing here you can imagine this single data set being split off into four separate data sets uh where we have the blue data set people in their 20s the yellow dataset people in the 30s people in the 40s in red people in the 50s in green and then we repeat the analysis i was talking about before where we kind of compare the incomes of people without just high school education versus college education so you may ask why do we care about this do operator why do we need to talk about interventional probabilities versus observational probabilities and so on ultimately what these tools provide are a way to estimate causal effects so a causal effect is a way to quantify the causal impact that one variable has on another and this is a core part of causal inference so this is what we were naturally doing in the previous slide when we were trying to assess the impact of education on income what we were really doing is quantifying the causal effect that education had on people's incomes but this is obviously applicable to other situations when we ask questions like what productivity be increased if we increase wages or how would sales change if we increase the marketing budget with these questions and several more we're talking about causal effects what is the causal impact of wages on productivity what is the causal effect of marketing spend on sales so looking at the same example as before we have a causal model including age education and wealth we know from the previous slide that age is a confounder because it creates a discrepancy between the interventional and observational probability distributions we can consider education to be a treatment and wealth to be a response to that treatment and then suppose with this causal model in hand we collect some data very similar to what we were talking about in the previous slide but now we're set up to do causal inference we're set up to ask and attempt to answer a question involving cause and effect so a question might be is grad school worth it which might be something someone watching this video is thinking about or something someone is reminiscing upon and wishing they would have known about causal inference before deciding to go to grad school and they're already waist deep into it either way one way to frame this question of is grad school worth it could be what is the treatment effect of education on wealth i'm not saying this is the best way to do it but this is a way we can do it so i'll use this opportunity to run through a concrete example with code in python so the example code is at the github link at the bottom here i also put the link in the description but basically here we're going to estimate the treatment effect of education on income so first we download some libraries load some data this is real census data from the uci university of california irvine the machine learning data repository i don't know the specific name but here is the uh link here to do the causal effect calculation i use the do y library which is a microsoft library for doing causal inference so the next step is we have to define our causal model so again the starting point of all causal inference is a causal model so we need to start with our dag which is the same as we saw in an earlier slide just that education now has a new name called has graduate degree and income has a different name which is greater than 50k so these are both boolean variables which means they're true or false variables so either someone can have a graduate degree or they don't or either they make more than fifty thousand dollars a year or they don't and then age is just an integer next we need a s demand which is basically a recipe for estimating our causal effect you can just do this in one line using the do y library and then finally we can estimate the causal effect so here we're using a t learner which is a type of meta learner i can link a paper talking about meta learners in the description i won't jump into all the details i'll just kind of jump to the result which is the average causal effect is 0.2 so one way to interpret this is having a graduate degree increases your chances of making more than 50 000 a year by 20 however we had a lot of samples in this data set and we've just reduced all those samples to a single number which was the average which may not always be the most representative number so it's always good to plot the distribution and when we plot the distribution so here we have on the x-axis the causal effect the y-axis is the count the number of records or people that had that individual causal effect we see that the distribution is not gaussian so if the distribution is not gaussian that means the average is not a very representative number for that distribution so in other words even though a lot of people had a 0.2 treatment effect there were also a significant number of people that had no treatment effect so it seems we're no closer to answering the question of is grad school worth it however one thing one could do is to dive into these different cohorts kind of look at the samples that had no causal effect from a graduate degree and then look at the people that had a significant causal effect and then you can start to answer the question like what kinds of people benefit from a graduate degree and what kinds of person don't benefit from a graduate degree and then maybe that can kind of help you answer this question so again codes on the github feel free to take it run with it do whatever you want extend the analysis further post your own youtube video about it i'll be really interested to see if anyone actually takes a look and tries to answer this question of is grad school worth it but i guess it's a little too late for me at this point so that was the second video in the three-part series on causality we talked about causal inference which aims at answering questions involving causality however the starting point of all causal inference is a causal model which may not be so easy to have in hand that's where the topic of the next video can be helpful which is causal discovery and that aims at obtaining causal structure from data alone so if you enjoyed this video consider liking subscribing sharing commenting your thoughts i'm always happy and interested in reading the comments check out the blog if you want to get some more details on causal inference and check out the github to get the example code talked about in this video and thanks for watching [Music]

Original Description

🤝 Work with me: https://aibuilder.academy/yt/PFBI-ZfV5rs 🚀 Ship AI apps in weeks, not months: https://aibuilder.academy/courses/yt/PFBI-ZfV5rs The second video in a 3-part series on causality. In this video I discuss key ideas from causal inference, which aims at answering question about cause-and-effect. I finish with a concrete example with code of doing causal inference in Python. Series Playlist: https://www.youtube.com/playlist?list=PLz-ep5RbHosVVTz9HEzpI4d6xpWsc8rOa 📰 Read more: https://medium.com/towards-data-science/causal-inference-962ae97cefda?sk=d68d5191fdb00d3fee47aaa43ed48f3d 💻 Example code: https://github.com/ShawhinT/YouTube-Blog/tree/main/causality/causal_inference Resources: - The Book of Why by Judea Pearl: https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X - Do-calculus: https://arxiv.org/abs/1210.4852 - Metalearner paper: https://www.pnas.org/content/116/10/4156 Introduction - 0:00 Causal Inference - 0:28 3 Gifts of Causal Inference - 1:13 Gift 1: Do-operator - 1:20 Gift 2: Confounding (deconfounded) - 3:22 Gift 3: Causal Effects - 5:51 Example: Treatment Effect of Grad School on Income - 8:05 Closing remarks - 11:12

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Shaw Talebi · Shaw Talebi · 12 of 60

← Previous Next →

biometricDashboard2 DEMO

biometricDashboard2 DEMO

biometricDahboard3 DEMO

biometricDahboard3 DEMO

Time Series, Signals, & the Fourier Transform | Introduction

Time Series, Signals, & the Fourier Transform | Introduction

The Fast Fourier Transform | How does it (actually) work?

The Fast Fourier Transform | How does it (actually) work?

The Wavelet Transform | Introduction & Example Code

The Wavelet Transform | Introduction & Example Code

Principal Component Analysis (PCA) | Introduction & Example (Python) Code

Principal Component Analysis (PCA) | Introduction & Example (Python) Code

Independent Component Analysis (ICA) | EEG Analysis Example Code

Independent Component Analysis (ICA) | EEG Analysis Example Code

Kmeans-based Blink Detecter DEMO

Kmeans-based Blink Detecter DEMO

Shit Happens, Stay Solution Oriented

Shit Happens, Stay Solution Oriented

Why Conflict Is Good & How You Can Use It

Why Conflict Is Good & How You Can Use It

Causality: An Introduction | How (naive) statistics can fail us

Causality: An Introduction | How (naive) statistics can fail us

Causal Inference | Answering causal questions

Causal Inference | Answering causal questions

Causal Discovery | Inferring causality from observational data

Causal Discovery | Inferring causality from observational data

How to Be Antifragile | 7 Practical Tips

How to Be Antifragile | 7 Practical Tips

Multi-kills: How to Do More With Less (no, not by multi-tasking)

Multi-kills: How to Do More With Less (no, not by multi-tasking)

Topological Data Analysis (TDA) | An introduction

Topological Data Analysis (TDA) | An introduction

The Mapper Algorithm | Overview & Python Example Code

The Mapper Algorithm | Overview & Python Example Code

Persistent Homology | Introduction & Python Example Code

Persistent Homology | Introduction & Python Example Code

What Is Data Science & How To Start? | A Beginner's Guide

What Is Data Science & How To Start? | A Beginner's Guide

How to do MORE with LESS - multikills

How to do MORE with LESS - multikills

Causal Effects | An introduction

Causal Effects | An introduction

Causal Effects via Propensity Scores | Introduction & Python Code

Causal Effects via Propensity Scores | Introduction & Python Code

Causal Effects via the Do-operator | Overview & Example

Causal Effects via the Do-operator | Overview & Example

Causal Effects via DAGs | How to Handle Unobserved Confounders

Causal Effects via DAGs | How to Handle Unobserved Confounders

Smoothing Crypto Time Series with Wavelets | Real-world Data Project

Smoothing Crypto Time Series with Wavelets | Real-world Data Project

Causal Effects via Regression w/ Python Code

Causal Effects via Regression w/ Python Code

5 Reasons Why Every Data Scientist Should Consider Freelancing

5 Reasons Why Every Data Scientist Should Consider Freelancing

An Introduction to Decision Trees | Gini Impurity & Python Code

An Introduction to Decision Trees | Gini Impurity & Python Code

10 Decision Trees are Better Than 1 | Random Forest & AdaBoost

10 Decision Trees are Better Than 1 | Random Forest & AdaBoost

Dimensionality Reduction & Segmentation with Decision Trees | Python Code

Dimensionality Reduction & Segmentation with Decision Trees | Python Code

How to Make a Data Science Portfolio With GitHub Pages (2025)

How to Make a Data Science Portfolio With GitHub Pages (2025)

My $100,000+ Data Science Resume (what got me hired)

My $100,000+ Data Science Resume (what got me hired)

How to Create a Custom Email Signature in Gmail (2025)

How to Create a Custom Email Signature in Gmail (2025)

I Spent $675.92 Talking to Top Data Scientists on Upwork—Here’s what I learned

I Spent $675.92 Talking to Top Data Scientists on Upwork—Here’s what I learned

Lessons from Spending $675.92 to Talk to Top Data Scientists on Upwork #freelance #datascience

Lessons from Spending $675.92 to Talk to Top Data Scientists on Upwork #freelance #datascience

A Practical Introduction to Large Language Models (LLMs)

A Practical Introduction to Large Language Models (LLMs)

The OpenAI (Python) API | Introduction & Example Code

The OpenAI (Python) API | Introduction & Example Code

The Hugging Face Transformers Library | Example Code + Chatbot UI with Gradio

The Hugging Face Transformers Library | Example Code + Chatbot UI with Gradio

Why I Quit My $150,000 Data Science Job

Why I Quit My $150,000 Data Science Job

Prompt Engineering: How to Trick AI into Solving Your Problems

Prompt Engineering: How to Trick AI into Solving Your Problems

The REALITY of entrepreneurship. #entrepreneurship #startup #smallbusiness

The REALITY of entrepreneurship. #entrepreneurship #startup #smallbusiness

Fine-tuning Large Language Models (LLMs) | w/ Example Code

Fine-tuning Large Language Models (LLMs) | w/ Example Code

How to Build an LLM from Scratch | An Overview

How to Build an LLM from Scratch | An Overview

I Have 90 Days to Make $10k/mo—Here's my plan

I Have 90 Days to Make $10k/mo—Here's my plan

I Spent $716.46 Talking to Data Scientists on Upwork—Here’s what I learned.

I Spent $716.46 Talking to Data Scientists on Upwork—Here’s what I learned.

Pareto, Power Laws, and Fat Tails

Pareto, Power Laws, and Fat Tails

Do NOT become an entrepreneur #entrepreneurship

Do NOT become an entrepreneur #entrepreneurship

Detecting Power Laws in Real-world Data | w/ Python Code

Detecting Power Laws in Real-world Data | w/ Python Code

How I’d learn data analytics (if I had to start over in 2024) #dataanalytics

How I’d learn data analytics (if I had to start over in 2024) #dataanalytics

4 Ways to Measure Fat Tails with Python (+ Example Code)

4 Ways to Measure Fat Tails with Python (+ Example Code)

Fine-tuning EXPLAINED in 40 sec #generativeai

Fine-tuning EXPLAINED in 40 sec #generativeai

How Much YouTube Paid Me in My First 6 Months of Monetization (as a Data Science Creator)

How Much YouTube Paid Me in My First 6 Months of Monetization (as a Data Science Creator)

5 Questions Every Data Scientist Should Hardcode into Their Brain

5 Questions Every Data Scientist Should Hardcode into Their Brain

AI for Business: A (non-technical) introduction

AI for Business: A (non-technical) introduction

LLMs EXPLAINED in 60 seconds #ai

LLMs EXPLAINED in 60 seconds #ai

3 Ways to Make a Custom AI Assistant | RAG, Tools, & Fine-tuning

3 Ways to Make a Custom AI Assistant | RAG, Tools, & Fine-tuning

What is #ai? — Simply Explained

What is #ai? — Simply Explained

QLoRA—How to Fine-tune an LLM on a Single GPU (w/ Python Code)

QLoRA—How to Fine-tune an LLM on a Single GPU (w/ Python Code)

How to Improve LLMs with RAG (Overview + Python Code)

How to Improve LLMs with RAG (Overview + Python Code)

Text Embeddings, Classification, and Semantic Search (w/ Python Code)

Text Embeddings, Classification, and Semantic Search (w/ Python Code)

This video teaches the basics of causal inference, including key concepts like confounding, causal models, and causal effects, and provides practical examples using Python libraries like do_y and t learner.

Key Takeaways

Compare data within age groups instead of between age groups
Quantify the causal effect that education has on people's incomes
Estimate causal effects using causal inference
Download libraries and load data
Define causal model with DAG
Estimate causal effect using do_y library
Plot distribution of causal effects

💡 Causal inference aims to estimate causal effects, quantifying the impact of one variable on another, and is a crucial part of understanding cause-and-effect relationships in data.

🔒 Pro feature: Ask AI to explain this lesson →

More on: Research Methods

View skill →

Mechanics of Materials III: Beam Bending

Mechanics of Materials III: Beam Bending

Inaugural Lecture: Juliane Reinecke

Inaugural Lecture: Juliane Reinecke

Saïd Business School, University of Oxford

Hands-On Learning: How and Why You Should Build a Home Lab

Hands-On Learning: How and Why You Should Build a Home Lab

SANS Live Online Interactive Remote Lab and Range Demo – SEC599: Defeating Advanced Adversaries

SANS Live Online Interactive Remote Lab and Range Demo – SEC599: Defeating Advanced Adversaries

Does Water Swirl the Other Way in the Southern Hemisphere?

Does Water Swirl the Other Way in the Southern Hemisphere?

Undergraduate Research Forum 2026

Undergraduate Research Forum 2026

Related AI Lessons

I Spent Weeks Looking for a Research Gap Before I Realized I Was Searching the Wrong Way

Learn how to effectively find research gaps by changing your approach, a crucial skill for AI researchers and academics

ICMI 2026 Reviews [D]

Learn how to interpret ICMI 2026 reviews and improve your paper's acceptance chances

Reddit r/MachineLearning

Workshop submission for main conference paper under review [D]

Learn how to navigate submitting a paper to a non-archival workshop before the final decision of a main conference like ECCV

Reddit r/MachineLearning

Kept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]

Streamline your research with a new Chrome extension and website that integrates 3M papers from arxiv, OpenReview, GitHub, and HuggingFace, including citation graphs and SPECTER2 neighbors, and provide feedback to improve it

Reddit r/MachineLearning

Beyond Big Vendors: ERP Systems Explained #shorts

Digital Transformation with Eric Kimberling