Gibbs Sampling : Data Science Concepts

ritvikmath · Beginner ·📰 AI News & Updates ·5y ago

Skills: ML Maths Basics70%

Key Takeaways

Introduces Gibbs sampling for multivariate distributions using a two-dimensional normal distribution example

Full Transcript

[Music] hey everyone welcome back so today we'll be talking about another mcmc method called gibbs sampling and i think this video will be pretty short i just have a couple things to say on gibbs sampling so uh first off why would you use gibbs sampling so this is only really makes sense when you're sampling from a multivariate distribution so in most of our past videos just to keep the example simple we've been sampling from single dimensional distributions where there's only one variable give sampling is useful in the case where you have two or more dimensions for the distribution that you're trying to sample from so we're going to be working with the easiest such case with a two-dimensional distribution today and just to keep things concrete our goal is to sample from the two-dimensional normal distribution or two-dimensional gaussian distribution with mean zero zero and this pretty simple covariance matrix now i'll just say off the bat that there are known ways to sample from this distribution that are not give sampling but we're going to keep things simple and assume that we're going to be using gibbs sampling today to sample from this distribution just to show you how gibb sampling actually works in practice so if we were able to sample from this distribution we would get some kind of plot like this so there's a high density around the mean which is zero zero for x and y and the distribution is tilted like this because of these one halves in the covariance matrix and we can also show that the correlation between the x and the y variable is one half so you get a distribution that looks like that and so the case when you use gibbs sampling so you want to sample from a multivariate distribution now what is the secondary case for knowing you should use gibb sampling this is the most important condition so sampling from the joint distribution which is p x and y so that would be the joint pdf for the multivariate normal distribution for the two-dimensional normal distribution we're going to say sampling from that is difficult so you may have the equation for it you might not have the equation for it but either way sampling from that joint distribution getting a pair of x and y's simultaneously is difficult but what is easy is sampling from the conditional distributions and by conditional distributions i mean the distribution of x given a fixed value of y and also the distribution of y given a fixed value of x and as you ramp up the number of dimensions in your distribution three four ten dimensions all these conditional densities so the density of the first variable given the others the density of the second variable given the others we're assuming all of those are relatively easy to sample from so those are all sampling from a single variable distribution which is that first variable holding all the other variables fixed so that is the first thing to get in your mind which is that we use gibbs sampling for multivariate distributions exactly when sampling from the joint distribution is tricky or impossible but we can easily sample from all the conditionals and now that begs the question what are the conditional distributions so x given y and y given x for this particular example and we can show i won't derive it for you here but we can show that if you're sampling x given some fixed value of y then it's going to be rho which is the correlation between x and y times that fixed value of y and the variance is going to be 1 minus rho squared and so for us since rho is equal to 1 half we just said that before this simplifies to normal distribution y over 2 and 3 4 as our variance so in more easy terms what that's saying is that if you have a fixed value of y and you want to sample x then you can sample from the single variable normal distribution with mean y over 2 and variance 3 4 and since this whole problem is symmetric the conditional distribution of y given x looks exactly the same just substituting x for y and so gives sampling of proceeds as follows extremely simple algorithm we start by initializing some x naught y naught so that can be anywhere on the x y preferably somewhere that's sort of close to the center of the distribution but it could really be anywhere just a matter of how fast it's going to converge and the next thing we do is we change x so we're going to keep the y variable fixed for now so this was our first sample and asking for our next sample we're going to be keeping the y variable fixed and we are going to be sampling the new value of the x variable from this conditional distribution which is the new value of x variable given the existing value of the y variable which is y0 and then the next thing we do is we sample a new value for the y variable so y1 given some fixed value of the x variable namely the one that we just sampled in step two so basically what's happening is that we are getting a new x sampling from the existing value of y then we get a new y sampling from that new value of x and then we just rinse and repeat as many times as many samples as you would like and it's really nice because we can see this visually at least for the 2d case in this chart here so let's say this is your first sample x naught y naught and now we said that we're going to sample a new value from x but keep the current value of y fixed that's equivalent to just moving somewhere in the x direction so this is our next sample and then to get the next sample after that we're going to swap so we're going to keep the value of x fixed and then sample a new value for y and then we just swap again we sample a new value for x keeping y fixed and we just continue on and on like that as many times as one and what you'll find even though i won't prove it if you want to prove that gibbs sampling works it's actually even easier than proving that metropolis hastings work so you can just use the detailed balance condition again but what you'll find is that if you take enough of these samples it's going to be exactly sampling from this multivariate distribution here that is you're going to get a lot of samples around here and you'll get less samples around the tails of the distribution so that's gibbs sampling in a nutshell and you can extend this to as many variables as your distribution is it's just that you don't have two steps here you have i'm going to sample the first variable given fixed values for the others then i sample the next variable given fixed values for the others and you just keep going and gibbs sampling is pretty simple there's a lot of variance to it sometimes people do this sampling in order sometimes people do the sampling randomly sometimes people even sample blocks of variables given blocks of other variables so there's a lot of directions you can go with this but the general philosophy the general guiding principle of gibbs sampling is that conditional distributions are easy to sample from for this problem at hand but the joint distribution is not and the last thing i'll say in this video is just some pitfalls some places that gibbs sampling doesn't work out the way you expect and the first one is this very contrived case here where you have just zero and one in the y direction and zero and one in the x direction and there's a one half probability at zero zero there's a one half probability at one one and there's a zero probability here you can probably already see the issue here let's say i start off at 0 0 and because of the way gibbs sampling works i can only either go in the x direction or i can go in the y direction because of this trading off x and y direction principle but you see the problem immediately if i'm going in the x direction i couldn't go here because there's no probability there so i'm going to have to stay here if i however go in the y direction same exact issue i can't go here so i'm staying here so i can never actually sample from this 1-1 because i can't get there in one step okay so that's one of the shortcomings of gibbs sampling another one is this phenomenon called probability spikes that is totally a term i just made up please don't write that in any official report but what i mean is that you have a distribution where there is a spike in probability so for example consider this 2d distribution this little green dot here is where i'm saying there's a lot of probability there there's a very high probability density there and everywhere else in this distribution i've marked ls which means there's a very low density there let's think about the issues that we get using gibb sampling here let's say we're currently in a low region again we can only sample in the x direction or the y direction which means we're probably going to be at a low region again and that's exactly the first part of the problem is that if we're in a low region because we can only move in the x and y directions at one time then we're going to stay in these low probabilities for a long time conversely if we are in the high density bubble then think about moving in the x direction you're probably going to stay in the high density bubble because in the x direction there's no other high density areas and also in the y direction you're going to stay in the high density bubble so although gibbs sampling will work theoretically it's going to take unfeasibly long to converge to the actual distribution because you're going to stay in lows and you're going to stay in highs so this is one of the shortcomings too anyways um that was just gibbs sampling in a nutshell if you have any questions please leave them in the comments below please subscribe for more videos just like this and i will see you next time

Original Description

Another MCMC Method. Gibbs sampling is great for multivariate distributions where conditional densities are *easy* to sample from. To emphasize a point in the video: - First sample is (x0,y0) - Next Sample is (x1,y1) - Next Sample is (x2,y2) ... That is, we update *all* variables once to get a new sample. Intro MCMC Video : https://www.youtube.com/watch?v=yApmR-c_hKU

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from ritvikmath · ritvikmath · 0 of 60

← Previous Next →

Math Team Update

Math Team Update

Single Variable Calculus Volume of a Sphere - Proof 1

Single Variable Calculus Volume of a Sphere - Proof 1

Single Variable Calculus Volume of a Sphere - Proof 2

Single Variable Calculus Volume of a Sphere - Proof 2

Multivariable Calculus Volume of a Sphere Proof - Triple Integrals

Multivariable Calculus Volume of a Sphere Proof - Triple Integrals

Multivariable Calculus Volume of a Sphere Proof - Double Integrals

Multivariable Calculus Volume of a Sphere Proof - Double Integrals

The Euclidian Algorithm

The Euclidian Algorithm

Proving the Chain Rule

Proving the Chain Rule

Proving the Fundamental Theorem of Calculus Part 1

Proving the Fundamental Theorem of Calculus Part 1

Proving the Fundamental Theorem of Calculus Part 2

Proving the Fundamental Theorem of Calculus Part 2

Math Puzzle - Poison Perplexity

Math Puzzle - Poison Perplexity

Math Puzzle - Poison Perplexity - Solution

Math Puzzle - Poison Perplexity - Solution

Expected Value and Variance of Continuous Random Variables (Calculus)

Expected Value and Variance of Continuous Random Variables (Calculus)

Expected Value and Variance of Discrete Random Variables (No Calculus)

Expected Value and Variance of Discrete Random Variables (No Calculus)

Complex Power Series and their Derivatives

Complex Power Series and their Derivatives

Distributions - Intro

Distributions - Intro

The Poisson Distribution

The Poisson Distribution

The Bernoulli Distribution

The Bernoulli Distribution

The Binomial Distribution

The Binomial Distribution

The Continuous Uniform Distribution

The Continuous Uniform Distribution

The Geometric Distribution

The Geometric Distribution

The Triangular Distribution

The Triangular Distribution

The Exponential Distribution

The Exponential Distribution

The Borel Distribution + Notes on Poisson Distribution

The Borel Distribution + Notes on Poisson Distribution

The Gamma Distribution

The Gamma Distribution

The Normal Distribution

The Normal Distribution

The Laplace Distribution

The Laplace Distribution

The Chi - Squared Distribution

The Chi - Squared Distribution

Truths Behind the Titanic : K-Nearest Neighbor

Truths Behind the Titanic : K-Nearest Neighbor

The Mathematics of Breakups

The Mathematics of Breakups

Finding Optimal Paths - Dynamic Programming

Finding Optimal Paths - Dynamic Programming

HowToDataScience : Scraping Twitter Data

HowToDataScience : Scraping Twitter Data

K-Nearest Neighbor

K-Nearest Neighbor

Evaluating Machine Learning Models

Evaluating Machine Learning Models

Decision Tree Pruning

Decision Tree Pruning

K-Means Clustering

K-Means Clustering

Gaussian Mixture Model

Gaussian Mixture Model

Data Science - Fuzzy Record Matching

Data Science - Fuzzy Record Matching

Time Series Talk : Autocorrelation and Partial Autocorrelation

Time Series Talk : Autocorrelation and Partial Autocorrelation

Time Series Talk : Autoregressive Model

Time Series Talk : Autoregressive Model

Time Series Talk : Moving Average Model

Time Series Talk : Moving Average Model

Time Series Talk : ARMA Model

Time Series Talk : ARMA Model

Time Series Talk : ARCH Model

Time Series Talk : ARCH Model

Time Series Talk : White Noise

Time Series Talk : White Noise

Time Series Talk : Stationarity

Time Series Talk : Stationarity

Time Series Talk : ARIMA Model

Time Series Talk : ARIMA Model

Time Series Talk : Lag Operator

Time Series Talk : Lag Operator

Time Series Talk : What is Seasonality ?

Time Series Talk : What is Seasonality ?

Time Series Talk : Seasonal ARIMA Model

Time Series Talk : Seasonal ARIMA Model

So ... What Actually is a Matrix ? : Data Science Basics

So ... What Actually is a Matrix ? : Data Science Basics

Derivative of a Matrix : Data Science Basics

Derivative of a Matrix : Data Science Basics

Basics of PCA (Principal Component Analysis) : Data Science Concepts

Basics of PCA (Principal Component Analysis) : Data Science Concepts

Eigenvalues & Eigenvectors : Data Science Basics

Eigenvalues & Eigenvectors : Data Science Basics

The Covariance Matrix : Data Science Basics

The Covariance Matrix : Data Science Basics

More on: ML Maths Basics

View skill →

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

ChethanAIChronicles

“Hello, world” from scratch on a 6502 — Part 1

“Hello, world” from scratch on a 6502 — Part 1

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

ROC and AUC in R

ROC and AUC in R

StatQuest with Josh Starmer

Data Science Fundamentals: Data Cleaning in Python

Data Science Fundamentals: Data Cleaning in Python

Related Reads

The AI Problem That Was Never About AI

The AI problem is not about AI itself, but rather about understanding its limitations and applications

What If Your Surgical Stitches Could Tell You an Infection Is Coming?

Discover how AI-powered surgical stitches can detect infections early, revolutionizing patient care and outcomes

The AI RAM crisis: did legacy tech just give up its seat to China?

The AI RAM crisis may have led to legacy tech giving up its seat to China, impacting consumer-grade RAM

The Great AI Quiet Period: Why No Frontier Model Launched This Week (July 2026)

The AI world experienced a rare quiet period with no major frontier model releases, likely due to a recent executive order requiring labs to provide early access to the US government

Tackling Malaria in Africa with Technology at the Huawei ICT Competition