Gibbs Sampling : Data Science Concepts

ritvikmath · Beginner ·📰 AI News & Updates ·5y ago

Key Takeaways

Introduces Gibbs sampling for multivariate distributions using a two-dimensional normal distribution example

Full Transcript

[Music] hey everyone welcome back so today we'll be talking about another mcmc method called gibbs sampling and i think this video will be pretty short i just have a couple things to say on gibbs sampling so uh first off why would you use gibbs sampling so this is only really makes sense when you're sampling from a multivariate distribution so in most of our past videos just to keep the example simple we've been sampling from single dimensional distributions where there's only one variable give sampling is useful in the case where you have two or more dimensions for the distribution that you're trying to sample from so we're going to be working with the easiest such case with a two-dimensional distribution today and just to keep things concrete our goal is to sample from the two-dimensional normal distribution or two-dimensional gaussian distribution with mean zero zero and this pretty simple covariance matrix now i'll just say off the bat that there are known ways to sample from this distribution that are not give sampling but we're going to keep things simple and assume that we're going to be using gibbs sampling today to sample from this distribution just to show you how gibb sampling actually works in practice so if we were able to sample from this distribution we would get some kind of plot like this so there's a high density around the mean which is zero zero for x and y and the distribution is tilted like this because of these one halves in the covariance matrix and we can also show that the correlation between the x and the y variable is one half so you get a distribution that looks like that and so the case when you use gibbs sampling so you want to sample from a multivariate distribution now what is the secondary case for knowing you should use gibb sampling this is the most important condition so sampling from the joint distribution which is p x and y so that would be the joint pdf for the multivariate normal distribution for the two-dimensional normal distribution we're going to say sampling from that is difficult so you may have the equation for it you might not have the equation for it but either way sampling from that joint distribution getting a pair of x and y's simultaneously is difficult but what is easy is sampling from the conditional distributions and by conditional distributions i mean the distribution of x given a fixed value of y and also the distribution of y given a fixed value of x and as you ramp up the number of dimensions in your distribution three four ten dimensions all these conditional densities so the density of the first variable given the others the density of the second variable given the others we're assuming all of those are relatively easy to sample from so those are all sampling from a single variable distribution which is that first variable holding all the other variables fixed so that is the first thing to get in your mind which is that we use gibbs sampling for multivariate distributions exactly when sampling from the joint distribution is tricky or impossible but we can easily sample from all the conditionals and now that begs the question what are the conditional distributions so x given y and y given x for this particular example and we can show i won't derive it for you here but we can show that if you're sampling x given some fixed value of y then it's going to be rho which is the correlation between x and y times that fixed value of y and the variance is going to be 1 minus rho squared and so for us since rho is equal to 1 half we just said that before this simplifies to normal distribution y over 2 and 3 4 as our variance so in more easy terms what that's saying is that if you have a fixed value of y and you want to sample x then you can sample from the single variable normal distribution with mean y over 2 and variance 3 4 and since this whole problem is symmetric the conditional distribution of y given x looks exactly the same just substituting x for y and so gives sampling of proceeds as follows extremely simple algorithm we start by initializing some x naught y naught so that can be anywhere on the x y preferably somewhere that's sort of close to the center of the distribution but it could really be anywhere just a matter of how fast it's going to converge and the next thing we do is we change x so we're going to keep the y variable fixed for now so this was our first sample and asking for our next sample we're going to be keeping the y variable fixed and we are going to be sampling the new value of the x variable from this conditional distribution which is the new value of x variable given the existing value of the y variable which is y0 and then the next thing we do is we sample a new value for the y variable so y1 given some fixed value of the x variable namely the one that we just sampled in step two so basically what's happening is that we are getting a new x sampling from the existing value of y then we get a new y sampling from that new value of x and then we just rinse and repeat as many times as many samples as you would like and it's really nice because we can see this visually at least for the 2d case in this chart here so let's say this is your first sample x naught y naught and now we said that we're going to sample a new value from x but keep the current value of y fixed that's equivalent to just moving somewhere in the x direction so this is our next sample and then to get the next sample after that we're going to swap so we're going to keep the value of x fixed and then sample a new value for y and then we just swap again we sample a new value for x keeping y fixed and we just continue on and on like that as many times as one and what you'll find even though i won't prove it if you want to prove that gibbs sampling works it's actually even easier than proving that metropolis hastings work so you can just use the detailed balance condition again but what you'll find is that if you take enough of these samples it's going to be exactly sampling from this multivariate distribution here that is you're going to get a lot of samples around here and you'll get less samples around the tails of the distribution so that's gibbs sampling in a nutshell and you can extend this to as many variables as your distribution is it's just that you don't have two steps here you have i'm going to sample the first variable given fixed values for the others then i sample the next variable given fixed values for the others and you just keep going and gibbs sampling is pretty simple there's a lot of variance to it sometimes people do this sampling in order sometimes people do the sampling randomly sometimes people even sample blocks of variables given blocks of other variables so there's a lot of directions you can go with this but the general philosophy the general guiding principle of gibbs sampling is that conditional distributions are easy to sample from for this problem at hand but the joint distribution is not and the last thing i'll say in this video is just some pitfalls some places that gibbs sampling doesn't work out the way you expect and the first one is this very contrived case here where you have just zero and one in the y direction and zero and one in the x direction and there's a one half probability at zero zero there's a one half probability at one one and there's a zero probability here you can probably already see the issue here let's say i start off at 0 0 and because of the way gibbs sampling works i can only either go in the x direction or i can go in the y direction because of this trading off x and y direction principle but you see the problem immediately if i'm going in the x direction i couldn't go here because there's no probability there so i'm going to have to stay here if i however go in the y direction same exact issue i can't go here so i'm staying here so i can never actually sample from this 1-1 because i can't get there in one step okay so that's one of the shortcomings of gibbs sampling another one is this phenomenon called probability spikes that is totally a term i just made up please don't write that in any official report but what i mean is that you have a distribution where there is a spike in probability so for example consider this 2d distribution this little green dot here is where i'm saying there's a lot of probability there there's a very high probability density there and everywhere else in this distribution i've marked ls which means there's a very low density there let's think about the issues that we get using gibb sampling here let's say we're currently in a low region again we can only sample in the x direction or the y direction which means we're probably going to be at a low region again and that's exactly the first part of the problem is that if we're in a low region because we can only move in the x and y directions at one time then we're going to stay in these low probabilities for a long time conversely if we are in the high density bubble then think about moving in the x direction you're probably going to stay in the high density bubble because in the x direction there's no other high density areas and also in the y direction you're going to stay in the high density bubble so although gibbs sampling will work theoretically it's going to take unfeasibly long to converge to the actual distribution because you're going to stay in lows and you're going to stay in highs so this is one of the shortcomings too anyways um that was just gibbs sampling in a nutshell if you have any questions please leave them in the comments below please subscribe for more videos just like this and i will see you next time

Original Description

Another MCMC Method. Gibbs sampling is great for multivariate distributions where conditional densities are *easy* to sample from. To emphasize a point in the video: - First sample is (x0,y0) - Next Sample is (x1,y1) - Next Sample is (x2,y2) ... That is, we update *all* variables once to get a new sample. Intro MCMC Video : https://www.youtube.com/watch?v=yApmR-c_hKU
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from ritvikmath · ritvikmath · 0 of 60

← Previous Next →
1 Math Team Update
Math Team Update
ritvikmath
2 Single Variable Calculus Volume of a Sphere - Proof 1
Single Variable Calculus Volume of a Sphere - Proof 1
ritvikmath
3 Single Variable Calculus Volume of a Sphere - Proof 2
Single Variable Calculus Volume of a Sphere - Proof 2
ritvikmath
4 Multivariable Calculus Volume of a Sphere Proof - Triple Integrals
Multivariable Calculus Volume of a Sphere Proof - Triple Integrals
ritvikmath
5 Multivariable Calculus Volume of a Sphere Proof - Double Integrals
Multivariable Calculus Volume of a Sphere Proof - Double Integrals
ritvikmath
6 The Euclidian Algorithm
The Euclidian Algorithm
ritvikmath
7 Proving the Chain Rule
Proving the Chain Rule
ritvikmath
8 Proving the Fundamental Theorem of Calculus Part 1
Proving the Fundamental Theorem of Calculus Part 1
ritvikmath
9 Proving the Fundamental Theorem of Calculus Part 2
Proving the Fundamental Theorem of Calculus Part 2
ritvikmath
10 Math Puzzle - Poison Perplexity
Math Puzzle - Poison Perplexity
ritvikmath
11 Math Puzzle - Poison Perplexity - Solution
Math Puzzle - Poison Perplexity - Solution
ritvikmath
12 Expected Value and Variance of Continuous Random Variables (Calculus)
Expected Value and Variance of Continuous Random Variables (Calculus)
ritvikmath
13 Expected Value and Variance of Discrete Random Variables (No Calculus)
Expected Value and Variance of Discrete Random Variables (No Calculus)
ritvikmath
14 Array Method
Array Method
ritvikmath
15 Complex Power Series and their Derivatives
Complex Power Series and their Derivatives
ritvikmath
16 Distributions - Intro
Distributions - Intro
ritvikmath
17 The Poisson Distribution
The Poisson Distribution
ritvikmath
18 The Bernoulli Distribution
The Bernoulli Distribution
ritvikmath
19 The Binomial Distribution
The Binomial Distribution
ritvikmath
20 The Continuous Uniform Distribution
The Continuous Uniform Distribution
ritvikmath
21 The Geometric Distribution
The Geometric Distribution
ritvikmath
22 The Triangular Distribution
The Triangular Distribution
ritvikmath
23 The Exponential Distribution
The Exponential Distribution
ritvikmath
24 The Borel Distribution + Notes on Poisson Distribution
The Borel Distribution + Notes on Poisson Distribution
ritvikmath
25 The Gamma Distribution
The Gamma Distribution
ritvikmath
26 The Normal Distribution
The Normal Distribution
ritvikmath
27 The Laplace Distribution
The Laplace Distribution
ritvikmath
28 The Chi - Squared Distribution
The Chi - Squared Distribution
ritvikmath
29 Overfitting
Overfitting
ritvikmath
30 Vector Norms
Vector Norms
ritvikmath
31 Truths Behind the Titanic : K-Nearest Neighbor
Truths Behind the Titanic : K-Nearest Neighbor
ritvikmath
32 The Mathematics of Breakups
The Mathematics of Breakups
ritvikmath
33 Sillyfish
Sillyfish
ritvikmath
34 Finding Optimal Paths - Dynamic Programming
Finding Optimal Paths - Dynamic Programming
ritvikmath
35 HowToDataScience : Scraping Twitter Data
HowToDataScience : Scraping Twitter Data
ritvikmath
36 Decision Trees
Decision Trees
ritvikmath
37 Perceptron
Perceptron
ritvikmath
38 Naive Bayes
Naive Bayes
ritvikmath
39 K-Nearest Neighbor
K-Nearest Neighbor
ritvikmath
40 Evaluating Machine Learning Models
Evaluating Machine Learning Models
ritvikmath
41 Decision Tree Pruning
Decision Tree Pruning
ritvikmath
42 K-Means Clustering
K-Means Clustering
ritvikmath
43 Gaussian Mixture Model
Gaussian Mixture Model
ritvikmath
44 Data Science - Fuzzy Record Matching
Data Science - Fuzzy Record Matching
ritvikmath
45 Time Series Talk : Autocorrelation and Partial Autocorrelation
Time Series Talk : Autocorrelation and Partial Autocorrelation
ritvikmath
46 Time Series Talk : Autoregressive Model
Time Series Talk : Autoregressive Model
ritvikmath
47 Time Series Talk : Moving Average Model
Time Series Talk : Moving Average Model
ritvikmath
48 Time Series Talk : ARMA Model
Time Series Talk : ARMA Model
ritvikmath
49 Time Series Talk : ARCH Model
Time Series Talk : ARCH Model
ritvikmath
50 Time Series Talk : White Noise
Time Series Talk : White Noise
ritvikmath
51 Time Series Talk : Stationarity
Time Series Talk : Stationarity
ritvikmath
52 Time Series Talk : ARIMA Model
Time Series Talk : ARIMA Model
ritvikmath
53 Time Series Talk : Lag Operator
Time Series Talk : Lag Operator
ritvikmath
54 Time Series Talk : What is Seasonality ?
Time Series Talk : What is Seasonality ?
ritvikmath
55 Time Series Talk : Seasonal ARIMA Model
Time Series Talk : Seasonal ARIMA Model
ritvikmath
56 So ... What Actually is a Matrix ? : Data Science Basics
So ... What Actually is a Matrix ? : Data Science Basics
ritvikmath
57 Derivative of a Matrix : Data Science Basics
Derivative of a Matrix : Data Science Basics
ritvikmath
58 Basics of PCA (Principal Component Analysis) : Data Science Concepts
Basics of PCA (Principal Component Analysis) : Data Science Concepts
ritvikmath
59 Eigenvalues & Eigenvectors : Data Science Basics
Eigenvalues & Eigenvectors : Data Science Basics
ritvikmath
60 The Covariance Matrix : Data Science Basics
The Covariance Matrix : Data Science Basics
ritvikmath

Related Reads

📰
The AI Problem That Was Never About AI
The AI problem is not about AI itself, but rather about understanding its limitations and applications
Medium · AI
📰
What If Your Surgical Stitches Could Tell You an Infection Is Coming?
Discover how AI-powered surgical stitches can detect infections early, revolutionizing patient care and outcomes
Medium · AI
📰
The AI RAM crisis: did legacy tech just give up its seat to China?
The AI RAM crisis may have led to legacy tech giving up its seat to China, impacting consumer-grade RAM
Medium · AI
📰
The Great AI Quiet Period: Why No Frontier Model Launched This Week (July 2026)
The AI world experienced a rare quiet period with no major frontier model releases, likely due to a recent executive order requiring labs to provide early access to the US government
Dev.to AI
Up next
Tackling Malaria in Africa with Technology at the Huawei ICT Competition
Huawei
Watch →