PCA explained with intuition, a little math and code

AI Coffee Break with Letitia · Beginner ·📄 Research Papers Explained ·5y ago

Skills: ML Maths Basics80%ML Pipelines60%

Key Takeaways

The video explains Principal Component Analysis (PCA) for dimensionality reduction, covering the intuition, math, and code implementation using scikit-learn in Python.

Full Transcript

hello everyone today we will explain how to get rid of too many dimensions especially after in the last video we have learned that too many dimensions can be often occurs so yes in a sense today we learn how to lift curses there are many methods for dimensionality reduction like pca ica bba funny yay okay i i messed up here again pca ica nmf lda ida disney umap autoencoders and so on and all are awesome coming with their own solutions and tricks but today we will discuss the basics by example of pca which stands for principal component analysis in a nutshell pca uses some heuristics to find the most important directions in the data then it can discard the most unimportant parts in order to achieve the desired dimensionality of the target space if the data would be two-dimensional we could even visualize it easily like this looking at the data clearly there are more important dimensions than others in this direction the spread is not that big and if we would clamp all points onto one line we would not lose that much information we would lose a lot of information if we would get rid of the other dimension of the spread because there the spread is much bigger and helps a lot to differentiate the points so the main idea of the pca algorithm is to keep the directions of the biggest spread and throw away the rest until we reach the desired number of dimensions d prime but how to do that with mass of course but first let's think about this intuitively what if we would approximate our data with an ellipse basically a two-dimensional gaussian and find out the main axis of the ellipse these axis which are called principal components are exactly the directions we are looking for we can keep the most important principal components and throw away the other ones but intuition does not compute stuff for us math does so let's see how we would do this mathematically by setting up the pca algorithm first let's suppose we have our data x which is a matrix with rows containing all our data points the columns contain the features of these data points then we want to centralize our data meaning that the mean of all points should be zero how to do that by taking the data and subtracting its mean done after this simple transformation we are ready to go on we want to find the biggest spread or variance of the data because this is the informative part the part most unpredictable of the data for that we compute the scatter matrix of the data defined by this formula and remember we have already subtracted the mean this is the reason why the formula for the scatter matrix is not more complicated but what is this thing the scatter matrix we remember that our data x is composed of rows of the dimensional vectors if we take one of these rows transpose it and do the multiplication we get a d by d matrix and of course we sum this all over our data samples i why have we done this we wanted to compute the main axis of the ellipse well it turns out that the directions of the main axis are actually the eigenvectors of this scatter matrix and each eigenvector has a so called eigenvalue which captures the importance of the eigenvectors so the magnitude of the spread so this is what we do we take the scatter matrix compute eigenvalues and eigenvectors for big matrices we do it of course in code like with this in python now that we have eigenvectors we can sort them by eigenvalue to determine their importance and we are ready to reduce the dimension for this we decide on how many dimensions we want to reduce say d prime then we take the prime eigenvectors and we put them into a matrix like this the eigenvectors are d dimensional since they live in the original space and what we want then is to compute the new position z in the new space of our data points like this meaning that we take out each data point x and multiply it with this matrix v of eigenvectors so we have x i one times the matrix which is a vector multiplied with a d times d prime matrix what comes out is a 1 by d prime matrix which is a vector so vectors in our new reduced d prime dimensional space so because v projects our points x from a higher dimensional space into a lower dimensional space d prime we call v the projection matrix what this projection matrix actually does is it rotates the data and maps it to a new space with lower dimensionality than the original one and that was it and the best part is we only have to understand the idea once because the implementation of the algorithm is already available in python with scikit-learn let's have a short look on how to use it this is one of the official scikit-learn examples where dimensionality reduction with pca is applied to the iris data set containing different flowers with four different features to differentiate between them like for example petal widths and links but we humans cannot really visualize the four dimensions uh until we do not reduce them to at least three dimensions and this is luckily exactly what this code does while the code might look dense we will now have a gentle introduction into it because you really have to know that pca relevant lines are very few the first python lines are as always importing specific packages like numpy for multi-dimensional arrays matplotlib piplot for plotting and of course sklearn for the machine learning part a scalern is so kind to provide us also with datasets so we load the iris data and save it into the variable iris in machine learning it is a custom to save the data into a variable called x and the labels in y we initialize the pca model by using the class from scikit learn as a parameter we specify the number of principal components we want to keep so the dimensionality of our target space d prime which is now 3. basically this is the line where we specify that our intention is to reduce from the original dimensionality of 4 to 3. and the cool part about scikit-learn is that you can exchange this line with many other dimensionality reduction algorithms but now with pca we fit the model to the data x and we do not use or need any labels y since they are not of any use in the pca algorithm which is a blind signal separation technique with the fitted model to our data which has internally done all complicated things for us like centralized the data computed the scatter matrix computed eigenvectors and sorted them after the eigenvalues formed the projection matrix with them so now it is ready to be applied on the data to compute the new representation which is stored here again in x x is now three dimensional and we are basically done we can use the rest of the code to make the 3d visualization that you see here at the side is it really that easy yes it is now you have no excuse not to use dimensionality reduction when needed do not forget to like and subscribe okay [Music] bye [Music] you

Original Description

Say "PCA" and the dimensions go away! Dimensionality reduction with PCA (Principal Component Analysis) explained with intuition, a little math and code. If you ever wanted to know how to escape the curse of dimensionality, this video is for you! Also, learn about the curse of dimensionality in our previous video: 📺https://youtu.be/4v7ngaiFdp4 ➡️ AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/ ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ 🔥 Optionally, pay us a coffee to boost our Coffee Bean production! ☕ Patreon: https://www.patreon.com/AICoffeeBreak Ko-fi: https://ko-fi.com/aicoffeebreak ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ Outline: * 00:00 The Intuition * 02:35 The Math * 05:52 The Code 💻 Code Source: https://scikit-learn.org/stable/auto_examples/decomposition/plot_pca_iris.html#sphx-glr-auto-examples-decomposition-plot-pca-iris-py ---------------- 🔗 Links: YouTube: https://www.youtube.com/AICoffeeBreak Twitter: https://twitter.com/AICoffeeBreak Reddit: https://www.reddit.com/r/AICoffeeBreak/ #AICoffeeBreak #MsCoffeeBean #PCA #MachineLearning #AI #research

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from AI Coffee Break with Letitia · AI Coffee Break with Letitia · 22 of 60

← Previous Next →

AI Coffee Break - Channel Trailer

AI Coffee Break - Channel Trailer

AI Coffee Break with Letitia

How to check if a neural network has learned a specific phenomenon?

How to check if a neural network has learned a specific phenomenon?

AI Coffee Break with Letitia

A brief history of the Transformer architecture in NLP

A brief history of the Transformer architecture in NLP

AI Coffee Break with Letitia

Our paper at CVPR 2020 - MUL Workshop and ACL 2020 - ALVR Workshop

Our paper at CVPR 2020 - MUL Workshop and ACL 2020 - ALVR Workshop

AI Coffee Break with Letitia

The Transformer neural network architecture EXPLAINED. “Attention is all you need”

The Transformer neural network architecture EXPLAINED. “Attention is all you need”

AI Coffee Break with Letitia

Transformer combining Vision and Language? ViLBERT - NLP meets Computer Vision

Transformer combining Vision and Language? ViLBERT - NLP meets Computer Vision

AI Coffee Break with Letitia

Pre-training of BERT-based Transformer architectures explained – language and vision!

Pre-training of BERT-based Transformer architectures explained – language and vision!

AI Coffee Break with Letitia

GPT-3 explained with examples. Possibilities, and implications.

GPT-3 explained with examples. Possibilities, and implications.

AI Coffee Break with Letitia

Adversarial Machine Learning explained! | With examples.

Adversarial Machine Learning explained! | With examples.

AI Coffee Break with Letitia

BERTology meets Biology | Solving biological problems with Transformers

BERTology meets Biology | Solving biological problems with Transformers

AI Coffee Break with Letitia

Can a neural network tell if an image is mirrored? – Visual Chirality

Can a neural network tell if an image is mirrored? – Visual Chirality

AI Coffee Break with Letitia

The ultimate intro to Graph Neural Networks. Maybe.

The ultimate intro to Graph Neural Networks. Maybe.

AI Coffee Break with Letitia

Can language models understand? Bender and Koller argument.

Can language models understand? Bender and Koller argument.

AI Coffee Break with Letitia

GANs explained | Generative Adversarial Networks video with showcase!

GANs explained | Generative Adversarial Networks video with showcase!

AI Coffee Break with Letitia

What nobody tells you about MULTIMODAL Machine Learning! 🙊 THE definition.

What nobody tells you about MULTIMODAL Machine Learning! 🙊 THE definition.

AI Coffee Break with Letitia

Multimodal Machine Learning models do not work. Here is why. Part 1/2 – The SYMPTOMS

Multimodal Machine Learning models do not work. Here is why. Part 1/2 – The SYMPTOMS

AI Coffee Break with Letitia

Why Multimodal Machine Learning models do not work. Part 2/2 – The CAUSES

Why Multimodal Machine Learning models do not work. Part 2/2 – The CAUSES

AI Coffee Break with Letitia

An image is worth 16x16 words: ViT | Vision Transformer explained

An image is worth 16x16 words: ViT | Vision Transformer explained

AI Coffee Break with Letitia

AI understanding language!? A roadmap to natural language understanding.

AI understanding language!? A roadmap to natural language understanding.

AI Coffee Break with Letitia

"What Can We Do to Improve Peer Review in NLP?" 👀

"What Can We Do to Improve Peer Review in NLP?" 👀

AI Coffee Break with Letitia

The curse of dimensionality. Or is it a blessing?

The curse of dimensionality. Or is it a blessing?

AI Coffee Break with Letitia

PCA explained with intuition, a little math and code

PCA explained with intuition, a little math and code

AI Coffee Break with Letitia

Data-efficient Image Transformers EXPLAINED! Facebook AI's DeiT paper

Data-efficient Image Transformers EXPLAINED! Facebook AI's DeiT paper

AI Coffee Break with Letitia

OpenAI's DALL-E explained. How GPT-3 creates images from descriptions.

OpenAI's DALL-E explained. How GPT-3 creates images from descriptions.

AI Coffee Break with Letitia

Leaking training data from GPT-2. How is this possible?

Leaking training data from GPT-2. How is this possible?

AI Coffee Break with Letitia

OpenAI’s CLIP explained! | Examples, links to code and pretrained model

OpenAI’s CLIP explained! | Examples, links to code and pretrained model

AI Coffee Break with Letitia

Transformers can do both images and text. Here is why.

Transformers can do both images and text. Here is why.

AI Coffee Break with Letitia

UMAP explained | The best dimensionality reduction?

UMAP explained | The best dimensionality reduction?

AI Coffee Break with Letitia

NVIDIA Jarvis (now NVIDIA Riva) meets Ms. Coffee Bean

NVIDIA Jarvis (now NVIDIA Riva) meets Ms. Coffee Bean

AI Coffee Break with Letitia

Transformer in Transformer: Paper explained and visualized | TNT

Transformer in Transformer: Paper explained and visualized | TNT

AI Coffee Break with Letitia

[RANT] Adversarial attack on OpenAI’s CLIP? Are we the fools or the foolers?

[RANT] Adversarial attack on OpenAI’s CLIP? Are we the fools or the foolers?

AI Coffee Break with Letitia

Pattern Exploiting Training explained! | PET, iPET, ADAPET

Pattern Exploiting Training explained! | PET, iPET, ADAPET

AI Coffee Break with Letitia

Deep Learning for Symbolic Mathematics!? | Paper EXPLAINED

Deep Learning for Symbolic Mathematics!? | Paper EXPLAINED

AI Coffee Break with Letitia

FNet: Mixing Tokens with Fourier Transforms – Paper Explained

FNet: Mixing Tokens with Fourier Transforms – Paper Explained

AI Coffee Break with Letitia

Are Pre-trained Convolutions Better than Pre-trained Transformers? – Paper Explained

Are Pre-trained Convolutions Better than Pre-trained Transformers? – Paper Explained

AI Coffee Break with Letitia

"Please Commit More Blatant Academic Fraud" – A fellow PhD student's response.

"Please Commit More Blatant Academic Fraud" – A fellow PhD student's response.

AI Coffee Break with Letitia

Scaling Vision Transformers? How much data can a transformer get? #Shorts

Scaling Vision Transformers? How much data can a transformer get? #Shorts

AI Coffee Break with Letitia

How cross-modal are vision and language models really? 👀 Seeing past words. [Own work]

How cross-modal are vision and language models really? 👀 Seeing past words. [Own work]

AI Coffee Break with Letitia

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization +Tokenizer explained

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization +Tokenizer explained

AI Coffee Break with Letitia

Positional embeddings in transformers EXPLAINED | Demystifying positional encodings.

Positional embeddings in transformers EXPLAINED | Demystifying positional encodings.

AI Coffee Break with Letitia

Adding vs. concatenating positional embeddings & Learned positional encodings

Adding vs. concatenating positional embeddings & Learned positional encodings

AI Coffee Break with Letitia

Self-Attention with Relative Position Representations – Paper explained

Self-Attention with Relative Position Representations – Paper explained

AI Coffee Break with Letitia

Saddle points vs. local minima in high dimensional spaces | ❓ #AICoffeeBreakQuiz #Shorts

Saddle points vs. local minima in high dimensional spaces | ❓ #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

What is the model identifiability problem? | Explained in 60 seconds! | ❓ #AICoffeeBreakQuiz #Shorts

What is the model identifiability problem? | Explained in 60 seconds! | ❓ #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

Data leakage during data preparation? | Using AntiPatterns to avoid MLOps Mistakes

Data leakage during data preparation? | Using AntiPatterns to avoid MLOps Mistakes

AI Coffee Break with Letitia

Is today's AI smarter than YOU? #Shorts

Is today's AI smarter than YOU? #Shorts

AI Coffee Break with Letitia

Convolution vs Cross-Correlation. How most CNNs do not compute convolutions. | ❓ #Shorts

Convolution vs Cross-Correlation. How most CNNs do not compute convolutions. | ❓ #Shorts

AI Coffee Break with Letitia

Why do we care about cross-correlations vs convolutions | ❓ #AICoffeeBreakQuiz #Shorts

Why do we care about cross-correlations vs convolutions | ❓ #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

The convolution is not shift invariant. | Invariance vs Equivariance | ❓ #AICoffeeBreakQuiz #Shorts

The convolution is not shift invariant. | Invariance vs Equivariance | ❓ #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

How to increase the receptive field in CNNs? | #AICoffeeBreakQuiz #Shorts

How to increase the receptive field in CNNs? | #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

What is tokenization and how does it work? Tokenizers explained.

What is tokenization and how does it work? Tokenizers explained.

AI Coffee Break with Letitia

Foundation Models | On the opportunities and risks of calling pre-trained models “Foundation Models”

Foundation Models | On the opportunities and risks of calling pre-trained models “Foundation Models”

AI Coffee Break with Letitia

How modern search engines work – Vector databases explained! | Weaviate open-source

How modern search engines work – Vector databases explained! | Weaviate open-source

AI Coffee Break with Letitia

Eyes tell all: How to tell that an AI generated a face?

Eyes tell all: How to tell that an AI generated a face?

AI Coffee Break with Letitia

Swin Transformer paper animated and explained

Swin Transformer paper animated and explained

AI Coffee Break with Letitia

Data BAD | What Will it Take to Fix Benchmarking for NLU?

Data BAD | What Will it Take to Fix Benchmarking for NLU?

AI Coffee Break with Letitia

SimVLM explained | What the paper doesn’t tell you

SimVLM explained | What the paper doesn’t tell you

AI Coffee Break with Letitia

Generalization – Interpolation – Extrapolation in Machine Learning: Which is it now!?

Generalization – Interpolation – Extrapolation in Machine Learning: Which is it now!?

AI Coffee Break with Letitia

Do Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuiz

Do Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuiz

AI Coffee Break with Letitia

The efficiency misnomer | Size does not matter | What does the number of parameters mean in a model?

The efficiency misnomer | Size does not matter | What does the number of parameters mean in a model?

AI Coffee Break with Letitia

This video explains PCA for dimensionality reduction, covering the intuition, math, and code implementation using scikit-learn in Python. It provides a gentle introduction to the concept and its application to the Iris dataset.

Key Takeaways

Import necessary packages, including numpy, matplotlib, and scikit-learn
Load the Iris dataset and initialize the PCA model
Specify the number of principal components to keep
Fit the model to the data and transform it to the new representation
Visualize the results using a 3D plot

💡 PCA is a powerful technique for dimensionality reduction, and scikit-learn provides an easy-to-use implementation in Python.

🔒 Pro feature: Ask AI to explain this lesson →

More on: ML Maths Basics

View skill →

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

ChethanAIChronicles

“Hello, world” from scratch on a 6502 — Part 1

“Hello, world” from scratch on a 6502 — Part 1

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

ROC and AUC in R

ROC and AUC in R

StatQuest with Josh Starmer

Data Science Fundamentals: Data Cleaning in Python

Data Science Fundamentals: Data Cleaning in Python

Related AI Lessons

I Spent Weeks Looking for a Research Gap Before I Realized I Was Searching the Wrong Way

Learn how to effectively find research gaps by changing your approach, a crucial skill for AI researchers and academics

ICMI 2026 Reviews [D]

Learn how to interpret ICMI 2026 reviews and improve your paper's acceptance chances

Reddit r/MachineLearning

Workshop submission for main conference paper under review [D]

Learn how to navigate submitting a paper to a non-archival workshop before the final decision of a main conference like ECCV

Reddit r/MachineLearning

Kept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]

Streamline your research with a new Chrome extension and website that integrates 3M papers from arxiv, OpenReview, GitHub, and HuggingFace, including citation graphs and SPECTER2 neighbors, and provide feedback to improve it

Reddit r/MachineLearning

Beyond Big Vendors: ERP Systems Explained #shorts

Digital Transformation with Eric Kimberling