UMAP explained | The best dimensionality reduction?

AI Coffee Break with Letitia · Beginner ·📄 Research Papers Explained ·5y ago

Skills: Unsupervised Learning80%ML Maths Basics70%

Key Takeaways

The video explains the Uniform Manifold Approximation and Projection (UMAP) algorithm for dimensionality reduction, comparing it to PCA and discussing its strengths and applications.

Full Transcript

hey we are back with the dimensionality reduction series in our last video of the series we talked about one way to escape the curse of dimensionality through an older algorithm called pca today we will talk about a newer and very popular dimensionality reduction algorithm called umap pca and umap are very different pca factorizes a matrix characterizing the data which puts it into company with algorithms like nmf or svd but you map like disney if you know it builds a neighbor graph in the original space of the data and tries to find a similar graph in lower dimensions but how does it do it umap stands for uniform manifold approximation and projection this sounds intimidating and the paper behind youmap can be even more intimidating but do not worry because we break it down for you the two steps of umap are high dimensional graph constructions and it's mapping to a lower dimensional graph the construction of this high dimensional graph is what makes umap so special compared to its competitors since it's hard to do it right and fast and the cool part about umap is that its steps are mathematically proven to work so first there was the data in the high dimensions and we want to approximate its shape or topology each data point is a so-called zero simplex and a certain theorem ensures that the shape of the data can be approximated when we connect these zero simplices which are our data points with their neighboring data points forming one or two or higher dimensional simplices and with this we can approximate the topology so all what we need to do is to make these connections for this the u-map algorithm extends a radius around each point and makes a connection between each point and its neighbors with intersecting radii so far the radii are equal but remember we want to approximate the shape of the data so we want a connected graph containing all our data points but this wish of ours brings in two problems firstly it often happens that in the data there are larger gaps where there is no next point to connect to in the graph this happens usually in low density regions secondly there are often high density regions where there are a lot of neighbors in the given radius and everything is way too connected this second problem gets even worse with the curse of dimensionality where in high dimensional spaces the distances between points become more and more similar okay then so if we have these two problems with a fixed radius then let's use a variable radius instead this choice is also mathematically supported by the definition of a romanian metric on the manifold but do not worry about that just keep in mind that there is math proving that the choice of a variable radius does not cause any trouble so now the radius is greater in low density regions and smaller in high density regions but u-map does not estimate density directly as a number but uses a proxy the density is estimated to be higher when the k-th nearest neighbor is close and lower when the k-th nearest neighbor is far away notice that this k in k nearest neighbor is a hyper parameter that we need to choose because with its help umap makes a density estimation to find the right local radius if k is big then more global structure is preserved if k is small then the radius decreases and the local structure is more preserved so the right k could give the perfect balance between local and global structure preservation but there are rarely any recipes for finding the optimum automatically some trial and error is required since k depends on each data set individually but not all k nearest neighbors are equal since each have different distances from the point we are looking at then the connections between each point and their neighbors get a weight a connection probability where points which are far away are weighted less and lower connection probability now that this high dimensional graph is constructed it is ready to be projected to lower dimensions this graph projection algorithm is too much for miss coffee bean to explain in detail in this video but you can imagine this projection as taking the high dimensional graph with their edges as being springs where each spring is stronger as the edge probability increases which means that points connected by high weighted edges are more likely to stay together in the lower dimensional space because the spring holds these points together and perhaps interesting to notice is that these spring forces are rotationally symmetric which leads to clusters sometimes landing on one side after one new map run and on the other side after another projection so umap has two main strengths over the famous graph based dimensionality reduction technique called disney it is faster due to its optimizations and strong mathematical foundations and it has also a better balance between locality and globality in clustering take for example this visualization from the awesome blog from google pair linked below we have this mammoth in 3d on the left and we can see side by side how umap and disney map these 3d mammoths into two dimensions we can play around with the number of neighbors taken into account when constructing the high dimensional graph and we can clearly see how low numbers focus on the local structure while higher numbers more on the global structure the minimum distance parameter allows to specify how tightly the algorithm will map points into the target low dimensional space a high minimum distance will spread the points more but it is important to notice that a stepwise change of these two parameters continuously changes the umap result disney on the other side is not that great in this aspect because when changing the parameter of this knee disney's result completely changes we really recommend you to play around yourself with all examples in this blog post so far we have seen examples where umap maps from 3d to 2d but the visualizations we have seen so far are toy examples they're just for us to get an intuition about the inner workings of the umap dimensionality reduction algorithm what umap excels at is reducing from a lot of dimensions here is a real world example of 764 dimensional mnist data containing handwritten digits it could be nice if we could reduce their dimensions to two or three dimensions so we can visualize this pixel space the digits are living in for this we can write a little python code to load the mnist data to load the umap package for dimensionality reduction and a visualization package of your liking we like baby plots and you will see why we read in the data and we see we have 60 000 training instances of 28 times 28 pixels which are together the 784 dimensions we plan to reduce from for reducing we fit and apply the umap algorithm and we do it once for two dimensions and again for three dimensions we reduced to 2d and 3d to show you what the cool thing baby plots can do it takes both the 3d and 2d embedding and can animate a transition between the two how cool is that hereby we can see that umap could already cluster almost all handwritten digits together meaning that umap here worked as an unsupervised clustering algorithm also we can see how useful a 3d visualization can be over just 2d where more complicated structures and relations can be visualized if you want to visualize these things in 3d yourself in either r javascript or python and load your interactive 3d plots into a powerpoint presentation to show to everybody check out the babyplot's website this was it from miss coffeebean read the paper if you're interested in the mathematical theory and proofs behind you map find it linked in the description below or watch the first author of the yuma paper presenting his umap invention linked below now go and reduce your dimensions with umap

Original Description

UMAP explained! The great dimensionality reduction algorithm in one video with a lot of visualizations and a little code. Uniform Manifold Approximation and Projection for all! ➡️ AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/ 📺 PCA video: https://youtu.be/3AUfWllnO7c 📺 Curse of dimensionality video: https://youtu.be/4v7ngaiFdp4 💻 Babyplots interactive 3D visualization in R, Python, Javascript with PowerPoint Add-in! Check it out at https://bp.bleb.li/ ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ 🔥 Optionally, pay us a coffee to boost our Coffee Bean production! ☕ Patreon: https://www.patreon.com/AICoffeeBreak Ko-fi: https://ko-fi.com/aicoffeebreak ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ Outline: * 00:00 UMAP intro * 01:31 Graph construction * 04:49 Graph projection * 05:48 UMAP vs. t-SNE visualized * 07:31 Code * 08:12 Babyplots 📚 Coenen, Pearce | Google Pair blog: https://pair-code.github.io/understanding-umap/ 📄 UMAP paper: McInnes, L., Healy, J., & Melville, J. (2018). Umap: Uniform manifold approximation and projection for dimension reduction. https://arxiv.org/abs/1802.03426 📺 Leland McInnes talk @enthought : https://youtu.be/nq6iPZVUxZU 🎵 Music (intro and outro): Dakar Flow - Carmen María and Edu Espinal ------------------------------- 🔗 Links: YouTube: https://www.youtube.com/AICoffeeBreak Twitter: https://twitter.com/AICoffeeBreak Reddit: https://www.reddit.com/r/AICoffeeBreak/ #AICoffeeBreak #MsCoffeeBean #UMAP #MachineLearning #research #AI

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from AI Coffee Break with Letitia · AI Coffee Break with Letitia · 28 of 60

← Previous Next →

AI Coffee Break - Channel Trailer

AI Coffee Break - Channel Trailer

AI Coffee Break with Letitia

How to check if a neural network has learned a specific phenomenon?

How to check if a neural network has learned a specific phenomenon?

AI Coffee Break with Letitia

A brief history of the Transformer architecture in NLP

A brief history of the Transformer architecture in NLP

AI Coffee Break with Letitia

Our paper at CVPR 2020 - MUL Workshop and ACL 2020 - ALVR Workshop

Our paper at CVPR 2020 - MUL Workshop and ACL 2020 - ALVR Workshop

AI Coffee Break with Letitia

The Transformer neural network architecture EXPLAINED. “Attention is all you need”

The Transformer neural network architecture EXPLAINED. “Attention is all you need”

AI Coffee Break with Letitia

Transformer combining Vision and Language? ViLBERT - NLP meets Computer Vision

Transformer combining Vision and Language? ViLBERT - NLP meets Computer Vision

AI Coffee Break with Letitia

Pre-training of BERT-based Transformer architectures explained – language and vision!

Pre-training of BERT-based Transformer architectures explained – language and vision!

AI Coffee Break with Letitia

GPT-3 explained with examples. Possibilities, and implications.

GPT-3 explained with examples. Possibilities, and implications.

AI Coffee Break with Letitia

Adversarial Machine Learning explained! | With examples.

Adversarial Machine Learning explained! | With examples.

AI Coffee Break with Letitia

BERTology meets Biology | Solving biological problems with Transformers

BERTology meets Biology | Solving biological problems with Transformers

AI Coffee Break with Letitia

Can a neural network tell if an image is mirrored? – Visual Chirality

Can a neural network tell if an image is mirrored? – Visual Chirality

AI Coffee Break with Letitia

The ultimate intro to Graph Neural Networks. Maybe.

The ultimate intro to Graph Neural Networks. Maybe.

AI Coffee Break with Letitia

Can language models understand? Bender and Koller argument.

Can language models understand? Bender and Koller argument.

AI Coffee Break with Letitia

GANs explained | Generative Adversarial Networks video with showcase!

GANs explained | Generative Adversarial Networks video with showcase!

AI Coffee Break with Letitia

What nobody tells you about MULTIMODAL Machine Learning! 🙊 THE definition.

What nobody tells you about MULTIMODAL Machine Learning! 🙊 THE definition.

AI Coffee Break with Letitia

Multimodal Machine Learning models do not work. Here is why. Part 1/2 – The SYMPTOMS

Multimodal Machine Learning models do not work. Here is why. Part 1/2 – The SYMPTOMS

AI Coffee Break with Letitia

Why Multimodal Machine Learning models do not work. Part 2/2 – The CAUSES

Why Multimodal Machine Learning models do not work. Part 2/2 – The CAUSES

AI Coffee Break with Letitia

An image is worth 16x16 words: ViT | Vision Transformer explained

An image is worth 16x16 words: ViT | Vision Transformer explained

AI Coffee Break with Letitia

AI understanding language!? A roadmap to natural language understanding.

AI understanding language!? A roadmap to natural language understanding.

AI Coffee Break with Letitia

"What Can We Do to Improve Peer Review in NLP?" 👀

"What Can We Do to Improve Peer Review in NLP?" 👀

AI Coffee Break with Letitia

The curse of dimensionality. Or is it a blessing?

The curse of dimensionality. Or is it a blessing?

AI Coffee Break with Letitia

PCA explained with intuition, a little math and code

PCA explained with intuition, a little math and code

AI Coffee Break with Letitia

Data-efficient Image Transformers EXPLAINED! Facebook AI's DeiT paper

Data-efficient Image Transformers EXPLAINED! Facebook AI's DeiT paper

AI Coffee Break with Letitia

OpenAI's DALL-E explained. How GPT-3 creates images from descriptions.

OpenAI's DALL-E explained. How GPT-3 creates images from descriptions.

AI Coffee Break with Letitia

Leaking training data from GPT-2. How is this possible?

Leaking training data from GPT-2. How is this possible?

AI Coffee Break with Letitia

OpenAI’s CLIP explained! | Examples, links to code and pretrained model

OpenAI’s CLIP explained! | Examples, links to code and pretrained model

AI Coffee Break with Letitia

Transformers can do both images and text. Here is why.

Transformers can do both images and text. Here is why.

AI Coffee Break with Letitia

UMAP explained | The best dimensionality reduction?

UMAP explained | The best dimensionality reduction?

AI Coffee Break with Letitia

NVIDIA Jarvis (now NVIDIA Riva) meets Ms. Coffee Bean

NVIDIA Jarvis (now NVIDIA Riva) meets Ms. Coffee Bean

AI Coffee Break with Letitia

Transformer in Transformer: Paper explained and visualized | TNT

Transformer in Transformer: Paper explained and visualized | TNT

AI Coffee Break with Letitia

[RANT] Adversarial attack on OpenAI’s CLIP? Are we the fools or the foolers?

[RANT] Adversarial attack on OpenAI’s CLIP? Are we the fools or the foolers?

AI Coffee Break with Letitia

Pattern Exploiting Training explained! | PET, iPET, ADAPET

Pattern Exploiting Training explained! | PET, iPET, ADAPET

AI Coffee Break with Letitia

Deep Learning for Symbolic Mathematics!? | Paper EXPLAINED

Deep Learning for Symbolic Mathematics!? | Paper EXPLAINED

AI Coffee Break with Letitia

FNet: Mixing Tokens with Fourier Transforms – Paper Explained

FNet: Mixing Tokens with Fourier Transforms – Paper Explained

AI Coffee Break with Letitia

Are Pre-trained Convolutions Better than Pre-trained Transformers? – Paper Explained

Are Pre-trained Convolutions Better than Pre-trained Transformers? – Paper Explained

AI Coffee Break with Letitia

"Please Commit More Blatant Academic Fraud" – A fellow PhD student's response.

"Please Commit More Blatant Academic Fraud" – A fellow PhD student's response.

AI Coffee Break with Letitia

Scaling Vision Transformers? How much data can a transformer get? #Shorts

Scaling Vision Transformers? How much data can a transformer get? #Shorts

AI Coffee Break with Letitia

How cross-modal are vision and language models really? 👀 Seeing past words. [Own work]

How cross-modal are vision and language models really? 👀 Seeing past words. [Own work]

AI Coffee Break with Letitia

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization +Tokenizer explained

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization +Tokenizer explained

AI Coffee Break with Letitia

Positional embeddings in transformers EXPLAINED | Demystifying positional encodings.

Positional embeddings in transformers EXPLAINED | Demystifying positional encodings.

AI Coffee Break with Letitia

Adding vs. concatenating positional embeddings & Learned positional encodings

Adding vs. concatenating positional embeddings & Learned positional encodings

AI Coffee Break with Letitia

Self-Attention with Relative Position Representations – Paper explained

Self-Attention with Relative Position Representations – Paper explained

AI Coffee Break with Letitia

Saddle points vs. local minima in high dimensional spaces | ❓ #AICoffeeBreakQuiz #Shorts

Saddle points vs. local minima in high dimensional spaces | ❓ #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

What is the model identifiability problem? | Explained in 60 seconds! | ❓ #AICoffeeBreakQuiz #Shorts

What is the model identifiability problem? | Explained in 60 seconds! | ❓ #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

Data leakage during data preparation? | Using AntiPatterns to avoid MLOps Mistakes

Data leakage during data preparation? | Using AntiPatterns to avoid MLOps Mistakes

AI Coffee Break with Letitia

Is today's AI smarter than YOU? #Shorts

Is today's AI smarter than YOU? #Shorts

AI Coffee Break with Letitia

Convolution vs Cross-Correlation. How most CNNs do not compute convolutions. | ❓ #Shorts

Convolution vs Cross-Correlation. How most CNNs do not compute convolutions. | ❓ #Shorts

AI Coffee Break with Letitia

Why do we care about cross-correlations vs convolutions | ❓ #AICoffeeBreakQuiz #Shorts

Why do we care about cross-correlations vs convolutions | ❓ #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

The convolution is not shift invariant. | Invariance vs Equivariance | ❓ #AICoffeeBreakQuiz #Shorts

The convolution is not shift invariant. | Invariance vs Equivariance | ❓ #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

How to increase the receptive field in CNNs? | #AICoffeeBreakQuiz #Shorts

How to increase the receptive field in CNNs? | #AICoffeeBreakQuiz #Shorts

AI Coffee Break with Letitia

What is tokenization and how does it work? Tokenizers explained.

What is tokenization and how does it work? Tokenizers explained.

AI Coffee Break with Letitia

Foundation Models | On the opportunities and risks of calling pre-trained models “Foundation Models”

Foundation Models | On the opportunities and risks of calling pre-trained models “Foundation Models”

AI Coffee Break with Letitia

How modern search engines work – Vector databases explained! | Weaviate open-source

How modern search engines work – Vector databases explained! | Weaviate open-source

AI Coffee Break with Letitia

Eyes tell all: How to tell that an AI generated a face?

Eyes tell all: How to tell that an AI generated a face?

AI Coffee Break with Letitia

Swin Transformer paper animated and explained

Swin Transformer paper animated and explained

AI Coffee Break with Letitia

Data BAD | What Will it Take to Fix Benchmarking for NLU?

Data BAD | What Will it Take to Fix Benchmarking for NLU?

AI Coffee Break with Letitia

SimVLM explained | What the paper doesn’t tell you

SimVLM explained | What the paper doesn’t tell you

AI Coffee Break with Letitia

Generalization – Interpolation – Extrapolation in Machine Learning: Which is it now!?

Generalization – Interpolation – Extrapolation in Machine Learning: Which is it now!?

AI Coffee Break with Letitia

Do Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuiz

Do Transformers process sequences of FIXED or of VARIABLE length? | #AICoffeeBreakQuiz

AI Coffee Break with Letitia

The efficiency misnomer | Size does not matter | What does the number of parameters mean in a model?

The efficiency misnomer | Size does not matter | What does the number of parameters mean in a model?

AI Coffee Break with Letitia

The video explains UMAP, a dimensionality reduction algorithm that constructs a graph in high-dimensional space and projects it to a lower-dimensional space, allowing for visualization and clustering of high-dimensional data.

Key Takeaways

Load a high-dimensional dataset
Apply UMAP to reduce dimensions
Visualize the resulting lower-dimensional data using Babyplots or other tools
Tune hyperparameters such as the number of neighbors and minimum distance to optimize results

💡 UMAP's ability to balance local and global structure preservation makes it a powerful tool for dimensionality reduction and clustering.

🔒 Pro feature: Ask AI to explain this lesson →

More on: Unsupervised Learning

View skill →

How to implement K-Means from scratch with Python

How to implement K-Means from scratch with Python

K-Means Clustering - The Math of Intelligence (Week 3)

K-Means Clustering - The Math of Intelligence (Week 3)

Mean Shift with Titanic Dataset - Practical Machine Learning Tutorial with Python p.40

Mean Shift with Titanic Dataset - Practical Machine Learning Tutorial with Python p.40

Self-/Unsupervised GNN Training

Self-/Unsupervised GNN Training

Statistical Learning: 12.R.3 Hierarchical Clustering

Statistical Learning: 12.R.3 Hierarchical Clustering

Stanford Online

Clustering with DBSCAN, Clearly Explained!!!

Clustering with DBSCAN, Clearly Explained!!!

StatQuest with Josh Starmer

Related AI Lessons

I Spent Weeks Looking for a Research Gap Before I Realized I Was Searching the Wrong Way

Learn how to effectively find research gaps by changing your approach, a crucial skill for AI researchers and academics

ICMI 2026 Reviews [D]

Learn how to interpret ICMI 2026 reviews and improve your paper's acceptance chances

Reddit r/MachineLearning

Workshop submission for main conference paper under review [D]

Learn how to navigate submitting a paper to a non-archival workshop before the final decision of a main conference like ECCV

Reddit r/MachineLearning

Kept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]

Streamline your research with a new Chrome extension and website that integrates 3M papers from arxiv, OpenReview, GitHub, and HuggingFace, including citation graphs and SPECTER2 neighbors, and provide feedback to improve it

Reddit r/MachineLearning

Beyond Big Vendors: ERP Systems Explained #shorts

Digital Transformation with Eric Kimberling