Principal Component Analysis (PCA) | Introduction & Example (Python) Code

Shaw Talebi · Beginner ·📐 ML Fundamentals ·5y ago

Skills: ML Maths Basics80%

Key Takeaways

Introduces the Wavelet Transform and provides example code

Full Transcript

okay this is good i got the angle quite on the set hey guys welcome back hey guys welcome back i'm back with another series if you missed the first one it's available on my channel it was on time series signals the fourier transform and the wave of the transform in this new series i'll be talking about two things one principle component analysis and two independent component analysis so principal component analysis or pca is the topic of this video so i'll give you a little intuition share some math and then i'll finish with a concrete example of how you can use pca to analyze the stock market so let's get right into it so the analogy i like to think of for pca is imagine like a massive rock band with like 20 members in the ensemble and you have you know two drummers several guitarists you have several keyboardists or pianists you have a string section a horn section vocalist percussionist the whole works so you have this 20 person band and you know that's not a big deal that's uh you know that's the kind of band made for huge arenas and stadiums but if a band like this is just getting started they're gonna have a hard time fitting in smaller venues like coffee shops and restaurants so a natural solution to this problem is to just kind of reduce the number of players at specific performances so instead of like a keyboardist pianist and whatnot you could just have one person on the keyboard instead of having multiple guitars you could just have one person do an acoustic guitar instead of two drummers and a percussionist you can have someone banging on the bongos and so on in a lot of ways this is basically what pca does so this is the big band on the left is before pca and then you can kind of boil it down to its core elements for the same band to play at the coffee shop but instead of uh a band you can think of a pca applying to a data set instead of musicians or players in the band you can think of the variables in your data set and instead of a song or the music you can think of what your data set is representing a bit more concretely principal component analysis pca reduces input dimensionality and redundancy so we can think of two variables x and y this could be something like hot dogs sold and hot dog bun sold which are directly correlated but in a lot of ways contain redundant information so it may be practical to represent this underlying information instead of through two variables through just one variable and then that's uh application of pca so we can transform our axes from this x and y axis to a new set of axes we'll call them pc1 and pc2 and then if you want to take it a step further you can just remove pc2 and just operate with one variable so essentially we've reduced the dimensionality from two variables x and y to just one pc one if we choose to drop pc2 okay so how does it work the basic idea the goal of pca is to reduce variable redundancy or input variable redundancy by creating a new set of variables where the variance along each subsequent variable is maximized so in the previous example we saw pictorially that we changed from a set of two variables hot dog sold and hot dog bun sold to a new pair of variables we call them pc1 and pc2 and essentially pc1 contained all the relevant information we needed and the way we got pc1 is basically rotated the axes to be kind of along this linear slope of points defined by the hot dog bun and hot dog sales what does that translate to mathematically so we can think of this situation so we have x which is a matrix of data where the rows are data records and the columns are variables we have w which is a vector of weights and then we have t which is a score vector and what i'm going to be calling a principal component so t is what we're interested in we have our data x and we're trying to find uh a w that is going to create this principle component for us okay so here's here's the magic of pca here's the trick to it all so the goal here is to maximize the variance of t subject to the constraint that the norm squared of w so w transpose times w is equal to one okay and then variances uh defined in the usual way so you take every element subtract the mean of the variable you square it and then you divide by uh the number of elements minus one and then you just add this up for every single element in the set of numbers um and so one really important thing when doing pca is you want to auto scale your data so basically what does that mean for each number in each column of your matrix uh you want to subtract the average and divide by the standard deviation so if we do that then the mean of the principal component will turn out to be zero which allows us to kind of drop the mean term in the variance here it turns out that the variance will just be equal to the norm squared of t divided by uh the number of elements minus one okay so what does that mean that means we can rewrite this optimization problem instead of maximizing the variance we can just maximize the norm squared of t because the the vector w that maximizes the norm squared of t is also going to be the same vector w that maximizes the variance of t okay so we can rewrite uh the optimization problem using our above expression for t and it turns out this is actually a pretty straightforward optimization problem to solve and don't be intimidated by the matrices and vectors we can use a very well known and common technique in calculus known as the method of lagrange multipliers which basically allows us to rewrite an optimization problem with constraints a constrained optimization problem as a optimization problem without constraints or an unconstrained optimization problem if none of that makes sense that's fine we just need these relevant expressions here so we can write out the lagrangian which is this l of x uh term here for our pca optimization problem and then we can have the associated equations and this is the exciting part here this first equation if we rearrange it is just an eigenvalue problem which is a standard problem in linear algebra and then the second equation is just a restatement of our original constraint so writing it explicitly here we can solve for the eigenvalue lambda and the vector of weights w using standard eigenvalue approaches if you're doing this in some programming language every programming language like r python matlab they're going to have built-in functions that allow you to solve this problem and then once we have this vector of weights we have everything we need we can just multiply that by x and we can get our principal component and then this naturally extends to multiple components so this we started out just looking for a single component but if you solve the eigenvalue problem your and you have n columns in your matrix x and x is square you're going to end up with n eigenvalues and n corresponding eigenvectors and then if you kind of sort these eigenvalues and eigenvectors from largest to smallest you sort from the largest eigenvalue all the way down to the smallest each corresponding eigenvector w is going to be a set of weights which define a principal component and the principal components associated with the larger eigenvalues contain more information than components associated with smaller eigenvalues so you can define some threshold like in the first slide where we could have just dropped pc2 because it wasn't giving us much additional information you can do the same thing and kind of truncate your variables after a certain amount of information is captured with your principal components okay so just as a recap principle component analysis it reduces input dimensionality and redundancy some key points are new variables are created to be a linear combination of input variables so that's kind of what we saw in the previous slide where you had a matrix multiplied by a vector of weights that's equivalent to a linear combination of your input variables and then each subsequent new variable contains less information we kind of saw that once you sorted your eigenvalues from largest to smallest the principal components associated with the larger eigenvalues contain more information and the principal components corresponding to smaller eigenvalues contain less information and then there are a lot of applications for pca relating variables together so if two variables get kind of clumped together kind of like hot dog bun sold and hot dog sold there's some underlying correlation there you can use it for clustering where you can transform your space from your original input space to like a new pca space and then you can do a clustering algorithm like k-means and then you can also do some outlier identification so you can plot all your points in your principal component space and just kind of visually inspect if there are any outliers all right so here's a fun example i guess at the outset i'm going to say i'm not a financial advisor i've never taken a finance class so in no way is this a recommendation of how you should invest your money this is just a fun example of what pca can do so here we're going to use pca to create an s p 500 index fund so an index fund is basically a set of investments that are meant to follow or track with a specific market the example codes on the github so i'll probably just fly through this i used the yahoo finance module to get real actual stock data so this is all real data this isn't made up and then i use pandas and numpy for all the number crunching so i write some code to input the ticker names from wikipedia and then graham guthrie had a nice medium post of how you can grab all these s p 500 names so i just stole some code from that post and made some edits okay then i pull s p 500 data for 2020 i drop nands get a pandas data frame of just close prices as opposed to all the other information that's available get a list of names ticker names of all the companies in the data frame so we have 253 rows and 499 columns so here i i guess the comments aren't updated so i apologize for that but here we're initializing pca with 10 components and then we'll ex we'll apply pca to our data set and we'll print the explained variants so you can see you know the first three components you're already at more than 90 of the explained variants uh if you just sum up the first three elements of that array there um okay and then we can create an index fund so there's countless ways you can do this i just arbitrarily took the weights defining the first three principal components i sum them together and then i only included the top 61 weights we can represent the uh overall portfolio of this index fund with a bar plot it's a natural way to do it so the y-axis is the relative weight you can also think of this as the number of dollars relative number of dollars you're gonna invest in each specific company and then the x-axis is just the individual ticker names okay and then we can see how our index fund compares to the actual s p 500 over 2020 and just you know visually approximately it doesn't do such a bad job there's some discrepancies uh along the way but everyone cares about percent return so if you would have just bought one share of every single stock in the s p 500 at the beginning of 2020 and then sold those uh same shares at the beginning of 2021 you would have made 20 return if you would have instead followed the investing strategy of this particular index fund derived from pca you would have made 25 so that was the video on principal component analysis i hope that cleared things up if you want to learn more about principal component analysis i have provided a link to my blog post on medium on the topic stay tuned for the next video where i'll be talking about a similar but different technique independent component analysis if you enjoyed this video be sure to like comment subscribe hit the bell share with your friends and family so they too can learn about principal component analysis thanks for watching you

Original Description

🤝 Work with me: https://aibuilder.academy/yt/WDjzgnqyz4s 🚀 Ship AI apps in weeks, not months: https://aibuilder.academy/courses/yt/WDjzgnqyz4s The first video in a 2-part series on Principal Component Analysis (PCA) and Independent Component Analysis (ICA). This video gives some intuition, math, and an example of using PCA to create an S&P 500 index fund. More in this series: - Blog: https://medium.com/towards-data-science/principal-component-analysis-pca-79d228eb9d24?sk=4c5b8fd7fd28a09c10ed483e51dd975a - ICA: https://youtu.be/GgLaP4Des1Q - Example code: https://github.com/ShawhinT/YouTube/tree/main/pca Resources I found helpful: - R. Bro, A. K. Smilde, Anal. Methods, 2014,6, 2812-2831 - Golden, R. (2020). Statistical machine learning: A unified framework. Boca Raton: CRC Press C. Introduction - 0:00 An analogy - 0:43 PCA - 2:20 Some math - 3:22 Recap - 9:32 Example: S&P 500 index fund - 10:52 Closing remarks - 14:10

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Shaw Talebi · Shaw Talebi · 6 of 60

← Previous Next →

biometricDashboard2 DEMO

biometricDashboard2 DEMO

biometricDahboard3 DEMO

biometricDahboard3 DEMO

Time Series, Signals, & the Fourier Transform | Introduction

Time Series, Signals, & the Fourier Transform | Introduction

The Fast Fourier Transform | How does it (actually) work?

The Fast Fourier Transform | How does it (actually) work?

The Wavelet Transform | Introduction & Example Code

The Wavelet Transform | Introduction & Example Code

Principal Component Analysis (PCA) | Introduction & Example (Python) Code

Principal Component Analysis (PCA) | Introduction & Example (Python) Code

Independent Component Analysis (ICA) | EEG Analysis Example Code

Independent Component Analysis (ICA) | EEG Analysis Example Code

Kmeans-based Blink Detecter DEMO

Kmeans-based Blink Detecter DEMO

Shit Happens, Stay Solution Oriented

Shit Happens, Stay Solution Oriented

Why Conflict Is Good & How You Can Use It

Why Conflict Is Good & How You Can Use It

Causality: An Introduction | How (naive) statistics can fail us

Causality: An Introduction | How (naive) statistics can fail us

Causal Inference | Answering causal questions

Causal Inference | Answering causal questions

Causal Discovery | Inferring causality from observational data

Causal Discovery | Inferring causality from observational data

How to Be Antifragile | 7 Practical Tips

How to Be Antifragile | 7 Practical Tips

Multi-kills: How to Do More With Less (no, not by multi-tasking)

Multi-kills: How to Do More With Less (no, not by multi-tasking)

Topological Data Analysis (TDA) | An introduction

Topological Data Analysis (TDA) | An introduction

The Mapper Algorithm | Overview & Python Example Code

The Mapper Algorithm | Overview & Python Example Code

Persistent Homology | Introduction & Python Example Code

Persistent Homology | Introduction & Python Example Code

What Is Data Science & How To Start? | A Beginner's Guide

What Is Data Science & How To Start? | A Beginner's Guide

How to do MORE with LESS - multikills

How to do MORE with LESS - multikills

Causal Effects | An introduction

Causal Effects | An introduction

Causal Effects via Propensity Scores | Introduction & Python Code

Causal Effects via Propensity Scores | Introduction & Python Code

Causal Effects via the Do-operator | Overview & Example

Causal Effects via the Do-operator | Overview & Example

Causal Effects via DAGs | How to Handle Unobserved Confounders

Causal Effects via DAGs | How to Handle Unobserved Confounders

Smoothing Crypto Time Series with Wavelets | Real-world Data Project

Smoothing Crypto Time Series with Wavelets | Real-world Data Project

Causal Effects via Regression w/ Python Code

Causal Effects via Regression w/ Python Code

5 Reasons Why Every Data Scientist Should Consider Freelancing

5 Reasons Why Every Data Scientist Should Consider Freelancing

An Introduction to Decision Trees | Gini Impurity & Python Code

An Introduction to Decision Trees | Gini Impurity & Python Code

10 Decision Trees are Better Than 1 | Random Forest & AdaBoost

10 Decision Trees are Better Than 1 | Random Forest & AdaBoost

Dimensionality Reduction & Segmentation with Decision Trees | Python Code

Dimensionality Reduction & Segmentation with Decision Trees | Python Code

How to Make a Data Science Portfolio With GitHub Pages (2025)

How to Make a Data Science Portfolio With GitHub Pages (2025)

My $100,000+ Data Science Resume (what got me hired)

My $100,000+ Data Science Resume (what got me hired)

How to Create a Custom Email Signature in Gmail (2025)

How to Create a Custom Email Signature in Gmail (2025)

I Spent $675.92 Talking to Top Data Scientists on Upwork—Here’s what I learned

I Spent $675.92 Talking to Top Data Scientists on Upwork—Here’s what I learned

Lessons from Spending $675.92 to Talk to Top Data Scientists on Upwork #freelance #datascience

Lessons from Spending $675.92 to Talk to Top Data Scientists on Upwork #freelance #datascience

A Practical Introduction to Large Language Models (LLMs)

A Practical Introduction to Large Language Models (LLMs)

The OpenAI (Python) API | Introduction & Example Code

The OpenAI (Python) API | Introduction & Example Code

The Hugging Face Transformers Library | Example Code + Chatbot UI with Gradio

The Hugging Face Transformers Library | Example Code + Chatbot UI with Gradio

Why I Quit My $150,000 Data Science Job

Why I Quit My $150,000 Data Science Job

Prompt Engineering: How to Trick AI into Solving Your Problems

Prompt Engineering: How to Trick AI into Solving Your Problems

The REALITY of entrepreneurship. #entrepreneurship #startup #smallbusiness

The REALITY of entrepreneurship. #entrepreneurship #startup #smallbusiness

Fine-tuning Large Language Models (LLMs) | w/ Example Code

Fine-tuning Large Language Models (LLMs) | w/ Example Code

How to Build an LLM from Scratch | An Overview

How to Build an LLM from Scratch | An Overview

I Have 90 Days to Make $10k/mo—Here's my plan

I Have 90 Days to Make $10k/mo—Here's my plan

I Spent $716.46 Talking to Data Scientists on Upwork—Here’s what I learned.

I Spent $716.46 Talking to Data Scientists on Upwork—Here’s what I learned.

Pareto, Power Laws, and Fat Tails

Pareto, Power Laws, and Fat Tails

Do NOT become an entrepreneur #entrepreneurship

Do NOT become an entrepreneur #entrepreneurship

Detecting Power Laws in Real-world Data | w/ Python Code

Detecting Power Laws in Real-world Data | w/ Python Code

How I’d learn data analytics (if I had to start over in 2024) #dataanalytics

How I’d learn data analytics (if I had to start over in 2024) #dataanalytics

4 Ways to Measure Fat Tails with Python (+ Example Code)

4 Ways to Measure Fat Tails with Python (+ Example Code)

Fine-tuning EXPLAINED in 40 sec #generativeai

Fine-tuning EXPLAINED in 40 sec #generativeai

How Much YouTube Paid Me in My First 6 Months of Monetization (as a Data Science Creator)

How Much YouTube Paid Me in My First 6 Months of Monetization (as a Data Science Creator)

5 Questions Every Data Scientist Should Hardcode into Their Brain

5 Questions Every Data Scientist Should Hardcode into Their Brain

AI for Business: A (non-technical) introduction

AI for Business: A (non-technical) introduction

LLMs EXPLAINED in 60 seconds #ai

LLMs EXPLAINED in 60 seconds #ai

3 Ways to Make a Custom AI Assistant | RAG, Tools, & Fine-tuning

3 Ways to Make a Custom AI Assistant | RAG, Tools, & Fine-tuning

What is #ai? — Simply Explained

What is #ai? — Simply Explained

QLoRA—How to Fine-tune an LLM on a Single GPU (w/ Python Code)

QLoRA—How to Fine-tune an LLM on a Single GPU (w/ Python Code)

How to Improve LLMs with RAG (Overview + Python Code)

How to Improve LLMs with RAG (Overview + Python Code)

Text Embeddings, Classification, and Semantic Search (w/ Python Code)

Text Embeddings, Classification, and Semantic Search (w/ Python Code)

More on: ML Maths Basics

View skill →

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

ChethanAIChronicles

“Hello, world” from scratch on a 6502 — Part 1

“Hello, world” from scratch on a 6502 — Part 1

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

ROC and AUC in R

ROC and AUC in R

StatQuest with Josh Starmer

Data Science Fundamentals: Data Cleaning in Python

Data Science Fundamentals: Data Cleaning in Python

Related AI Lessons

How to Learn a Hard Technical Skill Without Burning Out

Learn how to acquire hard technical skills without burnout by creating a sustainable learning plan

Dev.to · Anas Kalthoum | FreeBrain

After interviewing over 100 ML Candidates. Last Week Someone Walked In and Made Me Take Notes.

Learn what makes a standout ML candidate after interviewing over 100 applicants

Medium · Machine Learning

How AI Learns with Less Labeled Data

Discover how AI can learn with less labeled data, a crucial aspect of machine learning beyond model selection

Medium · Machine Learning

Mastering TypeScript — Understanding the TypeScript Compiler (tsc) from Scratch — Lesson 2

Learn the basics of the TypeScript compiler to write better JavaScript code

Medium · JavaScript

Learn Deep Learning by Hand (Beginner's Guide - Part 1)