Principal Component Analysis (PCA) | Introduction & Example (Python) Code

Shaw Talebi · Beginner ·📐 ML Fundamentals ·5y ago

Key Takeaways

Introduces the Wavelet Transform and provides example code

Full Transcript

okay this is good i got the angle quite on the set hey guys welcome back hey guys welcome back i'm back with another series if you missed the first one it's available on my channel it was on time series signals the fourier transform and the wave of the transform in this new series i'll be talking about two things one principle component analysis and two independent component analysis so principal component analysis or pca is the topic of this video so i'll give you a little intuition share some math and then i'll finish with a concrete example of how you can use pca to analyze the stock market so let's get right into it so the analogy i like to think of for pca is imagine like a massive rock band with like 20 members in the ensemble and you have you know two drummers several guitarists you have several keyboardists or pianists you have a string section a horn section vocalist percussionist the whole works so you have this 20 person band and you know that's not a big deal that's uh you know that's the kind of band made for huge arenas and stadiums but if a band like this is just getting started they're gonna have a hard time fitting in smaller venues like coffee shops and restaurants so a natural solution to this problem is to just kind of reduce the number of players at specific performances so instead of like a keyboardist pianist and whatnot you could just have one person on the keyboard instead of having multiple guitars you could just have one person do an acoustic guitar instead of two drummers and a percussionist you can have someone banging on the bongos and so on in a lot of ways this is basically what pca does so this is the big band on the left is before pca and then you can kind of boil it down to its core elements for the same band to play at the coffee shop but instead of uh a band you can think of a pca applying to a data set instead of musicians or players in the band you can think of the variables in your data set and instead of a song or the music you can think of what your data set is representing a bit more concretely principal component analysis pca reduces input dimensionality and redundancy so we can think of two variables x and y this could be something like hot dogs sold and hot dog bun sold which are directly correlated but in a lot of ways contain redundant information so it may be practical to represent this underlying information instead of through two variables through just one variable and then that's uh application of pca so we can transform our axes from this x and y axis to a new set of axes we'll call them pc1 and pc2 and then if you want to take it a step further you can just remove pc2 and just operate with one variable so essentially we've reduced the dimensionality from two variables x and y to just one pc one if we choose to drop pc2 okay so how does it work the basic idea the goal of pca is to reduce variable redundancy or input variable redundancy by creating a new set of variables where the variance along each subsequent variable is maximized so in the previous example we saw pictorially that we changed from a set of two variables hot dog sold and hot dog bun sold to a new pair of variables we call them pc1 and pc2 and essentially pc1 contained all the relevant information we needed and the way we got pc1 is basically rotated the axes to be kind of along this linear slope of points defined by the hot dog bun and hot dog sales what does that translate to mathematically so we can think of this situation so we have x which is a matrix of data where the rows are data records and the columns are variables we have w which is a vector of weights and then we have t which is a score vector and what i'm going to be calling a principal component so t is what we're interested in we have our data x and we're trying to find uh a w that is going to create this principle component for us okay so here's here's the magic of pca here's the trick to it all so the goal here is to maximize the variance of t subject to the constraint that the norm squared of w so w transpose times w is equal to one okay and then variances uh defined in the usual way so you take every element subtract the mean of the variable you square it and then you divide by uh the number of elements minus one and then you just add this up for every single element in the set of numbers um and so one really important thing when doing pca is you want to auto scale your data so basically what does that mean for each number in each column of your matrix uh you want to subtract the average and divide by the standard deviation so if we do that then the mean of the principal component will turn out to be zero which allows us to kind of drop the mean term in the variance here it turns out that the variance will just be equal to the norm squared of t divided by uh the number of elements minus one okay so what does that mean that means we can rewrite this optimization problem instead of maximizing the variance we can just maximize the norm squared of t because the the vector w that maximizes the norm squared of t is also going to be the same vector w that maximizes the variance of t okay so we can rewrite uh the optimization problem using our above expression for t and it turns out this is actually a pretty straightforward optimization problem to solve and don't be intimidated by the matrices and vectors we can use a very well known and common technique in calculus known as the method of lagrange multipliers which basically allows us to rewrite an optimization problem with constraints a constrained optimization problem as a optimization problem without constraints or an unconstrained optimization problem if none of that makes sense that's fine we just need these relevant expressions here so we can write out the lagrangian which is this l of x uh term here for our pca optimization problem and then we can have the associated equations and this is the exciting part here this first equation if we rearrange it is just an eigenvalue problem which is a standard problem in linear algebra and then the second equation is just a restatement of our original constraint so writing it explicitly here we can solve for the eigenvalue lambda and the vector of weights w using standard eigenvalue approaches if you're doing this in some programming language every programming language like r python matlab they're going to have built-in functions that allow you to solve this problem and then once we have this vector of weights we have everything we need we can just multiply that by x and we can get our principal component and then this naturally extends to multiple components so this we started out just looking for a single component but if you solve the eigenvalue problem your and you have n columns in your matrix x and x is square you're going to end up with n eigenvalues and n corresponding eigenvectors and then if you kind of sort these eigenvalues and eigenvectors from largest to smallest you sort from the largest eigenvalue all the way down to the smallest each corresponding eigenvector w is going to be a set of weights which define a principal component and the principal components associated with the larger eigenvalues contain more information than components associated with smaller eigenvalues so you can define some threshold like in the first slide where we could have just dropped pc2 because it wasn't giving us much additional information you can do the same thing and kind of truncate your variables after a certain amount of information is captured with your principal components okay so just as a recap principle component analysis it reduces input dimensionality and redundancy some key points are new variables are created to be a linear combination of input variables so that's kind of what we saw in the previous slide where you had a matrix multiplied by a vector of weights that's equivalent to a linear combination of your input variables and then each subsequent new variable contains less information we kind of saw that once you sorted your eigenvalues from largest to smallest the principal components associated with the larger eigenvalues contain more information and the principal components corresponding to smaller eigenvalues contain less information and then there are a lot of applications for pca relating variables together so if two variables get kind of clumped together kind of like hot dog bun sold and hot dog sold there's some underlying correlation there you can use it for clustering where you can transform your space from your original input space to like a new pca space and then you can do a clustering algorithm like k-means and then you can also do some outlier identification so you can plot all your points in your principal component space and just kind of visually inspect if there are any outliers all right so here's a fun example i guess at the outset i'm going to say i'm not a financial advisor i've never taken a finance class so in no way is this a recommendation of how you should invest your money this is just a fun example of what pca can do so here we're going to use pca to create an s p 500 index fund so an index fund is basically a set of investments that are meant to follow or track with a specific market the example codes on the github so i'll probably just fly through this i used the yahoo finance module to get real actual stock data so this is all real data this isn't made up and then i use pandas and numpy for all the number crunching so i write some code to input the ticker names from wikipedia and then graham guthrie had a nice medium post of how you can grab all these s p 500 names so i just stole some code from that post and made some edits okay then i pull s p 500 data for 2020 i drop nands get a pandas data frame of just close prices as opposed to all the other information that's available get a list of names ticker names of all the companies in the data frame so we have 253 rows and 499 columns so here i i guess the comments aren't updated so i apologize for that but here we're initializing pca with 10 components and then we'll ex we'll apply pca to our data set and we'll print the explained variants so you can see you know the first three components you're already at more than 90 of the explained variants uh if you just sum up the first three elements of that array there um okay and then we can create an index fund so there's countless ways you can do this i just arbitrarily took the weights defining the first three principal components i sum them together and then i only included the top 61 weights we can represent the uh overall portfolio of this index fund with a bar plot it's a natural way to do it so the y-axis is the relative weight you can also think of this as the number of dollars relative number of dollars you're gonna invest in each specific company and then the x-axis is just the individual ticker names okay and then we can see how our index fund compares to the actual s p 500 over 2020 and just you know visually approximately it doesn't do such a bad job there's some discrepancies uh along the way but everyone cares about percent return so if you would have just bought one share of every single stock in the s p 500 at the beginning of 2020 and then sold those uh same shares at the beginning of 2021 you would have made 20 return if you would have instead followed the investing strategy of this particular index fund derived from pca you would have made 25 so that was the video on principal component analysis i hope that cleared things up if you want to learn more about principal component analysis i have provided a link to my blog post on medium on the topic stay tuned for the next video where i'll be talking about a similar but different technique independent component analysis if you enjoyed this video be sure to like comment subscribe hit the bell share with your friends and family so they too can learn about principal component analysis thanks for watching you

Original Description

🤝 Work with me: https://aibuilder.academy/yt/WDjzgnqyz4s 🚀 Ship AI apps in weeks, not months: https://aibuilder.academy/courses/yt/WDjzgnqyz4s The first video in a 2-part series on Principal Component Analysis (PCA) and Independent Component Analysis (ICA). This video gives some intuition, math, and an example of using PCA to create an S&P 500 index fund. More in this series: - Blog: https://medium.com/towards-data-science/principal-component-analysis-pca-79d228eb9d24?sk=4c5b8fd7fd28a09c10ed483e51dd975a - ICA: https://youtu.be/GgLaP4Des1Q - Example code: https://github.com/ShawhinT/YouTube/tree/main/pca Resources I found helpful: - R. Bro, A. K. Smilde, Anal. Methods, 2014,6, 2812-2831 - Golden, R. (2020). Statistical machine learning: A unified framework. Boca Raton: CRC Press C. Introduction - 0:00 An analogy - 0:43 PCA - 2:20 Some math - 3:22 Recap - 9:32 Example: S&P 500 index fund - 10:52 Closing remarks - 14:10
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Shaw Talebi · Shaw Talebi · 6 of 60

1 biometricDashboard2 DEMO
biometricDashboard2 DEMO
Shaw Talebi
2 biometricDahboard3 DEMO
biometricDahboard3 DEMO
Shaw Talebi
3 Time Series, Signals, & the Fourier Transform | Introduction
Time Series, Signals, & the Fourier Transform | Introduction
Shaw Talebi
4 The Fast Fourier Transform | How does it (actually) work?
The Fast Fourier Transform | How does it (actually) work?
Shaw Talebi
5 The Wavelet Transform | Introduction & Example Code
The Wavelet Transform | Introduction & Example Code
Shaw Talebi
Principal Component Analysis (PCA) | Introduction & Example (Python) Code
Principal Component Analysis (PCA) | Introduction & Example (Python) Code
Shaw Talebi
7 Independent Component Analysis (ICA) | EEG Analysis Example Code
Independent Component Analysis (ICA) | EEG Analysis Example Code
Shaw Talebi
8 Kmeans-based Blink Detecter DEMO
Kmeans-based Blink Detecter DEMO
Shaw Talebi
9 Shit Happens, Stay Solution Oriented
Shit Happens, Stay Solution Oriented
Shaw Talebi
10 Why Conflict Is Good & How You Can Use It
Why Conflict Is Good & How You Can Use It
Shaw Talebi
11 Causality: An Introduction | How (naive) statistics can fail us
Causality: An Introduction | How (naive) statistics can fail us
Shaw Talebi
12 Causal Inference | Answering causal questions
Causal Inference | Answering causal questions
Shaw Talebi
13 Causal Discovery | Inferring causality from observational data
Causal Discovery | Inferring causality from observational data
Shaw Talebi
14 How to Be Antifragile | 7 Practical Tips
How to Be Antifragile | 7 Practical Tips
Shaw Talebi
15 Multi-kills: How to Do More With Less (no, not by multi-tasking)
Multi-kills: How to Do More With Less (no, not by multi-tasking)
Shaw Talebi
16 Topological Data Analysis (TDA) | An introduction
Topological Data Analysis (TDA) | An introduction
Shaw Talebi
17 The Mapper Algorithm | Overview & Python Example Code
The Mapper Algorithm | Overview & Python Example Code
Shaw Talebi
18 Persistent Homology | Introduction & Python Example Code
Persistent Homology | Introduction & Python Example Code
Shaw Talebi
19 What Is Data Science & How To Start? | A Beginner's Guide
What Is Data Science & How To Start? | A Beginner's Guide
Shaw Talebi
20 How to do MORE with LESS - multikills
How to do MORE with LESS - multikills
Shaw Talebi
21 Causal Effects | An introduction
Causal Effects | An introduction
Shaw Talebi
22 Causal Effects via Propensity Scores | Introduction & Python Code
Causal Effects via Propensity Scores | Introduction & Python Code
Shaw Talebi
23 Causal Effects via the Do-operator | Overview & Example
Causal Effects via the Do-operator | Overview & Example
Shaw Talebi
24 Causal Effects via DAGs | How to Handle Unobserved Confounders
Causal Effects via DAGs | How to Handle Unobserved Confounders
Shaw Talebi
25 Smoothing Crypto Time Series with Wavelets | Real-world Data Project
Smoothing Crypto Time Series with Wavelets | Real-world Data Project
Shaw Talebi
26 Causal Effects via Regression w/ Python Code
Causal Effects via Regression w/ Python Code
Shaw Talebi
27 5 Reasons Why Every Data Scientist Should Consider Freelancing
5 Reasons Why Every Data Scientist Should Consider Freelancing
Shaw Talebi
28 An Introduction to Decision Trees | Gini Impurity & Python Code
An Introduction to Decision Trees | Gini Impurity & Python Code
Shaw Talebi
29 10 Decision Trees are Better Than 1 | Random Forest & AdaBoost
10 Decision Trees are Better Than 1 | Random Forest & AdaBoost
Shaw Talebi
30 Dimensionality Reduction & Segmentation with Decision Trees | Python Code
Dimensionality Reduction & Segmentation with Decision Trees | Python Code
Shaw Talebi
31 How to Make a Data Science Portfolio With GitHub Pages (2025)
How to Make a Data Science Portfolio With GitHub Pages (2025)
Shaw Talebi
32 My $100,000+ Data Science Resume (what got me hired)
My $100,000+ Data Science Resume (what got me hired)
Shaw Talebi
33 How to Create a Custom Email Signature in Gmail (2025)
How to Create a Custom Email Signature in Gmail (2025)
Shaw Talebi
34 I Spent $675.92 Talking to Top Data Scientists on Upwork—Here’s what I learned
I Spent $675.92 Talking to Top Data Scientists on Upwork—Here’s what I learned
Shaw Talebi
35 Lessons from Spending $675.92 to Talk to Top Data Scientists on Upwork #freelance #datascience
Lessons from Spending $675.92 to Talk to Top Data Scientists on Upwork #freelance #datascience
Shaw Talebi
36 A Practical Introduction to Large Language Models (LLMs)
A Practical Introduction to Large Language Models (LLMs)
Shaw Talebi
37 The OpenAI (Python) API | Introduction & Example Code
The OpenAI (Python) API | Introduction & Example Code
Shaw Talebi
38 The Hugging Face Transformers Library | Example Code + Chatbot UI with Gradio
The Hugging Face Transformers Library | Example Code + Chatbot UI with Gradio
Shaw Talebi
39 Why I Quit My $150,000 Data Science Job
Why I Quit My $150,000 Data Science Job
Shaw Talebi
40 Prompt Engineering: How to Trick AI into Solving Your Problems
Prompt Engineering: How to Trick AI into Solving Your Problems
Shaw Talebi
41 The REALITY of entrepreneurship. #entrepreneurship #startup #smallbusiness
The REALITY of entrepreneurship. #entrepreneurship #startup #smallbusiness
Shaw Talebi
42 Fine-tuning Large Language Models (LLMs) | w/ Example Code
Fine-tuning Large Language Models (LLMs) | w/ Example Code
Shaw Talebi
43 How to Build an LLM from Scratch | An Overview
How to Build an LLM from Scratch | An Overview
Shaw Talebi
44 I Have 90 Days to Make $10k/mo—Here's my plan
I Have 90 Days to Make $10k/mo—Here's my plan
Shaw Talebi
45 I Spent $716.46 Talking to Data Scientists on Upwork—Here’s what I learned.
I Spent $716.46 Talking to Data Scientists on Upwork—Here’s what I learned.
Shaw Talebi
46 Pareto, Power Laws, and Fat Tails
Pareto, Power Laws, and Fat Tails
Shaw Talebi
47 Do NOT become an entrepreneur #entrepreneurship
Do NOT become an entrepreneur #entrepreneurship
Shaw Talebi
48 Detecting Power Laws in Real-world Data | w/ Python Code
Detecting Power Laws in Real-world Data | w/ Python Code
Shaw Talebi
49 How I’d learn data analytics (if I had to start over in 2024) #dataanalytics
How I’d learn data analytics (if I had to start over in 2024) #dataanalytics
Shaw Talebi
50 4 Ways to Measure Fat Tails with Python (+ Example Code)
4 Ways to Measure Fat Tails with Python (+ Example Code)
Shaw Talebi
51 Fine-tuning EXPLAINED in 40 sec #generativeai
Fine-tuning EXPLAINED in 40 sec #generativeai
Shaw Talebi
52 How Much YouTube Paid Me in My First 6 Months of Monetization (as a Data Science Creator)
How Much YouTube Paid Me in My First 6 Months of Monetization (as a Data Science Creator)
Shaw Talebi
53 5 Questions Every Data Scientist Should Hardcode into Their Brain
5 Questions Every Data Scientist Should Hardcode into Their Brain
Shaw Talebi
54 AI for Business: A (non-technical) introduction
AI for Business: A (non-technical) introduction
Shaw Talebi
55 LLMs EXPLAINED in 60 seconds #ai
LLMs EXPLAINED in 60 seconds #ai
Shaw Talebi
56 3 Ways to Make a Custom AI Assistant | RAG, Tools, & Fine-tuning
3 Ways to Make a Custom AI Assistant | RAG, Tools, & Fine-tuning
Shaw Talebi
57 What is #ai? — Simply Explained
What is #ai? — Simply Explained
Shaw Talebi
58 QLoRA—How to Fine-tune an LLM on a Single GPU (w/ Python Code)
QLoRA—How to Fine-tune an LLM on a Single GPU (w/ Python Code)
Shaw Talebi
59 How to Improve LLMs with RAG (Overview + Python Code)
How to Improve LLMs with RAG (Overview + Python Code)
Shaw Talebi
60 Text Embeddings, Classification, and Semantic Search (w/ Python Code)
Text Embeddings, Classification, and Semantic Search (w/ Python Code)
Shaw Talebi

Related AI Lessons

Up next
Learn Deep Learning by Hand (Beginner's Guide - Part 1)
Thu Vu
Watch →