Eigenvalues, Eigenvectors & PCA Explained | Linear Algebra for Data Science

Analytics Vidhya · Beginner ·🔢 Mathematical Foundations ·1y ago

Key Takeaways

This video tutorial covers the core concepts of Eigenvalues, Eigenvectors, and Principal Component Analysis (PCA) in Linear Algebra for Data Science, using tools like numpy, scikit-learn, and PyTorch to demonstrate dimensionality reduction and data visualization techniques.

Full Transcript

hello everyone in our previous videos we explored everything from vectors and matrices to determinant inverses rank linear Independence and so on we even tackled overdetermined and under determin systems and built a linear regression model from scratch you can find the link for all these videos in the description now we will step into igen values and igen vectors we'll explore why they matter in data science for example revealing the maximum variance directions and so on and how to compute them using numpy then we will explore the principal component analysis to reduce the dimensions while preserving key patterns in your data we'll do a manual example before comparing with the psychic learns PCA by the end you will see how these tools can transform High dimensional data into a simpler more efficient forms let's jump in so let me first start by importing the required libraries now I'll Define uh the igen values and igen vectors so igen values and igon vectors are formally defined by this equation so if you have a square Matrix let's say a of the shape n cross n uh this will have a non-zero Vector of the size n if this equation is satisfied so the product of a and v is equal to the Lambda * V what is V here V is representing all the igon vectors and Lambda is the igon value of that a so you can imagine I vectors to be representing some directions of information now how much information is carried or represented in that particular direction that information is carried by the igon value so let's say this Matrix a has got three igon vectors and three igen values okay Lambda 1 Lambda 2 Lambda 3 and igen vectors V1 V2 V3 okay so each of these igen Vector will have a corresponding igon value and using the igen values you can identify which direction is the principal direction of this a I mean in the sense which direction has got maximum information contained okay by looking at its corresponding igon value right so now why are igen values and igen vectors so important so the most immediate application of igen values and igon vectors as we will also see in this notebook is in the principal component analysis okay we'll be using the igon values and igon vectors of the co-variance Matrix U of the features okay we'll have a data set with features we'll first construct The coari Matrix of those features and then we'll find the igon values and igon vectors of that Co varience Matrix of the features and then from there we'll try to identify in which directions your data has maximum of variance the direction which has the largest variation amongst the features is called as the first principal component okay then the second largest information is carried by the second principal component which has second largest variation in your data set essentially yeah so we can also use the concept of ion values and ion vectors uh in dimensionality reduction so if you have let's say 1,000 features okay so that data set with 1,000 features will have again 1,000 IG values and 1,000 ion vectors I can take the top 100 igen vectors and transform that original feature Matrix from 1,000 diality down to 100 dimensionality or maybe 10 dimensionality so that's like reducing the dimensionality of the data set to a lower dimensional feature space an immediate application further becomes the Big Data visualization so if you have your original features in 1,000 Dimensions you cannot plot it in 1,000 axes we humans can only interpret two and three axes so if I can reduce that dimensionality down to two or three axis I can actually plot that big data and see the pattern visualize the pattern right so there are other places where ion values and ion vectors are used mechanical vibrations it can be used to find out the principal or the fundamental frequencies of these mechanical vibrations quantum mechanics it is heavily used the concept of ion values andon vectors again and so on so let's take a small example with a small Matrix here so I've got my Matrix 4213 and I'm extracting the ion values and ion vectors by using the EIG function from the linear algebra module of the numpy package okay so this is a 2 cross two Matrix so I can see that I have got two Lambda values or two igen values and two igen vectors now igen vectors are always column vectors so this is V1 this is V2 these are the two ion vectors this is my Lambda 1 and Lambda 2 so the first igon value is five the second ion value is two and this is the first ion vector and the second ion Vector so again if I go back to the formal definition so a * V1 should be equal to Lambda 1 * V1 right as per the definition and similarly a * vs2 should we again equal to Lambda 2 * V2 right so you can verify this by matrix multiplication so now the question is is it possible to visualize all the Transformations so let me attempt so I'm going to do three things here uh I'll plot set of vectors in 2D now what are these vectors these would be the igon vectors extracted from the Matrix a and then I'll multiply this uh Matrix a with these igon vectors so that will essentially transform the igon vectors into a new feature space so we'll try to visualize these vectors in the original feature space the original grid of data points and then we'll also visualize the same igon vectors scaled by their corresponding igon values in the transformed space as well right so the transformation happens when we are multiplying a Time V okay that's the transformed uh space you can say so I'll quickly explain this code uh I have my same Matrix 21 and 13 uh I want to create 11 data points for creating the grid here the X values are going from - 1.5 to plus 1.5 with an increment of 0.5 similarly we have y values going from - 1.52 plus 1.5 increment of 0.5 I'm creating a mesh grid and then using that mesh grid I'm creating the actual data points which I'm going to multiply with the transpose of a to get the transformed grid basically okay so the original grid will be this and this is my transformed grid so I want to plot the ion vectors in both the grids so here I'm creating a figure with two subplots so one row two columns so essentially I'll get two subplots 12 in by 6 in first is a scatter plot uh this is the scatter plot to plot the the grid points in the original space ax hline and a v line basically is your lines for the x-axis and the y-axis horizontal and vertical lines setting the title and the aspect ratio now I'm going to plot the vectors so in the first subplot I'm just going to plot the igon vectors at the end you'll get an arrow and this is just a label for those ion vectors so this for Loop goes for two iterations uh because the ion values will be two so so for each ion value or I'm going to extract the ion vectors you can see I goes from 0 to one so I'll get first ion Vector plotted and then the second ion Vector plotted again this is a new scatter for the transformed point so I'll be plotting the transformed grid points and on those again I'm setting the axis lines horizontal and vertical setting the title and aspect ratio here again I'm extracting the igon values uh as Lambda I and igon vectors as V here the transformation is carried out by multiplying a with that ion vector so a * V should be similar to Lambda * V that we have seen as per the definition so now I'm going to plot the ion vectors in the transformed space one by one and then provide a label to it right so let me zoom out and see the plots here so these are two igen vectors I get V1 and V2 and you can see very well they are perpendicular almost to each other so when these vectors are multiplied by their corresponding Lambda values so Lambda 1 * V1 you can see this has been transformed to Lambda 1 V1 so this is the length of the net Vector okay and similarly this vs2 is multiplied by Lambda 2 so you already know that a * vs2 should be same as Lambda 2 * V2 so although in the code we have done a * V2 but essentially it will be same as Lambda 2 * V2 so these vectors V1 and V2 have been scaled by their corresponding igen values okay now since Lambda 2 is very small compared to Lambda 1 okay Lambda 1 was 5 Lambda 2 was only two so that's why you can see that this vector is much smaller compared to this and in the same ratio 5 is to2 right and you can also see the way in which the entire grid space has been transformed right so I'm printing the ion values and ion vectors uh these are the ion values in this case 1.38 and 3.61 and these are the ion vectors so the first ion Vector would be this and this is the second ion Vector V2 and this is V1 this is my Lambda 1 and this is my Lambda two so if you're really interested we can actually go ahead and do a manual calculation of these numbers I can demonstrate how can we obtain these numbers for a given Matrix a okay uh if you're a person who is more application oriented you can skip this part directly to the principal component analysis okay so in this notebook I'm going to show you how we can calculate the igen values and igen vectors mathematically analytically on pen and paper so by definition again uh a v is equal to Lambda V essentially I want to solve that equation to be able to get the values of Lambda and V okay now the important thing about this equation is the solution part so there are two feasible Solutions in this case a trivial solution which says that all the ion vectors are zero so I'm not interested in this trivial solution otherwise a minus Lambda I that Matrix must be not invertible so these are the two possible scenarios for this equation to have a solution okay and when does a matrix not have an inverse so for a matrix to not have an inverse okay the determinant of that Matrix so a minus Lambda I the determinant of that Matrix must be equal to zero okay so instead of solving this okay I'm actually going to solve this equation that makes possible to find feasible solutions for these ion values and ion vectors let me take an example here uh assume that a is 4213 so just now we discussed that uh AV Lambda V and uh a minus Lambda I * V should be equal to zero okay now for this equation to be satisfied the determinant of a minus Lambda I should be equal to Z or essentially this a minus Lambda I Matrix must be non-invertible so a is 4 2 13 Lambda I Lambda would be Lambda 0 0 Lambda this is a Lambda I okay now I'm subtracting this from here so you can see 4us Lambda is the first element second element would be 2 - 0 is 2 third element would be 1 - 0 which is one here and the fourth element is 3us Lambda which is here okay so this Matrix is essentially a Lambda now I have to take the determinant of that so I'll just multiply Aus Lambda with 3us Lambda and then subtract the product of 1 and two okay and this entire thing must be equated to zero so upon expanding this and solving this is the final quadratic equation that I end up with Lambda Square - 7 Lambda + 10 = 0 of course this quadratic equation will have two roots and these two Roots will be the Lambda values or the igon values Okay so solving I can get the Lambda values as five and two so these are the two igen values of my Matrix a now how can I get the corresponding igen vectors so for each Lambda value you will have a igon vector so for the first igen value let me again Define my characteristic equation a minus Lambda I * V equal 0 I'm just substituting 5 in terms of Lambda or in place of Lambda so this is the equation I get in the LHS so um simplifying this 4 - 5 becomes minus one this is again 2 1 and and 3 - 5 becomes -2 okay and this is your a minus Lambda I now X and Y are the elements present in your vector let's say V1 okay so essentially your first igen Vector is represented by X Y so I just have to solve couple of equations so that I can end up with the numbers representing X and Y so if I am able to find out the values of X and Y my first ion Vector is determined right that's a simple plan so this times your X and Y should be equal to 0 0 as per definition so simple matrix multiplication again uh - 1 * x + 2 * y should be equal to 0 this first zero uh which simplifies to - x + 2 y = 0 and then x = 2 y so now x = 2 y can have infinite number of solutions okay for each value of y I can end up with a value of x so let me take uh y equal to 1 the simplest value of y of course y has to be non zero because we have assumed that myen vectors are Z so both X and Y will be non zero okay so since Y is non zero let me pick a simple value let's say y = 1 in that case solving this equation I get xal 2 so my first igon Vector becomes 2 1 so this is my first igon vector and in fact why only one I could have taken 10 I could have taken 100 or thousand anything so any nonzero scalar multiple of this igon Vector will also be an igon Vector of the corresponding Matrix a okay remember that so following the same process say I can get the igen vector for the other igon value okay so here Lambda equal to 2 is substituted same process followed this time I end up with the equation Y = to- X so this time instead of picking the value of y let me pick the value of x as one that leads to Y = minus1 so essentially Now 1 - 1 is the second IG vector or you can take any scalar non-zero multiple of this that will also be an ion Vector to your Matrix a or if you want you can again take y = 1 in that case X will be minus one so that's just the flip the direction is just flipped in that case so if this is the direction of V2 this is also the direction of V2 in the flipped sense so if V2 is represented by 1 minus one here uh this V2 will be represented by minus one one here that's also V2 the length of this direction will be represented by your Lambda 2 okay so how much information is carried in that direction so ion vectors are just unit vectors so it doesn't matter whether you draw the arrow this way or you draw the arrow that way they're just unit vectors okay they represent broad directions so in summary U my original Matrix a has got two ion values and for these two corresponding igon values I have got two ion vectors so these are my ion vectors and you can manually multiply and verify that so A multiplied by the first igon Vector that should be equal to the Lambda 1 that is the first igon value times the first igon Vector so I would give you this as a homework please try this out and confirm and again here a * the second igon Vector should be equal to the second igon value times the second second ion Vector okay so this is how you calculate the igen values and IG vectors of a matrix a I've taken a small Matrix for the demonstration but in principle the process is same so coming back to our original discussion on the I values I vectors as applied to the principal component analysis before we get into the concept and understanding of PCA let us understand the fundamentals and the meaning of the principal components so now imagine that there are two um features in your data set X1 and X2 okay and now let me draw some data points here so these are the actual data points in your feature Matrix so this is X1 X2 and uh each value of X1 X2 will lead to one of these circles okay so that's your original data Matrix okay that's plotted as a scatter so this is a scatter plot you can say so now you can visualize that I'm just enclosing this entire data okay so you can very well see that your data has got variation in two directions so this is One Direction and and there is one more perpendicular direction in which there is a variation in your data okay so this direction essentially represents your pc1 why is this the first principal component because this direction has got a larger variation in your data set so a greater amount of variation is captured along that direction the other perpendicular direction is this and this direction is your second principal component okay once again I repeat the directions are the principal components and conversely okay okay so principal all the principal components of your feature Matrix are essentially Direction vectors and these are what vectors same as igon vectors okay so I'm going to find out some sort of ion vector and these ion vectors directly are the principal components okay so the whole objective of principal component analysis is to identify the direction of Maximum variation in your data set which all directions contain variations in your data set right so the direction containing the largest variation amongst the features is called as the first principal component the second principal component is the Direction having the second largest variation in your data set and so on most importantly the amount of variation the amount of variation along a specific Direction so the amount of variation is this much okay along the pc1 okay so that amount of variation is captured by the igon values corresponding to the first principal component okay similarly the amount of variation in this direction is represented by another igon value corresponding to the second principal component so since we have two principal components I'll have two igon values as well and these igon values will represent the amount of variation the direction of variation is the igon value and the amount of variation along that direction is represented by their corresponding IG values but now the larger question is why do I need the principal component analysis Al together okay what are the basic applications so immediately you can see I'm discussing about the feature reduction so imagine that you have a feature Matrix which is let's say n cross uh 1,000 so you have 1,000 features uh originally that might have been 700 800 features but you did feature engineering feature Transformations and then you created some additional features by taking some linear combinations of one of the existing features and so on you created a lot of additional features and finally you are having 1,000 features now the question is are all the 1,000 features going to be meaningful or important to your machine learning problem may not be so okay there could be a lot of features which are essentially noce in your data but you may not be knowing at this point so if you try to make a machine learning model using all the 1,000 features that will take a lot of training time see so the computational cost is very large okay so the whole objective of doing PCA is can I represent the information contained in the original feature Matrix in a smaller feature Matrix let's say n cross 50 can I bring this dimensionality down to only 50 Dimensions so originally your data Matrix was 1,000 dimensional and now can I bring that down to only 50 Dimensions okay retaining almost 85 90% of the information there will be some information loss because you are projecting a high dimensional data into a lower dimensional feature space so that projection will lead to some loss of information but what is that loss of information the good thing is most of the times that loss of information is outliers anomalies abnormalities and noise in your data so essentially what you losing is only Noise Okay that gives another application that if you want to do you know Den noising or if you want to detect anomalies okay uh you can actually do PCA so PCA is actually also used for anomaly detection so you can easily find projects on kaggle uh just search for credit card fraud detection using PCA or animal detection using PCA on kagle and you'll get several data sets and projects with end to end codes right uh you can also use the concept of compression so imagine that this is your 1,000 pixel by th000 pixel image picture of mine can I represent the same picture in a lower dimensional feature space let's say 100 pixel by 100 pixel so now if machine wants to learn that this is how prant looks okay instead of giving a very high resolution image can I give a compressed image from there if the machine is able to identify yes this is also prant then I don't have to give this much amount of information for the machine to train Okay so I'll just simply focus on smaller images so that the computational time is again saved fine and the model is trained faster and is more generalized because more amount of data can also lead to more amount of noise as we have discussed and may lead to a overfitted model okay so noise also leads to overfitting problem so another reason people typically do dimensionality reduction is so as to focus on the important features get rid of the noise and that helps the model to really generalize well that helps to reduce the variance in your model as well right okay uh important word here is components so instead of the original columns or original features what you are doing is taking the components the principal components will be treated as the potential features for your model building so n cross 100 or th let's say th features will be transformed to n cross 50 now these 50 are no longer uh original features they are transformed features so these transformed features are referred as components so I'm going to take 50 compon components out of 1,000 okay so the total number of components that you get for any data set will be equal to the number of features if there are thousand features you'll get 1,000 principal components 1,000 igon values 1,000 igon vectors of course and I'm going to take the top 50 of those IG vectors to transform my high dimensional data Matrix into a smaller feature Matrix right so this is what will be fed to my machine learning algorithm for training or deep learning architecture for training right so that's the plan there is one more Point here uh and that is regarding the visualization so if I can reduce the 100 dimensional feature space let's say your data set has got 1,000 features or 100 features if I can reduce this to n cross2 only two Dimensions I can use these two Dimensions let's say pc1 and PC2 and visualize that big data okay so it helps me to immediately visualize the pattern which was present in 100 dimensional feature space bring it down to two dimensional feature space those two Dimensions can become your axis and you can actually visualize the pattern in your data so why dimensional reduction matters again efficiency fewer features will lead to faster training times whether it is a classification or a regression problem only drawback of dimensionality reduction using principal component analysises PCA being unsupervised machine learning technique uh even though you might have the class labels or the target values in the case of regression we are not going to use y values for these Transformations okay what transformation transformation of your feature space from your original 1,000 Dimensions to a lower dimensional feature space that's the linear transformation we are trying to carry out using PCA that transformation is only and only and only of the feature Matrix that does not consider anything about the target values okay so we are not going to use any information about the target while doing this transformation of the feature Matrix so what is the problem in this now the original feature Matrix had some relationship good amount of relationship with the original Target values now when you are transforming that feature Matrix into a lower dimensional feature space that projected lower dimensional data may not have that much strong relationship with the original Target okay so the relationship of your features with the target can get drastically hampered which is why PCA may not be a very good transformation technique as far as classification uh problem is concerned mostly it may work for regression but for classification uh I would I would suggest you to use PCA with extreme care because it does not guarantee that the same relationship will be present uh in the lower dimensional feature space relationship of the features with that Target that relationship is not we capturing they're only capturing the variation the pattern present in your features the variation of each feature with respect to other features so let's see the steps involved in the principal component analysis so these are the five steps we are going to compute the mean of your features and then we can also scale the features by dividing by the standard division of each of those features finally the scaled data will be used to calculate the covariance Matrix so coari Matrix essentially captures the variation of each feature with respect to all the other features so if your data set is let's say n cross p uh the first step will again be n cross p in shape the co-variance Matrix will be P cross P so if you have P features The coari Matrix will also be P cross p in shape igon value decomposition of that covariance Matrix so I will get P igon values and again p number of igen vectors okay I'll get P values and P vectors and then I all I have to do is sort these igen values okay in a decreasing order and then from there I can take the top IG vectors as the principal components and finally use that subset of ion vectors to transform my original uh data set into a lower dimensional feature space so that is essentially your projection step in the last so let's do this uh from scratch using numpy and then later on we'll see how does it compare with my psyit learns PCA function right so to demonstrate I'll take a real life example in this case the iris data set so if you search for Iris FL on Google you'll see a lot of images just to give a context if you see uh the bigger ones are your petals the smaller ones are your SEL so this will be your petal length petal width and then SLE length SLE width So based on these four features uh I'm going to classify these iris flowers into one of these three classes Iris setosa Iris virginica and Iris vesicular so these are the three species of the iris flowers you can see so the data set that I'm going to use has got 150 rows and got four features SE length width and petal length width divided into three classes so this Irish data set is built in within ccet learn so I've imported this data set let me print the features you can see 150 rows and these are the values of the four features so this is X1 X2 X3 and X4 are SLE length SLE width petal length and petal width all the four in centimeters right the Y values if you see here so we have the class labels 0 1 and two so class label zero for one class this is for another class and this is for the third class okay so it's already label encoded in a way right so as I've already mentioned although we do have the class tabls but since PCA is a unsupervised learning Technique we are not going to use that information in the entire transformation of the feature Matrix right so let's see the steps um so step number one I'm going to first calculate the mean of each of the features and store it here and then subtract that mean from the original feature values and then divide the standard deviation of each of those features okay so essentially I've got my standardized data so xor STD is the standardized features okay so this is how the original features look like I just printing the first five rows and after standardization you can see all the values are between some minus to some plus centered around zero that's the meaning of substracting with the mean leads to centering of the data okay so the next step is to find the co-variance of the standardized data okay uh here I'm using the row variance equal to false so that actual features are represented by columns okay uh the rows are your data points so I don't want to use the data points I want to use the columns for finding the co-variation okay so since we have four features the coari is again 4 cross 4 if you remember I had mentioned if your data set is n cross P The co-variance Matrix will be P cross P so this is your coari Matrix essentially this represents how much is each feature varying with respect to each of the other features so X1 X2 X3 X4 and this is again X1 X2 X3 and X4 so if you see this number is same across all that is basically the uh variance of your data set and then let's say this number represents the co-variance between X X and X4 which will be same as this number again okay this number represents how X1 varies with X4 higher the number more is the co-variation more strongly they are associated essentially and there are some negative uh numbers also so negative number essentially represents that if one of the feature values increase the other feature values decrease so co-variation does not give any hint about the uh reasons of this so covariation does not lead to causation essentially so next I'm calculating the values of this covariance Matrix so since there are four features I'll get four igen values you can see and four set of igen vectors again ion vectors are your column vectors so this is my first igon Vector second ion Vector third and the fourth okay these are Lambda 1 2 3 and four the next step is to sort these igon values and igen vectors in the decreasing order of the IG values okay so I have to first find out the igon vector corresponding to the highest igon value then the igon vector correspond to the second largest igon value and so on right and that's exactly what I'm doing here so I want to sort the igon vectors in the decreasing order of the IG values so this is done here so before using these uh igon vectors for dimensionality reduction let's see the amount of variation captured by these four directions we have got four igon vectors so essentially four directions so how much is the variation captured by each of these four directions okay here um ion values represent the amount of variation so I've normalized that by taking the sum of the ion values okay so I have the variance explained which is the normalized variance now what it says is the first igon Vector contains almost 73% of the variation in the data set okay the second igon Vector contains another 22.8% variation in the data set the third igon Vector contains only 3.6% information and the fourth ion Vector hardly contains any information right so that's exactly the Vari explained by the principal components principal components are nothing but igen vectors remember that okay so this is my pc1 this is PC2 three and four so now for dimensionality reduction I'll select as many principal components uh which can represent typically 90 to 95% of your total variance so if I take the first two igen vectors or principal components the sum of these two numbers is 0.958 1 which means that if I take the first two principal component I would have captured 95.8% of the total variation in the data set these two principal components can account for can explain 95.8% variation or pattern present in your features okay only the two uh principal components and that is very good amount of information so the information carried by the remaining two principal components is hardly 5% not even 5% so I can really uh you know discard that so out of four I'll only pick two the first won vectors corresponding to the two largest igon values okay so the last step is to project your data uh to a lower dimensional feature space and for projecting that you'll need the Transformer Matrix so let's see how we construct that Transformer Matrix so all you have to do is tack your K principal components horizontally so let's say this is my first principal component second principal component third principal component and so on I'm taking the K principal component okay at the end each principal component is a vector of P Dimensions okay so how many rows you will get p number of rows and since I'm stacking K principal component my Transformer Matrix is p cross K okay so now if I have my uh standardized data let's say which is n cross P that is uh your standardized data which is n cross p and multiply this W here which is p cross K okay I'm just going to multiply this Transformer Matrix with this standard B data I will end up with n cross K so that's your reduced dimensional data okay this is your PCA data essentially your uh data reduced by using principal component analysis right and that's exactly what we are doing so xcore STD times the top two ion vectors in this case k is equal to 2 so the shape of this Transformer Matrix is 150 cross 2 awesome now let's have a quick 2D plot to visualize the data in this two Dimensions that is the good thing so we could have not imagined or visualized this data in four dimensions so after reducing it to two Dimensions we can easily visualize this so you can see this is the PC one in the xaxis and PC2 in the second axis so these data points are colored with respect to the classes so from here we can clearly see how the separation between the classes look like okay so this is one class this is your data from one class this is the data from other class and yellow is the third class so these are the three classes and we can very well see that your classes are quite separable even in the reduced dimensional feature space right so even though you have reduced the data to a lower Dimensions you can still separate the classes but that is not guaranteed as I discussed already since PCA does not uh take into account the class labels so now let's see how can we do that uh PCA using the psyit learns API so first I have to import the PCA class from the decomposition module of psychic learn I'm initializing the PCA class by using two principal components here and then once I have got the PCA Transformer object set all I have to do is use the same standardized data and uh transform it uh after transformation let's have the shape which is 150 cross2 now and from the explained variance ratio attribute I can see that the first principal component accounts for 73% of information second principal component has another 23% information so cumulatively you have 95% information contained within the first two principal components right there might be slight difference in these numbers if you go up and check these numbers are very similar to what we have got there might be small difference es because of the way these numbers are computed okay okay so let us summarize what are the key take C from this video uh so we had elaborate discussion on the igon values and igon vectors what they physically mean where are they actually used and how they are actually computed I have shown you with manual calculation step by step uh you are also going to use the ion values and ion Vector Concept in principal component analysis where we found the igon values and ion vectors of the cence metrix to reduce the dimensionality of a high dimensional data U major objective is to con concentrate the features and do a data visualization and you can also see how much variance is captured by these principal components okay that quantification can be done easily some other tips that I can give you is uh you don't have to do all that manual calculation step by step most people will be using the PCA class directly here you can specify how many principal components or in place of two you can give a float like 0.9 okay that will automatically select the number of principal components until 90% variation is captured another important point is when your data has a lot of features there could be some of the features which are in a completely different scale okay in such a case the centering and scaling is very very important okay so that you are able to capture the variation from each of these features by giving equal importance that's what the centering and scaling does finally trying to do PCA for a very very large data set will be very computationally uh slow so you need to have enough memory as well as time if you're Computing you know PCF for data set which let it has 10,000 features or th000 features that's not very scalable so PCA is good PC by using igen value decomposition is good for smaller data sets only if you want to do the diality reduction for very large data set I would prefer to use singular value decomposition there is a truncated version of it as well the what we call is the truncated SVD in pyet learn API uh that is preferred for efficiency and more stability when your data set is really large so in the next video I'm going to discuss about the singular value decomposition how you can take a matrix decompose into its components USV and then how can we do a practical real world example using the singular value decomposition that's it for this video and don't forget to like share and subscribe our channel for more such content see you next time

Original Description

Github link to download the codes: https://github.com/prashant9501/YT_Videos_Resources/tree/main/Linear%20Algebra Master the core concepts of Eigenvalues, Eigenvectors, and Principal Component Analysis (PCA) in this beginner-friendly Linear Algebra for Data Science tutorial. Learn step-by-step how to compute Eigenvalues and Eigenvectors mathematically, understand dimensionality reduction using PCA, and explore real-world applications in machine learning and data science. 🚀 Topics Covered: ✅ Eigenvalues & Eigenvectors ✅ Principal Component Analysis (PCA) ✅ Characteristic Polynomial ✅ Solving for Eigenvalues ✅ Finding Eigenvectors ✅ PCA for Dimensionality Reduction Whether you're a data scientist, AI researcher, or student, this video will build your foundation in linear algebra for machine learning. Don’t forget to like, comment, and subscribe for more AI and data science content!
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Analytics Vidhya · Analytics Vidhya · 0 of 60

← Previous Next →
1 The DataHour: Data Science in Retail
The DataHour: Data Science in Retail
Analytics Vidhya
2 The DataHour: Anomaly detection using NLP and Predictive Modeling
The DataHour: Anomaly detection using NLP and Predictive Modeling
Analytics Vidhya
3 The DataHour: Energy Data Science Project from Scratch
The DataHour: Energy Data Science Project from Scratch
Analytics Vidhya
4 The DataHour: Explainable AI Need and Implementation
The DataHour: Explainable AI Need and Implementation
Analytics Vidhya
5 The DataHour: Google Cloud AI/ML
The DataHour: Google Cloud AI/ML
Analytics Vidhya
6 Prediction to Production in Machine Learning #machinelearning #prediction
Prediction to Production in Machine Learning #machinelearning #prediction
Analytics Vidhya
7 Practical Applications of Data science in Ecommerce
Practical Applications of Data science in Ecommerce
Analytics Vidhya
8 How to tackle Overfitting?#machinelearning #overfitting
How to tackle Overfitting?#machinelearning #overfitting
Analytics Vidhya
9 Building Data Pipelines on GCP #googlecloud #datapipelines #data
Building Data Pipelines on GCP #googlecloud #datapipelines #data
Analytics Vidhya
10 Hands-on with A/B Testing #abtesting #datascience
Hands-on with A/B Testing #abtesting #datascience
Analytics Vidhya
11 Efficient Implementations of Transformers #transformers #cnn  #machinelearning
Efficient Implementations of Transformers #transformers #cnn #machinelearning
Analytics Vidhya
12 Modern Deep Learning Architecture #deeplearning  #architecture #deeplearningtutorial
Modern Deep Learning Architecture #deeplearning #architecture #deeplearningtutorial
Analytics Vidhya
13 Key steps for Designing Artificial Neural Network (ANN) for Image classification #machinelearning
Key steps for Designing Artificial Neural Network (ANN) for Image classification #machinelearning
Analytics Vidhya
14 5 things you should know about Azure SQL #azure #sql #datahour #datascience
5 things you should know about Azure SQL #azure #sql #datahour #datascience
Analytics Vidhya
15 AI & ML in the Automotive Industry #machinelearning #ai
AI & ML in the Automotive Industry #machinelearning #ai
Analytics Vidhya
16 Building Machine Learning Models in BigQuery
Building Machine Learning Models in BigQuery
Analytics Vidhya
17 NLP aspects in Telecommunication Industry
NLP aspects in Telecommunication Industry
Analytics Vidhya
18 Practical Time Series Analysis
Practical Time Series Analysis
Analytics Vidhya
19 Fundamentals of Quantum Computing
Fundamentals of Quantum Computing
Analytics Vidhya
20 A DAY IN THE LIFE of a Data Scientist (From waking up to working on algorithms)
A DAY IN THE LIFE of a Data Scientist (From waking up to working on algorithms)
Analytics Vidhya
21 Classification Machine Learning Model from Scratch
Classification Machine Learning Model from Scratch
Analytics Vidhya
22 Knowledge Graph Solutions using Neo4j
Knowledge Graph Solutions using Neo4j
Analytics Vidhya
23 Model Guesstimation (MLOps)
Model Guesstimation (MLOps)
Analytics Vidhya
24 ETL Pipelines in Google Cloud Platform
ETL Pipelines in Google Cloud Platform
Analytics Vidhya
25 Key steps for Designing Convolutional Neural Network(CNN) for Image Classification
Key steps for Designing Convolutional Neural Network(CNN) for Image Classification
Analytics Vidhya
26 Getting Started with AWS EC2 #amazon #aws
Getting Started with AWS EC2 #amazon #aws
Analytics Vidhya
27 How to Use Azure NLP and Graph Databases for Intelligent Knowledge Mining
How to Use Azure NLP and Graph Databases for Intelligent Knowledge Mining
Analytics Vidhya
28 Certified AI & ML BlackBelt Plus Program #shorts
Certified AI & ML BlackBelt Plus Program #shorts
Analytics Vidhya
29 Visualizing Data using Python #machinelearning #visualization #python
Visualizing Data using Python #machinelearning #visualization #python
Analytics Vidhya
30 DCNN for Machine RUL Prediction using Time-series Data #timeseries #machinelearning #datascience
DCNN for Machine RUL Prediction using Time-series Data #timeseries #machinelearning #datascience
Analytics Vidhya
31 M in ML stands for Math & Magic
M in ML stands for Math & Magic
Analytics Vidhya
32 An Unsupervised ML approach using Clustering
An Unsupervised ML approach using Clustering
Analytics Vidhya
33 Customizing Large Language Models GPT3 for Real-life Use Cases #gpt3 #datascience
Customizing Large Language Models GPT3 for Real-life Use Cases #gpt3 #datascience
Analytics Vidhya
34 Model Parameters vs Hyperparameters - Techniques in ML Engineering #machinelearning
Model Parameters vs Hyperparameters - Techniques in ML Engineering #machinelearning
Analytics Vidhya
35 Practical MLOps #mlops #datascience
Practical MLOps #mlops #datascience
Analytics Vidhya
36 Data Engineering with Databricks #dataengineering #databricks
Data Engineering with Databricks #dataengineering #databricks
Analytics Vidhya
37 Multi-Objective Optimisation
Multi-Objective Optimisation
Analytics Vidhya
38 When Airflow Meets Kubernetes
When Airflow Meets Kubernetes
Analytics Vidhya
39 AI in Banking
AI in Banking
Analytics Vidhya
40 Learn Convolutional Neural Network for Image Recognition
Learn Convolutional Neural Network for Image Recognition
Analytics Vidhya
41 Extracting Value from Data
Extracting Value from Data
Analytics Vidhya
42 How to measure Marketing Channel Effectiveness
How to measure Marketing Channel Effectiveness
Analytics Vidhya
43 Transforming Lives | Data Science Immersive Bootcamp
Transforming Lives | Data Science Immersive Bootcamp
Analytics Vidhya
44 Stock Market Analysis - AI driven approach
Stock Market Analysis - AI driven approach
Analytics Vidhya
45 Become a Data Engineering Professional in 2022 | Future Trends + Skills Required
Become a Data Engineering Professional in 2022 | Future Trends + Skills Required
Analytics Vidhya
46 Ensemble Techniques in Machine Learning #machinelearning #ensemble #datascience
Ensemble Techniques in Machine Learning #machinelearning #ensemble #datascience
Analytics Vidhya
47 The Power of Visualization | Tableau Full Course | Analytics Vidhya
The Power of Visualization | Tableau Full Course | Analytics Vidhya
Analytics Vidhya
48 Demand for Data Engineers is on the Rise | Data Engineer | Analytics Vidhya
Demand for Data Engineers is on the Rise | Data Engineer | Analytics Vidhya
Analytics Vidhya
49 Data Visualization in Data Science | DataHour | Analytics Vidhya
Data Visualization in Data Science | DataHour | Analytics Vidhya
Analytics Vidhya
50 Role of Optimization in Machine Learning & Deep Learning | DataHour | Analytics Vidhya
Role of Optimization in Machine Learning & Deep Learning | DataHour | Analytics Vidhya
Analytics Vidhya
51 Solving any Machine Learning Problem | Approach and Steps Involved
Solving any Machine Learning Problem | Approach and Steps Involved
Analytics Vidhya
52 Topic Modeling Explained with Implementation | Using LDA in Python | DataHour by Arpendu Ganguly
Topic Modeling Explained with Implementation | Using LDA in Python | DataHour by Arpendu Ganguly
Analytics Vidhya
53 Data Engineering in E-Commerce | The Best Case Study
Data Engineering in E-Commerce | The Best Case Study
Analytics Vidhya
54 Introduction to Classification using Azure Machine Learning | DataHour | Analytics Vidhya
Introduction to Classification using Azure Machine Learning | DataHour | Analytics Vidhya
Analytics Vidhya
55 Introduction to Federated Learning | DataHour | Analytics Vidhya
Introduction to Federated Learning | DataHour | Analytics Vidhya
Analytics Vidhya
56 Diffusion Models for Generative Arts | DataHour | Analytics Vidhya
Diffusion Models for Generative Arts | DataHour | Analytics Vidhya
Analytics Vidhya
57 Master Google Analytics in 1 Hour | DataHour | Analytics Vidhya
Master Google Analytics in 1 Hour | DataHour | Analytics Vidhya
Analytics Vidhya
58 Learn Hypothesis Testing | DataHour | Analytics Vidhya
Learn Hypothesis Testing | DataHour | Analytics Vidhya
Analytics Vidhya
59 A Practical Approach to Kaggle Competition | DataHour | Analytics Vidhya
A Practical Approach to Kaggle Competition | DataHour | Analytics Vidhya
Analytics Vidhya
60 Making AI work for Business | DataHour | Analytics Vidhya
Making AI work for Business | DataHour | Analytics Vidhya
Analytics Vidhya

This video tutorial explains the concepts of Eigenvalues, Eigenvectors, and Principal Component Analysis (PCA) in Linear Algebra for Data Science, demonstrating how to apply these techniques for dimensionality reduction and data visualization using popular libraries like numpy, scikit-learn, and PyTorch.

Key Takeaways
  1. Extract eigenvectors from a matrix using the EIG function from numpy
  2. Plot eigenvectors in the original and transformed spaces
  3. Transform the eigenvectors by multiplying the matrix with them
  4. Compute the mean of features and scale features by dividing by standard deviation
  5. Calculate the covariance Matrix and perform eigenvector decomposition
  6. Sort eigenvalues in decreasing order and select the top principal components
💡 Eigenvalues and Eigenvectors can be used to reveal the maximum variance directions in data and reduce dimensionality while preserving key patterns, making them essential techniques in Linear Algebra for Data Science.

Related Reads

Up next
How to Open OSM Files (OpenStreetMap Data)
File Extension Geeks
Watch →