Linear Regression Gradient Descent From Scratch in Python
Key Takeaways
Linear Regression implementation using Gradient Descent in Python with NumPy library, covering key concepts such as prediction, loss calculation, and weight updates.
Full Transcript
let's code linear regression using gradient descent in the last video I did sort of a quick explanation and if you want a more detailed and thorough like derivation of it I will link my blog post in the description but let's see so let's just let's just first go through the notation again so M will be the number of training examples or the number of points and will be the number of features so the number of dimensions that each point is in and Y will be so Y will be 1 by M 1 value for each training example and X will be an N by M matrix for each training example we have n time like n values and W will be our weight which is just n by 1 so I just made a quick skeleton code as well so we're gonna use numpy and then like we have a class we have an initialization for class we're gonna have a function where we compute our predictions which is y hat a function for the loss and a gradient descent where we do the update step and then a main function which we'll call these functions over and over for it's like a total number of iterations that we choose so what we need to first is just we're gonna have a learning rate let's set it to 0.01 and we're gonna have a total amount of iterations let's say we run for 10,000 and let's see so Y hat remember that Y hat is just W 1 X 1 plus W 2 X 2 plus W 3 X 3 etcetera X 3 etc so trick here is that we we did a a we added X 1 as its just 1 so just to make the notation more compact but really this is how it would look like right we would have one intercept value and then the rest but if we say that we have X 1 here which will be initialized as 1 then we know that we just want the output of this should be one scalar value and so Y will be the same shape as Y hat so 1 by M and W is and by right so X is n by M W is n by 1 and we want to have a scalar value for each training example so the output should be 1 by M and how we do that is we take W transpose times X right and then this transpose would be 1 by n times this and by M which will give the output 1 by M so here we can just return come on X then the loss will will be 1 over self dot M 2 we define later and times the sum of Y hat minus y squared so NP dot power y hat minus y comma 2 so we take each we subtracted and then we take each element Y squared and then just to sum over all of those and also the average then we just return l and let's see here so this might be the most difficult step the gradient descent step in the last year I think I you I showed the formula with loops we're going to do it vectorized it's still going to be the same thing just more efficient code but let's just first think about it in the most simple way what we want to have returned from gradient descent like when we compute the DL DW is a vector of size n by one right because what we want to do later on is just take W minus W we can write this first actually minus the learning rate times DL TW right so the learner will be element-wise times DL DW and if we're going to do this for the entire vector W which is n by one we also want the ldw to be size and by one and we know that well first of all we know that we have to take Y hat minus y and this will be 1 by M so so like one trick you could use is just say okay we know that the shape of Y is 1 by M and we also know that the shape of X that we need to multiply with will be n by M and we want the output of this to be n by 1 and then the idea is ok well how can we multiply them to get n by 1 well ok said if we multiply this with this transpose then we would get we would get n by 1 so this is y hat minus y right this is size and 1 by M so let's take this transpose and let's multiply it with X and I guess what you could do just to make sure it's like print dld w dot shape and just check that it's like the correct shape but this should be fine and there's one thing that we just missing we need to we need to take 2 divided by self dot M times and P dot and then this and return W yeah and we need to have X here we need to have Y hat we need to Y and W for the update great so for the main we need to send in x and y and the first thing that we're gonna do is let's see we need to add one in the beginning and we need to do it for each training example yeah so we need to add so the input will be n by M and we're gonna add another dimension so it will be n plus 1 but it's only like the that one that we add are all just going to be once for the intercept value so let's create x1 to be complete at once and the shape of it will be we need to add it for all training examples right so we're going to make its shape XOR shape of 1 for all the training examples and then we're just going to change the input X to be we just add this vector to the first dimension right to the the so that it becomes n plus 1 and now after that we just run I will just initialize self dot m to be exit shape of 1 again self dot n to the X of shape 0 and W needs to be initialized as so n by 1 softer and common one and then after having done that we need to run so what we want to do is we need to first compute let's compute our prediction let's check the loss for that prediction and then let's just run the update step so for iteration in range of self dot total iterations plus one let's compute y hat to be septa y hat of X comma W right and the loss will be self dot loss of Y hat comma Y and then the CW will be self dot gradient descent of w x y y hat and also let's just do if iteration modulus 2000 is 0 then print let's just use f strings so cost at iteration like this and in the end after that it has been trained right the updated w values we return w okay this code hopefully should work now you want to just do if so we initialize X let's say that if X is just in in let's say in to in like I to the graph so we're going to have one I'm 500 remember that we're going to add 1 to it right for the intercept value so really we're going to have y equals W 1 plus W 2 X 1 something over X 2 and this is just a straight line in 2d then let's say Y the correct is 3 times X plus some noise it's just let's just take so that the relationship is three times X but we have some sort of noise in the in the correct values and we need to do like regression we call the the class we initialize the class and then we W will be regression dot main of X comma Y and that should be it hopefully no errors let's see damn like this is the problem with like sublime text it's like it's probably not that big of an error but if you look at this this is like super scary let's see has no attribute why I had okay so see why hat should be like this sickness is not the fun yeah okay yeah so loss nice now it works okay so we see that in the beginning the cost is pretty high and then when we run it we see it decreases quite quickly and yeah what would be like interesting I did like I've written this code before and I added another function that that plotted it so see that it actually like does what it's supposed to I you're gonna have to trust me on this one or you can check how you code it yourself but like seeing the cost or the loss go down is is like gives a good indication then it's most likely working like it one thing to be careful with is setting the learning rate like you if you set it too high I don't I haven't tried it let's say 0.5 it's the works let's say you save it to hunt like 100 yeah it's going to diverge but if you set it too low like if you set it to like 180 minus 100 it's going to take a long time for it to converge let's say like 50 yeah yeah like for this one we see that it it starts decreasing so if we put it like this it's probably going to decrease yeah so this is probably a good choice all right thank you so much for watching hopefully this video helped you understand how to implement it if you have any questions then leave them in the comment and the I usually respond see you next time
Original Description
Linear Regression implementation in Python using numpy library.
❤️ Support the channel ❤️
https://www.youtube.com/channel/UCkzW5JSFwvKRjXABI-UTAkQ/join
Paid Courses I recommend for learning (affiliate links, no extra cost for you):
⭐ Machine Learning Specialization https://bit.ly/3hjTBBt
⭐ Deep Learning Specialization https://bit.ly/3YcUkoI
📘 MLOps Specialization http://bit.ly/3wibaWy
📘 GAN Specialization https://bit.ly/3FmnZDl
📘 NLP Specialization http://bit.ly/3GXoQuP
✨ Free Resources that are great:
NLP: https://web.stanford.edu/class/cs224n/
CV: http://cs231n.stanford.edu/
Deployment: https://fullstackdeeplearning.com/
FastAI: https://www.fast.ai/
💻 My Deep Learning Setup and Recording Setup:
https://www.amazon.com/shop/aladdinpersson
GitHub Repository:
https://github.com/aladdinpersson/Machine-Learning-Collection
✅ One-Time Donations:
Paypal: https://bit.ly/3buoRYH
▶️ You Can Connect with me on:
Twitter - https://twitter.com/aladdinpersson
LinkedIn - https://www.linkedin.com/in/aladdin-persson-a95384153/
Github - https://github.com/aladdinpersson
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Aladdin Persson · Aladdin Persson · 29 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
▶
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
computeCost.m Linear Regression Cost Function - Machine Learning
Aladdin Persson
gradientDescent.m Gradient Descent Implementation - Machine Learning
Aladdin Persson
Neural Network from scratch - Part 1 (Standard Notation)
Aladdin Persson
Neural Network from scratch - Part 2 (Forward Propagation)
Aladdin Persson
Neural Network from scratch - Part 3 (Backward Propagation)
Aladdin Persson
Neural Network from scratch - Part 4 (With Python)
Aladdin Persson
sigmoid.m - Programming Assignment 2 Machine Learning
Aladdin Persson
costFunction.m - Programming Assignment 2 Machine Learning
Aladdin Persson
predict.m - Programming Assignment 2 Machine Learning
Aladdin Persson
costFunctionReg.m - Programming Assignment 2 Machine Learning
Aladdin Persson
lrCostFunction.m - Programming Assignment 3 Machine Learning
Aladdin Persson
oneVsAll.m - Programming Assignment 3 Machine Learning
Aladdin Persson
predictOneVsAll.m - Programming Assignment 3 Machine Learning
Aladdin Persson
predict.m - Programming Assignment 3 Machine Learning
Aladdin Persson
Caesar Cipher Encryption and Decryption with example
Aladdin Persson
Cryptography: Caesar Cipher Python
Aladdin Persson
Vigenere Cipher Explained (with Example)
Aladdin Persson
Cryptography: Vigenere Cipher Python
Aladdin Persson
Hill Cipher Explained (with Example)
Aladdin Persson
Cryptography: Hill Cipher Python
Aladdin Persson
Interval Scheduling Greedy Algorithm: Python
Aladdin Persson
Weighted Interval Scheduling Algorithm Explained
Aladdin Persson
Weighted Interval Scheduling Python Code
Aladdin Persson
Sequence Alignment | Needleman Wunsch Algorithm
Aladdin Persson
Sequence Alignment | Needleman Wunsch in Python
Aladdin Persson
Codility BinaryGap Python
Aladdin Persson
Codility CyclicRotation Python
Aladdin Persson
Derivation Linear Regression with Gradient Descent
Aladdin Persson
Linear Regression Gradient Descent From Scratch in Python
Aladdin Persson
Pytorch Neural Network example
Aladdin Persson
Pytorch CNN example (Convolutional Neural Network)
Aladdin Persson
Pytorch LeNet implementation from scratch
Aladdin Persson
Pytorch VGG implementation from scratch
Aladdin Persson
Pytorch GoogLeNet / InceptionNet implementation from scratch
Aladdin Persson
How to save and load models in Pytorch
Aladdin Persson
How to build custom Datasets for Images in Pytorch
Aladdin Persson
Pytorch Transfer Learning and Fine Tuning Tutorial
Aladdin Persson
Pytorch Data Augmentation using Torchvision
Aladdin Persson
Pytorch Quick Tip: Weight Initialization
Aladdin Persson
Pytorch Quick Tip: Using a Learning Rate Scheduler
Aladdin Persson
Pytorch ResNet implementation from Scratch
Aladdin Persson
Pytorch TensorBoard Tutorial
Aladdin Persson
Pytorch DCGAN Tutorial (See description for updated video)
Aladdin Persson
Naive Bayes from Scratch - Machine Learning Python
Aladdin Persson
Spam Classifier using Naive Bayes in Python
Aladdin Persson
K-Nearest Neighbor from scratch - Machine Learning Python
Aladdin Persson
Linear Regression Normal Equation Python
Aladdin Persson
SVM from Scratch - Machine Learning Python (Support Vector Machine)
Aladdin Persson
Neural Network from Scratch - Machine Learning Python
Aladdin Persson
Pytorch RNN example (Recurrent Neural Network)
Aladdin Persson
Pytorch Bidirectional LSTM example
Aladdin Persson
Pytorch Text Generator with character level LSTM
Aladdin Persson
Logistic Regression from Scratch - Machine Learning Python
Aladdin Persson
K-Means Clustering from Scratch - Machine Learning Python
Aladdin Persson
Pytorch Torchtext Tutorial 1: Custom Datasets and loading JSON/CSV/TSV files
Aladdin Persson
Pytorch Torchtext Tutorial 2: Built in Datasets with Example
Aladdin Persson
Pytorch Torchtext Tutorial 3: From Textfiles to Dataset
Aladdin Persson
Paper Review: Sequence to Sequence Learning with Neural Networks
Aladdin Persson
Pytorch Seq2Seq Tutorial for Machine Translation
Aladdin Persson
Pytorch Seq2Seq with Attention for Machine Translation
Aladdin Persson
More on: ML Maths Basics
View skill →
🎓
Tutor Explanation
DeepCamp AI