Linear Regression Gradient Descent From Scratch in Python

Aladdin Persson · Beginner ·📐 ML Fundamentals ·6y ago

Key Takeaways

Linear Regression implementation using Gradient Descent in Python with NumPy library, covering key concepts such as prediction, loss calculation, and weight updates.

Full Transcript

let's code linear regression using gradient descent in the last video I did sort of a quick explanation and if you want a more detailed and thorough like derivation of it I will link my blog post in the description but let's see so let's just let's just first go through the notation again so M will be the number of training examples or the number of points and will be the number of features so the number of dimensions that each point is in and Y will be so Y will be 1 by M 1 value for each training example and X will be an N by M matrix for each training example we have n time like n values and W will be our weight which is just n by 1 so I just made a quick skeleton code as well so we're gonna use numpy and then like we have a class we have an initialization for class we're gonna have a function where we compute our predictions which is y hat a function for the loss and a gradient descent where we do the update step and then a main function which we'll call these functions over and over for it's like a total number of iterations that we choose so what we need to first is just we're gonna have a learning rate let's set it to 0.01 and we're gonna have a total amount of iterations let's say we run for 10,000 and let's see so Y hat remember that Y hat is just W 1 X 1 plus W 2 X 2 plus W 3 X 3 etcetera X 3 etc so trick here is that we we did a a we added X 1 as its just 1 so just to make the notation more compact but really this is how it would look like right we would have one intercept value and then the rest but if we say that we have X 1 here which will be initialized as 1 then we know that we just want the output of this should be one scalar value and so Y will be the same shape as Y hat so 1 by M and W is and by right so X is n by M W is n by 1 and we want to have a scalar value for each training example so the output should be 1 by M and how we do that is we take W transpose times X right and then this transpose would be 1 by n times this and by M which will give the output 1 by M so here we can just return come on X then the loss will will be 1 over self dot M 2 we define later and times the sum of Y hat minus y squared so NP dot power y hat minus y comma 2 so we take each we subtracted and then we take each element Y squared and then just to sum over all of those and also the average then we just return l and let's see here so this might be the most difficult step the gradient descent step in the last year I think I you I showed the formula with loops we're going to do it vectorized it's still going to be the same thing just more efficient code but let's just first think about it in the most simple way what we want to have returned from gradient descent like when we compute the DL DW is a vector of size n by one right because what we want to do later on is just take W minus W we can write this first actually minus the learning rate times DL TW right so the learner will be element-wise times DL DW and if we're going to do this for the entire vector W which is n by one we also want the ldw to be size and by one and we know that well first of all we know that we have to take Y hat minus y and this will be 1 by M so so like one trick you could use is just say okay we know that the shape of Y is 1 by M and we also know that the shape of X that we need to multiply with will be n by M and we want the output of this to be n by 1 and then the idea is ok well how can we multiply them to get n by 1 well ok said if we multiply this with this transpose then we would get we would get n by 1 so this is y hat minus y right this is size and 1 by M so let's take this transpose and let's multiply it with X and I guess what you could do just to make sure it's like print dld w dot shape and just check that it's like the correct shape but this should be fine and there's one thing that we just missing we need to we need to take 2 divided by self dot M times and P dot and then this and return W yeah and we need to have X here we need to have Y hat we need to Y and W for the update great so for the main we need to send in x and y and the first thing that we're gonna do is let's see we need to add one in the beginning and we need to do it for each training example yeah so we need to add so the input will be n by M and we're gonna add another dimension so it will be n plus 1 but it's only like the that one that we add are all just going to be once for the intercept value so let's create x1 to be complete at once and the shape of it will be we need to add it for all training examples right so we're going to make its shape XOR shape of 1 for all the training examples and then we're just going to change the input X to be we just add this vector to the first dimension right to the the so that it becomes n plus 1 and now after that we just run I will just initialize self dot m to be exit shape of 1 again self dot n to the X of shape 0 and W needs to be initialized as so n by 1 softer and common one and then after having done that we need to run so what we want to do is we need to first compute let's compute our prediction let's check the loss for that prediction and then let's just run the update step so for iteration in range of self dot total iterations plus one let's compute y hat to be septa y hat of X comma W right and the loss will be self dot loss of Y hat comma Y and then the CW will be self dot gradient descent of w x y y hat and also let's just do if iteration modulus 2000 is 0 then print let's just use f strings so cost at iteration like this and in the end after that it has been trained right the updated w values we return w okay this code hopefully should work now you want to just do if so we initialize X let's say that if X is just in in let's say in to in like I to the graph so we're going to have one I'm 500 remember that we're going to add 1 to it right for the intercept value so really we're going to have y equals W 1 plus W 2 X 1 something over X 2 and this is just a straight line in 2d then let's say Y the correct is 3 times X plus some noise it's just let's just take so that the relationship is three times X but we have some sort of noise in the in the correct values and we need to do like regression we call the the class we initialize the class and then we W will be regression dot main of X comma Y and that should be it hopefully no errors let's see damn like this is the problem with like sublime text it's like it's probably not that big of an error but if you look at this this is like super scary let's see has no attribute why I had okay so see why hat should be like this sickness is not the fun yeah okay yeah so loss nice now it works okay so we see that in the beginning the cost is pretty high and then when we run it we see it decreases quite quickly and yeah what would be like interesting I did like I've written this code before and I added another function that that plotted it so see that it actually like does what it's supposed to I you're gonna have to trust me on this one or you can check how you code it yourself but like seeing the cost or the loss go down is is like gives a good indication then it's most likely working like it one thing to be careful with is setting the learning rate like you if you set it too high I don't I haven't tried it let's say 0.5 it's the works let's say you save it to hunt like 100 yeah it's going to diverge but if you set it too low like if you set it to like 180 minus 100 it's going to take a long time for it to converge let's say like 50 yeah yeah like for this one we see that it it starts decreasing so if we put it like this it's probably going to decrease yeah so this is probably a good choice all right thank you so much for watching hopefully this video helped you understand how to implement it if you have any questions then leave them in the comment and the I usually respond see you next time

Original Description

Linear Regression implementation in Python using numpy library. ❤️ Support the channel ❤️ https://www.youtube.com/channel/UCkzW5JSFwvKRjXABI-UTAkQ/join Paid Courses I recommend for learning (affiliate links, no extra cost for you): ⭐ Machine Learning Specialization https://bit.ly/3hjTBBt ⭐ Deep Learning Specialization https://bit.ly/3YcUkoI 📘 MLOps Specialization http://bit.ly/3wibaWy 📘 GAN Specialization https://bit.ly/3FmnZDl 📘 NLP Specialization http://bit.ly/3GXoQuP ✨ Free Resources that are great: NLP: https://web.stanford.edu/class/cs224n/ CV: http://cs231n.stanford.edu/ Deployment: https://fullstackdeeplearning.com/ FastAI: https://www.fast.ai/ 💻 My Deep Learning Setup and Recording Setup: https://www.amazon.com/shop/aladdinpersson GitHub Repository: https://github.com/aladdinpersson/Machine-Learning-Collection ✅ One-Time Donations: Paypal: https://bit.ly/3buoRYH ▶️ You Can Connect with me on: Twitter - https://twitter.com/aladdinpersson LinkedIn - https://www.linkedin.com/in/aladdin-persson-a95384153/ Github - https://github.com/aladdinpersson
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Aladdin Persson · Aladdin Persson · 29 of 60

1 computeCost.m Linear Regression Cost Function - Machine Learning
computeCost.m Linear Regression Cost Function - Machine Learning
Aladdin Persson
2 gradientDescent.m Gradient Descent Implementation -  Machine Learning
gradientDescent.m Gradient Descent Implementation - Machine Learning
Aladdin Persson
3 Neural Network from scratch - Part 1 (Standard Notation)
Neural Network from scratch - Part 1 (Standard Notation)
Aladdin Persson
4 Neural Network from scratch - Part 2 (Forward Propagation)
Neural Network from scratch - Part 2 (Forward Propagation)
Aladdin Persson
5 Neural Network from scratch - Part 3 (Backward Propagation)
Neural Network from scratch - Part 3 (Backward Propagation)
Aladdin Persson
6 Neural Network from scratch - Part 4 (With Python)
Neural Network from scratch - Part 4 (With Python)
Aladdin Persson
7 sigmoid.m - Programming Assignment 2 Machine Learning
sigmoid.m - Programming Assignment 2 Machine Learning
Aladdin Persson
8 costFunction.m - Programming Assignment 2 Machine Learning
costFunction.m - Programming Assignment 2 Machine Learning
Aladdin Persson
9 predict.m - Programming Assignment 2 Machine Learning
predict.m - Programming Assignment 2 Machine Learning
Aladdin Persson
10 costFunctionReg.m - Programming Assignment 2 Machine Learning
costFunctionReg.m - Programming Assignment 2 Machine Learning
Aladdin Persson
11 lrCostFunction.m - Programming Assignment 3 Machine Learning
lrCostFunction.m - Programming Assignment 3 Machine Learning
Aladdin Persson
12 oneVsAll.m - Programming Assignment 3 Machine Learning
oneVsAll.m - Programming Assignment 3 Machine Learning
Aladdin Persson
13 predictOneVsAll.m - Programming Assignment 3 Machine Learning
predictOneVsAll.m - Programming Assignment 3 Machine Learning
Aladdin Persson
14 predict.m - Programming Assignment 3 Machine Learning
predict.m - Programming Assignment 3 Machine Learning
Aladdin Persson
15 Caesar Cipher Encryption and Decryption with example
Caesar Cipher Encryption and Decryption with example
Aladdin Persson
16 Cryptography: Caesar Cipher Python
Cryptography: Caesar Cipher Python
Aladdin Persson
17 Vigenere Cipher Explained (with Example)
Vigenere Cipher Explained (with Example)
Aladdin Persson
18 Cryptography: Vigenere Cipher Python
Cryptography: Vigenere Cipher Python
Aladdin Persson
19 Hill Cipher Explained (with Example)
Hill Cipher Explained (with Example)
Aladdin Persson
20 Cryptography: Hill Cipher Python
Cryptography: Hill Cipher Python
Aladdin Persson
21 Interval Scheduling Greedy Algorithm: Python
Interval Scheduling Greedy Algorithm: Python
Aladdin Persson
22 Weighted Interval Scheduling Algorithm Explained
Weighted Interval Scheduling Algorithm Explained
Aladdin Persson
23 Weighted Interval Scheduling Python Code
Weighted Interval Scheduling Python Code
Aladdin Persson
24 Sequence Alignment | Needleman Wunsch Algorithm
Sequence Alignment | Needleman Wunsch Algorithm
Aladdin Persson
25 Sequence Alignment | Needleman Wunsch in Python
Sequence Alignment | Needleman Wunsch in Python
Aladdin Persson
26 Codility BinaryGap Python
Codility BinaryGap Python
Aladdin Persson
27 Codility CyclicRotation Python
Codility CyclicRotation Python
Aladdin Persson
28 Derivation Linear Regression with Gradient Descent
Derivation Linear Regression with Gradient Descent
Aladdin Persson
Linear Regression Gradient Descent From Scratch in Python
Linear Regression Gradient Descent From Scratch in Python
Aladdin Persson
30 Pytorch Neural Network example
Pytorch Neural Network example
Aladdin Persson
31 Pytorch CNN example (Convolutional Neural Network)
Pytorch CNN example (Convolutional Neural Network)
Aladdin Persson
32 Pytorch LeNet implementation from scratch
Pytorch LeNet implementation from scratch
Aladdin Persson
33 Pytorch VGG implementation from scratch
Pytorch VGG implementation from scratch
Aladdin Persson
34 Pytorch GoogLeNet / InceptionNet implementation from scratch
Pytorch GoogLeNet / InceptionNet implementation from scratch
Aladdin Persson
35 How to save and load models in Pytorch
How to save and load models in Pytorch
Aladdin Persson
36 How to build custom Datasets for Images in Pytorch
How to build custom Datasets for Images in Pytorch
Aladdin Persson
37 Pytorch Transfer Learning and Fine Tuning Tutorial
Pytorch Transfer Learning and Fine Tuning Tutorial
Aladdin Persson
38 Pytorch Data Augmentation using Torchvision
Pytorch Data Augmentation using Torchvision
Aladdin Persson
39 Pytorch Quick Tip: Weight Initialization
Pytorch Quick Tip: Weight Initialization
Aladdin Persson
40 Pytorch Quick Tip: Using a Learning Rate Scheduler
Pytorch Quick Tip: Using a Learning Rate Scheduler
Aladdin Persson
41 Pytorch ResNet implementation from Scratch
Pytorch ResNet implementation from Scratch
Aladdin Persson
42 Pytorch TensorBoard Tutorial
Pytorch TensorBoard Tutorial
Aladdin Persson
43 Pytorch DCGAN Tutorial (See description for updated video)
Pytorch DCGAN Tutorial (See description for updated video)
Aladdin Persson
44 Naive Bayes from Scratch - Machine Learning Python
Naive Bayes from Scratch - Machine Learning Python
Aladdin Persson
45 Spam Classifier using Naive Bayes in Python
Spam Classifier using Naive Bayes in Python
Aladdin Persson
46 K-Nearest Neighbor from scratch - Machine Learning Python
K-Nearest Neighbor from scratch - Machine Learning Python
Aladdin Persson
47 Linear Regression Normal Equation Python
Linear Regression Normal Equation Python
Aladdin Persson
48 SVM from Scratch - Machine Learning Python (Support Vector Machine)
SVM from Scratch - Machine Learning Python (Support Vector Machine)
Aladdin Persson
49 Neural Network from Scratch - Machine Learning Python
Neural Network from Scratch - Machine Learning Python
Aladdin Persson
50 Pytorch RNN example (Recurrent Neural Network)
Pytorch RNN example (Recurrent Neural Network)
Aladdin Persson
51 Pytorch Bidirectional LSTM example
Pytorch Bidirectional LSTM example
Aladdin Persson
52 Pytorch Text Generator with character level LSTM
Pytorch Text Generator with character level LSTM
Aladdin Persson
53 Logistic Regression from Scratch - Machine Learning Python
Logistic Regression from Scratch - Machine Learning Python
Aladdin Persson
54 K-Means Clustering from Scratch - Machine Learning Python
K-Means Clustering from Scratch - Machine Learning Python
Aladdin Persson
55 Pytorch Torchtext Tutorial 1: Custom Datasets and loading JSON/CSV/TSV files
Pytorch Torchtext Tutorial 1: Custom Datasets and loading JSON/CSV/TSV files
Aladdin Persson
56 Pytorch Torchtext Tutorial 2: Built in Datasets with Example
Pytorch Torchtext Tutorial 2: Built in Datasets with Example
Aladdin Persson
57 Pytorch Torchtext Tutorial 3: From Textfiles to Dataset
Pytorch Torchtext Tutorial 3: From Textfiles to Dataset
Aladdin Persson
58 Paper Review: Sequence to Sequence Learning with Neural Networks
Paper Review: Sequence to Sequence Learning with Neural Networks
Aladdin Persson
59 Pytorch Seq2Seq Tutorial for Machine Translation
Pytorch Seq2Seq Tutorial for Machine Translation
Aladdin Persson
60 Pytorch Seq2Seq with Attention for Machine Translation
Pytorch Seq2Seq with Attention for Machine Translation
Aladdin Persson

This video teaches how to implement Linear Regression using Gradient Descent in Python from scratch, covering key concepts and providing a practical example.

Key Takeaways
  1. Import necessary libraries (NumPy)
  2. Define the Linear Regression class
  3. Initialize weights and learning rate
  4. Compute predictions (y hat)
  5. Calculate loss
  6. Update weights using Gradient Descent
  7. Run the training loop for a specified number of iterations
💡 The choice of learning rate is crucial for the convergence of the Gradient Descent algorithm.

Related AI Lessons

Mastering TypeScript — Understanding the TypeScript Compiler (tsc) from Scratch — Lesson 2
Learn the basics of the TypeScript compiler to write better JavaScript code
Medium · JavaScript
Stop Overfitting With Basically One Line of Code
Learn to prevent overfitting with a simple code tweak and understand the difference between Ridge and Lasso regression
Medium · AI
Stop Overfitting With Basically One Line of Code
Learn to prevent overfitting in machine learning models with a simple code tweak and understand the difference between Ridge and Lasso regression
Medium · Machine Learning
Stop Overfitting With Basically One Line of Code
Prevent overfitting in models with a simple code tweak, understanding the difference between Ridge and Lasso regression
Medium · Data Science
Up next
Learn Deep Learning by Hand (Beginner's Guide - Part 1)
Thu Vu
Watch →