Linear Regression in Python - Machine Learning From Scratch 02 - Python Tutorial
Key Takeaways
The video demonstrates Linear Regression in Python using NumPy and Scikit-learn, covering the concept, math, and implementation from scratch. It uses gradient descent as the optimization algorithm and mean squared error as the cost function.
Full Transcript
hi everybody welcome to a new tutorial this is the second video of the machine learning from scratch tutorial series in this series we are going to implement popular machine learning algorithms using only built-in Python modules and numpy today we are going to implement the linear regression algorithm so let's talk about the concept of linear regression first so in regression we want to predict continuous values whereas in classification we want to predict a discrete value like a class label 0 or 1 so if we have a look at this example plot then we have our data the blue dots and we want to approximate this data with a linear function that's why it's called linear regression so we use a linear function to predict the values so we can define the approximation as y hat equals W times X plus B so this is the line equation where W or our weights is the slope and B is the bias or just the shift along the y axis in the 2d case so this is the approximation and now we have to come up with this W and the B and how do we find this so for this we define a cost function and in linear regression this is the mean squared error so this is the difference between the actual value and the approximated value so the actual value for this we need training samples and then we square this difference and sum over all the samples and then divide by the number of samples so this way we get the mean error this is the cost function of the so this is the error and of course we want to F the error as small as possible so we have to find the minimum of this function and how do we find the minimum so for this we need it's to calculate the derivative or the gradient so we calculate the gradient of our cost function with respect to W and with respect to B so this is the formula of the gradient please check this for yourself and I will also put some links in the description with some further readings but I will not go into detail now and now with this gradient we use a technique that is called gradient descent so this is an iterative method to get to the minimum so if we have our object or our cost function here then we start somewhere so we have some initialization of the weights and the bias and then we want to go into the direction of the steepest descent and the steepest descent is also the gradient so we want to go into the direction of the into the negative direction of the gradient and we do this iteratively until we finally reached the minimum and with each iteration we have a update rule for the new weights and new bias so the new W is the old W minus alpha times the derivative so minus because we want to go into the negative direction and then this alpha is the so-called learning rate and this is an important parameter for our model so the learning rate defines how far we go how far we go into this direction with each iteration step so for example if you use a small learning rate then it may take longer but it can finally reach the minimum and if we reach or if you use a big learning rate then it might be faster but it might also jump around like this and never find the minimum so this is an important parameter that we have to specify and please keep that in mind so now that we know our update rules I've written the formulas for the derivatives again here and then simplify them a little bit please check that for yourself so these are the formulas for the update rules and the derivatives and this is all we need to know so now we can get started so now let's define a class called linear regression and this will of course has an init method or double underscore in it and then it has self and it gets the learning rate and I will give this a default value so usually this is a very small value so I will give it point zero zero one and then I will give it a number of iterations so how many iterations we use in our gradient decent method and I will also give this an a default value so I will say this is 1000 and then I will simply store them here so I will say self L R equals L R and self and ITER's equals and ITER's and then later we have to come up with the weights but here at the beginning I will simply say self weights equals none and self taught bias equals none and then we have to define two functions and we will follow the conventions of other machine learning libraries here so we will define a fit method which takes the training samples and the labels for them so this will involve the training step and the gradient descent and then we will define a predict method so then when it gets new test samples then it can approximate the value and return the value so these are the functions we have to implement and before we go on let's have a quick look at the data X&Y so how does this look and for this I've written a little example script and I used the scikit-learn module to generate some example data and I will split the data into training and test samples and also training and test labels so first of all let's have a look at how this data looks so here's the plot so this is how our data looks and now we want to find a function somewhere here that approximates the value and let's have a look at the shape of our x and y so let's run this and now I don't want this plot here anymore so if we see that our X is a MD array of size 80 by 1 and this is because I put in here I want to have 100 samples and one feature for each sample and then I will split this so our training samples only has 80 samples in it so this is a ND array of size 80 by 1 and our training labels is just a wonder elector of also of size 80 so for each training sample we have one value so this is how our data looks and now let's continue so let's implement the fit method so as I said we need to implement the gradient descent method here and the gradient descent always needs to start somewhere so we need to have some initialization so let's do let's init our parameters and for this first let's get the number of samples and the number of features we can get this by saying this is X dot shape and then we simply initialize all the weights with zero so we can say self weights equals numpy zeroes of size and feature and feature so for each component we put in a zero and self depth bias equals zero this is just the value so you can also use random values here but zero is just fine so let's use zero here and then we use the gradient descent so this is an iterative process so we use a for loop so for I know actually we don't need this so four underscore in range and then self dot and ITER's and now what we need we first have to approximate or let's have a look at the formula again so the formula for our new weights is the old way - the learning rate times the derivative and the derivative with respect to W is 1 over N and then we have the sum and the sum over two times X I times and then the difference here of this approximated and the actual value so let's first first calculate this approximation so we have this formula here the approximation is the weights times our X plus the bias so let's do this let's say and let's call this Y predict it equals and then we can use n P dot dot and then X and self dot weights plus bias so this will multiply the X with the weights and now that we have the approximation we can calculate the derivative with respect to W and this is its again have a look at this formula 1 over N and then this sum and then inside the sum we have the product of x times this so we say 1 over N sample so we already got the number of samples here and then x and then we have the sum product so this is nothing else but also the dot product so NP dot dot but now we have to be careful so what we did here here we multiply each weight component with the feature vector component and sum it up and we do this for all samples and then get one value for each sample and here we want to get one value for each feature vector component so we multiply each sample with the predicted value and sum it up so and then we do this for each feature vector component and get one value for each component so this is the other way around so this is along the other axis and we can so we have to use X dot transposed here X dot T and this is the dot product of the runs Post X and then we have Y predicted - the actual y so please check the number dot function for yourself so this is the derivative of the W and the derivative of the bias is also one or again let's have a look at the formula so this is the same except that we don't have the X here so this is one over N and then just the sum of this difference and by the way I I left the two out so this is just a scaling factor that we can omit so here 1 over N and then the sum over the difference so again 1 over number of samples and then we can say numpy dot sum and the sum of Y predicted - actual Y so this these are our derivatives and now we update our weight so we say self weights - equals self dot learning rate times this derivative and self bias equal self - equals self dot learning rate times the derivative and ya so this is the gradient descent and now we need the predict method so again we approximate we approximate the values with this formula so we already have this year so this is the dot product of X and the way it's plus the bias and then we simply returned this so this is the whole implementation that we need and I forgot to import numpy of course so I said let's say import numpy S&P so that we can use this and now let's test this so let's import this class so let's say from linear regression import linear regression and then create some regress all equals linear regression and then we say regress or dot fit and we want to fit the training samples and the training labels and then we can say we can get predicted values equals regress or dot predict and now we want to predict the test samples and now in order to calculate the or to see how our model performs now we can't we can't use the accuracy measure but here we use the mean squared error so as I said this is our cost function the mean squared error that tells us how big the difference between the actual value and the approximated value is so let's define the mean squared error let's say def ms e and this will get the actual values and the predicted values and this is the numpy here we can use numpy mean and then simply the difference ^ - so why true - why predict it ^ - and then we want to return this so let's see let's say Ms eval you equals ms e off why test and the predicted values and let's print this so now if we run this see it's not running bias is not defined so let's say what we what did we miss yourself dot bias and let's run this again so mmm 27 oh sorry I copied this and forgot this year - so next try now we see that our performance so the mean squared error is 783 so this is pretty high so let's use another learning rate here so let's say L R equals point zero one and let's run this and now we see that our error is smaller and let's actually plot this so let's plot first with the original learning rate let's see how how the plot looks so now plot looks like this so it's almost like the right slope but not exactly and now let's have the other learning rate so let's use this learning rate and let's run this and now our plot looks like this and this looks pretty good actually so this is a pretty good fit or pretty good approximation of this data with a linear function so we see that our implementation is working and I hope you enjoyed this tutorial and see you in the next tutorial bye
Original Description
Get my Free NumPy Handbook:
https://www.python-engineer.com/numpybook
In this Machine Learning from Scratch Tutorial, we are going to implement the Linear Regression algorithm, using only built-in Python modules and numpy. We will also learn about the concept and the math behind this popular ML algorithm.
~~~~~~~~~~~~~~ GREAT PLUGINS FOR YOUR CODE EDITOR ~~~~~~~~~~~~~~
✅ Write cleaner code with Sourcery: https://sourcery.ai/?utm_source=youtube&utm_campaign=pythonengineer *
📓 Notebooks available on Patreon:
https://www.patreon.com/patrickloeber
⭐ Join Our Discord : https://discord.gg/FHMg9tKFSN
If you enjoyed this video, please subscribe to the channel!
The code can be found here:
https://github.com/patrickloeber/MLfromscratch
Further readings:
https://ml-cheatsheet.readthedocs.io/en/latest/linear_regression.html
https://ml-cheatsheet.readthedocs.io/en/latest/gradient_descent.html
You can find me here:
Website: https://www.python-engineer.com
Twitter: https://twitter.com/patloeber
GitHub: https://github.com/patrickloeber
#Python #MachineLearning
----------------------------------------------------------------------------------------------------------
* This is a sponsored link. By clicking on it you will not have any additional costs, instead you will support me and my project. Thank you so much for the support! 🙏
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Patrick Loeber · Patrick Loeber · 23 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
▶
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Lists in Python - Advanced Python 01 - Programming Tutorial
Patrick Loeber
Tuples in Python - Advanced Python 02 - Programming Tutorial
Patrick Loeber
Dictionaries in Python - Advanced Python 03 - Programming Tutorial
Patrick Loeber
Sets in Python - Advanced Python 04 - Programming Tutorial
Patrick Loeber
Strings in Python - Advanced Python 05 - Programming Tutorial
Patrick Loeber
Collections in Python - Advanced Python 06 - Programming Tutorial
Patrick Loeber
Itertools in Python - Advanced Python 07 - Programming Tutorial
Patrick Loeber
Lambda in Python - Advanced Python 08 - Programming Tutorial - Map Filter Reduce
Patrick Loeber
Exceptions in Python - Advanced Python 09 - Programming Tutorial
Patrick Loeber
Logging in Python - Advanced Python 10 - Programming Tutorial
Patrick Loeber
JSON in Python - Advanced Python 11 - Programming Tutorial
Patrick Loeber
Random Numbers in Python - Advanced Python 12 - Programming Tutorial
Patrick Loeber
Decorators in Python - Advanced Python 13 - Programming Tutorial
Patrick Loeber
Generators in Python - Advanced Python 14 - Programming Tutorial
Patrick Loeber
Threading vs Multiprocessing in Python - Advanced Python 15 - Programming Tutorial
Patrick Loeber
Threading in Python - Advanced Python 16 - Programming Tutorial
Patrick Loeber
Multiprocessing in Python - Advanced Python 17 - Programming Tutorial
Patrick Loeber
Function arguments in detail - Advanced Python 18 - Programming Tutorial
Patrick Loeber
The asterisk (*) operator in Python - Advanced Python 19 - Programming Tutorial
Patrick Loeber
Shallow vs Deep Copying in Python - Advanced Python 20 - Programming Tutorial
Patrick Loeber
Context Managers in Python - Advanced Python 21 - Programming Tutorial
Patrick Loeber
KNN (K Nearest Neighbors) in Python - Machine Learning From Scratch 01 - Python Tutorial
Patrick Loeber
Linear Regression in Python - Machine Learning From Scratch 02 - Python Tutorial
Patrick Loeber
Logistic Regression in Python - Machine Learning From Scratch 03 - Python Tutorial
Patrick Loeber
Linear and Logistic Regression in 60 lines of Python - Machine Learning From Scratch 04
Patrick Loeber
Naive Bayes in Python - Machine Learning From Scratch 05 - Python Tutorial
Patrick Loeber
Perceptron in Python - Machine Learning From Scratch 06 - Python Tutorial
Patrick Loeber
SVM (Support Vector Machine) in Python - Machine Learning From Scratch 07 - Python Tutorial
Patrick Loeber
Decision Tree in Python Part 1/2 - Machine Learning From Scratch 08 - Python Tutorial
Patrick Loeber
Decision Tree in Python Part 2/2 - Machine Learning From Scratch 09 - Python Tutorial
Patrick Loeber
Random Forest in Python - Machine Learning From Scratch 10 - Python Tutorial
Patrick Loeber
PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial
Patrick Loeber
K-Means Clustering in Python - Machine Learning From Scratch 12 - Python Tutorial
Patrick Loeber
Anaconda Tutorial - Installation and Basic Commands
Patrick Loeber
PyTorch Tutorial 01 - Installation
Patrick Loeber
PyTorch Tutorial 02 - Tensor Basics
Patrick Loeber
PyTorch Tutorial 03 - Gradient Calculation With Autograd
Patrick Loeber
PyTorch Tutorial 04 - Backpropagation - Theory With Example
Patrick Loeber
PyTorch Tutorial 05 - Gradient Descent with Autograd and Backpropagation
Patrick Loeber
PyTorch Tutorial 06 - Training Pipeline: Model, Loss, and Optimizer
Patrick Loeber
PyTorch Tutorial 07 - Linear Regression
Patrick Loeber
PyTorch Tutorial 08 - Logistic Regression
Patrick Loeber
PyTorch Tutorial 09 - Dataset and DataLoader - Batch Training
Patrick Loeber
PyTorch Tutorial 10 - Dataset Transforms
Patrick Loeber
Download Images With Python Automatically - Python Web Scraping Tutorial
Patrick Loeber
PyTorch Tutorial 11 - Softmax and Cross Entropy
Patrick Loeber
Select Movies with Python - Web Scraping Tutorial
Patrick Loeber
PyTorch Tutorial 12 - Activation Functions
Patrick Loeber
List Comprehension in Python - A Python Feature You MUST KNOW - Python Tutorial
Patrick Loeber
PyTorch Tutorial 13 - Feed-Forward Neural Network
Patrick Loeber
How To Add A Progress Bar In Python With Just One Line - Python Tutorial
Patrick Loeber
PyTorch Tutorial 14 - Convolutional Neural Network (CNN)
Patrick Loeber
The Walrus Operator - New in Python 3.8 - Python Tutorial
Patrick Loeber
PyTorch Tutorial 15 - Transfer Learning
Patrick Loeber
YouTube Data API Tutorial with Python - Analyze Channel Statistics - Part 1
Patrick Loeber
YouTube Data API Tutorial with Python - Find Channel Videos - Part 2
Patrick Loeber
YouTube Data API Tutorial with Python - Get Video Statistics - Part 3
Patrick Loeber
YouTube Data API Tutorial with Python - Analyze the Data - Part 4
Patrick Loeber
AdaBoost in Python - Machine Learning From Scratch 13 - Python Tutorial
Patrick Loeber
Ultimate FREE Study Guide for Machine Learning and Deep Learning
Patrick Loeber
More on: Supervised Learning
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Mastering TypeScript — Understanding the TypeScript Compiler (tsc) from Scratch — Lesson 2
Medium · JavaScript
Stop Overfitting With Basically One Line of Code
Medium · AI
Stop Overfitting With Basically One Line of Code
Medium · Machine Learning
Stop Overfitting With Basically One Line of Code
Medium · Data Science
🎓
Tutor Explanation
DeepCamp AI