SVM (Support Vector Machine) in Python - Machine Learning From Scratch 07 - Python Tutorial

Patrick Loeber · Beginner ·📐 ML Fundamentals ·6y ago

Key Takeaways

This video tutorial demonstrates the implementation of a Support Vector Machine (SVM) algorithm using only built-in Python modules and numpy, covering the concept, math, and code behind this popular machine learning technique. It uses a linear model to find a hyperplane that best separates data and maximizes the margin between classes by minimizing the magnitude of W, applying gradient descent to find W and B.

Full Transcript

hi everybody welcome to a new machine learning from scratch tutorial today we are going to implement the SVM algorithm using only build and Python modules and numpy the SVM or support vector machine is a very popular algorithm it follows the idea to use a linear model and to find a linear decision boundary also called a hyperplane that best separates our data and here the choice as the best hyperplane is the one that represents the largest separation or the largest march in between the two classes so we choose the hyperplane so that the distance from it to the nearest data point on each side is maximized so if you have a look at this image then we want to find a hyperplane and the hyperplane has to satisfy this equation W times X minus B equals 0 and we want to find the hyperplane so that the distance to both the set to both classes is maximized so we use the class plus 1 here and minus 1 here so this distance or the margin should be maximized and first let's have a look at the math behind it so it's a little bit more complex than in my previous tutorials but I promise that once you have understood it the final implementation is fairly simple so we used the linear model W times X minus B that should be 0 and then our our function should also satisfy the condition that W times X minus B should be greater or equal than 1 for our class plus 1 so all the samples here must lie on the left side of this equation or this line here and all the samples of the class - one must lie on the right side from this equation so if we put this mathematically then we should it must satisfy W times X minus B should be greater or equal than one for class one or it should be less or equal than minus one for class minus one so if you put this in only one equation then we multiply our linear function with the class label and this should be greater or equal than one so this is the condition that we want to satisfy and now we want to come up with the W and the B so our weights and the bias and for this we use the cost function and then apply gradient descent so if you're not familiar with gradient decent already then please watch one of my previous tutorials for example the one with linear regression there I explained this a little bit more in detail so now let's continue so we use the user cost function here and in this case we use the hinge loss and this is defined as the maximum of zero and one - and here we have our condition Y I times our linear model so what this means is if if we plot the hinge loss then here the blue line is the hinge loss so this is either 0 if Y times F is greater or equal than 1 so if they have the the same sign then it's 0 and so if they yeah if they are correctly classified and are larger than 1 then our loss is zero so this means if we have a look at this image again if for the green class if it's if it lies on this side then it's 0 and for the blue class if it lies on this side then it's also 0 and otherwise and then we have a linear function so the further we are away from our decision boundary line the higher is our loss and so this is one part of our cost function and the other part is as I already said we want to maximize the margin here so between these two classes and the margin is defined is 2 over the magnitude of W so this is dependent from our weight dependent on our weight vector so we want to maximize this and therefore we want to minimize the magnitude so we put this or add this to our cost function so we also put this term the magnitude of W to the power of 2 times a lambda parameter and then here we have our hinge loss so the lambda parameter tries to find a trade-off between these two terms so with it says basically says which is more important so we want to of course we want to have the right classification we want to lie on the correct side of our lines but we also want to have the the line such that the margin is is maximized so yeah so if you look at the two cases if our if we are on the right side of the lines of why I times F on X f of X is greater or equal than one then we simply we only have this term because this is the hinge loss is 0 and otherwise then our cost function is this year and now we want to minimize that so we want to get the derivatives or the gradients of our cost function so in the first case if we are greater or equal than 1 our derivative is only is 2 times lambda times W so and here we only look at one component of our W so we get rid of the magnitude and the derivative with respect to the B is 0 so please double check that for yourself here I will not explain the derivatives in details and in the other case so if if Y I times F on X is not greater or equal than 1 then our derivative with respect to the W is this equation here and the derivative with respect to our bias is only Y I so again please double check that for yourself and then when we have our gradients we can use the update rule so the new weight is the old weight - because we use gradient descent so we go into negative direction - the learning rate or the step size times the derivative so these are our update rules and now I hope you've understood the concept and the math behind this and now we can start implementing it so this is now straightforward first of all we import numpy S&P of course and then we create our class as we M which will get an init method and here I will put in a learning rate which will get a default value of point zero zero one and it will get a lambda parameter which will also get a default and I will say this is point zero one so this is usually also a small value and then it will get the number of iterations for our optimization which will get the default of one thousand so then I will simply store them so I will say self dot L R equals learning rate self dot lambda param equals lambda param so note that I cannot use lambda here because lambda is a key word in Python for the lambda function so ya then self dot and ITER's equals and ITER's then I will say self dot W equals nun and self dot B equals nun so I have to come up with them later and then we define our two functions so as always one is the predict function where we fit the training samples and the training labels and the sorry this is the fit method and the other one is the predict method where we predict the labels of the test samples and now let's start with the predict method because this is very short so we want to as I said if we look at the math we apply this linear model and then we look at the sign of this so if it's positive then we say it's class one and if it's negative then we say it's class minus one so we say linear output equals numpy dot dot so the dot product of X and self dot W minus self dot B and then we choose the sign so we can simply say return numpy dot sine of this linear output so this is the whole predict implementation and now let's continue with the fit method so first of all as I said we used the classes plus 1 and minus 1 here so we want to make sure that our Y has only minus 1 and plus 1 so oftentimes it has 0 and 1 so let's convert this so let's say Y underscore equals and here we can use numpy dot where this will get a condition so we say y and if this is less or equal than 0 then we put in minus 1 and otherwise we put in plus 1 so this will convert all the zeros or smaller numbers to minus 1 and the other numbers 2 plus 1 and now let's get the number of samples and the number of features and this is simply X dot shape because our input vector X is in numpy and D array where the number of rows is the number of samples and the number of columns is the none features then we want to initialize our W and our B and we simply put in zeros in the beginning so we say self dot W equals numpy zeros of size and features so for each feature component we put in a zero for our weight component and then we say self dot B equals zero and now we can start with our gradient descent so we say for underscore because we don't need this in range self dot and it error so the number of iterations we want to do this and then we iterate over our training samples so I say for index and X I in enumerate X so this will give me the current index and also the current sample and now what I want to do now is let's have a look at the math again so I want to come I want to calculate the weight or the derivative of our cost function with respect to the W and with respect to the bias and here I first but at first I look if this condition is satisfied so I will say and the condition is why I times our linear function so I say condition equals y underscore of the current index times and then the linear function so numpy dot of the current sample and our self W minus self dot be this should be greater or equal than one so if this is satisfied and the condition is true and otherwise it's false so now I say if condition so if this is true then our derivatives look like this so the derivative with respect to the B is just zero and so we only need this so I say so it's two times lambda times W and then in our update we go in as a we say the new weight is the old way - the learning rate times this so I write this in one step so I say self dot W - equal self dot learning rate times and now here we have two times self dot lambda parameter times self dot W so this is the first update or if our condition is satisfied and we only need this update and otherwise we say self dot W - equals self times L our learning rate times and let's again have a look at the equation so it's 2 times lambda times W minus y I times X I so 2 times our lambda times the W - numpy dot so I want to multiply our vectors X I and y i so the y underscore of the current index so this is our update for the W and our self dot B is minus equal self times learning rate times the derivative and the derivative is only or just Y I so only Y underscore of the index and now we're done so this is the whole implementation and now let's test this so I've written a little test script that will import this SVM class and then it will generate a some test samples so it will generate two classes and then I will create my SVM classifier and fit the data and then I wrote a little function to visualize this so you can find the code on github by the way so please check that out for yourself and now if we run this so let's say Python as we am underscore test of time and now this should calculate the weights and the bias and it should also plop the decision function so that yellow line and the two lines on both sides here and we see that it's working so yeah that's all about the SVM I hope you enjoyed this and if you liked this please subscribe to my channel and see you next time bye

Original Description

Get my Free NumPy Handbook: https://www.python-engineer.com/numpybook In this Machine Learning from Scratch Tutorial, we are going to implement a SVM (Support Vector Machine) algorithm using only built-in Python modules and numpy. We will also learn about the concept and the math behind this popular ML algorithm. ~~~~~~~~~~~~~~ GREAT PLUGINS FOR YOUR CODE EDITOR ~~~~~~~~~~~~~~ ✅ Write cleaner code with Sourcery: https://sourcery.ai/?utm_source=youtube&utm_campaign=pythonengineer * 📓 Notebooks available on Patreon: https://www.patreon.com/patrickloeber ⭐ Join Our Discord : https://discord.gg/FHMg9tKFSN If you enjoyed this video, please subscribe to the channel! The code can be found here: https://github.com/patrickloeber/MLfromscratch Further readings: https://towardsdatascience.com/support-vector-machine-introduction-to-machine-learning-algorithms-934a444fca47 You can find me here: Website: https://www.python-engineer.com Twitter: https://twitter.com/patloeber GitHub: https://github.com/patrickloeber #Python #MachineLearning ---------------------------------------------------------------------------------------------------------- * This is a sponsored link. By clicking on it you will not have any additional costs, instead you will support me and my project. Thank you so much for the support! 🙏
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Patrick Loeber · Patrick Loeber · 28 of 60

1 Lists in Python - Advanced Python 01 - Programming Tutorial
Lists in Python - Advanced Python 01 - Programming Tutorial
Patrick Loeber
2 Tuples in Python - Advanced Python 02 - Programming Tutorial
Tuples in Python - Advanced Python 02 - Programming Tutorial
Patrick Loeber
3 Dictionaries in Python - Advanced Python 03 - Programming Tutorial
Dictionaries in Python - Advanced Python 03 - Programming Tutorial
Patrick Loeber
4 Sets in Python - Advanced Python 04 - Programming Tutorial
Sets in Python - Advanced Python 04 - Programming Tutorial
Patrick Loeber
5 Strings in Python - Advanced Python 05 - Programming Tutorial
Strings in Python - Advanced Python 05 - Programming Tutorial
Patrick Loeber
6 Collections in Python - Advanced Python 06 - Programming Tutorial
Collections in Python - Advanced Python 06 - Programming Tutorial
Patrick Loeber
7 Itertools in Python - Advanced Python 07 - Programming Tutorial
Itertools in Python - Advanced Python 07 - Programming Tutorial
Patrick Loeber
8 Lambda in Python - Advanced Python 08 - Programming Tutorial - Map Filter Reduce
Lambda in Python - Advanced Python 08 - Programming Tutorial - Map Filter Reduce
Patrick Loeber
9 Exceptions in Python - Advanced Python 09 - Programming Tutorial
Exceptions in Python - Advanced Python 09 - Programming Tutorial
Patrick Loeber
10 Logging in Python - Advanced Python 10 - Programming Tutorial
Logging in Python - Advanced Python 10 - Programming Tutorial
Patrick Loeber
11 JSON in Python - Advanced Python 11 - Programming Tutorial
JSON in Python - Advanced Python 11 - Programming Tutorial
Patrick Loeber
12 Random Numbers in Python - Advanced Python 12 - Programming Tutorial
Random Numbers in Python - Advanced Python 12 - Programming Tutorial
Patrick Loeber
13 Decorators in Python - Advanced Python 13 - Programming Tutorial
Decorators in Python - Advanced Python 13 - Programming Tutorial
Patrick Loeber
14 Generators in Python - Advanced Python 14 - Programming Tutorial
Generators in Python - Advanced Python 14 - Programming Tutorial
Patrick Loeber
15 Threading vs Multiprocessing in Python - Advanced Python 15 - Programming Tutorial
Threading vs Multiprocessing in Python - Advanced Python 15 - Programming Tutorial
Patrick Loeber
16 Threading in Python - Advanced Python 16 - Programming Tutorial
Threading in Python - Advanced Python 16 - Programming Tutorial
Patrick Loeber
17 Multiprocessing in Python - Advanced Python 17 - Programming Tutorial
Multiprocessing in Python - Advanced Python 17 - Programming Tutorial
Patrick Loeber
18 Function arguments in detail - Advanced Python 18 - Programming Tutorial
Function arguments in detail - Advanced Python 18 - Programming Tutorial
Patrick Loeber
19 The asterisk (*) operator in Python - Advanced Python 19 - Programming Tutorial
The asterisk (*) operator in Python - Advanced Python 19 - Programming Tutorial
Patrick Loeber
20 Shallow vs Deep Copying in Python - Advanced Python 20 - Programming Tutorial
Shallow vs Deep Copying in Python - Advanced Python 20 - Programming Tutorial
Patrick Loeber
21 Context Managers in Python - Advanced Python 21 - Programming Tutorial
Context Managers in Python - Advanced Python 21 - Programming Tutorial
Patrick Loeber
22 KNN (K Nearest Neighbors) in Python - Machine Learning From Scratch 01 - Python Tutorial
KNN (K Nearest Neighbors) in Python - Machine Learning From Scratch 01 - Python Tutorial
Patrick Loeber
23 Linear Regression in Python - Machine Learning From Scratch 02 - Python Tutorial
Linear Regression in Python - Machine Learning From Scratch 02 - Python Tutorial
Patrick Loeber
24 Logistic Regression in Python - Machine Learning From Scratch 03 - Python Tutorial
Logistic Regression in Python - Machine Learning From Scratch 03 - Python Tutorial
Patrick Loeber
25 Linear and Logistic Regression in 60 lines of Python - Machine Learning From Scratch 04
Linear and Logistic Regression in 60 lines of Python - Machine Learning From Scratch 04
Patrick Loeber
26 Naive Bayes in Python - Machine Learning From Scratch 05 - Python Tutorial
Naive Bayes in Python - Machine Learning From Scratch 05 - Python Tutorial
Patrick Loeber
27 Perceptron in Python - Machine Learning From Scratch 06 - Python Tutorial
Perceptron in Python - Machine Learning From Scratch 06 - Python Tutorial
Patrick Loeber
SVM (Support Vector Machine) in Python - Machine Learning From Scratch 07 - Python Tutorial
SVM (Support Vector Machine) in Python - Machine Learning From Scratch 07 - Python Tutorial
Patrick Loeber
29 Decision Tree in Python Part 1/2 - Machine Learning From Scratch 08 - Python Tutorial
Decision Tree in Python Part 1/2 - Machine Learning From Scratch 08 - Python Tutorial
Patrick Loeber
30 Decision Tree in Python Part 2/2 - Machine Learning From Scratch 09 - Python Tutorial
Decision Tree in Python Part 2/2 - Machine Learning From Scratch 09 - Python Tutorial
Patrick Loeber
31 Random Forest in Python - Machine Learning From Scratch 10 - Python Tutorial
Random Forest in Python - Machine Learning From Scratch 10 - Python Tutorial
Patrick Loeber
32 PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial
PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial
Patrick Loeber
33 K-Means Clustering in Python - Machine Learning From Scratch 12 - Python Tutorial
K-Means Clustering in Python - Machine Learning From Scratch 12 - Python Tutorial
Patrick Loeber
34 Anaconda Tutorial - Installation and Basic Commands
Anaconda Tutorial - Installation and Basic Commands
Patrick Loeber
35 PyTorch Tutorial 01 - Installation
PyTorch Tutorial 01 - Installation
Patrick Loeber
36 PyTorch Tutorial 02 - Tensor Basics
PyTorch Tutorial 02 - Tensor Basics
Patrick Loeber
37 PyTorch Tutorial 03 - Gradient Calculation With Autograd
PyTorch Tutorial 03 - Gradient Calculation With Autograd
Patrick Loeber
38 PyTorch Tutorial 04 - Backpropagation - Theory With Example
PyTorch Tutorial 04 - Backpropagation - Theory With Example
Patrick Loeber
39 PyTorch Tutorial 05 - Gradient Descent with Autograd and Backpropagation
PyTorch Tutorial 05 - Gradient Descent with Autograd and Backpropagation
Patrick Loeber
40 PyTorch Tutorial 06 - Training Pipeline: Model, Loss, and Optimizer
PyTorch Tutorial 06 - Training Pipeline: Model, Loss, and Optimizer
Patrick Loeber
41 PyTorch Tutorial 07 - Linear Regression
PyTorch Tutorial 07 - Linear Regression
Patrick Loeber
42 PyTorch Tutorial 08 - Logistic Regression
PyTorch Tutorial 08 - Logistic Regression
Patrick Loeber
43 PyTorch Tutorial 09 - Dataset and DataLoader - Batch Training
PyTorch Tutorial 09 - Dataset and DataLoader - Batch Training
Patrick Loeber
44 PyTorch Tutorial 10 - Dataset Transforms
PyTorch Tutorial 10 - Dataset Transforms
Patrick Loeber
45 Download Images With Python Automatically - Python Web Scraping Tutorial
Download Images With Python Automatically - Python Web Scraping Tutorial
Patrick Loeber
46 PyTorch Tutorial 11 - Softmax and Cross Entropy
PyTorch Tutorial 11 - Softmax and Cross Entropy
Patrick Loeber
47 Select Movies with Python - Web Scraping Tutorial
Select Movies with Python - Web Scraping Tutorial
Patrick Loeber
48 PyTorch Tutorial 12 - Activation Functions
PyTorch Tutorial 12 - Activation Functions
Patrick Loeber
49 List Comprehension in Python - A Python Feature You MUST KNOW - Python Tutorial
List Comprehension in Python - A Python Feature You MUST KNOW - Python Tutorial
Patrick Loeber
50 PyTorch Tutorial 13 - Feed-Forward Neural Network
PyTorch Tutorial 13 - Feed-Forward Neural Network
Patrick Loeber
51 How To Add A Progress Bar In Python With Just One Line - Python Tutorial
How To Add A Progress Bar In Python With Just One Line - Python Tutorial
Patrick Loeber
52 PyTorch Tutorial 14 - Convolutional Neural Network (CNN)
PyTorch Tutorial 14 - Convolutional Neural Network (CNN)
Patrick Loeber
53 The Walrus Operator - New in Python 3.8 - Python Tutorial
The Walrus Operator - New in Python 3.8 - Python Tutorial
Patrick Loeber
54 PyTorch Tutorial 15 - Transfer Learning
PyTorch Tutorial 15 - Transfer Learning
Patrick Loeber
55 YouTube Data API Tutorial with Python - Analyze Channel Statistics - Part 1
YouTube Data API Tutorial with Python - Analyze Channel Statistics - Part 1
Patrick Loeber
56 YouTube Data API Tutorial with Python - Find Channel Videos - Part 2
YouTube Data API Tutorial with Python - Find Channel Videos - Part 2
Patrick Loeber
57 YouTube Data API Tutorial with Python - Get Video Statistics - Part 3
YouTube Data API Tutorial with Python - Get Video Statistics - Part 3
Patrick Loeber
58 YouTube Data API Tutorial with Python - Analyze the Data - Part 4
YouTube Data API Tutorial with Python - Analyze the Data - Part 4
Patrick Loeber
59 AdaBoost in Python - Machine Learning From Scratch 13 - Python Tutorial
AdaBoost in Python - Machine Learning From Scratch 13 - Python Tutorial
Patrick Loeber
60 Ultimate FREE Study Guide for Machine Learning and Deep Learning
Ultimate FREE Study Guide for Machine Learning and Deep Learning
Patrick Loeber

This video tutorial teaches how to implement a Support Vector Machine (SVM) algorithm from scratch using Python and numpy, covering the concept, math, and code behind this popular machine learning technique. It provides a step-by-step guide on how to implement the SVM algorithm, including finding the hyperplane that maximizes the margin between classes and applying gradient descent to find W and B. By the end of this tutorial, viewers will be able to implement and train their own SVM models usin

Key Takeaways
  1. Implement the SVM algorithm using Python modules and numpy
  2. Find the hyperplane that maximizes the margin between classes
  3. Apply gradient descent to find W and B
  4. Create a class for the SVM model with an init method and default values for the learning rate, lambda parameter, and number of iterations
  5. Define the predict method to apply a linear model and return the sign of the output
  6. Define the fit method to convert the labels to -1 and 1, and then fit the model using the training samples and labels
  7. Initialize W and B with zeros
  8. Update W and B using gradient descent
  9. Calculate derivative of cost function with respect to W and B
  10. Update W and B using learning rate and lambda parameter
💡 The SVM algorithm uses a trade-off between the margin and classification accuracy, and the hinge loss is 0 when the prediction is greater or equal than 1. The cost function is minimized using gradient descent with a learning rate and lambda parameter.

Related AI Lessons

Up next
Learn Deep Learning by Hand (Beginner's Guide - Part 1)
Thu Vu
Watch →