SVM from Scratch - Machine Learning Python (Support Vector Machine)
Key Takeaways
This video implements a Support Vector Machine (SVM) from scratch in Python using the CVXOPT package for quadratic programming, focusing on soft margin SVM with Gaussian and polynomial kernels. The implementation covers key concepts such as kernel methods, optimization problems, and the calculation of support vectors and parameters W and b.
Full Transcript
[Music] what's going on guys hope you're doing awesome so in this video I want to show you how to implement an SEM from scratch I want to explain the theory behind the CMS because honestly that would make for too long or a video and even if I did I wouldn't be able to explain it as well as endrin who has to our lecturers own SVM's let's see the classification problem also he has pretty incredible lecture notes and those are really the only sources I've used to learn about how SVM's work I'll link that in the description if you want to learn about the theory of SVM's and in this video we'll focus on how to actually implement it so from reading the lecture note the final result we obtain from all the derivations is that solving this optimization problem and using the formulas that relate W and B gives us the the result that we want and so there are multiple ways of solving the optimization problem one is using quadratic programming which is what we're going to do in this video you can also use the SML algorithm for example just to sort of heads up so you know what to expect from from this video without further introduction let's get started with coding it so what we're gonna do is we're gonna do a well first of all we got to do our input so we're gonna use numpy import numpy as NP we're going to use import CVX opt that's sort of for solving the quadratic programming and then we're also going to use from utils import create data set so these are two functions and plot contour these are two util functions I created I just put them here you can check them out in the description link it will be on my github but really this is just for creating sort of a data set that we can plot in a nice way and it will be nice to see what the SVM is actually doing at the end so let's do class SVM and let's define our init function and we're gonna send in a colonel let's use gaussian as default we haven't implemented that yet but we're gonna do it soon and then also the C value so I don't believe I mentioned it but we're gonna implement the soft margin SVM so sort of the DC parameter just tells us how much we should panel penalize having the sort of miss miss classifying a point we're gonna set default to 1 and then software kernel is kernel self dot C is C and then let's do a sort of do the overviews of the skeleton code we're gonna have some fit function that we're gonna send in x and y and here we're gonna find our alphas and then we're gonna have a prediction so this will predict each each each point and then we'll have another get parameters and here will sending the alphas value alpha values and yeah that's that's sort of it we're gonna have also if name equals main and then we're gonna run the SVM so in the fit function here we're going to do self dot x equals 2x self dot y equals y then we're going to do M comma n is X shape so M for the sort of following entering notation M are the training examples and n are the features and then we're gonna so we're gonna calculate the kernel the kernel and let's actually do that one because we're gonna use it let's do define Gaussian and we're gonna send in two vectors X and Z and we're gonna use some Sigma let's say 0.1 is default this will depend on the data set but let's just use that one for now we're gonna do NP that X of minus MP dot Lin L norm and then X minus Z and we're gonna do it for axis one then we're gonna square this and we're gonna divide by 2 times Sigma squared all right so now we're gonna define self-talk a for the colonel to be MP dot zeros M comma M in shape and then we're gonna do for I in range M self dot K so we're gonna do this I guess semi vectorized not folded back tries we still have one loop but we've removed one of the loops we're going to do self that kernel of X I and then MP dot new axis and self dot X so this will let's see this will calculate the kernel for a specific training example with with all the the entire data and so the final shape will be M by M and yeah so now that we have calculated the kernel which is sort of the you know the inner product of Phi X X I times Phi of X J sort of the notation entering uses and then what we want to do is we want to send in sort of two CVX opt and we want to tell it to you know solve this quadratic programming problem and so that we can obtain the Alpha values now the the CVX up requires it to be on a specific form and we don't have it on that form sort of the optimization problem that we saw previously so we need to do some just reformulation to to have it on that form and yeah if you if you have any questions about that I'll I'll link to a blog post that does a pretty good job of explaining how we reformulate it but either way the final thing that we come up with is uh is that P should be expec c bx opt dot matrix and P dot outer why why and then self dot K Q should be CVX up dot matrix minus MP dot ones and yeah so I highly recommend that you sort of go through it step by step to see that how we reformulate the optimization problem but that's really all we're doing here if you understand the original one we're only reformulating it so that we can see the excel can solve it for us and then here we're gonna impede Alvey stack and P that I have M times minus 1 then NP that I of M and then age top matrix and peanut each stack yeah so so the H here is for the the Alpha values we require it to be we require the Alpha values to be greater than 0 and less than C and yeah and then we have a to be CVX up that matrix Y in shape 1 comma M and then D just to have it in to convert it to float and then B will be CVX up top matrix and put our zeroes 1 then we're just gonna do CVX up solvers options and then show progress equals false just that it doesn't print unnecessary things and then the solution will be CV X up solvers Q P and then we're just gonna send in all those that we formulated above and then stuff that alphas will be NP dot array of solution and then X yeah so I don't expect you to follow all of these ones if you if you follow the theory you should then be able to you know just convert it to in this form and that's how it results in in in the code so what we want to do now is having the alpha values we want to get our parameters and let's see here so we're gonna have a threshold let's set it to 1 e minus 4 or something like that then we're gonna do the support vectors will be where the Alpha values are greater than the threshold but this you should also be less than self dot C sort of using a tolerance value when it's greater than zero so this if it's greater than this and it's we say that it's greater than zero and then we're just going to do flatten so this right here will just be a binary vector right four for the support vectors that we have so we're going to do self that W is MP dot and this sort of follows from the formula as well that was given in the lecture notes so we have X of s in the support vectors and then to match the shape we just take transpose and then the alphas similarly the support vectors of the alphas and then sell top y of as well the support vectors they were just going to do MP dot new axis to have the shapes match and then self dot B will be the mean value I'm self-taught why support vectors again numpy that new axis minus self dot alphas the support vectors and then sell top why and then lastly so let's put this on a line like this like this and we're gonna have self dot K so the kernel on the support vectors and then the support vectors in the columns and then we're just gonna do NP that new access to not have any weird shapes and here we're gonna return the support vectors now that we have sort of the the W and B just following from having the Alpha values from the up in the CVX opt then we're gonna send it in to predict so the predict will be Y predict will be ten P dot zeros of X shape 0 all right so we're going to have prediction for each of our our point the support vectors will be self dot get parameters of self dot alphas then we're going to be for I in range of X shape 0 we're going to wire predict of I is gonna be the NP sum of self dot alphas of the support vectors and then self that y of the support vectors and then again MP dot new axis and then here we're gonna do self dot kernel of X I call myself that X of the support vectors and let's see we're gonna have new actors again just to have correct shaped and then in the episode and we're gonna do see we're gonna do return and P dot sine of Y predict plus soft upbeat series mirror X yes we should have think we should have self dot X here yeah and [Music] yeah so as I said there's a lot of math behind this I'm just showing you how to do it and not the theory behind it and and again I refer you to the lecture notes and and drawing and if you have any questions about the code I'd be happy to to answer them but these are just sort of how we we implement it and yeah let's also define so we can use some other kernels we could use the linear kernel and for our in this in this case just to have them the shapes match up we're going to have X comma Z transpose and then let's say we have a polynomial as well and you can define more kernels obviously but we're just gonna use the polynomial and we use a default polynomial of degree 5 and we're going to return one plus MP dot of X is there transpose and then take this to the power of P and I'll see that we should now be be ready to to sort of get our data and train so let's do NP dot random seed one just to have reproducible results so that you can sort of get the same and let's do XY is create data set and let's pick 50 point and then we're gonna do SVM is SVM of colonel equals Gaussian so that's sort of default so we wouldn't have to send that in but s m dot fit x and y and then Y predict let's do si m dot predict X and yeah we can sort of do print F accuracy is let's do some y equals y prediction and then divide by the total examples that we have and then also let's use the D function so plot contours XY and we're gonna send an SVM here as well you're gonna see what this does in us in a second plot contours like this and yeah so let's run this you can't import plot contour okay so it should be this and hopefully this works now so we get yeah accuracy is 99% and then so what I did with those functions right there is that see well I don't know what is printing right now it's probably from this one yeah so remove this print here and let's rerun it and let's bring out the plot so right here see so right right here so as I said this is sort of the the contour plot yeah and I apply it a bunch of them yeah so this is how the data looks like that won't be generated and having the SVM predictions the the decision boundary looks like this and so it can be sort of interesting to see how the SVM does its prediction and we can sort of you know you can let's see we can use a polynomial instead we can run it and we can see that okay so it takes a while because this is how the polynomial looks like and if we would you know decrease the power of it then we would probably get something that looks more linear I guess yeah and yeah you could increase it as well and I know five seems like a pretty good result and then we can do also you know the linear one and this should be pretty linear so let's see yeah so we get a a very linear decision boundary so that's it for SVM's if you have any questions leave them in a comment thank you so much for watching the video and I hope to see you in the next one [Music]
Original Description
A from scratch implementation of SVM using the CVXOPT package in Python to solve the quadratic programming. Specifically implementation of soft margin SVM.
To understand the theory (which is a bit challenging) I recommend reading the following:
* http://cs229.stanford.edu/notes/cs229-notes3.pdf
* https://www.youtube.com/playlist?list=PLoROMvodv4rMiGQp3WXShtMGgzqpfVfbU (Lectures 6,7 by Andrew Ng)
❤️ Support the channel ❤️
https://www.youtube.com/channel/UCkzW5JSFwvKRjXABI-UTAkQ/join
Paid Courses I recommend for learning (affiliate links, no extra cost for you):
⭐ Machine Learning Specialization https://bit.ly/3hjTBBt
⭐ Deep Learning Specialization https://bit.ly/3YcUkoI
📘 MLOps Specialization http://bit.ly/3wibaWy
📘 GAN Specialization https://bit.ly/3FmnZDl
📘 NLP Specialization http://bit.ly/3GXoQuP
✨ Free Resources that are great:
NLP: https://web.stanford.edu/class/cs224n/
CV: http://cs231n.stanford.edu/
Deployment: https://fullstackdeeplearning.com/
FastAI: https://www.fast.ai/
💻 My Deep Learning Setup and Recording Setup:
https://www.amazon.com/shop/aladdinpersson
GitHub Repository:
https://github.com/aladdinpersson/Machine-Learning-Collection
✅ One-Time Donations:
Paypal: https://bit.ly/3buoRYH
▶️ You Can Connect with me on:
Twitter - https://twitter.com/aladdinpersson
LinkedIn - https://www.linkedin.com/in/aladdin-persson-a95384153/
Github - https://github.com/aladdinpersson
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Aladdin Persson · Aladdin Persson · 48 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
▶
49
50
51
52
53
54
55
56
57
58
59
60
computeCost.m Linear Regression Cost Function - Machine Learning
Aladdin Persson
gradientDescent.m Gradient Descent Implementation - Machine Learning
Aladdin Persson
Neural Network from scratch - Part 1 (Standard Notation)
Aladdin Persson
Neural Network from scratch - Part 2 (Forward Propagation)
Aladdin Persson
Neural Network from scratch - Part 3 (Backward Propagation)
Aladdin Persson
Neural Network from scratch - Part 4 (With Python)
Aladdin Persson
sigmoid.m - Programming Assignment 2 Machine Learning
Aladdin Persson
costFunction.m - Programming Assignment 2 Machine Learning
Aladdin Persson
predict.m - Programming Assignment 2 Machine Learning
Aladdin Persson
costFunctionReg.m - Programming Assignment 2 Machine Learning
Aladdin Persson
lrCostFunction.m - Programming Assignment 3 Machine Learning
Aladdin Persson
oneVsAll.m - Programming Assignment 3 Machine Learning
Aladdin Persson
predictOneVsAll.m - Programming Assignment 3 Machine Learning
Aladdin Persson
predict.m - Programming Assignment 3 Machine Learning
Aladdin Persson
Caesar Cipher Encryption and Decryption with example
Aladdin Persson
Cryptography: Caesar Cipher Python
Aladdin Persson
Vigenere Cipher Explained (with Example)
Aladdin Persson
Cryptography: Vigenere Cipher Python
Aladdin Persson
Hill Cipher Explained (with Example)
Aladdin Persson
Cryptography: Hill Cipher Python
Aladdin Persson
Interval Scheduling Greedy Algorithm: Python
Aladdin Persson
Weighted Interval Scheduling Algorithm Explained
Aladdin Persson
Weighted Interval Scheduling Python Code
Aladdin Persson
Sequence Alignment | Needleman Wunsch Algorithm
Aladdin Persson
Sequence Alignment | Needleman Wunsch in Python
Aladdin Persson
Codility BinaryGap Python
Aladdin Persson
Codility CyclicRotation Python
Aladdin Persson
Derivation Linear Regression with Gradient Descent
Aladdin Persson
Linear Regression Gradient Descent From Scratch in Python
Aladdin Persson
Pytorch Neural Network example
Aladdin Persson
Pytorch CNN example (Convolutional Neural Network)
Aladdin Persson
Pytorch LeNet implementation from scratch
Aladdin Persson
Pytorch VGG implementation from scratch
Aladdin Persson
Pytorch GoogLeNet / InceptionNet implementation from scratch
Aladdin Persson
How to save and load models in Pytorch
Aladdin Persson
How to build custom Datasets for Images in Pytorch
Aladdin Persson
Pytorch Transfer Learning and Fine Tuning Tutorial
Aladdin Persson
Pytorch Data Augmentation using Torchvision
Aladdin Persson
Pytorch Quick Tip: Weight Initialization
Aladdin Persson
Pytorch Quick Tip: Using a Learning Rate Scheduler
Aladdin Persson
Pytorch ResNet implementation from Scratch
Aladdin Persson
Pytorch TensorBoard Tutorial
Aladdin Persson
Pytorch DCGAN Tutorial (See description for updated video)
Aladdin Persson
Naive Bayes from Scratch - Machine Learning Python
Aladdin Persson
Spam Classifier using Naive Bayes in Python
Aladdin Persson
K-Nearest Neighbor from scratch - Machine Learning Python
Aladdin Persson
Linear Regression Normal Equation Python
Aladdin Persson
SVM from Scratch - Machine Learning Python (Support Vector Machine)
Aladdin Persson
Neural Network from Scratch - Machine Learning Python
Aladdin Persson
Pytorch RNN example (Recurrent Neural Network)
Aladdin Persson
Pytorch Bidirectional LSTM example
Aladdin Persson
Pytorch Text Generator with character level LSTM
Aladdin Persson
Logistic Regression from Scratch - Machine Learning Python
Aladdin Persson
K-Means Clustering from Scratch - Machine Learning Python
Aladdin Persson
Pytorch Torchtext Tutorial 1: Custom Datasets and loading JSON/CSV/TSV files
Aladdin Persson
Pytorch Torchtext Tutorial 2: Built in Datasets with Example
Aladdin Persson
Pytorch Torchtext Tutorial 3: From Textfiles to Dataset
Aladdin Persson
Paper Review: Sequence to Sequence Learning with Neural Networks
Aladdin Persson
Pytorch Seq2Seq Tutorial for Machine Translation
Aladdin Persson
Pytorch Seq2Seq with Attention for Machine Translation
Aladdin Persson
More on: Supervised Learning
View skill →Related Reads
📰
📰
📰
📰
What Is MLIR and Why Does It Exist?
Dev.to · Fedor Nikolaev
Why Choosing the Right Machine Learning Development Company Matters More Than the AI Model
Medium · Machine Learning
Data privacy in AI training: federated learning, differential privacy, and synthetic data
Dev.to AI
Data Preprocessing: Encoding and Feature Scaling in Machine Learning
Medium · Machine Learning
🎓
Tutor Explanation
DeepCamp AI