SVM from Scratch - Machine Learning Python (Support Vector Machine)

Aladdin Persson · Advanced ·📐 ML Fundamentals ·6y ago

Key Takeaways

This video implements a Support Vector Machine (SVM) from scratch in Python using the CVXOPT package for quadratic programming, focusing on soft margin SVM with Gaussian and polynomial kernels. The implementation covers key concepts such as kernel methods, optimization problems, and the calculation of support vectors and parameters W and b.

Full Transcript

[Music] what's going on guys hope you're doing awesome so in this video I want to show you how to implement an SEM from scratch I want to explain the theory behind the CMS because honestly that would make for too long or a video and even if I did I wouldn't be able to explain it as well as endrin who has to our lecturers own SVM's let's see the classification problem also he has pretty incredible lecture notes and those are really the only sources I've used to learn about how SVM's work I'll link that in the description if you want to learn about the theory of SVM's and in this video we'll focus on how to actually implement it so from reading the lecture note the final result we obtain from all the derivations is that solving this optimization problem and using the formulas that relate W and B gives us the the result that we want and so there are multiple ways of solving the optimization problem one is using quadratic programming which is what we're going to do in this video you can also use the SML algorithm for example just to sort of heads up so you know what to expect from from this video without further introduction let's get started with coding it so what we're gonna do is we're gonna do a well first of all we got to do our input so we're gonna use numpy import numpy as NP we're going to use import CVX opt that's sort of for solving the quadratic programming and then we're also going to use from utils import create data set so these are two functions and plot contour these are two util functions I created I just put them here you can check them out in the description link it will be on my github but really this is just for creating sort of a data set that we can plot in a nice way and it will be nice to see what the SVM is actually doing at the end so let's do class SVM and let's define our init function and we're gonna send in a colonel let's use gaussian as default we haven't implemented that yet but we're gonna do it soon and then also the C value so I don't believe I mentioned it but we're gonna implement the soft margin SVM so sort of the DC parameter just tells us how much we should panel penalize having the sort of miss miss classifying a point we're gonna set default to 1 and then software kernel is kernel self dot C is C and then let's do a sort of do the overviews of the skeleton code we're gonna have some fit function that we're gonna send in x and y and here we're gonna find our alphas and then we're gonna have a prediction so this will predict each each each point and then we'll have another get parameters and here will sending the alphas value alpha values and yeah that's that's sort of it we're gonna have also if name equals main and then we're gonna run the SVM so in the fit function here we're going to do self dot x equals 2x self dot y equals y then we're going to do M comma n is X shape so M for the sort of following entering notation M are the training examples and n are the features and then we're gonna so we're gonna calculate the kernel the kernel and let's actually do that one because we're gonna use it let's do define Gaussian and we're gonna send in two vectors X and Z and we're gonna use some Sigma let's say 0.1 is default this will depend on the data set but let's just use that one for now we're gonna do NP that X of minus MP dot Lin L norm and then X minus Z and we're gonna do it for axis one then we're gonna square this and we're gonna divide by 2 times Sigma squared all right so now we're gonna define self-talk a for the colonel to be MP dot zeros M comma M in shape and then we're gonna do for I in range M self dot K so we're gonna do this I guess semi vectorized not folded back tries we still have one loop but we've removed one of the loops we're going to do self that kernel of X I and then MP dot new axis and self dot X so this will let's see this will calculate the kernel for a specific training example with with all the the entire data and so the final shape will be M by M and yeah so now that we have calculated the kernel which is sort of the you know the inner product of Phi X X I times Phi of X J sort of the notation entering uses and then what we want to do is we want to send in sort of two CVX opt and we want to tell it to you know solve this quadratic programming problem and so that we can obtain the Alpha values now the the CVX up requires it to be on a specific form and we don't have it on that form sort of the optimization problem that we saw previously so we need to do some just reformulation to to have it on that form and yeah if you if you have any questions about that I'll I'll link to a blog post that does a pretty good job of explaining how we reformulate it but either way the final thing that we come up with is uh is that P should be expec c bx opt dot matrix and P dot outer why why and then self dot K Q should be CVX up dot matrix minus MP dot ones and yeah so I highly recommend that you sort of go through it step by step to see that how we reformulate the optimization problem but that's really all we're doing here if you understand the original one we're only reformulating it so that we can see the excel can solve it for us and then here we're gonna impede Alvey stack and P that I have M times minus 1 then NP that I of M and then age top matrix and peanut each stack yeah so so the H here is for the the Alpha values we require it to be we require the Alpha values to be greater than 0 and less than C and yeah and then we have a to be CVX up that matrix Y in shape 1 comma M and then D just to have it in to convert it to float and then B will be CVX up top matrix and put our zeroes 1 then we're just gonna do CVX up solvers options and then show progress equals false just that it doesn't print unnecessary things and then the solution will be CV X up solvers Q P and then we're just gonna send in all those that we formulated above and then stuff that alphas will be NP dot array of solution and then X yeah so I don't expect you to follow all of these ones if you if you follow the theory you should then be able to you know just convert it to in this form and that's how it results in in in the code so what we want to do now is having the alpha values we want to get our parameters and let's see here so we're gonna have a threshold let's set it to 1 e minus 4 or something like that then we're gonna do the support vectors will be where the Alpha values are greater than the threshold but this you should also be less than self dot C sort of using a tolerance value when it's greater than zero so this if it's greater than this and it's we say that it's greater than zero and then we're just going to do flatten so this right here will just be a binary vector right four for the support vectors that we have so we're going to do self that W is MP dot and this sort of follows from the formula as well that was given in the lecture notes so we have X of s in the support vectors and then to match the shape we just take transpose and then the alphas similarly the support vectors of the alphas and then sell top y of as well the support vectors they were just going to do MP dot new axis to have the shapes match and then self dot B will be the mean value I'm self-taught why support vectors again numpy that new axis minus self dot alphas the support vectors and then sell top why and then lastly so let's put this on a line like this like this and we're gonna have self dot K so the kernel on the support vectors and then the support vectors in the columns and then we're just gonna do NP that new access to not have any weird shapes and here we're gonna return the support vectors now that we have sort of the the W and B just following from having the Alpha values from the up in the CVX opt then we're gonna send it in to predict so the predict will be Y predict will be ten P dot zeros of X shape 0 all right so we're going to have prediction for each of our our point the support vectors will be self dot get parameters of self dot alphas then we're going to be for I in range of X shape 0 we're going to wire predict of I is gonna be the NP sum of self dot alphas of the support vectors and then self that y of the support vectors and then again MP dot new axis and then here we're gonna do self dot kernel of X I call myself that X of the support vectors and let's see we're gonna have new actors again just to have correct shaped and then in the episode and we're gonna do see we're gonna do return and P dot sine of Y predict plus soft upbeat series mirror X yes we should have think we should have self dot X here yeah and [Music] yeah so as I said there's a lot of math behind this I'm just showing you how to do it and not the theory behind it and and again I refer you to the lecture notes and and drawing and if you have any questions about the code I'd be happy to to answer them but these are just sort of how we we implement it and yeah let's also define so we can use some other kernels we could use the linear kernel and for our in this in this case just to have them the shapes match up we're going to have X comma Z transpose and then let's say we have a polynomial as well and you can define more kernels obviously but we're just gonna use the polynomial and we use a default polynomial of degree 5 and we're going to return one plus MP dot of X is there transpose and then take this to the power of P and I'll see that we should now be be ready to to sort of get our data and train so let's do NP dot random seed one just to have reproducible results so that you can sort of get the same and let's do XY is create data set and let's pick 50 point and then we're gonna do SVM is SVM of colonel equals Gaussian so that's sort of default so we wouldn't have to send that in but s m dot fit x and y and then Y predict let's do si m dot predict X and yeah we can sort of do print F accuracy is let's do some y equals y prediction and then divide by the total examples that we have and then also let's use the D function so plot contours XY and we're gonna send an SVM here as well you're gonna see what this does in us in a second plot contours like this and yeah so let's run this you can't import plot contour okay so it should be this and hopefully this works now so we get yeah accuracy is 99% and then so what I did with those functions right there is that see well I don't know what is printing right now it's probably from this one yeah so remove this print here and let's rerun it and let's bring out the plot so right here see so right right here so as I said this is sort of the the contour plot yeah and I apply it a bunch of them yeah so this is how the data looks like that won't be generated and having the SVM predictions the the decision boundary looks like this and so it can be sort of interesting to see how the SVM does its prediction and we can sort of you know you can let's see we can use a polynomial instead we can run it and we can see that okay so it takes a while because this is how the polynomial looks like and if we would you know decrease the power of it then we would probably get something that looks more linear I guess yeah and yeah you could increase it as well and I know five seems like a pretty good result and then we can do also you know the linear one and this should be pretty linear so let's see yeah so we get a a very linear decision boundary so that's it for SVM's if you have any questions leave them in a comment thank you so much for watching the video and I hope to see you in the next one [Music]

Original Description

A from scratch implementation of SVM using the CVXOPT package in Python to solve the quadratic programming. Specifically implementation of soft margin SVM. To understand the theory (which is a bit challenging) I recommend reading the following: * http://cs229.stanford.edu/notes/cs229-notes3.pdf * https://www.youtube.com/playlist?list=PLoROMvodv4rMiGQp3WXShtMGgzqpfVfbU (Lectures 6,7 by Andrew Ng) ❤️ Support the channel ❤️ https://www.youtube.com/channel/UCkzW5JSFwvKRjXABI-UTAkQ/join Paid Courses I recommend for learning (affiliate links, no extra cost for you): ⭐ Machine Learning Specialization https://bit.ly/3hjTBBt ⭐ Deep Learning Specialization https://bit.ly/3YcUkoI 📘 MLOps Specialization http://bit.ly/3wibaWy 📘 GAN Specialization https://bit.ly/3FmnZDl 📘 NLP Specialization http://bit.ly/3GXoQuP ✨ Free Resources that are great: NLP: https://web.stanford.edu/class/cs224n/ CV: http://cs231n.stanford.edu/ Deployment: https://fullstackdeeplearning.com/ FastAI: https://www.fast.ai/ 💻 My Deep Learning Setup and Recording Setup: https://www.amazon.com/shop/aladdinpersson GitHub Repository: https://github.com/aladdinpersson/Machine-Learning-Collection ✅ One-Time Donations: Paypal: https://bit.ly/3buoRYH ▶️ You Can Connect with me on: Twitter - https://twitter.com/aladdinpersson LinkedIn - https://www.linkedin.com/in/aladdin-persson-a95384153/ Github - https://github.com/aladdinpersson
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Aladdin Persson · Aladdin Persson · 48 of 60

1 computeCost.m Linear Regression Cost Function - Machine Learning
computeCost.m Linear Regression Cost Function - Machine Learning
Aladdin Persson
2 gradientDescent.m Gradient Descent Implementation -  Machine Learning
gradientDescent.m Gradient Descent Implementation - Machine Learning
Aladdin Persson
3 Neural Network from scratch - Part 1 (Standard Notation)
Neural Network from scratch - Part 1 (Standard Notation)
Aladdin Persson
4 Neural Network from scratch - Part 2 (Forward Propagation)
Neural Network from scratch - Part 2 (Forward Propagation)
Aladdin Persson
5 Neural Network from scratch - Part 3 (Backward Propagation)
Neural Network from scratch - Part 3 (Backward Propagation)
Aladdin Persson
6 Neural Network from scratch - Part 4 (With Python)
Neural Network from scratch - Part 4 (With Python)
Aladdin Persson
7 sigmoid.m - Programming Assignment 2 Machine Learning
sigmoid.m - Programming Assignment 2 Machine Learning
Aladdin Persson
8 costFunction.m - Programming Assignment 2 Machine Learning
costFunction.m - Programming Assignment 2 Machine Learning
Aladdin Persson
9 predict.m - Programming Assignment 2 Machine Learning
predict.m - Programming Assignment 2 Machine Learning
Aladdin Persson
10 costFunctionReg.m - Programming Assignment 2 Machine Learning
costFunctionReg.m - Programming Assignment 2 Machine Learning
Aladdin Persson
11 lrCostFunction.m - Programming Assignment 3 Machine Learning
lrCostFunction.m - Programming Assignment 3 Machine Learning
Aladdin Persson
12 oneVsAll.m - Programming Assignment 3 Machine Learning
oneVsAll.m - Programming Assignment 3 Machine Learning
Aladdin Persson
13 predictOneVsAll.m - Programming Assignment 3 Machine Learning
predictOneVsAll.m - Programming Assignment 3 Machine Learning
Aladdin Persson
14 predict.m - Programming Assignment 3 Machine Learning
predict.m - Programming Assignment 3 Machine Learning
Aladdin Persson
15 Caesar Cipher Encryption and Decryption with example
Caesar Cipher Encryption and Decryption with example
Aladdin Persson
16 Cryptography: Caesar Cipher Python
Cryptography: Caesar Cipher Python
Aladdin Persson
17 Vigenere Cipher Explained (with Example)
Vigenere Cipher Explained (with Example)
Aladdin Persson
18 Cryptography: Vigenere Cipher Python
Cryptography: Vigenere Cipher Python
Aladdin Persson
19 Hill Cipher Explained (with Example)
Hill Cipher Explained (with Example)
Aladdin Persson
20 Cryptography: Hill Cipher Python
Cryptography: Hill Cipher Python
Aladdin Persson
21 Interval Scheduling Greedy Algorithm: Python
Interval Scheduling Greedy Algorithm: Python
Aladdin Persson
22 Weighted Interval Scheduling Algorithm Explained
Weighted Interval Scheduling Algorithm Explained
Aladdin Persson
23 Weighted Interval Scheduling Python Code
Weighted Interval Scheduling Python Code
Aladdin Persson
24 Sequence Alignment | Needleman Wunsch Algorithm
Sequence Alignment | Needleman Wunsch Algorithm
Aladdin Persson
25 Sequence Alignment | Needleman Wunsch in Python
Sequence Alignment | Needleman Wunsch in Python
Aladdin Persson
26 Codility BinaryGap Python
Codility BinaryGap Python
Aladdin Persson
27 Codility CyclicRotation Python
Codility CyclicRotation Python
Aladdin Persson
28 Derivation Linear Regression with Gradient Descent
Derivation Linear Regression with Gradient Descent
Aladdin Persson
29 Linear Regression Gradient Descent From Scratch in Python
Linear Regression Gradient Descent From Scratch in Python
Aladdin Persson
30 Pytorch Neural Network example
Pytorch Neural Network example
Aladdin Persson
31 Pytorch CNN example (Convolutional Neural Network)
Pytorch CNN example (Convolutional Neural Network)
Aladdin Persson
32 Pytorch LeNet implementation from scratch
Pytorch LeNet implementation from scratch
Aladdin Persson
33 Pytorch VGG implementation from scratch
Pytorch VGG implementation from scratch
Aladdin Persson
34 Pytorch GoogLeNet / InceptionNet implementation from scratch
Pytorch GoogLeNet / InceptionNet implementation from scratch
Aladdin Persson
35 How to save and load models in Pytorch
How to save and load models in Pytorch
Aladdin Persson
36 How to build custom Datasets for Images in Pytorch
How to build custom Datasets for Images in Pytorch
Aladdin Persson
37 Pytorch Transfer Learning and Fine Tuning Tutorial
Pytorch Transfer Learning and Fine Tuning Tutorial
Aladdin Persson
38 Pytorch Data Augmentation using Torchvision
Pytorch Data Augmentation using Torchvision
Aladdin Persson
39 Pytorch Quick Tip: Weight Initialization
Pytorch Quick Tip: Weight Initialization
Aladdin Persson
40 Pytorch Quick Tip: Using a Learning Rate Scheduler
Pytorch Quick Tip: Using a Learning Rate Scheduler
Aladdin Persson
41 Pytorch ResNet implementation from Scratch
Pytorch ResNet implementation from Scratch
Aladdin Persson
42 Pytorch TensorBoard Tutorial
Pytorch TensorBoard Tutorial
Aladdin Persson
43 Pytorch DCGAN Tutorial (See description for updated video)
Pytorch DCGAN Tutorial (See description for updated video)
Aladdin Persson
44 Naive Bayes from Scratch - Machine Learning Python
Naive Bayes from Scratch - Machine Learning Python
Aladdin Persson
45 Spam Classifier using Naive Bayes in Python
Spam Classifier using Naive Bayes in Python
Aladdin Persson
46 K-Nearest Neighbor from scratch - Machine Learning Python
K-Nearest Neighbor from scratch - Machine Learning Python
Aladdin Persson
47 Linear Regression Normal Equation Python
Linear Regression Normal Equation Python
Aladdin Persson
SVM from Scratch - Machine Learning Python (Support Vector Machine)
SVM from Scratch - Machine Learning Python (Support Vector Machine)
Aladdin Persson
49 Neural Network from Scratch - Machine Learning Python
Neural Network from Scratch - Machine Learning Python
Aladdin Persson
50 Pytorch RNN example (Recurrent Neural Network)
Pytorch RNN example (Recurrent Neural Network)
Aladdin Persson
51 Pytorch Bidirectional LSTM example
Pytorch Bidirectional LSTM example
Aladdin Persson
52 Pytorch Text Generator with character level LSTM
Pytorch Text Generator with character level LSTM
Aladdin Persson
53 Logistic Regression from Scratch - Machine Learning Python
Logistic Regression from Scratch - Machine Learning Python
Aladdin Persson
54 K-Means Clustering from Scratch - Machine Learning Python
K-Means Clustering from Scratch - Machine Learning Python
Aladdin Persson
55 Pytorch Torchtext Tutorial 1: Custom Datasets and loading JSON/CSV/TSV files
Pytorch Torchtext Tutorial 1: Custom Datasets and loading JSON/CSV/TSV files
Aladdin Persson
56 Pytorch Torchtext Tutorial 2: Built in Datasets with Example
Pytorch Torchtext Tutorial 2: Built in Datasets with Example
Aladdin Persson
57 Pytorch Torchtext Tutorial 3: From Textfiles to Dataset
Pytorch Torchtext Tutorial 3: From Textfiles to Dataset
Aladdin Persson
58 Paper Review: Sequence to Sequence Learning with Neural Networks
Paper Review: Sequence to Sequence Learning with Neural Networks
Aladdin Persson
59 Pytorch Seq2Seq Tutorial for Machine Translation
Pytorch Seq2Seq Tutorial for Machine Translation
Aladdin Persson
60 Pytorch Seq2Seq with Attention for Machine Translation
Pytorch Seq2Seq with Attention for Machine Translation
Aladdin Persson

This video teaches how to implement a Support Vector Machine from scratch in Python, covering the theory behind SVM, kernel methods, and optimization problems. It provides a practical example of using the CVXOPT package for quadratic programming and demonstrates how to calculate support vectors and parameters W and b. By watching this video, viewers will gain hands-on experience with implementing SVM models and understand the key concepts behind kernel methods and optimization problems.

Key Takeaways
  1. Import necessary libraries
  2. Define SVM class with init function
  3. Implement fit function to find alphas and make predictions
  4. Define Gaussian kernel function
  5. Vectorize kernel calculation using numpy operations
  6. Calculate kernel for training example with entire data
  7. Reformulate optimization problem for CVXopt
  8. Obtain Alpha values using CVXopt
  9. Calculate support vectors using Alpha values and threshold
  10. Compute parameters W, b, and support vectors using support vectors and Alpha values
💡 The video highlights the importance of kernel methods in SVM and demonstrates how different kernels, such as Gaussian and polynomial, can affect the decision boundary of the model.

Related Reads

📰
What Is MLIR and Why Does It Exist?
Learn about MLIR, a intermediate representation for machine learning models, and its purpose in optimizing ML workflows
Dev.to · Fedor Nikolaev
📰
Why Choosing the Right Machine Learning Development Company Matters More Than the AI Model
Choosing the right machine learning development company is crucial for turning AI investments into measurable results, as it can make or break the success of AI projects
Medium · Machine Learning
📰
Data privacy in AI training: federated learning, differential privacy, and synthetic data
Learn how federated learning, differential privacy, and synthetic data preserve data privacy in AI training, and why they matter for secure machine learning
Dev.to AI
📰
Data Preprocessing: Encoding and Feature Scaling in Machine Learning
Learn to preprocess data by encoding and scaling features for better machine learning model performance
Medium · Machine Learning
Up next
Is Python Dead in 2026?| Truth About Python in AI Era | 90 Days Roadmap @FameWorldEducationalHub
FAME WORLD EDUCATIONAL HUB
Watch →