PyTorch Tutorial 03 - Gradient Calculation With Autograd
Key Takeaways
The video demonstrates how to calculate gradients using PyTorch's Autograd package, essential for model optimization in deep learning. It covers the process of creating a computational graph, performing forward and backward passes, and tracking gradients using Autograd.
Full Transcript
hi everybody welcome to a new PI torch tutorial today we learn about the autocrat package in pi torch and how we can calculate gradients with it gradients are essential for our model optimizations so this is a very important concept that we should understand luckily PI Touch provides the autograft package which can do all the computations for us we just have to know how to use it so let's start to see how we can calculate gradients in pie charts so first of all we import torch of course and now let's create a tensor x equals torch dot R and n of size 3 and now let's print our X so this is a tensor with three values so three random values and now let's say later we want to calculate the gradient of some function with respect to X then what we have to do is we must specify the argument require skrutz equals true so by default this is false and now if you run this again then we see that also pi touch tracks that it requires the gradient and now whenever we do operations with this tensor pi torch will create a so-called computational graph for us so now let's say we do the operation x + 2 and we store this in an output so we say y equals x + 2 then this will create the computational graph and this looks like this so for each node we have a for each operation we have a node with inputs and an output so here the operation is the + so an addition and our inputs are x + 2 and the output is y and now with this graph and the technique that is called back propagation we can then calculate the gradients I will explain the of backpropagation in detail in the next video but for now it's fine to just know that we are how we can use it so first we do a forward pass so here we apply this operation and in the forward pass we calculate the output Y and since we specified that it requires the gradient PI touch will then automatically create and store a function for us and this function is then used in the back propagation and to get the gradients so here Y has an attribute grab underscore F M so this will point to a gradient function and in this case it's called at at backward and with this function we can then calculate the gradients in the so called backward pass so this will calculate the gradient of Y with respect to X in this case so now if we print Y then we will see exactly this graph F n attribute and here this is an at backward function so because here our operation was a plus and then our then we do the back propagation later so that's why it's called at backward and let's do some more operation with our tensors so let's say we have C equals y times y times 2 for example so this tensor then also has this great function attribute so here grad F M equals Malbec word because here our operation is a multiplication and for example we can say C equals C dot means so we can apply a mean operation and then our gradient function is the mean backward and now when we want to calculate the gradients the only thing that we must do is to call seat backward so this will then calculate the gradient of C with respect to X so X then has a gradient ret attribute where the gradients are stored so we can print this and now if you run this then we see that we have the gradients here in this tensor so this is all we have to do and now let's have a look what happens when we don't specify this argument so first of all if we print our 10 zeros then we see that they don't have this great function attribute and if we try to call the backward function then this will produce an error so it says tensors does not require a gret and does not have the great function so remember that we must specify this argument and then it will work and one thing that we should also know is so in the background what this basically does this will create a so-called vector Jacobian product to get the gradients so this will look like this I will not go into the mathematical details but we should know that we have the Jacobian matrix with the partial derivatives and then we multiply this with a gradient vector and then we will get the final the final gradients that we are interested in so this is also called the chain rule and I will also explain this more in detail in the next video but yeah we should know that actually we must multiply it with a vector so in this case since our C is a scalar value we don't have to put the don't have to use an argument here for our backward function so our C here has only one value so this is fine but let's say we didn't apply the mean operation so now our C has more than one value in it so it's also size 1 by 3 and now when we try to call the backward function like this then this will produce an error so gret can be implicitly created only for Skala outputs so in this case we have to give it the gradient argument so we have to create a vector of the same size so let's say V equals torch dots tensor and here we put for example point 1 1 point 0 and point 0 0 1 and we give it a data type of torch dot float32 and then we must pass this vector to our backward function and now it will work again so now if we run this then this is okay so we should know that in the background this is a chicken a vector Jacobian product and a lot of times the last operation is some operation that will create a scalar value so this is it's okay to call it like this without an argument but if this is not an ask a lot and we must give it the vector and yeah then some other thing that we should know is how we can prevent PI tot from tracking the history and calculating this gret FM attribute so for example sometimes during our training loop when we want to update our weights then this operation should not be part of the gradient computation so in one of the next tutorials I will give a concrete example of how we apply this autocrat package and then it will become clearer maybe but yeah for now we should know how we can prevent this from from trekking the gradients and we have three option for this so the first one is to call the requires grat underscore function and set this to false the second option is to call X dot detach so this will create a new tensor that doesn't require the gradient and the second option would be to wrap this in a with statement so with torch dot no gret and then we can do our operations so yeah let's try each of these so first we can say X dot requires grat underscore and set this to false so whenever a function has a trailing underscore in pi torch then this means that it will modify our variable in place so now if you print X then we will see that it doesn't have this require grad attribute anymore so now this is false so this is the first option and the second option would be to call X detach so we say y equals x dot detach so this will create a new vector with the same or a new tensor with the same values but it doesn't require the gradient so here we see that our Y has the same values but doesn't require the gradients and the last option is to wrap it in a torch in a width with statement with torch dot no gret and then we can do some operations for example y equals x plus 2 and now if we print our Y then we see that it doesn't have the gradient function attribute here so yeah if you don't use this and would run it like this then our why has the gradient function so these are the three ways how we can stop by touch from creating this gradient functions and tracking the history in our computational graph and now one more very important thing that we should also know is that whenever we call the backward function then the gradient for this tensor will be accumulated into the dot grad attribute so their values will be summed up so here we we must be very careful so let's create some dummy training example where we have some have some weights so this is a tensor with ones in it of size let's say four and they require the gradient so require scrud equals true and now let's say we have a training loop where we say for epoch in range and first let's only do one iteration and here we do let's say model output equals let's say weights times three dots sum so this is just a dummy operation which will simulate some model output and then we want to calculate the gradients so we say model output dot backward and now we have the gradient so we can call weights dot grat and print this so I want gradients here are three so the tensor is filled with threes and now if we do another iteration so if we say we have two iterations then the second backward call will again accumulate the values and write them into the grad attribute so now our greps has sixes in it and now if we do a third iteration then it has nines in it so all the values are summed up and now our weights or our gradients are clearly incorrect so before we do the next iteration and optimization step we must empty the gradients so we must call weights dot red dot zero underscore and now if we run this then our gradients are correct again so this is one very important thing that we must note during our training steps and later we will work with the PI torch built-in optimizer so let's say we have a optimizer from the torch optimization package so torch dot optim dot SGD for stochastic gradient descent which has our weights as parameters and some learning rate and now with this optimizer we can call or do a optimization step and then before we do the next iteration we must call the optimist a optimize a dot zero gret function which will do exactly the same so yeah we will talk about that optimizes in some later tutorials but yeah for now the things you should remember is that whenever we want to calculate the gradients we must specify the require scrub parameter and set this to true then we can simply calculate the gradients with calling the backward function and before we want to do the next operation or the next iteration in our optimization steps we must empty our gradient so we must call the zero function again and we also should know how we can prevent some operations from being tracked in the computational graph and that's all I wanted to show you for now with the autocrat package and I hope you liked it please subscribe to the channel and see you next time bye
Original Description
New Tutorial series about Deep Learning with PyTorch!
⭐ Check out Tabnine, the FREE AI-powered code completion tool I use to help me code faster: https://www.tabnine.com/?utm_source=youtube.com&utm_campaign=PythonEngineer *
In this part we learn how to calculate gradients using the autograd package in PyTorch.
This tutorial contains the following topics:
- requires_grad attribute for Tensors
- Computational graph
- Backpropagation (brief explanation)
- How to stop autograd from tracking history
- How to zero (empty) gradients
Part 03: Gradient Calculation With Autograd
📚 Get my FREE NumPy Handbook:
https://www.python-engineer.com/numpybook
📓 Notebooks available on Patreon:
https://www.patreon.com/patrickloeber
⭐ Join Our Discord : https://discord.gg/FHMg9tKFSN
If you enjoyed this video, please subscribe to the channel!
Official website:
https://pytorch.org/
Part 01:
https://youtu.be/EMXfZB8FVUA
You can find me here:
Website: https://www.python-engineer.com
Twitter: https://twitter.com/patloeber
GitHub: https://github.com/patrickloeber
#Python #DeepLearning #Pytorch
----------------------------------------------------------------------------------------------------------
* This is a sponsored link. By clicking on it you will not have any additional costs, instead you will support me and my project. Thank you so much for the support! 🙏
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Patrick Loeber · Patrick Loeber · 37 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
▶
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Lists in Python - Advanced Python 01 - Programming Tutorial
Patrick Loeber
Tuples in Python - Advanced Python 02 - Programming Tutorial
Patrick Loeber
Dictionaries in Python - Advanced Python 03 - Programming Tutorial
Patrick Loeber
Sets in Python - Advanced Python 04 - Programming Tutorial
Patrick Loeber
Strings in Python - Advanced Python 05 - Programming Tutorial
Patrick Loeber
Collections in Python - Advanced Python 06 - Programming Tutorial
Patrick Loeber
Itertools in Python - Advanced Python 07 - Programming Tutorial
Patrick Loeber
Lambda in Python - Advanced Python 08 - Programming Tutorial - Map Filter Reduce
Patrick Loeber
Exceptions in Python - Advanced Python 09 - Programming Tutorial
Patrick Loeber
Logging in Python - Advanced Python 10 - Programming Tutorial
Patrick Loeber
JSON in Python - Advanced Python 11 - Programming Tutorial
Patrick Loeber
Random Numbers in Python - Advanced Python 12 - Programming Tutorial
Patrick Loeber
Decorators in Python - Advanced Python 13 - Programming Tutorial
Patrick Loeber
Generators in Python - Advanced Python 14 - Programming Tutorial
Patrick Loeber
Threading vs Multiprocessing in Python - Advanced Python 15 - Programming Tutorial
Patrick Loeber
Threading in Python - Advanced Python 16 - Programming Tutorial
Patrick Loeber
Multiprocessing in Python - Advanced Python 17 - Programming Tutorial
Patrick Loeber
Function arguments in detail - Advanced Python 18 - Programming Tutorial
Patrick Loeber
The asterisk (*) operator in Python - Advanced Python 19 - Programming Tutorial
Patrick Loeber
Shallow vs Deep Copying in Python - Advanced Python 20 - Programming Tutorial
Patrick Loeber
Context Managers in Python - Advanced Python 21 - Programming Tutorial
Patrick Loeber
KNN (K Nearest Neighbors) in Python - Machine Learning From Scratch 01 - Python Tutorial
Patrick Loeber
Linear Regression in Python - Machine Learning From Scratch 02 - Python Tutorial
Patrick Loeber
Logistic Regression in Python - Machine Learning From Scratch 03 - Python Tutorial
Patrick Loeber
Linear and Logistic Regression in 60 lines of Python - Machine Learning From Scratch 04
Patrick Loeber
Naive Bayes in Python - Machine Learning From Scratch 05 - Python Tutorial
Patrick Loeber
Perceptron in Python - Machine Learning From Scratch 06 - Python Tutorial
Patrick Loeber
SVM (Support Vector Machine) in Python - Machine Learning From Scratch 07 - Python Tutorial
Patrick Loeber
Decision Tree in Python Part 1/2 - Machine Learning From Scratch 08 - Python Tutorial
Patrick Loeber
Decision Tree in Python Part 2/2 - Machine Learning From Scratch 09 - Python Tutorial
Patrick Loeber
Random Forest in Python - Machine Learning From Scratch 10 - Python Tutorial
Patrick Loeber
PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial
Patrick Loeber
K-Means Clustering in Python - Machine Learning From Scratch 12 - Python Tutorial
Patrick Loeber
Anaconda Tutorial - Installation and Basic Commands
Patrick Loeber
PyTorch Tutorial 01 - Installation
Patrick Loeber
PyTorch Tutorial 02 - Tensor Basics
Patrick Loeber
PyTorch Tutorial 03 - Gradient Calculation With Autograd
Patrick Loeber
PyTorch Tutorial 04 - Backpropagation - Theory With Example
Patrick Loeber
PyTorch Tutorial 05 - Gradient Descent with Autograd and Backpropagation
Patrick Loeber
PyTorch Tutorial 06 - Training Pipeline: Model, Loss, and Optimizer
Patrick Loeber
PyTorch Tutorial 07 - Linear Regression
Patrick Loeber
PyTorch Tutorial 08 - Logistic Regression
Patrick Loeber
PyTorch Tutorial 09 - Dataset and DataLoader - Batch Training
Patrick Loeber
PyTorch Tutorial 10 - Dataset Transforms
Patrick Loeber
Download Images With Python Automatically - Python Web Scraping Tutorial
Patrick Loeber
PyTorch Tutorial 11 - Softmax and Cross Entropy
Patrick Loeber
Select Movies with Python - Web Scraping Tutorial
Patrick Loeber
PyTorch Tutorial 12 - Activation Functions
Patrick Loeber
List Comprehension in Python - A Python Feature You MUST KNOW - Python Tutorial
Patrick Loeber
PyTorch Tutorial 13 - Feed-Forward Neural Network
Patrick Loeber
How To Add A Progress Bar In Python With Just One Line - Python Tutorial
Patrick Loeber
PyTorch Tutorial 14 - Convolutional Neural Network (CNN)
Patrick Loeber
The Walrus Operator - New in Python 3.8 - Python Tutorial
Patrick Loeber
PyTorch Tutorial 15 - Transfer Learning
Patrick Loeber
YouTube Data API Tutorial with Python - Analyze Channel Statistics - Part 1
Patrick Loeber
YouTube Data API Tutorial with Python - Find Channel Videos - Part 2
Patrick Loeber
YouTube Data API Tutorial with Python - Get Video Statistics - Part 3
Patrick Loeber
YouTube Data API Tutorial with Python - Analyze the Data - Part 4
Patrick Loeber
AdaBoost in Python - Machine Learning From Scratch 13 - Python Tutorial
Patrick Loeber
Ultimate FREE Study Guide for Machine Learning and Deep Learning
Patrick Loeber
More on: ML Maths Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Want to get started with deep learning
Reddit r/deeplearning
Building a Deepfake Detector From Scratch — What Nobody Tells You
Medium · Deep Learning
Unfolding the Meandering Path: High-Dimensional Invariance and the Flat 2D Plane of Neural…
Medium · Deep Learning
Implementing Neural Style Transfer from Scratch: The Project That Started It All
Medium · Deep Learning
🎓
Tutor Explanation
DeepCamp AI