PyTorch Tutorial 03 - Gradient Calculation With Autograd

Patrick Loeber · Beginner ·🧬 Deep Learning ·6y ago

Key Takeaways

The video demonstrates how to calculate gradients using PyTorch's Autograd package, essential for model optimization in deep learning. It covers the process of creating a computational graph, performing forward and backward passes, and tracking gradients using Autograd.

Full Transcript

hi everybody welcome to a new PI torch tutorial today we learn about the autocrat package in pi torch and how we can calculate gradients with it gradients are essential for our model optimizations so this is a very important concept that we should understand luckily PI Touch provides the autograft package which can do all the computations for us we just have to know how to use it so let's start to see how we can calculate gradients in pie charts so first of all we import torch of course and now let's create a tensor x equals torch dot R and n of size 3 and now let's print our X so this is a tensor with three values so three random values and now let's say later we want to calculate the gradient of some function with respect to X then what we have to do is we must specify the argument require skrutz equals true so by default this is false and now if you run this again then we see that also pi touch tracks that it requires the gradient and now whenever we do operations with this tensor pi torch will create a so-called computational graph for us so now let's say we do the operation x + 2 and we store this in an output so we say y equals x + 2 then this will create the computational graph and this looks like this so for each node we have a for each operation we have a node with inputs and an output so here the operation is the + so an addition and our inputs are x + 2 and the output is y and now with this graph and the technique that is called back propagation we can then calculate the gradients I will explain the of backpropagation in detail in the next video but for now it's fine to just know that we are how we can use it so first we do a forward pass so here we apply this operation and in the forward pass we calculate the output Y and since we specified that it requires the gradient PI touch will then automatically create and store a function for us and this function is then used in the back propagation and to get the gradients so here Y has an attribute grab underscore F M so this will point to a gradient function and in this case it's called at at backward and with this function we can then calculate the gradients in the so called backward pass so this will calculate the gradient of Y with respect to X in this case so now if we print Y then we will see exactly this graph F n attribute and here this is an at backward function so because here our operation was a plus and then our then we do the back propagation later so that's why it's called at backward and let's do some more operation with our tensors so let's say we have C equals y times y times 2 for example so this tensor then also has this great function attribute so here grad F M equals Malbec word because here our operation is a multiplication and for example we can say C equals C dot means so we can apply a mean operation and then our gradient function is the mean backward and now when we want to calculate the gradients the only thing that we must do is to call seat backward so this will then calculate the gradient of C with respect to X so X then has a gradient ret attribute where the gradients are stored so we can print this and now if you run this then we see that we have the gradients here in this tensor so this is all we have to do and now let's have a look what happens when we don't specify this argument so first of all if we print our 10 zeros then we see that they don't have this great function attribute and if we try to call the backward function then this will produce an error so it says tensors does not require a gret and does not have the great function so remember that we must specify this argument and then it will work and one thing that we should also know is so in the background what this basically does this will create a so-called vector Jacobian product to get the gradients so this will look like this I will not go into the mathematical details but we should know that we have the Jacobian matrix with the partial derivatives and then we multiply this with a gradient vector and then we will get the final the final gradients that we are interested in so this is also called the chain rule and I will also explain this more in detail in the next video but yeah we should know that actually we must multiply it with a vector so in this case since our C is a scalar value we don't have to put the don't have to use an argument here for our backward function so our C here has only one value so this is fine but let's say we didn't apply the mean operation so now our C has more than one value in it so it's also size 1 by 3 and now when we try to call the backward function like this then this will produce an error so gret can be implicitly created only for Skala outputs so in this case we have to give it the gradient argument so we have to create a vector of the same size so let's say V equals torch dots tensor and here we put for example point 1 1 point 0 and point 0 0 1 and we give it a data type of torch dot float32 and then we must pass this vector to our backward function and now it will work again so now if we run this then this is okay so we should know that in the background this is a chicken a vector Jacobian product and a lot of times the last operation is some operation that will create a scalar value so this is it's okay to call it like this without an argument but if this is not an ask a lot and we must give it the vector and yeah then some other thing that we should know is how we can prevent PI tot from tracking the history and calculating this gret FM attribute so for example sometimes during our training loop when we want to update our weights then this operation should not be part of the gradient computation so in one of the next tutorials I will give a concrete example of how we apply this autocrat package and then it will become clearer maybe but yeah for now we should know how we can prevent this from from trekking the gradients and we have three option for this so the first one is to call the requires grat underscore function and set this to false the second option is to call X dot detach so this will create a new tensor that doesn't require the gradient and the second option would be to wrap this in a with statement so with torch dot no gret and then we can do our operations so yeah let's try each of these so first we can say X dot requires grat underscore and set this to false so whenever a function has a trailing underscore in pi torch then this means that it will modify our variable in place so now if you print X then we will see that it doesn't have this require grad attribute anymore so now this is false so this is the first option and the second option would be to call X detach so we say y equals x dot detach so this will create a new vector with the same or a new tensor with the same values but it doesn't require the gradient so here we see that our Y has the same values but doesn't require the gradients and the last option is to wrap it in a torch in a width with statement with torch dot no gret and then we can do some operations for example y equals x plus 2 and now if we print our Y then we see that it doesn't have the gradient function attribute here so yeah if you don't use this and would run it like this then our why has the gradient function so these are the three ways how we can stop by touch from creating this gradient functions and tracking the history in our computational graph and now one more very important thing that we should also know is that whenever we call the backward function then the gradient for this tensor will be accumulated into the dot grad attribute so their values will be summed up so here we we must be very careful so let's create some dummy training example where we have some have some weights so this is a tensor with ones in it of size let's say four and they require the gradient so require scrud equals true and now let's say we have a training loop where we say for epoch in range and first let's only do one iteration and here we do let's say model output equals let's say weights times three dots sum so this is just a dummy operation which will simulate some model output and then we want to calculate the gradients so we say model output dot backward and now we have the gradient so we can call weights dot grat and print this so I want gradients here are three so the tensor is filled with threes and now if we do another iteration so if we say we have two iterations then the second backward call will again accumulate the values and write them into the grad attribute so now our greps has sixes in it and now if we do a third iteration then it has nines in it so all the values are summed up and now our weights or our gradients are clearly incorrect so before we do the next iteration and optimization step we must empty the gradients so we must call weights dot red dot zero underscore and now if we run this then our gradients are correct again so this is one very important thing that we must note during our training steps and later we will work with the PI torch built-in optimizer so let's say we have a optimizer from the torch optimization package so torch dot optim dot SGD for stochastic gradient descent which has our weights as parameters and some learning rate and now with this optimizer we can call or do a optimization step and then before we do the next iteration we must call the optimist a optimize a dot zero gret function which will do exactly the same so yeah we will talk about that optimizes in some later tutorials but yeah for now the things you should remember is that whenever we want to calculate the gradients we must specify the require scrub parameter and set this to true then we can simply calculate the gradients with calling the backward function and before we want to do the next operation or the next iteration in our optimization steps we must empty our gradient so we must call the zero function again and we also should know how we can prevent some operations from being tracked in the computational graph and that's all I wanted to show you for now with the autocrat package and I hope you liked it please subscribe to the channel and see you next time bye

Original Description

New Tutorial series about Deep Learning with PyTorch! ⭐ Check out Tabnine, the FREE AI-powered code completion tool I use to help me code faster: https://www.tabnine.com/?utm_source=youtube.com&utm_campaign=PythonEngineer * In this part we learn how to calculate gradients using the autograd package in PyTorch. This tutorial contains the following topics: - requires_grad attribute for Tensors - Computational graph - Backpropagation (brief explanation) - How to stop autograd from tracking history - How to zero (empty) gradients Part 03: Gradient Calculation With Autograd 📚 Get my FREE NumPy Handbook: https://www.python-engineer.com/numpybook 📓 Notebooks available on Patreon: https://www.patreon.com/patrickloeber ⭐ Join Our Discord : https://discord.gg/FHMg9tKFSN If you enjoyed this video, please subscribe to the channel! Official website: https://pytorch.org/ Part 01: https://youtu.be/EMXfZB8FVUA You can find me here: Website: https://www.python-engineer.com Twitter: https://twitter.com/patloeber GitHub: https://github.com/patrickloeber #Python #DeepLearning #Pytorch ---------------------------------------------------------------------------------------------------------- * This is a sponsored link. By clicking on it you will not have any additional costs, instead you will support me and my project. Thank you so much for the support! 🙏
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Patrick Loeber · Patrick Loeber · 37 of 60

1 Lists in Python - Advanced Python 01 - Programming Tutorial
Lists in Python - Advanced Python 01 - Programming Tutorial
Patrick Loeber
2 Tuples in Python - Advanced Python 02 - Programming Tutorial
Tuples in Python - Advanced Python 02 - Programming Tutorial
Patrick Loeber
3 Dictionaries in Python - Advanced Python 03 - Programming Tutorial
Dictionaries in Python - Advanced Python 03 - Programming Tutorial
Patrick Loeber
4 Sets in Python - Advanced Python 04 - Programming Tutorial
Sets in Python - Advanced Python 04 - Programming Tutorial
Patrick Loeber
5 Strings in Python - Advanced Python 05 - Programming Tutorial
Strings in Python - Advanced Python 05 - Programming Tutorial
Patrick Loeber
6 Collections in Python - Advanced Python 06 - Programming Tutorial
Collections in Python - Advanced Python 06 - Programming Tutorial
Patrick Loeber
7 Itertools in Python - Advanced Python 07 - Programming Tutorial
Itertools in Python - Advanced Python 07 - Programming Tutorial
Patrick Loeber
8 Lambda in Python - Advanced Python 08 - Programming Tutorial - Map Filter Reduce
Lambda in Python - Advanced Python 08 - Programming Tutorial - Map Filter Reduce
Patrick Loeber
9 Exceptions in Python - Advanced Python 09 - Programming Tutorial
Exceptions in Python - Advanced Python 09 - Programming Tutorial
Patrick Loeber
10 Logging in Python - Advanced Python 10 - Programming Tutorial
Logging in Python - Advanced Python 10 - Programming Tutorial
Patrick Loeber
11 JSON in Python - Advanced Python 11 - Programming Tutorial
JSON in Python - Advanced Python 11 - Programming Tutorial
Patrick Loeber
12 Random Numbers in Python - Advanced Python 12 - Programming Tutorial
Random Numbers in Python - Advanced Python 12 - Programming Tutorial
Patrick Loeber
13 Decorators in Python - Advanced Python 13 - Programming Tutorial
Decorators in Python - Advanced Python 13 - Programming Tutorial
Patrick Loeber
14 Generators in Python - Advanced Python 14 - Programming Tutorial
Generators in Python - Advanced Python 14 - Programming Tutorial
Patrick Loeber
15 Threading vs Multiprocessing in Python - Advanced Python 15 - Programming Tutorial
Threading vs Multiprocessing in Python - Advanced Python 15 - Programming Tutorial
Patrick Loeber
16 Threading in Python - Advanced Python 16 - Programming Tutorial
Threading in Python - Advanced Python 16 - Programming Tutorial
Patrick Loeber
17 Multiprocessing in Python - Advanced Python 17 - Programming Tutorial
Multiprocessing in Python - Advanced Python 17 - Programming Tutorial
Patrick Loeber
18 Function arguments in detail - Advanced Python 18 - Programming Tutorial
Function arguments in detail - Advanced Python 18 - Programming Tutorial
Patrick Loeber
19 The asterisk (*) operator in Python - Advanced Python 19 - Programming Tutorial
The asterisk (*) operator in Python - Advanced Python 19 - Programming Tutorial
Patrick Loeber
20 Shallow vs Deep Copying in Python - Advanced Python 20 - Programming Tutorial
Shallow vs Deep Copying in Python - Advanced Python 20 - Programming Tutorial
Patrick Loeber
21 Context Managers in Python - Advanced Python 21 - Programming Tutorial
Context Managers in Python - Advanced Python 21 - Programming Tutorial
Patrick Loeber
22 KNN (K Nearest Neighbors) in Python - Machine Learning From Scratch 01 - Python Tutorial
KNN (K Nearest Neighbors) in Python - Machine Learning From Scratch 01 - Python Tutorial
Patrick Loeber
23 Linear Regression in Python - Machine Learning From Scratch 02 - Python Tutorial
Linear Regression in Python - Machine Learning From Scratch 02 - Python Tutorial
Patrick Loeber
24 Logistic Regression in Python - Machine Learning From Scratch 03 - Python Tutorial
Logistic Regression in Python - Machine Learning From Scratch 03 - Python Tutorial
Patrick Loeber
25 Linear and Logistic Regression in 60 lines of Python - Machine Learning From Scratch 04
Linear and Logistic Regression in 60 lines of Python - Machine Learning From Scratch 04
Patrick Loeber
26 Naive Bayes in Python - Machine Learning From Scratch 05 - Python Tutorial
Naive Bayes in Python - Machine Learning From Scratch 05 - Python Tutorial
Patrick Loeber
27 Perceptron in Python - Machine Learning From Scratch 06 - Python Tutorial
Perceptron in Python - Machine Learning From Scratch 06 - Python Tutorial
Patrick Loeber
28 SVM (Support Vector Machine) in Python - Machine Learning From Scratch 07 - Python Tutorial
SVM (Support Vector Machine) in Python - Machine Learning From Scratch 07 - Python Tutorial
Patrick Loeber
29 Decision Tree in Python Part 1/2 - Machine Learning From Scratch 08 - Python Tutorial
Decision Tree in Python Part 1/2 - Machine Learning From Scratch 08 - Python Tutorial
Patrick Loeber
30 Decision Tree in Python Part 2/2 - Machine Learning From Scratch 09 - Python Tutorial
Decision Tree in Python Part 2/2 - Machine Learning From Scratch 09 - Python Tutorial
Patrick Loeber
31 Random Forest in Python - Machine Learning From Scratch 10 - Python Tutorial
Random Forest in Python - Machine Learning From Scratch 10 - Python Tutorial
Patrick Loeber
32 PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial
PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial
Patrick Loeber
33 K-Means Clustering in Python - Machine Learning From Scratch 12 - Python Tutorial
K-Means Clustering in Python - Machine Learning From Scratch 12 - Python Tutorial
Patrick Loeber
34 Anaconda Tutorial - Installation and Basic Commands
Anaconda Tutorial - Installation and Basic Commands
Patrick Loeber
35 PyTorch Tutorial 01 - Installation
PyTorch Tutorial 01 - Installation
Patrick Loeber
36 PyTorch Tutorial 02 - Tensor Basics
PyTorch Tutorial 02 - Tensor Basics
Patrick Loeber
PyTorch Tutorial 03 - Gradient Calculation With Autograd
PyTorch Tutorial 03 - Gradient Calculation With Autograd
Patrick Loeber
38 PyTorch Tutorial 04 - Backpropagation - Theory With Example
PyTorch Tutorial 04 - Backpropagation - Theory With Example
Patrick Loeber
39 PyTorch Tutorial 05 - Gradient Descent with Autograd and Backpropagation
PyTorch Tutorial 05 - Gradient Descent with Autograd and Backpropagation
Patrick Loeber
40 PyTorch Tutorial 06 - Training Pipeline: Model, Loss, and Optimizer
PyTorch Tutorial 06 - Training Pipeline: Model, Loss, and Optimizer
Patrick Loeber
41 PyTorch Tutorial 07 - Linear Regression
PyTorch Tutorial 07 - Linear Regression
Patrick Loeber
42 PyTorch Tutorial 08 - Logistic Regression
PyTorch Tutorial 08 - Logistic Regression
Patrick Loeber
43 PyTorch Tutorial 09 - Dataset and DataLoader - Batch Training
PyTorch Tutorial 09 - Dataset and DataLoader - Batch Training
Patrick Loeber
44 PyTorch Tutorial 10 - Dataset Transforms
PyTorch Tutorial 10 - Dataset Transforms
Patrick Loeber
45 Download Images With Python Automatically - Python Web Scraping Tutorial
Download Images With Python Automatically - Python Web Scraping Tutorial
Patrick Loeber
46 PyTorch Tutorial 11 - Softmax and Cross Entropy
PyTorch Tutorial 11 - Softmax and Cross Entropy
Patrick Loeber
47 Select Movies with Python - Web Scraping Tutorial
Select Movies with Python - Web Scraping Tutorial
Patrick Loeber
48 PyTorch Tutorial 12 - Activation Functions
PyTorch Tutorial 12 - Activation Functions
Patrick Loeber
49 List Comprehension in Python - A Python Feature You MUST KNOW - Python Tutorial
List Comprehension in Python - A Python Feature You MUST KNOW - Python Tutorial
Patrick Loeber
50 PyTorch Tutorial 13 - Feed-Forward Neural Network
PyTorch Tutorial 13 - Feed-Forward Neural Network
Patrick Loeber
51 How To Add A Progress Bar In Python With Just One Line - Python Tutorial
How To Add A Progress Bar In Python With Just One Line - Python Tutorial
Patrick Loeber
52 PyTorch Tutorial 14 - Convolutional Neural Network (CNN)
PyTorch Tutorial 14 - Convolutional Neural Network (CNN)
Patrick Loeber
53 The Walrus Operator - New in Python 3.8 - Python Tutorial
The Walrus Operator - New in Python 3.8 - Python Tutorial
Patrick Loeber
54 PyTorch Tutorial 15 - Transfer Learning
PyTorch Tutorial 15 - Transfer Learning
Patrick Loeber
55 YouTube Data API Tutorial with Python - Analyze Channel Statistics - Part 1
YouTube Data API Tutorial with Python - Analyze Channel Statistics - Part 1
Patrick Loeber
56 YouTube Data API Tutorial with Python - Find Channel Videos - Part 2
YouTube Data API Tutorial with Python - Find Channel Videos - Part 2
Patrick Loeber
57 YouTube Data API Tutorial with Python - Get Video Statistics - Part 3
YouTube Data API Tutorial with Python - Get Video Statistics - Part 3
Patrick Loeber
58 YouTube Data API Tutorial with Python - Analyze the Data - Part 4
YouTube Data API Tutorial with Python - Analyze the Data - Part 4
Patrick Loeber
59 AdaBoost in Python - Machine Learning From Scratch 13 - Python Tutorial
AdaBoost in Python - Machine Learning From Scratch 13 - Python Tutorial
Patrick Loeber
60 Ultimate FREE Study Guide for Machine Learning and Deep Learning
Ultimate FREE Study Guide for Machine Learning and Deep Learning
Patrick Loeber

This video tutorial teaches how to calculate gradients using PyTorch's Autograd package, a crucial step in deep learning model optimization. By following the steps outlined in the video, viewers can learn how to create computational graphs, perform forward and backward passes, and track gradients using Autograd. This knowledge is essential for building and optimizing deep learning models.

Key Takeaways
  1. Create a tensor and specify require_grad=True
  2. Perform operations on the tensor to create a computational graph
  3. Do a forward pass to calculate the output
  4. Call the backward function to calculate the gradients
  5. Store the gradients in the tensor's grad_fn attribute
  6. Call the backward function with the gradient argument
  7. Create a vector with the same size as the output tensor
  8. Use the requires_grad function to set the attribute to false
  9. Use the detach method to create a new tensor without gradients
  10. Use the with statement to prevent the gradients from being tracked
💡 Gradient calculation with Autograd requires specifying require_grad=True and can be tracked using the backward function, but can also be prevented from being tracked using the requires_grad function, detach method, or with statement.

Related AI Lessons

Want to get started with deep learning
Get started with deep learning by leveraging resources like Andrew Karpathy's playlist and frameworks such as TensorFlow or PyTorch
Reddit r/deeplearning
Building a Deepfake Detector From Scratch — What Nobody Tells You
Learn to build a deepfake detector from scratch and understand the challenges involved in detecting AI-generated fake media
Medium · Deep Learning
Unfolding the Meandering Path: High-Dimensional Invariance and the Flat 2D Plane of Neural…
Learn about high-dimensional invariance and its relation to the flat 2D plane of neural networks, and how to apply these concepts to improve model performance
Medium · Deep Learning
Implementing Neural Style Transfer from Scratch: The Project That Started It All
Learn to implement Neural Style Transfer from scratch and understand its significance in deep learning
Medium · Deep Learning
Up next
Image Classification with ml5.js
The Coding Train
Watch →