PyTorch Tutorial 03 - Gradient Calculation With Autograd

Patrick Loeber · Beginner ·🧬 Deep Learning ·6y ago

Skills: ML Maths Basics90%Supervised Learning80%

Key Takeaways

The video demonstrates how to calculate gradients using PyTorch's Autograd package, essential for model optimization in deep learning. It covers the process of creating a computational graph, performing forward and backward passes, and tracking gradients using Autograd.

Full Transcript

hi everybody welcome to a new PI torch tutorial today we learn about the autocrat package in pi torch and how we can calculate gradients with it gradients are essential for our model optimizations so this is a very important concept that we should understand luckily PI Touch provides the autograft package which can do all the computations for us we just have to know how to use it so let's start to see how we can calculate gradients in pie charts so first of all we import torch of course and now let's create a tensor x equals torch dot R and n of size 3 and now let's print our X so this is a tensor with three values so three random values and now let's say later we want to calculate the gradient of some function with respect to X then what we have to do is we must specify the argument require skrutz equals true so by default this is false and now if you run this again then we see that also pi touch tracks that it requires the gradient and now whenever we do operations with this tensor pi torch will create a so-called computational graph for us so now let's say we do the operation x + 2 and we store this in an output so we say y equals x + 2 then this will create the computational graph and this looks like this so for each node we have a for each operation we have a node with inputs and an output so here the operation is the + so an addition and our inputs are x + 2 and the output is y and now with this graph and the technique that is called back propagation we can then calculate the gradients I will explain the of backpropagation in detail in the next video but for now it's fine to just know that we are how we can use it so first we do a forward pass so here we apply this operation and in the forward pass we calculate the output Y and since we specified that it requires the gradient PI touch will then automatically create and store a function for us and this function is then used in the back propagation and to get the gradients so here Y has an attribute grab underscore F M so this will point to a gradient function and in this case it's called at at backward and with this function we can then calculate the gradients in the so called backward pass so this will calculate the gradient of Y with respect to X in this case so now if we print Y then we will see exactly this graph F n attribute and here this is an at backward function so because here our operation was a plus and then our then we do the back propagation later so that's why it's called at backward and let's do some more operation with our tensors so let's say we have C equals y times y times 2 for example so this tensor then also has this great function attribute so here grad F M equals Malbec word because here our operation is a multiplication and for example we can say C equals C dot means so we can apply a mean operation and then our gradient function is the mean backward and now when we want to calculate the gradients the only thing that we must do is to call seat backward so this will then calculate the gradient of C with respect to X so X then has a gradient ret attribute where the gradients are stored so we can print this and now if you run this then we see that we have the gradients here in this tensor so this is all we have to do and now let's have a look what happens when we don't specify this argument so first of all if we print our 10 zeros then we see that they don't have this great function attribute and if we try to call the backward function then this will produce an error so it says tensors does not require a gret and does not have the great function so remember that we must specify this argument and then it will work and one thing that we should also know is so in the background what this basically does this will create a so-called vector Jacobian product to get the gradients so this will look like this I will not go into the mathematical details but we should know that we have the Jacobian matrix with the partial derivatives and then we multiply this with a gradient vector and then we will get the final the final gradients that we are interested in so this is also called the chain rule and I will also explain this more in detail in the next video but yeah we should know that actually we must multiply it with a vector so in this case since our C is a scalar value we don't have to put the don't have to use an argument here for our backward function so our C here has only one value so this is fine but let's say we didn't apply the mean operation so now our C has more than one value in it so it's also size 1 by 3 and now when we try to call the backward function like this then this will produce an error so gret can be implicitly created only for Skala outputs so in this case we have to give it the gradient argument so we have to create a vector of the same size so let's say V equals torch dots tensor and here we put for example point 1 1 point 0 and point 0 0 1 and we give it a data type of torch dot float32 and then we must pass this vector to our backward function and now it will work again so now if we run this then this is okay so we should know that in the background this is a chicken a vector Jacobian product and a lot of times the last operation is some operation that will create a scalar value so this is it's okay to call it like this without an argument but if this is not an ask a lot and we must give it the vector and yeah then some other thing that we should know is how we can prevent PI tot from tracking the history and calculating this gret FM attribute so for example sometimes during our training loop when we want to update our weights then this operation should not be part of the gradient computation so in one of the next tutorials I will give a concrete example of how we apply this autocrat package and then it will become clearer maybe but yeah for now we should know how we can prevent this from from trekking the gradients and we have three option for this so the first one is to call the requires grat underscore function and set this to false the second option is to call X dot detach so this will create a new tensor that doesn't require the gradient and the second option would be to wrap this in a with statement so with torch dot no gret and then we can do our operations so yeah let's try each of these so first we can say X dot requires grat underscore and set this to false so whenever a function has a trailing underscore in pi torch then this means that it will modify our variable in place so now if you print X then we will see that it doesn't have this require grad attribute anymore so now this is false so this is the first option and the second option would be to call X detach so we say y equals x dot detach so this will create a new vector with the same or a new tensor with the same values but it doesn't require the gradient so here we see that our Y has the same values but doesn't require the gradients and the last option is to wrap it in a torch in a width with statement with torch dot no gret and then we can do some operations for example y equals x plus 2 and now if we print our Y then we see that it doesn't have the gradient function attribute here so yeah if you don't use this and would run it like this then our why has the gradient function so these are the three ways how we can stop by touch from creating this gradient functions and tracking the history in our computational graph and now one more very important thing that we should also know is that whenever we call the backward function then the gradient for this tensor will be accumulated into the dot grad attribute so their values will be summed up so here we we must be very careful so let's create some dummy training example where we have some have some weights so this is a tensor with ones in it of size let's say four and they require the gradient so require scrud equals true and now let's say we have a training loop where we say for epoch in range and first let's only do one iteration and here we do let's say model output equals let's say weights times three dots sum so this is just a dummy operation which will simulate some model output and then we want to calculate the gradients so we say model output dot backward and now we have the gradient so we can call weights dot grat and print this so I want gradients here are three so the tensor is filled with threes and now if we do another iteration so if we say we have two iterations then the second backward call will again accumulate the values and write them into the grad attribute so now our greps has sixes in it and now if we do a third iteration then it has nines in it so all the values are summed up and now our weights or our gradients are clearly incorrect so before we do the next iteration and optimization step we must empty the gradients so we must call weights dot red dot zero underscore and now if we run this then our gradients are correct again so this is one very important thing that we must note during our training steps and later we will work with the PI torch built-in optimizer so let's say we have a optimizer from the torch optimization package so torch dot optim dot SGD for stochastic gradient descent which has our weights as parameters and some learning rate and now with this optimizer we can call or do a optimization step and then before we do the next iteration we must call the optimist a optimize a dot zero gret function which will do exactly the same so yeah we will talk about that optimizes in some later tutorials but yeah for now the things you should remember is that whenever we want to calculate the gradients we must specify the require scrub parameter and set this to true then we can simply calculate the gradients with calling the backward function and before we want to do the next operation or the next iteration in our optimization steps we must empty our gradient so we must call the zero function again and we also should know how we can prevent some operations from being tracked in the computational graph and that's all I wanted to show you for now with the autocrat package and I hope you liked it please subscribe to the channel and see you next time bye

Original Description

New Tutorial series about Deep Learning with PyTorch! ⭐ Check out Tabnine, the FREE AI-powered code completion tool I use to help me code faster: https://www.tabnine.com/?utm_source=youtube.com&utm_campaign=PythonEngineer * In this part we learn how to calculate gradients using the autograd package in PyTorch. This tutorial contains the following topics: - requires_grad attribute for Tensors - Computational graph - Backpropagation (brief explanation) - How to stop autograd from tracking history - How to zero (empty) gradients Part 03: Gradient Calculation With Autograd 📚 Get my FREE NumPy Handbook: https://www.python-engineer.com/numpybook 📓 Notebooks available on Patreon: https://www.patreon.com/patrickloeber ⭐ Join Our Discord : https://discord.gg/FHMg9tKFSN If you enjoyed this video, please subscribe to the channel! Official website: https://pytorch.org/ Part 01: https://youtu.be/EMXfZB8FVUA You can find me here: Website: https://www.python-engineer.com Twitter: https://twitter.com/patloeber GitHub: https://github.com/patrickloeber #Python #DeepLearning #Pytorch ---------------------------------------------------------------------------------------------------------- * This is a sponsored link. By clicking on it you will not have any additional costs, instead you will support me and my project. Thank you so much for the support! 🙏

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Patrick Loeber · Patrick Loeber · 37 of 60

← Previous Next →

Lists in Python - Advanced Python 01 - Programming Tutorial

Lists in Python - Advanced Python 01 - Programming Tutorial

Tuples in Python - Advanced Python 02 - Programming Tutorial

Tuples in Python - Advanced Python 02 - Programming Tutorial

Dictionaries in Python - Advanced Python 03 - Programming Tutorial

Dictionaries in Python - Advanced Python 03 - Programming Tutorial

Sets in Python - Advanced Python 04 - Programming Tutorial

Sets in Python - Advanced Python 04 - Programming Tutorial

Strings in Python - Advanced Python 05 - Programming Tutorial

Strings in Python - Advanced Python 05 - Programming Tutorial

Collections in Python - Advanced Python 06 - Programming Tutorial

Collections in Python - Advanced Python 06 - Programming Tutorial

Itertools in Python - Advanced Python 07 - Programming Tutorial

Itertools in Python - Advanced Python 07 - Programming Tutorial

Lambda in Python - Advanced Python 08 - Programming Tutorial - Map Filter Reduce

Lambda in Python - Advanced Python 08 - Programming Tutorial - Map Filter Reduce

Exceptions in Python - Advanced Python 09 - Programming Tutorial

Exceptions in Python - Advanced Python 09 - Programming Tutorial

Logging in Python - Advanced Python 10 - Programming Tutorial

Logging in Python - Advanced Python 10 - Programming Tutorial

JSON in Python - Advanced Python 11 - Programming Tutorial

JSON in Python - Advanced Python 11 - Programming Tutorial

Random Numbers in Python - Advanced Python 12 - Programming Tutorial

Random Numbers in Python - Advanced Python 12 - Programming Tutorial

Decorators in Python - Advanced Python 13 - Programming Tutorial

Decorators in Python - Advanced Python 13 - Programming Tutorial

Generators in Python - Advanced Python 14 - Programming Tutorial

Generators in Python - Advanced Python 14 - Programming Tutorial

Threading vs Multiprocessing in Python - Advanced Python 15 - Programming Tutorial

Threading vs Multiprocessing in Python - Advanced Python 15 - Programming Tutorial

Threading in Python - Advanced Python 16 - Programming Tutorial

Threading in Python - Advanced Python 16 - Programming Tutorial

Multiprocessing in Python - Advanced Python 17 - Programming Tutorial

Multiprocessing in Python - Advanced Python 17 - Programming Tutorial

Function arguments in detail - Advanced Python 18 - Programming Tutorial

Function arguments in detail - Advanced Python 18 - Programming Tutorial

The asterisk (*) operator in Python - Advanced Python 19 - Programming Tutorial

The asterisk (*) operator in Python - Advanced Python 19 - Programming Tutorial

Shallow vs Deep Copying in Python - Advanced Python 20 - Programming Tutorial

Shallow vs Deep Copying in Python - Advanced Python 20 - Programming Tutorial

Context Managers in Python - Advanced Python 21 - Programming Tutorial

Context Managers in Python - Advanced Python 21 - Programming Tutorial

KNN (K Nearest Neighbors) in Python - Machine Learning From Scratch 01 - Python Tutorial

KNN (K Nearest Neighbors) in Python - Machine Learning From Scratch 01 - Python Tutorial

Linear Regression in Python - Machine Learning From Scratch 02 - Python Tutorial

Linear Regression in Python - Machine Learning From Scratch 02 - Python Tutorial

Logistic Regression in Python - Machine Learning From Scratch 03 - Python Tutorial

Logistic Regression in Python - Machine Learning From Scratch 03 - Python Tutorial

Linear and Logistic Regression in 60 lines of Python - Machine Learning From Scratch 04

Linear and Logistic Regression in 60 lines of Python - Machine Learning From Scratch 04

Naive Bayes in Python - Machine Learning From Scratch 05 - Python Tutorial

Naive Bayes in Python - Machine Learning From Scratch 05 - Python Tutorial

Perceptron in Python - Machine Learning From Scratch 06 - Python Tutorial

Perceptron in Python - Machine Learning From Scratch 06 - Python Tutorial

SVM (Support Vector Machine) in Python - Machine Learning From Scratch 07 - Python Tutorial

SVM (Support Vector Machine) in Python - Machine Learning From Scratch 07 - Python Tutorial

Decision Tree in Python Part 1/2 - Machine Learning From Scratch 08 - Python Tutorial

Decision Tree in Python Part 1/2 - Machine Learning From Scratch 08 - Python Tutorial

Decision Tree in Python Part 2/2 - Machine Learning From Scratch 09 - Python Tutorial

Decision Tree in Python Part 2/2 - Machine Learning From Scratch 09 - Python Tutorial

Random Forest in Python - Machine Learning From Scratch 10 - Python Tutorial

Random Forest in Python - Machine Learning From Scratch 10 - Python Tutorial

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

K-Means Clustering in Python - Machine Learning From Scratch 12 - Python Tutorial

K-Means Clustering in Python - Machine Learning From Scratch 12 - Python Tutorial

Anaconda Tutorial - Installation and Basic Commands

Anaconda Tutorial - Installation and Basic Commands

PyTorch Tutorial 01 - Installation

PyTorch Tutorial 01 - Installation

PyTorch Tutorial 02 - Tensor Basics

PyTorch Tutorial 02 - Tensor Basics

PyTorch Tutorial 03 - Gradient Calculation With Autograd

PyTorch Tutorial 03 - Gradient Calculation With Autograd

PyTorch Tutorial 04 - Backpropagation - Theory With Example

PyTorch Tutorial 04 - Backpropagation - Theory With Example

PyTorch Tutorial 05 - Gradient Descent with Autograd and Backpropagation

PyTorch Tutorial 05 - Gradient Descent with Autograd and Backpropagation

PyTorch Tutorial 06 - Training Pipeline: Model, Loss, and Optimizer

PyTorch Tutorial 06 - Training Pipeline: Model, Loss, and Optimizer

PyTorch Tutorial 07 - Linear Regression

PyTorch Tutorial 07 - Linear Regression

PyTorch Tutorial 08 - Logistic Regression

PyTorch Tutorial 08 - Logistic Regression

PyTorch Tutorial 09 - Dataset and DataLoader - Batch Training

PyTorch Tutorial 09 - Dataset and DataLoader - Batch Training

PyTorch Tutorial 10 - Dataset Transforms

PyTorch Tutorial 10 - Dataset Transforms

Download Images With Python Automatically - Python Web Scraping Tutorial

Download Images With Python Automatically - Python Web Scraping Tutorial

PyTorch Tutorial 11 - Softmax and Cross Entropy

PyTorch Tutorial 11 - Softmax and Cross Entropy

Select Movies with Python - Web Scraping Tutorial

Select Movies with Python - Web Scraping Tutorial

PyTorch Tutorial 12 - Activation Functions

PyTorch Tutorial 12 - Activation Functions

List Comprehension in Python - A Python Feature You MUST KNOW - Python Tutorial

List Comprehension in Python - A Python Feature You MUST KNOW - Python Tutorial

PyTorch Tutorial 13 - Feed-Forward Neural Network

PyTorch Tutorial 13 - Feed-Forward Neural Network

How To Add A Progress Bar In Python With Just One Line - Python Tutorial

How To Add A Progress Bar In Python With Just One Line - Python Tutorial

PyTorch Tutorial 14 - Convolutional Neural Network (CNN)

PyTorch Tutorial 14 - Convolutional Neural Network (CNN)

The Walrus Operator - New in Python 3.8 - Python Tutorial

The Walrus Operator - New in Python 3.8 - Python Tutorial

PyTorch Tutorial 15 - Transfer Learning

PyTorch Tutorial 15 - Transfer Learning

YouTube Data API Tutorial with Python - Analyze Channel Statistics - Part 1

YouTube Data API Tutorial with Python - Analyze Channel Statistics - Part 1

YouTube Data API Tutorial with Python - Find Channel Videos - Part 2

YouTube Data API Tutorial with Python - Find Channel Videos - Part 2

YouTube Data API Tutorial with Python - Get Video Statistics - Part 3

YouTube Data API Tutorial with Python - Get Video Statistics - Part 3

YouTube Data API Tutorial with Python - Analyze the Data - Part 4

YouTube Data API Tutorial with Python - Analyze the Data - Part 4

AdaBoost in Python - Machine Learning From Scratch 13 - Python Tutorial

AdaBoost in Python - Machine Learning From Scratch 13 - Python Tutorial

Ultimate FREE Study Guide for Machine Learning and Deep Learning

Ultimate FREE Study Guide for Machine Learning and Deep Learning

This video tutorial teaches how to calculate gradients using PyTorch's Autograd package, a crucial step in deep learning model optimization. By following the steps outlined in the video, viewers can learn how to create computational graphs, perform forward and backward passes, and track gradients using Autograd. This knowledge is essential for building and optimizing deep learning models.

Key Takeaways

Create a tensor and specify require_grad=True
Perform operations on the tensor to create a computational graph
Do a forward pass to calculate the output
Call the backward function to calculate the gradients
Store the gradients in the tensor's grad_fn attribute
Call the backward function with the gradient argument
Create a vector with the same size as the output tensor
Use the requires_grad function to set the attribute to false
Use the detach method to create a new tensor without gradients
Use the with statement to prevent the gradients from being tracked

💡 Gradient calculation with Autograd requires specifying require_grad=True and can be tracked using the backward function, but can also be prevented from being tracked using the requires_grad function, detach method, or with statement.

🔒 Pro feature: Ask AI to explain this lesson →

More on: ML Maths Basics

View skill →

Coding the GARCH Model : Time Series Talk

Coding the GARCH Model : Time Series Talk

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Important Steps I Have Followed To Improve My Data Science Skills- Sharing My Experience

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

Learn Python FAST for Beginners 🚀#coding #conditionals #loops #functions

ChethanAIChronicles

“Hello, world” from scratch on a 6502 — Part 1

“Hello, world” from scratch on a 6502 — Part 1

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

PCA (Principal Component Analysis) in Python - Machine Learning From Scratch 11 - Python Tutorial

ROC and AUC in R

ROC and AUC in R

StatQuest with Josh Starmer

Related AI Lessons

Want to get started with deep learning

Get started with deep learning by leveraging resources like Andrew Karpathy's playlist and frameworks such as TensorFlow or PyTorch

Reddit r/deeplearning

Building a Deepfake Detector From Scratch — What Nobody Tells You

Learn to build a deepfake detector from scratch and understand the challenges involved in detecting AI-generated fake media

Medium · Deep Learning

Unfolding the Meandering Path: High-Dimensional Invariance and the Flat 2D Plane of Neural…

Learn about high-dimensional invariance and its relation to the flat 2D plane of neural networks, and how to apply these concepts to improve model performance

Medium · Deep Learning

Implementing Neural Style Transfer from Scratch: The Project That Started It All

Learn to implement Neural Style Transfer from scratch and understand its significance in deep learning

Medium · Deep Learning

Image Classification with ml5.js

The Coding Train