Tips N Tricks # 8: Using automatic mixed precision training with PyTorch 1.6
Key Takeaways
This video demonstrates how to use automatic mixed precision training with PyTorch 1.6 to train the BERT sentiment model, showcasing its benefits in reducing memory consumption and improving training speed.
Full Transcript
hello everyone and welcome to this new short video in this one I'm going to show you how you can use automatic mixed precision from PI Taj natively so PI dot 1.6 is going to have native support for automatic expression in training and mixer engine helps you in many different ways one of the things is your model will occupy less memory you so you can use larger batch sizes you can have faster training and till now we have been using it using Nvidia's apex but now since it's in built its natively supported by part or we can try using that one and see if it brings any kind of improvement so I should make a much longer video on a MP but I will probably if the time permits so you can you can read more about mixed provision training from this paper which is called an expression training and get to know more about it and in this one I'm going to show you how to use it so for just to start with we have already seen so this is the bird sentiment model that I'm using that I have trained a long time ago so if you have not taken a look at it you can take a look at this model it's also in the description box so I'm going to fight this line data parallel for now and then try to train the model so python train dot pi so I'm not changing anything in the model right now so let's see you let's see what happens so you can see that it's showing that the model is training and it's going to take around 32 minutes if we look at the memory consumption it's around 10 gigabytes of GPU memory 9009 9191 and so let me just stop it first and now we can try the mixed precision training and see what happens so q2 use mixed precision there are a few steps it's it's not very difficult so you can you have to import from torch to kuda import EMP which is automatic mixed precision so when you used Nvidia's epic see you used to import from apex import have a MP and then you have to define the scaler before anything begins so that's your gradient scaler grad scaler and then pass it on to the training function so let me just write a cheer scaler okay now we go to our training function and here when we are doing the forward pass so everything remains the same but when we are doing the forward pass we say like we have to use the context of auto casting so with a MP dot auto cast and here also you need to import so from taured CUDA import a MP so MP dot auto cast and then you do the forward pass of the model and also calculate the gloss and when you're done with that you have to do the backward function so in this one you have to just scaler dot scale loss and then backward and then the optimizer step so scaler dot step and then optimizer and then you have to update the scalar so scaler dot update so as you can see it's more straightforward and I think NVIDIA apex is also similar so there's not much difference and now we can start to train this model one more thing that I forgot was to include a scaler here okay and now we can train the model of and see what happens so as you can see now the model is training and showing 18 minutes so previously it was 32 minutes now it's 18 minutes so things are quite good it seems and if I look at memory consumptions now it's 8 gigabytes so we reduce 2 gigabytes of memory and that's that's how FB 16 or mix president training helps you automatic mix version so it's not just a p16 and one more thing to remember that in in the training we used data parallel in the original version so if you're using data parallel then you have to auto cast the forward function so what you can do is you can you can import from char store CUDA import EMP and then you can use you can use it in different ways so you can have the MP dot auto cast a decorator here or you can do with a MP dot auto cost so you can use this context and put everything inside this context so but we are going for the decorator and once you're once you've done that you can use the model in the same way so it's it's yeah it's not very difficult it's very simple and this is like one of the optimizations you should always go for it's going to make your training much faster provided your GPU supports mixed precision training which is Pascal or more yeah and that's it for today's video and I hope you liked it and subscribe my channel if you liked it and you liked click on the like button and share it with your friends so this is all about automatic mix precision in PI touch 1.6 it won't work with pythons 1.5 so you have to go to the nightly version just remember that and if you have any comments write me in the comment section and I would be happy to take a look and reply if you have any queries so thank you very much and see you next time goodbye
Original Description
In this Tips N Tricks video I show you how to use automatic mixed precision training ( #amp ) with #pytorch 1.6 to train the #BERT sentiment model.
If you are not familiar with BERT sentiment model, take a look at this video: https://www.youtube.com/watch?v=hinZO--TEk4
Please subscribe and like the video to help me keep motivated to make awesome videos like this one. :)
To buy my book, Approaching (Almost) Any Machine Learning problem, please visit: https://bit.ly/buyaaml
Follow me on:
Twitter: https://twitter.com/abhi1thakur
LinkedIn: https://www.linkedin.com/in/abhi1thakur/
Kaggle: https://kaggle.com/abhishek
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Abhishek Thakur · Abhishek Thakur · 32 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
▶
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Episode 1.1: Intro and building a machine learning framework
Abhishek Thakur
Episode 1.2: Building an inference for the machine learning framework
Abhishek Thakur
Episode 2: A Cross Validation Framework
Abhishek Thakur
Tips N Tricks #2: Setting up development environment for machine learning
Abhishek Thakur
Episode 3: Handling Categorical Features in Machine Learning Problems
Abhishek Thakur
BERT on Steroids: Fine-tuning BERT for a dataset using PyTorch and Google Cloud TPUs
Abhishek Thakur
Special Announcement: Approaching (almost) any machine learning problem
Abhishek Thakur
Training BERT Language Model From Scratch On TPUs
Abhishek Thakur
Bengali.AI: Handwritten Grapheme Classification Using PyTorch (Part-1)
Abhishek Thakur
Bengali.AI: Handwritten Grapheme Classification Using PyTorch (Part-2)
Abhishek Thakur
Episode 4: Simple and Basic Binary Classification Metrics
Abhishek Thakur
Training Sentiment Model Using BERT and Serving it with Flask API
Abhishek Thakur
Episode 5: Entity Embeddings for Categorical Variables
Abhishek Thakur
Tips N Tricks #5: 3 Simple and Easy Ways to Cache Functions in Python
Abhishek Thakur
Multi-Lingual Toxic Comment Classification using BERT and TPUs with PyTorch
Abhishek Thakur
Text Extraction From a Corpus Using BERT (AKA Question Answering)
Abhishek Thakur
10K Subscribers: Approaching (almost) Any Machine Learning Problem and Talk Show
Abhishek Thakur
Data Processing For Question & Answering Systems: BERT vs. RoBERTa
Abhishek Thakur
Tips N Tricks #6: How to train multiple deep neural networks on TPUs simultaneously
Abhishek Thakur
Sentencepiece Tokenizer With Offsets For T5, ALBERT, XLM-RoBERTa And Many More
Abhishek Thakur
Talks # 1:Andrey Lukyanenko - Handwritten digit recognition w/ a twist & topic modelling over time
Abhishek Thakur
Episode 6: Simple and Basic Evaluation Metrics For Regression
Abhishek Thakur
Talks # 2: Subhaditya Mukherjee - Image restoration using Deep Learning: Dehazing
Abhishek Thakur
Basic git commands everyone should know about
Abhishek Thakur
How do I start my career in Data Science?
Abhishek Thakur
Talks # 3: Lorenzo Ampil - Introduction to T5 for Sentiment Span Extraction
Abhishek Thakur
Detecting Skin Cancer (Melanoma) With Deep Learning
Abhishek Thakur
Talks # 4: Sebastien Fischman - Pytorch-TabNet: Beating XGBoost on Tabular Data Using Deep Learning
Abhishek Thakur
Build a web-app to serve a deep learning model for skin cancer detection
Abhishek Thakur
Talks # 5: Parul Pandey: Data Science, Diversity and Kaggle
Abhishek Thakur
Implementing original U-Net from scratch using PyTorch
Abhishek Thakur
Tips N Tricks # 8: Using automatic mixed precision training with PyTorch 1.6
Abhishek Thakur
Talks # 6: Mani Sarkar: From backend development to machine learning
Abhishek Thakur
Dockerizing the skin cancer detection web application
Abhishek Thakur
How to train a deep learning model using docker?
Abhishek Thakur
Building an entity extraction model using BERT
Abhishek Thakur
Train custom object detection model with YOLO V5
Abhishek Thakur
Talks # 7: Moez Ali: Machine learning with PyCaret
Abhishek Thakur
How to convert almost any PyTorch model to ONNX and serve it using flask
Abhishek Thakur
Hyperparameter Optimization: This Tutorial Is All You Need
Abhishek Thakur
I finally got a copy of "Approaching (Almost) Any Machine Learning Problem"
Abhishek Thakur
Captcha recognition using PyTorch (Convolutional-RNN + CTC Loss)
Abhishek Thakur
Live Q&A: Getting Started With Data Science
Abhishek Thakur
WTFML: Simple, reusable code for PyTorch models
Abhishek Thakur
Talks # 8: Sebastián Ramírez; Build a machine learning API from scratch with FastAPI
Abhishek Thakur
Data Science PC Configs: From Low Range to Super-High Range
Abhishek Thakur
BERT Model Architectures For Semantic Similarity
Abhishek Thakur
I just got access to GitHub's Codespaces and it's amazing!
Abhishek Thakur
Talks # 9: Vladimir Iglovikov; Detecting Masked Faces In The Pandemic World
Abhishek Thakur
Tips To Build A Good Data Science / Machine Learning Project (For Your Portfolio)
Abhishek Thakur
Docker For Data Scientists
Abhishek Thakur
How To Become A Data Scientist In 1 Year (Learn From A Real World Example)
Abhishek Thakur
Talks # 10: Tanishq Abraham; What are CycleGANs? (a novel deep learning tool in pathology)
Abhishek Thakur
Deploy Any Machine Learning Or Deep Learning Model On Google Cloud Platform (App Engine)
Abhishek Thakur
Pair Programming: Deep Learning Model For Drug Classification With Andrey Lukyanenko
Abhishek Thakur
VS Code (codeserver) on Google Colab / Kaggle / Anywhere
Abhishek Thakur
Talks # 11: Jean-François Puget; Did you know GPUs are not just for Deep Learning?
Abhishek Thakur
End-to-End: Automated Hyperparameter Tuning For Deep Neural Networks
Abhishek Thakur
Deploy Any Machine Learning (or Deep Learning) Endpoint on Google Cloud Platform In 10 minutes
Abhishek Thakur
Ensembling, Blending & Stacking
Abhishek Thakur
More on: ML Pipelines
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Want to get started with deep learning
Reddit r/deeplearning
Building a Deepfake Detector From Scratch — What Nobody Tells You
Medium · Deep Learning
Unfolding the Meandering Path: High-Dimensional Invariance and the Flat 2D Plane of Neural…
Medium · Deep Learning
Implementing Neural Style Transfer from Scratch: The Project That Started It All
Medium · Deep Learning
🎓
Tutor Explanation
DeepCamp AI