Why use GPU with Neural Networks?

CodeEmporium · Beginner ·🧬 Deep Learning ·6y ago

Key Takeaways

This video explains why GPUs are used with Neural Networks, using an analogy and demonstrating CUDA with PyTorch code, highlighting the benefits of using GPUs over CPUs for deep learning computations.

Full Transcript

why use GPUs in neural networks there's a hardware and a software aspect to this answer and so this is going to be a two-part series in this video we're going to take a look at what the GPU Hardware provides that makes neural net training much faster and in the next video what software and algorithm changes do GPUs make to speed-up GPU processing what is five times six times seven 210 210 what is eight times three times five 120 120 what is seven times 12 times three to 50 to 250 to the CPU is blazing fast the GPU is almost as fast but slightly trails behind what is matrix a times matrix B this CPU your answer this looks like the CPU took its own sweet time GPUs are used in neural nets because of this ability to perform matrix multiplications so fast but why does this happen why or CPUs great with scalar multiplications and GPUs better with matrix multiplications three main reasons GPUs have a larger memory bandwidth they use parallelization and they have faster and more memory access than CPUs we're gonna delve into these three points in detail and then show some PI torch code in action that really demonstrates the difference in speed let's get started so here's a block diagram we have a system memory this is the main memory which contains the matrices that we need to multiply we have a CPU and a GPU and each of these also have their own memory that I'll just label as memory without the details for now let's make an analogy here the CPU is a Ferrari and the GPU is a truck case one performing scalar multiplication the CPU can get this data with a fetch operation to memory the amount of data that we can fetch though is pretty tiny but it's large enough to fetch floating-point numbers within a couple of trips and these fetches happen blazing fast like a Ferrari but GPUs fetch operation on the other hand can transmit a lot more data but they are not optimized for speed so they're like a truck they can take more time to fetch and process small chunks of data and that's why for scalar multiplications GPUs can lag behind CPUs case two when we multiply two large matrices CPU it has a super fast fetch operation but each fetch can only transmit a tiny amount of data to and from memory so it can take thousands of trips to fetch all the required data well one trip is being fetched the CPU memory is processed and freed up for the next inputs this is where GPUs work much better at a time we can fetch much larger amounts of data from the ram into the GPUs memory so the GPU doesn't need to make that many trips in more technical terms GPUs have a higher memory bandwidth than CPUs memory bandwidth is the amount of data that can be transmitted in a single trip to and from the memory and this is one of the main reasons GPUs have an edge over the CPUs for large matrix multiplication but even if we get a lot of packages or a lot of data the GPU processors remain idle for a while after all they're too fast and the truck is just too slow so instead of having a single truck we can have a fleet of trucks this way GPU processors do not need to wait around and it always has data to work with this notion of using a fleet of trucks is called parallelization coupling large memory bandwidth with parallelization reduces any time a GPU would be waiting so basically we fetch a lot of data and we do it fast but GPUs offer something more faster memory I use the term memory very vaguely before but what I'm really talking about are the caches and the registers GPUs have a similar structure but GPUs l1 and l2 caches are smaller in size than a CPUs l1 and l2 caches smaller size means that it can be accessed much faster and also these streamlined processors of a GPU have a bunch of registers which are super fast and GPUs have upwards of a thousand times more registers to play around with then CPUs so all of these memory enhancements together makes computations just blazing fast and it's a combination of these three points that GPUs are faster than CPUs from matrix multiplication especially for large matrices faster multiplication means that we can perform operations in deep learning also faster cool but how do we actually make use of GPUs as a programmer well this is where CUDA comes in from a programmers view CUDA provides an API that allows us to access the components of a GPU like the stream line processors the caches and the registers CUDA already provides a nice high level abstraction but deep learning frameworks like pi torch make life even easier we don't even need to know about the inner components of a GPU we just treat it as one big abstract unit called a GPU and we're all good to go let's actually see how matrix multiplication with GPUs perform I'm in Google Kolob and it's an environment that allows you to run your code in chunks and see the outputs directly I have some code cells here that we can walk through the first cell is a scalar multiplication with the CPU we're just taking the square of a number PI torch uses tensors as the fundamental building blocks and you can think of tensors as wrappers around your scalars and matrices for any dimension I'm just squaring a 1 cross 1 tensor here we're using a command called timeit to calculate the execution time of this block of code these are known as magic functions if you want to look them up for more reference it looks like the CPU took 5.26 microseconds to do this that's super fast in the second cell we have similar code but this time we multiply a 10,000 by 10,000 matrix with itself and this took eleven point eight seconds it probably seems like it it's taking longer when you actually run it but that's only because it's being executed three times for the rest of the code you need to set up a GPU so for this you got to go to the menu bar and click on run time then on the drop down you go to change run time type and from the hardware accelerator drop-down select GPU this initializes the GPU we now have to tell PI torch to use this GPU and we use torch device to get a reference to this GPU and this dot two device tells PI torch to store and process this variable Z using the GPU let's run the cell you okay so that took thirty five point two microseconds which was slower than the five point two six microseconds that we did using without the GPU so it was slower let's now run the large matrix multiplication with the GPU okay wow that took less than a second way faster than the 12 seconds it took before pretty slick right what I explained right here is only half the reason GPUs are blazing fast with matrix multiplication in the next video we're going to take a look at how we can change the original matrix multiplication algorithm for faster processing but that's all I've got for you now hope y'all enjoy what you saw click one of these cards to see some of my amazing work and I will see you very soon bye you

Original Description

Start with an analogy. Then delve into CUDA with some pytorch code to demonstrate why we use GPUs instead of just CPUs. REFERENCES [1] Why GPU’s work well in deep learning: https://www.quora.com/Why-are-GPUs-well-suited-to-deep-learning [2] Matrix math on a GPU: http://www.ijcee.org/vol9/949-E1621.pdf [3] GPUs are not necessarily faster than CPUs: https://www.quora.com/Why-are-GPUs-more-powerful-than-CPUs [4] Code for difference: https://medium.com/analytics-vidhya/using-pytorch-and-cuda-for-large-computation-in-google-colabs-f1c026c17673 [5] GPU memory architecture: https://medium.com/@ashanpriyadarshana/cuda-gpu-memory-architecture-8c3ac644bd64
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from CodeEmporium · CodeEmporium · 45 of 60

1 Linear Regression and Multiple Regression
Linear Regression and Multiple Regression
CodeEmporium
2 Logistic Regression - THE MATH YOU SHOULD KNOW!
Logistic Regression - THE MATH YOU SHOULD KNOW!
CodeEmporium
3 Generative Adversarial Networks - FUTURISTIC & FUN AI !
Generative Adversarial Networks - FUTURISTIC & FUN AI !
CodeEmporium
4 Deep Learning on the Cloud - GPU TO LEARN FASTER
Deep Learning on the Cloud - GPU TO LEARN FASTER
CodeEmporium
5 Deep Mind's AlphaGo Zero - EXPLAINED
Deep Mind's AlphaGo Zero - EXPLAINED
CodeEmporium
6 Mask Region based Convolution Neural Networks - EXPLAINED!
Mask Region based Convolution Neural Networks - EXPLAINED!
CodeEmporium
7 Attention in Neural Networks
Attention in Neural Networks
CodeEmporium
8 Depthwise Separable Convolution - A FASTER CONVOLUTION!
Depthwise Separable Convolution - A FASTER CONVOLUTION!
CodeEmporium
9 One Neural network learns EVERYTHING ?!
One Neural network learns EVERYTHING ?!
CodeEmporium
10 Neural Voice Cloning
Neural Voice Cloning
CodeEmporium
11 AI creates Image Classifiers…by DRAWING?
AI creates Image Classifiers…by DRAWING?
CodeEmporium
12 Unpaired Image-Image Translation using CycleGANs
Unpaired Image-Image Translation using CycleGANs
CodeEmporium
13 K-Means Clustering - EXPLAINED!
K-Means Clustering - EXPLAINED!
CodeEmporium
14 Random Forest Classification
Random Forest Classification
CodeEmporium
15 Data Science in Finance
Data Science in Finance
CodeEmporium
16 Hypothesis testing with Applications in Data Science
Hypothesis testing with Applications in Data Science
CodeEmporium
17 A/B Testing - Simply Explained
A/B Testing - Simply Explained
CodeEmporium
18 The Kernel Trick - THE MATH YOU SHOULD KNOW!
The Kernel Trick - THE MATH YOU SHOULD KNOW!
CodeEmporium
19 Support Vector Machines - THE MATH YOU  SHOULD KNOW
Support Vector Machines - THE MATH YOU SHOULD KNOW
CodeEmporium
20 Principal Component Analysis (PCA) - THE MATH YOU SHOULD KNOW!
Principal Component Analysis (PCA) - THE MATH YOU SHOULD KNOW!
CodeEmporium
21 History of Calculus - Animated
History of Calculus - Animated
CodeEmporium
22 Curiosity in AI
Curiosity in AI
CodeEmporium
23 DropBlock - A BETTER DROPOUT for Neural Networks
DropBlock - A BETTER DROPOUT for Neural Networks
CodeEmporium
24 Autoencoders - EXPLAINED
Autoencoders - EXPLAINED
CodeEmporium
25 Recurrent Neural Networks - EXPLAINED!
Recurrent Neural Networks - EXPLAINED!
CodeEmporium
26 LSTM Networks - EXPLAINED!
LSTM Networks - EXPLAINED!
CodeEmporium
27 Building an Image Captioner with Neural Networks
Building an Image Captioner with Neural Networks
CodeEmporium
28 10 Machine Learning Questions - ANSWERED!
10 Machine Learning Questions - ANSWERED!
CodeEmporium
29 How do neural networks work?
How do neural networks work?
CodeEmporium
30 Evolution of Face Generation |  Evolution of GANs
Evolution of Face Generation | Evolution of GANs
CodeEmporium
31 How does Google Translate's AI work?
How does Google Translate's AI work?
CodeEmporium
32 How to keep up with AI research?
How to keep up with AI research?
CodeEmporium
33 How does YouTube recommend videos? - AI EXPLAINED!
How does YouTube recommend videos? - AI EXPLAINED!
CodeEmporium
34 Variational Autoencoders - EXPLAINED!
Variational Autoencoders - EXPLAINED!
CodeEmporium
35 Logistic Regression - VISUALIZED!
Logistic Regression - VISUALIZED!
CodeEmporium
36 Gradient Descent - THE MATH YOU SHOULD KNOW
Gradient Descent - THE MATH YOU SHOULD KNOW
CodeEmporium
37 Boosting - EXPLAINED!
Boosting - EXPLAINED!
CodeEmporium
38 Transformer Neural Networks - EXPLAINED! (Attention is all you need)
Transformer Neural Networks - EXPLAINED! (Attention is all you need)
CodeEmporium
39 Loss Functions - EXPLAINED!
Loss Functions - EXPLAINED!
CodeEmporium
40 Optimizers - EXPLAINED!
Optimizers - EXPLAINED!
CodeEmporium
41 NLP with Neural Networks & Transformers
NLP with Neural Networks & Transformers
CodeEmporium
42 Batch Normalization - EXPLAINED!
Batch Normalization - EXPLAINED!
CodeEmporium
43 Activation Functions - EXPLAINED!
Activation Functions - EXPLAINED!
CodeEmporium
44 Data Scientist Answers Interview Questions
Data Scientist Answers Interview Questions
CodeEmporium
Why use GPU with Neural Networks?
Why use GPU with Neural Networks?
CodeEmporium
46 How do GPUs speed up Neural Network training?
How do GPUs speed up Neural Network training?
CodeEmporium
47 BERT Neural Network - EXPLAINED!
BERT Neural Network - EXPLAINED!
CodeEmporium
48 ConvNets Scaled Efficiently
ConvNets Scaled Efficiently
CodeEmporium
49 Transformer Neural Net makes music! (JukeboxAI)
Transformer Neural Net makes music! (JukeboxAI)
CodeEmporium
50 What do filters of Convolution Neural Network learn?
What do filters of Convolution Neural Network learn?
CodeEmporium
51 We're hosting a Machine Learning Conference!
We're hosting a Machine Learning Conference!
CodeEmporium
52 MLconfEU 2020: Machine Learning Conference for Software Engineers
MLconfEU 2020: Machine Learning Conference for Software Engineers
CodeEmporium
53 Are Neural Networks Intelligent?
Are Neural Networks Intelligent?
CodeEmporium
54 Time Series Forecasting with Machine Learning
Time Series Forecasting with Machine Learning
CodeEmporium
55 Few Shot Learning - EXPLAINED!
Few Shot Learning - EXPLAINED!
CodeEmporium
56 How does a Data Scientist Fight FRAUD?
How does a Data Scientist Fight FRAUD?
CodeEmporium
57 How would a Data Scientist analyze Customer Churn?
How would a Data Scientist analyze Customer Churn?
CodeEmporium
58 Expectations with Machine Learning
Expectations with Machine Learning
CodeEmporium
59 Why Logistic Regression DOESN'T return probabilities?!
Why Logistic Regression DOESN'T return probabilities?!
CodeEmporium
60 How you SHOULD code Machine Learning
How you SHOULD code Machine Learning
CodeEmporium

This video teaches the importance of using GPUs with Neural Networks, demonstrating how CUDA and PyTorch can be used for efficient deep learning computations. It highlights the benefits of GPU acceleration and provides a basic understanding of GPU memory architecture.

Key Takeaways
  1. Start with an analogy to understand why GPUs are useful for Neural Networks
  2. Delve into CUDA and its application in deep learning
  3. Use PyTorch to demonstrate GPU acceleration for Neural Networks
  4. Explore GPU memory architecture and its implications for deep learning
💡 GPUs are particularly well-suited for deep learning computations due to their ability to perform matrix math operations efficiently, making them a crucial component in modern Neural Network architectures.

Related AI Lessons

Want to get started with deep learning
Get started with deep learning by leveraging resources like Andrew Karpathy's playlist and frameworks such as TensorFlow or PyTorch
Reddit r/deeplearning
Building a Deepfake Detector From Scratch — What Nobody Tells You
Learn to build a deepfake detector from scratch and understand the challenges involved in detecting AI-generated fake media
Medium · Deep Learning
Unfolding the Meandering Path: High-Dimensional Invariance and the Flat 2D Plane of Neural…
Learn about high-dimensional invariance and its relation to the flat 2D plane of neural networks, and how to apply these concepts to improve model performance
Medium · Deep Learning
Implementing Neural Style Transfer from Scratch: The Project That Started It All
Learn to implement Neural Style Transfer from scratch and understand its significance in deep learning
Medium · Deep Learning
Up next
Image Classification with ml5.js
The Coding Train
Watch →