Building Blocks of a Deep Neural Network (C1W4L05)

DeepLearningAI · Beginner ·📐 ML Fundamentals ·8y ago

Key Takeaways

The video explains the basic building blocks of a deep neural network, including forward and backward propagation, and how to implement these components to build a deep neural network. It covers the key components of a neural network, such as parameters, activations, and gradients, and how to use these to compute the output of a layer and the gradients of the loss with respect to the parameters.

Full Transcript

in the earlier videos from this week as well as from the videos from the past several weeks you've already seen the basic building blocks of board propagation and back propagation the key components you need to implement a deep neural network let's see how you can put these components together to build your deep net use the network with a few layers let's pick one layer and look at the computations focusing on just that layer for now so for layer L you have some parameters WL and Bo and for the forward prop you will input the activations a L minus 1 from the previous layer and output Al so the way we did this previously was you compute Z l equals WL x al minus 1 plus BL um and then al equals G of ZL right so that's how you go from the input al minus 1 to the output al and it turns out that for later use will be useful to also cache the value ZL so let me include this on cache as well because storing the value ZL will be useful for backward for the back propagation step later and then for the backward step or 3 for the back propagation step again focusing on computation for this layer L you're going to implement a function that inputs da of L and output da L minus 1 and just the special the details the input is actually da FL as well as the cache so you have available to you the value of ZL that you compute it and in addition to outputting GL minus 1 you will output the gradients you want in order to implement gradient descent for learning okay so this is the basic structure of how you implement this forward step I'm going to call the forward function as well as this backward step using a callback wave function so just to summarize in layer L you're going to have you know the forward step or the forward property' forward function input 800 minus 1 and output Al and in order to make this computation you need to use wo and BL um and also output a cache which contains ZL and then on the backward function using the back prop step will be another function then now inputs the AFL and outputs da l minus 1 so it tells you given the derivatives respect to these activations that's da FL how what are the derivatives or how much do I wish you know al minus 1 changes to compute the derivatives respect to the activations from the previous layer within this box ready need to use WL and BL and it turns out along the way you end up computing VL and then this false this backward function can also output dwl and DB l but now sometimes using red arrows to denote the backward elevation so if you prefer we could draw these arrows in red so if you can implement these two functions then the basic computation of the neural network will be as follows you're going to take the input features a zero see that in and that will compute the activations of the first layer let's call that a 1 and to do that you needed W 1 and B 1 and then we'll also you know cache away v1 now having done that you feed that this is the second layer and then using W 2 and B 2 you're going to compute the activations our next layer a 2 and so on until eventually you end up outputting a capital L which is equal to Y hat and along the way we cashed all of these on values Z so that's the forward propagation step now for the back propagation step what we're going to do will be a backward sequence of iterations in which you're going backwards and computing gradients like so so as you're going to feed in here da L and then this box will give us da L minus 1 and so on until we get da - da 1 you could actually get one more output to compute da 0 but this is derivative respect your input features which is not useful at least for training the weights of these are supervised neural networks so you could just stop it there belong the way back prop also ends up outputting PWL DB l right this used times with wo and BL this would output d w3 t p3 and so on so you end up computing all the derivatives you need and so just a maybe so in the structure of this a little bit more right these boxes will use those parameters as well wo PL and it turns out that we'll see later that inside these boxes we'll end up computing disease as well so one iteration of training for a new network involves starting with a zero which is X and going through for profit as follows computing y hats and then using that to compute this and then back prop right doing that and now you have all these derivative terms and so you know W will get updated as some W minus the learning rate times DW right for each of the layers and similarly for B right now we've compute the back prop and have all these derivatives so that's one iteration of gradient descent for your neural network now before moving on just one more implementational detail conceptually will be useful to think of the cashier as storing the value of Z for the backward functions but when you implement this you see this in the programming exercise when you implement it you find that the cash may be a convenient way to get this value of the parameters at W 1 V 1 into the backward function as well so the program exercise you actually spawn the cash is Z as well as W and B all right so to store z2w to be 2 but from an implementational standpoint i just find this a convenient way to just get the parameters copied to where you need to need to use them later when you're computing back propagation so that's just an implementational detail that you see when you do the programming exercise so you've now seen one of the basic building blocks for implementing the deep neural network in each layer there's a for propagation step and there's a corresponding backward propagation step and there's a cash deposit information from one to the other in the next video we'll talk about how you can actually implement these building blocks let's go on to the next video

Original Description

Take the Deep Learning Specialization: http://bit.ly/3aqFCk3 Check out all our courses: https://www.deeplearning.ai Subscribe to The Batch, our weekly newsletter: https://www.deeplearning.ai/thebatch Follow us: Twitter: https://twitter.com/deeplearningai_ Facebook: https://www.facebook.com/deeplearningHQ/ Linkedin: https://www.linkedin.com/company/deeplearningai
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from DeepLearningAI · DeepLearningAI · 39 of 60

1 Forward and Backward Propagation (C1W4L06)
Forward and Backward Propagation (C1W4L06)
DeepLearningAI
2 deeplearning.ai's Heroes of Deep Learning: Yuanqing Lin
deeplearning.ai's Heroes of Deep Learning: Yuanqing Lin
DeepLearningAI
3 deeplearning.ai's Heroes of Deep Learning: Ruslan Salakhutdinov
deeplearning.ai's Heroes of Deep Learning: Ruslan Salakhutdinov
DeepLearningAI
4 deeplearning.ai's Heroes of Deep Learning: Yoshua Bengio
deeplearning.ai's Heroes of Deep Learning: Yoshua Bengio
DeepLearningAI
5 deeplearning.ai's Heroes of Deep Learning: Pieter Abbeel
deeplearning.ai's Heroes of Deep Learning: Pieter Abbeel
DeepLearningAI
6 deeplearning.ai's Heroes of Deep Learning: Ian Goodfellow
deeplearning.ai's Heroes of Deep Learning: Ian Goodfellow
DeepLearningAI
7 deeplearning.ai's Heroes of Deep Learning: Andrej Karpathy
deeplearning.ai's Heroes of Deep Learning: Andrej Karpathy
DeepLearningAI
8 Using an Appropriate Scale (C2W3L02)
Using an Appropriate Scale (C2W3L02)
DeepLearningAI
9 Gradient Checking (C2W1L13)
Gradient Checking (C2W1L13)
DeepLearningAI
10 Gradient Checking Implementation Notes (C2W1L14)
Gradient Checking Implementation Notes (C2W1L14)
DeepLearningAI
11 Learning Rate Decay (C2W2L09)
Learning Rate Decay (C2W2L09)
DeepLearningAI
12 Understanding Mini-Batch Gradient Dexcent (C2W2L02)
Understanding Mini-Batch Gradient Dexcent (C2W2L02)
DeepLearningAI
13 Mini Batch Gradient Descent (C2W2L01)
Mini Batch Gradient Descent (C2W2L01)
DeepLearningAI
14 The Problem of Local Optima (C2W3L10)
The Problem of Local Optima (C2W3L10)
DeepLearningAI
15 Exponentially Weighted Averages (C2W2L03)
Exponentially Weighted Averages (C2W2L03)
DeepLearningAI
16 Tuning Process (C2W3L01)
Tuning Process (C2W3L01)
DeepLearningAI
17 Understanding Exponentially Weighted Averages (C2W2L04)
Understanding Exponentially Weighted Averages (C2W2L04)
DeepLearningAI
18 Bias Correction of Exponentially Weighted Averages (C2W2L05)
Bias Correction of Exponentially Weighted Averages (C2W2L05)
DeepLearningAI
19 Gradient Descent With Momentum (C2W2L06)
Gradient Descent With Momentum (C2W2L06)
DeepLearningAI
20 Normalizing Activations in a Network (C2W3L04)
Normalizing Activations in a Network (C2W3L04)
DeepLearningAI
21 Hyperparameter Tuning in Practice (C2W3L03)
Hyperparameter Tuning in Practice (C2W3L03)
DeepLearningAI
22 Adam Optimization Algorithm (C2W2L08)
Adam Optimization Algorithm (C2W2L08)
DeepLearningAI
23 RMSProp (C2W2L07)
RMSProp (C2W2L07)
DeepLearningAI
24 Fitting Batch Norm Into Neural Networks (C2W3L05)
Fitting Batch Norm Into Neural Networks (C2W3L05)
DeepLearningAI
25 Why Does Batch Norm Work? (C2W3L06)
Why Does Batch Norm Work? (C2W3L06)
DeepLearningAI
26 Batch Norm At Test Time (C2W3L07)
Batch Norm At Test Time (C2W3L07)
DeepLearningAI
27 Softmax Regression (C2W3L08)
Softmax Regression (C2W3L08)
DeepLearningAI
28 Deep Learning Frameworks (C2W3L10)
Deep Learning Frameworks (C2W3L10)
DeepLearningAI
29 Neural Network Overview (C1W3L01)
Neural Network Overview (C1W3L01)
DeepLearningAI
30 Training Softmax Classifier (C2W3L09)
Training Softmax Classifier (C2W3L09)
DeepLearningAI
31 Why Deep Representations? (C1W4L04)
Why Deep Representations? (C1W4L04)
DeepLearningAI
32 Gradient Descent For Neural Networks (C1W3L09)
Gradient Descent For Neural Networks (C1W3L09)
DeepLearningAI
33 Neural Network Representations (C1W3L02)
Neural Network Representations (C1W3L02)
DeepLearningAI
34 TensorFlow (C2W3L11)
TensorFlow (C2W3L11)
DeepLearningAI
35 Activation Functions (C1W3L06)
Activation Functions (C1W3L06)
DeepLearningAI
36 Explanation For Vectorized Implementation (C1W3L05)
Explanation For Vectorized Implementation (C1W3L05)
DeepLearningAI
37 Getting Matrix Dimensions Right (C1W4L03)
Getting Matrix Dimensions Right (C1W4L03)
DeepLearningAI
38 Understanding Dropout (C2W1L07)
Understanding Dropout (C2W1L07)
DeepLearningAI
Building Blocks of a Deep Neural Network (C1W4L05)
Building Blocks of a Deep Neural Network (C1W4L05)
DeepLearningAI
40 Why Non-linear Activation Functions (C1W3L07)
Why Non-linear Activation Functions (C1W3L07)
DeepLearningAI
41 Computing Neural Network Output (C1W3L03)
Computing Neural Network Output (C1W3L03)
DeepLearningAI
42 Backpropagation Intuition (C1W3L10)
Backpropagation Intuition (C1W3L10)
DeepLearningAI
43 Train/Dev/Test Sets (C2W1L01)
Train/Dev/Test Sets (C2W1L01)
DeepLearningAI
44 Deep L-Layer Neural Network (C1W4L01)
Deep L-Layer Neural Network (C1W4L01)
DeepLearningAI
45 Random Initialization (C1W3L11)
Random Initialization (C1W3L11)
DeepLearningAI
46 Other Regularization Methods (C2W1L08)
Other Regularization Methods (C2W1L08)
DeepLearningAI
47 Normalizing Inputs (C2W1L09)
Normalizing Inputs (C2W1L09)
DeepLearningAI
48 Derivatives Of Activation Functions (C1W3L08)
Derivatives Of Activation Functions (C1W3L08)
DeepLearningAI
49 Parameters vs Hyperparameters (C1W4L07)
Parameters vs Hyperparameters (C1W4L07)
DeepLearningAI
50 Vectorizing Across Multiple Examples (C1W3L04)
Vectorizing Across Multiple Examples (C1W3L04)
DeepLearningAI
51 What does this have to do with the brain? (C1W4L08)
What does this have to do with the brain? (C1W4L08)
DeepLearningAI
52 Dropout Regularization (C2W1L06)
Dropout Regularization (C2W1L06)
DeepLearningAI
53 Vanishing/Exploding Gradients (C2W1L10)
Vanishing/Exploding Gradients (C2W1L10)
DeepLearningAI
54 Basic Recipe for Machine Learning (C2W1L03)
Basic Recipe for Machine Learning (C2W1L03)
DeepLearningAI
55 Bias/Variance (C2W1L02)
Bias/Variance (C2W1L02)
DeepLearningAI
56 Forward Propagation in a Deep Network (C1W4L02)
Forward Propagation in a Deep Network (C1W4L02)
DeepLearningAI
57 Weight Initialization in a Deep Network (C2W1L11)
Weight Initialization in a Deep Network (C2W1L11)
DeepLearningAI
58 Numerical Approximations of Gradients (C2W1L12)
Numerical Approximations of Gradients (C2W1L12)
DeepLearningAI
59 Regularization (C2W1L04)
Regularization (C2W1L04)
DeepLearningAI
60 Why Regularization Reduces Overfitting (C2W1L05)
Why Regularization Reduces Overfitting (C2W1L05)
DeepLearningAI

This video teaches the basic building blocks of a deep neural network, including forward and backward propagation, and how to implement these components to build a deep neural network. It covers the key components of a neural network and how to use these to compute the output of a layer and the gradients of the loss with respect to the parameters.

Key Takeaways
  1. Define the parameters and activations of a layer
  2. Compute the output of a layer using forward propagation
  3. Compute the gradients of the loss with respect to the parameters using backward propagation
  4. Cache the intermediate results for later use
  5. Update the parameters using gradient descent
💡 The cache is used to store the intermediate results of the forward propagation step, which are then used in the backward propagation step to compute the gradients of the loss with respect to the parameters.

Related AI Lessons

Up next
Learn Deep Learning by Hand (Beginner's Guide - Part 1)
Thu Vu
Watch →