Vectorizing Across Multiple Examples (C1W3L04)

DeepLearningAI · Beginner ·📐 ML Fundamentals ·8y ago

Key Takeaways

The video demonstrates how to vectorize across multiple training examples in a neural network, using equations from the previous video and modifying them to compute outputs for all examples at once. The process involves stacking training examples in columns of a matrix and using vectorized implementations of the equations to compute the outputs.

Full Transcript

in the last video you saw how to compute the prediction on a new network given a single training example in this video you see how to vectorize across multiple training examples and the outcome will be quite similar to what you saw for logistic regression where by stacking up different training examples in different columns of the matrix you'll be able to take the equations you have from the previous video and with very little modification change them to make the neural network compute the outputs on all the examples on pretty much all at the same time so let's see the details of how to do that these were the four equations we have from the previous video of how you compute Z 1 a 1 Z 2 and a 2 and they tell you how given an input feature vector X you can use them to generate a 2 equals y hat for single training example now if you have M training examples you need to repeat this process for say the first training example X superscript round records one to compute Y hat one Wester prediction on your first training example then X to use that to generate prediction y hat two and so on down to XM to generate a prediction y hat M and so in order to write this the activation function notation as well I'm going to write this as a two square bracket round bracket 1 this is a 2 2 and a 2 m so this notation a square bracket 2 round bracket I the round bracket I refers to training example I and the square bracket 2 refers to layer 2 ok so that's how the square bracket and the round bracket industries work and so the suggest that if you have an unvectorized implementation and want to compute the prediction for all your training examples you need to do for I equals 1 to em on there basically intimate these four equations where you need I guess z1 I equals 31 X I plus B 1 on a 1 I equals sigmoid z 1i z 2 I equals W 2 a 1 I plus V 2 and a 2 I equals sigmoid of z2 I right so it's basically you know these four equations on top and adding the superscript round bracket I to all the variables that depend on the training example so adding those superstream round bracket I to X Z and a if you want to compute all the outputs on your M training examples what we like to do is vectorize this whole computation so it's to get rid of this volume and by the way in case it seems like I'm getting a little more of nitty-gritty linear algebra it turns out that being able to implement this correctly is important in the deep learning error and we actually chose the notation very carefully for this class to make these vectorization as as easy as possible so I hope that great through this nitty-gritty will actually help you to more quickly get your correct implementations of these advents working all right so let me just copy this whole block of code to the next slide and then we'll see how to vectorize this so here's we had from the previous line with a four group going over all M training examples so recall that we define the matrix X to be equal to our training examples stacked up on these columns like so so take the training examples stack them in columns so this becomes a n or maybe NX by dimensional matrix I'm just going to give away the punchline and tell you what you need to implement in order to vectorize implementation of this for loop turns out what you need to do is compute capital Z 1 equals W 1 X plus B 1 capital a1 equals sigmoid of z1 then tap code Z 2 equals W 2 times a 1 plus B 2 and then a2 equals sigmoid of Z 2 so if you want the analogy is that we went from lowercase vector X s to this capital case X matrix by stacking up the lower case X's in different columns if you do the same thing for the Z's so for example if you take Z 1 1 z 1 2 and so on these are all column vectors up to Z 1 m right so that's this first quantity but all M of them and stack them in columns then this gives you the matrix Z 1 and similarly if you look at say this quantity you take a 1 1 a 1 2 and so on in a 1m and stack them up in columns then this just as we went from lower case X to capital case X and lo que si to Catholic 8z this goes from the lower case a which are vectors to do some capital A 1 that's over there and similarly for Z 2 and a 2 right there also attained by taking these vectors and stacking them horizontally and taking these vectors and stacking them horizontally in order to get Z Capital Z 2 and capital e 2 one of the property of this notation that might help you to think about it is that these matrices say Z and a horizontally we're going to index across training examples so that's why the horizontal index you know corresponds to different training examples is sweep from left to right you're scanning through the training set and vertically this vertical index corresponds to different notes in the neural network so for example this note this value at the topmost topmost corner of the matrix corresponds to the activation of the first hidden unit on the first training example on one value down corresponds to the activation in the second hidden unit on the first training example then the third heading unit on the first training example and so on so as you scan down this is new indexing into the hidden units number where as you do with horizontally then you're going from the first hidden unit in the first training example to you now the first in the human second training example the third turn example and so on until this note here corresponds to the activation of the first hidden unit in the final training example in the M training example ok so the horizontal the the matrix a goes over a different training examples and vertically the different indices in the matrix a corresponds to different hidden units and a similar intuition holds true for the matrix Z as well as well as for X where horizontally it corresponds to different training examples and vertically it corresponds to different features different input features which are really different notes in phileo of the neural network so with these equations you now know how to implement a neural network with vectorization that is vectorization across multiple examples in the next video I want to show you a bit more justification about why this is a correct implementation of this type of vectorization it turns out the justification will be similar to whether you have seen for logistic regression let's go on to the next video

Original Description

Take the Deep Learning Specialization: http://bit.ly/2IfZoml Check out all our courses: https://www.deeplearning.ai Subscribe to The Batch, our weekly newsletter: https://www.deeplearning.ai/thebatch Follow us: Twitter: https://twitter.com/deeplearningai_ Facebook: https://www.facebook.com/deeplearningHQ/ Linkedin: https://www.linkedin.com/company/deeplearningai
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from DeepLearningAI · DeepLearningAI · 50 of 60

1 Forward and Backward Propagation (C1W4L06)
Forward and Backward Propagation (C1W4L06)
DeepLearningAI
2 deeplearning.ai's Heroes of Deep Learning: Yuanqing Lin
deeplearning.ai's Heroes of Deep Learning: Yuanqing Lin
DeepLearningAI
3 deeplearning.ai's Heroes of Deep Learning: Ruslan Salakhutdinov
deeplearning.ai's Heroes of Deep Learning: Ruslan Salakhutdinov
DeepLearningAI
4 deeplearning.ai's Heroes of Deep Learning: Yoshua Bengio
deeplearning.ai's Heroes of Deep Learning: Yoshua Bengio
DeepLearningAI
5 deeplearning.ai's Heroes of Deep Learning: Pieter Abbeel
deeplearning.ai's Heroes of Deep Learning: Pieter Abbeel
DeepLearningAI
6 deeplearning.ai's Heroes of Deep Learning: Ian Goodfellow
deeplearning.ai's Heroes of Deep Learning: Ian Goodfellow
DeepLearningAI
7 deeplearning.ai's Heroes of Deep Learning: Andrej Karpathy
deeplearning.ai's Heroes of Deep Learning: Andrej Karpathy
DeepLearningAI
8 Using an Appropriate Scale (C2W3L02)
Using an Appropriate Scale (C2W3L02)
DeepLearningAI
9 Gradient Checking (C2W1L13)
Gradient Checking (C2W1L13)
DeepLearningAI
10 Gradient Checking Implementation Notes (C2W1L14)
Gradient Checking Implementation Notes (C2W1L14)
DeepLearningAI
11 Learning Rate Decay (C2W2L09)
Learning Rate Decay (C2W2L09)
DeepLearningAI
12 Understanding Mini-Batch Gradient Dexcent (C2W2L02)
Understanding Mini-Batch Gradient Dexcent (C2W2L02)
DeepLearningAI
13 Mini Batch Gradient Descent (C2W2L01)
Mini Batch Gradient Descent (C2W2L01)
DeepLearningAI
14 The Problem of Local Optima (C2W3L10)
The Problem of Local Optima (C2W3L10)
DeepLearningAI
15 Exponentially Weighted Averages (C2W2L03)
Exponentially Weighted Averages (C2W2L03)
DeepLearningAI
16 Tuning Process (C2W3L01)
Tuning Process (C2W3L01)
DeepLearningAI
17 Understanding Exponentially Weighted Averages (C2W2L04)
Understanding Exponentially Weighted Averages (C2W2L04)
DeepLearningAI
18 Bias Correction of Exponentially Weighted Averages (C2W2L05)
Bias Correction of Exponentially Weighted Averages (C2W2L05)
DeepLearningAI
19 Gradient Descent With Momentum (C2W2L06)
Gradient Descent With Momentum (C2W2L06)
DeepLearningAI
20 Normalizing Activations in a Network (C2W3L04)
Normalizing Activations in a Network (C2W3L04)
DeepLearningAI
21 Hyperparameter Tuning in Practice (C2W3L03)
Hyperparameter Tuning in Practice (C2W3L03)
DeepLearningAI
22 Adam Optimization Algorithm (C2W2L08)
Adam Optimization Algorithm (C2W2L08)
DeepLearningAI
23 RMSProp (C2W2L07)
RMSProp (C2W2L07)
DeepLearningAI
24 Fitting Batch Norm Into Neural Networks (C2W3L05)
Fitting Batch Norm Into Neural Networks (C2W3L05)
DeepLearningAI
25 Why Does Batch Norm Work? (C2W3L06)
Why Does Batch Norm Work? (C2W3L06)
DeepLearningAI
26 Batch Norm At Test Time (C2W3L07)
Batch Norm At Test Time (C2W3L07)
DeepLearningAI
27 Softmax Regression (C2W3L08)
Softmax Regression (C2W3L08)
DeepLearningAI
28 Deep Learning Frameworks (C2W3L10)
Deep Learning Frameworks (C2W3L10)
DeepLearningAI
29 Neural Network Overview (C1W3L01)
Neural Network Overview (C1W3L01)
DeepLearningAI
30 Training Softmax Classifier (C2W3L09)
Training Softmax Classifier (C2W3L09)
DeepLearningAI
31 Why Deep Representations? (C1W4L04)
Why Deep Representations? (C1W4L04)
DeepLearningAI
32 Gradient Descent For Neural Networks (C1W3L09)
Gradient Descent For Neural Networks (C1W3L09)
DeepLearningAI
33 Neural Network Representations (C1W3L02)
Neural Network Representations (C1W3L02)
DeepLearningAI
34 TensorFlow (C2W3L11)
TensorFlow (C2W3L11)
DeepLearningAI
35 Activation Functions (C1W3L06)
Activation Functions (C1W3L06)
DeepLearningAI
36 Explanation For Vectorized Implementation (C1W3L05)
Explanation For Vectorized Implementation (C1W3L05)
DeepLearningAI
37 Getting Matrix Dimensions Right (C1W4L03)
Getting Matrix Dimensions Right (C1W4L03)
DeepLearningAI
38 Understanding Dropout (C2W1L07)
Understanding Dropout (C2W1L07)
DeepLearningAI
39 Building Blocks of a Deep Neural Network (C1W4L05)
Building Blocks of a Deep Neural Network (C1W4L05)
DeepLearningAI
40 Why Non-linear Activation Functions (C1W3L07)
Why Non-linear Activation Functions (C1W3L07)
DeepLearningAI
41 Computing Neural Network Output (C1W3L03)
Computing Neural Network Output (C1W3L03)
DeepLearningAI
42 Backpropagation Intuition (C1W3L10)
Backpropagation Intuition (C1W3L10)
DeepLearningAI
43 Train/Dev/Test Sets (C2W1L01)
Train/Dev/Test Sets (C2W1L01)
DeepLearningAI
44 Deep L-Layer Neural Network (C1W4L01)
Deep L-Layer Neural Network (C1W4L01)
DeepLearningAI
45 Random Initialization (C1W3L11)
Random Initialization (C1W3L11)
DeepLearningAI
46 Other Regularization Methods (C2W1L08)
Other Regularization Methods (C2W1L08)
DeepLearningAI
47 Normalizing Inputs (C2W1L09)
Normalizing Inputs (C2W1L09)
DeepLearningAI
48 Derivatives Of Activation Functions (C1W3L08)
Derivatives Of Activation Functions (C1W3L08)
DeepLearningAI
49 Parameters vs Hyperparameters (C1W4L07)
Parameters vs Hyperparameters (C1W4L07)
DeepLearningAI
Vectorizing Across Multiple Examples (C1W3L04)
Vectorizing Across Multiple Examples (C1W3L04)
DeepLearningAI
51 What does this have to do with the brain? (C1W4L08)
What does this have to do with the brain? (C1W4L08)
DeepLearningAI
52 Dropout Regularization (C2W1L06)
Dropout Regularization (C2W1L06)
DeepLearningAI
53 Vanishing/Exploding Gradients (C2W1L10)
Vanishing/Exploding Gradients (C2W1L10)
DeepLearningAI
54 Basic Recipe for Machine Learning (C2W1L03)
Basic Recipe for Machine Learning (C2W1L03)
DeepLearningAI
55 Bias/Variance (C2W1L02)
Bias/Variance (C2W1L02)
DeepLearningAI
56 Forward Propagation in a Deep Network (C1W4L02)
Forward Propagation in a Deep Network (C1W4L02)
DeepLearningAI
57 Weight Initialization in a Deep Network (C2W1L11)
Weight Initialization in a Deep Network (C2W1L11)
DeepLearningAI
58 Numerical Approximations of Gradients (C2W1L12)
Numerical Approximations of Gradients (C2W1L12)
DeepLearningAI
59 Regularization (C2W1L04)
Regularization (C2W1L04)
DeepLearningAI
60 Why Regularization Reduces Overfitting (C2W1L05)
Why Regularization Reduces Overfitting (C2W1L05)
DeepLearningAI

This video teaches how to vectorize across multiple training examples in a neural network, allowing for efficient computation of outputs for all examples at once. The process involves modifying equations from the previous video to use matrix operations and vectorized implementations.

Key Takeaways
  1. Stack training examples in columns of a matrix
  2. Modify equations to compute outputs for all examples at once
  3. Use vectorized implementations of the equations
  4. Compute capital Z1, capital A1, capital Z2, and capital A2
💡 Vectorization across multiple examples allows for efficient computation of outputs for all examples at once, making it a crucial concept in deep learning.

Related Reads

📰
Evolving Algorithms: Next-Generation AI in Predictive Analytics
Learn how next-generation AI is transforming predictive analytics with evolving algorithms and why it matters for informed decision-making
Dev.to · Fu'ad Husnan
📰
Architecting for the Future: A Blueprint for Model-Agnostic, Business-Ready AI
Learn to architect model-agnostic, business-ready AI systems with a standardized infrastructure
Medium · AI
📰
The Recommender System Pipeline: An End-to-End Overview
Learn the end-to-end pipeline of recommender systems and how they filter information for users
Medium · AI
📰
The Recommender System Pipeline: An End-to-End Overview
Learn how to build a recommender system pipeline from data collection to model deployment and understand its key components
Medium · Machine Learning
Up next
1. Overview of Artificial Intelligence | What is AI? Fundamental Concepts & Complete History of AI
Professor Rahul Jain
Watch →