Bias Correction of Exponentially Weighted Averages (C2W2L05)
Key Takeaways
The video explains the concept of bias correction in exponentially weighted averages, a technique used in machine learning to improve the accuracy of estimates, particularly during the initial phase of learning. It demonstrates how to implement bias correction using a modified formula, VT divided by 1 minus beta to the power of T, where T is the current data point.
Full Transcript
you've learned how to implement exponentially weighted averages there's one technical detail called bias correction that can make your computation of these averages more accurately let's see how that works in the previous video you saw this figure for beta equals 0.9 this figure for beta equals 0.98 but it turns out that if you implement the formula as written here you won't actually get the green curve when say beta equals 0.98 you actually get the purple curve here and you notice that the purple curve starts off really low so let's see how to fix that when you're implementing a moving average you initialize it with B zero equals zero and then V 1 is equal to 0.98 V zero plus 0.02 theta one but the zero is equal to 0 so that term just goes away so B 1 is just 0.02 times theta 1 so that's why if the first day's temperature is say 40 degrees Fahrenheit then V 1 will be 0.02 times 40 which is 8 so you get a much lower value down here so it's not a very good estimate of the first day's temperature V 2 will be 0.98 times B 1 plus 0.02 times theta 2 and if you plug in you know V 1 which is this down here and multiply it out then you find that B 2 is actually equal to 0.98 times zero point zero 2 times theta 1 plus 0.02 times theta 2 and that is zero point zero one nine six theta one plus zero point zero two theta 2 so again you know assuming if they the 1 and theta 2 a positive numbers when you compute this B 2 will be much less than say they want all theta 2 so B 2 is in a very good estimate of the first two days temperature of the year so it turns out that there's a way to modify the Zestimate that makes it much better and it makes it more accurate especially during this initial phase of your estimate which is that instead of taking VT take VT divided by one minus beta to the power of T where T is the current data you're on so let's take a concrete example when T is equal to 2 1 minus Bay to the power of T is 1 minus 0.98 squared and it turns out that this is 0.0 0.6 and so your estimate of the Thames on day 2 becomes be 2 divided by 0.03 9 6 and this is going to be 0.01 9 6 times theta 1 plus 0.02 Zeta 2 you notice that these two things adds up to the nominator Oh Penelope 9 6 and so this becomes a weighted average of theta 1 and theta 2 and this removes this bias so you notice that dump as T becomes large beta to the T will become will approach 0 which is why when T is large enough the bias correction makes almost no difference this is why when T is large the Purple Line and the green line you are pretty much overlap but during this initial phase of learning when you're still warming up with your estimate when bias correction can help you to obtain a better estimate of temperature and as this bias correction that helps you go from the purple line to the green line so in machine learning for most informations of the exponential weighted average people don't often bother to implement bias Corrections because most people would rather just wait that initial period and a slightly more bias estimate and go from there but we are concerned about the buyers during this initial phase while you're exponentially weighted moving average is the warming up or then bias Corrections can help you get a better estimate early on so that you now know how to implement exponentially weighted moving averages let's go on and use this to build some better optimization algorithms
Original Description
Take the Deep Learning Specialization: http://bit.ly/3cqn45p
Check out all our courses: https://www.deeplearning.ai
Subscribe to The Batch, our weekly newsletter: https://www.deeplearning.ai/thebatch
Follow us:
Twitter: https://twitter.com/deeplearningai_
Facebook: https://www.facebook.com/deeplearningHQ/
Linkedin: https://www.linkedin.com/company/deeplearningai
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from DeepLearningAI · DeepLearningAI · 18 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
▶
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Forward and Backward Propagation (C1W4L06)
DeepLearningAI
deeplearning.ai's Heroes of Deep Learning: Yuanqing Lin
DeepLearningAI
deeplearning.ai's Heroes of Deep Learning: Ruslan Salakhutdinov
DeepLearningAI
deeplearning.ai's Heroes of Deep Learning: Yoshua Bengio
DeepLearningAI
deeplearning.ai's Heroes of Deep Learning: Pieter Abbeel
DeepLearningAI
deeplearning.ai's Heroes of Deep Learning: Ian Goodfellow
DeepLearningAI
deeplearning.ai's Heroes of Deep Learning: Andrej Karpathy
DeepLearningAI
Using an Appropriate Scale (C2W3L02)
DeepLearningAI
Gradient Checking (C2W1L13)
DeepLearningAI
Gradient Checking Implementation Notes (C2W1L14)
DeepLearningAI
Learning Rate Decay (C2W2L09)
DeepLearningAI
Understanding Mini-Batch Gradient Dexcent (C2W2L02)
DeepLearningAI
Mini Batch Gradient Descent (C2W2L01)
DeepLearningAI
The Problem of Local Optima (C2W3L10)
DeepLearningAI
Exponentially Weighted Averages (C2W2L03)
DeepLearningAI
Tuning Process (C2W3L01)
DeepLearningAI
Understanding Exponentially Weighted Averages (C2W2L04)
DeepLearningAI
Bias Correction of Exponentially Weighted Averages (C2W2L05)
DeepLearningAI
Gradient Descent With Momentum (C2W2L06)
DeepLearningAI
Normalizing Activations in a Network (C2W3L04)
DeepLearningAI
Hyperparameter Tuning in Practice (C2W3L03)
DeepLearningAI
Adam Optimization Algorithm (C2W2L08)
DeepLearningAI
RMSProp (C2W2L07)
DeepLearningAI
Fitting Batch Norm Into Neural Networks (C2W3L05)
DeepLearningAI
Why Does Batch Norm Work? (C2W3L06)
DeepLearningAI
Batch Norm At Test Time (C2W3L07)
DeepLearningAI
Softmax Regression (C2W3L08)
DeepLearningAI
Deep Learning Frameworks (C2W3L10)
DeepLearningAI
Neural Network Overview (C1W3L01)
DeepLearningAI
Training Softmax Classifier (C2W3L09)
DeepLearningAI
Why Deep Representations? (C1W4L04)
DeepLearningAI
Gradient Descent For Neural Networks (C1W3L09)
DeepLearningAI
Neural Network Representations (C1W3L02)
DeepLearningAI
TensorFlow (C2W3L11)
DeepLearningAI
Activation Functions (C1W3L06)
DeepLearningAI
Explanation For Vectorized Implementation (C1W3L05)
DeepLearningAI
Getting Matrix Dimensions Right (C1W4L03)
DeepLearningAI
Understanding Dropout (C2W1L07)
DeepLearningAI
Building Blocks of a Deep Neural Network (C1W4L05)
DeepLearningAI
Why Non-linear Activation Functions (C1W3L07)
DeepLearningAI
Computing Neural Network Output (C1W3L03)
DeepLearningAI
Backpropagation Intuition (C1W3L10)
DeepLearningAI
Train/Dev/Test Sets (C2W1L01)
DeepLearningAI
Deep L-Layer Neural Network (C1W4L01)
DeepLearningAI
Random Initialization (C1W3L11)
DeepLearningAI
Other Regularization Methods (C2W1L08)
DeepLearningAI
Normalizing Inputs (C2W1L09)
DeepLearningAI
Derivatives Of Activation Functions (C1W3L08)
DeepLearningAI
Parameters vs Hyperparameters (C1W4L07)
DeepLearningAI
Vectorizing Across Multiple Examples (C1W3L04)
DeepLearningAI
What does this have to do with the brain? (C1W4L08)
DeepLearningAI
Dropout Regularization (C2W1L06)
DeepLearningAI
Vanishing/Exploding Gradients (C2W1L10)
DeepLearningAI
Basic Recipe for Machine Learning (C2W1L03)
DeepLearningAI
Bias/Variance (C2W1L02)
DeepLearningAI
Forward Propagation in a Deep Network (C1W4L02)
DeepLearningAI
Weight Initialization in a Deep Network (C2W1L11)
DeepLearningAI
Numerical Approximations of Gradients (C2W1L12)
DeepLearningAI
Regularization (C2W1L04)
DeepLearningAI
Why Regularization Reduces Overfitting (C2W1L05)
DeepLearningAI
More on: ML Maths Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Data privacy in AI training: federated learning, differential privacy, and synthetic data
Dev.to AI
Data Preprocessing: Encoding and Feature Scaling in Machine Learning
Medium · Machine Learning
Data Preprocessing: Encoding and Feature Scaling in Machine Learning
Medium · Data Science
Data Preprocessing: Encoding and Feature Scaling in Machine Learning
Medium · Python
🎓
Tutor Explanation
DeepCamp AI