1D convolution for neural networks, part 6: Input gradient
Skills:
ML Maths Basics80%
Key Takeaways
Covers the topic of 1D convolution for neural networks, specifically the input gradient, as part of a series on convolutional neural networks
Full Transcript
our next step is to regroup these so that all of our partials with respect to a particular input value are all grouped together so this top two lines include all of the parcels with respect to X sub J so it's just all of the little small expressions from the previous set of equations just rearranged just reordered but now we're starting to gather up all of the contributions of this partial of Y with respect to an individual X element an individual input and if you look at them you can represent them then with the pattern so we see that the partial of X sub I plus K with respect to X of I is W sub K this is a shorthand way to represent all of the equations here and you can see that the pattern holds for any X sub I for any input element if we gather up all of the partials with respect to it we can take and represent all of those expressions with this shorthand so for any input X sub I the partial of the output X of I plus K is equal to W sub K so that's a neat little way to condense that and something that we're gonna make good use of now we can actually plug this back in to our chain rule where the input gradient is equal to the summation of the output gradient with respect to each of these partials we can then substitute in this expression W sub K we have the input gradient with respect to the output gradient times W sub K so this is a fairly slick way then to do our back propagation there's one more step we can do if we take W sub K and flip it left to right which we're going to represent with this left-handed arrow above it then everything that was minus K becomes plus K so we can change the sign on the K index in our output gradient and everything else stays the same so we just did a little trick by pre flipping this w sub K now this is a sliding dot product so it is an array which is our output gradient and we have this kernel our flipped W sub K and we're summing it over the full length of that kernel and for each value of our input X sub I so then we can represent that even more concisely as our input gradient is our output gradient convolve with the reversed version of our kernel so this is a really slick little result it says that the derivative of a convolution is a convolution with the kernel flipped there's a pleasing symmetry with that math is beautiful exhibit 673 very very slick
Original Description
Part of an 9-part series on 1D convolution for neural networks.
Catch the rest at https://e2eml.school/321
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Brandon Rohrer · Brandon Rohrer · 52 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
▶
53
54
55
56
57
58
59
60
Robot Learning with a Biologically-Inspired Brain (BECCA)
Brandon Rohrer
BECCA talk at AGI 2011
Brandon Rohrer
Robot Learning with a Biologically-Inspired Brain (BECCA), The Sequel
Brandon Rohrer
BECCA listens to The Hobbit
Brandon Rohrer
Learning the building blocks of speech: BECCA extracts a hierarchy of audio features
Brandon Rohrer
BECCA listens for sound effects in The Hobbit
Brandon Rohrer
BECCA finds movie trailers while watching the Big Bang Theory
Brandon Rohrer
Listening for unexpected sounds: BECCA detects anomalies in audio data
Brandon Rohrer
Learning the building blocks of vision: BECCA extracts a spatio-temporal hierarchy of features
Brandon Rohrer
Watching for the unexpected: BECCA detects anomalies in video data
Brandon Rohrer
BECCA finds a stationary target
Brandon Rohrer
BECCA finds a stationary target at 3X speed
Brandon Rohrer
BECCA watches the X-men and Bruce Lee
Brandon Rohrer
BECCA plays Quidditch
Brandon Rohrer
BECCA chases a ball
Brandon Rohrer
BECCA chases a ball, part 2
Brandon Rohrer
Becca chases a ball, part 3
Brandon Rohrer
BECCA creates features from MNIST
Brandon Rohrer
How reinforcement learning works in Becca 7
Brandon Rohrer
Deep Learning Demystified
Brandon Rohrer
How Data Science Works
Brandon Rohrer
How Convolutional Neural Networks work
Brandon Rohrer
How Bayes Theorem works
Brandon Rohrer
How Deep Neural Networks Work
Brandon Rohrer
Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM)
Brandon Rohrer
How Support Vector Machines work / How to open a black box
Brandon Rohrer
How autocorrelation works
Brandon Rohrer
Getting closer to human intelligence through robotics
Brandon Rohrer
A minimalist's guide to slicing and indexing pandas DataFrames
Brandon Rohrer
How decision trees work
Brandon Rohrer
Data scientist archetypes
Brandon Rohrer
How to use python's datetime package
Brandon Rohrer
How optimization for machine learning works, part 1
Brandon Rohrer
How optimization for machine learning works, part 2
Brandon Rohrer
How optimization for machine learning works, part 3
Brandon Rohrer
How optimization for machine learning works, part 4
Brandon Rohrer
How convolutional neural networks work, in depth
Brandon Rohrer
How to pick a machine learning model 4: Splitting the data
Brandon Rohrer
How to pick a machine learning model 3: Choosing a loss function
Brandon Rohrer
How to pick a machine learning model 2: Separating signal from noise
Brandon Rohrer
How to pick a machine learning model 1: Choosing between models
Brandon Rohrer
How to pick a machine learning model 5: Navigating assumptions
Brandon Rohrer
What do neural networks learn?
Brandon Rohrer
Interview with iRobot's Director of Data Science Angela Bassa
Brandon Rohrer
How Backpropagation Works
Brandon Rohrer
Evolutionary Powell's method: A discrete optimizer for hyperparameter optimization
Brandon Rohrer
1D convolution for neural networks, part 1: Sliding dot product
Brandon Rohrer
1D convolution for neural networks, part 2: Convolution copies the kernel
Brandon Rohrer
1D convolution for neural networks, part 3: Sliding dot product equations longhand
Brandon Rohrer
1D convolution for neural networks, part 4: Convolution equation
Brandon Rohrer
1D convolution for neural networks, part 5: Backpropagation
Brandon Rohrer
1D convolution for neural networks, part 6: Input gradient
Brandon Rohrer
1D convolution for neural networks, part 7: Weight gradient
Brandon Rohrer
1D convolution for neural networks, part 8: Padding
Brandon Rohrer
1D convolution for neural networks, part 9: Stride
Brandon Rohrer
The Four Grand Challenges of Robots in the Home
Brandon Rohrer
How Convolution Works
Brandon Rohrer
The Softmax neural network layer
Brandon Rohrer
Batch normalization
Brandon Rohrer
Getting ready to learn Python, Mac edition #1: Files and directories
Brandon Rohrer
More on: ML Maths Basics
View skill →Related Reads
📰
📰
📰
📰
What Is MLIR and Why Does It Exist?
Dev.to · Fedor Nikolaev
Why Choosing the Right Machine Learning Development Company Matters More Than the AI Model
Medium · Machine Learning
Data privacy in AI training: federated learning, differential privacy, and synthetic data
Dev.to AI
Data Preprocessing: Encoding and Feature Scaling in Machine Learning
Medium · Machine Learning
🎓
Tutor Explanation
DeepCamp AI