The spelled-out intro to neural networks and backpropagation: building micrograd
This is the most step-by-step spelled-out explanation of backpropagation and training of neural networks. It only assumes basic knowledge of Python and a vague recollection of calculus from high school.
Links:
- micrograd on github: https://github.com/karpathy/micrograd
- jupyter notebooks I built in this video: https://github.com/karpathy/nn-zero-to-hero/tree/master/lectures/micrograd
- my website: https://karpathy.ai
- my twitter: https://twitter.com/karpathy
- "discussion forum": nvm, use youtube comments below for now :)
- (new) Neural Networks: Zero to Hero series Discord channel: https://discord.gg/3zy8kqD9Cp , for people who'd like to chat more and go beyond youtube comments
Exercises:
you should now be able to complete the following google collab, good luck!:
https://colab.research.google.com/drive/1FPTx1RXtBfc4MaTkf7viZZD4U2F9gtKN?usp=sharing
Chapters:
00:00:00 intro
00:00:25 micrograd overview
00:08:08 derivative of a simple function with one input
00:14:12 derivative of a function with multiple inputs
00:19:09 starting the core Value object of micrograd and its visualization
00:32:10 manual backpropagation example #1: simple expression
00:51:10 preview of a single optimization step
00:52:52 manual backpropagation example #2: a neuron
01:09:02 implementing the backward function for each operation
01:17:32 implementing the backward function for a whole expression graph
01:22:28 fixing a backprop bug when one node is used multiple times
01:27:05 breaking up a tanh, exercising with more operations
01:39:31 doing the same thing but in PyTorch: comparison
01:43:55 building out a neural net library (multi-layer perceptron) in micrograd
01:51:04 creating a tiny dataset, writing the loss function
01:57:56 collecting all of the parameters of the neural net
02:01:12 doing gradient descent optimization manually, training the network
02:14:03 summary of what we learned, how to go towards modern neural nets
02:16:46 walkthrough of the full code of micrograd on githu
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Andrej Karpathy · Andrej Karpathy · 3 of 17
1
2
▶
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Stable diffusion dreams of steam punk neural networks
Andrej Karpathy
Stable diffusion dreams of "blueberry spaghetti" for one night
Andrej Karpathy
The spelled-out intro to neural networks and backpropagation: building micrograd
Andrej Karpathy
Stable diffusion dreams of tattoos
Andrej Karpathy
Stable diffusion dreams of steampunk brains
Andrej Karpathy
Stable diffusion dreams of psychedelic faces
Andrej Karpathy
The spelled-out intro to language modeling: building makemore
Andrej Karpathy
Building makemore Part 2: MLP
Andrej Karpathy
Building makemore Part 3: Activations & Gradients, BatchNorm
Andrej Karpathy
Building makemore Part 4: Becoming a Backprop Ninja
Andrej Karpathy
Building makemore Part 5: Building a WaveNet
Andrej Karpathy
Let's build GPT: from scratch, in code, spelled out.
Andrej Karpathy
[1hr Talk] Intro to Large Language Models
Andrej Karpathy
Let's build the GPT Tokenizer
Andrej Karpathy
Let's reproduce GPT-2 (124M)
Andrej Karpathy
Deep Dive into LLMs like ChatGPT
Andrej Karpathy
How I use LLMs
Andrej Karpathy
More on: Neural Network Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Python Programming Course in Delhi
Medium · Python
Choosing the Right Architecture: A Software Engineer’s Field Guide to Neural Networks
Medium · Data Science
Chandra OCR 2: When Open Source Reads What Others Miss
Medium · Machine Learning
The hidden value of teaching ML to Non-ML teams
Medium · Machine Learning
Chapters (19)
intro
0:25
micrograd overview
8:08
derivative of a simple function with one input
14:12
derivative of a function with multiple inputs
19:09
starting the core Value object of micrograd and its visualization
32:10
manual backpropagation example #1: simple expression
51:10
preview of a single optimization step
52:52
manual backpropagation example #2: a neuron
1:09:02
implementing the backward function for each operation
1:17:32
implementing the backward function for a whole expression graph
1:22:28
fixing a backprop bug when one node is used multiple times
1:27:05
breaking up a tanh, exercising with more operations
1:39:31
doing the same thing but in PyTorch: comparison
1:43:55
building out a neural net library (multi-layer perceptron) in micrograd
1:51:04
creating a tiny dataset, writing the loss function
1:57:56
collecting all of the parameters of the neural net
2:01:12
doing gradient descent optimization manually, training the network
2:14:03
summary of what we learned, how to go towards modern neural nets
2:16:46
walkthrough of the full code of micrograd on githu
🎓
Tutor Explanation
DeepCamp AI