Building makemore Part 5: Building a WaveNet

Andrej Karpathy · Beginner ·📄 Research Papers Explained ·3y ago
We take the 2-layer MLP from previous video and make it deeper with a tree-like structure, arriving at a convolutional neural network architecture similar to the WaveNet (2016) from DeepMind. In the WaveNet paper, the same hierarchical architecture is implemented more efficiently using causal dilated convolutions (not yet covered). Along the way we get a better sense of torch.nn and what it is and how it works under the hood, and what a typical deep learning development process looks like (a lot of reading of documentation, keeping track of multidimensional tensor shapes, moving between jupyte…
Watch on YouTube ↗ (saves to browser)

Chapters (18)

intro
1:40 starter code walkthrough
6:56 let’s fix the learning rate plot
9:16 pytorchifying our code: layers, containers, torch.nn, fun bugs
17:11 overview: WaveNet
19:33 dataset bump the context size to 8
19:55 re-running baseline code on block_size 8
21:36 implementing WaveNet
37:41 training the WaveNet: first pass
38:50 fixing batchnorm1d bug
45:21 re-training WaveNet with bug fix
46:07 scaling up our WaveNet
46:58 experimental harness
47:44 WaveNet but with “dilated causal convolutions”
51:34 torch.nn
52:28 the development process of building deep neural nets
54:17 going forward
55:26 improve on my loss! how far can we
How to Ace a Career Change Interview
Next Up
How to Ace a Career Change Interview
Coursera