Building makemore Part 2: MLP

Andrej Karpathy · Beginner ·📐 ML Fundamentals ·3y ago
We implement a multilayer perceptron (MLP) character-level language model. In this video we also introduce many basics of machine learning (e.g. model training, learning rate tuning, hyperparameters, evaluation, train/dev/test splits, under/overfitting, etc.). Links: - makemore on github: https://github.com/karpathy/makemore - jupyter notebook I built in this video: https://github.com/karpathy/nn-zero-to-hero/blob/master/lectures/makemore/makemore_part2_mlp.ipynb - collab notebook (new)!!!: https://colab.research.google.com/drive/1YIfmkftLrz6MPTOO9Vwqrop2Q5llHIGK?usp=sharing - Bengio et al. 2…
Watch on YouTube ↗ (saves to browser)

Chapters (10)

intro
1:48 Bengio et al. 2003 (MLP language model) paper walkthrough
9:03 (re-)building our training dataset
12:19 implementing the embedding lookup table
18:35 implementing the hidden layer + internals of torch.Tensor: storage, views
29:15 implementing the output layer
29:53 implementing the negative log likelihood loss
32:17 summary of the full network
32:49 introducing F.cross_entropy and why
37:56 implementing th
Machine Learning Using SAS Viya
Next Up
Machine Learning Using SAS Viya
Coursera