Decoder Architecture in Transformers | Step-by-Step from Scratch

Learn With Jay ยท Advanced ยท๐Ÿง  Large Language Models ยท1y ago
Transformers have revolutionized deep learning, but have you ever wondered how the decoder in a transformer actually works? ๐Ÿค” In this video, we break down Decoder Architecture in Transformers step by step! ๐Ÿ’ก What Youโ€™ll Learn: โœ… The fundamentals of encoding-decoding in deep learning and how it's different in Transformers. โœ… The role of each layer in the decoder and how they work together. โœ… A deep dive into masked self-attention, cross-attention, and feed-forward networks in the decoder. โœ… How transformers generate meaningful sequences in tasks like language modeling, machine translation, aโ€ฆ
Watch on YouTube โ†— (saves to browser)

Chapters (14)

Intro
0:56 Encoder-Decoder model in Deep Learning
2:24 Encoder-Decoder in Transformers
5:25 Parallelizing Training in Transformers
12:57 Masked Multi-head attention
19:29 Encoder-Decoder in training of Transformers
22:01 Positional Encodings
23:08 Add & Norm Layer
24:47 Cross Attention
32:33 Feed Forward Network
33:53 Stacking of Decoder blocks
34:42 Final Prediction Layer
37:06 Decoder during inference
40:05 Outro
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Next Up
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)