Transformer Architecture: Attention is All you Need Paper Explained

Name: Transformer Architecture: Attention is All you Need Paper Explained
Uploaded: 2022-12-29T19:00:01+00:00
Channel: Deep Learning Revision
Description: The Transformer is a neural network architecture based on an attention mechanism. Transformer architecture was introduced in Attention is All You Need P...

Deep Learning Revision · Beginner ·🧠 Large Language Models ·3y ago

The Transformer is a neural network architecture based on an attention mechanism. Transformer architecture was introduced in Attention is All You Need Paper, an excellent paper you might know. Transformers were initially introduced for machine translation, but as of day, they are applied in other areas of AI, from computer vision, multimodal learning, robotics, reinforcement learning, etc...This is a comprehensive video that dives deep into Transformer architecture and attention mechanism and other related topics such as Large Language Models(covering BERT and GPT-3), efficient Transformers an…

Watch on YouTube ↗ (saves to browser)

Chapters (14)

Introduction

1:32 Neural networks before transformers(RNNs, LSTMs, CNNs)

6:19 Transformer architecture

13:03 Attention

25:00 Other elements of Transformer architecture

31:37 Visualizing attention

34:59 Large language models(LLMs)

44:04 BERT

49:41 GPT-3

57:39 Implementations of Transformers

1:04:21 Current state of Transformers

1:05:49 Efficient Transformers

1:09:20 Amusing Transformer tweet by Karpathy

1:12:20 Summary

Next Up

5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems

Dave Ebbelaar (LLM Eng)

Transformer Architecture: Attention is All you Need Paper Explained

Chapters (14)

Lesson complete!