Linear Transformation in Self Attention | Transformers in Deep Learning | Part 3

Learn With Jay · Beginner ·🧠 Large Language Models ·1y ago
In this third video of our Transformer series, we’re diving deep into the concept of Linear Transformations in Self Attention. Linear Transformation is fundamental in Self Attention Mechanism, shaping how inputs are mapped to key, query, and value vectors. In this lesson, we’ll explore the role of linear transformation, breaking down the math behind them to see why they’re essential for capturing dependencies in Self Attention. We’ll go through detailed mathematical proofs to show how Linear Transformation work and why it is crucial for capturing relevant similarities and generate an appropri…
Watch on YouTube ↗ (saves to browser)

Chapters (11)

Intro
1:31 Recap of Self Attention
9:33 Without Learnable Parameters
14:01 Linear Transformation
15:44 Changing Dimensions
16:34 Feature Extraction with Linear Transformation
18:00 Math of Linear Transformation in Self Attention
22:33 Math of capturing dependencies
25:12 Training the parameters
26:50 Number of parameters
28:37 Outro
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Next Up
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)