Training models with only 4 bits | Fully-Quantized Training
Skills:
Fine-tuning LLMs90%
Can you really train a large language model in just 4 bits? In this video, we explore the cutting edge of model compression: fully quantized training in FP4 (4-bit floating point). While quantization has traditionally focused on inference, new research pushes the limits of training efficiency — reducing memory, compute, and cost.
🧠 We cover:
✅ NVIDIA TensorCores for mixed precision training
✅ Micro-scaling (MX) data formats
✅ Modeling tricks for 4-bit gradients (e.g. Stochastic Rounding)
📎 Resources:
🔵 Main paper: https://arxiv.org/abs/2505.19115
🔵 US congressional report on DeepSeek: https://selectcommitteeontheccp.house.gov/sites/evo-subsites/selectcommitteeontheccp.house.gov/files/evo-media-document/DeepSeek%20Final.pdf
🔵 Slide deck and full reading list: https://www.patreon.com/c/JuliaTurc
Watch the entire quantization series here: https://youtube.com/playlist?list=PL4bm2lr9UVG0HvePBXvsceO4yuLC8HhUh&si=xLu7vxMfNdJxkB0S
00:00 Intro
01:00 Motivation (training is expensive)
03:06 Mixed precision
05:40 Hardware support: FP4 in NVIDIA Blackwell
13:51 Microscaling formats (MXFP4 & NVFP4)
17:45 Why not INT4?
19:51 Modeling tricks: Stochastic Rounding
22:26 Outro
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: Fine-tuning LLMs
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
What Happens When an Algorithm Knows Your Taste Better Than You Do?
Medium · Machine Learning
What Happens When an Algorithm Knows Your Taste Better Than You Do?
Medium · LLM
6 Programming Languages I Learned the Hard Way as a Beginner Developer
Medium · Machine Learning
6 Programming Languages I Learned the Hard Way as a Beginner Developer
Medium · Programming
Chapters (8)
Intro
1:00
Motivation (training is expensive)
3:06
Mixed precision
5:40
Hardware support: FP4 in NVIDIA Blackwell
13:51
Microscaling formats (MXFP4 & NVFP4)
17:45
Why not INT4?
19:51
Modeling tricks: Stochastic Rounding
22:26
Outro
🎓
Tutor Explanation
DeepCamp AI