How AI Taught Itself to See [DINOv3]
Skills:
Staying Current in AI53%
How can we train a general-purpose vision model to perceive our visual world?
This video dives into the fascinating idea of self-supervised learning. We will discuss the basic concepts of transfer learning, contrastive language-image pretraining (CLIP), and self-supervised learning methods, including masked autoencoder, contrastive methods like SimCLR, and self-distillation methods like DINOv1, v2, and v3. I hope you enjoy the video!
00:00 Introduction
00:33 Why do features matter?
01:11 Learning features using classification
02:14 Learning features using language (CLIP)
04:09 Learning features using pretask (Self-supervised learning)
05:20 Learning features using contrast (SimCLR)
06:36 Learning features using self-distillation (DINOv1)
12:18 DINOv2
13:54 DINOv3
References:
Language-image pretraining
[CLIP] https://openai.com/index/clip/
Self-supervised learning (pretask):
[Context encoder] https://arxiv.org/abs/1604.07379
[Colorization] https://arxiv.org/abs/1611.09842
[Rotation prediction] https://arxiv.org/abs/1803.07728
[Jigsaw puzzle] https://arxiv.org/abs/1603.09246
[Temporal order shuffling] https://arxiv.org/abs/1708.01246
Contrastive learning
[SimCLR] https://arxiv.org/abs/2002.05709
Inpainting
[MAE] https://arxiv.org/abs/2111.06377
[iBOT] https://arxiv.org/abs/2111.07832
Self-distillation
[DINOv1] https://arxiv.org/abs/2104.14294
[DINOv2] https://arxiv.org/abs/2304.07193
[DINOv3] https://arxiv.org/abs/2508.10104
Self-supervised learning
[Cookbook] https://arxiv.org/abs/2304.12210
Video made with Manim: https://www.manim.community/
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: Staying Current in AI
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
The ABCs of reading medical research and review papers these days
Medium · LLM
#1 DevLog Meta-research: I Got Tired of Tab Chaos While Reading Research Papers.
Dev.to AI
How to Set Up a Karpathy-Style Wiki for Your Research Field
Medium · AI
The Non-Optimality of Scientific Knowledge: Path Dependence, Lock-In, and The Local Minimum Trap
ArXiv cs.AI
Chapters (9)
Introduction
0:33
Why do features matter?
1:11
Learning features using classification
2:14
Learning features using language (CLIP)
4:09
Learning features using pretask (Self-supervised learning)
5:20
Learning features using contrast (SimCLR)
6:36
Learning features using self-distillation (DINOv1)
12:18
DINOv2
13:54
DINOv3
🎓
Tutor Explanation
DeepCamp AI