How AI Taught Itself to See [DINOv3]

Jia-Bin Huang · Beginner ·📄 Research Papers Explained ·8mo ago

Skills: Staying Current in AI53%

How can we train a general-purpose vision model to perceive our visual world? This video dives into the fascinating idea of self-supervised learning. We will discuss the basic concepts of transfer learning, contrastive language-image pretraining (CLIP), and self-supervised learning methods, including masked autoencoder, contrastive methods like SimCLR, and self-distillation methods like DINOv1, v2, and v3. I hope you enjoy the video! 00:00 Introduction 00:33 Why do features matter? 01:11 Learning features using classification 02:14 Learning features using language (CLIP) 04:09 Learning features using pretask (Self-supervised learning) 05:20 Learning features using contrast (SimCLR) 06:36 Learning features using self-distillation (DINOv1) 12:18 DINOv2 13:54 DINOv3 References: Language-image pretraining [CLIP] https://openai.com/index/clip/ Self-supervised learning (pretask): [Context encoder] https://arxiv.org/abs/1604.07379 [Colorization] https://arxiv.org/abs/1611.09842 [Rotation prediction] https://arxiv.org/abs/1803.07728 [Jigsaw puzzle] https://arxiv.org/abs/1603.09246 [Temporal order shuffling] https://arxiv.org/abs/1708.01246 Contrastive learning [SimCLR] https://arxiv.org/abs/2002.05709 Inpainting [MAE] https://arxiv.org/abs/2111.06377 [iBOT] https://arxiv.org/abs/2111.07832 Self-distillation [DINOv1] https://arxiv.org/abs/2104.14294 [DINOv2] https://arxiv.org/abs/2304.07193 [DINOv3] https://arxiv.org/abs/2508.10104 Self-supervised learning [Cookbook] https://arxiv.org/abs/2304.12210 Video made with Manim: https://www.manim.community/

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

More on: Staying Current in AI

View skill →

The biggest mistake developers make in their resumes

The biggest mistake developers make in their resumes

THIS is why a CS degree won't get you a coding job

THIS is why a CS degree won't get you a coding job

Recon-ng - Introduction And Installation

Recon-ng - Introduction And Installation

The Ultimate Home Assistant Backup Guide (Google Drive, OneDrive, Dropbox & Cloudflare R2)

The Ultimate Home Assistant Backup Guide (Google Drive, OneDrive, Dropbox & Cloudflare R2)

Recon-ng - Generating Reports

Recon-ng - Generating Reports

How can I be notified when my name is mentioned on the web?

How can I be notified when my name is mentioned on the web?

Google Search Central

Related AI Lessons

The ABCs of reading medical research and review papers these days

Learn to critically evaluate medical research papers by accepting nothing at face value, believing no one blindly, and checking everything

#1 DevLog Meta-research: I Got Tired of Tab Chaos While Reading Research Papers.

Learn to manage research paper tabs efficiently and apply meta-research techniques to improve productivity

How to Set Up a Karpathy-Style Wiki for Your Research Field

Learn to set up a Karpathy-style wiki for your research field to organize and share knowledge effectively

The Non-Optimality of Scientific Knowledge: Path Dependence, Lock-In, and The Local Minimum Trap

Scientific knowledge may be stuck in a local minimum, hindering optimal progress, and understanding this concept is crucial for advancing research

Chapters (9)

Introduction

0:33 Why do features matter?

1:11 Learning features using classification

2:14 Learning features using language (CLIP)

4:09 Learning features using pretask (Self-supervised learning)

5:20 Learning features using contrast (SimCLR)

6:36 Learning features using self-distillation (DINOv1)

12:18 DINOv2

13:54 DINOv3

Microsoft Research Forum | Season 2, Episode 4

Microsoft Research