DINO -- Self-supervised ViT
In this video, we cover a very exciting paper, called “Emerging Properties in Self-supervised Vision Transformer”. The proposed method DINO (self-distillation with no labels) is a simplified approach for self-supervised learning in vision domain.
Similar to self-supervised transformers in NLP, pre-training ViT with DINO also leads to some emerging properties beyond what they were trained for.
Watch on YouTube ↗
(saves to browser)
DeepCamp AI