How AI Vision Evolved | Merve Noyan

Name: How AI Vision Evolved | Merve Noyan
Uploaded: 2026-04-06T09:31:13Z
Channel: Hugging Face
Description: In this clip, Merve breaks down how AI vision evolved and explains why it matters in practice. Dense explanation of how vision evolved and why progress ...

Hugging Face · Intermediate ·👁️ Computer Vision ·1mo ago

Skills: CV Basics80%ML Maths Basics60%

In this clip, Merve breaks down how AI vision evolved and explains why it matters in practice. Dense explanation of how vision evolved and why progress feels incremental now. 🤗 Listen to the full podcast episode 👉 Here: https://youtu.be/SjjCpeTjXIY Connect with Merve: - Merve on X — https://x.com/mervenoyann - Vision Language Models (O'Reilly) — https://www.oreilly.com/library/view/vision-language-models/9798341624030/ Chapters: - 00:00 How AI Vision Evolved - 00:12 Vision Transformers - 01:06 LLaVA - 01:38 IDEFICS - 02:06 CLIP + Projection Layer - 02:54 Interleaving - 05:42 Segment Anything Topics covered: - Vision Transformers - LLaVA - IDEFICS - CLIP + Projection Layer - Interleaving Sources mentioned: - An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale — https://arxiv.org/abs/2010.11929 - Visual Instruction Tuning project page — https://llava-vl.github.io/ - IDEFICS: an open reproduction of Flamingo — https://huggingface.co/blog/idefics - CLIP: Connecting text and images — https://arxiv.org/abs/2103.00020 - IDEFICS2 model documentation — https://huggingface.co/docs/transformers/model_doc/idefics2 - Segment Anything — https://arxiv.org/abs/2304.02643

Watch on YouTube ↗ (saves to browser)