Yasser Benigmin - Domain Adaptation in the Era of Foundation Models
In this presentation, we address domain adaptation in semantic segmentation, where deep learning models rely heavily on large labeled datasets and struggle with domain shift, limiting real-world generalization. We show how Foundation Models (FMs) can be adapted to overcome these challenges under resource constraints through three key contributions. First, we present DATUM, a one-shot unsupervised domain adaptation approach that personalizes text-to-image diffusion models to generate diverse, style-consistent training data from a single target image. Next, we introduce CLOUDS, a collaborative framework in which multiple foundation models, such as CLIP, large language models, diffusion models, and Segment Anything Model, work together to generate synthetic data and automate the creation of high-quality pseudo-labels for self-training, enabling improved domain generalization.. Finally, we discuss FLOSS, a training-free strategy for open-vocabulary segmentation that enhances CLIP’s performance by automatically discovering class-specific “expert” text templates.
Yasser Benigmin is a recent PhD graduate in Computer Vision within the Multimedia team at Telecom Paris and the VISTA team at LIX (Laboratoire d'Informatique de l'X) at École Polytechnique, supervised by Stéphane Lathuilière, Vicky Kalogeiton, and Slim Essid. His research focuses on domain adaptation for semantic segmentation leveraging foundation models, with a particular emphasis on resource-constrained scenarios. Previously, he interned at INRIA Paris in the Astra-Vision team, working on open-vocabulary semantic segmentation under Raoul de Charette. Yasser holds an engineering degree from École des Mines de Saint-Étienne and completed an exchange year at EURECOM.
This session is brought to you by the Cohere Labs Open Science Community - a space where ML researchers, engineers, linguists, social scientists, and lifelong learners connect and collaborate with each other. We'd like to extend a special thank you
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
More on: CV Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Demystifying CNNs: How Convolutional Filters and Max-Pooling Actually Work
Medium · Data Science
Your "Biometric Age Check" Isn't Verifying Identity — And Defense Lawyers Know It
Dev.to AI
MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons
Medium · Machine Learning
How I Built a Perceptual Color Quantization Engine for LEGO Mosaics
Dev.to · BMBrick
🎓
Tutor Explanation
DeepCamp AI