OpenAI CLIP model explained

Machine Learning Studio · Beginner ·👁️ Computer Vision ·1y ago
CLIP: Contrastive Language-Image Pre-training In this video, I describe the CLIP model published by OpenAI. CLIP is based on Natural Language Supervision for pre-training. Natural Language Supervision is not a new, in fact there are two approaches for this, one approach tries to predict the exact caption for each image, whereas the other approach is based on contrastive loss, where instead of predicting the exact caption, they try to increase the similarity of correct pairs.
Watch on YouTube ↗ (saves to browser)
Low Code Image Segmentation
Next Up
Low Code Image Segmentation
Coursera