Foundations

Computer Vision

Object detection, segmentation, YOLO, CLIP, and vision-language models

1,332
lessons
Skills in this topic
View full skill map →
CV Basics
beginner
Classify images with a pre-trained CNN
Modern CV Models
intermediate
Run YOLO for real-time object detection
Generative CV
advanced
Build a Stable Diffusion inference pipeline
EV Pickups Are a Bust for US Carmakers
Computer Vision
EV Pickups Are a Bust for US Carmakers
Bloomberg Technology Intermediate 8mo ago
Almost All Email Campaigns Are Doing This Wrong
Computer Vision
Almost All Email Campaigns Are Doing This Wrong
Neil Patel Beginner 9mo ago
David Fan & Peter Tong  - Scaling Language Free Visual Representation Learning
Computer Vision ⚡ AI Lesson
David Fan & Peter Tong - Scaling Language Free Visual Representation Learning
Cohere Advanced 9mo ago
Introducing CodeSpy.ai – Detect AI-Generated Code with Confidence
Computer Vision ⚡ AI Lesson
Introducing CodeSpy.ai – Detect AI-Generated Code with Confidence
Muhammad Moin Beginner 9mo ago
Vision AI in 2025 — Peter Robicheaux, Roboflow
Computer Vision
Vision AI in 2025 — Peter Robicheaux, Roboflow
AI Engineer Intermediate 9mo ago
The Segmentation Tweak That Quietly BOOSTS Klaviyo Revenue #shorts #emailmarketing
1:26
Computer Vision ⚡ AI Lesson
The Segmentation Tweak That Quietly BOOSTS Klaviyo Revenue #shorts #emailmarketing
Emissary 2.0 Intermediate 9mo ago
I trained an AI Model to Detect Trading Candlesticks (from scratch using ViTs)
Computer Vision ⚡ AI Lesson
I trained an AI Model to Detect Trading Candlesticks (from scratch using ViTs)
Nicholas Renotte Intermediate 9mo ago
YOLOv5 Tutorial | Architecture, Assigning Targets & Loss Function Explained
Computer Vision
YOLOv5 Tutorial | Architecture, Assigning Targets & Loss Function Explained
ExplainingAI Beginner 9mo ago
Control PTZ Cameras with AI | ONVIF Integration with Object Tracking
Computer Vision
Control PTZ Cameras with AI | ONVIF Integration with Object Tracking
Roboflow Beginner 9mo ago
DAViD: Data-efficient and Accurate Vision Models from Synthetic Data
Computer Vision ⚡ AI Lesson
DAViD: Data-efficient and Accurate Vision Models from Synthetic Data
Microsoft Research Intermediate 9mo ago
Getting Started with Google Gemini 2.5 Pro: Detect Objects, Generate Captions & OCR
Computer Vision
Getting Started with Google Gemini 2.5 Pro: Detect Objects, Generate Captions & OCR
Muhammad Moin Beginner 9mo ago
Is Your Business Running on Empty? 🤖
Computer Vision
Is Your Business Running on Empty? 🤖
imFORZA Intermediate 10mo ago
Auto Labeling Image Data | How to Annotate a Dataset and Train a Vision AI Model
Computer Vision
Auto Labeling Image Data | How to Annotate a Dataset and Train a Vision AI Model
Roboflow Beginner 10mo ago
Distilling Transformers and Diffusion Models for Robust Edge Use Cases [Fatih Porikli] - 738
Computer Vision ⚡ AI Lesson
Distilling Transformers and Diffusion Models for Robust Edge Use Cases [Fatih Porikli] - 738
The TWIML AI Podcast with Sam Charrington Advanced 10mo ago
VGG From Scratch – Deep Learning Theory & PyTorch Implementation (Full Course)
Computer Vision ⚡ AI Lesson
VGG From Scratch – Deep Learning Theory & PyTorch Implementation (Full Course)
freeCodeCamp.org Advanced 10mo ago
Timothée Darcet - Scaling Self Supervised Learning for Vision  An Introduction to DINOv2
Computer Vision ⚡ AI Lesson
Timothée Darcet - Scaling Self Supervised Learning for Vision An Introduction to DINOv2
Cohere Beginner 10mo ago
Transforming Guest Experiences: GoTo Foods’ Data Journey with Amperity & Databricks
Computer Vision ⚡ AI Lesson
Transforming Guest Experiences: GoTo Foods’ Data Journey with Amperity & Databricks
Databricks Advanced 10mo ago
Transforming Data Governance for Multimodal Data at Amgen With Databricks
Computer Vision ⚡ AI Lesson
Transforming Data Governance for Multimodal Data at Amgen With Databricks
Databricks Intermediate 10mo ago
3 Insane Algorithms Netflix Uses to Scan BILLIONS of Frames
Computer Vision
3 Insane Algorithms Netflix Uses to Scan BILLIONS of Frames
Coding with Lewis Beginner 10mo ago
What is Computer Vision
Computer Vision
What is Computer Vision
AI Simplified Beginner 10mo ago
Multimodal Document Intelligence with NVIDIA Llama Nemotron Nano VL
Computer Vision ⚡ AI Lesson
Multimodal Document Intelligence with NVIDIA Llama Nemotron Nano VL
NVIDIA Developer Beginner 10mo ago
Train YOLO on Custom Dataset | Object Detection Step-by-Step Tutorial
Computer Vision
Train YOLO on Custom Dataset | Object Detection Step-by-Step Tutorial
Samin Learns AI Advanced 10mo ago
Why More Researchers Should become Content Creators
Computer Vision
Why More Researchers Should become Content Creators
Jia-Bin Huang Beginner 10mo ago
Multimodal Open Source at Kyutai, From Online Demos to On-Device - Alexandre Défossez
Computer Vision
Multimodal Open Source at Kyutai, From Online Demos to On-Device - Alexandre Défossez
PyTorch Intermediate 11mo ago
MedGemma LLM: Doctors, Meet Your AI Assistant 🧠
Computer Vision ⚡ AI Lesson
MedGemma LLM: Doctors, Meet Your AI Assistant 🧠
AI Anytime Intermediate 11mo ago
Convolutional Neural Networks (CNN) - Face Recognition Case Study - Algorithm & Full Code Explained
Computer Vision
Convolutional Neural Networks (CNN) - Face Recognition Case Study - Algorithm & Full Code Explained
Thinking Neuron Beginner 11mo ago
FastVLM brings advanced computer vision to your phone...
Computer Vision ⚡ AI Lesson
FastVLM brings advanced computer vision to your phone...
NeuralNine Advanced 11mo ago
Building a Vision Transformer Model from Scratch with PyTorch
Computer Vision ⚡ AI Lesson
Building a Vision Transformer Model from Scratch with PyTorch
freeCodeCamp.org Beginner 11mo ago
China’s ByteDance Just Dropped BAGEL — Multimodal AI Beast!
Computer Vision
China’s ByteDance Just Dropped BAGEL — Multimodal AI Beast!
Analytics Vidhya Intermediate 11mo ago
Uber CEO Dara Khosrowshahi on the company's new Route Share feature. Presented by @AdobeExpress
Computer Vision
Uber CEO Dara Khosrowshahi on the company's new Route Share feature. Presented by @AdobeExpress
The Verge Intermediate 11mo ago
The Shape of Intelligence
Computer Vision ⚡ AI Lesson
The Shape of Intelligence
Latent Space Intermediate 11mo ago
AI Personal Tutor for Everyone
Computer Vision
AI Personal Tutor for Everyone
Y Combinator Beginner 1y ago
Computer Vision in 100 Seconds
Computer Vision
Computer Vision in 100 Seconds
Infinite Codes Beginner 1y ago
How to Segment Your Audience in Mailchimp
9:16
Computer Vision ⚡ AI Lesson
How to Segment Your Audience in Mailchimp
Intuit Mailchimp Intermediate 1y ago
Build an AI/ML NBA Basketball Analysis system with YOLO, OpenCV, and Python
Computer Vision
Build an AI/ML NBA Basketball Analysis system with YOLO, OpenCV, and Python
Code In a Jiffy Beginner 1y ago
Multimodal AI with Logan Kilpatrick
Computer Vision
Multimodal AI with Logan Kilpatrick
Google Cloud Beginner 1y ago
DETR Explained | End-to-End Object Detection with Transformers | DETR Tutorial Part 1
Computer Vision
DETR Explained | End-to-End Object Detection with Transformers | DETR Tutorial Part 1
ExplainingAI Beginner 1y ago
Find out how Nevada DETR achieved 4x faster approvals with Vertex AI
Computer Vision
Find out how Nevada DETR achieved 4x faster approvals with Vertex AI
Google Cloud Advanced 1y ago
Visual RAG Unleashed: Harnessing ColQwen2.5 & Qwen2.5-VL-3B-Instruct for Next-Level AI
Computer Vision
Visual RAG Unleashed: Harnessing ColQwen2.5 & Qwen2.5-VL-3B-Instruct for Next-Level AI
Bytes of AI Beginner 1y ago
PaliGemma – Making Gemma 2 see by adding a vision encoder
Computer Vision
PaliGemma – Making Gemma 2 see by adding a vision encoder
Google for Developers Advanced 1y ago
How to Fine-Tune SmolVLM2 | Convert Documents into JSON
Computer Vision
How to Fine-Tune SmolVLM2 | Convert Documents into JSON
Roboflow Intermediate 10mo ago
Drowsiness Detection with Vision AI | Improve Safety with AI
Computer Vision
Drowsiness Detection with Vision AI | Improve Safety with AI
Roboflow Intermediate 11mo ago
Seulki Park - Visually Consistent Hierarchical Image Classification
Computer Vision
Seulki Park - Visually Consistent Hierarchical Image Classification
Cohere Beginner 11mo ago
How to Detect People in Danger Zones with AI
Computer Vision
How to Detect People in Danger Zones with AI
Roboflow Beginner 1y ago
Intuit uses Google Cloud Document AI to further simplify tax prep for millions
Computer Vision
Intuit uses Google Cloud Document AI to further simplify tax prep for millions
Google Cloud Intermediate 1y ago
RF-DETR Architecture & How it Works | Why is DETR Better Than YOLO?
Computer Vision
RF-DETR Architecture & How it Works | Why is DETR Better Than YOLO?
Roboflow Beginner 1y ago
Multimodal AI & Next Gen Databases | Data Brew | Episode 42
Computer Vision ⚡ AI Lesson
Multimodal AI & Next Gen Databases | Data Brew | Episode 42
Databricks Intermediate 1y ago
RF-DETR, Batch Processing, Instant Training, Serverless Inference, and More | What's New in Roboflow
Computer Vision
RF-DETR, Batch Processing, Instant Training, Serverless Inference, and More | What's New in Roboflow
Roboflow Intermediate 1y ago
📚 Coursera Courses Opens on Coursera · Free to audit
1 / 3 View all →
Using Specialized Processors with Document AI (Python)
📚 Coursera Course ↗
Self-paced
Using Specialized Processors with Document AI (Python)
Opens on Coursera ↗
Behavioral Marketing
📚 Coursera Course ↗
Self-paced
Behavioral Marketing
Opens on Coursera ↗
Process SAR & Multispectral
📚 Coursera Course ↗
Self-paced
Process SAR & Multispectral
Opens on Coursera ↗
Humanidades digitales
📚 Coursera Course ↗
Self-paced
Humanidades digitales
Opens on Coursera ↗
Preparing Multimodal Data: Vision, Audio, and NLP Pipelines
📚 Coursera Course ↗
Self-paced
Preparing Multimodal Data: Vision, Audio, and NLP Pipelines
Opens on Coursera ↗
Form Parsing Using Document AI
📚 Coursera Course ↗
Self-paced
Form Parsing Using Document AI
Opens on Coursera ↗