Foundations

Computer Vision

Object detection, segmentation, YOLO, CLIP, and vision-language models

1,539
lessons
Skills in this topic
View full skill map →
CV Basics
beginner
Classify images with a pre-trained CNN
Modern CV Models
intermediate
Run YOLO for real-time object detection
Generative CV
advanced
Build a Stable Diffusion inference pipeline
33. What are Multimodal Agents? Definition, Examples & Applications In Hindi
Computer Vision
33. What are Multimodal Agents? Definition, Examples & Applications In Hindi
AI SayI Intermediate 6mo ago
The Next Frontier of AI: Real-Time Multimodal Decision Making
Computer Vision
The Next Frontier of AI: Real-Time Multimodal Decision Making
The Information Intermediate 6mo ago
SAM 3: The Eyes for AI  — Nikhila & Pengchuan (Meta Superintelligence), ft. Joseph Nelson (Roboflow)
Computer Vision
SAM 3: The Eyes for AI — Nikhila & Pengchuan (Meta Superintelligence), ft. Joseph Nelson (Roboflow)
Latent Space Intermediate 6mo ago
AI Paradox: Use Text for Logic, Avatars for Meaning
Computer Vision
AI Paradox: Use Text for Logic, Avatars for Meaning
Discover AI Intermediate 6mo ago
Roboflow Rapid Livestream | Use text prompts to train vision models
Computer Vision ⚡ AI Lesson
Roboflow Rapid Livestream | Use text prompts to train vision models
Roboflow Intermediate 6mo ago
PixelTable: Revolutionizing Multimodal AI Development Simplified #shorts #youtube
Computer Vision ⚡ AI Lesson
PixelTable: Revolutionizing Multimodal AI Development Simplified #shorts #youtube
AI Anytime Intermediate 6mo ago
Grounding DINO: Open Vocabulary Object Detection on Videos
Computer Vision
Grounding DINO: Open Vocabulary Object Detection on Videos
PyImageSearch Intermediate 6mo ago
Insane Results with YOLOv8 & YOLO11 — Detection, Segmentation, Pose & More!
Computer Vision
Insane Results with YOLOv8 & YOLO11 — Detection, Segmentation, Pose & More!
Muhammad Moin Intermediate 6mo ago
Gemini 3 Demo: Building a Music Rhythm Game with Computer Vision
Computer Vision
Gemini 3 Demo: Building a Music Rhythm Game with Computer Vision
Google for Developers Intermediate 7mo ago
SAM 3: The AI That Lets You “Segment Anything” — Images, Videos & Concepts
Computer Vision
SAM 3: The AI That Lets You “Segment Anything” — Images, Videos & Concepts
Analytics Vidhya Intermediate 7mo ago
Stanford Robotics Seminar ENGR319 | Autumn 2025 | General Compliant Robot Interaction
Computer Vision ⚡ AI Lesson
Stanford Robotics Seminar ENGR319 | Autumn 2025 | General Compliant Robot Interaction
Stanford Online Intermediate 7mo ago
AI Video Editing Hack
Computer Vision ⚡ AI Lesson
AI Video Editing Hack
Matt Wolfe Intermediate 7mo ago
InferenceJS: Real-time computer vision in your browser
Computer Vision
InferenceJS: Real-time computer vision in your browser
Chrome for Developers Intermediate 7mo ago
Segment Anything 3 (SAM 3): Text to Segmentation | Live Coding + Q&A (Nov 20th)
Computer Vision ⚡ AI Lesson
Segment Anything 3 (SAM 3): Text to Segmentation | Live Coding + Q&A (Nov 20th)
Roboflow Intermediate 7mo ago
I Gave This Fish $10,000 to Trade Stocks
Computer Vision
I Gave This Fish $10,000 to Trade Stocks
Coding with Lewis Intermediate 7mo ago
Use this Template for Speak About the Photo + 10 Practice Questions | Duolingo English Test
Computer Vision
Use this Template for Speak About the Photo + 10 Practice Questions | Duolingo English Test
Teacher Luke - Duolingo English Test Intermediate 7mo ago
The biggest mistake companies make deploying AI  #podcast #interview #dataanalysis #ai #datascience
Computer Vision ⚡ AI Lesson
The biggest mistake companies make deploying AI #podcast #interview #dataanalysis #ai #datascience
Abhishek Thakur Intermediate 7mo ago
Basic Network Segmentation
Computer Vision ⚡ AI Lesson
Basic Network Segmentation
John Hammond Intermediate 7mo ago
Demystifying AI & Data Science (w/ Luca Massaron) 📱
Computer Vision ⚡ AI Lesson
Demystifying AI & Data Science (w/ Luca Massaron) 📱
Abhishek Thakur Intermediate 7mo ago
Build a RAG Application from Scratch — No LangChain, No LlamaIndex
Computer Vision
Build a RAG Application from Scratch — No LangChain, No LlamaIndex
Muhammad Moin Intermediate 7mo ago
Real Time AI Video Object Tracking! 💥EdgeTAM - Sam 2 for On-Device 🔥
Computer Vision
Real Time AI Video Object Tracking! 💥EdgeTAM - Sam 2 for On-Device 🔥
1littlecoder Intermediate 7mo ago
How to Create a Profitable Paid Search Strategy for 2026
Computer Vision
How to Create a Profitable Paid Search Strategy for 2026
Exposure Ninja Intermediate 7mo ago
Build DIY Home Security With Computer Vision and a Raspberry Pi
Computer Vision
Build DIY Home Security With Computer Vision and a Raspberry Pi
The Dividor Daily Intermediate 7mo ago
Multimodal Data Analysis with AI
Computer Vision ⚡ AI Lesson
Multimodal Data Analysis with AI
Latent Space Intermediate 8mo ago
Stop Losing Luggage: AI Computer Vision for Global Bag Tracking
Computer Vision
Stop Losing Luggage: AI Computer Vision for Global Bag Tracking
The Dividor Daily Intermediate 8mo ago
Generate Image Captions That Focus on What You Need
Computer Vision ⚡ AI Lesson
Generate Image Captions That Focus on What You Need
NVIDIA Developer Intermediate 8mo ago
Meta Engineer on Industrial Computer Vision systems
Computer Vision
Meta Engineer on Industrial Computer Vision systems
MLOps.community Intermediate 8mo ago
Duolingo English Test - NEW Complete Practice Test with Answers
Computer Vision
Duolingo English Test - NEW Complete Practice Test with Answers
Teacher Luke - Duolingo English Test Intermediate 8mo ago
The SECRET to Hyper Segmentation (and Sales)
0:35
Computer Vision ⚡ AI Lesson
The SECRET to Hyper Segmentation (and Sales)
Optimum7 Intermediate 8mo ago
"Smartest" VISION AI in Cars Do Reasoning?
Computer Vision
"Smartest" VISION AI in Cars Do Reasoning?
Discover AI Intermediate 9mo ago
How to focus on building your skills when everything's so distracting with Ania Kubów [Podcast #187]
Computer Vision ⚡ AI Lesson
How to focus on building your skills when everything's so distracting with Ania Kubów [Podcast #187]
freeCodeCamp.org Intermediate 9mo ago
Discover the Future of AI: Multimodal AI Revolution!
Computer Vision
Discover the Future of AI: Multimodal AI Revolution!
AIHub101 Intermediate 10mo ago
New Way Now: Simbe's AI robotic vision tech improves retail sales and margin with Google Cloud
Computer Vision
New Way Now: Simbe's AI robotic vision tech improves retail sales and margin with Google Cloud
Google Cloud Intermediate 10mo ago
EV Pickups Are a Bust for US Carmakers
Computer Vision
EV Pickups Are a Bust for US Carmakers
Bloomberg Technology Intermediate 10mo ago
Vision AI in 2025 — Peter Robicheaux, Roboflow
Computer Vision
Vision AI in 2025 — Peter Robicheaux, Roboflow
AI Engineer Intermediate 11mo ago
The Segmentation Tweak That Quietly BOOSTS Klaviyo Revenue #shorts #emailmarketing
1:26
Computer Vision ⚡ AI Lesson
The Segmentation Tweak That Quietly BOOSTS Klaviyo Revenue #shorts #emailmarketing
Emissary 2.0 Intermediate 11mo ago
Top-Ranked RAG: NeMo Retriever Leads Visual Document Retrieval Leaderboards
Computer Vision ⚡ AI Lesson
Top-Ranked RAG: NeMo Retriever Leads Visual Document Retrieval Leaderboards
NVIDIA Developer Intermediate 11mo ago
DAViD: Data-efficient and Accurate Vision Models from Synthetic Data
Computer Vision ⚡ AI Lesson
DAViD: Data-efficient and Accurate Vision Models from Synthetic Data
Microsoft Research Intermediate 11mo ago
Is Your Business Running on Empty? 🤖
Computer Vision
Is Your Business Running on Empty? 🤖
imFORZA Intermediate 11mo ago
​Productionizing Prompts: How Pinterest Turned Every Team into GenAI Power Users
Computer Vision
​Productionizing Prompts: How Pinterest Turned Every Team into GenAI Power Users
Predibase by Rubrik Intermediate 11mo ago
Transforming Data Governance for Multimodal Data at Amgen With Databricks
Computer Vision ⚡ AI Lesson
Transforming Data Governance for Multimodal Data at Amgen With Databricks
Databricks Intermediate 11mo ago
Demystifying AI & Data Science (w/ Luca Massaron)
Computer Vision ⚡ AI Lesson
Demystifying AI & Data Science (w/ Luca Massaron)
Abhishek Thakur Intermediate 7mo ago
How to Stay Relevant in AI & Data Science (w/ Alexey Grigorev)
Computer Vision
How to Stay Relevant in AI & Data Science (w/ Alexey Grigorev)
Abhishek Thakur Intermediate 8mo ago
RF-DETR: How to Train SOTA for Object Detection on a Custom Dataset | Step-by-step guide
Computer Vision
RF-DETR: How to Train SOTA for Object Detection on a Custom Dataset | Step-by-step guide
Roboflow Intermediate 10mo ago
Improve Tiny Object Detection with YOLO11 + SAHI 🔍
Computer Vision
Improve Tiny Object Detection with YOLO11 + SAHI 🔍
Muhammad Moin Intermediate 11mo ago
How to Fine-Tune SmolVLM2 | Convert Documents into JSON
Computer Vision
How to Fine-Tune SmolVLM2 | Convert Documents into JSON
Roboflow Intermediate 11mo ago
Build a Local RAG App with DeepSeek R1 & Ollama in Streamlit – Step-by-Step Tutorial
Computer Vision
Build a Local RAG App with DeepSeek R1 & Ollama in Streamlit – Step-by-Step Tutorial
Muhammad Moin Intermediate 11mo ago
Gemini CLI + MCP Server: A Step-by-Step Tutorial
Computer Vision
Gemini CLI + MCP Server: A Step-by-Step Tutorial
Muhammad Moin Intermediate 12mo ago
📚 Continue on Coursera External links · Free to audit
1 / 3 View all →
Form Parsing Using Document AI
📚 External: Coursera ↗
Self-paced
Form Parsing Using Document AI
Opens on Coursera ↗
Marketing in the Age of AI
📚 External: Coursera ↗
Self-paced
Marketing in the Age of AI
Opens on Coursera ↗
Process Images & Extract Motion Features
📚 External: Coursera ↗
Self-paced
Process Images & Extract Motion Features
Opens on Coursera ↗
UiPath Automation Developer Professional
📚 External: Coursera ↗
Self-paced
UiPath Automation Developer Professional
Opens on Coursera ↗
Supply Market Analysis
📚 External: Coursera ↗
Self-paced
Supply Market Analysis
Opens on Coursera ↗
Image Segmentation, Filtering, and Region Analysis
📚 External: Coursera ↗
Self-paced
Image Segmentation, Filtering, and Region Analysis
Opens on Coursera ↗