Foundations

Computer Vision

Object detection, segmentation, YOLO, CLIP, and vision-language models

1,539
lessons
Skills in this topic
View full skill map →
CV Basics
beginner
Classify images with a pre-trained CNN
Modern CV Models
intermediate
Run YOLO for real-time object detection
Generative CV
advanced
Build a Stable Diffusion inference pipeline
Building MCP Servers with LangChain in Python
Computer Vision
Building MCP Servers with LangChain in Python
Muhammad Moin Intermediate 1y ago
Drowsiness Detection with Vision AI | Improve Safety with AI
Computer Vision
Drowsiness Detection with Vision AI | Improve Safety with AI
Roboflow Intermediate 1y ago
Multimodal Open Source at Kyutai, From Online Demos to On-Device - Alexandre Défossez
Computer Vision
Multimodal Open Source at Kyutai, From Online Demos to On-Device - Alexandre Défossez
PyTorch Intermediate 1y ago
MedGemma LLM: Doctors, Meet Your AI Assistant 🧠
Computer Vision ⚡ AI Lesson
MedGemma LLM: Doctors, Meet Your AI Assistant 🧠
AI Anytime Intermediate 1y ago
[CVPR 2025] Pos3R: 6D Pose Estimation for Unseen Objects Made Easy
Computer Vision
[CVPR 2025] Pos3R: 6D Pose Estimation for Unseen Objects Made Easy
anucvml Intermediate 1y ago
China’s ByteDance Just Dropped BAGEL — Multimodal AI Beast!
Computer Vision
China’s ByteDance Just Dropped BAGEL — Multimodal AI Beast!
Analytics Vidhya Intermediate 1y ago
How to Segment Your Audience in Mailchimp
9:16
Computer Vision ⚡ AI Lesson
How to Segment Your Audience in Mailchimp
Intuit Mailchimp Intermediate 1y ago
Intuit uses Google Cloud Document AI to further simplify tax prep for millions
Computer Vision
Intuit uses Google Cloud Document AI to further simplify tax prep for millions
Google Cloud Intermediate 1y ago
Multimodal AI & Next Gen Databases | Data Brew | Episode 42
Computer Vision ⚡ AI Lesson
Multimodal AI & Next Gen Databases | Data Brew | Episode 42
Databricks Intermediate 1y ago
RF-DETR, Batch Processing, Instant Training, Serverless Inference, and More | What's New in Roboflow
Computer Vision
RF-DETR, Batch Processing, Instant Training, Serverless Inference, and More | What's New in Roboflow
Roboflow Intermediate 1y ago
Expedition Aya Kick Off Event
Computer Vision
Expedition Aya Kick Off Event
Cohere Intermediate 1y ago
Build a Football Analysis System Using YOLO11 and Supervision
Computer Vision
Build a Football Analysis System Using YOLO11 and Supervision
Muhammad Moin Intermediate 1y ago
Seminar: Segment Anything - Meta AI (15-03-2025)
Computer Vision
Seminar: Segment Anything - Meta AI (15-03-2025)
IEC Seminar Intermediate 1y ago
Building a travel buddy with Gemma
Computer Vision
Building a travel buddy with Gemma
Google for Developers Intermediate 1y ago
New Way Now: Safe Rate helps homebuyers and owners save thousands with AI-powered mortgage assistant
Computer Vision
New Way Now: Safe Rate helps homebuyers and owners save thousands with AI-powered mortgage assistant
Google Cloud Intermediate 1y ago
Peter Tong - MetaMorph: Multimodal Understanding and Generation via Instruction Tuning
Computer Vision
Peter Tong - MetaMorph: Multimodal Understanding and Generation via Instruction Tuning
Cohere Intermediate 1y ago
How Machines Find Patterns [Template Matching]
Computer Vision
How Machines Find Patterns [Template Matching]
Jia-Bin Huang Intermediate 1y ago
Next Multi trillion dollar industry?
Computer Vision
Next Multi trillion dollar industry?
Full Disclosure Intermediate 1y ago
DeepSeek’s Janus-Pro-7B Crushes DALL·E 3!  #deepseek #openai
Computer Vision
DeepSeek’s Janus-Pro-7B Crushes DALL·E 3! #deepseek #openai
Analytics Vidhya Intermediate 1y ago
This Python module is your go-to for speech and image recognition!
Computer Vision ⚡ AI Lesson
This Python module is your go-to for speech and image recognition!
Tech With Tim Intermediate 1y ago
Selling the Cause: Leveraging Marketing Strategies & Storytelling in Nonprofits
Computer Vision
Selling the Cause: Leveraging Marketing Strategies & Storytelling in Nonprofits
The Nonprofit Prof Intermediate 1y ago
Not ElevenLabs, This new #1 Text to Speech AI is FREE!!!!
Computer Vision
Not ElevenLabs, This new #1 Text to Speech AI is FREE!!!!
1littlecoder Intermediate 1y ago
Next AI Project is Image Classification in Python🔍🤖
Computer Vision ⚡ AI Lesson
Next AI Project is Image Classification in Python🔍🤖
Tech With Tim Intermediate 1y ago
Best of 2024 in Vision [LS Live @ NeurIPS]
Computer Vision ⚡ AI Lesson
Best of 2024 in Vision [LS Live @ NeurIPS]
Latent Space Intermediate 1y ago
How to Do Email Segmentation the Right Way
0:47
Computer Vision ⚡ AI Lesson
How to Do Email Segmentation the Right Way
Spark Bridge Digital | Email Marketing Agency Intermediate 1y ago
OpenAI DevDay 2024 | Multimodal apps with the Realtime API
Computer Vision
OpenAI DevDay 2024 | Multimodal apps with the Realtime API
OpenAI Intermediate 1y ago
Ethan Norville EXPOSES Coronation Project Secrets
Computer Vision
Ethan Norville EXPOSES Coronation Project Secrets
Professor Charley T Intermediate 1y ago
MediaPipe Web: Bringing cross-platform AI tech to the browser
Computer Vision ⚡ AI Lesson
MediaPipe Web: Bringing cross-platform AI tech to the browser
Chrome for Developers Intermediate 1y ago
Moondream: how does a tiny vision model slap so hard? — Vikhyat Korrapati
Computer Vision ⚡ AI Lesson
Moondream: how does a tiny vision model slap so hard? — Vikhyat Korrapati
AI Engineer Intermediate 1y ago
Transformers.js: State-of-the-art Machine Learning for the web
Computer Vision ⚡ AI Lesson
Transformers.js: State-of-the-art Machine Learning for the web
Chrome for Developers Intermediate 1y ago
Stanford Seminar - Open-world Segmentation and Tracking in 3D
Computer Vision
Stanford Seminar - Open-world Segmentation and Tracking in 3D
Stanford Online Intermediate 1y ago
The Next Decade in AI and Computer Vision
Computer Vision ⚡ AI Lesson
The Next Decade in AI and Computer Vision
a16z Intermediate 1y ago
Hairmony: Fairness-aware hairstyle classification
Computer Vision
Hairmony: Fairness-aware hairstyle classification
Microsoft Research Intermediate 1y ago
AI vs. Machine Learning: Debunked
Computer Vision
AI vs. Machine Learning: Debunked
Jean Lee Intermediate 1y ago
Multimodal RAG YT Video
Computer Vision
Multimodal RAG YT Video
Srikantan Sankaran Intermediate 1y ago
Testing CA’s Computer Vision Robot Arm @LEGO @raspberrypi @Core-Electronics
Computer Vision
Testing CA’s Computer Vision Robot Arm @LEGO @raspberrypi @Core-Electronics
Creator Academy Australia Intermediate 1y ago
ExecuTorch Beta and on-Device Generative AI Support - Mergen Nachin & Mengtao (Martin) Yuan, Meta
Computer Vision
ExecuTorch Beta and on-Device Generative AI Support - Mergen Nachin & Mengtao (Martin) Yuan, Meta
PyTorch Intermediate 1y ago
New Course: YOLOv12 – Custom Object Detection, Tracking & Web Apps
Computer Vision
New Course: YOLOv12 – Custom Object Detection, Tracking & Web Apps
Muhammad Moin Intermediate 1y ago
Build an AI-Powered Self-Serve Checkout & Cost Calculator in 10 Minutes (Almost)
Computer Vision
Build an AI-Powered Self-Serve Checkout & Cost Calculator in 10 Minutes (Almost)
Roboflow Intermediate 1y ago
Measure Liquid Levels with AI | Build a Web App Powered by Computer Vision
Computer Vision
Measure Liquid Levels with AI | Build a Web App Powered by Computer Vision
Roboflow Intermediate 1y ago
Pool Shot Predictor with OpenCV: Will the Ball Go Into the Pocket?
Computer Vision
Pool Shot Predictor with OpenCV: Will the Ball Go Into the Pocket?
Muhammad Moin Intermediate 1y ago
How to Train YOLO11 Instance Segmentation Models on Your Custom Dataset in Google Colab
Computer Vision
How to Train YOLO11 Instance Segmentation Models on Your Custom Dataset in Google Colab
Muhammad Moin Intermediate 1y ago
Estimate Real Distance to Objects with Depth Pro and YOLO11
Computer Vision
Estimate Real Distance to Objects with Depth Pro and YOLO11
Muhammad Moin Intermediate 1y ago
Florence-2: Create and Deploy a Custom Vision Language Model
Computer Vision
Florence-2: Create and Deploy a Custom Vision Language Model
Roboflow Intermediate 1y ago
YOLO11: Performance Benchmark and Real World Use Cases
Computer Vision
YOLO11: Performance Benchmark and Real World Use Cases
Roboflow Intermediate 1y ago
Video Analytics with AI | Live Coding & Q&A (Oct 9th)
Computer Vision
Video Analytics with AI | Live Coding & Q&A (Oct 9th)
Roboflow Intermediate 1y ago
GPT-4o: Fine-tune OpenAI's Multimodal Model | Live Coding & Q&A (Oct 3rd)
Computer Vision
GPT-4o: Fine-tune OpenAI's Multimodal Model | Live Coding & Q&A (Oct 3rd)
Roboflow Intermediate 1y ago
YOLO11: How to Train for Object Detection | Live Coding & Q&A (Sep 30)
Computer Vision
YOLO11: How to Train for Object Detection | Live Coding & Q&A (Sep 30)
Roboflow Intermediate 1y ago
📚 Continue on Coursera External links · Free to audit
1 / 3 View all →
Infraestructura de IA: GPU de Cloud
📚 External: Coursera ↗
Self-paced
Infraestructura de IA: GPU de Cloud
Opens on Coursera ↗
Rural Marketing: Segmentation & Consumer Insights
📚 External: Coursera ↗
Self-paced
Rural Marketing: Segmentation & Consumer Insights
Opens on Coursera ↗
Build a DIY Multimodal Question Answering System with Vertex AI
📚 External: Coursera ↗
Self-paced
Build a DIY Multimodal Question Answering System with Vertex AI
Opens on Coursera ↗
Tendencias e innovaciones en los medios deportivos
📚 External: Coursera ↗
Self-paced
Tendencias e innovaciones en los medios deportivos
Opens on Coursera ↗
Advancing Your Career in Computer Vision Engineering
📚 External: Coursera ↗
Self-paced
Advancing Your Career in Computer Vision Engineering
Opens on Coursera ↗
Supply Market Analysis
📚 External: Coursera ↗
Self-paced
Supply Market Analysis
Opens on Coursera ↗