Foundations

Computer Vision

Object detection, segmentation, YOLO, CLIP, and vision-language models

1,541

lessons

Skills in this topic

3 skills — Sign in to track your progress

View full skill map →

Classify images with a pre-trained CNN

Modern CV Models

Run YOLO for real-time object detection

Build a Stable Diffusion inference pipeline

Videos 1,145 Reads 396

Level: All Beginner Intermediate Advanced

Any Length Short (<5m) Medium (5-20m) Long (>20m)

Newest Popular Oldest

AI Diaries Episode Multimodal Drug Safety at the Edge

Computer Vision

AI Diaries Episode Multimodal Drug Safety at the Edge

QuickTech Daily Advanced 2w ago

Getac and the Future of Rugged Technology and the Deskless Workforce

Computer Vision

Getac and the Future of Rugged Technology and the Deskless Workforce

Neil C. Hughes Advanced 2w ago

KREA.AI: la startup de IA con más de 30 millones de usuarios | itnig podcast

Computer Vision

KREA.AI: la startup de IA con más de 30 millones de usuarios | itnig podcast

Itnig Advanced 1mo ago

When Your Car Can Reason: An Inside Look at BADAS-Reason Technology. V-JEPA2 and Physical Causality.

Computer Vision

When Your Car Can Reason: An Inside Look at BADAS-Reason Technology. V-JEPA2 and Physical Causality.

Byte Goose AI. Advanced 2mo ago

Why Legacy SIEM Models Are Struggling | Ali Ghodsi at RSAC 2026

Computer Vision

Why Legacy SIEM Models Are Struggling | Ali Ghodsi at RSAC 2026

Databricks Advanced 2mo ago

From Vision Encoders to Perception Encoders: How Meta's EUPE Perception Encoder Beats the AI Giants.

Computer Vision

From Vision Encoders to Perception Encoders: How Meta's EUPE Perception Encoder Beats the AI Giants.

Byte Goose AI. Advanced 2mo ago

Yasser Benigmin - Domain Adaptation in the Era of Foundation Models

Computer Vision

Yasser Benigmin - Domain Adaptation in the Era of Foundation Models

Cohere Advanced 3mo ago

China’s Secret Combat Robot Revealed at Lunar New Year Gala!

Computer Vision

China’s Secret Combat Robot Revealed at Lunar New Year Gala!

Technology Now Advanced 3mo ago

TensorFlow: Advanced Techniques Specialization

Computer Vision ⚡ AI Lesson

TensorFlow: Advanced Techniques Specialization

DeepLearning.AI Advanced 4mo ago

YOLO26 Fine-Tuning | Detection and Instance Segmentation | Live Coding + Q&A (Jan 15th)

Computer Vision ⚡ AI Lesson

YOLO26 Fine-Tuning | Detection and Instance Segmentation | Live Coding + Q&A (Jan 15th)

Roboflow Advanced 5mo ago

Anaximander: Interactive Orchestration and Evaluation of Geospatial Foundation Models

Computer Vision

Anaximander: Interactive Orchestration and Evaluation of Geospatial Foundation Models

Microsoft Research Advanced 5mo ago

26. What is Hugging Face? | Full Guide to Models, Datasets & NLP In Hindi

Computer Vision

26. What is Hugging Face? | Full Guide to Models, Datasets & NLP In Hindi

AI SayI Advanced 6mo ago

Anthony Fuller & Yousef Yassin - LookWhere? Efficient Visual Recognition by Learning Where to Look

Computer Vision ⚡ AI Lesson

Anthony Fuller & Yousef Yassin - LookWhere? Efficient Visual Recognition by Learning Where to Look

Cohere Advanced 6mo ago

Basketball AI: Player Tracking, Team Detection, and Number Recognition with Python

Computer Vision ⚡ AI Lesson

Basketball AI: Player Tracking, Team Detection, and Number Recognition with Python

Roboflow Advanced 6mo ago

Qwen3-Omni: The First Open All-in-One AI?

Computer Vision

Qwen3-Omni: The First Open All-in-One AI?

What's AI by Louis-François Bouchard Advanced 9mo ago

2025 EC3 & CIB W78 - Partl, Rainer - Deep Neural Networks for Object-detection and Instance Seg...

Computer Vision

2025 EC3 & CIB W78 - Partl, Rainer - Deep Neural Networks for Object-detection and Instance Seg...

European Council on Computing in Construction Advanced 11mo ago

Transforming Guest Experiences: GoTo Foods’ Data Journey with Amperity & Databricks

Computer Vision ⚡ AI Lesson

Transforming Guest Experiences: GoTo Foods’ Data Journey with Amperity & Databricks

Databricks Advanced 11mo ago

Train YOLO on Custom Dataset | Object Detection Step-by-Step Tutorial

Computer Vision

Train YOLO on Custom Dataset | Object Detection Step-by-Step Tutorial

Samin Learns AI Advanced 1y ago

FastVLM brings advanced computer vision to your phone...

Computer Vision ⚡ AI Lesson

FastVLM brings advanced computer vision to your phone...

NeuralNine Advanced 1y ago

Find out how Nevada DETR achieved 4x faster approvals with Vertex AI

Computer Vision

Find out how Nevada DETR achieved 4x faster approvals with Vertex AI

Google Cloud Advanced 1y ago

PaliGemma – Making Gemma 2 see by adding a vision encoder

Computer Vision

PaliGemma – Making Gemma 2 see by adding a vision encoder

Google for Developers Advanced 1y ago

YOLOE: Real-Time Zero-Shot Object Detection and Segmentation Explained | Visual Prompting

Computer Vision

YOLOE: Real-Time Zero-Shot Object Detection and Segmentation Explained | Visual Prompting

Muhammad Moin Advanced 1y ago

George Hotz | mixture of experts (like deepseek) on tinygrad sovereign AMD stack | AMD YOLO

Computer Vision

George Hotz | mixture of experts (like deepseek) on tinygrad sovereign AMD stack | AMD YOLO

george hotz archive Advanced 1y ago

Le meilleur OCR au monde : Mistral AI

Computer Vision

Le meilleur OCR au monde : Mistral AI

LAW I.A. Avocat & intelligence artificielle Lexvox Advanced 1y ago

Microsoft’s Phi-4 SLM: Open-Source AI for Text, Vision & Audio!

Computer Vision

Microsoft’s Phi-4 SLM: Open-Source AI for Text, Vision & Audio!

Analytics Vidhya Advanced 1y ago

Deepseek is back with VISION

Computer Vision

Deepseek is back with VISION

1littlecoder Advanced 1y ago

Using Vertex AI for healthcare

Computer Vision

Using Vertex AI for healthcare

Google Cloud Tech Advanced 1y ago

Enhance Generative AI Model Accuracy Through High-Quality Multimodal Data Processing

Computer Vision

Enhance Generative AI Model Accuracy Through High-Quality Multimodal Data Processing

NVIDIA Developer Advanced 1y ago

YOLOv2 (YOLO9000) and YOLOv3 Explained

Computer Vision ⚡ AI Lesson

YOLOv2 (YOLO9000) and YOLOv3 Explained

ExplainingAI Advanced 1y ago

New Video AI by META & Stanford Univ: APOLLO (7B)

Computer Vision ⚡ AI Lesson

New Video AI by META & Stanford Univ: APOLLO (7B)

Discover AI Advanced 1y ago

MedAI: Vision Language Models & Fine-Tuning (KnowAda)

Computer Vision

MedAI: Vision Language Models & Fine-Tuning (KnowAda)

Discover AI Advanced 1y ago

open-animal-tracks

Computer Vision ⚡ AI Lesson

open-animal-tracks

Data Skeptic Advanced 1y ago

Bird Distribution Modeling with Satbird

Computer Vision ⚡ AI Lesson

Bird Distribution Modeling with Satbird

Data Skeptic Advanced 1y ago

YOLO Object Detection | YoloV1 Explanation and Implementation Tutorial

Computer Vision

YOLO Object Detection | YoloV1 Explanation and Implementation Tutorial

ExplainingAI Advanced 1y ago

Beyond Language: The future of multimodal models in health, gaming, & AI | Microsoft Research Forum

Computer Vision

Beyond Language: The future of multimodal models in health, gaming, & AI | Microsoft Research Forum

Microsoft Research Advanced 1y ago

Segment Anything 2: Memory + Vision = Object Permanence — with Nikhila Ravi and Joseph Nelson

Computer Vision

Segment Anything 2: Memory + Vision = Object Permanence — with Nikhila Ravi and Joseph Nelson

Latent Space Advanced 1y ago

JETSON AI LAB | Research Group Meeting (8/6/2024)

Computer Vision

JETSON AI LAB | Research Group Meeting (8/6/2024)

NVIDIA Developer Advanced 1y ago

LlamaIndex Webinar: ColPali - Efficient Document Retrieval with Vision Language Models

Computer Vision

LlamaIndex Webinar: ColPali - Efficient Document Retrieval with Vision Language Models

LlamaIndex Advanced 1y ago

Zhiwen Fan - VLM 3R Vision Language Models Augmented with Instruction Aligned 3D Reconstruction

Computer Vision ⚡ AI Lesson

Zhiwen Fan - VLM 3R Vision Language Models Augmented with Instruction Aligned 3D Reconstruction

Cohere Advanced 8mo ago

Ashmal Vayani - Seeing the World as It Speaks Multilingual, Culturally Aware Multimodal AI

Computer Vision ⚡ AI Lesson

Ashmal Vayani - Seeing the World as It Speaks Multilingual, Culturally Aware Multimodal AI

Cohere Advanced 8mo ago

David Fan & Peter Tong - Scaling Language Free Visual Representation Learning

Computer Vision ⚡ AI Lesson

David Fan & Peter Tong - Scaling Language Free Visual Representation Learning

Cohere Advanced 10mo ago

RF-DETR Beat YOLOs on Real-time Object Detection | Fine-Tuning | Live Coding & Q&A (Mar 27th)

Computer Vision

RF-DETR Beat YOLOs on Real-time Object Detection | Fine-Tuning | Live Coding & Q&A (Mar 27th)

Roboflow Advanced 1y ago

YOLOE: Real-time Zero-shot Object Detection | Visual Prompting | Live Coding & Q&A (Mar 14th)

Computer Vision

YOLOE: Real-time Zero-shot Object Detection | Visual Prompting | Live Coding & Q&A (Mar 14th)

Roboflow Advanced 1y ago

Aya Vision - The Challenges & Breakthroughs

Computer Vision ⚡ AI Lesson

Aya Vision - The Challenges & Breakthroughs

Cohere Advanced 1y ago

Peng Xia - RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models

Computer Vision

Peng Xia - RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models

Cohere Advanced 1y ago

Gwanghyun (Bradley) Kim - BeyondScene: Higher-Resolution Human-Scene Generation

Computer Vision

Gwanghyun (Bradley) Kim - BeyondScene: Higher-Resolution Human-Scene Generation

Cohere Advanced 1y ago

Football AI | Community Q&A (Aug 29)

Computer Vision ⚡ AI Lesson

Football AI | Community Q&A (Aug 29)

Roboflow Advanced 1y ago

Segment Anything 2 (SAM 2): Meta AI's Newest Model | Community Q&A (Jul 30)

Computer Vision

Segment Anything 2 (SAM 2): Meta AI's Newest Model | Community Q&A (Jul 30)

Roboflow Advanced 1y ago

📚 Continue on Coursera External links · Free to audit

View all →

Supply Market Analysis

📚 External: Coursera ↗

Supply Market Analysis

Opens on Coursera ↗

Salesforce Data Cloud Mastery: Certified Consultant Skills Path

📚 External: Coursera ↗

Salesforce Data Cloud Mastery: Certified Consultant Skills Path

Opens on Coursera ↗

YOLO-NAS + v8 Full-Stack Computer Vision Course

📚 External: Coursera ↗

YOLO-NAS + v8 Full-Stack Computer Vision Course

Opens on Coursera ↗

Implement Hand Gesture Recognition with OpenCV

📚 External: Coursera ↗

Implement Hand Gesture Recognition with OpenCV

Opens on Coursera ↗

Advancing Your Career in Computer Vision Engineering

📚 External: Coursera ↗

Advancing Your Career in Computer Vision Engineering

Opens on Coursera ↗

Network Visualization and Intervention

📚 External: Coursera ↗

Network Visualization and Intervention

Opens on Coursera ↗

Materiales para envase y embalaje

📚 External: Coursera ↗

Materiales para envase y embalaje

Opens on Coursera ↗

Build Real-Time Face Recognition with OpenCV

📚 External: Coursera ↗

Build Real-Time Face Recognition with OpenCV

Opens on Coursera ↗

Traitement d'images : segmentation et caractérisation

📚 External: Coursera ↗

Traitement d'images : segmentation et caractérisation

Opens on Coursera ↗

Networking and Security Architecture with VMware NSX

📚 External: Coursera ↗

Networking and Security Architecture with VMware NSX

Opens on Coursera ↗

Craft Sales Strategy

📚 External: Coursera ↗

Craft Sales Strategy

Opens on Coursera ↗

The "Who" of the Marketing Strategy:Segmentation & Targeting

📚 External: Coursera ↗

The "Who" of the Marketing Strategy:Segmentation & Targeting

Opens on Coursera ↗

Deep Learning Applications for Computer Vision

📚 External: Coursera ↗

Deep Learning Applications for Computer Vision

Opens on Coursera ↗

📚 External: Coursera ↗

Introduction to Computer Vision with TensorFlow

Opens on Coursera ↗

Unity: Design & Deform Meshes for 3D Geometry Control

📚 External: Coursera ↗

Unity: Design & Deform Meshes for 3D Geometry Control

Opens on Coursera ↗

Introduction to Deep Learning for Computer Vision

📚 External: Coursera ↗

Introduction to Deep Learning for Computer Vision

Opens on Coursera ↗

Applied Machine Learning: Techniques and Applications

📚 External: Coursera ↗

Applied Machine Learning: Techniques and Applications

Opens on Coursera ↗

Infraestructura: Tecnologías Detrás de Recintos Inteligentes

📚 External: Coursera ↗

Infraestructura: Tecnologías Detrás de Recintos Inteligentes

Opens on Coursera ↗