Foundations

Computer Vision

Object detection, segmentation, YOLO, CLIP, and vision-language models

1,332
lessons
Skills in this topic
View full skill map →
CV Basics
beginner
Classify images with a pre-trained CNN
Modern CV Models
intermediate
Run YOLO for real-time object detection
Generative CV
advanced
Build a Stable Diffusion inference pipeline
Peng Xia - RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models
Computer Vision
Peng Xia - RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models
Cohere Advanced 1y ago
MediaPipe Web: Bringing cross-platform AI tech to the browser
Computer Vision ⚡ AI Lesson
MediaPipe Web: Bringing cross-platform AI tech to the browser
Chrome for Developers Intermediate 1y ago
Multimodal Embeddings: Introduction & Use Cases (with Python)
Computer Vision
Multimodal Embeddings: Introduction & Use Cases (with Python)
Shaw Talebi Beginner 1y ago
How to Build a Smart Parking System - License Plate Detection & OCR
Computer Vision ⚡ AI Lesson
How to Build a Smart Parking System - License Plate Detection & OCR
Roboflow Beginner 1y ago
Demo Lecture-Image Processing-Computer Vision With Generative AI Bootcamp With Doubts Solving
Computer Vision
Demo Lecture-Image Processing-Computer Vision With Generative AI Bootcamp With Doubts Solving
Krish Naik Beginner 1y ago
Insights from a Kaggle Grandmaster: Multimodal Models, Agents, Document AI & more
Computer Vision
Insights from a Kaggle Grandmaster: Multimodal Models, Agents, Document AI & more
Analytics Vidhya Beginner 1y ago
MedAI: Vision Language Models & Fine-Tuning (KnowAda)
Computer Vision
MedAI: Vision Language Models & Fine-Tuning (KnowAda)
Discover AI Advanced 1y ago
Moondream: how does a tiny vision model slap so hard? — Vikhyat Korrapati
Computer Vision ⚡ AI Lesson
Moondream: how does a tiny vision model slap so hard? — Vikhyat Korrapati
AI Engineer Intermediate 1y ago
Transformers.js: State-of-the-art Machine Learning for the web
Computer Vision ⚡ AI Lesson
Transformers.js: State-of-the-art Machine Learning for the web
Chrome for Developers Intermediate 1y ago
NLP Engineer & Computer Vision Engineer #codebasics #nlp #computervision #datajob #shorts
Computer Vision ⚡ AI Lesson
NLP Engineer & Computer Vision Engineer #codebasics #nlp #computervision #datajob #shorts
codebasics Beginner 1y ago
Gwanghyun (Bradley) Kim - BeyondScene: Higher-Resolution Human-Scene Generation
Computer Vision
Gwanghyun (Bradley) Kim - BeyondScene: Higher-Resolution Human-Scene Generation
Cohere Advanced 1y ago
Stanford Seminar - Open-world Segmentation and Tracking in 3D
Computer Vision
Stanford Seminar - Open-world Segmentation and Tracking in 3D
Stanford Online Intermediate 1y ago
Revolutionizing sign language with AI
Computer Vision ⚡ AI Lesson
Revolutionizing sign language with AI
TensorFlow Official Beginner 1y ago
Neuralift AI builds trust using W&B Weave
Computer Vision
Neuralift AI builds trust using W&B Weave
Weights & Biases Beginner 1y ago
The Next Decade in AI and Computer Vision
Computer Vision ⚡ AI Lesson
The Next Decade in AI and Computer Vision
a16z Intermediate 1y ago
[Paper Club] SWE-Bench [OpenAI Verified/Multimodal] + MLE-Bench with Jesse Hu
Computer Vision ⚡ AI Lesson
[Paper Club] SWE-Bench [OpenAI Verified/Multimodal] + MLE-Bench with Jesse Hu
Latent Space Beginner 1y ago
Single Shot Multibox Detector | SSD Object Detection Explained and Implemented
Computer Vision
Single Shot Multibox Detector | SSD Object Detection Explained and Implemented
ExplainingAI Beginner 1y ago
YOLOv11: How to Train for Object Detection on a Custom Dataset | Step-by-step guide
Computer Vision
YOLOv11: How to Train for Object Detection on a Custom Dataset | Step-by-step guide
Roboflow Beginner 1y ago
Data As a Corporate Asset—the GenAI-era Take (Part 2)
Computer Vision ⚡ AI Lesson
Data As a Corporate Asset—the GenAI-era Take (Part 2)
Microsoft Developer Beginner 1y ago
Computer Vision Explained in 30s
Computer Vision
Computer Vision Explained in 30s
365 Data Science Beginner 1y ago
Multimodal RAG YT Video
Computer Vision
Multimodal RAG YT Video
Srikantan Sankaran Intermediate 1y ago
New Way Now: Plenitude streamlines customer onboarding and fraud prevention with Google Cloud AI
Computer Vision
New Way Now: Plenitude streamlines customer onboarding and fraud prevention with Google Cloud AI
Google Cloud Beginner 1y ago
Testing CA’s Computer Vision Robot Arm @LEGO @raspberrypi @Core-Electronics
Computer Vision
Testing CA’s Computer Vision Robot Arm @LEGO @raspberrypi @Core-Electronics
Creator Academy Australia Intermediate 1y ago
Blobs to Clips: Efficient End-to-End Video Data Loading - Andrew Ho & Ahmad Sharif, Meta
Computer Vision ⚡ AI Lesson
Blobs to Clips: Efficient End-to-End Video Data Loading - Andrew Ho & Ahmad Sharif, Meta
PyTorch Beginner 1y ago
Llama 3.2: Best Multimodal Model Yet? (Vision Test)
Computer Vision ⚡ AI Lesson
Llama 3.2: Best Multimodal Model Yet? (Vision Test)
Mervin Praison Beginner 1y ago
CS50x 2025 - Lecture 4 - Memory
Computer Vision
CS50x 2025 - Lecture 4 - Memory
CS50 Beginner 1y ago
Data As a Corporate Asset—the GenAI-era Take (Part 1)
Computer Vision
Data As a Corporate Asset—the GenAI-era Take (Part 1)
Microsoft Developer Beginner 1y ago
Free Live 3 Days Computer Vision and Object Detection Workshop
Computer Vision
Free Live 3 Days Computer Vision and Object Detection Workshop
Krish Naik Beginner 1y ago
Using PyTorch for Monocular Depth Estimation Webinar
Computer Vision ⚡ AI Lesson
Using PyTorch for Monocular Depth Estimation Webinar
PyTorch Beginner 1y ago
The era of unbounded products: Designing for Multimodal IO: Ben Hylak
Computer Vision
The era of unbounded products: Designing for Multimodal IO: Ben Hylak
AI Engineer Intermediate 1y ago
Handwriting Transcription with AI: Digitizing Documents Using Computer Vision
Computer Vision
Handwriting Transcription with AI: Digitizing Documents Using Computer Vision
Macgence Beginner 1y ago
Object Detection: Importance of High-Quality Data
Computer Vision ⚡ AI Lesson
Object Detection: Importance of High-Quality Data
Macgence Beginner 1y ago
“The Future of AI is Here” — Fei-Fei Li Unveils the Next Frontier of AI
Computer Vision ⚡ AI Lesson
“The Future of AI is Here” — Fei-Fei Li Unveils the Next Frontier of AI
a16z Beginner 1y ago
Why Zero Trust is the Key to Cybersecurity in 2024 and Beyond
Computer Vision ⚡ AI Lesson
Why Zero Trust is the Key to Cybersecurity in 2024 and Beyond
SANS Institute Intermediate 1y ago
AI for Business Transformation: Lessons from Healthcare
Computer Vision
AI for Business Transformation: Lessons from Healthcare
Microsoft Research Beginner 1y ago
open-animal-tracks
Computer Vision ⚡ AI Lesson
open-animal-tracks
Data Skeptic Advanced 1y ago
Bird Distribution Modeling with Satbird
Computer Vision ⚡ AI Lesson
Bird Distribution Modeling with Satbird
Data Skeptic Advanced 1y ago
Web AI Summit 2024: State of client side machine learning
Computer Vision ⚡ AI Lesson
Web AI Summit 2024: State of client side machine learning
Chrome for Developers Beginner 1y ago
YOLO11: Performance Benchmark and Real World Use Cases
Computer Vision
YOLO11: Performance Benchmark and Real World Use Cases
Roboflow Intermediate 1y ago
Video Analytics with AI | Live Coding & Q&A (Oct 9th)
Computer Vision
Video Analytics with AI | Live Coding & Q&A (Oct 9th)
Roboflow Intermediate 1y ago
How to use OCR | Get Started with Optical Character Recognition
Computer Vision
How to use OCR | Get Started with Optical Character Recognition
Roboflow Beginner 1y ago
GPT-4o: Fine-tune OpenAI's Multimodal Model | Live Coding & Q&A (Oct 3rd)
Computer Vision
GPT-4o: Fine-tune OpenAI's Multimodal Model | Live Coding & Q&A (Oct 3rd)
Roboflow Intermediate 1y ago
YOLO11: How to Train for Object Detection | Live Coding & Q&A (Sep 30)
Computer Vision
YOLO11: How to Train for Object Detection | Live Coding & Q&A (Sep 30)
Roboflow Intermediate 1y ago
Using RTSP Streams for Computer Vision | Tracking & Counting Objects
Computer Vision
Using RTSP Streams for Computer Vision | Tracking & Counting Objects
Roboflow Intermediate 1y ago
Xiang Yue - Measuring Multimodal Reasoning with the MMMU Benchmarks
Computer Vision
Xiang Yue - Measuring Multimodal Reasoning with the MMMU Benchmarks
Cohere Beginner 1y ago
Model Evaluation for Computer Vision
Computer Vision ⚡ AI Lesson
Model Evaluation for Computer Vision
Roboflow Beginner 1y ago
Active Learning in Computer Vision
Computer Vision
Active Learning in Computer Vision
Roboflow Beginner 1y ago
Use Dedicated Deployments with Computer Vision Workflows
Computer Vision
Use Dedicated Deployments with Computer Vision Workflows
Roboflow Intermediate 1y ago
📚 Coursera Courses Opens on Coursera · Free to audit
1 / 3 View all →
Start Remote Sensing
📚 Coursera Course ↗
Self-paced
Start Remote Sensing
Opens on Coursera ↗
Multimodal Literacies: Communication and Learning in the Era of Digital Media
📚 Coursera Course ↗
Self-paced
Multimodal Literacies: Communication and Learning in the Era of Digital Media
Opens on Coursera ↗
Future of data and technology in football
📚 Coursera Course ↗
Self-paced
Future of data and technology in football
Opens on Coursera ↗
Introduction to Computer Vision with TensorFlow
📚 Coursera Course ↗
Self-paced
Introduction to Computer Vision with TensorFlow
Opens on Coursera ↗
Python Project: Software Engineering and Image Manipulation
📚 Coursera Course ↗
Self-paced
Python Project: Software Engineering and Image Manipulation
Opens on Coursera ↗
Anatomy of the Abdomen and Pelvis; a journey from basis to clinic.
📚 Coursera Course ↗
Self-paced
Anatomy of the Abdomen and Pelvis; a journey from basis to clinic.
Opens on Coursera ↗