Foundations

Computer Vision

Object detection, segmentation, YOLO, CLIP, and vision-language models

1,538
lessons
Skills in this topic
View full skill map →
CV Basics
beginner
Classify images with a pre-trained CNN
Modern CV Models
intermediate
Run YOLO for real-time object detection
Generative CV
advanced
Build a Stable Diffusion inference pipeline
All Reads (393) Articles (216)Blog Posts (116)Tutorials (47)Research Papers (13)News (1)
Building a Real-Time Fire Detection and People Counting System with InceptionV3 and OpenCV
Medium · Deep Learning 👁️ Computer Vision ⚡ AI Lesson 4w ago
Building a Real-Time Fire Detection and People Counting System with InceptionV3 and OpenCV
How transfer learning and classical computer vision can work together on edge hardware to save lives Continue reading on Medium »
My Friend Had a Cameras-On Problem. I Wrote Him a Solution.
Dev.to · Heiner 👁️ Computer Vision ⚡ AI Lesson 4w ago
My Friend Had a Cameras-On Problem. I Wrote Him a Solution.
Originally published on my blog. GitHub: ScrumSurvivor. My Friend Had a Cameras-On Problem....
How to Migrate From Clarifai to Ximilar: Quick Start Guide
Medium · AI 👁️ Computer Vision ⚡ AI Lesson 4w ago
How to Migrate From Clarifai to Ximilar: Quick Start Guide
Your drop-in replacement for custom classification, detection, and visual search. Continue reading on Medium »
Household Item Annotation Services for AI & Computer Vision
Medium · Machine Learning 👁️ Computer Vision ⚡ AI Lesson 4w ago
Household Item Annotation Services for AI & Computer Vision
Artificial Intelligence systems that understand indoor environments are becoming increasingly important across industries such as real… Continue reading on Medi
Dev.to AI 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Deepfakes Just Broke Evidence: $893M Gone, 100K Fake Images, First Arrests Land
the evolution of forensic verification in the age of generative noise For developers working in computer vision (CV) and biometrics, the news of $893M in AI-sca
Software Rendering Pipeline with Backface Culling
Dev.to · yubin yang 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Software Rendering Pipeline with Backface Culling
1. Overview In this project, I implemented a simple software renderer using Python and...
NVIDIA LocateAnything-3B : GoodBye YOLO Object Detection
Medium · Programming 👁️ Computer Vision ⚡ AI Lesson 1mo ago
NVIDIA LocateAnything-3B : GoodBye YOLO Object Detection
How to use NVIDIA LocateAnything-3B ? Continue reading on Data Science in Your Pocket »
Reddit r/MachineLearning 👁️ Computer Vision ⚡ AI Lesson 1mo ago
A new dataset with more that 100M hi-quality, curated images, with captions and meta data! [P]
Hello everyone. The new dataset is named MONET, is Apache 2.0 and available on HF: https://huggingface.co/datasets/jasperai/monet MONET is open, Apache 2.0-lice
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 1mo ago
BlazeEdit: Generalist Image Editing on Mobile Devices with Image-to-Image Diffusion Models
arXiv:2605.28067v1 Announce Type: new Abstract: The remarkable generation quality of modern diffusion models often comes at the cost of massive parameter counts
The Robotics Interview Series: Part 2A
Medium · Data Science 👁️ Computer Vision ⚡ AI Lesson 1mo ago
The Robotics Interview Series: Part 2A
The Perception Concepts You Need Cold (Beyond CV Fundamentals) Continue reading on Medium »
Reddit r/deeplearning 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Pls suggest best resources to learn semantic segmentation
​ I want to learn it for road extraction....so please suggest the best resources submitted by /u/NoAnybody8034 [link]
When to Choose C++ for Barcode Processing Pipelines
Medium · Programming 👁️ Computer Vision ⚡ AI Lesson 1mo ago
When to Choose C++ for Barcode Processing Pipelines
Barcode processing is vital in logistics, retail, healthcare, and manufacturing. While many languages support barcode recognition, C++ is… Continue reading on M
Akıllı Ulaşım Sistemlerinde Görüntü İşleme Teknolojisi Kullanılarak Araç Hız Tespiti Nasıl…
Medium · Python 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Akıllı Ulaşım Sistemlerinde Görüntü İşleme Teknolojisi Kullanılarak Araç Hız Tespiti Nasıl…
Bir trafik kamerası size bir aracın kaç km/h hızla geçtiğini söyleyebilir mi? Yazılım katmanı olmadan hayır. Bu yazı, bu yazılım katmanını… Continue reading on
Medium · Programming 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Shot detection is the cheap feature everyone underestimates
A friend of mine spent two months trying to add a “smart preview” feature to a video product, the kind of thing you see on every modern… Continue reading on Med
Medium · Python 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Shot detection is the cheap feature everyone underestimates
A friend of mine spent two months trying to add a “smart preview” feature to a video product, the kind of thing you see on every modern… Continue reading on Med
Real-time video classification with PaliGemma: architecture patterns for low-latency VLM inference
Dev.to · Pasquale Molinaro 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Real-time video classification with PaliGemma: architecture patterns for low-latency VLM inference
In a previous article, we benchmarked three open-source Vision-Language Models on zero-shot object...
cv3 — make OpenCV pythonic again
Medium · AI 👁️ Computer Vision ⚡ AI Lesson 1mo ago
cv3 — make OpenCV pythonic again
TL;DR cv3 is a Pythonic wrapper for OpenCV that simplifies computer vision tasks by providing more intuitive interfaces and eliminating… Continue reading on Med
cv3 — make OpenCV pythonic again
Medium · Machine Learning 👁️ Computer Vision ⚡ AI Lesson 1mo ago
cv3 — make OpenCV pythonic again
TL;DR cv3 is a Pythonic wrapper for OpenCV that simplifies computer vision tasks by providing more intuitive interfaces and eliminating… Continue reading on Med
cv3 — make OpenCV pythonic again
Medium · Deep Learning 👁️ Computer Vision ⚡ AI Lesson 1mo ago
cv3 — make OpenCV pythonic again
TL;DR cv3 is a Pythonic wrapper for OpenCV that simplifies computer vision tasks by providing more intuitive interfaces and eliminating… Continue reading on Med
Tile Extractor
Dev.to · somyabhalani 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Tile Extractor
Parsing the Unparsable: Building a Layout-Aware Computer Vision Pipeline for 50,000+ Stone...
SentinelML
Medium · AI 👁️ Computer Vision ⚡ AI Lesson 1mo ago
SentinelML
A modular, open-source framework for real-time firearm detection and alerting using YOLOv8 and cloud-native infrastructure. Continue reading on Medium »
SentinelML
Medium · Machine Learning 👁️ Computer Vision ⚡ AI Lesson 1mo ago
SentinelML
A modular, open-source framework for real-time firearm detection and alerting using YOLOv8 and cloud-native infrastructure. Continue reading on Medium »
Medium · Python 👁️ Computer Vision ⚡ AI Lesson 1mo ago
sen2p: Download Sentinel-2 Imagery Without API Keys or Extra Setup
A lightweight Python library that makes Sentinel-2 imagery easier to search and download. Continue reading on GeoAI »
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 1mo ago
A Camera-Cooperative ISAC Framework for Multimodal Non-Cooperative UAVs Sensing
arXiv:2605.22090v1 Announce Type: new Abstract: The detection of non-cooperative unmanned aerial vehicles (UAVs) presents significant challenges for Integrated
R-CNN : The Foundation of Deep Learning-Based Object Detection
Medium · Deep Learning 👁️ Computer Vision ⚡ AI Lesson 1mo ago
R-CNN : The Foundation of Deep Learning-Based Object Detection
Object detection is one of the most important tasks in computer vision. Unlike image classification, where the goal is only to identify… Continue reading on Med
I Built a 7-Stage OCR Pipeline to Make Gemini Vision Actually Reliable
Medium · Python 👁️ Computer Vision ⚡ AI Lesson 1mo ago
I Built a 7-Stage OCR Pipeline to Make Gemini Vision Actually Reliable
We all know LLMs are powerful. But they’re also probabilistic — and that’s the problem. The real job of an AI engineer isn’t just to call… Continue reading on M
Traffic Light Recognition (TLR) Architecture: 2D Bounding Box Detection
Medium · Machine Learning 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Traffic Light Recognition (TLR) Architecture: 2D Bounding Box Detection
The TLR model is a Fully Convolutional Network (FCN) + FPN + Header model, utilizing an “anchor-free” approach. Instead of guessing… Continue reading on Medium
2D Gaussian Splatting: when removing a dimension makes 3D better
Medium · AI 👁️ Computer Vision ⚡ AI Lesson 1mo ago
2D Gaussian Splatting: when removing a dimension makes 3D better
Why 3D Gaussians fail at surfaces, and how flat disks fix it Continue reading on Medium »
"Mastering Digital Logic Counters with C++ OOP: A Hands-On Guide”
Dev.to · Abdullah Fiaz 👁️ Computer Vision ⚡ AI Lesson 1mo ago
"Mastering Digital Logic Counters with C++ OOP: A Hands-On Guide”
Introduction Digital logic counters are fundamental in electronics and computing. They track events,...
Como o pensamento computacional me ajudou a estruturar minhas entregas
Medium · Programming 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Como o pensamento computacional me ajudou a estruturar minhas entregas
Há um bom tempo venho tentando entrar, bem aos poucos, no mundo da programação. Continue reading on Tatiane Marina »
Manchester Code Made Bits Behave
IEEE Spectrum 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Manchester Code Made Bits Behave
In the late 1940s—when computer engineers were grappling with unreliable hardware and noisy transmission environments—a team of engineers inside a modest lab at
Dev.to AI 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Why Your Image Upload Pipeline Should Check for Physically Impossible Lighting
Why Your Image Upload Pipeline Should Check for Physically Impossible Lighting If you're building user-generated content platforms, marketplace verification sys
Rasterization Using Bresenham Algorithm and Scanline Algorithm
Dev.to · yubin yang 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Rasterization Using Bresenham Algorithm and Scanline Algorithm
1. Overview Bresenham algorithm is the fastest algorithm for drawing straight lines on a...
OCR Intelligente per Documenti Aziendali: Architettura e Lezioni dal Campo
Dev.to · Alessandro Binda 👁️ Computer Vision ⚡ AI Lesson 1mo ago
OCR Intelligente per Documenti Aziendali: Architettura e Lezioni dal Campo
L'OCR (Optical Character Recognition) per testo stampato moderno è un problema risolto da decenni....
Computer Vision Yolculuğu — Gün 2: OpenCV ile Frame Üzerine Çizim Yapmak
Medium · AI 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Computer Vision Yolculuğu — Gün 2: OpenCV ile Frame Üzerine Çizim Yapmak
Computer Vision projelerinde kameradan görüntü almak yalnızca ilk adımdır. Gerçek sistemlerde asıl önemli nokta, alınan frame’lerin… Continue reading on Medium
Who Really Deserves To Be Called The Father Of The Internet
Medium · Programming 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Who Really Deserves To Be Called The Father Of The Internet
From ARPANET to the World Wide Web the Internet was built by a network of pioneers not one inventor Continue reading on IT Chronicles »
Why Your Computer Reads Numbers Backwards: Byte Order Explained
Dev.to · hassaan-syed 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Why Your Computer Reads Numbers Backwards: Byte Order Explained
What is Byte Order? Before understanding byte order, we need to understand one thing: A byte = 8...
Dev.to AI 👁️ Computer Vision ⚡ AI Lesson 1mo ago
High Speed and Performance
High Speed and Performance C language is very fast because it is a compiled language. It converts code directly into machine language, so programs run quickly a
Dev.to AI 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Building a License Plate Recognition Engine in C++ — Part 2: Grayscale Image Preprocessing and Local Contrast Edge Detection
In the previous article, we loaded an image, converted it into grayscale, and introduced the core data structures used by the recognition engine. In this part,
Inside SAM 3D: how Meta turns a single image into 3D
Medium · Machine Learning 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Inside SAM 3D: how Meta turns a single image into 3D
For about forty years, “3D” in the practical sense meant one thing: triangle meshes. Every game shipped, every animated film rendered… Continue reading on Mediu
Inside SAM 3D: how Meta turns a single image into 3D
Medium · Deep Learning 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Inside SAM 3D: how Meta turns a single image into 3D
For about forty years, “3D” in the practical sense meant one thing: triangle meshes. Every game shipped, every animated film rendered… Continue reading on Mediu
Demystifying CNNs: How Convolutional Filters and Max-Pooling Actually Work
Medium · Data Science 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Demystifying CNNs: How Convolutional Filters and Max-Pooling Actually Work
If you’ve ever wondered how a computer can look at a photo of a car and instantly know it’s a car, you’re looking at the magic of… Continue reading on Medium »
Dev.to AI 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Your "Biometric Age Check" Isn't Verifying Identity — And Defense Lawyers Know It
Understanding the distinction between biometric age estimation and identity verification For developers in the computer vision and biometrics space, the nuance
MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons
Medium · Machine Learning 👁️ Computer Vision ⚡ AI Lesson 1mo ago
MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons
This project speaks for itself. It covers three crucial steps in one go: motion tracking, skeleton reconstruction, and 3D animation. What… Continue reading on M
How I Built a Perceptual Color Quantization Engine for LEGO Mosaics
Dev.to · BMBrick 👁️ Computer Vision ⚡ AI Lesson 1mo ago
How I Built a Perceptual Color Quantization Engine for LEGO Mosaics
The Problem Converting a photo into a LEGO mosaic sounds simple: resize the image, find...
Computer Vision Is Rebuilding the Fitting Room
Medium · AI 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Computer Vision Is Rebuilding the Fitting Room
The models, the stack, the ROI — no fluff Continue reading on Medium »
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 1mo ago
Intelligent CCTV for Urban Design: AI-Based Analysis of Soft Infrastructure at Intersections
arXiv:2605.05402v1 Announce Type: new Abstract: Artificial intelligence (AI) and computer vision are transforming transportation data collection. This study int
Panduan Praktis Optimasi Pencahayaan Citra Digital dengan Python
Medium · Python 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Panduan Praktis Optimasi Pencahayaan Citra Digital dengan Python
Mengapa Pencahayaan Itu Krusial? ​Pernahkah Anda mengambil foto di kondisi minim cahaya dan mendapati hasilnya sangat gelap hingga… Continue reading on Medium »