Foundations

Computer Vision

Object detection, segmentation, YOLO, CLIP, and vision-language models

1,538
lessons
Skills in this topic
View full skill map →
CV Basics
beginner
Classify images with a pre-trained CNN
Modern CV Models
intermediate
Run YOLO for real-time object detection
Generative CV
advanced
Build a Stable Diffusion inference pipeline
All Reads (393) Articles (216)Blog Posts (116)Tutorials (47)Research Papers (13)News (1)
Video Stabilization — Deep Dive + Problem: Softmax Function
Dev.to · pixelbank dev 👁️ Computer Vision ⚡ AI Lesson 2d ago
Video Stabilization — Deep Dive + Problem: Softmax Function
A daily deep dive into cv topics, coding problems, and platform features from PixelBank. ...
Exceptional Control Flow - Deep Reference **CSAPP Chapter 8**
Dev.to · Sangyog Puri 👁️ Computer Vision ⚡ AI Lesson 3d ago
Exceptional Control Flow - Deep Reference **CSAPP Chapter 8**
1. The Core Idea - What is Exceptional Control Flow? Normally a program runs sequentially...
what i learned on day 1 of a 3D reconstruction internship
Dev.to · Shahram Shafiq 👁️ Computer Vision ⚡ AI Lesson 4d ago
what i learned on day 1 of a 3D reconstruction internship
Day 1 post, PreserveMy.World x TechRealm Internship 2026 I'm a CS student at FAST NUCES...
do you know what exactly startup code does ?
Dev.to · hassaan-syed 👁️ Computer Vision ⚡ AI Lesson 4d ago
do you know what exactly startup code does ?
let me explain what happens before the main() Most C programmers believe that a program starts...
A VLM gate for generated images, with provider failover via Bifrost
Dev.to · Elise Moreau 👁️ Computer Vision ⚡ AI Lesson 1w ago
A VLM gate for generated images, with provider failover via Bifrost
TL;DR: At Photoroom we run a vision-language model as the last check before a generated product image...
Roblox Promised "No Friction." Parents Got Locked Out — and $6.7B Vanished.
Dev.to · CaraComp 👁️ Computer Vision ⚡ AI Lesson 1w ago
Roblox Promised "No Friction." Parents Got Locked Out — and $6.7B Vanished.
The engineering reality of biometric friction For developers building in the computer vision and...
OpenCV 5 Is Here: The Biggest Leap in Years for Computer Vision
OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 3w ago
OpenCV 5 Is Here: The Biggest Leap in Years for Computer Vision
Authored by: Abhishek Gola and Gursimar Singh OpenCV 5 is one of the most important releases in the history of OpenCV. For more than two decades, OpenCV has bee
My Friend Had a Cameras-On Problem. I Wrote Him a Solution.
Dev.to · Heiner 👁️ Computer Vision ⚡ AI Lesson 4w ago
My Friend Had a Cameras-On Problem. I Wrote Him a Solution.
Originally published on my blog. GitHub: ScrumSurvivor. My Friend Had a Cameras-On Problem....
Software Rendering Pipeline with Backface Culling
Dev.to · yubin yang 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Software Rendering Pipeline with Backface Culling
1. Overview In this project, I implemented a simple software renderer using Python and...
Real-time video classification with PaliGemma: architecture patterns for low-latency VLM inference
Dev.to · Pasquale Molinaro 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Real-time video classification with PaliGemma: architecture patterns for low-latency VLM inference
In a previous article, we benchmarked three open-source Vision-Language Models on zero-shot object...
Tile Extractor
Dev.to · somyabhalani 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Tile Extractor
Parsing the Unparsable: Building a Layout-Aware Computer Vision Pipeline for 50,000+ Stone...
"Mastering Digital Logic Counters with C++ OOP: A Hands-On Guide”
Dev.to · Abdullah Fiaz 👁️ Computer Vision ⚡ AI Lesson 1mo ago
"Mastering Digital Logic Counters with C++ OOP: A Hands-On Guide”
Introduction Digital logic counters are fundamental in electronics and computing. They track events,...
Rasterization Using Bresenham Algorithm and Scanline Algorithm
Dev.to · yubin yang 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Rasterization Using Bresenham Algorithm and Scanline Algorithm
1. Overview Bresenham algorithm is the fastest algorithm for drawing straight lines on a...
OCR Intelligente per Documenti Aziendali: Architettura e Lezioni dal Campo
Dev.to · Alessandro Binda 👁️ Computer Vision ⚡ AI Lesson 1mo ago
OCR Intelligente per Documenti Aziendali: Architettura e Lezioni dal Campo
L'OCR (Optical Character Recognition) per testo stampato moderno è un problema risolto da decenni....
Why Your Computer Reads Numbers Backwards: Byte Order Explained
Dev.to · hassaan-syed 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Why Your Computer Reads Numbers Backwards: Byte Order Explained
What is Byte Order? Before understanding byte order, we need to understand one thing: A byte = 8...
How I Built a Perceptual Color Quantization Engine for LEGO Mosaics
Dev.to · BMBrick 👁️ Computer Vision ⚡ AI Lesson 1mo ago
How I Built a Perceptual Color Quantization Engine for LEGO Mosaics
The Problem Converting a photo into a LEGO mosaic sounds simple: resize the image, find...
Deconstructing the TikTok Media Stack: Building a High-Performance, No-Watermark Extraction Engine
Dev.to · yqqwe 👁️ Computer Vision ⚡ AI Lesson 1mo ago
Deconstructing the TikTok Media Stack: Building a High-Performance, No-Watermark Extraction Engine
Introduction As developers, we are often fascinated by how global-scale platforms manage...
OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 2mo ago
How P&G Uses AI to Understand Human Behavior
Computer vision isn’t just for self-driving cars and robots. At Procter & Gamble, it’s helping researchers understand human behavior, generate synthetic data, a
The AI School Bus Camera Company Blanketing America in Tickets
Dev.to · Aman Shekhar 👁️ Computer Vision ⚡ AI Lesson 2mo ago
The AI School Bus Camera Company Blanketing America in Tickets
Ever find yourself sitting in traffic, cursing under your breath because a school bus has stopped...
OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 2mo ago
The Holographic Future Is Here. See It at OSCCA.
For decades, the hologram was a promise. A thing of science fiction. Something always just around the corner. Shawn Frayne decided to stop waiting. As co-founde
Mitigating I/O Bottlenecks in Event-Driven Architectures: A Deep Dive into Backpressure and Resiliency
Dev.to · João Vitor Nascimento Mendonca 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Mitigating I/O Bottlenecks in Event-Driven Architectures: A Deep Dive into Backpressure and Resiliency
By: João Vitor Nascimento De Mendonça Originally published in...
Three.js: Püf Noktaları - Detaylı Teknik Analiz Rehberi 2026
Dev.to · FORUM WEB 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Three.js: Püf Noktaları - Detaylı Teknik Analiz Rehberi 2026
Three.js'in Tarihçesi ve Gelişimi Three.js, 2010 yılında Ricardo Cabello (Mr. doob) tarafından...
Blurring a Name Doesn't Anonymise a Face: What GDPR Actually Says
Dev.to · CaraComp 👁️ Computer Vision 3mo ago
Blurring a Name Doesn't Anonymise a Face: What GDPR Actually Says
Think your facial datasets are anonymized? Think again. For developers building computer vision (CV)...
Standard RAG Is Blind — Building Multimodal RAG in .NET to Fix It
Dev.to · Argha Sarkar 👁️ Computer Vision 3mo ago
Standard RAG Is Blind — Building Multimodal RAG in .NET to Fix It
The Scenario A developer builds a RAG system. A user uploads a 60-page service manual —...
Multimodal Biometrics: Why Face + Fingerprint + Voice Defeats Deepfakes
Dev.to · CaraComp 👁️ Computer Vision 3mo ago
Multimodal Biometrics: Why Face + Fingerprint + Voice Defeats Deepfakes
How multimodal fusion rewrites the rules of biometric probability As developers building identity...
Introducing a Simple, High-Performance 3D Visualization Tool in Python for Robotics, SLAM, and Computer Vision Applications
Dev.to · Roman Dubrovin 👁️ Computer Vision 3mo ago
Introducing a Simple, High-Performance 3D Visualization Tool in Python for Robotics, SLAM, and Computer Vision Applications
Introduction: The 3D Visualization Gap in Python In the world of robotics, SLAM, and...
AI Facial Recognition Sent an Innocent Grandmother to Jail
Dev.to · CaraComp 👁️ Computer Vision 3mo ago
AI Facial Recognition Sent an Innocent Grandmother to Jail
the technical failure points of automated identification For developers working in computer vision...
The Face Recognition Error That's Wrecking Investigations
Dev.to · CaraComp 👁️ Computer Vision 3mo ago
The Face Recognition Error That's Wrecking Investigations
the mathematical gap between open-world search and closed-set verification The accuracy ceiling for...
Law Enforcement Isn't Abandoning Face Tech — It's Regulating It
Dev.to · CaraComp 👁️ Computer Vision 3mo ago
Law Enforcement Isn't Abandoning Face Tech — It's Regulating It
The hidden shift in biometric regulation For developers building in the computer vision and...
When 99% Accurate Still Means Thousands of Wrong Arrests
Dev.to · CaraComp 👁️ Computer Vision 3mo ago
When 99% Accurate Still Means Thousands of Wrong Arrests
Biometric Accuracy vs. Investigative Reality For developers working in computer vision (CV) and...
Tomorrow: March 12 - MCP, Agents and Skills Meetup
Dev.to · Jimmy Guerrero 👁️ Computer Vision 3mo ago
Tomorrow: March 12 - MCP, Agents and Skills Meetup
Join us tomorrow on March 12 at 9 AM Pacific for a special edition of the AI, ML and Computer Vision...
Hunting Einstein Rings: Achieving 0.994 mAP in Deep-Space Detection with RT-DETR
Dev.to · jinghao-ai 👁️ Computer Vision 3mo ago
Hunting Einstein Rings: Achieving 0.994 mAP in Deep-Space Detection with RT-DETR
1.Introduction: The Needle in a Haystack Detecting Strong Gravitational Lensing(e.g., Einstein...
Building a Real-Time Posture Monitoring System in the Browser (MediaPipe + PiP)
Dev.to · Manan Verma 👁️ Computer Vision 3mo ago
Building a Real-Time Posture Monitoring System in the Browser (MediaPipe + PiP)
Browsers Kill Background Tabs. Here’s How I Kept My Computer Vision Engine Alive. Most...
YOLO vs Cloud API for Object Detection — Which One Should You Actually Use?
Dev.to · AI Engine 👁️ Computer Vision 3mo ago
YOLO vs Cloud API for Object Detection — Which One Should You Actually Use?
You need object detection in your app. You have two paths: run YOLO on your own GPU, or call a cloud...
Stop Losing Your Medical Records: Build a Multimodal Health RAG with LlamaIndex & Qdrant 🩺
Dev.to · wellallyTech 👁️ Computer Vision 3mo ago
Stop Losing Your Medical Records: Build a Multimodal Health RAG with LlamaIndex & Qdrant 🩺
We’ve all been there: staring at a pile of blood test results, crumpled physical therapy notes, and...
AI-Based Green Light Optimization using Computer Vision
Dev.to · Naitik Verma 👁️ Computer Vision 3mo ago
AI-Based Green Light Optimization using Computer Vision
Urban traffic systems still rely largely on fixed timer traffic lights. These timers do not adapt to...
March 19 - Women in AI Meetup
Dev.to · Jimmy Guerrero 👁️ Computer Vision 3mo ago
March 19 - Women in AI Meetup
Hear talks from experts on cutting-edge topics in AI, ML, and computer vision at the Women in AI...
How to Build AI iOS Apps: Complete CoreML Guide
Dev.to · Iniyarajan 👁️ Computer Vision 3mo ago
How to Build AI iOS Apps: Complete CoreML Guide
Learn how to build AI iOS apps with CoreML, Vision, and Natural Language frameworks. Complete Swift code examples for image recognition and text analysis.
March 12 - MCP, Skills and Agents AI Meetup
Dev.to · Jimmy Guerrero 👁️ Computer Vision 3mo ago
March 12 - MCP, Skills and Agents AI Meetup
Join us on March 12 for a special edition of the AI, ML and Computer Vision Meetup where we will...
March 5 - AI, ML and Computer Vision Meetup
Dev.to · Jimmy Guerrero 👁️ Computer Vision 4mo ago
March 5 - AI, ML and Computer Vision Meetup
Join us on March 5 for the virtual AI, ML and Computer Vision Meetup. Register for the...
I Built a Real JARVIS in Python with Knowledge Graphs, BERT Emotion Detection, Face Recognition and NASA API
Dev.to · Konstantinos 👁️ Computer Vision 4mo ago
I Built a Real JARVIS in Python with Knowledge Graphs, BERT Emotion Detection, Face Recognition and NASA API
Ever watched Iron Man and thought — could I actually build that? I did, and after months of work,...
Gandalf Vision
Dev.to · Andrey 👁️ Computer Vision 4mo ago
Gandalf Vision
Hey! So I spent yesterday diving into that Gandalf Vision library you mentioned—the computer vision...
Building a Custom Augmented Reality Marker Detector with OpenCV
Dev.to · 💻 Arpad Kish 💻 👁️ Computer Vision 4mo ago
Building a Custom Augmented Reality Marker Detector with OpenCV
Augmented Reality (AR) bridges the gap between the physical and digital worlds. A foundational step...
How to use OpenCV in Python, Make Your Hand Invisible Using OpenCV Magic Effect
Dev.to · Shafqat Awan 👁️ Computer Vision 4mo ago
How to use OpenCV in Python, Make Your Hand Invisible Using OpenCV Magic Effect
As we move into 2026, the demand for real-time computer vision manipulation has shifted from simple filters to seamless augmented reality integrations...
From Metrics to Action: Turning Embedding Analysis into Sprint Tickets
Dev.to · Itay Eylath 👁️ Computer Vision 4mo ago
From Metrics to Action: Turning Embedding Analysis into Sprint Tickets
In an agile Computer Vision startup, global accuracy is a vanity metric. It tells you the model is...
Food Image Recognition: How AI Identifies What's on Your Plate
Dev.to · albert nahas 👁️ Computer Vision 4mo ago
Food Image Recognition: How AI Identifies What's on Your Plate
Discover how AI-powered food image recognition accurately identifies and estimates your meals. Explore tech insights and boost your app’s accuracy today!
Building FridgeChef: What I Learned Training a Custom Computer Vision Model with Roboflow
Dev.to · Jacob Nastaskin 👁️ Computer Vision 4mo ago
Building FridgeChef: What I Learned Training a Custom Computer Vision Model with Roboflow
I spend too much time staring at my fridge trying to figure out what to make for dinner. So I built...
Generating SEM Images from Segmentation Masks
Dev.to · Shira S 👁️ Computer Vision 4mo ago
Generating SEM Images from Segmentation Masks
Acknowledgements We would like to thank our mentors, Asaf Nisani and Yoav Lebendiker, for...