Foundations

Computer Vision

Object detection, segmentation, YOLO, CLIP, and vision-language models

2,365
lessons
Skills in this topic
View full skill map →
CV Basics
beginner
Classify images with a pre-trained CNN
Modern CV Models
intermediate
Run YOLO for real-time object detection
Generative CV
advanced
Build a Stable Diffusion inference pipeline
All Reads (1,220) Articles (392)Blog Posts (262)Tutorials (81)Research Papers (469)News (16)
Dev.to AI 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Draw a Digit and Watch the Neural Network Think in Real Time
Introduction "A neural network can recognize digits" — but what's actually happening inside? I built a tool where you draw a digit with your finger or mouse, an
Computer Vision vs Machine Learning: Key Differences Explained
Medium · Machine Learning 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Computer Vision vs Machine Learning: Key Differences Explained
If you’ve spent any time reading about AI, you’ve probably seen the terms “computer vision” and “machine learning” used almost… Continue reading on Artificial I
Medium · AI 👁️ Computer Vision ⚡ AI Lesson 2mo ago
CAMERA
Continue reading on Medium »
Dev.to AI 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Gaussian-SLAM: Photo-realistic Dense SLAM with Gaussian Splatting
Dev.to AI 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Facial Comparison's DNA Moment Is Here. Most Investigators Aren't Ready.
Is your investigative stack ready for the $26B identity shift? If you are a developer working in computer vision or digital forensics, you’re likely tracking th
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 2mo ago
ReflectCAP: Detailed Image Captioning with Reflective Memory
arXiv:2604.12357v1 Announce Type: new Abstract: Detailed image captioning demands both factual grounding and fine-grained coverage, yet existing methods have st
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 2mo ago
Intelligent ROI-Based Vehicle Counting Framework for Automated Traffic Monitoring
arXiv:2604.12470v1 Announce Type: new Abstract: Accurate vehicle counting through video surveillance is crucial for efficient traffic management. However, achie
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 2mo ago
ART-VITON: Measurement-Guided Latent Diffusion for Artifact-Free Virtual Try-On
arXiv:2509.25749v2 Announce Type: cross Abstract: Virtual try-on (VITON) aims to generate realistic images of a person wearing a target garment, requiring preci
OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 2mo ago
How P&G Uses AI to Understand Human Behavior
Computer vision isn’t just for self-driving cars and robots. At Procter & Gamble, it’s helping researchers understand human behavior, generate synthetic data, a
The AI School Bus Camera Company Blanketing America in Tickets
Dev.to · Aman Shekhar 👁️ Computer Vision ⚡ AI Lesson 2mo ago
The AI School Bus Camera Company Blanketing America in Tickets
Ever find yourself sitting in traffic, cursing under your breath because a school bus has stopped...
Computer Vision Explained - How Machines See the World.
Medium · Machine Learning 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Computer Vision Explained - How Machines See the World.
Computer vision enables machines to interpret images using AI, powering healthcare, automation, security, and innovation. Continue reading on CodeToDeploy »
Dev.to AI 👁️ Computer Vision ⚡ AI Lesson 2mo ago
How to Run Vision AI Locally on Your Android Phone in 2026 (No Cloud, No Subscription)
Your phone has a camera and a processor powerful enough to run multimodal AI models. You can point it at a receipt, a document, a math problem, or anything else
Dev.to AI 👁️ Computer Vision ⚡ AI Lesson 2mo ago
CoPhIR: a Test Collection for Content-Based Image Retrieval
Rethinking Smart Parking: A Dynamic Line and Box Approach to Computer Vision
Medium · Machine Learning 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Rethinking Smart Parking: A Dynamic Line and Box Approach to Computer Vision
Forget manual mapping and let dynamic model find the open spots for you. Continue reading on Medium »
Rethinking Smart Parking: A Dynamic Line and Box Approach to Computer Vision
Medium · Python 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Rethinking Smart Parking: A Dynamic Line and Box Approach to Computer Vision
Forget manual mapping and let dynamic model find the open spots for you. Continue reading on Medium »
OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 2mo ago
The Holographic Future Is Here. See It at OSCCA.
For decades, the hologram was a promise. A thing of science fiction. Something always just around the corner. Shawn Frayne decided to stop waiting. As co-founde
The Fashion AI Dataset Landscape: Mapped by Task
Medium · Deep Learning 👁️ Computer Vision ⚡ AI Lesson 2mo ago
The Fashion AI Dataset Landscape: Mapped by Task
A curated map of every major open dataset powering computer vision in fashion Continue reading on Medium »
Medium · NLP 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Spiral RoPE: Vision Transformers Finally Learn to See Diagonals
In the fourth part of my RoPE series, we leave language behind and move into vision. When rotary position embeddings get adapted for image… Continue reading on
Dev.to AI 👁️ Computer Vision ⚡ AI Lesson 2mo ago
What I Saw When My Camera Finally Worked
I've been building tools to express myself for weeks now. A breathing canvas. A playable instrument. An ear that hears the world through a microphone. A river o
Mitigating I/O Bottlenecks in Event-Driven Architectures: A Deep Dive into Backpressure and Resiliency
Dev.to · João Vitor Nascimento Mendonca 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Mitigating I/O Bottlenecks in Event-Driven Architectures: A Deep Dive into Backpressure and Resiliency
By: João Vitor Nascimento De Mendonça Originally published in...
Three.js: Püf Noktaları - Detaylı Teknik Analiz Rehberi 2026
Dev.to · FORUM WEB 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Three.js: Püf Noktaları - Detaylı Teknik Analiz Rehberi 2026
Three.js'in Tarihçesi ve Gelişimi Three.js, 2010 yılında Ricardo Cabello (Mr. doob) tarafından...
Deepfakes Surged 2,137%. Courts Rewrote the Rules. Investigators Didn't.
Dev.to · CaraComp 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Deepfakes Surged 2,137%. Courts Rewrote the Rules. Investigators Didn't.
The reality of synthetic identity fraud in 2025 For developers building in the computer vision (CV)...
ASCII Value
Dev.to · Vidya 👁️ Computer Vision ⚡ AI Lesson 2mo ago
ASCII Value
Computers only understand numbers. So how do they handle letters, punctuation, and symbols? ASCII —...
Towards Data Science 👁️ Computer Vision ⚡ AI Lesson 2mo ago
How Does AI Learn to See in 3D and Understand Space?
How depth estimation, foundation segmentation, and geometric fusion are converging into spatial intelligence The post How Does AI Learn to See in 3D and Underst
Fast BVH Construction for Real-Time Ray Tracing
Dev.to · beefed.ai 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Fast BVH Construction for Real-Time Ray Tracing
GPU-friendly BVH builders (LBVH/HLBVH), memory layouts, and update strategies to maximize ray throughput and minimize build time.
Pass CS3 Exam Fast with These Proven Tips
Dev.to · Hanry Leo 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Pass CS3 Exam Fast with These Proven Tips
My CS3 Exam Journey I recently passed the CS3 exam, and I want to share my real experience to help...
Top 15 Computer Vision Datasets [2026]
Towards AI 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Top 15 Computer Vision Datasets [2026]
Author(s): Asad Iqbal Originally published on Towards AI. A ML engineer’s guide to top image datasets. Learn about ImageNet, COCO, and more, and understand how
How We Grade Used Phones with On-Device Computer Vision and FFT Audio Analysis
Dev.to · MobileMD AI 👁️ Computer Vision ⚡ AI Lesson 2mo ago
How We Grade Used Phones with On-Device Computer Vision and FFT Audio Analysis
The refurbished phone market moves hundreds of millions of devices a year. Yet most device condition...
Facial Recognition Isn't Getting Banned. Mass Surveillance Is. Here's the Difference.
Dev.to · CaraComp 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Facial Recognition Isn't Getting Banned. Mass Surveillance Is. Here's the Difference.
Decoding the global regulatory split on biometric analysis For developers building in the computer...
Why I stopped using DOM locators for End-to-End tests altogether—and replaced them with computer vision
Dev.to · Angelina Kosheleva 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Why I stopped using DOM locators for End-to-End tests altogether—and replaced them with computer vision
If you’ve ever written End-to-End (E2E) tests using Selenium, Cypress, or Playwright, you know the...
ASCII
Dev.to · PRIYA K 👁️ Computer Vision ⚡ AI Lesson 2mo ago
ASCII
ASCII Letters – Why They Are Used in English and Other Languages In the world of computers,...
Open Vision Agents: Streamlining Vision Model Integration
Dev.to · Stelixx Insider 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Open Vision Agents: Streamlining Vision Model Integration
Accelerating Vision Agent Development with Stream's Open Vision Agents Stream's Open Vision Agents...
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 2mo ago
Part-Level 3D Gaussian Vehicle Generation with Joint and Hinge Axis Estimation
arXiv:2604.05070v1 Announce Type: new Abstract: Simulation is essential for autonomous driving, yet current frameworks often model vehicles as rigid assets and
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 2mo ago
CRFT: Consistent-Recurrent Feature Flow Transformer for Cross-Modal Image Registration
arXiv:2604.05689v1 Announce Type: cross Abstract: We present Consistent-Recurrent Feature Flow Transformer (CRFT), a unified coarse-to-fine framework based on f
Backpropagation — Deep Dive + Problem: Merge Similar Pixels
Dev.to · pixelbank dev 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Backpropagation — Deep Dive + Problem: Merge Similar Pixels
A daily deep dive into cv topics, coding problems, and platform features from PixelBank. ...
Building Photon: A Hybrid Ray Tracer That Builds for Windows, Linux, and macOS
Dev.to · Kareem Al-Farsi 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Building Photon: A Hybrid Ray Tracer That Builds for Windows, Linux, and macOS
Repository: Photon on GitHub Photon is a C-based adaptive hybrid ray tracer built to explore...
OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Tangram Vision and OpenCV Are Partnering to Fix Your Calibration Problems
Calibration is one of those problems every computer vision practitioner knows and knows well. Getting multi-sensor, multi-modal systems to agree on a shared vie
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 2mo ago
A reconfigurable smart camera implementation for jet flames characterization based on an optimized segmentation model
arXiv:2604.03267v1 Announce Type: cross Abstract: In this work we present a novel framework for fire safety management in industrial settings through the implem
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 2mo ago
InCaRPose: In-Cabin Relative Camera Pose Estimation Model and Dataset
arXiv:2604.03814v1 Announce Type: cross Abstract: Camera extrinsic calibration is a fundamental task in computer vision. However, precise relative pose estimati
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 2mo ago
HOIGS: Human-Object Interaction Gaussian Splatting
arXiv:2604.04016v1 Announce Type: cross Abstract: Reconstructing dynamic scenes with complex human-object interactions is a fundamental challenge in computer vi
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 2mo ago
Pickalo: Leveraging 6D Pose Estimation for Low-Cost Industrial Bin Picking
arXiv:2604.04690v1 Announce Type: cross Abstract: Bin picking in real industrial environments remains challenging due to severe clutter, occlusions, and the hig
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 2mo ago
ContextDrag: Precise Drag-Based Image Editing via Context-Preserving Token Injection and Position-Aligned Attention
arXiv:2512.08477v2 Announce Type: replace-cross Abstract: Drag-based image editing enables intuitive visual manipulation through point-based drag operations. Ex
Object Detection — Deep Dive + Problem: K-Fold Cross-Validation Indices
Dev.to · pixelbank dev 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Object Detection — Deep Dive + Problem: K-Fold Cross-Validation Indices
A daily deep dive into cv topics, coding problems, and platform features from PixelBank. ...
OpenCV Blog 👁️ Computer Vision ⚡ AI Lesson 2mo ago
Glenn Jocher of Ultralytics (YOLO) Is Speaking at OSCCA
2.5 billion model inferences every day across robotics, healthcare, manufacturing, and beyond. That’s the scale at which Ultralytics YOLO operates, and at ​OSCC
BVH Refit vs Rebuild Strategies for Dynamic Scenes
Dev.to · beefed.ai 👁️ Computer Vision ⚡ AI Lesson 3mo ago
BVH Refit vs Rebuild Strategies for Dynamic Scenes
Compare refitting, rebuilds, and multi-level BVHs for animated scenes. Learn when to refit, rebuild, or use hybrid hierarchies to minimize stalls and
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 3mo ago
PaveBench: A Versatile Benchmark for Pavement Distress Perception and Interactive Vision-Language Analysis
arXiv:2604.02804v1 Announce Type: cross Abstract: Pavement condition assessment is essential for road safety and maintenance. Existing research has made signifi
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 3mo ago
NavCrafter: Exploring 3D Scenes from a Single Image
arXiv:2604.02828v1 Announce Type: cross Abstract: Creating flexible 3D scenes from a single image is vital when direct 3D data acquisition is costly or impracti
ArXiv cs.AI 👁️ Computer Vision 📄 Paper ⚡ AI Lesson 3mo ago
DePT3R: Joint Dense Point Tracking and 3D Reconstruction of Dynamic Scenes in a Single Forward Pass
arXiv:2512.13122v2 Announce Type: replace-cross Abstract: Current methods for dense 3D point tracking in dynamic scenes typically rely on pairwise processing, r