Foundations
Computer Vision
Object detection, segmentation, YOLO, CLIP, and vision-language models
Skills in this topic
3 skills — Sign in to track your progress

Dev.to · BMBrick
👁️ Computer Vision
⚡ AI Lesson
1mo ago
How I Built a Perceptual Color Quantization Engine for LEGO Mosaics
The Problem Converting a photo into a LEGO mosaic sounds simple: resize the image, find...

Medium · LLM
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Unified Video Action (UVA) Model
Seminar #5 (Paper review) Continue reading on Medium »

Medium · Deep Learning
👁️ Computer Vision
⚡ AI Lesson
1mo ago
# CNN vs Vision Transformer on CIFAR-10: A Beginner-Friendly Experiment
## Why I wrote this experiment Continue reading on Medium »

Medium · Deep Learning
👁️ Computer Vision
⚡ AI Lesson
1mo ago
From Pixels to Predictions: How CNNs Actually Work
Understanding how Convolutional Neural Networks transform raw pixel data into intelligent predictions. Continue reading on Medium »

Dev.to · 𝗔𝗷𝗮𝘆 𝗦𝗼𝗻𝗶
👁️ Computer Vision
⚡ AI Lesson
1mo ago
VXN-RAMNet (VisionX Routine Adaptive Memory Network)
What if navigation systems could remember routes visually instead of depending entirely on...

Medium · AI
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Computer Vision Is Rebuilding the Fitting Room
The models, the stack, the ROI — no fluff Continue reading on Medium »

Medium · Data Science
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Why Most Tools Fail at Table Extraction (And How I Built a Vision-First Solution)
Conquering the nightmare of Borderless, Scanned, and Merged-Cell Tables with a Hybrid AI Pipeline Continue reading on Medium »
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
Intelligent CCTV for Urban Design: AI-Based Analysis of Soft Infrastructure at Intersections
arXiv:2605.05402v1 Announce Type: new Abstract: Artificial intelligence (AI) and computer vision are transforming transportation data collection. This study int
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
MolRecBench-Wild: A Real-World Benchmark for Optical Chemical Structure Recognition
arXiv:2605.05832v1 Announce Type: new Abstract: Optical Chemical Structure Recognition (OCSR) aims to translate molecular diagrams in scientific literature into
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
Tamaththul3D: High-Fidelity 3D Saudi Sign Language Avatars from Monocular Video
arXiv:2605.05367v1 Announce Type: cross Abstract: Arabic Sign Language (ArSL) and its dialects serve approximately 400 million Arabic speakers worldwide, yet th
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
CFE-PPAR: Compression-friendly encryption for privacy-preserving action recognition leveraging video transformers
arXiv:2605.05692v1 Announce Type: cross Abstract: Privacy-preserving action recognition (PPAR) enables machines to understand human activities in videos without
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
The autoPET3 Challenge -- Automated Lesion Segmentation in Whole-Body PET/CT - Multitracer Multicenter Generalization
arXiv:2605.05775v1 Announce Type: cross Abstract: We report the design and results of the third autoPET challenge (MICCAI 2024), which benchmarked automated les
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
VideoRouter: Query-Adaptive Dual Routing for Efficient Long-Video Understanding
arXiv:2605.05848v1 Announce Type: cross Abstract: Video large multimodal models increasingly face a scalability bottleneck: long videos produce excessively long
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
iPhoneBlur: A Difficulty-Stratified Benchmark for Consumer Device Motion Deblurring
arXiv:2605.05990v1 Announce Type: cross Abstract: Motion blur restoration on consumer mobile devices is typically evaluated using aggregate metrics that obscure
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
Adding Thermal Awareness to Visual Systems in Real-Time via Distilled Diffusion Models
arXiv:2605.06010v1 Announce Type: cross Abstract: Purely RGB-based vision models often fail to provide reliable cues in challenging scenarios such as nighttime
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
Dynamic Pondering Sparsity-aware Mixture-of-Experts Transformer for Event Stream based Visual Object Tracking
arXiv:2605.06112v1 Announce Type: cross Abstract: Despite significant progress, RGB-based trackers remain vulnerable to challenging imaging conditions, such as
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
Autoregressive Visual Generation Needs a Prologue
arXiv:2605.06137v1 Announce Type: cross Abstract: In this work, we propose Prologue, an approach to bridging the reconstruction-generation gap in autoregressive
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
ActCam: Zero-Shot Joint Camera and 3D Motion Control for Video Generation
arXiv:2605.06667v1 Announce Type: cross Abstract: For artistic applications, video generation requires fine-grained control over both performance and cinematogr
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
Multi-Scale Spectral Attention Module-based Hyperspectral Segmentation in Autonomous Driving Scenarios
arXiv:2506.18682v2 Announce Type: replace-cross Abstract: Recent advances in autonomous driving (AD) have highlighted the potential of hyperspectral imaging (HS
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
SoccerMaster: A Vision Foundation Model for Soccer Understanding
arXiv:2512.11016v2 Announce Type: replace-cross Abstract: Soccer understanding has recently garnered growing research interest due to its domain-specific comple
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
CSMCIR: CoT-Enhanced Symmetric Alignment with Memory Bank for Composed Image Retrieval
arXiv:2601.03728v2 Announce Type: replace-cross Abstract: Composed Image Retrieval (CIR) enables users to search for target images using both a reference image
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
DC-DiT: Adaptive Compute and Elastic Inference for Visual Generation via Dynamic Chunking
arXiv:2603.06351v2 Announce Type: replace-cross Abstract: Diffusion Transformers rely on static patchify tokenization, assigning the same token budget to smooth
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
ChArtist: Generating Pictorial Charts with Unified Spatial and Subject Control
arXiv:2603.14209v2 Announce Type: replace-cross Abstract: A pictorial chart is an effective medium for visual storytelling, seamlessly integrating visual elemen

Medium · Programming
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Number systems conversion for dummies
There are four widely used number systems: decimal (10), binary (2), octal (8), and hexadecimal (16). As humans, we use the decimal system. Continue reading on

Medium · Python
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Panduan Praktis Optimasi Pencahayaan Citra Digital dengan Python
Mengapa Pencahayaan Itu Krusial? Pernahkah Anda mengambil foto di kondisi minim cahaya dan mendapati hasilnya sangat gelap hingga… Continue reading on Medium »

Medium · Python
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Efficiency vs. Precision: A Python Deep Dive into Faster R-CNN and SSD PyTorch
In the rapidly evolving landscape of artificial intelligence, selecting the optimal architecture for computer vision is rarely a simple… Continue reading on Obj

Medium · Deep Learning
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Computer Vision Fundamentals: CNN Architectures
The landmark designs that shaped modern computer vision, from LeNet to EfficientNet. Continue reading on Medium »

Medium · Python
👁️ Computer Vision
⚡ AI Lesson
1mo ago
What If You Could Find Films by How They Feel Visually?
There’s a scene in Life of Pi where the ocean at night fills with bioluminescent green light. The whole frame glows. It’s one of the most… Continue reading on M

Medium · Machine Learning
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Membangun Sistem Deteksi Helm Pengendara Motor Menggunakan YOLOv8
Keselamatan lalu lintas merupakan salah satu isu penting, khususnya bagi pengguna sepeda motor. Continue reading on Medium »

Medium · Python
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Membangun Sistem Deteksi Helm Pengendara Motor Menggunakan YOLOv8
Keselamatan lalu lintas merupakan salah satu isu penting, khususnya bagi pengguna sepeda motor. Continue reading on Medium »

Medium · Machine Learning
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Part 1:
From Fish Classification to Vision Transformers: How Machines Learned to See Continue reading on Medium »

Medium · Data Science
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Part 1:
From Fish Classification to Vision Transformers: How Machines Learned to See Continue reading on Medium »

Medium · Programming
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Part 1:
From Fish Classification to Vision Transformers: How Machines Learned to See Continue reading on Medium »

Medium · Python
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Eksplorasi Deteksi Tepi pada Citra Digital Menggunakan Python
Pendahuluan Continue reading on Medium »
Medium · Python
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Dari Pixel ke AI: Bagaimana Komputer Memahami Sebuah Gambar
“Sebuah eksplorasi sederhana tentang bagaimana gambar digital diubah menjadi informasi yang dapat dipahami oleh Artificial Intelligence.”… Continue reading on M

Medium · Python
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Teaching a Random Forest to Tell Walking from Running: A Computer Vision Pipeline with Hand-Built...
How a 56-feature baseline became a 240-feature classifier at 86% accuracy, with per-class SHAP guiding every feature engineering decision. Continue reading on M

Medium · Deep Learning
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Vision Transformers Under Extreme Latency: Particle Tracking at the LHC
Particle physics has always been a data problem disguised as a physics problem and the LHC is now pushing us to rethink tracking as a… Continue reading on Data

Medium · Deep Learning
👁️ Computer Vision
⚡ AI Lesson
1mo ago
How Your Phone Unlocks in the Dark With Your Face
Thirty thousand invisible dots, a neural engine, and some surprisingly elegant geometry — all in the time it takes you to glance at your… Continue reading on Co

Medium · Data Science
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Cara Mudah Deteksi Tepi Gambar Menggunakan Algoritma Sobel di Python
Dalam dunia Computer Vision, deteksi tepi (edge detection) adalah salah satu teknik fundamental yang digunakan untuk mengidentifikasi… Continue reading on Mediu

Medium · Python
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Cara Mudah Deteksi Tepi Gambar Menggunakan Algoritma Sobel di Python
Dalam dunia Computer Vision, deteksi tepi (edge detection) adalah salah satu teknik fundamental yang digunakan untuk mengidentifikasi… Continue reading on Mediu

Medium · Deep Learning
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Implementasi YOLO26 untuk Deteksi Kesehatan Kelapa Sawit Melalui Citra Digital
Indonesia merupakan salah satu produsen kelapa sawit terbesar di dunia. Berdasarkan laporan Analisis Kinerja Perdagangan Kelapa Sawit… Continue reading on Mediu

Medium · Python
👁️ Computer Vision
⚡ AI Lesson
1mo ago
Mengenal Lebih Dekat Deteksi Tepi Canny Pada Pengolahan Citra Digital dengan python dan opencv
Dalam dunia pengolahan citra digital, mendeteksi batas suatu objek merupakan hal yang sangat penting. Continue reading on Medium »
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
Modeling Subjective Urban Perception with Human Gaze
arXiv:2605.00764v1 Announce Type: cross Abstract: Urban perception describes how people subjectively evaluate urban environments, shaping how cities are experie
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
StableI2I: Spotting Unintended Changes in Image-to-Image Transition
arXiv:2605.04453v1 Announce Type: cross Abstract: In most real-world image-to-image (I2I) scenarios, existing evaluations primarily focus on instruction followi
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
Example-Based Object Detection
arXiv:2605.04501v1 Announce Type: cross Abstract: In recent years, object detection has achieved significant progress, especially in the field of open-vocabular
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
Efficient Geometry-Controlled High-Resolution Satellite Image Synthesis
arXiv:2605.04557v1 Announce Type: cross Abstract: High-resolution satellite images are often scarce and costly, especially for remote areas or infrequent events
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
From Diffusion to Rectified Flow: Rethinking Text-Based Segmentation
arXiv:2605.04590v1 Announce Type: cross Abstract: Text-based image segmentation aims to delineate object boundaries within an image from text prompts, offering
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
1mo ago
Reference-based Category Discovery: Unsupervised Object Detection with Category Awareness
arXiv:2605.04606v1 Announce Type: cross Abstract: Traditional one-shot detection methods have addressed the closed-set problem in object detection, but the high
DeepCamp AI