Occlusion Techniques in Computer Vision

Roboflow · Intermediate ·👁️ Computer Vision ·5y ago

Skills: CV Basics80%Modern CV Models70%

Key Takeaways

This video discusses occlusion techniques in computer vision, including cutout, random erase, grid mask, cutmix, hide and seek, and mosaic augmentations, and demonstrates how to apply these techniques using the Roboflow platform to build more robust object detection models.

Full Transcript

this is jacob from rebiflow today we're going to talk about occlusion techniques and computer vision first we're going to talk about why you need occlusion techniques in computer vision to build better computer vision models then we're going to talk about where state of the art research is and inclusion techniques now for that we're going to show you how to get hands-on with your own data set to use occlusion techniques to make your own model better so kind of diving into why we need occlusion techniques computer vision models are like all machine learning models that is they often fine-tune and over-fit to a specific training data set and then they're not able to generalize that well when they get into the wild so in this graph we can kind of see a little bit of an example of how that might work mathematically so we have some noisy data which represents the way that a task might manifest itself in the world and then our model is actually fitting to this data in a very tight line and it's getting kind of over fit here to be a very specific function whereas you know once you go into the wild things are actually going to manifest themselves differently than the way they were in the training set so that it's not going to be so easy for the model to generalize when it gets out into uh into a production environment so oftentimes you'll see curves like this during training which shows that you're actually uh suffering from overfitting during training so this is uh where your your training loss is going down so it's learning the task well but the validation loss is actually has a kink in it where it starts going up meaning that uh basically that the the model is over fitting to the training set but it's not learning not continuing to learn ways that it can generalize outside of that so what what does occlusion techniques do for you well occlusion techniques are a specific way to combat against this so basically occlusion techniques are are designed to hide a part of the image and this hiding of the image during training means that your model is going to actually learn around hidden areas of the image so for example here we might have a model that predicts a cat or dog based on a photo and if we look at the cam the class activation map for this model we might see that all of the predictions are happening at the uh the dog's head so like for example in this image for the cam you can see that the prediction dog basically all relies on this dog's head here and that's uh you know not not such a good thing because what if uh this dog was like hidden behind a bush or something it wouldn't be able to predict dog or cat because it's already kind of overfit to this very specific part of the dog's face um so we might use an occlusion technique to kind of hide where the dog is at where the dog is so now kind of diving into the different occlusion techniques that are out there so for example you have the first one is random erase so here's random erase it's basically just taking a piece of the image and erasing it um this is kind of like implemented randomly and then you replace the the rectangle with just noise rather than the base pixels a similar technique is cut out where there's kind of like cut out rectangle out of the image the difference in cutout in the the original paper at least is that uh they're only hiding those pixels from the first layer of the network so they're allowing the future layers to to see um the area that wouldn't cut out um and then another one is called hide and seek so this one just like kind of draws a grid over the image and then randomly hides someone's in in the uh in the grid uh this one's called grid mask so grid mask is uh doing a similar thing except it's not probabilistically uh hiding uh areas of the grid and then so those were kind of the predecessor uh occlusion techniques and now there are like some even better ones that are coming out uh so the best occlusion techniques are there's this one cup mix um so cut mix is uh where you're not actually just occluding you're actually cutting out a piece and then you're putting in a piece from a different image so this does accomplish the same goal as occluding but it also teaches the model to learn how to recognize uh objects and classes within different environments so not even the the context can even be different then taking it kind of one step further is this uh augmentation technique called mosaic so mosaic is not technically an occlusion technique per se um but it is kind of accomplishing similar things so a mosaic is stitching four images together and kind of looking at them in different uh places in the image so this accomplishes a few things uh to help with regularization so to avoid overfitting of the training set in that it's teaching the model to learn to recognize objects in different locations teaching the model to learn if it's sort of slightly occluded on the edge and then it's also teaching the model to learn in different contexts so it doesn't only have the same surroundings it has various surroundings in the mosaic data augmentation so all those things are all pretty powerful and that is a new state of the art augmentation technique so now we're going to go kind of hands-on with an occlusion example so here we're going to jump into the robo platform so let's say we had a chess data set we wanted to build an object detection model to identify chess pieces in an image so you would start by doing this by getting a data set so we have this data set um that's public on roboflow so that's public.roboflow.com and you can see these images here where you have chess pieces that are labeled and you might you know have nice pieces uh like this that are not occluded so this king is like very visible and has nothing in the way but um in this case you know you might have a pawn that actually is like kind of slightly blocked and so if you're training only on ponds that are not blocked and then you get into a scenario afterwards where the pawns are actually blocked they might have the model might have overfit to a specific part of that pawn and to avoid this you're going to want to try to experiment with occlusion techniques to make your model even more resilient so one one way you can do that is by adding an augmentation step so here's uh augmentation steps within the robo-flow platform you add an augmentation step here um particularly we'll we'll look at the cut out one here which is what we were talking about where those rectangles were were hidden from the image we have a slightly different implementation here where we're kind of randomly cutting out different pieces and you can add more than one cutouts to your training set so here's some previews of how it might work so you go here and you can add like more cutouts you can make them bigger um and then it'll kind of randomly generate these cutouts over your images um so you can you can add a lot of these uh you can crank the augmentations up so then you can get a bunch of these occluded images to send through training so to make a data set version we'll just go ahead and hit generate and i'll say this one is just called occluded so so yeah so that's uh basically getting hands-on with occlusion techniques uh to make versions that are even more robust and to make uh models that are even better afterwards then you can export kind of anywhere you want for training um and you can use one click training here too um to see kind of how these occlusion techniques are uh making your models even better so that's all for today looking forward to seeing you guys next time remember to like and subscribe and thanks so much for watching see ya

Original Description

Build better CV models with occlusion - cutout, random erase, grid mask, cutmix, hide and seek, and mosaic augmentations are discussed in this video.

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Roboflow · Roboflow · 40 of 60

← Previous Next →

YOLOv3 PyTorch Notebook Tutorial

YOLOv3 PyTorch Notebook Tutorial

How to Train YOLOv4 on a Custom Dataset (PyTorch)

How to Train YOLOv4 on a Custom Dataset (PyTorch)

How to Train YOLOv5 on a Custom Dataset

How to Train YOLOv5 on a Custom Dataset

How to Use the Roboflow Dataset Health Check

How to Use the Roboflow Dataset Health Check

What is Mean Average Precision (mAP)?

What is Mean Average Precision (mAP)?

How to Use the Roboflow Model Library

How to Use the Roboflow Model Library

How to Train EfficientDet in TensorFlow 2 Object Detection

How to Train EfficientDet in TensorFlow 2 Object Detection

How to Train YOLO v4 Tiny (Darknet) on a Custom Dataset

How to Train YOLO v4 Tiny (Darknet) on a Custom Dataset

Ask the Roboflow Team Anything - Episode 1

Ask the Roboflow Team Anything - Episode 1

Exploring The COCO Dataset

Exploring The COCO Dataset

Community Spotlight: Improving Uno with Computer Vision

Community Spotlight: Improving Uno with Computer Vision

Mosaic Data Augmentation - Deep Dive

Mosaic Data Augmentation - Deep Dive

Hands on with the OAK-1

Hands on with the OAK-1

Glenn Jocher: What is New in YOLO v5?

Glenn Jocher: What is New in YOLO v5?

How to Use Amazon Rekognition Custom Labels and Roboflow to Build an Object Detection Model

How to Use Amazon Rekognition Custom Labels and Roboflow to Build an Object Detection Model

An Interview with Brandon Gilles, Luxonis Founder and OAK Chief Architect

An Interview with Brandon Gilles, Luxonis Founder and OAK Chief Architect

How to Train a Custom Mobile Object Detection Model (with YOLOv4 Tiny and TensorFlow Lite)

How to Train a Custom Mobile Object Detection Model (with YOLOv4 Tiny and TensorFlow Lite)

Tackling the Small Object Problem in Object Detection

Tackling the Small Object Problem in Object Detection

Fast.ai v2 Released - What's New?

Fast.ai v2 Released - What's New?

Teaser: Roboflow Train (1-Click Computer Vision AutoML)

Teaser: Roboflow Train (1-Click Computer Vision AutoML)

How to Train a Custom Resnet34 Image Classification Model

How to Train a Custom Resnet34 Image Classification Model

How to Label Images for Object Detection with CVAT

How to Label Images for Object Detection with CVAT

Deploy YOLOv5 to Jetson Xavier NX at 30 FPS

Deploy YOLOv5 to Jetson Xavier NX at 30 FPS

Elisha Odemakinde Hosts Roboflow ML Engineer, Jacob Solawetz

Elisha Odemakinde Hosts Roboflow ML Engineer, Jacob Solawetz

Getting Started with VoTT - Computer Vision Annotation

Getting Started with VoTT - Computer Vision Annotation

How to Manage Classes in Object Detection (Rename, Combine, Balance)

How to Manage Classes in Object Detection (Rename, Combine, Balance)

How to Train YOLOv4 on a Custom Dataset in Darknet

How to Train YOLOv4 on a Custom Dataset in Darknet

Is Grayscale a Preprocessing or Augmentation Step in Computer Vision?

Is Grayscale a Preprocessing or Augmentation Step in Computer Vision?

Getting Started with Image Data Augmentation

Getting Started with Image Data Augmentation

Glenn Jocher: Image Augmentation in YOLO v5 and Beyond

Glenn Jocher: Image Augmentation in YOLO v5 and Beyond

GA Hosts Roboflow - Healthcare and AI

GA Hosts Roboflow - Healthcare and AI

How do self driving cars know when to stop?

How do self driving cars know when to stop?

What is PASCAL VOC XML?

What is PASCAL VOC XML?

AutoML Showdown: Google vs Amazon vs Microsoft

AutoML Showdown: Google vs Amazon vs Microsoft

How is computer vision changing manufacturing?

How is computer vision changing manufacturing?

The Alphabet in American Sign Language

The Alphabet in American Sign Language

Luxonis OAK-D: Computer Vision on Device

Luxonis OAK-D: Computer Vision on Device

How to Train a Custom Faster R-CNN Model with Facebook AI's Detectron2 | Use Your Own Dataset

How to Train a Custom Faster R-CNN Model with Facebook AI's Detectron2 | Use Your Own Dataset

TensorFlow vs PyTorch: Fireside

TensorFlow vs PyTorch: Fireside

Occlusion Techniques in Computer Vision

Occlusion Techniques in Computer Vision

A Customizable Web Application for Your Computer Vision Model

A Customizable Web Application for Your Computer Vision Model

Model Tradeoffs and the Future of Computer Vision

Model Tradeoffs and the Future of Computer Vision

Designing an Augmented Reality Board Game App

Designing an Augmented Reality Board Game App

YOLOv4 - Advanced Tactics

YOLOv4 - Advanced Tactics

How to Use CreateML and Build a Computer Vision iPhone App | AR Object Detection

How to Use CreateML and Build a Computer Vision iPhone App | AR Object Detection

Fireside Chat: Computer Vision in Agriculture

Fireside Chat: Computer Vision in Agriculture

Scaled-YOLOv4 Tops EfficientDet: Research Rundown

Scaled-YOLOv4 Tops EfficientDet: Research Rundown

What is Image Preprocessing?

What is Image Preprocessing?

Building a Community of Creators with BlkArthouse and Von Deon

Building a Community of Creators with BlkArthouse and Von Deon

How to Train Scaled-YOLOv4 to Detect Custom Objects

How to Train Scaled-YOLOv4 to Detect Custom Objects

Intro to Computer Vision: Fireside

Intro to Computer Vision: Fireside

The Best Way to Annotate Images for Object Detection

The Best Way to Annotate Images for Object Detection

The Computer Vision Process: Fireside

The Computer Vision Process: Fireside

How to Annotate Images with Your Team Using Roboflow

How to Annotate Images with Your Team Using Roboflow

Introducing the Roboflow Object Count Histogram

Introducing the Roboflow Object Count Histogram

How Fast is the M1 at Machine Learning? Benchmarking Apple's M1 and Intel's Chips

How Fast is the M1 at Machine Learning? Benchmarking Apple's M1 and Intel's Chips

CLIP: OpenAI's amazing new zero-shot image classifier

CLIP: OpenAI's amazing new zero-shot image classifier

How I hacked my Nest camera to run custom models

How I hacked my Nest camera to run custom models

Getting Started with the Roboflow Inference API

Getting Started with the Roboflow Inference API

Transfer Learning in Computer Vision | What, How, Why

Transfer Learning in Computer Vision | What, How, Why

This video teaches you how to use occlusion techniques to improve the robustness of your computer vision models, and demonstrates how to apply these techniques using the Roboflow platform. By applying occlusion techniques, you can build models that are more resilient to overfitting and better generalize to new, unseen data.

Key Takeaways

Import a dataset into Roboflow
Add an augmentation step to the dataset
Select an occlusion technique, such as cutout or random erase
Configure the occlusion technique, such as setting the size and number of cutouts
Generate augmented images using the occlusion technique
Train a model using the augmented images
Evaluate the model's performance on occluded images

💡 Occlusion techniques can help improve the robustness of computer vision models by simulating real-world scenarios where objects may be partially occluded, and can be used to build models that are more resilient to overfitting and better generalize to new, unseen data.

🔒 Pro feature: Ask AI to explain this lesson →

More on: CV Basics

View skill →

Identify Horses or Humans with TensorFlow and Vertex AI

How to Build and Install OpenCV from Source | Using Visual Studio and CMake | Computer Vision

How to Build and Install OpenCV from Source | Using Visual Studio and CMake | Computer Vision

Building a Dog Breed Identifier App from scratch - DogNet

Building a Dog Breed Identifier App from scratch - DogNet

Aladdin Persson

Apply OpenGL Texturing and Camera Systems

Apply OpenGL Texturing and Camera Systems

Aerial Image Segmentation with PyTorch

Aerial Image Segmentation with PyTorch

How to Install Stable Diffusion - automatic1111

How to Install Stable Diffusion - automatic1111

Sebastian Kamph

Related AI Lessons

Cloud-Optimized OpenCV + A Special Surprise Announcement on OpenCV Live

Learn about Cloud-Optimized OpenCV for faster computer vision computations and a special announcement on OpenCV Live

When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…

Learn how to build an AI-powered exam monitoring system using Computer Vision and DeepFace to assist professional certification exams

Medium · Python

When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…

Build an AI-powered exam monitoring system using Computer Vision and Deep Learning to enhance professional certification exams

Medium · Deep Learning

When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…

Build an AI-powered exam monitoring system using Computer Vision and Deep Learning to enhance exam security and integrity

Medium · Cybersecurity

Marketing management for ugc net| Important topics of marketing management ugc net commerce dec 2023

Bhoomi Learning Centre~Dr. Muskan