How to Use the Roboflow Dataset Health Check
Key Takeaways
The video demonstrates how to use Roboflow's Dataset Health Check to improve the quality of computer vision datasets, including analyzing annotation quality, class balance, and image size.
Full Transcript
hey there it's Joseph from Roma flow I'm gonna show you how you can use Robo flows data set health check to get the most out of your computer vision data sets now for today's example we're gonna be walking through the hardhat workers data set if you want to follow along you can actually find this data set on public dot Robo flow dot AI so if you go to public dot rubble fills out a I presented with all these free public image data sets and there's the hardhat workers data set I already have this data set in my account and I've pulled up the hardhat workers data set health check let's dive on in first things first I have seven thousand and thirty five images in my data set I don't have any missing annotations meaning all of my image files have a matching annotation file I also don't have any null examples 0 no example a null example is when I have an image that doesn't contain any of the objects that I wanted to detect in that given image so for example I don't have an image of a construction site that doesn't contain a worker a person or someone that doesn't have a hard head on I have twenty seven thousand and thirty nine annotations across my seven thousand images or approximately 3.8 per image that's good I mean it's pretty good feature richness across my three classes I also generally have pretty small images 0.17 megapixels my smallest one is point zero three megapixels my largest is 0.67 I'd actually click on that if I want to see this itty bitty small image and see its dimensions inside my dataset and here I see this this image here that's is 167 by 154 which is which is helpful okay now my class balance so having balanced classes is important in computer vision because we want our model to learn evenly across different objects that we want to teach it to recognize in this data set I might have some problems I have a pretty good representation of helmets nineteen thousand seven hundred forty-seven helmet examples I have about six thousand six hundred some examples of heads people without helmets but I only have six hundred fifteen examples of persons meaning people that don't have a helmet or a or just their face instead now I can zoom in and see show me Robo flow show me my data set every single image that contains a person remember there is 615 annotated people but that doesn't mean that there's 615 images of people why because there could be multiple people in a single image and in fact that's what we see here there's 209 images but a lot of these images have multiple people annotated like this one or actually this is a really good example here where I have all these people annotated and in the hard hat around those people as well okay now that's my health check so I also have here the size information me size matters because it helps inform the resize decision I want to make if I resize my images to square as most models require our my model is gonna be or our my image is gonna be stretched down or stretched up and it kind of depends in this data set it looks like my median average size is 500 by 333 so I might not want to go much bigger than 333 most the time you know we resize anywhere between 300 by 300 to 640 by 640 or somewhere in between sometimes bigger sometimes smaller it all depends of course on the context of your problem but I wouldn't want to go much bigger than 333 pixels here for the the height because that would stretch out my pixels more than I want the width is generally 500 pixels wide so maybe a good resize decision would be 300 by 300 perhaps a kind of depends again on the context to your problem I also see that the aspect ratio here a lot of my images are wider than they are square in fact you see here I have this line here where it goes directly across if an image is just as tall as it is wide that means it is a perfect square if it's wider than it is tall and it's a wide image and if it's taller than it is wide then it's a tall image I could have images that are very tall or very wide meaning if I stretch things to be square it might mess up their aspect ratio but conversely if I could preserve the aspect ratio it might create a lot of white or black padding in my resize decision rebel flow shows me those previews my pre-processing steps if I wanted to have a look so this is all useful information to inform my resize decision in particular now I also have the annotation heat map to understand generally in my image where are my objects appearing the helmets are generally across the my image the heads also generate across the top and persons are actually emanating from the bottom of my images that kind of makes sense right so this is like a gut check are my objects generally where I expect them to be across my images and if I were to make a resize decision or crop or change the size of my objects would they be in positions where they're in the frame of my image this helps make sure that you have a very visual quick check without manually combing through surfacing all these individual annotations in a quick one-stop way so that's kind of an overview of the things that are available to you and your health check now be sure to LIKE and subscribe to the Robo flow channel to learn more computer vision tips we cannot wait to see what you build using rubble flow to incorporate computer vision into your problems thanks so much for watching
Original Description
Roboflow enables developers to use computer vision, and computer vision engineers to get the most out of their data. In this walkthrough, we show you how to use the Dataset Health Check to improve the quality of your annotations, inform resize and augmentation decisions, and ensure your data is the best it can be.
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Roboflow · Roboflow · 4 of 60
1
2
3
▶
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
YOLOv3 PyTorch Notebook Tutorial
Roboflow
How to Train YOLOv4 on a Custom Dataset (PyTorch)
Roboflow
How to Train YOLOv5 on a Custom Dataset
Roboflow
How to Use the Roboflow Dataset Health Check
Roboflow
What is Mean Average Precision (mAP)?
Roboflow
How to Use the Roboflow Model Library
Roboflow
How to Train EfficientDet in TensorFlow 2 Object Detection
Roboflow
How to Train YOLO v4 Tiny (Darknet) on a Custom Dataset
Roboflow
Ask the Roboflow Team Anything - Episode 1
Roboflow
Exploring The COCO Dataset
Roboflow
Community Spotlight: Improving Uno with Computer Vision
Roboflow
Mosaic Data Augmentation - Deep Dive
Roboflow
Hands on with the OAK-1
Roboflow
Glenn Jocher: What is New in YOLO v5?
Roboflow
How to Use Amazon Rekognition Custom Labels and Roboflow to Build an Object Detection Model
Roboflow
An Interview with Brandon Gilles, Luxonis Founder and OAK Chief Architect
Roboflow
How to Train a Custom Mobile Object Detection Model (with YOLOv4 Tiny and TensorFlow Lite)
Roboflow
Tackling the Small Object Problem in Object Detection
Roboflow
Fast.ai v2 Released - What's New?
Roboflow
Teaser: Roboflow Train (1-Click Computer Vision AutoML)
Roboflow
How to Train a Custom Resnet34 Image Classification Model
Roboflow
How to Label Images for Object Detection with CVAT
Roboflow
Deploy YOLOv5 to Jetson Xavier NX at 30 FPS
Roboflow
Elisha Odemakinde Hosts Roboflow ML Engineer, Jacob Solawetz
Roboflow
Getting Started with VoTT - Computer Vision Annotation
Roboflow
How to Manage Classes in Object Detection (Rename, Combine, Balance)
Roboflow
How to Train YOLOv4 on a Custom Dataset in Darknet
Roboflow
Is Grayscale a Preprocessing or Augmentation Step in Computer Vision?
Roboflow
Getting Started with Image Data Augmentation
Roboflow
Glenn Jocher: Image Augmentation in YOLO v5 and Beyond
Roboflow
GA Hosts Roboflow - Healthcare and AI
Roboflow
How do self driving cars know when to stop?
Roboflow
What is PASCAL VOC XML?
Roboflow
AutoML Showdown: Google vs Amazon vs Microsoft
Roboflow
How is computer vision changing manufacturing?
Roboflow
The Alphabet in American Sign Language
Roboflow
Luxonis OAK-D: Computer Vision on Device
Roboflow
How to Train a Custom Faster R-CNN Model with Facebook AI's Detectron2 | Use Your Own Dataset
Roboflow
TensorFlow vs PyTorch: Fireside
Roboflow
Occlusion Techniques in Computer Vision
Roboflow
A Customizable Web Application for Your Computer Vision Model
Roboflow
Model Tradeoffs and the Future of Computer Vision
Roboflow
Designing an Augmented Reality Board Game App
Roboflow
YOLOv4 - Advanced Tactics
Roboflow
How to Use CreateML and Build a Computer Vision iPhone App | AR Object Detection
Roboflow
Fireside Chat: Computer Vision in Agriculture
Roboflow
Scaled-YOLOv4 Tops EfficientDet: Research Rundown
Roboflow
What is Image Preprocessing?
Roboflow
Building a Community of Creators with BlkArthouse and Von Deon
Roboflow
How to Train Scaled-YOLOv4 to Detect Custom Objects
Roboflow
Intro to Computer Vision: Fireside
Roboflow
The Best Way to Annotate Images for Object Detection
Roboflow
The Computer Vision Process: Fireside
Roboflow
How to Annotate Images with Your Team Using Roboflow
Roboflow
Introducing the Roboflow Object Count Histogram
Roboflow
How Fast is the M1 at Machine Learning? Benchmarking Apple's M1 and Intel's Chips
Roboflow
CLIP: OpenAI's amazing new zero-shot image classifier
Roboflow
How I hacked my Nest camera to run custom models
Roboflow
Getting Started with the Roboflow Inference API
Roboflow
Transfer Learning in Computer Vision | What, How, Why
Roboflow
More on: CV Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…
Medium · Python
When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…
Medium · Deep Learning
When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…
Medium · Cybersecurity
Your Face Is About to Become Your Phone Number
Dev.to AI
🎓
Tutor Explanation
DeepCamp AI