How to Use the Roboflow Dataset Health Check

Roboflow · Intermediate ·👁️ Computer Vision ·5y ago

Key Takeaways

The video demonstrates how to use Roboflow's Dataset Health Check to improve the quality of computer vision datasets, including analyzing annotation quality, class balance, and image size.

Full Transcript

hey there it's Joseph from Roma flow I'm gonna show you how you can use Robo flows data set health check to get the most out of your computer vision data sets now for today's example we're gonna be walking through the hardhat workers data set if you want to follow along you can actually find this data set on public dot Robo flow dot AI so if you go to public dot rubble fills out a I presented with all these free public image data sets and there's the hardhat workers data set I already have this data set in my account and I've pulled up the hardhat workers data set health check let's dive on in first things first I have seven thousand and thirty five images in my data set I don't have any missing annotations meaning all of my image files have a matching annotation file I also don't have any null examples 0 no example a null example is when I have an image that doesn't contain any of the objects that I wanted to detect in that given image so for example I don't have an image of a construction site that doesn't contain a worker a person or someone that doesn't have a hard head on I have twenty seven thousand and thirty nine annotations across my seven thousand images or approximately 3.8 per image that's good I mean it's pretty good feature richness across my three classes I also generally have pretty small images 0.17 megapixels my smallest one is point zero three megapixels my largest is 0.67 I'd actually click on that if I want to see this itty bitty small image and see its dimensions inside my dataset and here I see this this image here that's is 167 by 154 which is which is helpful okay now my class balance so having balanced classes is important in computer vision because we want our model to learn evenly across different objects that we want to teach it to recognize in this data set I might have some problems I have a pretty good representation of helmets nineteen thousand seven hundred forty-seven helmet examples I have about six thousand six hundred some examples of heads people without helmets but I only have six hundred fifteen examples of persons meaning people that don't have a helmet or a or just their face instead now I can zoom in and see show me Robo flow show me my data set every single image that contains a person remember there is 615 annotated people but that doesn't mean that there's 615 images of people why because there could be multiple people in a single image and in fact that's what we see here there's 209 images but a lot of these images have multiple people annotated like this one or actually this is a really good example here where I have all these people annotated and in the hard hat around those people as well okay now that's my health check so I also have here the size information me size matters because it helps inform the resize decision I want to make if I resize my images to square as most models require our my model is gonna be or our my image is gonna be stretched down or stretched up and it kind of depends in this data set it looks like my median average size is 500 by 333 so I might not want to go much bigger than 333 most the time you know we resize anywhere between 300 by 300 to 640 by 640 or somewhere in between sometimes bigger sometimes smaller it all depends of course on the context of your problem but I wouldn't want to go much bigger than 333 pixels here for the the height because that would stretch out my pixels more than I want the width is generally 500 pixels wide so maybe a good resize decision would be 300 by 300 perhaps a kind of depends again on the context to your problem I also see that the aspect ratio here a lot of my images are wider than they are square in fact you see here I have this line here where it goes directly across if an image is just as tall as it is wide that means it is a perfect square if it's wider than it is tall and it's a wide image and if it's taller than it is wide then it's a tall image I could have images that are very tall or very wide meaning if I stretch things to be square it might mess up their aspect ratio but conversely if I could preserve the aspect ratio it might create a lot of white or black padding in my resize decision rebel flow shows me those previews my pre-processing steps if I wanted to have a look so this is all useful information to inform my resize decision in particular now I also have the annotation heat map to understand generally in my image where are my objects appearing the helmets are generally across the my image the heads also generate across the top and persons are actually emanating from the bottom of my images that kind of makes sense right so this is like a gut check are my objects generally where I expect them to be across my images and if I were to make a resize decision or crop or change the size of my objects would they be in positions where they're in the frame of my image this helps make sure that you have a very visual quick check without manually combing through surfacing all these individual annotations in a quick one-stop way so that's kind of an overview of the things that are available to you and your health check now be sure to LIKE and subscribe to the Robo flow channel to learn more computer vision tips we cannot wait to see what you build using rubble flow to incorporate computer vision into your problems thanks so much for watching

Original Description

Roboflow enables developers to use computer vision, and computer vision engineers to get the most out of their data. In this walkthrough, we show you how to use the Dataset Health Check to improve the quality of your annotations, inform resize and augmentation decisions, and ensure your data is the best it can be.
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Roboflow · Roboflow · 4 of 60

1 YOLOv3 PyTorch Notebook Tutorial
YOLOv3 PyTorch Notebook Tutorial
Roboflow
2 How to Train YOLOv4 on a Custom Dataset (PyTorch)
How to Train YOLOv4 on a Custom Dataset (PyTorch)
Roboflow
3 How to Train YOLOv5 on a Custom Dataset
How to Train YOLOv5 on a Custom Dataset
Roboflow
How to Use the Roboflow Dataset Health Check
How to Use the Roboflow Dataset Health Check
Roboflow
5 What is Mean Average Precision (mAP)?
What is Mean Average Precision (mAP)?
Roboflow
6 How to Use the Roboflow Model Library
How to Use the Roboflow Model Library
Roboflow
7 How to Train EfficientDet in TensorFlow 2 Object Detection
How to Train EfficientDet in TensorFlow 2 Object Detection
Roboflow
8 How to Train YOLO v4 Tiny (Darknet) on a Custom Dataset
How to Train YOLO v4 Tiny (Darknet) on a Custom Dataset
Roboflow
9 Ask the Roboflow Team Anything - Episode 1
Ask the Roboflow Team Anything - Episode 1
Roboflow
10 Exploring The COCO Dataset
Exploring The COCO Dataset
Roboflow
11 Community Spotlight: Improving Uno with Computer Vision
Community Spotlight: Improving Uno with Computer Vision
Roboflow
12 Mosaic Data Augmentation - Deep Dive
Mosaic Data Augmentation - Deep Dive
Roboflow
13 Hands on with the OAK-1
Hands on with the OAK-1
Roboflow
14 Glenn Jocher: What is New in YOLO v5?
Glenn Jocher: What is New in YOLO v5?
Roboflow
15 How to Use Amazon Rekognition Custom Labels and Roboflow to Build an Object Detection Model
How to Use Amazon Rekognition Custom Labels and Roboflow to Build an Object Detection Model
Roboflow
16 An Interview with Brandon Gilles, Luxonis Founder and OAK Chief Architect
An Interview with Brandon Gilles, Luxonis Founder and OAK Chief Architect
Roboflow
17 How to Train a Custom Mobile Object Detection Model (with YOLOv4 Tiny and TensorFlow Lite)
How to Train a Custom Mobile Object Detection Model (with YOLOv4 Tiny and TensorFlow Lite)
Roboflow
18 Tackling the Small Object Problem in Object Detection
Tackling the Small Object Problem in Object Detection
Roboflow
19 Fast.ai v2 Released - What's New?
Fast.ai v2 Released - What's New?
Roboflow
20 Teaser: Roboflow Train (1-Click Computer Vision AutoML)
Teaser: Roboflow Train (1-Click Computer Vision AutoML)
Roboflow
21 How to Train a Custom Resnet34 Image Classification Model
How to Train a Custom Resnet34 Image Classification Model
Roboflow
22 How to Label Images for Object Detection with CVAT
How to Label Images for Object Detection with CVAT
Roboflow
23 Deploy YOLOv5 to Jetson Xavier NX at 30 FPS
Deploy YOLOv5 to Jetson Xavier NX at 30 FPS
Roboflow
24 Elisha Odemakinde Hosts Roboflow ML Engineer, Jacob Solawetz
Elisha Odemakinde Hosts Roboflow ML Engineer, Jacob Solawetz
Roboflow
25 Getting Started with VoTT - Computer Vision Annotation
Getting Started with VoTT - Computer Vision Annotation
Roboflow
26 How to Manage Classes in Object Detection (Rename, Combine, Balance)
How to Manage Classes in Object Detection (Rename, Combine, Balance)
Roboflow
27 How to Train YOLOv4 on a Custom Dataset in Darknet
How to Train YOLOv4 on a Custom Dataset in Darknet
Roboflow
28 Is Grayscale a Preprocessing or Augmentation Step in Computer Vision?
Is Grayscale a Preprocessing or Augmentation Step in Computer Vision?
Roboflow
29 Getting Started with Image Data Augmentation
Getting Started with Image Data Augmentation
Roboflow
30 Glenn Jocher: Image Augmentation in YOLO v5 and Beyond
Glenn Jocher: Image Augmentation in YOLO v5 and Beyond
Roboflow
31 GA Hosts Roboflow - Healthcare and AI
GA Hosts Roboflow - Healthcare and AI
Roboflow
32 How do self driving cars know when to stop?
How do self driving cars know when to stop?
Roboflow
33 What is PASCAL VOC XML?
What is PASCAL VOC XML?
Roboflow
34 AutoML Showdown: Google vs Amazon vs Microsoft
AutoML Showdown: Google vs Amazon vs Microsoft
Roboflow
35 How is computer vision changing manufacturing?
How is computer vision changing manufacturing?
Roboflow
36 The Alphabet in American Sign Language
The Alphabet in American Sign Language
Roboflow
37 Luxonis OAK-D: Computer Vision on Device
Luxonis OAK-D: Computer Vision on Device
Roboflow
38 How to Train a Custom Faster R-CNN Model with Facebook AI's Detectron2 | Use Your Own Dataset
How to Train a Custom Faster R-CNN Model with Facebook AI's Detectron2 | Use Your Own Dataset
Roboflow
39 TensorFlow vs PyTorch: Fireside
TensorFlow vs PyTorch: Fireside
Roboflow
40 Occlusion Techniques in Computer Vision
Occlusion Techniques in Computer Vision
Roboflow
41 A Customizable Web Application for Your Computer Vision Model
A Customizable Web Application for Your Computer Vision Model
Roboflow
42 Model Tradeoffs and the Future of Computer Vision
Model Tradeoffs and the Future of Computer Vision
Roboflow
43 Designing an Augmented Reality Board Game App
Designing an Augmented Reality Board Game App
Roboflow
44 YOLOv4 - Advanced Tactics
YOLOv4 - Advanced Tactics
Roboflow
45 How to Use CreateML and Build a Computer Vision iPhone App | AR Object Detection
How to Use CreateML and Build a Computer Vision iPhone App | AR Object Detection
Roboflow
46 Fireside Chat: Computer Vision in Agriculture
Fireside Chat: Computer Vision in Agriculture
Roboflow
47 Scaled-YOLOv4 Tops EfficientDet: Research Rundown
Scaled-YOLOv4 Tops EfficientDet: Research Rundown
Roboflow
48 What is Image Preprocessing?
What is Image Preprocessing?
Roboflow
49 Building a Community of Creators with BlkArthouse and Von Deon
Building a Community of Creators with BlkArthouse and Von Deon
Roboflow
50 How to Train Scaled-YOLOv4 to Detect Custom Objects
How to Train Scaled-YOLOv4 to Detect Custom Objects
Roboflow
51 Intro to Computer Vision: Fireside
Intro to Computer Vision: Fireside
Roboflow
52 The Best Way to Annotate Images for Object Detection
The Best Way to Annotate Images for Object Detection
Roboflow
53 The Computer Vision Process: Fireside
The Computer Vision Process: Fireside
Roboflow
54 How to Annotate Images with Your Team Using Roboflow
How to Annotate Images with Your Team Using Roboflow
Roboflow
55 Introducing the Roboflow Object Count Histogram
Introducing the Roboflow Object Count Histogram
Roboflow
56 How Fast is the M1 at Machine Learning? Benchmarking Apple's M1 and Intel's Chips
How Fast is the M1 at Machine Learning? Benchmarking Apple's M1 and Intel's Chips
Roboflow
57 CLIP: OpenAI's amazing new zero-shot image classifier
CLIP: OpenAI's amazing new zero-shot image classifier
Roboflow
58 How I hacked my Nest camera to run custom models
How I hacked my Nest camera to run custom models
Roboflow
59 Getting Started with the Roboflow Inference API
Getting Started with the Roboflow Inference API
Roboflow
60 Transfer Learning in Computer Vision | What, How, Why
Transfer Learning in Computer Vision | What, How, Why
Roboflow

The video teaches how to use Roboflow's Dataset Health Check to analyze and improve computer vision datasets, covering topics such as annotation quality, class balance, and image size. This is important because high-quality datasets are essential for training accurate computer vision models. By following the steps outlined in the video, viewers can learn how to use the Dataset Health Check to identify and address issues in their own datasets.

Key Takeaways
  1. Upload a dataset to Roboflow
  2. Run the Dataset Health Check
  3. Analyze annotation quality and class balance
  4. Evaluate image size and aspect ratio
  5. Use the annotation heat map to understand object placement
  6. Make informed resize decisions based on the analysis
💡 The Dataset Health Check provides a comprehensive overview of a computer vision dataset, allowing users to identify and address issues that could impact model performance.

Related AI Lessons

When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…
Learn how to build an AI-powered exam monitoring system using Computer Vision and DeepFace to assist professional certification exams
Medium · Python
When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…
Build an AI-powered exam monitoring system using Computer Vision and Deep Learning to enhance professional certification exams
Medium · Deep Learning
When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…
Build an AI-powered exam monitoring system using Computer Vision and Deep Learning to enhance exam security and integrity
Medium · Cybersecurity
Your Face Is About to Become Your Phone Number
Indonesia's mandatory facial verification for SIM cards is a massive test for biometric identity verification at scale, with implications for developers in computer vision and biometrics
Dev.to AI
Up next
Marketing management for ugc net| Important topics of marketing management ugc net commerce dec 2023
Bhoomi Learning Centre~Dr. Muskan
Watch →