Getting Started with Image Data Augmentation
Key Takeaways
The video demonstrates the use of Roboflow platform for image data augmentation and pre-processing to improve the performance of a computer vision model on an aerial maritime drone data set. It showcases various techniques such as random crop, flipping, rotating, auto-orienting, and resizing images, as well as tiling and mosaic augmentations.
Full Transcript
everyone this is jacob from rebelflow today i'm going to show you how to use image level augmentations and image pre-processing to build better computer vision models the best thing about our tutorial today is that it requires no code to easily get started building your computer vision model today with the roboflow platform so to get started i'm going to go ahead and introduce the data set that we're going to be working on today so we have a pretty sparse data set as gathering data and annotating data is expensive so we have an example here of a public data set which is an aerial data set taken via a drone as it's flying over the lake so here we have the aerial maritime drone data set and this is public on roboflow so i'll go ahead and link it below if you want to go ahead and replicate the steps that we take in this tutorial to see how things are working on on your end with um your own hands on the data um but you could bring it bringing in any data set here um so a little bit more about this data set here's a little gif of the cto of roboflow brad flying a drone over the lake and taking these images and here's a little picture of what the images look like so here you can see we've taken pictures of boats on the water and we've also annotated things like dots docs and jet skis and these are all present in our data set so this is the data set we're going to be using if you want it you use it you can go ahead and just click fork data set to bring it into your own account um i've already done that and downloaded it locally but now i'll show you an example of how you can bring your data set in and start augmenting and pre-processing um by loading your your data set into the platform here uh so all you need to do is go ahead and go over here and hit sign in so we'll go ahead and sign in and we'll create a new data set we're going to go ahead and create a data set here and we'll call it arial youtube and we'll just have the annotation group called objects here um and we'll go ahead and create data set now here is where you just would drop in your files and drop in your images and annotations to get started on augmenting and pre-processing your images so i'll go ahead and hit select files here and i've got the data set here in a coco json format uh we support pretty much any format so you should be able to just drag and drop any uh data set format in here and and get uh going right away so here you can see we've loaded in our images and um now we're going to go ahead and kind of move on to the data set versioning step so we'll go ahead and click finish upload here and this will split the data set into train valid and test today we're going to be showing metrics on the validation set and we'll be doing our training jobs based on the training set so we'll go ahead and hit continue here and this will go ahead and upload the data into the roboflow platform so before we get started on kind of trying to model this data set um we're going to want to sort of get a feel for the way that the data set looks and the things that we might want to do to it that might make our model even better so the first thing we can do here is to first check the dataset health check so as i mentioned this is a pretty sparse data set especially for the complexity of the problem we're trying to model we're trying to identify lift docs jet skis cars and boats and these things can kind of appear in different ways in different images as you can see here we have mostly lifts and mostly docks and very few boats but another thing that's useful to you before you get started modeling your data set is just to do a little bit of a preview here to get a look at the images that you're going to be modeling with so one thing i'm noticing here right off the bat is that the docs and the lifts are actually very close to overlapping so that might kind of confuse the model if it's trying to parse out and identify things that are in overlapping bonding boxes in the same place and then another thing worth noting is that these objects are pretty far and pretty zoomed out so that means there's not too many pixels for the model to be making its inferences uh based on as it's as it's working through and learning how to model the problem um and so yeah that's basically it now we can get started on uh sort of making versions of the data that we're going to be sending into train jobs uh to try to boost our model performance as we're working with the same data set so we're not going to gather any new data here but we're going to artificially create new data and we're going to pre-process it in ways that will make it essentially easier to model and then we'll have a better finished model after we're done with this process but it takes some hypothesis testing so i'll just kind of walk you through the way my thought process is working as i would be working on modeling this data set so the first thing we're going to start doing here is pre-processing steps so just for a vanilla version to get kind of like our baseline performance i'm just going to basically apply two things auto orient which flips the image in the correct direction based on exif data and then we're going to resize down to 416. so this will take the image pixels and just do a 416 by 416 box this is small pretty good for quick inference and then it also allows uh us to kind of have a standard resolution to be testing different models against so go we will go ahead and pick those those will be pre-selected and we'll hit generate here and this we'll just call our 416 version so this generates this will send it into the backend and then generate a new data set version and you'll see it posts over here to the left so you're able to kind of keep track of the different versions that you um are building and and keep track of the different experiments that you're running so we'll go ahead and uh cancel the export here and we're just going to use the one-click training integration to do our tests so this is rebel flow train you can hit one click button this will send it into the back end and now this train job has gone to the back end and we'll get some results back on how well it did afterwards so basically it'll send us an email at the end and that's all we need to do to run our first experiment uh so now we're gonna keep going and keep thinking about this and think about how we can make the modeling better so one of the first things i want to do is start getting into the augmentation land so augmentations basically make more images from our base training images and this will be a good way to improve model performance by not generating more data we can just merely like generate the data artificially and then we can have more training data without having to go through and annotate it so some things that i think makes sense for this data set are to use a random crop so this will basically zoom our images in and out and this helps because the drone may not have been flying at an exact level as it's been going over the earth and taking images so it might make sense to be kind of randomly zooming those images in and out so we'll go ahead and hit apply there we've added a random crop and then a couple other ones for this first iteration that i want to add are to flip the images and rotate them this is because we were flying over a lake and this lake's coastline could kind of manifest itself in all sorts of different ways that are uh flipped and tilted so those i think make sense to apply here i'll go ahead and apply both kinds of flips and i'll apply both kinds of rotates um so there we go that's uh all the augmentations we're going to make and then the other thing we're going to do is we can choose how many artificial images we want to generate from our base images so we'll just choose three here but you can bump that up and you can try to experiment with actually like making a lot of artificial data from your base training set and and see how performance improves with that so now we'll go ahead and generate this version i'm going to call it 416 flip rotate crop so now we've got a new data set version coming in here and this one's actually going through and not just resizing but it's making all the ogs too and we'll just go ahead and click off click this to launch that experiment as i said it's a really convenient way to just kind of be quickly iterating on your ideas and we'll go ahead and make a couple more versions here um one that i want to try is kind of adding on to this so there's uh this concept of brightness so you know the drone might have been taking pictures at different times of the day so um our clouds may be going overhead so the brightness might kind of help your model learn different lighting settings so i'll add that in there and then there's another one here called mosaic which i really like which tiles images together this helps the model kind of start to localize objects better and it doesn't rely on surroundings as much because there could be different tiles surrounding different objects i mean it's a good way to kind of mix and match and add you know those added complexities and elements to the model that might actually improve it so we'll go ahead and give it a try um so we'll generate this data we'll call this one 416 flip rotate crop brighten mosaic so we'll do all of those to this this data set um so that's our next uh our next image and as before we'll just go ahead and click probable train and kick off a new experiment and so the next thing i want to show you is another way that you can try to beat this data set or or this this isn't really beating the data set but it's sort of editing the task essentially which is you might actually consider um going through here and saying that you know this whole task you labeled everything but maybe you don't want to try to model it all so uh like for example you might decide that the docs and the lifts are kind of overlapping and it's not as important to try to model both of them so what if you said you know what actually i just want a model boat and i want to model um doc you know those those are the two that i think are the most important so you can go ahead and just kind of filter out all the other classes and narrow the modeling problem down so that's uh that's uh modify classes so i'll go ahead and hit generate here and we'll try that so we'll just call this flip rotate crop boat dock so we'll see we'll see how that one does you know the theory there is that since there's less to model it'll be easier to kind of focus on those uh two base classes that you want to model go ahead and kick in off another training job here so we're actually firing up a lot of gpus here to kind of do these experiments but that's pretty nice um and then the last thing that i want to show you guys is to try one more experiment this experiment is going to be to tile our images so here we can do basically uh we'll just remove the augmentations here just see how the raw tiling does tiling simulates the idea of like we're actually going to be tiling the images into pieces so it's zoomed in so it has higher resolution to be detecting objects that's a really nice thing to have especially with the data set that is this zoomed out so we'll remove modify classes we'll try to model everything here and i'll go ahead and hit tiling so and we can tile it down into two by two frames and you can add extra tiling there if you want but we'll just do two by two for now and see if that helps our model even more so go ahead and hit apply and um yeah so this uh is the new data set we're going to just call this one 416 tiling go ahead and hit generate and after we're done generating here we'll go ahead and kick off another gpu training and basically here um now we can look at our versions here and we can see kind of everything that we've kicked off so we have the base 416 we have the 416 that's augmented with flip rotate croc crop we have some more augmentations where we added in brighton mosaic then we actually started to filter some classes so this is making the task uh like a little bit less um complex and then we've also tightly titled to kind of zoom in with the resolution so those are all the steps that we've taken and uh now basically an hour or so with a lapse and the gpu training job would go and then you'd get your results back and then these results will kind of tell you how well your model did on modeling the other part of your data set that you didn't show it during training and so this metric we're going to use is called map i'll put some links below on like more on what it is but it's basically just a metric that shows you how well your model is doing and the nice thing about this youtube today is i've already run these tests so we can go ahead and check and kind of see into the future how our training jobs would have done uh so now we'll go ahead and jump over here um to this data set version which is the same data set all the same augmentations and the same training jobs and we'll we'll see what the results were so um to start off we have the base uh plane 416. um so this uh this is uh just kind of like the very baseline approach um and we can see just kind of basically how well this one did um so you can see here that it's 18 um so that's not too high and that's in some ways to be expected since we knew this was going to be a hard task but let's see if our augmentations made things even better so the next thing we tried was to flip rotate and crop and you can see here that we actually got an even better map from that we went from eighteen percent all the way up to four to forty two percent which is uh more than doubling the performance of our model with the same data set so this really shows you the power of using augmentation for a sparse data set just already with this first experiment then let's take a look at mosaic mosaic looks like it even added even a little bit more map to our model so we're getting even better and better and we're seeing these metrics come through on our data set versions um and then let's see what happened when we actually um filtered the classes so this was when we just said you know let's actually take this complexity of the problem down and let's just model boats and docks and we're going to kind of ignore the other classes that went all the way up to 67 so now we're actually getting to a pretty tractable model that's doing a pretty good job and then the last thing we're going to check here is how our tiling job did so remember this is all the classes and we're tiling it tiling it down so we're actually kind of zooming in and this is showing the power of tiling um really really showing brute force the power of tiling uh for for aerial imagery or anything where you're trying to detect small objects that are far away um our map went all the way up to 64.4 um which is again a pretty good model and pretty tractable and it's really impressive what we're able to do with just 70 images to be able to create a computer vision model that can identify these aerial objects these aerial maritime objects that it's never seen before just by using image augmentation image pre-processing and kicking off train jobs in the back end um these are all things that are possible uh using the replica platform and i look forward to discussion below about what you think are good image pre-processing steps and image augmentation steps and how we can build just simply the best computer vision models with the limited data that we have and thanks so much for watching today and i hope you like and subscribe below and we'll see you next time thanks so much
Original Description
In this video, we showcase how running a series of experiments with image preprocessing and image augmentation can boost the performance of a computer vision model.
Dataset used in this video:
https://public.roboflow.com/object-detection/aerial-maritime
Details on the performance metric used in this video:
https://blog.roboflow.com/mean-average-precision/
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Roboflow · Roboflow · 29 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
▶
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
YOLOv3 PyTorch Notebook Tutorial
Roboflow
How to Train YOLOv4 on a Custom Dataset (PyTorch)
Roboflow
How to Train YOLOv5 on a Custom Dataset
Roboflow
How to Use the Roboflow Dataset Health Check
Roboflow
What is Mean Average Precision (mAP)?
Roboflow
How to Use the Roboflow Model Library
Roboflow
How to Train EfficientDet in TensorFlow 2 Object Detection
Roboflow
How to Train YOLO v4 Tiny (Darknet) on a Custom Dataset
Roboflow
Ask the Roboflow Team Anything - Episode 1
Roboflow
Exploring The COCO Dataset
Roboflow
Community Spotlight: Improving Uno with Computer Vision
Roboflow
Mosaic Data Augmentation - Deep Dive
Roboflow
Hands on with the OAK-1
Roboflow
Glenn Jocher: What is New in YOLO v5?
Roboflow
How to Use Amazon Rekognition Custom Labels and Roboflow to Build an Object Detection Model
Roboflow
An Interview with Brandon Gilles, Luxonis Founder and OAK Chief Architect
Roboflow
How to Train a Custom Mobile Object Detection Model (with YOLOv4 Tiny and TensorFlow Lite)
Roboflow
Tackling the Small Object Problem in Object Detection
Roboflow
Fast.ai v2 Released - What's New?
Roboflow
Teaser: Roboflow Train (1-Click Computer Vision AutoML)
Roboflow
How to Train a Custom Resnet34 Image Classification Model
Roboflow
How to Label Images for Object Detection with CVAT
Roboflow
Deploy YOLOv5 to Jetson Xavier NX at 30 FPS
Roboflow
Elisha Odemakinde Hosts Roboflow ML Engineer, Jacob Solawetz
Roboflow
Getting Started with VoTT - Computer Vision Annotation
Roboflow
How to Manage Classes in Object Detection (Rename, Combine, Balance)
Roboflow
How to Train YOLOv4 on a Custom Dataset in Darknet
Roboflow
Is Grayscale a Preprocessing or Augmentation Step in Computer Vision?
Roboflow
Getting Started with Image Data Augmentation
Roboflow
Glenn Jocher: Image Augmentation in YOLO v5 and Beyond
Roboflow
GA Hosts Roboflow - Healthcare and AI
Roboflow
How do self driving cars know when to stop?
Roboflow
What is PASCAL VOC XML?
Roboflow
AutoML Showdown: Google vs Amazon vs Microsoft
Roboflow
How is computer vision changing manufacturing?
Roboflow
The Alphabet in American Sign Language
Roboflow
Luxonis OAK-D: Computer Vision on Device
Roboflow
How to Train a Custom Faster R-CNN Model with Facebook AI's Detectron2 | Use Your Own Dataset
Roboflow
TensorFlow vs PyTorch: Fireside
Roboflow
Occlusion Techniques in Computer Vision
Roboflow
A Customizable Web Application for Your Computer Vision Model
Roboflow
Model Tradeoffs and the Future of Computer Vision
Roboflow
Designing an Augmented Reality Board Game App
Roboflow
YOLOv4 - Advanced Tactics
Roboflow
How to Use CreateML and Build a Computer Vision iPhone App | AR Object Detection
Roboflow
Fireside Chat: Computer Vision in Agriculture
Roboflow
Scaled-YOLOv4 Tops EfficientDet: Research Rundown
Roboflow
What is Image Preprocessing?
Roboflow
Building a Community of Creators with BlkArthouse and Von Deon
Roboflow
How to Train Scaled-YOLOv4 to Detect Custom Objects
Roboflow
Intro to Computer Vision: Fireside
Roboflow
The Best Way to Annotate Images for Object Detection
Roboflow
The Computer Vision Process: Fireside
Roboflow
How to Annotate Images with Your Team Using Roboflow
Roboflow
Introducing the Roboflow Object Count Histogram
Roboflow
How Fast is the M1 at Machine Learning? Benchmarking Apple's M1 and Intel's Chips
Roboflow
CLIP: OpenAI's amazing new zero-shot image classifier
Roboflow
How I hacked my Nest camera to run custom models
Roboflow
Getting Started with the Roboflow Inference API
Roboflow
Transfer Learning in Computer Vision | What, How, Why
Roboflow
More on: CV Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Cloud-Optimized OpenCV + A Special Surprise Announcement on OpenCV Live
OpenCV Blog
When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…
Medium · Python
When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…
Medium · Deep Learning
When the Camera Becomes an Exam Proctor: Building an AI-Powered Exam Monitoring System with…
Medium · Cybersecurity
🎓
Tutor Explanation
DeepCamp AI