Image Data Augmentation for Deep Learning

Connor Shorten · Beginner ·📐 ML Fundamentals ·6y ago

Key Takeaways

The video discusses data augmentation techniques for deep learning in computer vision, including image manipulation, random erasing, feature space augmentation, and adversarial training, with tools such as Generative Adversarial Networks (GANs) and AlexNet.

Full Transcript

[Music] this video will present a paper I recently published on image data augmentation for deep learning the motivation behind data augmentation is to prevent overfitting well we're fitting refers to a phenomenon when deep learning models with high capacity exactly modeled the training data such that they don't generalize to the testing data this is shown in this image whereas the training error decreases the testing error actually increases rather than continuing to decrease with the training error so data augmentation shown right here in these rotation images in the video on the left is an is a way of doing manipulations on data such that you have new data points and it serves as a regular rising effect such that you can hard code these translational invariance --iz into your model so you would want your image recognition model to be able to recognize the Panda in the normal image and in the horizontal reflect image so data-orientation has been applied a lot in the history of computer vision in the Laocoon net five on the m missed recognition task this shows how they do did a data warping to get more out of there and miss dataset alex net also uses data augmentation and their did augmentation increases the image net data set by a factor of 2048 and they do this by randomly cropping 224 by 224 patches clipping them horizontally and then changing the intensity of the RGB channels and so they directly tribute Algren tation in the alex net paper to one percent error rate reduction so this is the taxonomy of data augmentations covered in this paper will talk about basic image manipulations like color space transformations geometric transformations random erasing mixing images and kernel filters then we'll talk about deep learning approaches like generative adversarial networks normal style transfer and adversarial training then we'll see how these can be controlled with a meta learning controller to get an even better performance with data augmentation so image manipulations are the most commonly used data augmentation in computer vision this includes things like flipping color space cropping rotation translation and noise injection the in the top right shows an example of some color augmentations to this image you'd want the convolutional neural hour to be invariant to these color transformations and still be able to recognize objects despite lighting differences one other thing to consider with image Minette manipulations is non label preserving transformations so for example if you horizontally flip the m-miss dataset then you flip a 9at horizontally it's no longer really at 9 so with it all these image met collisions there have an affiliated magnitude parameter and there's always some level of distortion that is going to corrupt the label as well so this is a comparison of augmentations by image manipulation in one study so it's definitely interesting to search over the augmentation space of classic image manipulation and see how the accuracy of your model changes so in this case you see that they get a much better performance result with cropping than the other occupations another interesting idea is kernel filters this is the patch shuffled regularization technique where they randomly shuffle around pixels in a four by four sliding window mixing images is another really surprisingly successful data organ what they do is they extract patches and they just randomly average together the patches for each pixel and they train a network this may be successful because of the increased data set size some kind of regular ACE regularization effect it's really unclear why exactly this works but it does work well which is surprising they also experiment with nonlinear mixing and all these interesting ways of mixing images to form new samples another very interesting technique is random erasing error cut out this is used really frequently instead the RNA with recognition models so this does is it's like drop out but in the input space so you have like this rectangle that is placed on the images and then instead of the original image it's like all zero all ones or you know the static noise so this is the results of applying cutout in the cutout regularization paper and you see that they get a like greater than one percent error rate and almost all of the trials with it another interesting idea is feature space augmentation so the way the convolutional networks work is they sequentially transform an image into a series of rank 3 tensors where each dimension is number of feature maps height width of the feature map so this study they they augment the image representations and these intermediate tensors and then they decode them back into the image space another interesting way of doing this is adversarial training so in addition to the phenomenons of adversarial examples we could have an adversarial agent which is constrained to a set of image manipulations like rotations and translations and is trying to select geometric transformation that will result in a miss classification and it is beneficial to use these adversarial agents to direct the search process of augmentations one really interesting idea is to use data from a gener adversarial Network to augment datasets Durov adversarial networks as shown on the slide here take random noise and then they learn to generate new data based on the discriminators lost function of real or fake so eventually the generator is able to produce novel data samples so it's really interesting to see and this hasn't really been shown to work successfully on datasets like image net but on this liberal ijen classification data set they use this technique of generating data and then just appending it to the dataset and they get these seventy eight point six to eighty five point seven and eighty eight point four to ninety two point four but other than this medical image domain this hasn't really worked well on things like image net or even see far ten so another really interesting day talent a shin that hasn't been explored too much as neural style transfer using these styles to augment images in interesting ways and this goes beyond just like color transformations this really augments images in a cement whay so style transfer has been really successful in robotic applications like this study from UC Berkeley where they randomized the colors in the simulation such that the when the robot goes to the real world it generalizes because it just sees the color transformation as another color set in the diverse data set it's been trained on and then on the opposite end of that rather than going for diversity these the Sangam model is for realism and they take data from a graphics engine from like unity and then they use the game to make it to align the generated data from the graphics engine with the original training set so this brings the idea of meta learning using some kind of controlling algorithm to search through this space of augmentations so auto augment is the most interesting way of applying reinforcement learning to basic image manipulations so what they do is they search for a policy that selects a image manipulation like a rotation and translate and then it will find a magnitude of applying the operation like rotated 45 degrees or 70 degrees and then it will find a probability of applying the operation so some other things to consider with data augmentation is test time the augmentation and this isn't good for applications that need fast and friends and fast predictions but if you want to increase the accuracy of your model it would be useful to take to take the image then augment it several times and then aggregate the predictions across the different augmentations so like an Alex net what they do is they randomly crop 5 to 24 by 224 patches and then they horizontally flip them all and then they aggregate the predictions across these ten pairs another thing is curriculum learning and recent paper presented in nice email 2019 population-based augmentation they progressively increase the magnitude parameter of image manipulation so when they first start reading the model they might rotate it like 10 degrees or minus 10 degrees and then as the training progresses the rotations would get larger like sixty degrees or negative sixty degrees another interesting thing is the impact of image resolution on model performance and it's not really quite similar to the other things presented but frequently images are downsized to fit as input and to save computation but this study shows that if you preserve the high resolution you'll tend to get better classification performance so then one other consideration is online and offline data augmentation so online data augmentation means that as it goes into the batch its augmented with some probability parameter and then offline would be where you are meant it in in preparation for the training and then write it to the disk which has a storage cost so there are some other regularization tools out there like dropout Batchelor ization transforming free training and then one chattin zero learning zero shot learning these are all techniques that are aiming to overcome the problem of overfitting especially in the case of limited data domains like medical image analysis so again this is the taxonomy of image data augmentations surveyed in this paper basically manipulations deep learning approaches and then meta learning approaches that use controllers to search for the augmentation parameters thanks for watching this video please subscribe to Henry AI labs for more deep learning videos and please check out this paper a survey on image data augmentation for deep learning published in the spring or Journal of big data

Original Description

Paper Link: https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0197-0 Data Augmentation is used in all modern Computer Vision models. This video explains some of the different ways Data Augmentation is implemented and some design considerations for your own problems. Please Subscribe for more Deep Learning videos!
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Connor Shorten · Connor Shorten · 38 of 60

1 DenseNets
DenseNets
Connor Shorten
2 DeepWalk Explained
DeepWalk Explained
Connor Shorten
3 Inception Network Explained
Inception Network Explained
Connor Shorten
4 StackGAN
StackGAN
Connor Shorten
5 StyleGAN
StyleGAN
Connor Shorten
6 Progressive Growing of GANs Explained
Progressive Growing of GANs Explained
Connor Shorten
7 Improved Techniques for Training GANs
Improved Techniques for Training GANs
Connor Shorten
8 Word2Vec Explained
Word2Vec Explained
Connor Shorten
9 Must Read Papers on GANs
Must Read Papers on GANs
Connor Shorten
10 Unsupervised Feature Learning
Unsupervised Feature Learning
Connor Shorten
11 Self-Supervised GANs
Self-Supervised GANs
Connor Shorten
12 Embedding Graphs with Deep Learning
Embedding Graphs with Deep Learning
Connor Shorten
13 Transfer Learning in GANs
Transfer Learning in GANs
Connor Shorten
14 ReLU Activation Function
ReLU Activation Function
Connor Shorten
15 AC-GAN Explained
AC-GAN Explained
Connor Shorten
16 SimGAN Explained
SimGAN Explained
Connor Shorten
17 DC-GAN Explained!
DC-GAN Explained!
Connor Shorten
18 ResNet Explained!
ResNet Explained!
Connor Shorten
19 Graph Convolutional Networks
Graph Convolutional Networks
Connor Shorten
20 Neural Architecture Search
Neural Architecture Search
Connor Shorten
21 Henry AI Labs
Henry AI Labs
Connor Shorten
22 Video Classification with Deep Learning
Video Classification with Deep Learning
Connor Shorten
23 BigGANs in Data Augmentation
BigGANs in Data Augmentation
Connor Shorten
24 Introduction to Deep Learning
Introduction to Deep Learning
Connor Shorten
25 EfficientNet Explained!
EfficientNet Explained!
Connor Shorten
26 Self-Attention GAN
Self-Attention GAN
Connor Shorten
27 Curriculum Learning in Deep Neural Networks
Curriculum Learning in Deep Neural Networks
Connor Shorten
28 Deep Learning Podcast #1 | Edward Dixon | Stochastic Weight Averaging
Deep Learning Podcast #1 | Edward Dixon | Stochastic Weight Averaging
Connor Shorten
29 Deep Compression
Deep Compression
Connor Shorten
30 Skin Cancer Classification with Deep Learning
Skin Cancer Classification with Deep Learning
Connor Shorten
31 Deep Learning Podcast #2 | Edward Peake | Deep Learning in Medical Imaging
Deep Learning Podcast #2 | Edward Peake | Deep Learning in Medical Imaging
Connor Shorten
32 The Lottery Ticket Hypothesis Explained!
The Lottery Ticket Hypothesis Explained!
Connor Shorten
33 SqueezeNet
SqueezeNet
Connor Shorten
34 GauGAN Explained!
GauGAN Explained!
Connor Shorten
35 AutoML with Hyperband
AutoML with Hyperband
Connor Shorten
36 DL Podcast #3 | Yannic Kilcher | Population-Based Search
DL Podcast #3 | Yannic Kilcher | Population-Based Search
Connor Shorten
37 Weakly Supervised Pretraining
Weakly Supervised Pretraining
Connor Shorten
Image Data Augmentation for Deep Learning
Image Data Augmentation for Deep Learning
Connor Shorten
39 Unsupervised Data Augmentation
Unsupervised Data Augmentation
Connor Shorten
40 Wide ResNet Explained!
Wide ResNet Explained!
Connor Shorten
41 RevNet: Backpropagation without Storing Activations
RevNet: Backpropagation without Storing Activations
Connor Shorten
42 GANs with Fewer Labels
GANs with Fewer Labels
Connor Shorten
43 BigBiGAN Unsupervised Learning!
BigBiGAN Unsupervised Learning!
Connor Shorten
44 Self-Supervised Learning
Self-Supervised Learning
Connor Shorten
45 Multi-Task Self-Supervised Learning
Multi-Task Self-Supervised Learning
Connor Shorten
46 Self-Supervised GANs
Self-Supervised GANs
Connor Shorten
47 Population Based Training
Population Based Training
Connor Shorten
48 Show, Attend and Tell
Show, Attend and Tell
Connor Shorten
49 Siamese Neural Networks
Siamese Neural Networks
Connor Shorten
50 WaveGAN Explained!
WaveGAN Explained!
Connor Shorten
51 VAE-GAN Explained!
VAE-GAN Explained!
Connor Shorten
52 Evolution in Neural Architecture Search!
Evolution in Neural Architecture Search!
Connor Shorten
53 AI Research Weekly Update August 18th, 2019
AI Research Weekly Update August 18th, 2019
Connor Shorten
54 Weight Agnostic Neural Networks Explained!
Weight Agnostic Neural Networks Explained!
Connor Shorten
55 AI Research Weekly Update August 25th, 2019
AI Research Weekly Update August 25th, 2019
Connor Shorten
56 Neuroevolution of Augmenting Topologies (NEAT)
Neuroevolution of Augmenting Topologies (NEAT)
Connor Shorten
57 CoDeepNEAT
CoDeepNEAT
Connor Shorten
58 AI Research Weekly Update September 1st, 2019
AI Research Weekly Update September 1st, 2019
Connor Shorten
59 Randomly Wired Neural Networks
Randomly Wired Neural Networks
Connor Shorten
60 Genetic CNN
Genetic CNN
Connor Shorten

This video teaches the importance of data augmentation in deep learning for computer vision and provides an overview of various techniques, including image manipulation, random erasing, and feature space augmentation. It also discusses the use of tools such as Generative Adversarial Networks (GANs) and AlexNet.

Key Takeaways
  1. Apply image manipulation techniques such as rotation, flipping, and cropping
  2. Use random erasing to augment images
  3. Implement feature space augmentation
  4. Use Generative Adversarial Networks (GANs) to generate new data
  5. Apply curriculum learning to progressively increase the magnitude of image manipulation
  6. Use online data augmentation to augment batches during training
💡 Data augmentation is a crucial technique in deep learning for computer vision, and using the right techniques can significantly improve model performance.

Related AI Lessons

Data Preprocessing: Encoding and Feature Scaling in Machine Learning
Learn to preprocess data by encoding and scaling features for better machine learning model performance
Medium · Machine Learning
Data Preprocessing: Encoding and Feature Scaling in Machine Learning
Learn to preprocess data for machine learning by encoding and scaling features, a crucial step for model training
Medium · Data Science
The Python Dictionary Trick That Makes Interviewers Smile
Learn the Python dictionary trick that impresses interviewers and improves your coding skills
Dev.to · Ameer Abdullah
I Compared 50 Python Courses. Here Are My Top 5 Recommendations for 2026
Discover the top 5 Python courses for 2026, curated from a comparison of 50 courses, to enhance your programming skills and career prospects
Medium · Python
Up next
Is Python Dead in 2026?| Truth About Python in AI Era | 90 Days Roadmap @FameWorldEducationalHub
FAME WORLD EDUCATIONAL HUB
Watch →