Image Data Augmentation for Deep Learning

Connor Shorten · Beginner ·📐 ML Fundamentals ·6y ago

Skills: CV Basics90%ML Maths Basics80%ML Pipelines80%Supervised Learning70%Unsupervised Learning60%

Key Takeaways

The video discusses data augmentation techniques for deep learning in computer vision, including image manipulation, random erasing, feature space augmentation, and adversarial training, with tools such as Generative Adversarial Networks (GANs) and AlexNet.

Full Transcript

[Music] this video will present a paper I recently published on image data augmentation for deep learning the motivation behind data augmentation is to prevent overfitting well we're fitting refers to a phenomenon when deep learning models with high capacity exactly modeled the training data such that they don't generalize to the testing data this is shown in this image whereas the training error decreases the testing error actually increases rather than continuing to decrease with the training error so data augmentation shown right here in these rotation images in the video on the left is an is a way of doing manipulations on data such that you have new data points and it serves as a regular rising effect such that you can hard code these translational invariance --iz into your model so you would want your image recognition model to be able to recognize the Panda in the normal image and in the horizontal reflect image so data-orientation has been applied a lot in the history of computer vision in the Laocoon net five on the m missed recognition task this shows how they do did a data warping to get more out of there and miss dataset alex net also uses data augmentation and their did augmentation increases the image net data set by a factor of 2048 and they do this by randomly cropping 224 by 224 patches clipping them horizontally and then changing the intensity of the RGB channels and so they directly tribute Algren tation in the alex net paper to one percent error rate reduction so this is the taxonomy of data augmentations covered in this paper will talk about basic image manipulations like color space transformations geometric transformations random erasing mixing images and kernel filters then we'll talk about deep learning approaches like generative adversarial networks normal style transfer and adversarial training then we'll see how these can be controlled with a meta learning controller to get an even better performance with data augmentation so image manipulations are the most commonly used data augmentation in computer vision this includes things like flipping color space cropping rotation translation and noise injection the in the top right shows an example of some color augmentations to this image you'd want the convolutional neural hour to be invariant to these color transformations and still be able to recognize objects despite lighting differences one other thing to consider with image Minette manipulations is non label preserving transformations so for example if you horizontally flip the m-miss dataset then you flip a 9at horizontally it's no longer really at 9 so with it all these image met collisions there have an affiliated magnitude parameter and there's always some level of distortion that is going to corrupt the label as well so this is a comparison of augmentations by image manipulation in one study so it's definitely interesting to search over the augmentation space of classic image manipulation and see how the accuracy of your model changes so in this case you see that they get a much better performance result with cropping than the other occupations another interesting idea is kernel filters this is the patch shuffled regularization technique where they randomly shuffle around pixels in a four by four sliding window mixing images is another really surprisingly successful data organ what they do is they extract patches and they just randomly average together the patches for each pixel and they train a network this may be successful because of the increased data set size some kind of regular ACE regularization effect it's really unclear why exactly this works but it does work well which is surprising they also experiment with nonlinear mixing and all these interesting ways of mixing images to form new samples another very interesting technique is random erasing error cut out this is used really frequently instead the RNA with recognition models so this does is it's like drop out but in the input space so you have like this rectangle that is placed on the images and then instead of the original image it's like all zero all ones or you know the static noise so this is the results of applying cutout in the cutout regularization paper and you see that they get a like greater than one percent error rate and almost all of the trials with it another interesting idea is feature space augmentation so the way the convolutional networks work is they sequentially transform an image into a series of rank 3 tensors where each dimension is number of feature maps height width of the feature map so this study they they augment the image representations and these intermediate tensors and then they decode them back into the image space another interesting way of doing this is adversarial training so in addition to the phenomenons of adversarial examples we could have an adversarial agent which is constrained to a set of image manipulations like rotations and translations and is trying to select geometric transformation that will result in a miss classification and it is beneficial to use these adversarial agents to direct the search process of augmentations one really interesting idea is to use data from a gener adversarial Network to augment datasets Durov adversarial networks as shown on the slide here take random noise and then they learn to generate new data based on the discriminators lost function of real or fake so eventually the generator is able to produce novel data samples so it's really interesting to see and this hasn't really been shown to work successfully on datasets like image net but on this liberal ijen classification data set they use this technique of generating data and then just appending it to the dataset and they get these seventy eight point six to eighty five point seven and eighty eight point four to ninety two point four but other than this medical image domain this hasn't really worked well on things like image net or even see far ten so another really interesting day talent a shin that hasn't been explored too much as neural style transfer using these styles to augment images in interesting ways and this goes beyond just like color transformations this really augments images in a cement whay so style transfer has been really successful in robotic applications like this study from UC Berkeley where they randomized the colors in the simulation such that the when the robot goes to the real world it generalizes because it just sees the color transformation as another color set in the diverse data set it's been trained on and then on the opposite end of that rather than going for diversity these the Sangam model is for realism and they take data from a graphics engine from like unity and then they use the game to make it to align the generated data from the graphics engine with the original training set so this brings the idea of meta learning using some kind of controlling algorithm to search through this space of augmentations so auto augment is the most interesting way of applying reinforcement learning to basic image manipulations so what they do is they search for a policy that selects a image manipulation like a rotation and translate and then it will find a magnitude of applying the operation like rotated 45 degrees or 70 degrees and then it will find a probability of applying the operation so some other things to consider with data augmentation is test time the augmentation and this isn't good for applications that need fast and friends and fast predictions but if you want to increase the accuracy of your model it would be useful to take to take the image then augment it several times and then aggregate the predictions across the different augmentations so like an Alex net what they do is they randomly crop 5 to 24 by 224 patches and then they horizontally flip them all and then they aggregate the predictions across these ten pairs another thing is curriculum learning and recent paper presented in nice email 2019 population-based augmentation they progressively increase the magnitude parameter of image manipulation so when they first start reading the model they might rotate it like 10 degrees or minus 10 degrees and then as the training progresses the rotations would get larger like sixty degrees or negative sixty degrees another interesting thing is the impact of image resolution on model performance and it's not really quite similar to the other things presented but frequently images are downsized to fit as input and to save computation but this study shows that if you preserve the high resolution you'll tend to get better classification performance so then one other consideration is online and offline data augmentation so online data augmentation means that as it goes into the batch its augmented with some probability parameter and then offline would be where you are meant it in in preparation for the training and then write it to the disk which has a storage cost so there are some other regularization tools out there like dropout Batchelor ization transforming free training and then one chattin zero learning zero shot learning these are all techniques that are aiming to overcome the problem of overfitting especially in the case of limited data domains like medical image analysis so again this is the taxonomy of image data augmentations surveyed in this paper basically manipulations deep learning approaches and then meta learning approaches that use controllers to search for the augmentation parameters thanks for watching this video please subscribe to Henry AI labs for more deep learning videos and please check out this paper a survey on image data augmentation for deep learning published in the spring or Journal of big data

Original Description

Paper Link: https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0197-0 Data Augmentation is used in all modern Computer Vision models. This video explains some of the different ways Data Augmentation is implemented and some design considerations for your own problems. Please Subscribe for more Deep Learning videos!

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Connor Shorten · Connor Shorten · 38 of 60

← Previous Next →

DeepWalk Explained

DeepWalk Explained

Inception Network Explained

Inception Network Explained

Progressive Growing of GANs Explained

Progressive Growing of GANs Explained

Improved Techniques for Training GANs

Improved Techniques for Training GANs

Word2Vec Explained

Word2Vec Explained

Must Read Papers on GANs

Must Read Papers on GANs

Unsupervised Feature Learning

Unsupervised Feature Learning

Self-Supervised GANs

Self-Supervised GANs

Embedding Graphs with Deep Learning

Embedding Graphs with Deep Learning

Transfer Learning in GANs

Transfer Learning in GANs

ReLU Activation Function

ReLU Activation Function

AC-GAN Explained

AC-GAN Explained

SimGAN Explained

SimGAN Explained

DC-GAN Explained!

DC-GAN Explained!

ResNet Explained!

ResNet Explained!

Graph Convolutional Networks

Graph Convolutional Networks

Neural Architecture Search

Neural Architecture Search

Video Classification with Deep Learning

Video Classification with Deep Learning

BigGANs in Data Augmentation

BigGANs in Data Augmentation

Introduction to Deep Learning

Introduction to Deep Learning

EfficientNet Explained!

EfficientNet Explained!

Self-Attention GAN

Self-Attention GAN

Curriculum Learning in Deep Neural Networks

Curriculum Learning in Deep Neural Networks

Deep Learning Podcast #1 | Edward Dixon | Stochastic Weight Averaging

Deep Learning Podcast #1 | Edward Dixon | Stochastic Weight Averaging

Deep Compression

Deep Compression

Skin Cancer Classification with Deep Learning

Skin Cancer Classification with Deep Learning

Deep Learning Podcast #2 | Edward Peake | Deep Learning in Medical Imaging

Deep Learning Podcast #2 | Edward Peake | Deep Learning in Medical Imaging

The Lottery Ticket Hypothesis Explained!

The Lottery Ticket Hypothesis Explained!

GauGAN Explained!

GauGAN Explained!

AutoML with Hyperband

AutoML with Hyperband

DL Podcast #3 | Yannic Kilcher | Population-Based Search

DL Podcast #3 | Yannic Kilcher | Population-Based Search

Weakly Supervised Pretraining

Weakly Supervised Pretraining

Image Data Augmentation for Deep Learning

Image Data Augmentation for Deep Learning

Unsupervised Data Augmentation

Unsupervised Data Augmentation

Wide ResNet Explained!

Wide ResNet Explained!

RevNet: Backpropagation without Storing Activations

RevNet: Backpropagation without Storing Activations

GANs with Fewer Labels

GANs with Fewer Labels

BigBiGAN Unsupervised Learning!

BigBiGAN Unsupervised Learning!

Self-Supervised Learning

Self-Supervised Learning

Multi-Task Self-Supervised Learning

Multi-Task Self-Supervised Learning

Self-Supervised GANs

Self-Supervised GANs

Population Based Training

Population Based Training

Show, Attend and Tell

Show, Attend and Tell

Siamese Neural Networks

Siamese Neural Networks

WaveGAN Explained!

WaveGAN Explained!

VAE-GAN Explained!

VAE-GAN Explained!

Evolution in Neural Architecture Search!

Evolution in Neural Architecture Search!

AI Research Weekly Update August 18th, 2019

AI Research Weekly Update August 18th, 2019

Weight Agnostic Neural Networks Explained!

Weight Agnostic Neural Networks Explained!

AI Research Weekly Update August 25th, 2019

AI Research Weekly Update August 25th, 2019

Neuroevolution of Augmenting Topologies (NEAT)

Neuroevolution of Augmenting Topologies (NEAT)

AI Research Weekly Update September 1st, 2019

AI Research Weekly Update September 1st, 2019

Randomly Wired Neural Networks

Randomly Wired Neural Networks

This video teaches the importance of data augmentation in deep learning for computer vision and provides an overview of various techniques, including image manipulation, random erasing, and feature space augmentation. It also discusses the use of tools such as Generative Adversarial Networks (GANs) and AlexNet.

Key Takeaways

Apply image manipulation techniques such as rotation, flipping, and cropping
Use random erasing to augment images
Implement feature space augmentation
Use Generative Adversarial Networks (GANs) to generate new data
Apply curriculum learning to progressively increase the magnitude of image manipulation
Use online data augmentation to augment batches during training

💡 Data augmentation is a crucial technique in deep learning for computer vision, and using the right techniques can significantly improve model performance.

🔒 Pro feature: Ask AI to explain this lesson →

More on: CV Basics

View skill →

Identify Horses or Humans with TensorFlow and Vertex AI

Building a Dog Breed Identifier App from scratch - DogNet

Building a Dog Breed Identifier App from scratch - DogNet

Aladdin Persson

Apply OpenGL Texturing and Camera Systems

Apply OpenGL Texturing and Camera Systems

Aerial Image Segmentation with PyTorch

Aerial Image Segmentation with PyTorch

How to Install Stable Diffusion - automatic1111

How to Install Stable Diffusion - automatic1111

Sebastian Kamph

NVIDIA RTXGI Unreal Engine 4 Plugin: Introduction and Setup

NVIDIA RTXGI Unreal Engine 4 Plugin: Introduction and Setup

NVIDIA Developer

Related Reads

What Is MLIR and Why Does It Exist?

Learn about MLIR, a intermediate representation for machine learning models, and its purpose in optimizing ML workflows

Dev.to · Fedor Nikolaev

Why Choosing the Right Machine Learning Development Company Matters More Than the AI Model

Choosing the right machine learning development company is crucial for turning AI investments into measurable results, as it can make or break the success of AI projects

Medium · Machine Learning

Data privacy in AI training: federated learning, differential privacy, and synthetic data

Learn how federated learning, differential privacy, and synthetic data preserve data privacy in AI training, and why they matter for secure machine learning

Data Preprocessing: Encoding and Feature Scaling in Machine Learning

Learn to preprocess data by encoding and scaling features for better machine learning model performance

Medium · Machine Learning

Is Python Dead in 2026?| Truth About Python in AI Era | 90 Days Roadmap @FameWorldEducationalHub

FAME WORLD EDUCATIONAL HUB