ResNet Explained!

Connor Shorten · Beginner ·🧬 Deep Learning ·7y ago

Skills: Modern CV Models90%CV Basics80%ML Pipelines60%

Key Takeaways

The ResNet paper introduces a novel connection scheme to deep convolutional neural networks, allowing for the training of neural nets with many layers, and achieves state-of-the-art results on ImageNet and CIFAR-10 datasets using techniques such as residual learning and skip connections.

Full Transcript

[Music] this video will explain the resident a deep convolutional neural network architecture design this is one of the most popular neural network designs that have ever been published with over 20,000 citations deep learning is thought of as learning a hierarchical set of representations such that it learns low mid and high level features in images this is analogous to learning like edges and then shapes and then objects so theoretically more layers should enrich the levels of the features and previous models to the ResNet typically have depths of 16 and 30 layers so the idea is that shouldn't building better neural networks because easy as adding more layers to the network so the first contribution of the ResNet paper is showing that if you just continue to concatenate convolutional layers on top of activations and batch normalization the training will eventually get worse not better but they offer this insight the construction insight that says if you consider a shallow architecture and it's deeper counterpart with more layers theoretically all the deeper model would need to do is to just copy the output from the shower model model with identity mappings so the construction solution suggests that a deeper model should produce no higher error than the shallow counterpart however the identity functions aren't an easy function to learn and so therefore the residual functions formulate the layers is having a reference to the input through these identity or skip connections such that theoretically if it needed to push the layer down to zero it could easily do it in this framework so again this shows the residual connection which is the building block for the residual network or resident so one interesting thing with res Nets is if the previous layer dimensions don't match the input to the next layer you think about a convolution a 3x3 convolution would change the spatial dimensions of an image from like 32 by 32 to 30 by 30 so they do here is they propose different schemes for up sampling the previous input layers to through this identity skip connection so one of the two ways it can do this is it can either just zero pad the outsides of the spatial dimension and there's no extra parameters with doing this it's really quick or I can expand the dimensions with one by one convolution so this image shows what the resident looks like in contrast to a 34 layer plane Network which is a series of convolutional layers followed by activations followed by batch normalization and compared to another really popular model the vgg 19 so the resident experiments they test the 152 layer net on image net and this gets their state-of-the-art results and this is eight times deeper than vgg nets but in terms of the floating-point operation measurement it actually has less less computation and the vgg 19 shown by these billion flops metric so the ensemble event resonance they're able to achieve three point five seven percent error on the image net test set which achieves them the state VR they also test this on C 410 with a hundred and a thousand layers and then they use the ResNet features on the cocoa object detection so with the object detection networks work is you would use like VG or ResNet to extract the features from the image data set and then you would classify the different bounding boxes based on some region proposal algorithm so some more details about the resident is that it uses a batch formalization after each convolution and before activations it uses the hey initializer invented by the author of the paper climbing hey you use the batch size of 256 they have this learning rate scheme the weight decay and then also interestingly they don't use dropout and they have they test on this one interesting test time augmentation where they don't just predict on the test image what they do is they take ten crops from the test image and then they predict the model predicts on each of the crops and then they average the prediction that to form the final prediction so the first experimental result is showing how the ResNet continues to get better as you go from 18 to 34 layers but the naive concatenation of convolutional layers is already starting to get worse so just say 27.8% to 25 percent error rate whereas the play network goes up almost half a percentage from the increase in layers so then they test this idea of when you're skipping ahead and the dimensions don't match to you zero pad it do you have these one-by-one convolutions and how frequently do you use the world above all collisions so they do find that when they have 1x1 convolutions or known as projections that they do get a slightly significant performance boost but it comes with the cost of having a significant amount of extra parameters so one of the thing they do is when they train the present at 5101 and once these two is they extend the Skip connection so it skips ahead two layers rather than one like the normal residual building block and this is done just to save training time so these are the results of the different levels of ResNet the B and C denoting the different ways of doing the projection matching and then compared to some of the other said via our models like Inception vgg yeah so this is the results of the ensemble of resonances at the state-of-the-art on the top 5 predictions on the image net test set and also these are the results from the CFR 10 data set and interestingly in this is you see that when they try to go from 110 layers they achieved the city art with this but when they try to go to 1202 layers the error goes back up so they haven't quite figured out how to make it go that deep yet and then this shows how using the features extracted from ResNet outperform vgg on the localization or the bounding box detection task so again they find that when they try to they do figure out how to make it significantly deeper than like the vgg 19 layers but with this mechanism the 1202 layer and that still does not perform well and they suggest in the paper that this is due to overfitting so thanks for watching this video on rez nets please subscribe to Henry AI labs and the paper link is in the description [Music]

Original Description

Residual Learning introduces a novel connection scheme to the Deep Convolutional Network that achieves state of the art networks and allows the training of Neural Nets with very many layers! Thanks for checking this video out, Please subscribe!

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Connor Shorten · Connor Shorten · 18 of 60

← Previous Next →

DeepWalk Explained

DeepWalk Explained

Inception Network Explained

Inception Network Explained

Progressive Growing of GANs Explained

Progressive Growing of GANs Explained

Improved Techniques for Training GANs

Improved Techniques for Training GANs

Word2Vec Explained

Word2Vec Explained

Must Read Papers on GANs

Must Read Papers on GANs

Unsupervised Feature Learning

Unsupervised Feature Learning

Self-Supervised GANs

Self-Supervised GANs

Embedding Graphs with Deep Learning

Embedding Graphs with Deep Learning

Transfer Learning in GANs

Transfer Learning in GANs

ReLU Activation Function

ReLU Activation Function

AC-GAN Explained

AC-GAN Explained

SimGAN Explained

SimGAN Explained

DC-GAN Explained!

DC-GAN Explained!

ResNet Explained!

ResNet Explained!

Graph Convolutional Networks

Graph Convolutional Networks

Neural Architecture Search

Neural Architecture Search

Video Classification with Deep Learning

Video Classification with Deep Learning

BigGANs in Data Augmentation

BigGANs in Data Augmentation

Introduction to Deep Learning

Introduction to Deep Learning

EfficientNet Explained!

EfficientNet Explained!

Self-Attention GAN

Self-Attention GAN

Curriculum Learning in Deep Neural Networks

Curriculum Learning in Deep Neural Networks

Deep Learning Podcast #1 | Edward Dixon | Stochastic Weight Averaging

Deep Learning Podcast #1 | Edward Dixon | Stochastic Weight Averaging

Deep Compression

Deep Compression

Skin Cancer Classification with Deep Learning

Skin Cancer Classification with Deep Learning

Deep Learning Podcast #2 | Edward Peake | Deep Learning in Medical Imaging

Deep Learning Podcast #2 | Edward Peake | Deep Learning in Medical Imaging

The Lottery Ticket Hypothesis Explained!

The Lottery Ticket Hypothesis Explained!

GauGAN Explained!

GauGAN Explained!

AutoML with Hyperband

AutoML with Hyperband

DL Podcast #3 | Yannic Kilcher | Population-Based Search

DL Podcast #3 | Yannic Kilcher | Population-Based Search

Weakly Supervised Pretraining

Weakly Supervised Pretraining

Image Data Augmentation for Deep Learning

Image Data Augmentation for Deep Learning

Unsupervised Data Augmentation

Unsupervised Data Augmentation

Wide ResNet Explained!

Wide ResNet Explained!

RevNet: Backpropagation without Storing Activations

RevNet: Backpropagation without Storing Activations

GANs with Fewer Labels

GANs with Fewer Labels

BigBiGAN Unsupervised Learning!

BigBiGAN Unsupervised Learning!

Self-Supervised Learning

Self-Supervised Learning

Multi-Task Self-Supervised Learning

Multi-Task Self-Supervised Learning

Self-Supervised GANs

Self-Supervised GANs

Population Based Training

Population Based Training

Show, Attend and Tell

Show, Attend and Tell

Siamese Neural Networks

Siamese Neural Networks

WaveGAN Explained!

WaveGAN Explained!

VAE-GAN Explained!

VAE-GAN Explained!

Evolution in Neural Architecture Search!

Evolution in Neural Architecture Search!

AI Research Weekly Update August 18th, 2019

AI Research Weekly Update August 18th, 2019

Weight Agnostic Neural Networks Explained!

Weight Agnostic Neural Networks Explained!

AI Research Weekly Update August 25th, 2019

AI Research Weekly Update August 25th, 2019

Neuroevolution of Augmenting Topologies (NEAT)

Neuroevolution of Augmenting Topologies (NEAT)

AI Research Weekly Update September 1st, 2019

AI Research Weekly Update September 1st, 2019

Randomly Wired Neural Networks

Randomly Wired Neural Networks

The ResNet paper introduces a novel connection scheme to deep convolutional neural networks, allowing for the training of neural nets with many layers, and achieves state-of-the-art results on ImageNet and CIFAR-10 datasets. This video explains the ResNet architecture and its key components, including residual learning and skip connections.

Key Takeaways

Implement residual learning in a neural network
Use skip connections to improve model performance
Train a deep convolutional neural network
Use batch normalization in a neural network
Implement a learning rate scheme
Test the model on ImageNet and CIFAR-10 datasets

💡 The ResNet paper introduces a novel connection scheme to deep convolutional neural networks, allowing for the training of neural nets with many layers, and achieves state-of-the-art results on ImageNet and CIFAR-10 datasets.

🔒 Pro feature: Ask AI to explain this lesson →

More on: Modern CV Models

View skill →

YOLOE: Real-time Zero-shot Object Detection | Visual Prompting | Live Coding & Q&A (Mar 14th)

YOLOE: Real-time Zero-shot Object Detection | Visual Prompting | Live Coding & Q&A (Mar 14th)

RF-DETR: How to Train SOTA for Object Detection on a Custom Dataset | Step-by-step guide

RF-DETR: How to Train SOTA for Object Detection on a Custom Dataset | Step-by-step guide

Build a Deep Facial Recognition App // Part 8 - Kivy Computer Vision App with OpenCV and Tensorflow

Build a Deep Facial Recognition App // Part 8 - Kivy Computer Vision App with OpenCV and Tensorflow

Nicholas Renotte

Deep Learning with PyTorch : Image Segmentation

Deep Learning with PyTorch : Image Segmentation

Mesh Optimization Using FlexiCubes with NVIDIA Kaolin Library v0.15.0

Mesh Optimization Using FlexiCubes with NVIDIA Kaolin Library v0.15.0

NVIDIA Developer

Code Panoptic Image Segmentation w/ Vision Transformer & Mask2Former - A PyTorch tutorial

Code Panoptic Image Segmentation w/ Vision Transformer & Mask2Former - A PyTorch tutorial

Related AI Lessons

Want to get started with deep learning

Get started with deep learning by leveraging resources like Andrew Karpathy's playlist and frameworks such as TensorFlow or PyTorch

Reddit r/deeplearning

Building a Deepfake Detector From Scratch — What Nobody Tells You

Learn to build a deepfake detector from scratch and understand the challenges involved in detecting AI-generated fake media

Medium · Deep Learning

Unfolding the Meandering Path: High-Dimensional Invariance and the Flat 2D Plane of Neural…

Learn about high-dimensional invariance and its relation to the flat 2D plane of neural networks, and how to apply these concepts to improve model performance

Medium · Deep Learning

Implementing Neural Style Transfer from Scratch: The Project That Started It All

Learn to implement Neural Style Transfer from scratch and understand its significance in deep learning

Medium · Deep Learning

Image Classification with ml5.js

The Coding Train