Machine Learning Foundations: Ep #6 - Convolutional cats and dogs
Skills:
CV Basics90%
Key Takeaways
Applies Convolutional Neural Networks to a computer vision scenario using TensorFlow
Full Transcript
hi and welcome back to machine learning foundations this is episode 6 where we'll take what you've learned about convolutional neural networks in the previous view episodes and apply them to a computer vision scenario that was a Kangol challenge not that long ago but before we get there let's take a look at the answer to the exercise from the last video so this was a very simple exercise let's take a look first of all we'll import everything that we need to import and we'll download the data set which is just some simple happy or sad faces you'll then extract the data set and you'll put it into a folder called TMP slash hrs and if we take a look at our files we could see in their TMP hrs has been downloaded and we have happy images in there and we have sad images in there as we can see I've also set up a callback so once it hits 99.9% accuracy it will cancel training it's a very small data set and maybe overfitting hugely for the data set but that's ok we'll just want to use this to test out some convolutions so here we can define our network and we define it 3 convolutional 2d layers 3 max pooling 2d layers that will then flatten and feed into a dense we can use the rmsprop optimizer set the learning rate to be 1 times 10 to the minus 3 you can even tweak this as much as you think will work best so now we just use our image data generator and we'll set up a train data gen which will rescale the images and then will flow from directory and it will flow from the directory that we've just seen it will resize to 150 by 150 and will flow them in batches of 10 there are 80 images in the data set so when we then do the model dot fit we should set steps per a park to be 8 because 8 times 10 is 80 and that makes it run nice and fast so we can do a model that fit now and we can see it's running really really quickly it's you know point 2 of a second press step and we had 100% accuracy very quickly it's a super simple data set but hopefully the scaffolding that you're building out here is something that you can learn from when you're building for more complex models now that wasn't so bad it was a super simple exercise but it will lay the foundations for what you're going to study in this video so let's get started the dogs versus cat's data set has twenty five thousand images of cats and dogs and various poses it was used for a kegel challenge a few years ago in determining state of the art computer vision techniques in the next few minutes you'll see how to use what you've learned so far to build a classifier for cats and dogs that's over 96% accurate on the training set as before you're going to split the data into training and validation directories and each of these will have cats and dogs sub directories you will then be able to train a classifier on cats and dogs images using a generator that pulls from the training sub directories and which validates by using a generator that pulls from the validation sub directories so to get started let's first import an image data generator which allows us to use them for our training data will then simply create an image data generator and flow from the training directory note that the image data generator will handle normalizing the images by dividing their byte values by 255 to turn a value from 0 to 255 into a value from 0 to 1 together training images from the training sub directories we can call flow from directory passing it a bunch of parameters the first of which of course is the training directory itself this should only contain subdirectories 4 which will provide the labels for the classes we have cats and dogs sub directories civil of cats and dogs labels next is our target size and as the images come in many sizes we'll need to have a consistent shape to feed into the neural network I've set them to be 150 by 150 here but you can choose whatever you want of course all images regardless of their shape will end up as 150 by 150 so choose carefully and remember when creating your model to have the input layers use the same dimensions next choose a batch size for which the images will be loaded pick something that divides evenly into a step size that you'll use later so for example if I have 20 2,500 training images and that's taking 90% of the day for training and then a batch size of 250 then I will need 90 steps to load this into the neural network finally there's the class mode and as the two classes will set it to binary for validation you'll do exactly the same except of course that you want to flow from the validation directory and not the training one and here's where you define your model this should look familiar by now it is a convolutional neural network I've designed this one with three convolutional layers each paired with a max pool layer the first convolutional layer will learn sixteen filters the next 32 and the next 64 it's important to remember the input shape remember earlier we resized everything to be 150 by 150 and that's what we use here the three is for the color channels if you use a different size make sure to adjust this to match and similar for your output layer this should match that a number of classes in your model one exception which we're following here is that a binary classifier can get by with just one neuron provided you use a sigmoid activation function which sets it to zero for one class and one for the other if we take a look at our model summary it looks like this you can see the familiar resizing of the image as it travels through the network by convolutions and pooling by the end you can see that we have nine and a half million trainable parameters so this can take a while I've chosen to use rmsprop as the optimizer here it's set with a high learning rate and you can tweak this to try for better performance on your network finally we'll train the network by specifying the training and validation generators as our data sources don't forget to set the steps per ipok and validation steps for performance and these should be calculated by dividing the amount of data by the batch size and this gives us 90 and 10 respectively in this case now that gives you everything you need to train a cats versus dogs classifier let's take a look at it in action after which I'll give you the URL with all the code so you can try it for yourself let's take a look at a much larger data set and this is cats and dogs the cats and dogs dataset has about 25,000 images in it and those images are various different cats and dogs in various poses there are real photographs this dataset was used for a kaggle challenge a couple of years ago we're gonna do our imports first and as well as tensorflow we're going to import like a bunch of stuff that we can use for manipulating the file system because this dataset is just a raw zip containing cats and dogs we're gonna have to do a bit of data pre-processing on it so that we can use it with image data generators so we can see here it's about 800 make to download and now I'm unzipping that to create the slash TMP folder containing the cats and dogs and if we go to the file system and we take a look at /tmp we'll see in here as pet images and within pet images are cat and dog I'm not gonna open them to look at the files right now because it takes a little while for them to all list out and it will look like collab as frozen but what I can do is just list the directory looks at list how many images are in it and we can see there's 12,500 and one in each which is our 25,000 and now I'm going to create my own directory so cats versus dogs master and these will contain training and testing and each of those will have cats and dogs images within them or cats and dogs folders within them that will contain the images I've created this function called split data and as its name suggests what it does is it will split the data into training and testing for us so if you specify a split size of 0.9 what it's going to do is gonna get 90% of the images at random that will be for training and 10% which will be for testing and then it will split these into cats and dogs for us it'll also filter out any of the images that are corrupt and as we can see here two of them are zero lengths so we'll ignore them just to make sure I got them right centered that my sums were correct we can see that 90% of the images 11,000 1250 of each are in the training and twelve hundred and fifty of each are in the testing I can then create my model and it's just going to be simple convolutional neural network with three layers of cons to D combined with three layers of Max pooling that get flattened and fed into dense I'm going to compile this with an auto mess prop with a learning rate of one e to the minus three you can tweak this maybe to make it more accurate I've set it to be a binary cross-entropy because there are only two classes next we're going to create our generators and generators are going to rescale the images by dividing by 255 there are better ways to rescale but this won't work for now and then we're going to flow from the training directory to the training generator and we're gonna flow from the validation directory to the validation generator note that I'm setting batch sizes here so that when we're training later on we'll load them in batches so we can be a bit faster so next up let me run that cell so next up we're just going to do the model don't fit passing it the trained generator and the validation generator and we're setting stamps pretty Park and validation steps to match the number of batches that we have to match the number of images so this should run quite fast so in the case of training it's going to take 90 steps of 250 images each and it's going to take 10 steps of 250 images each in order to do validation this gives us a much faster training if you were not to do this it would be loading the images one by one into the GPU and to be a lot of wasted time but as we can see it's still a big and complex data set so it's taking maybe 45 seconds per epoch so I'm just going to speed it up until we get to the end before we do that actually just note that you'll see warnings like this possibly corrupt EXIF data don't worry about those that the exif data is additional metadata that goes onto the image with things like the location of the image so when the tiff plugin is actually decoding those images is trying to read that at caesars corrupt and it's giving us that warning it's not gonna impact your training in any way and now we see it's almost done we're about three seconds to go and the final epoch we've only trained for 15 epochs and see it's just a little over a minute pretty park like this let's take a look at what our final figures are we can see it was 97% on the training set and 82% on the validation set so not bad it is overfitting a bit we can work on that but right now looks pretty good on the validation set even eighty-two percent we can actually take a plot and see what our history looks like we can see like a validation accuracy I can start training accuracy and we can even see our loss of validation against training so this indication of the validation curve going up like this is clear indication of overfitting so we could do a bit of tweaking there and we'll learn about that with image augmentation later and if we want to try to test a few images we can do so so I'm gonna do this and it some files I'm gonna go to my downloads or I've downloaded a few files already you can guess from their names what the contents are and let's see how the classifier does with them so we can see I thought puppy was a dog I thought pug was a dog I thought dog was a dog it thought this cat was a dog and it got both of these cats right and if we take a look at cat 2 5 3 6 6 2 this was the one that had got wrong and I thought was a dog so it's an interesting picture and maybe we can learn from this and how to optimize it better for the future now that you've seen it in action here's the URL for the lab so you can try it out for yourself as you saw on the screencast there is room for improvement because of overfitting and in the next video you'll learn some techniques for that but have fun with the lab in the meantime and I'll see you next time [Music]
Original Description
Machine Learning Foundations is a free training course where you’ll learn the fundamentals of building machine learned models using TensorFlow.
In Episode 6 we’ll take what we learned about Convolutional Neural Networks in the previous few episodes and apply them to a computer vision scenario that was a Kaggle challenge not long ago--building a classifier for cats and dogs!
Dogs vs. Cats dataset → https://goo.gle/3g5mWdn
Exercise 4 answer → https://goo.gle/2ZhV6oq
Cats vs. Dogs example → https://goo.gle/2zTHWDu
TensorFlow is Google’s end-to-end open source machine learning platform. For more videos about TensorFlow, subscribe to the TF YouTube channel → https://goo.gle/TensorFlow
Machine Learning Foundations playlist → https://goo.gle/ml-foundations
Subscribe to Google Developers → https://goo.gle/developers
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Google for Developers · Google for Developers · 53 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
▶
54
55
56
57
58
59
60
Developer Journey - Sunnyvale DSC Summit ‘19
Google for Developers
How Google is working with students - Sunnyvale DSC Summit ‘19
Google for Developers
Starting your career in the Cloud - Sunnyvale DSC Summit ‘19
Google for Developers
The Solution Challenge - Sunnyvale DSC Summit ‘19
Google for Developers
Firebase - Sunnyvale DSC Summit ‘19
Google for Developers
Cloud Hero - Sunnyvale DSC Summit ‘19
Google for Developers
Panel discussion - Sunnyvale DSC Summit ‘19
Google for Developers
The art of negotiation - Sunnyvale DSC Summit ‘19
Google for Developers
Courage to care, solve and share - Sunnyvale DSC Summit ‘19
Google for Developers
Version 9 of Angular, Glass Enterprise Edition 2, path to DX deprecation, & more!
Google for Developers
[DEPRECATING] Introducing a new series (Assistant for Developers Pro Tips)
Google for Developers
Detecting memory bugs with HWASan, Bazel 2.1, Next ‘20 session guide, & more!
Google for Developers
Why Podcast.app chose a .app domain name
Google for Developers
Machine Learning Bootcamp Jakarta 2019
Google for Developers
Android Studio 3.6, Android 11 Developer Preview, Kubeflow 1.0, & more!
Google for Developers
[DEPRECATING] Importance of community (Assistant on Air)
Google for Developers
Why the Flutter team switched from .io to a .dev domain name
Google for Developers
3 website-building tips from .dev creators
Google for Developers
Why NimbleDroid chose a .app domain name
Google for Developers
Android Platform Codelab, Bazel 2.2, Maps Android Utility Library v1.0, & more!
Google for Developers
Google for Games Developer Summit: A free, digital experience for game developers
Google for Developers
Inspecting Home Graph (Assistant for Developers Pro Tips)
Google for Developers
Google for Games Developer Summit Keynote
Google for Developers
Stadia Games & Entertainment presents: Keys to a great game pitch (Google Games Dev Summit)
Google for Developers
Empowering game developers with Stadia R&D (Google Games Dev Summit)
Google for Developers
Supercharging discoverability with Stadia (Google Games Dev Summit)
Google for Developers
Stadia Games & Entertainment presents: Creating for content creators (Google Games Dev Summit)
Google for Developers
Bringing Destiny to Stadia: A postmortem (Google Games Dev Summit)
Google for Developers
Live Captioning in Google Slides
Google for Developers
[DEPRECATING] User engagement for the Google Assistant
Google for Developers
TensorFlow Dev Summit ‘20, Google for Games Dev Summit, Cloud AI Platform Pipelines, & much more!
Google for Developers
Top 5 from the TensorFlow Dev Summit 2020
Google for Developers
Developer Student Clubs 2019 Turkey Leads Summit
Google for Developers
Building simpler payment experiences | Google Pay Plugin for Magento 2
Google for Developers
Become A Developer Student Club Lead
Google for Developers
Firebase Kotlin Extensions, ARM apps on the Android Emulator, Angular v9.1, & more!
Google for Developers
Test suite for Smart Home (Assistant for Developers Pro Tips)
Google for Developers
Google Play updates, Bazel 3.0, Business Console for Google Pay, & more!
Google for Developers
How to use error logs (Assistant for Developers Pro Tips)
Google for Developers
Contact Center AI, Android Studio 4.1 Canary 5, TensorFlow QAT API, & more!
Google for Developers
WebView DevTools, Kotlin meets gRPC, Flutter CodePen support, & more! (Episode 200)
Google for Developers
Offline handling for Smart Home (Assistant for Developers Pro Tips)
Google for Developers
Android 11 Dev Preview 3, Google Fonts for Flutter, Shielded VM, & more!
Google for Developers
Machine Learning Foundations: Ep #1 - What is ML?
Google for Developers
Flutter web support updates, BigQuery materialized views, Cloud Spanner emulator, & more!
Google for Developers
Computer vision by building a neural network with TensorFlow | Machine Learning Foundations
Google for Developers
Machine Learning Foundations: Ep #3 - Convolutions and pooling
Google for Developers
Android 11 Beta plans, Flutter 1.17, Dart 2.8, & much more!
Google for Developers
Machine Learning Foundations: Ep #4 - Coding with Convolutional Neural Networks
Google for Developers
Google Developers ML Summit
Google for Developers
Real-world image classification using convolutional neural networks | Machine Learning Foundations
Google for Developers
Adobe XD support for Flutter, Architecture Framework, temporary closures with Places API, & more!
Google for Developers
Machine Learning Foundations: Ep #6 - Convolutional cats and dogs
Google for Developers
Machine Learning Foundations: Ep #7 - Image augmentation and overfitting
Google for Developers
Announcing Firebase Live, Flutter Day, Java 11 on Google Cloud Functions, & more!
Google for Developers
Machine Learning Foundations: Ep #8 - Tokenization for Natural Language Processing
Google for Developers
Android 11 Beta, Google Play Asset Delivery, Firebase Crashlytics SDK, & much more!
Google for Developers
Natural Language Processing: Using sequencing APIs in TensorFlow | Machine Learning Foundations
Google for Developers
Build a sarcasm classifier using NLP and TensorFlow | Machine Learning Foundations
Google for Developers
AR Realism with the ARCore Depth API
Google for Developers
More on: CV Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Want to get started with deep learning
Reddit r/deeplearning
Building a Deepfake Detector From Scratch — What Nobody Tells You
Medium · Deep Learning
Unfolding the Meandering Path: High-Dimensional Invariance and the Flat 2D Plane of Neural…
Medium · Deep Learning
Implementing Neural Style Transfer from Scratch: The Project That Started It All
Medium · Deep Learning
🎓
Tutor Explanation
DeepCamp AI