How to build custom Datasets for Images in Pytorch
Skills:
Data Literacy60%
Key Takeaways
Builds a custom dataset for images in PyTorch
Full Transcript
[Music] in this video we will implement a custom data set to load our data in all the videos i've done it's just been about creating a model or saving a model uh you're using transfer learning etc but a really big problem is loading the actual data and there are a few data sets that you can load very nicely from pytorch but let's say you have your own images or maybe some data set that you downloaded online how do you actually load the data efficiently so that's the question we're going to try to answer in this video first i have some standard code here that's working except we have no way to load the data the data that we have is a data set that i've downloaded of cats and dogs and what i've stored is in a csv file i've stored the image name that's in this specific folder and also the second column here is the target so one for dog and zero for cat what we're going to start with is we're going to try to def create a class which is going to be used to to load our data set so we're going to do cats and dogs data set and we're going to inherit from data set then we're going to define an init and what we're going to send in is a csv file so the csv file i just showed a root directory which is going to be the root directory to the to the images and then we're going to have transform which is optional if we want to have it then we're going to use self.annotations to be panda read csv csv file so we're going to use panda to help us read the csv file we're not going to do anything complicated then self.groupdirectory will just be the directory and self.transform will be transformed so the really the functions that we're going to have are len so the length and then we're going to have get item and what we want get item to do is return is a get item is going to return a specific example i a specific image and corresponding target to that image so we're going to do first the length is pretty easy we're going to return len of self.annotation in our case we have 25 25 000 images uh 50 of cats 50 of logs then so what we're going to start with is we're going to find image path we're going to do os.path.join self.root directory and then we're going to do self.annotations.i look and then index comma 0. so remember we send in a particular index that's we don't choose the index that's python does for us but we're gonna do self.annotation which is the csv file we're gonna uh this the row i and the column zero right the first column was the name of the image then we're going to do image is io.imread image path then we're going to do y label is so the y label will just be at that same except the first column and all that we're going to do is we're going to do int convert to an integer and then we're going to do torch dot tensor and in between we're going to do if.self.transform we're going to do image is self.transform of image and then that's it so we've loaded the image we've done some transform which is optional uh if we send in transform it's going to do the transformation and then what we want to do is we want to return the image and the y label okay so this class we've created is for a specific it just loads one image and the corresponding target to that image now we can go back to our load of the data and we're gonna first say see we're gonna do from custom data set so that's the let's see that's the the python file let's actually make this full screen we're going to import cats and dogs data set then we're going to say data set is copy that and then we're going to send in the csv file which we called cats dogs.csv root directory will be cats dogs resized and transform the transform we will use is since we're going to load it as an image we need to convert it to a tensor so we're going to do transforms dot 2 tensor and now one thing we can do here see valid syntax be probably one less yeah then one thing we can use also is we can use train set comma test set so we can use uh for this we can we can use torch.utils.data.random split from the data set and let's do 20 000 to the training set 5000 to the test set and then we're going to define our train loader to be data loader so this is the standard way all that really we've done that's unique to our specific data set is we're going to use that class that we created now it's just pretty much the standard that we always do so we do data set equals train set batch size equals batch size shuffle equals true and then we just copy this for the test loader and test set yeah and that should be it so that's how we load data if we have images in a folder and we have a cc file that describes the image name and also the target corresponding to that specific image so let's uh so for fun what i've done here is i've imported the google net [Music] google net architecture with pre-training and yeah i didn't cover this code but it's pretty standard code we have the loss we have the optimizer we have some training here of the network and then in the end we have check accuracy on the training set and the test set yeah so let's try to train on this just for curiosity on the 5000 images in the test set we got about 96 97 accuracy which is pretty cool so hopefully you were able to follow the video on how to load data if you've gathered your own data set and if you got any question then leave them in the comment below thank you so much for watching the video and i hope to see you in another video
Original Description
In this video we have downloaded images online and store them in a folder together with a csv file and we want to load them efficiently with a custom Dataset in Pytorch.
Small Example of Dataset used in video:
https://www.kaggle.com/dataset/c75fbba288ac0418f7786b16e713d2364a1a27936e63f4ec47502d73d6ef30ab
Dataset (All images but not with csv file so you have to create it youself):
https://www.kaggle.com/c/dogs-vs-cats/data
❤️ Support the channel ❤️
https://www.youtube.com/channel/UCkzW5JSFwvKRjXABI-UTAkQ/join
Paid Courses I recommend for learning (affiliate links, no extra cost for you):
⭐ Machine Learning Specialization https://bit.ly/3hjTBBt
⭐ Deep Learning Specialization https://bit.ly/3YcUkoI
📘 MLOps Specialization http://bit.ly/3wibaWy
📘 GAN Specialization https://bit.ly/3FmnZDl
📘 NLP Specialization http://bit.ly/3GXoQuP
✨ Free Resources that are great:
NLP: https://web.stanford.edu/class/cs224n/
CV: http://cs231n.stanford.edu/
Deployment: https://fullstackdeeplearning.com/
FastAI: https://www.fast.ai/
💻 My Deep Learning Setup and Recording Setup:
https://www.amazon.com/shop/aladdinpersson
GitHub Repository:
https://github.com/aladdinpersson/Machine-Learning-Collection
✅ One-Time Donations:
Paypal: https://bit.ly/3buoRYH
▶️ You Can Connect with me on:
Twitter - https://twitter.com/aladdinpersson
LinkedIn - https://www.linkedin.com/in/aladdin-persson-a95384153/
Github - https://github.com/aladdinpersson
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Aladdin Persson · Aladdin Persson · 36 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
▶
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
computeCost.m Linear Regression Cost Function - Machine Learning
Aladdin Persson
gradientDescent.m Gradient Descent Implementation - Machine Learning
Aladdin Persson
Neural Network from scratch - Part 1 (Standard Notation)
Aladdin Persson
Neural Network from scratch - Part 2 (Forward Propagation)
Aladdin Persson
Neural Network from scratch - Part 3 (Backward Propagation)
Aladdin Persson
Neural Network from scratch - Part 4 (With Python)
Aladdin Persson
sigmoid.m - Programming Assignment 2 Machine Learning
Aladdin Persson
costFunction.m - Programming Assignment 2 Machine Learning
Aladdin Persson
predict.m - Programming Assignment 2 Machine Learning
Aladdin Persson
costFunctionReg.m - Programming Assignment 2 Machine Learning
Aladdin Persson
lrCostFunction.m - Programming Assignment 3 Machine Learning
Aladdin Persson
oneVsAll.m - Programming Assignment 3 Machine Learning
Aladdin Persson
predictOneVsAll.m - Programming Assignment 3 Machine Learning
Aladdin Persson
predict.m - Programming Assignment 3 Machine Learning
Aladdin Persson
Caesar Cipher Encryption and Decryption with example
Aladdin Persson
Cryptography: Caesar Cipher Python
Aladdin Persson
Vigenere Cipher Explained (with Example)
Aladdin Persson
Cryptography: Vigenere Cipher Python
Aladdin Persson
Hill Cipher Explained (with Example)
Aladdin Persson
Cryptography: Hill Cipher Python
Aladdin Persson
Interval Scheduling Greedy Algorithm: Python
Aladdin Persson
Weighted Interval Scheduling Algorithm Explained
Aladdin Persson
Weighted Interval Scheduling Python Code
Aladdin Persson
Sequence Alignment | Needleman Wunsch Algorithm
Aladdin Persson
Sequence Alignment | Needleman Wunsch in Python
Aladdin Persson
Codility BinaryGap Python
Aladdin Persson
Codility CyclicRotation Python
Aladdin Persson
Derivation Linear Regression with Gradient Descent
Aladdin Persson
Linear Regression Gradient Descent From Scratch in Python
Aladdin Persson
Pytorch Neural Network example
Aladdin Persson
Pytorch CNN example (Convolutional Neural Network)
Aladdin Persson
Pytorch LeNet implementation from scratch
Aladdin Persson
Pytorch VGG implementation from scratch
Aladdin Persson
Pytorch GoogLeNet / InceptionNet implementation from scratch
Aladdin Persson
How to save and load models in Pytorch
Aladdin Persson
How to build custom Datasets for Images in Pytorch
Aladdin Persson
Pytorch Transfer Learning and Fine Tuning Tutorial
Aladdin Persson
Pytorch Data Augmentation using Torchvision
Aladdin Persson
Pytorch Quick Tip: Weight Initialization
Aladdin Persson
Pytorch Quick Tip: Using a Learning Rate Scheduler
Aladdin Persson
Pytorch ResNet implementation from Scratch
Aladdin Persson
Pytorch TensorBoard Tutorial
Aladdin Persson
Pytorch DCGAN Tutorial (See description for updated video)
Aladdin Persson
Naive Bayes from Scratch - Machine Learning Python
Aladdin Persson
Spam Classifier using Naive Bayes in Python
Aladdin Persson
K-Nearest Neighbor from scratch - Machine Learning Python
Aladdin Persson
Linear Regression Normal Equation Python
Aladdin Persson
SVM from Scratch - Machine Learning Python (Support Vector Machine)
Aladdin Persson
Neural Network from Scratch - Machine Learning Python
Aladdin Persson
Pytorch RNN example (Recurrent Neural Network)
Aladdin Persson
Pytorch Bidirectional LSTM example
Aladdin Persson
Pytorch Text Generator with character level LSTM
Aladdin Persson
Logistic Regression from Scratch - Machine Learning Python
Aladdin Persson
K-Means Clustering from Scratch - Machine Learning Python
Aladdin Persson
Pytorch Torchtext Tutorial 1: Custom Datasets and loading JSON/CSV/TSV files
Aladdin Persson
Pytorch Torchtext Tutorial 2: Built in Datasets with Example
Aladdin Persson
Pytorch Torchtext Tutorial 3: From Textfiles to Dataset
Aladdin Persson
Paper Review: Sequence to Sequence Learning with Neural Networks
Aladdin Persson
Pytorch Seq2Seq Tutorial for Machine Translation
Aladdin Persson
Pytorch Seq2Seq with Attention for Machine Translation
Aladdin Persson
More on: Data Literacy
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Want to get started with deep learning
Reddit r/deeplearning
Building a Deepfake Detector From Scratch — What Nobody Tells You
Medium · Deep Learning
Unfolding the Meandering Path: High-Dimensional Invariance and the Flat 2D Plane of Neural…
Medium · Deep Learning
Implementing Neural Style Transfer from Scratch: The Project That Started It All
Medium · Deep Learning
🎓
Tutor Explanation
DeepCamp AI