Loading Image using PyTorch
Import torchvision #easiest_way
In this Article, I’ll show how to load image data, it will be really useful while doing real projects. PyTorch is the newly released deep learning framework and is easy to use. Tensors are the building block of PyTorch and this is similar to NumPy array or matrix. PyTorch also can use GPU which enable the data preprocessing faster and that’s the reason we can use PyTorch in replacement of NumPy. Also, we convert a NumPy array or matrix to PyTorch Tensor and vice-versa.
Let us start,
I’ll be using a data set from kaggle i.e cat and dog photos. Hence example image from this data set:
1. Loading dependencies
The easiest way to load image data is by using datasets.ImageFolder from torchvision
so, for this we need to import necessary packages therefore here I import matplotlib.pyplot as plt
where matplotlib
is a 2D Plotting library for displaying image as well as I import torch
and datasets and transforms
from torchvision
and helper
module. Let’s see the code:
%matplotlib inline
%config InlineBackend.figure_format = ‘retina’import matplotlib.pyplot as pltimport torch
from torchvision import datasets, transformsimport helper
2. Transform
In general we use ImageFolder
as
dataset = datasets.ImageFolder('path', transform=transform)
where ‘path’
is the path to the data set which the path to the folder where the data is present and while loading data with ImageFolder
we need to define some transforms because images are of different sizes and shape and we need all image in training set to be of the same size for training. Therefore we define resize with transform.Resize()
or crop with transforms.CenterCrop(), transforms.RandomResizedCrop()
also we need to convert all the image to PyTorch tensors for this purpose we use transforms.ToTensor()
. Also, we will combine this transforms to pipeline with transforms.Compose()
, which run the list of transforms in sequence. so finally we define transform as:
transform = transforms.Compose([transforms.Resize(255),
transforms.CenterCrop(224),
transforms.ToTensor()])
3. Data Loaders
After loaded ImageFolder
, we have to pass it to DataLoader
. It takes a data set and returns batches of images and corresponding labels. Here we can set batch_size and shuffle (True/False) after each epoch. For this we need to pass data set, batch_size, shuffle into torch.utils.data.DataLoader()
as below:
dataloader = torch.utils.data.DataLoader(dataset, batch_size=32, shuffle=True)
4. Testing Data Loader
Now, to test the data loader we need to run:
images, labels = next(iter(dataloader))
helper.imshow(images[0], normalize=False)
here, data loader is a generator and to get data out of it, we need to loop through it or convert it to an iterator and call next()
Let us see the complete code for transforming and loading data:
data_dir = ‘Cat_Dog_data/train’transform = transforms.Compose([transforms.Resize(255),
transforms.CenterCrop(224),
transforms.ToTensor()])
dataset = datasets.ImageFolder(data_dir, transform=transform)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=32, shuffle=True)images, labels = next(iter(dataloader))
helper.imshow(images[0], normalize=False)
After Loading we will see an image from the dataset:
5. Introducing Randomness
We can randomly rotate, mirror, crop, scale image during training, which will help our network generalize as it’s seeing the same image but in a different location, with different orientation and size.
To randomly rotate, scale, crop, and horizontal flip, we define transforms like this:
train_transforms = transforms.Compose([
transforms.RandomRotation(30),
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor()])
6. Normalizing the image
We can normalize the image with transforms. Normalize, for this, we need to pass the list of means, list of standard deviations, then the color channels as:
input[channel] = (input[channel] - mean[channel]) / std[channel]
Here, subtracting mean centers the data near zero and dividing by standard deviation squishes the values to be between -1 and 1. Hence, Normalizing helps to keep the Network weights near zero which in turn makes back propagation more stable. Without normalization, the network will fail to learn properly.
7. Final Result:
data_dir = ‘Cat_Dog_data’#Applying Transformation
train_transforms = transforms.Compose([
transforms.RandomRotation(30),
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor()])test_transforms = transforms.Compose([transforms.Resize(255),
transforms.CenterCrop(224),
transforms.ToTensor()])train_data = datasets.ImageFolder(data_dir + ‘/train’,
transform=train_transforms)
test_data = datasets.ImageFolder(data_dir + ‘/test’,
transform=test_transforms)
#Data Loadingtrainloader = torch.utils.data.DataLoader(train_data,
batch_size=32)
testloader = torch.utils.data.DataLoader(test_data, batch_size=32)