Loading Image using PyTorch

Import torchvision #easiest_way

Archit
Secure and Private AI Writing Challenge
4 min readJul 12, 2019

--

In this Article, I’ll show how to load image data, it will be really useful while doing real projects. PyTorch is the newly released deep learning framework and is easy to use. Tensors are the building block of PyTorch and this is similar to NumPy array or matrix. PyTorch also can use GPU which enable the data preprocessing faster and that’s the reason we can use PyTorch in replacement of NumPy. Also, we convert a NumPy array or matrix to PyTorch Tensor and vice-versa.

Let us start,
I’ll be using a data set from kaggle i.e cat and dog photos. Hence example image from this data set:

woff_meow

1. Loading dependencies

The easiest way to load image data is by using datasets.ImageFolder from torchvision so, for this we need to import necessary packages therefore here I import matplotlib.pyplot as plt where matplotlib is a 2D Plotting library for displaying image as well as I import torch and datasets and transforms from torchvision and helper module. Let’s see the code:

%matplotlib inline
%config InlineBackend.figure_format = ‘retina’
import matplotlib.pyplot as pltimport torch
from torchvision import datasets, transforms
import helper

2. Transform

In general we use ImageFolder as

dataset = datasets.ImageFolder('path', transform=transform)

where ‘path’ is the path to the data set which the path to the folder where the data is present and while loading data with ImageFolder we need to define some transforms because images are of different sizes and shape and we need all image in training set to be of the same size for training. Therefore we define resize with transform.Resize() or crop with transforms.CenterCrop(), transforms.RandomResizedCrop() also we need to convert all the image to PyTorch tensors for this purpose we use transforms.ToTensor(). Also, we will combine this transforms to pipeline with transforms.Compose(), which run the list of transforms in sequence. so finally we define transform as:

transform = transforms.Compose([transforms.Resize(255),
transforms.CenterCrop(224),
transforms.ToTensor()])
Rescale, Crop and compose

3. Data Loaders

After loaded ImageFolder, we have to pass it to DataLoader. It takes a data set and returns batches of images and corresponding labels. Here we can set batch_size and shuffle (True/False) after each epoch. For this we need to pass data set, batch_size, shuffle into torch.utils.data.DataLoader() as below:

dataloader = torch.utils.data.DataLoader(dataset, batch_size=32, shuffle=True)

4. Testing Data Loader

Now, to test the data loader we need to run:

images, labels = next(iter(dataloader))
helper.imshow(images[0], normalize=False)

here, data loader is a generator and to get data out of it, we need to loop through it or convert it to an iterator and call next()

Data Loader

Let us see the complete code for transforming and loading data:

data_dir = ‘Cat_Dog_data/train’transform = transforms.Compose([transforms.Resize(255),
transforms.CenterCrop(224),
transforms.ToTensor()])
dataset = datasets.ImageFolder(data_dir, transform=transform)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=32, shuffle=True)
images, labels = next(iter(dataloader))
helper.imshow(images[0], normalize=False)

After Loading we will see an image from the dataset:

Output

5. Introducing Randomness

We can randomly rotate, mirror, crop, scale image during training, which will help our network generalize as it’s seeing the same image but in a different location, with different orientation and size.

To randomly rotate, scale, crop, and horizontal flip, we define transforms like this:

train_transforms = transforms.Compose([
transforms.RandomRotation(30),
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor()])

6. Normalizing the image

We can normalize the image with transforms. Normalize, for this, we need to pass the list of means, list of standard deviations, then the color channels as:

input[channel] = (input[channel] - mean[channel]) / std[channel]

Here, subtracting mean centers the data near zero and dividing by standard deviation squishes the values to be between -1 and 1. Hence, Normalizing helps to keep the Network weights near zero which in turn makes back propagation more stable. Without normalization, the network will fail to learn properly.

7. Final Result:

data_dir = ‘Cat_Dog_data’#Applying Transformation
train_transforms = transforms.Compose([
transforms.RandomRotation(30),
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor()])
test_transforms = transforms.Compose([transforms.Resize(255),
transforms.CenterCrop(224),
transforms.ToTensor()])
train_data = datasets.ImageFolder(data_dir + ‘/train’,
transform=train_transforms)
test_data = datasets.ImageFolder(data_dir + ‘/test’,
transform=test_transforms)

#Data Loading
trainloader = torch.utils.data.DataLoader(train_data,
batch_size=32)
testloader = torch.utils.data.DataLoader(test_data, batch_size=32)
Result

Resources

  1. Data set: https://www.kaggle.com/c/dogs-vs-cats
  2. Documentation: https://pytorch.org/docs/0.3.0/torchvision/transforms.html
  3. Tutorial: https://pytorch.org/tutorials/beginner/data_loading_tutorial.html

--

--

Archit
Secure and Private AI Writing Challenge

Interested in Differential Privacy, Deep Learning and Machine Learning