Flower Species Classifier With PyTorch

Weng Seng
7 min readAug 6, 2019

--

Introduction to the Project

In this article I will show how to create an image classifier with PyTorch. I going to use transfer learning to train pre-trained neural networks to classify 102 different species of flowers.

The flower data and information can be found here. The dataset consists of 102 folders each species per folder. The codes details of this project can be found at my github here.

Overview of the flower data within a species:

English Marigold
English Marigold
Snapdragon

In this project, the folder structure will be 3 folders which contain all the flower species for each folder:

  1. Train
  2. Validation
  3. Test

Since my laptop is kinda old which currently run on i5 and NVIDIA Cuda 540 is not easy to handle larger dataset with very deep neutral network. It like taking few hours to finished the task which very time consuming. So I decided to use Google Colab because got free GPU.

Thanks Google!!!

First the flower dataset I will uploads into my google drive then from there I mount my drive to my colab:

from google.colab import drive
drive.mount('/content/drive')

Steps:

Step 1: Load Dataset
Step 2: Transform the Dataset
Step 3: Create Model
Step 4: Train Model
Step 5: Save the Model
Step 6: Load the Model
Step 7: Predict the Image
Step 8: Show the result

Step 1: Load Dataset

It simple basically just load in the data and assign it.

data_dir = '/content/drive/My Drive/Colab Notebooks/flowers'
train_dir = data_dir + '/train'
valid_dir = data_dir + '/valid'
test_dir = data_dir + '/test'

Udacity provided label mapping in JSON file so just load it for later use.

with open('/content/drive/My Drive/name.json', 'r') as f:
name = json.load(f)

Step 2: Transform the Dataset

In the training dataset I will transform the original image in resize, crop, rotate, and flip. Training data set cropped, rotated, and flipped and not in validation and test dataset because I want create more robust model in training dataset. For validation and test dataset just resized center cropped.

Then the image values also normalize with the means and standard deviations

train_transforms = transforms.Compose([transforms.RandomRotation(30),
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],[0.229, 0.224, 0.225])])
test_transforms =
transforms.Compose([transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],[0.229, 0.224, 0.225])])
validation_transforms =
transforms.Compose([transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],[0.229, 0.224, 0.225])])

ImageFolder and DataLoader is used to pass the image that went through the transformation and then stored it for training, validation or prediction later.

Step 3: Create Model

For the pretrained model I use VGG19 architecture. There still got some other popular pre-trained models like ResNet, AlexNet and densenet121.

model = models.vgg19(pretrained=True)

For VGG19 the final layer output is 25088. In our case, I want predict 102 flower classes so I use ordered dictionary to create the final layers of my network with 102 output. So I define new untrained feed-forward network as a classifier, using ReLU activations and dropout. LogSoftmax is used for the output values from the model for each image prediction.

from collections import OrderedDict
classifier = nn.Sequential(OrderedDict([
('fc1', nn.Linear(25088, 500)),
('relu', nn.ReLU()),
('dropout1', nn.Dropout(dropout)),
('fc2', nn.Linear(500, 102)),
('output', nn.LogSoftmax(dim=1))
]))

Replace the classifier of VGG19:

model.classifier = classifier

The criterion method used to evaluate the model, optimizer method used to update the weights.

criterion = nn.NLLLoss()
optimizer = optim.Adam(model.classifier.parameters(), lr)

Step 4: Train Model

Here I want to train the final layers. The epochs is the run of feedforward and backpropagation through the neural network. The device is set to cuda which mean only can run in GPU if want to run in CPU need to set to ‘cpu’.

The validation accuracy result give me around 83% for all the flower species. Higher accuracy result can achieved by model tuning, the parameters need to reconsider will be learning rate, unit in classifier, number of epochs and type of optimizer.

epochs = 10
steps = 0
print_every = 40
# change to gpu mode
model.to('cuda')
for e in range(epochs):running_loss = 0

# Iterating over data to carry out training step
for ii, (inputs, labels) in enumerate(trainloader):
steps += 1

inputs, labels = inputs.to('cuda'), labels.to('cuda')

# zero parameter gradients
optimizer.zero_grad()

# Forward and backward passes
outputs = model.forward(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()

running_loss += loss.item()

# Carrying out validation step
if steps % print_every == 0:
# setting model to evaluation mode during validation
model.eval()

# Gradients are turned off as no longer in training
with torch.no_grad():
valid_loss, accuracy = validation(model, validloader, criterion)

print(f"No. epochs: {e+1}, \
Training Loss: {round(running_loss/print_every,3)} \
Valid Loss: {round(valid_loss/len(validloader),3)} \
Valid Accuracy: {round(float(accuracy/len(validloader)),3)}")

running_loss = 0

# Turning training back on
model.train()

Step 5: Save the Model

Now the network is trained, I need to save the model so I can load it later for making predictions.
In PyTorch it allows me to save checkpoint of my models with torch.save function. This checkpoint here saves the model with classifier, optimizer, epochs,model.state_dict, and class_to_idx.

model.state_dict keep all of the weights and biases of our model for each layer in a dictionary

class_to_idx keeps track of our mapping of flower class values to the flower indices.

checkpoint = {'state_dict': model.state_dict(),
'classifier': model.classifier,
'class_to_idx': train_data.class_to_idx,
'opt_state': optimizer.state_dict,
'num_epochs': epochs}
torch.save(checkpoint, '/content/drive/My Drive/Colab Notebooks/my_checkpoint1.pth')

Step 6: Load the Model

After I saved the model in pth file so I should able to load back the model. In order to load I can simply use load function in PyTorch by providing the filepath.

checkpoint = torch.load(filepath)

# Checkpoint for when using CPU
#checkpoint = torch.load(filepath, map_location=lambda storage, loc: storage)

model.load_state_dict(checkpoint['state_dict'])
model.classifier = checkpoint['classifier']
model.class_to_idx = checkpoint['class_to_idx']

Step 7: Predict the Image

Before predict the image, I need to process the input image so that it can be used in the network.
I will use use PIL to load the image.

First, resize the images to 256 pixels. This can be done with resize method. Second, I need to crop out the center 224x224 portion of the image with CenterCrop method.
Then I normalized the image with Normalize function. The idea is simply subtract the means from each color channel, then divide by the standard deviation.
Lastly the model expected floats 0–1 but color channels of images are encoded as integers 0–255, so I need convert it with Numpy array.

def process_image(image):''' Scales, crops, and normalizes a PIL image for a PyTorch model,returns an Numpy array'''# Converting image to PIL image using image file path
pil_im = Image.open(f'{image}' + '.jpg')
# Building image transformtransform = transforms.Compose([transforms.Resize(256),transforms.CenterCrop(224),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406],[0.229, 0.224, 0.225])])
## Transforming image for use with network
pil_tfd = transform(pil_im)
# Converting to Numpy array
array_im_tfd = np.array(pil_tfd)
return array_im_tfd

To check process_image function works, the function below imshowconverts a PyTorch tensor and display the image as a result.

I reorder dimensions using ndarray.transpose. This is because PyTorch expects the color channel (RGB) to be first dimension where in PIL image is in third dimension.

def imshow(image, ax=None, title=None):if ax is None:fig, ax = plt.subplots()# PyTorch tensors assume the color channel is the first dimension# but matplotlib assumes is the third dimensionimage = image.transpose((1, 2, 0))# Undo preprocessingmean = np.array([0.485, 0.456, 0.406])std = np.array([0.229, 0.224, 0.225])image = std * image + mean
# Image needs to be clipped between 0 and 1 or it looks like noise when displayedimage = np.clip(image, 0, 1)ax.imshow(image)return ax

I try run the function to see image pre-processing successful or not

image_name = data_dir + '/test' + '/1/' + 'image_06743'
img_test = process_image(image_name)
imshow(img_test)
The Result

Class Prediction

Once the image in the correct format, it’s time making predictions with the model. A common practice is to predict the top 5 (usually called top-K) most probable classes. I need calculate the class probabilities then find the highest values.

img = process_image(image_path)   
img_tensor = torch.from_numpy(img).type(torch.FloatTensor)
img_add_dim = img_tensor.unsqueeze_(0)
probs = torch.exp(output)
probs_top = probs.topk(5)[0]
index_top = probs.topk(5)[1]

I changed the tensors to numpy arrays:

probs_top_list = np.array(probs_top)[0]index_top_list = np.array(index_top[0])

Then I loading the index and class mapping then switch the direction of x, y index

class_to_idx = loaded_model.class_to_idx
# Inverting index-class dictionary
indx_to_class = {x: y for y, x in class_to_idx.items()}

Converting index list to class list by index the idx_to_class then pull the top 5 classes.

classes_top_list = []
for index in index_top_list:
classes_top_list += [indx_to_class[index]]

Step 8: Show the result

Use matplotlib to plot the probabilities for the top 5 classes as a bar graph, along with the input image.

# Plotting test image and predicted probabilitesf, ax = plt.subplots(2,figsize = (6,10))ax[0].imshow(image)ax[0].set_title(names[0])y_names = np.arange(len(names))ax[1].barh(y_names, probs, color='darkblue')ax[1].set_yticks(y_names)ax[1].set_yticklabels(names)ax[1].invert_yaxis()plt.show()

Here the example result that the model predicted correctly.

image_path = data_dir + '/test' + '/1/'+ 'image_06743'
The Top 5 Classes Prediction

--

--