HANDS-ON TUTORIALS

Image Classification with TensorFlow 2

Classifying images with deep learning using TensorFlow and Keras API | Introduction to computer vision

Tirendaz Academy
Apr 16 · 10 min read
Photo by Josh Rose on Unsplash

Introduction to Computer Vision

Computer vision is used in many fields such as robotics, healthcare, drones, driver-less cars, sports and entertainment.

In my previous article, I was introduced to deep learning with TensorFlow.

In this article, I will explain classifying images using deep neural networks.

Fashion MNIST Dataset

Fashion MNIST dataset

The dataset I’m going to use is the Fashion MNIST dataset. This data set is in the same format as the classical MNIST dataset containing images of handwritten digits. Fashion MNIST dataset contains images of fashion items instead of images of digits, unlike the classical MNIST dataset.

The classes in the Fashion MNIST dataset have more varieties than the classic MNIST dataset. Therefore, it is more difficult to classify the images in this dataset than the classical MNIST data set.

Loading the MNIST Dataset

Analysis using the Fashion MNIST dataset is the “Hello World” of computer vision.

This data set is relatively small in size, making it easy to build and test a computer vision model. You can directly load this dataset with TensorFlow.

import tensorflow as tf
fashion_mnist = tf.keras.datasets.fashion_mnist
(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()

When the dataset is loaded, four NumPy arrays are returned. The X_train and y_train arrays are used to train the model. The X_test and y_test arrays are used to test the model. Pixels in images are integers from 0 to 255.

Let’s look at the shape and data type of the training and test set.

X_train.shape, X_test.shape

Data Preprocessing

Data preprocessing is one of the important steps of data analysis.

The labels in the dataset consist of numbers. Let’s assign the names of fashion items corresponding to these numbers to a variable.

class_names = ["T-shirt / top", "Trouser", "Pullover", "Dress",
"Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]

Let’s use the matplotlib library to see the second image.

import matplotlib.pyplot as plt
plt.figure()
plt.imshow(X_train[1])
plt.colorbar()
plt.grid(False)
plt.show()

When you look at the second image in the training dataset, you can see that the pixel values ​​are between 0 and 255.

Normalizing the Data Set

X_train = X_train / 255.0
X_test = X_test / 255.0

Building the Model

The basic block of a neural network is the layer. Layers extract representations from the data. You hope these representations make sense for the problem you are dealing with. Most deep learning models are formed by a chain linking of layers. Let’s start building the model.

model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape = (28, 28), name = "Input"),
tf.keras.layers.Dense(128, activation='relu', name = "Hidden"),
tf.keras.layers.Dense(10, name = "Output")
])

Let’s go through these codes I wrote line by line.

(1) The first line creates a Sequential model. The sequential model is Keras’ simplest model. In the sequential model, the layers are sequenced in order.

(2) In the next line, I wrote the Flatten layer and added it to the model. Layer from flat converts input images of 28 x 28 pixels into 1 dimensional array (28 * 28 = 784). This layer does not take any parameters, only reshapes the format of the data.

(3) In the next row, I added a hidden Dense layer of 128 neurons to the model. I used the ReLU activation function in this layer. The dense layer connects with all neurons in the previous layer. Each Dense layer has its own weight matrix and this layer contains all the weights between input and output.

(4) Finally, I added a Dense layer with 10 neurons, one neuron per class. The last layer returns a logit array of 10 lengths. Each neuron contains a score indicating that the image belongs to one of 10 classes. The model’s summary() method shows all layers of the model with the names of the layers.

model.summary()

If you do not name the layer, the layer name is automatically generated by Keras.

The meaning of None in the output means that batch size can be anything. The total number of parameters is shown at the end of the summary.

You can easily get a model’s list of layers, to fetch a layer by its index, or you can fetch it by name:

model.layers
hidden = model.layers[1]
print(hidden.name)

All parameters of a layer can be accessed with the get_weights () and set_weights () methods. Let’s look at both the weights and the bias of the first layer.

weights, biases = hidden.get_weights ()
print(weights)
print (biases)
weights.shape, biases.shape

Notice that the weights of the first Dense layer are random and the bias is initialized with zero. You can also initialize the weights and bias in the layer using the kernel_initializer and bias_initializer methods, respectively. More information about these methods can be found here.

Compiling the Model

The loss function measures how accurately the model predicts during training. We want to minimize this function in order to direct the model in the right direction. The optimizer updates the model based on the loss function and the data it sees.
The metrics argument is used to monitor training and testing steps.

model.compile(loss = tf.keras.losses.SparseCategoricalCrossentropy(
from_logits = True),
optimizer = 'adam',
metrics = ['accuracy'])

Let’s go through these codes.

(1) I used SparseCategoricalCrossentropy as a loss function because there are labels from 0 to 9. If you code the labels with one-hot coding, you can use the CategoricalCrossentropy loss function.

(2) I used “adam” as an optimizer, popular in recent years.

(3) Since the problem we are dealing with is classification, I used the metric measure of accuracy.

Training the Model

The model learns the relationship between images and labels during training.

history = model.fit( X_train, y_train, 
epochs = 10,
validation_split = 0.1)

Training sets are used while training the model. The model is evaluated with the validation data. We can separate some of the data for validation with the validation_split argument. By typing 0.1 in this argument, I wanted 10 percent of the training data to be used for validation.

While training the model, loss and accuracy metrics were shown at the end of each epoch. Monitoring these metrics is useful for seeing the actual performance of the model. If the accuracy of the model in the training set is better than the validation set, there may be an overfitting problem.

That’s it. I trained the model.

As you can see, the loss value decreases in each epoch. This means that the model learns from data. After 10 periods, the training and verification accuracies were written on the screen.

If the accuracy and validation accuracy values ​​are close to each other, it means that there is no overfitting problem.

The fit () method returns a History object containing training parameters. history.history is in the form of a dictionary. This dictionary includes metric and loss measured after each epoch in training and validation sets. If you convert this dictionary structure to a Pandas DataFrame structure and use the plot() method, you can plot the training curve.

import pandas as pd
pd.DataFrame(history.history).plot(figsize = (8, 5))
plt.grid(True)
plt.show()
Accuracy and loss for train and validation sets

As you can see from the graph, the accuracy of the model increases in training and validation data, while the loss on training and validation decreases.

If the model is not performing well, you can tune the hyperparameters. The first parameter you should check is the learning rate. If changing this parameter doesn’t work, you can choose a different optimizer. If the model’s performance still does not improve then you can change the number of layers, the number of neurons in each layer, and the activation function in the hidden layers. You can also set the argument batch_size, which is the default 32 in the fit () method.

Evaluating the Model

Let’s call evaluate () method and evaluate the model by using the test set.

test_loss, test_acc = model.evaluate (X_test, y_test, verbose = 2)
print ('\ nTest accuracy:', test_acc)

The accuracy of the model on the test set is slightly less than on the training data. This difference between training and testing accuracies indicates the overfitting problem. The overfitting problem shows that the model is memorizing. In other words, while the model predicts the training data well, it cannot predict the data that it has not seen before. The regularization technique I used L1, L2, or the last lesson can be used to overcome this problem.

Making a Prediction

probability_model = tf.keras.Sequential (
[model, tf.keras.layers.Softmax()])

Let’s estimate the test data based on this model.

predictions = probability_model.predict(X_test)

So the model predicted the label of each picture on the test set. Let’s take a first prediction.

predictions [0]

Note that 10 possibilities corresponding to each fashion item were returned. You can see the label with the highest probability using the argmax() method in NumPy.

import numpy as np
np.argmax(predictions[0])

The model predicted the first image as the ankle boot. Let’s take a look at the actual label of the first image.

y_test[0]

As you can see, the model made the correct prediction.

That’s it. In summary, I explained the following topics for image classification using the MNIST data set in this article:

  • Building the model
  • Tuning hyperparameters
  • Evaluating the model
  • Predicting the new image

I hope you enjoyed the post. Thank you for reading my article. You can access the notebook I used in this article on GitHub or Kaggle pages.

You may also be interested in the following articles.

Don’t forget to follow us on GitHub 🌱, Twitter 😎, Kaggle 📚, LinkedIn 👍

See you in the next post …

MLearning.ai

Data Scientists must think like an artist when finding a solution

Sign up for AI & ART

By MLearning.ai

A weekly collection of the best news and resources on AI & ART Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

MLearning.ai

Data Scientists must think like an artist when finding a solution, when creating a piece of code.Artists enjoy working on interesting problems, even if there is no obvious answer.

Tirendaz Academy

Written by

Data Science | AI | Machine Learning | Deep Learning | Programming

MLearning.ai

Data Scientists must think like an artist when finding a solution, when creating a piece of code.Artists enjoy working on interesting problems, even if there is no obvious answer.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store