Image Classification with TensorFlow 2
Classifying images with deep learning using TensorFlow and Keras API | Introduction to computer vision
Introduction to Computer Vision
Computer vision is one of the important research areas in machine learning. With smartphones and tablets, people take pictures every day and upload them to social media platforms. Specialists are needed to analyze this huge amount of produced pictures.
Computer vision is used in many fields such as robotics, healthcare, drones, driver-less cars, sports and entertainment.
In my previous article, I was introduced to deep learning with TensorFlow.
Introduction to Deep Learning with TensorFlow 2
Machine learning and deep learning with TensorFlow and Keras
In this article, I will explain classifying images using deep neural networks.
Fashion MNIST Dataset
The dataset I’m going to use is the Fashion MNIST dataset. This data set is in the same format as the classical MNIST dataset containing images of handwritten digits. Fashion MNIST dataset contains images of fashion items instead of images of digits, unlike the classical MNIST dataset.
The classes in the Fashion MNIST dataset have more varieties than the classic MNIST dataset. Therefore, it is more difficult to classify the images in this dataset than the classical MNIST data set.
Loading the MNIST Dataset
The Fashion MNIST dataset consists of 70,000 grayscale images and 10 categories. 60,000 images are used to train the network and 10,000 images to evaluate how accurately the network learned to classify images. The images show fashion items measuring 28 x 28 pixels.
Analysis using the Fashion MNIST dataset is the “Hello World” of computer vision.
This data set is relatively small in size, making it easy to build and test a computer vision model. You can directly load this dataset with TensorFlow.
import tensorflow as tf
fashion_mnist = tf.keras.datasets.fashion_mnist
(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()
When the dataset is loaded, four NumPy arrays are returned. The X_train and y_train arrays are used to train the model. The X_test and y_test arrays are used to test the model. Pixels in images are integers from 0 to 255.
Let’s look at the shape and data type of the training and test set.
The data must be preprocessed before training the network
Data preprocessing is one of the important steps of data analysis.
The labels in the dataset consist of numbers. Let’s assign the names of fashion items corresponding to these numbers to a variable.
class_names = ["T-shirt / top", "Trouser", "Pullover", "Dress",
"Coat", "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]
Let’s use the matplotlib library to see the second image.
import matplotlib.pyplot as plt
When you look at the second image in the training dataset, you can see that the pixel values are between 0 and 255.
Normalizing the Data Set
Let’s scale the inputs to increase the training speed and performance of the model. You can do this by simply dividing the pixels of the entries into the data set by 255.
X_train = X_train / 255.0
X_test = X_test / 255.0
Building the Model
To build a neural network, it is necessary to configure the layers of the model and then you can compile the model. Let’s now adjust the layers of the model using the Sequential API.
The basic block of a neural network is the layer. Layers extract representations from the data. You hope these representations make sense for the problem you are dealing with. Most deep learning models are formed by a chain linking of layers. Let’s start building the model.
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape = (28, 28), name = "Input"),
tf.keras.layers.Dense(128, activation='relu', name = "Hidden"),
tf.keras.layers.Dense(10, name = "Output")
Let’s go through these codes I wrote line by line.
(1) The first line creates a Sequential model. The sequential model is Keras’ simplest model. In the sequential model, the layers are sequenced in order.
(2) In the next line, I wrote the Flatten layer and added it to the model. Layer from flat converts input images of 28 x 28 pixels into 1 dimensional array (28 * 28 = 784). This layer does not take any parameters, only reshapes the format of the data.
(3) In the next row, I added a hidden Dense layer of 128 neurons to the model. I used the ReLU activation function in this layer. The dense layer connects with all neurons in the previous layer. Each Dense layer has its own weight matrix and this layer contains all the weights between input and output.
(4) Finally, I added a Dense layer with 10 neurons, one neuron per class. The last layer returns a logit array of 10 lengths. Each neuron contains a score indicating that the image belongs to one of 10 classes. The model’s summary() method shows all layers of the model with the names of the layers.
If you do not name the layer, the layer name is automatically generated by Keras.
The meaning of None in the output means that batch size can be anything. The total number of parameters is shown at the end of the summary.
You can easily get a model’s list of layers, to fetch a layer by its index, or you can fetch it by name:
hidden = model.layers
All parameters of a layer can be accessed with the get_weights () and set_weights () methods. Let’s look at both the weights and the bias of the first layer.
weights, biases = hidden.get_weights ()
Notice that the weights of the first Dense layer are random and the bias is initialized with zero. You can also initialize the weights and bias in the layer using the kernel_initializer and bias_initializer methods, respectively. More information about these methods can be found here.
Compiling the Model
Before starting the training of the model, it is necessary to compile the model with the compile() method. When compiling, the loss function and optimizer are determined. Optionally, an extra metric can be used to see the calculation during training and evaluation.
The loss function measures how accurately the model predicts during training. We want to minimize this function in order to direct the model in the right direction. The optimizer updates the model based on the loss function and the data it sees.
The metrics argument is used to monitor training and testing steps.
model.compile(loss = tf.keras.losses.SparseCategoricalCrossentropy(
from_logits = True),
optimizer = 'adam',
metrics = ['accuracy'])
Let’s go through these codes.
(1) I used SparseCategoricalCrossentropy as a loss function because there are labels from 0 to 9. If you code the labels with one-hot coding, you can use the CategoricalCrossentropy loss function.
(2) I used “adam” as an optimizer, popular in recent years.
(3) Since the problem we are dealing with is classification, I used the metric measure of accuracy.
Training the Model
Now we can train the model by calling the fit () method.
The model learns the relationship between images and labels during training.
history = model.fit( X_train, y_train,
epochs = 10,
validation_split = 0.1)
Training sets are used while training the model. The model is evaluated with the validation data. We can separate some of the data for validation with the validation_split argument. By typing 0.1 in this argument, I wanted 10 percent of the training data to be used for validation.
While training the model, loss and accuracy metrics were shown at the end of each epoch. Monitoring these metrics is useful for seeing the actual performance of the model. If the accuracy of the model in the training set is better than the validation set, there may be an overfitting problem.
That’s it. I trained the model.
As you can see, the loss value decreases in each epoch. This means that the model learns from data. After 10 periods, the training and verification accuracies were written on the screen.
If the accuracy and validation accuracy values are close to each other, it means that there is no overfitting problem.
The fit () method returns a History object containing training parameters. history.history is in the form of a dictionary. This dictionary includes metric and loss measured after each epoch in training and validation sets. If you convert this dictionary structure to a Pandas DataFrame structure and use the plot() method, you can plot the training curve.
import pandas as pd
pd.DataFrame(history.history).plot(figsize = (8, 5))
As you can see from the graph, the accuracy of the model increases in training and validation data, while the loss on training and validation decreases.
If the model is not performing well, you can tune the hyperparameters. The first parameter you should check is the learning rate. If changing this parameter doesn’t work, you can choose a different optimizer. If the model’s performance still does not improve then you can change the number of layers, the number of neurons in each layer, and the activation function in the hidden layers. You can also set the argument batch_size, which is the default 32 in the fit () method.
Evaluating the Model
I built the model with the training data. You might want to see how the model predicts data it hasn’t seen before. To evaluate the model, a test set not used during training is used.
Let’s call evaluate () method and evaluate the model by using the test set.
test_loss, test_acc = model.evaluate (X_test, y_test, verbose = 2)
print ('\ nTest accuracy:', test_acc)
The accuracy of the model on the test set is slightly less than on the training data. This difference between training and testing accuracies indicates the overfitting problem. The overfitting problem shows that the model is memorizing. In other words, while the model predicts the training data well, it cannot predict the data that it has not seen before. The regularization technique I used L1, L2, or the last lesson can be used to overcome this problem.
Making a Prediction
You may want to predict new images with the model you trained. The linear output of the model is logit. You can convert logits into possibilities by adding a softmax layer for easier interpretation.
probability_model = tf.keras.Sequential (
Let’s estimate the test data based on this model.
predictions = probability_model.predict(X_test)
So the model predicted the label of each picture on the test set. Let’s take a first prediction.
Note that 10 possibilities corresponding to each fashion item were returned. You can see the label with the highest probability using the argmax() method in NumPy.
import numpy as np
The model predicted the first image as the ankle boot. Let’s take a look at the actual label of the first image.
As you can see, the model made the correct prediction.
That’s it. In summary, I explained the following topics for image classification using the MNIST data set in this article:
- Building the model
- Tuning hyperparameters
- Evaluating the model
- Predicting the new image
You may also be interested in the following articles.
Data Visualization with Pandas in Action
Data visualization is one of the important steps of data analysis. To visualize data, most people usually use…
PRACTICAL DATA ANALYSIS with PANDAS
In my last post, I mentioned working with data in Pandas library. One of Python’s most important libraries is pandas…
INTRODUCTION TO DEEP LEARNING WITH R
Machine learning is a sub-domain of artificial intelligence (AI), which discovers rules to execute a data processing…
See you in the next post …