IMAGE CLASSIFICATION

Introduction cat & dog image classification

Vann Ponlork
Mar 5 · 11 min read

In this topic, I want to show you the neural network of the classifier in TensorFlow run on CPU. And after that, I will show an introduction on how to save the model, restore, and save the whole model. In addition, I will show you about how to create the dataset for training model, what type and image shape should impose to the dataset. As well as I will introduce about TensorFlow serving what is TensorFlow serving and how to deploy the model to the server as well as send data to predict on the server.

All of this introduction use Python and Pycharm for IDE. For those who underbook of python, you should ready document about python first.

We will use TensorFlow in this introduction. For more detail of TensorFlow you can check on:

https://www.tensorflow.org/tutorials/

Image:

The image shape is thought to in format: [width, height, channel].

Channel mean: color type ex: [28,28,3] the 3 mean RGB.In the training process, you can train every type of image and channel, but I recommend using a grey scale to reduce image size and in training process, it will use less memory it means channel equal 1.

Neural Network:

In neural network there: are input => hidden layer(neural network) => output

For image the image the famous usage is convolutional neural network(CNN).

In this case, we slide our window by 1 pixel at a time. In some cases, people slide the windows by more than 1 pixel. This number is called stride.

Pooling Layer:

Pooling layer is mostly used immediately after the convolutional layer to reduce the spatial size(only width and height, not depth). This reduces the number of parameters, hence computation is reduced. Also, less number of parameters avoid overfitting. The most common form of pooling is Max pooling where we take a filter of size F*F and apply the maximum operation over the F*F sized part of the image.

Fully Connected Layer

If each neuron in a layer receives input from all the neurons in the previous layer, then this layer is called a fully connected layer. The output of this layer is computed by matrix multiplication followed by bias offset.

When designing the architecture of a neural network you have to decide on: How do you arrange layers? which layers to use? how many neurons to use in each layer etc.?

Let start:

For training we need to have dataset:

Image: To create a dataset for train model we have to prepare image first. Whatever the size of the image you want, but if you train the image to model 28 x 28 px so when you let the model make a prediction it need 28x28.

In addition, the image has 3 channel it means RGB, so we can reduce its size by changing the number of the channel to 1(grey scale).

The question is can I impose the number of image channel to 3 for training?

The answers: Yes you can, but use much memory of your machine.

Now, I use 28 x 28 and 3 of the channel of image shape.

Dataset:

To train the model you need to have a batch of data, so you need you to have a dataset. For a beginner, you can sue MNIST dataset or you can create dataset by yourself. For MNIST dataset you can try it yourself, but now I want to introduce of creating the dataset.

In the training process there is supervise and unsupervised, so this dataset introduction I will show about supervised.

Supervise it means it has a label to identify image ex. [255,255,….233] ,[0,1]

the first array is image and the second [0,1] is a label.

The question is: if I have the multi-type of images what the form of label is it?

The answer is : label will change to number of label items ex, [0,1,0,0,0] or [1,0,0,0,0]

import numpy as np
import os
import glob
import random
import cv2
categories = ''
class_num =[]
training_data=[]
image_size = 28
image_path = ('./create_dataset/images')
path= os.listdir(image_path)
categories= path
def create_training_data():
for category in categories:
path=os.path.join(image_path,category)
i=1
j=0
if category == 'dog':
class_num = ''
class_num = [i, j]
elif category == 'cat':
class_num = ''
class_num = [j, i]
for img in os.listdir(path):
try:
image_data = cv2.imread(os.path.join(path, img))
image_data = cv2.cvtColor(image_data, cv2.COLOR_BGR2RGB)
image_data = cv2.resize(image_data, (image_size, image_size),0, 0)
training_data.append([image_data, class_num])
except Exception as e: print("you have error something"
random.shuffle(training_data)
np.save("./dataset_multi_column/training_data.npy", training_data)
create_training_data()

I have dog and cat images I resize these images to 28x28 in RGB (3 channel). The data format is [image array] [label], after I make random shuffle to mix shuffle of cat and dog image together and save it by saving it to array dataset training_data.npy.

Next step we have to prepare code to train data.

Training data to machine it means train data to the model. One more thing to train image to the model we use a convolutional neural network (CNN) because it catches pixel by pixel to train to model. In this introduction, I use Tensoflow both neural network and CNN. I will create 2 layers of the neural network.

import tensorflow as tf
import numpy as np
import os
import cv2
import matplotlib.pyplot as plt


training_data = []
training_data = np.load('./dataset/dataset_multi_column/training_data.npy')


#training_data = np.load('./training_data1.npy')
#==================Model==========================
model_path = './model/model_multi_column/model_multi_column.ckpt'
model_checkpoint='./model/model_multi_column/checkpoint'
#===========================================================
model_dir='./model/model_multi_column/'
#===========================================================
batch_size = 500
n_classes = 2

x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)
train_x = []
train_y = []
sess=tf.InteractiveSession()
sess.run(tf.global_variables_initializer())


def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def maxpool2d(x):
# size of window movement of window
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

def convolutional_neural_network(x):
weights = {'W_conv1': tf.Variable(tf.random_normal([5, 5,3, 32])),
'W_conv2': tf.Variable(tf.random_normal([5, 5, 32, 64])),
'W_fc': tf.Variable(tf.random_normal([7 * 7 * 64, 1024])),
'out': tf.Variable(tf.random_normal([1024, n_classes]))}

biases = {'b_conv1': tf.Variable(tf.random_normal([32])),
'b_conv2': tf.Variable(tf.random_normal([64])),
'b_fc': tf.Variable(tf.random_normal([1024])),
'out': tf.Variable(tf.random_normal([n_classes]))}

x = tf.reshape(x, shape=[-1, 28, 28, 3])
conv1 = tf.nn.relu(conv2d(x, weights['W_conv1']) + biases['b_conv1'])
conv1 = maxpool2d(conv1)
conv2 = tf.nn.relu(conv2d(conv1, weights['W_conv2']) + biases['b_conv2'])
conv2 = maxpool2d(conv2)

fc = tf.reshape(conv2, [-1, 7 * 7 * 64])
fc = tf.nn.relu(tf.matmul(fc, weights['W_fc']) + biases['b_fc'])


output = tf.matmul(fc, weights['out']) + biases['out']
return output

def train_neural_network(x):
prediction = convolutional_neural_network(x)
#cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=prediction,labels=y))
optimizer = tf.train.AdamOptimizer(0.01).minimize(cost)
hm_epochs = 100

predict = []
for image_data, image_label in training_data:
train_x.append(image_data)
train_y.append(image_label)
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())

if os.path.isfile(model_path + '.index'):
saver.restore(sess,model_path)
print('Model restore')

for epoch in range(hm_epochs):
l = int(len(training_data))
i=0
while i < l :
start =i
end = i + batch_size
batch_x = np.array(train_x[start:end])
batch_y = np.array(train_y[start:end])
label_prediction=tf.argmax(tf.round(y),axis=1)

get_normalize= tf.nn.sigmoid(prediction)
y_predict=tf.argmax(get_normalize,axis=1)

correction= tf.equal(y_predict,label_prediction)
accuracy =tf.reduce_mean(tf.cast(correction,tf.float32))

_,c,accuracy,y_predict,label_prediction = sess.run([optimizer,cost,accuracy,y_predict,label_prediction],feed_dict={x:batch_x,y:batch_y})
i+= batch_size
print('Epoch',epoch+1,'completed out of ',hm_epochs,'Cost:',c,'accuracy:',accuracy)
saver = tf.train.Saver()
saver.save(sess,model_path)

output=convolutional_neural_network(x)
tensorflow_serving = tf.nn.softmax(output)

def save_model():
saver = tf.train.Saver()
if os.path.isfile(model_path + '.index'):
saver.restore(sess, model_path)
print('Model restore')
# sess.run(tf.global_variables_initializer())
path = './model/model_multi_column/'
dir_list = os.listdir(path)
if len(dir_list) == 0:
version = 1
else:
last_version = len(dir_list)
version = last_version + 1
model_save_path = path + "{}".format(str(version))
prediction_signature = (
tf.saved_model.signature_def_utils.build_signature_def(
inputs={'input_images': tf.saved_model.utils.build_tensor_info(x)},
outputs={'output': tf.saved_model.utils.build_tensor_info(tensorflow_serving)},
method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME
)
)
builder = tf.saved_model.builder.SavedModelBuilder(model_save_path)
builder.add_meta_graph_and_variables(
sess, [tf.saved_model.tag_constants.SERVING],
signature_def_map={
'generate_images': prediction_signature
},
legacy_init_op=tf.group(tf.tables_initializer(), name='legacy_init_op'))
builder.save(as_text=False)

train_neural_network(x)
save_model()

I have created two layers of the neural network and have one output. Let me how to show briefly about the code above:

We use TensorFlow, batch_size is the size of data that will train to model, we have 2 of tf.placeholder(tf.float32) one is image and one is a label.

Note: if we train large data to model it will more accuracy, less of data will have less of accuracy.

But it will use a large of memory too.

So, now I will train the model with 500 images each time (batch_size = 500).

Note: To train data to machine we will use sigmoid function : z = w.x +b. reference : http://neuralnetworksanddeeplearning.com/

Now we start to optimize by reducing cost.

COST is normal have algebra: cost = 0.5(label — prediction)²

optimizer = tf.train.AdamOptimizer(0.01).minimize(cost)

Rate default is 0.01

In this code I use 100 epoch and 2 iteration (my dataset have 1000 images and train 500 each time).

NOTE: Model should train more images and train more epoch to get more accuracy.

_,c,accuracy,y_predict,label_prediction = sess.run([optimizer,cost,accuracy,y_predict,label_prediction],feed_dict={x:batch_x,y:batch_y})

Here we feed placeholder x and placeholder y by batch_x, batch_y

After we have already trained our model to have to get training result to save to model

saver = tf.train.Saver()

saver.save(sess,model_path)

In this case, in process of save model if there is any error the model will train from the beginning.

The question is: how to restore the training process from where the epoch stop?

The answer is: we have to restore the process of saving.

if os.path.isfile(model_path + '.index'):
saver.restore(sess,model_path)
print('Model restore')

So, it will restore from where it stops.This saving process just saves the variable is not save the whole model. To deploy the model to the web to predict we have to save the whole model.The format that we will save the whole model is JSON format.

So, Here is it:

def save_model():
saver = tf.train.Saver()
if os.path.isfile(model_path + '.index'):
saver.restore(sess, model_path)
print('Model restore')
# sess.run(tf.global_variables_initializer())
path = './model/model_multi_column/'
dir_list = os.listdir(path)
if len(dir_list) == 0:
version = 1
else:
last_version = len(dir_list)
version = last_version + 1
model_save_path = path + "{}".format(str(version))
prediction_signature = (
tf.saved_model.signature_def_utils.build_signature_def(
inputs={'input_images': tf.saved_model.utils.build_tensor_info(x)},
outputs={'output': tf.saved_model.utils.build_tensor_info(tensorflow_serving)},
method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME
)
)
builder = tf.saved_model.builder.SavedModelBuilder(model_save_path)
builder.add_meta_graph_and_variables(
sess, [tf.saved_model.tag_constants.SERVING],
signature_def_map={
'generate_images': prediction_signature
},
legacy_init_op=tf.group(tf.tables_initializer(), name='legacy_init_op'))
builder.save(as_text=False)

This saves the whole model after saving the variable.This thinks this introduction is enough to create a neural network, use CNN for image training, train model, save the model, and restore model.
Now, I want to show you the next step of how to deploy the model to the web to predict.
In this introduction I will use Tensorflow_serving the I will deploy the model.
I recommend using docker for TensorFlow cause it is easy and has already tensorflow_serving.

Reference:
https://www.docker.com/
https://www.tutorialspoint.com/docker/
https://www.vagrantup.com/downloads.html
https://www.virtualbox.org/wiki/Downloads

Prerequisite study:
-vitualbox
-vagrant
-ubuntu16(some comment line)
-git
-docker(image and container)
-NGINX or APACH

Now, I start my vagrant hostname host3 and I run docker container.
First, we need to check the docker repository by

#docker search tensorflow_serving

Note: In the ML program we need to know our program run on CPU or GPU. In my program, it runs on the CPU. So show check TensorFlow program that uses in CPU.

So, you need to pull it from the repository:
#docker pull docker_container_name:
#docker ps -a
After you push your model to ubuntu host we need to bind it to tensorflow_serving contain.

#docker run -p 8501:8501 — mount type=bind,source=/ImageCmpl/model_store/,target=/models/my_model -e MODEL_NAME=my_model -t tensorflow/serving

After binding the model you need to run docker container that has just bind the model.
#docker start your_docker_container.
#docker ps
So, after you have started your container you can access from a web browser.

http://192.168.33.15:8501/v1/models/classifier

After it will reply back by JSON data.

{
"model_version_status": [
{
"version": "1",
"state": "AVAILABLE",
"status": {
"error_code": "OK",
"error_message": ""
}
}
]
}

The message like that it means it works. Now you have success deploy model to the tensorflow_serving server.
The next step is to send some image to server let model predict.
To send data to the server you need format data to JSON format.
Prerequisite study:
-HTML
-JAVASCRIPT
-JSON
_AJAX , JQUERY(understand how to transfer data to server by these script)
https://www.w3schools.com/
https://www.tutorialspoint.com/
https://www.google.com/

LET’S START:

Index.html<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">

<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<script type="text/javascript">
var json_data = '{name:"John",age:"30"}';
var post = $.post(json_data);
post.done(function(data){document.getElementById('demo').innerHTML=data});
post.always(function(data){data = JSON.stringify(data);
alert('Post always work:\n' + data);
document.getElementById('demo').innerHTML=data;
});
</script>
<script type="text/javascript">
function jquery_send_data(){
var json_data = '{name:"John",age:"30"}';
var post = $.post(json_data);
post.done(function(data){document.getElementById('demo').innerHTML=data});
post.always(function(data){data = JSON.stringify(data);
document.getElementById('demo').innerHTML=data;
});
}</script>
<script type="text/javascript">
function send_data_ajax(){
var json_data = '{name:"John",age:"30"}';
var xmlhttp = new XMLHttpRequest();
xmlhttp.onreadystatechange = function(){
if(this.readyState == 4 && this.status == 200){
document.getElementById("demo").innerHTML = this.responseText;
}
};
xmlhttp.open("GET","ajax_info.txt",true);
xmlhttp.send();
}
</script>
</head>
<body>

<button onclick="jquery_send_data()">JQuery send data</button>
<br>
<button onclick="send_data_ajax()">AJAX SEND DATA</button>
<div id="demo">
<h2>Let AJAX change this text</h2>
</div>

</body>
</html>
Image.js
height = 128
width = 128
function readURL(event){
var getImagePath = URL.createObjectURL(event.target.files[0]);
$('#tools_sketch').css('background-image', 'url(' + getImagePath + ')');
$('#tools_sketch').css('background-position', 'left top');
}
function myalert(){
document.getElementById("show").innerHTML += "PONLORK";
}
function postImage(){
var canvas = document.getElementById("tools_sketch");
var ctx = canvas.getContext("2d");
var mask_data = ctx.getImageData(0, 0, width, height).data;
mask_img = convert_to_mask(mask_data);
mask_json = JSON.stringify(mask_img);
var img_canvas = document.createElement('canvas');
var img_context = img_canvas.getContext('2d');
var img = new Image();
background_image = ctx.canvas.style.backgroundImage;
img.src = background_image.substr(5, background_image.length - 7);
img_context.drawImage(img, 0, 0);
bi_data = img_context.getImageData(0, 0, width, height).data;
bi_img = convert_to_list(bi_data);
mask_bounding = get_bounding(mask_img);
bounding_color = calc_average_color(bi_img, mask_bounding);
painted_bi_img = paint_by_bounding_color(bi_img, mask_img, bounding_color);
bi_json = JSON.stringify(painted_bi_img);
triggerEvent(img_canvas, 'click');
json_data = '{"signature_name": "generate_images", "instances": [{ "input_images": ' + bi_json + ', "mask_images": ' + mask_json + '}]}'
$.post('v1/models/my_model:predict', json_data)
.done( (data) => {
$('.result').html(data);
buffer = decode_to_image(data['predictions'][0]);
idata = img_context.createImageData(width, height);
idata.data.set(buffer);
img_context.putImageData(idata, 0, 0);
url = img_canvas.toDataURL();
$('#tools_sketch').css('background-image', 'url(' + url + ')');
$('#tools_sketch').css('background-position', 'left top');
var ctx_to_erase = canvas.getContext("2d");
ctx_to_erase.clearRect(0, 0, width, height);
ctx.clearRect(0, 0, canvas.width, canvas.height);
img_context.clearRect(0, 0, img_canvas.width, img_canvas.height);
tools_class = document.getElementsByClassName('tools')[0];
a_tags = tools_class.getElementsByTagName('a');
marker_link = a_tags[0];
erase_link = a_tags[1];
triggerEvent(erase_link, 'click');
triggerEvent(canvas, 'mousedown');
triggerEvent(marker_link, 'click');
})
.fail( (data) => {
$('.result').html(data);
})
.always( (data) => {
});
}
function convert_to_list(data){
var newArray = [];
for(var i=0; i<height; i++){
var newRow = [];
for(var j=0; j<width; j++){
var idx = (j + i * width) * 4;
newRow.push([data[idx] / 255.0, data[idx + 1] / 255.0, data[idx + 2] / 255.0]);
}
newArray.push(newRow.slice());
}
return newArray.slice();
}
function convert_to_mask(data){
var newArray = [];
for(var i=0; i<height; i++){
var newRow = [];
for(var j=0; j<width; j++){
var idx = (j + i * width) * 4;
var gray = (data[idx] + data[idx + 1] + data[idx + 2]) / 3;
if(gray > 127){
newRow.push([1.0]);
}else{
newRow.push([0.0]);
}
}
newArray.push(newRow.slice());
}
return newArray.slice();
}
function calc_average_color(img, bounding){
var y_min = bounding[0];
var x_min = bounding[1];
var y_max = bounding[2];
var x_max = bounding[3];
var r_val = 0;
var g_val = 0;
var b_val = 0;
var count_pixel = 0;
for(var col=x_min; col<x_max; col++){
r_val += img[y_min][col][0];
g_val += img[y_min][col][1];
b_val += img[y_min][col][2];
count_pixel++;
r_val += img[y_max][col][0];
g_val += img[y_max][col][1];
b_val += img[y_max][col][2];
count_pixel++;
}
for(var row=y_min; row<y_max; row++){
r_val += img[row][x_min][0];
g_val += img[row][x_min][1];
b_val += img[row][x_min][2];
count_pixel++;
r_val += img[row][x_max][0];
g_val += img[row][x_max][1];
b_val += img[row][x_max][2];
count_pixel++;
}
return [r_val/count_pixel, g_val/count_pixel, b_val/count_pixel];
}
function paint_by_bounding_color(bi_img, mask_img, bounding_color){
var newArray = [];
for(var row=0; row<height; row++){
var newRow = [];
for(var col=0; col<width; col++){
if(mask_img[row][col][0] < 0.5){
newRow.push(bi_img[row][col]);
}else
newRow.push(bounding_color);
}
newArray.push(newRow.slice());
}
return newArray.slice();
}
function decode_to_image(data){
var buffer = new Uint8ClampedArray(width * height * 4);
for(var i=0; i<height; i++){
var newRow = [];
for(var j=0; j<width; j++){
var idx = (j + i * width) * 4;
buffer[idx] = data[i][j][0] * 255;
buffer[idx+1] = data[i][j][1] * 255;
buffer[idx+2] = data[i][j][2] * 255;
buffer[idx+3] = 255;
}
}
return buffer;
}
function get_bounding(mask_img){
var x_min = width;
var x_max = 0;
var y_min = height;
var y_max = 0;
for(var row=0; row<height; row++){
for(var col=0; col<width; col++){
if(mask_img[row][col][0] > 0.5){
if(x_min > col){
x_min = col;
}
if(x_max < col){
x_max = col;
}
if(y_min > row){
y_min = row;
}
if(y_max < row){
y_max = row;
}
}
}
}
return [y_min, x_min, y_max, x_max];
}
function triggerEvent(element, event) {
if (document.createEvent) {
// IE以外
var evt = document.createEvent("HTMLEvents");
evt.initEvent(event, true, true ); // event type, bubbling, cancelable
return element.dispatchEvent(evt);
} else {
// IE
var evt = document.createEventObject();
return element.fireEvent("on"+event, evt)
}
}

The images that should choose for predict is 28x28 pixels cause data in the dataset is 28x28 pixels.

Reference:

https://www.tensorflow.org/tutorials/
https://www.w3schools.com/
https://www.tutorialspoint.com/
https://www.docker.com/
https://www.tutorialspoint.com/docker/
https://www.vagrantup.com/downloads.html
https://www.virtualbox.org/wiki/Downloads

LASTMILE WORKS / DYNAMO TECH - R&D Project

Developing next-generation technology in Combodia

Vann Ponlork

Written by

Dynamo Tech Solutions Co., Ltd.

LASTMILE WORKS / DYNAMO TECH - R&D Project

Developing next-generation technology in Combodia

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade