Custom models with TensorFlow (Part-2)->Multi-Input model (Siamese Network)

Sthanikam Santhosh
6 min readNov 22, 2022

--

Siamese Network

In Part1 of the custom models with Tensorflow, we saw how we can implement a multi-output model architecture. In this current article, we will see how we can implement a multi-input model architecture using Tensorflow.

For implementing the multi-input model architecture, we will take Siamese network architecture as a reference.

Siamese network is a type of neural network that contains multiple similar neural network architectures. Each of these similar networks will take the input and compute its feature vectors. Using a loss function we will compute how similar the two vectors are and will output their similarity scores.

We will use the Fashion mnist dataset to train the model.

First, let’s import the required packages

import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Flatten, Dense, Dropout, Lambda
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.python.keras.utils.vis_utils import plot_model
from tensorflow.keras import backend as K

import numpy as np
import matplotlib.pyplot as plt
from PIL import Image, ImageFont, ImageDraw
import random

Siamese network requires the data to be in pairs. Using the following helper functions we will go through the entire dataset and will create pairs and label them as 1 if they fall under the same kind of clothing otherwise will label them as 0.

def create_pairs(x, digit_):
pairs = []
labels = []
n = min([len(digit_[d]) for d in range(10)]) - 1

for d in range(10):
for i in range(n):
z1, z2 = digit_[d][i], digit_[d][i + 1]
pairs += [[x[z1], x[z2]]]
inc = random.randrange(1, 10)
dn = (d + inc) % 10
z1, z2 = digit_[d][i], digit_[dn][i]
pairs += [[x[z1], x[z2]]]
labels += [1, 0]

return np.array(pairs), np.array(labels)


def create_pairs_set(images, labels):

digit_ = [np.where(labels == i)[0] for i in range(10)]
pairs, y = create_pairs(images, digit_)
y = y.astype('float32')

return pairs, y

Now we will download and prepare our train and test sets. With the help of the above helper functions, we will also create pairs of images that will go into the multi-input model.

# load the dataset
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

# prepare train and test sets
train_images = train_images.astype('float32')
test_images = test_images.astype('float32')

# normalize values
train_images = train_images / 255.0
test_images = test_images / 255.0

# create pairs on train and test sets
tr_pairs, tr_y = create_pairs_set(train_images, train_labels)
ts_pairs, ts_y = create_pairs_set(test_images, test_labels)

As we already discussed at the beginning of the article our Siamese network should have similar network architectures. Now let’s define the base network.

def base_network():
input = Input(shape=(28,28,), name="base_input")
x = Flatten(name="flatten_input")(input)
x = Dense(128, activation='relu', name="first_base_dense")(x)
x = Dropout(0.1, name="first_dropout")(x)
x = Dense(128, activation='relu', name="second_base_dense")(x)
x = Dropout(0.1, name="second_dropout")(x)
x = Dense(128, activation='relu', name="third_base_dense")(x)

return Model(inputs=input, outputs=x)

Our base network will look like this,

The fashion mnist dataset has images with dimensions 28*28, hence we defined our input layer to accept a 28*28 input shape.
The dense layer expects data to be a row vector hence we added flatten layer before the dense layer to process the image to be a row vector.

We will use the Euclidean distance metric to measure the similarity between the images. The following Euclidean distance function will help us measure the distance between the feature vectors.

def euclidean_distance(vectors):
x, y = vectors
sum_square = K.sum(K.square(x - y), axis=1, keepdims=True)
return K.sqrt(K.maximum(sum_square, K.epsilon()))


def eucl_output_shape(shapes):
shape1, shape2 = shapes
return (shape1[0], 1)

Using the base network let’s create the left input and right input network and we will add a lambda layer to call the Euclidean metric function.

Using lambda layer we can add custom functions to the network.

inp_a = Input(shape=(28,28,), name="left_input")
vect_a = base_network(inp_a)

in_b = Input(shape=(28,28,), name="right_input")
vect_b = base_network(inp_b)

out = Lambda(euclidean_distance, name="output_layer", output_shape=eucl_dist_output_shape)([vect_a, vect_b])

model = Model([inp_a, inp_b], out)

Now our model will be lambda layer following the base network.

There are different similarity functions available through which we can train the Siamese network. These can be Contrastive loss, triplet loss, and circle loss.

In this, we will use the contrastive loss function because the contrastive loss function works better for differentiating between image pairs. Contrastive loss, L = Y * D² + (1-Y) * max(margin — D, 0)²

def contrastive_loss_with_margin(margin):
def contrastive_loss(y_true, y_pred):
square_pred = K.square(y_pred)
margin_square = K.square(K.maximum(margin - y_pred, 0))
return (y_true * square_pred + (1 - y_true) * margin_square)
return contrastive_loss

By using the above loss function let’s train the model

rms = RMSprop()
model.compile(loss=contrastive_loss_with_margin(margin=1), optimizer=rms)
history = model.fit([tr_pairs[:,0], tr_pairs[:,1]], tr_y, epochs=20, batch_size=128, validation_data=([ts_pairs[:,0], ts_pairs[:,1]], ts_y))

Note:- Along with the default loss functions that Tensorflow supports, we can also define our own custom loss function and add it to the model.compile method.

Here we will use the RMSprop optimizer to update the model weights.

We will pass our custom contrastive loss function to the model.compile method.

Using the model.fit method we will allow the model to experience training samples for 20 epochs with a batch size of 128.

Epoch 1/20
938/938 [==============================] - 8s 8ms/step - loss: 0.1133 - val_loss: 0.0858
Epoch 2/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0808 - val_loss: 0.0763
Epoch 3/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0721 - val_loss: 0.0743
Epoch 4/20
938/938 [==============================] - 7s 8ms/step - loss: 0.0675 - val_loss: 0.0704
Epoch 5/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0643 - val_loss: 0.0674
Epoch 6/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0624 - val_loss: 0.0663
Epoch 7/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0602 - val_loss: 0.0666
Epoch 8/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0590 - val_loss: 0.0653
Epoch 9/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0576 - val_loss: 0.0655
Epoch 10/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0568 - val_loss: 0.0653
Epoch 11/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0557 - val_loss: 0.0664
Epoch 12/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0553 - val_loss: 0.0634
Epoch 13/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0542 - val_loss: 0.0630
Epoch 14/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0537 - val_loss: 0.0690
Epoch 15/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0533 - val_loss: 0.0635
Epoch 16/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0530 - val_loss: 0.0623
Epoch 17/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0521 - val_loss: 0.0646
Epoch 18/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0520 - val_loss: 0.0660
Epoch 19/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0518 - val_loss: 0.0616
Epoch 20/20
938/938 [==============================] - 7s 7ms/step - loss: 0.0511 - val_loss: 0.0627

Now let’s evaluate the model with training and testing data.

def compute_accuracy(y_true, y_pred):
pred = y_pred.ravel() < 0.5
return np.mean(pred == y_true)

loss = model.evaluate(x=[ts_pairs[:,0],ts_pairs[:,1]], y=ts_y)

y_pred_train = model.predict([tr_pairs[:,0], tr_pairs[:,1]])
train_accuracy = compute_accuracy(tr_y, y_pred_train)

y_pred_test = model.predict([ts_pairs[:,0], ts_pairs[:,1]])
test_accuracy = compute_accuracy(ts_y, y_pred_test)

print("Loss = {}, Train Accuracy = {} Test Accuracy = {}".format(loss, train_accuracy, test_accuracy))
625/625 [==============================] - 1s 2ms/step - loss: 0.0627
Loss = 0.0626874640583992, Train Accuracy = 0.941365227537923 Test Accuracy = 0.9153653653653654

Finally, our custom model is 94% accurate with training data and 91% accurate with testing data.

In this way using TensorFlow, we can design a multi-input model with the custom loss function.

This kind of modeling solution can be extended to any other use cases where multiple input combinations are involved.

Reference:

--

--