Siamese Neural Network for signature verification
Let’s see how Siamese Architecture works using CNNs to perform a signature verification task!
A person’s signature hardly varies each time they sign and the algorithm that we employ for fraud detection must account for the variations in strokes. But the detection system should also manage to catch forged signatures that might be very similar in case of skilled forgeries. Fortunately, deep learning offers to learn the similarity by itself and all we have to do is plain, simple backpropagation. We’ll build a siamese network and train it to approximate the similarity function that outputs a score between 0 and 1 (1 being perfectly genuine and 0 being completely dissimilar)
Let’s consider this as a supervised learning task and prepare the features and labels. The siamese network is fed with pairs of images and their corresponding labels (similar or dissimilar). The dataset for signature verification is available at SigComp’s website whose link is here. The data is prepared by looping over the dataset and forming an array of pairs of images and their labels in another array. This ultimately makes it a binary classification problem!
Siamese network has a stack of convolutional and pooling layers and a final fully connected layer with 128 neurons. The sister network takes on the same weights and biases as the original network (essentially means running the same network twice). It takes the input image pair and produces two 128-D vectors as outputs. The network learns to encode similar input images closer and dissimilar ones, farther apart from each other in the vector space. The conventional method to train it is by using a contrastive loss function or a triplet loss that takes three images at once. In this post, we’ll be training our network with ‘cross-entropy’ loss. How you ask? Here’s the layout of our architecture.
The two vectors are then subtracted element-wise to form a single 128-D vector (not to be confused with the L1 distance function that yields just a single scalar value as output).
Well, we’re almost finished. The next part of it is a fully connected network that takes the 128-D distance vector as the input and the output layer has a single output neuron for classification with sigmoid activation. The number of hidden layers is a hyper-parameter and can be experimented with to get the best results. Now, the network is trained with the image pairs and labels that we gathered earlier. The loss function is ‘binary-cross entropy’ and the optimizer that best suits is ‘Adagrad’.
After training for a few epochs (5 to 7), we’ll get an accuracy close to 93% in the validation set. Note: For the network to perform well with real-world signatures, the SigComp dataset can be augmented by adding noise and random blur to force the network to generalize well since the dataset was collected using a digital tablet with little to no artefacts.
Github link: https://github.com/Baakchsu