SigNet (Detecting Signature Similarity Using Machine Learning/Deep Learning): Is This the End of Human Forensic Analysis?
SigNet (Detecting Signature Similarity using Machine Learning/Deep Learning): Is this the end of Human Forensic Analysis?
My grandfather was an expert in handwriting analysis. He spent all his life analyzing documents for the CBI (Central Bureau Of Investigation) and other organizations. His unique way of analyzing documents using a magnifying glass and different tools required huge amounts of time and patience to analyze a single document. This is back when computers were not fast enough. I remember vividly that he photocopied the same document multiple times and arranged it on the table to gain a closer look at the handwriting style.
Handwriting analysis involves a comprehensive comparative analysis between a questioned document and the known handwriting of a suspected writer. Specific habits, characteristics, and individualities of both the questioned document and the known specimen are examined for similarities and differences.
As this problem consists of detecting and analyzing patterns, Machine Learning is a great fit to solve this problem.
Why and How?
Why: My grandfather’s unique way of analyzing documents using a magnifying glass and different tools required huge amounts of time and patience to analyze a single document. This is back when computers were not fast enough. I remember vividly that he photocopied the same document multiple times and arranged it on the table to gain a closer look at the handwriting style. While I agree that we cannot replace that job with an A.I with a 100% accuracy, we can certainly build a system capable of aiding human beings.
How: To build our Signature Similarity network, we will use utilize the wonders of Deep Learning. We will go through three approaches to extract the similarity between our handwritten signatures. For our initial data, we will use the HandWritten Signatures dataset from Kaggle.
Requirements
For this project we will require:
- Python 3.8: The Programming Language
- TensorFlow 2: The Deep Learning Library
- Numpy: Linear Algebra
- Matplotlib: Plotting images
- Scikit-Learn: General Machine Learning Library
The Dataset
The dataset contains real and forged signatures of 30 people. Each person has 5 genuine and 5 forged signatures.
For loading the data, I have created a simple load_data() that iterates through all the datasets and extracts real and forged signatures with a label of 1 and 0 respectively.
In addition to this, I have also created a dictionary of tuples consisting of images and labels. (To be used later in the project).
def load_data(DATA_DIR=DATA_DIR, test_size=0.2, verbose=True, load_grayscale=True):
"""
Loads the data into a dataframe.
Arguments:
DATA_DIR: str
test_size: float
Returns:
(x_train, y_train,x_test, y_test, x_val, y_val, df)
"""
features = []
features_forged = []
features_real = []
features_dict = {}
labels = [] # forged: 0 and real: 1
mode = "rgb"
if load_grayscale:
mode = "grayscale"
for folder in os.listdir(DATA_DIR):
# forged images
if folder == '.DS_Store' or folder == '.ipynb_checkpoints':
continue
print ("Searching folder {}".format(folder))
for sub in os.listdir(DATA_DIR+"/"+folder+"/forge"):
f = DATA_DIR+"/"+folder+"/forge/" + sub
img = load_img(f,color_mode=mode, target_size=(150,150))
features.append(img_to_array(img))
features_dict[sub] = (img, 0)
features_forged.append(img)
if verbose:
print ("Adding {} with label 0".format(f))
labels.append(0) # forged
# real images
for sub in os.listdir(DATA_DIR+"/"+folder+"/real"):
f = DATA_DIR+"/"+folder+"/real/" + sub
img = load_img(f,color_mode=mode, target_size=(150,150))
features.append(img_to_array(img))
features_dict[sub] = (img, 1)
features_real.append(img)
if verbose:
print ("Adding {} with label 1".format(f))
labels.append(1) # real
features = np.array(features)
labels = np.array(labels)
x_train, x_test, y_train, y_test = train_test_split(features, labels, test_size=test_size, random_state=42)
x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.25, random_state=42)
print ("Generated data.")
return features, labels,features_forged, features_real,features_dict,x_train, x_test, y_train, y_test, x_val, y_valdef convert_label_to_text(label=0):
"""
Convert label into text
Arguments:
label: int
Returns:
str: The mapping
"""
return "Forged" if label == 0 else "Real"features, labels,features_forged, features_real, features_dict,x_train, x_test, y_train, y_test, x_val, y_val = load_data(verbose=False, load_grayscale=False)
Visualization of the data
The images are loaded with a target_size of (150,150,3).
Approach #1: Similarity in images (signatures) using MSE and SSIM.
For this approach, we will compute the similarity between images by using MSE (Mean Squared Error) or SSIM(Structural similarity). As you can see the formulas are pretty straightforward and fortunately Scikit-Learn provides an implementation for SSIM.
def mse(A, B):
"""
Computes Mean Squared Error between two images. (A and B)
Arguments:
A: numpy array
B: numpy array
Returns:
err: float
"""
# sigma(1, n-1)(a-b)^2)
err = np.sum((A - B) ** 2)
# mean of the sum (r,c) => total elements: r*c
err /= float(A.shape[0] * B.shape[1])
return errdef ssim(A, B):
"""
Computes SSIM between two images.
Arguments:
A: numpy array
B: numpy array
Returns:
score: float
"""
return structural_similarity(A, B)
Now let us take two images from the same person, one of them is real and the other is a fake.
As you can see, MSE Error does not have a fixed bound whereas SSIM has a fixed bound between -1 and 1.
Lower MSE represents Similar images whereas lower SSIM represents Similar images.
Approach #2: Building a classifier using CNNs that can detect forged or real signatures.
With this approach, we will try to come up with a classifier (using CNNs) to detect forged or real signatures.
As CNN's are known to detect intricate features among images, we will experiment with this classifier.
We are bound to encounter with overfitting as we do not have enough data.
We will probably use Image Augmentation to generate more training data.
On training our model, we are bound to encounter overfitting and after applying techniques to overcome the problem, the model did not improve.
Approach #2.1: Transfer Learning using Inception
To improve our model we will use transfer learning and fine-tune the model for this particular problem.
The InceptionV3 Model
For this approach, we will load pre-trained weights and add a classification head at the top to cater to this problem.
# loading Inception
model2 = tf.keras.applications.InceptionV3(include_top=False, input_shape=(150,150,3))# freezing layers
for layer in model2.layers:
layer.trainable=False# getting mixed7 layer
l = model2.get_layer("mixed7")x = tf.keras.layers.Flatten()(l.output)
x = tf.keras.layers.Dense(1024, activation='relu')(x)
x = tf.keras.layers.Dropout(.5)(x)
x = tf.keras.layers.Dense(1, activation='sigmoid')(x)
net = tf.keras.Model(model2.input, x)net.compile(optimizer='adam', loss=tf.keras.losses.binary_crossentropy, metrics=['acc'])h2 = net.fit(x_train, y_train, validation_data=(x_val, y_val), epochs=5)
These two approaches show that if we use transfer learning, we get much better results than using a plain CNN model.
Keep in mind, these approaches do not learn the similarity function but these focus on the classifying whether the image is forged or real.
There are still many ways we can improve our model, one is by augmenting data.
Approach #3: Siamese networks for image similarity
With our third approach, we will try to learn the similarity function. We will use something called Siamese networks (due to the nature of our data i.e fewer training examples).
In this approach, we will use Siamese networks to learn the similarity function. Siamese means ‘twins’ and the biggest difference between normal NNs is that these networks try to learn the similarity function instead of trying to classify (fitting the function).
- We first create a common feature vector for our images. We will pass two images (positive and negative) and use a contrastive loss function (Distance metric (L1 distance)) and in the end, we squash the output between 1 and 0 (sigmoid) to get the final result.
# creating the siamese network
im_a = tf.keras.layers.Input(shape=(150,150,3))
im_b = tf.keras.layers.Input(shape=(150,150,3))encoded_a = feature_vector(im_a)
encoded_b = feature_vector(im_b)combined = tf.keras.layers.concatenate([encoded_a, encoded_b])
combine = tf.keras.layers.BatchNormalization()(combined)
combined = tf.keras.layers.Dense(4, activation = 'linear')(combined)
combined = tf.keras.layers.BatchNormalization()(combined)
combined = tf.keras.layers.Activation('relu')(combined)
combined = tf.keras.layers.Dense(1, activation = 'sigmoid')(combined)sm = tf.keras.Model(inputs=[im_a, im_b], outputs=[combined])
sm.summary()
Dataset Generation
To generate the required dataset, we will try two approaches. First, we will generate data on the basis of labels. If two images have the same label (1 or 0), then they are similar. We will generate data in pairs in the form (im_a, im_b, label). Second, we will generate data on the basis of a person's number. According to the dataset, 02104021.png represents the signature produced by person 21 (i.e real).
Data generation Approach #1:
Here we are assuming similarity on the basis of labels. If two images have the same label (i.e 1 or 0) then they are similar.
def generate_data_first_approach(features, labels, test_size=0.25):
"""
Generate data in pairs according to labels.
Arguments:
features: numpy
labels: numpy
"""
im_a = [] # images a
im_b = [] # images b
pair_labels = []
for i in range(0, len(features)-1):
j = i + 1
if labels[i] == labels[j]:
im_a.append(features[i])
im_b.append(features[j])
pair_labels.append(1) # similar
else:
im_a.append(features[i])
im_b.append(features[j])
pair_labels.append(0) # not similar
pairs = np.stack([im_a, im_b], axis=1)
pair_labels = np.array(pair_labels)
x_train, x_test, y_train, y_test = train_test_split(pairs, pair_labels, test_size=test_size, random_state=42)
x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.25, random_state=42)
return x_train, y_train, x_test, y_test, x_val, y_val, pairs, pair_labelsx_train, y_train, x_test, y_test, x_val, y_val, pairs, pair_labels = generate_data_first_approach(features, labels)# show data
plt.imshow(pairs[:,0][0]/255.)
plt.show()
plt.imshow(pairs[:,1][0]/255.)
plt.show()
print("Label: ",pair_labels[0])
Training the dataset with Dataset Generation #1
Now we will train the network. Due to computational limitations, we only train the model on a single epoch.
# x_train[:,0] => axis=1 (all 150,150,3) x_train[:,1] => axis=1 (second column)
sm.fit([x_train[:,0], x_train[:,1]], y_train, validation_data=([x_val[:,0],x_val[:,1]], y_val),epochs=1)
- The metric is calculating the L1-Distance (MAE) between the y_hat and y.
- Due to computation limitations, we only train it for one epoch
- This represents a very simple siamese network capable of learning the similarity function.
Data Generation Approach #2
In this approach, we try to set up a dataset where we cross multiply each signature with other number signature.
The inputs and the outputs must be the same size.
def generate_data(person_number="001"):
x = list(features_dict.keys())
im_r = []
im_f = []
labels = [] # represents 1 if signature is real else 0
for i in x:
if i.startswith(person_number):
if i.endswith("{}.png".format(person_number)):
im_r.append(i)
labels.append(1)
else:
im_f.append(i)
labels.append(0)
return im_r, im_f, labelsdef generate_dataset_approach_two(size=100, test_size=0.25):
"""
Generate data using the second approach.
Remember input and output must be the same size!
Arguments:
features: numpy array
labels: numpy array
size: the target size (length of the array)
Returns:
x_train, y_train
"""
im_r = []
im_f = []
ls = []ids = ["001","002","003",'004','005','006','007','008','009','010','011','012','013','014','015','016','017','018','019','020','021','022',
'023','024','025','026','027','028','029','030']
for i in ids:
imr, imf, labels = generate_data(i)
# similar batch
for i in imr:
for j in imr:
im_r.append(img_to_array(features_dict[i][0]))
im_f.append(img_to_array(features_dict[j][0]))
ls.append(1) # they are similar
# not similar batch
for k in imf:
for l in imf:
im_r.append(img_to_array(features_dict[k][0]))
im_f.append(img_to_array(features_dict[l][0]))
ls.append(0) # they are not similar
print(len(im_r), len(im_f))
pairs = np.stack([im_r, im_f], axis=1)
ls = np.array(ls)
x_train, x_test, y_train, y_test = train_test_split(pairs, ls, test_size=test_size, random_state=42)
x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.25, random_state=42)
return x_train, y_train, x_test, y_test, x_val, y_val, pairs, lsx_train, y_train, x_test, y_test, x_val, y_val, pairs, ls = generate_dataset_approach_two()# show data
plt.imshow(x_train[:,0][0]/255.)
plt.show()
plt.imshow(x_train[:,0][1]/255.)
print("Label: ",y_train[0])
Training the Network with Dataset Generation #2
The biggest difference between dataset generation #1 and #2 is the way inputs are arranged. In dataset #1 we select random signatures according to their labels but in #2 we select signatures from the same person throughout.
Conclusion
To conclude, we present a plausible method to detect forged signatures using Siamese Networks and most importantly we show how we can easily train a Siamese network only a few training examples. We see how we can easily achieve great results using transfer learning.