Face Verification using MTCNN, FaceNet and Siamese Network with Triplet Loss
Facial verification is a critical application of computer vision, widely used in security systems, user authentication, and more. In this blog post, we’ll build a facial verification system using MTCNN for face detection, FaceNet for embedding generation, and a Siamese network trained with triplet loss. Let’s dive into each step in detail.
Results
At Threshold=0.5
Siamese Network for Facial Verification
A Siamese network compares the similarity between images using triplets: an anchor image of a specific person, a positive image of the same person, and a negative image of a different person. It learns to minimize the distance between the anchor and positive images while maximizing the distance between the anchor and negative images. This approach ensures the system can accurately differentiate between images of the same person and images of different people.
Dataset Details
we want to create three folder Anchor , Positive and Negative.
- Anchor (A): An image of a person.
- Positive (P): Another image of the same person as in the anchor.
- Negative (N): An image of a different person.
i have created dataset like this:
dataset/
person1/
anchor/
positive/
negative/
person2/
anchor/
positive/
negative/
person3/
anchor/
positive/
negative/
.
.
.
For selecting negative images, you can use the LFW (Labeled Faces in the Wild) dataset, which provides a diverse collection of images of different people.
The goal is to create triplets (A, P, N) such that:
- A and P are images of the same person.
- A and N are images of different people.
Loss Function Formula:
𝐿(𝑎,𝑝,𝑛)=max{𝑑(𝑎𝑖,𝑝𝑖)−𝑑(𝑎𝑖,𝑛𝑖)+margin,0}
Step 1: Setup and Pre-requisites
Before we start, ensure you have the necessary libraries installed.
import os
import cv2
import random
from mtcnn import MTCNN
from tqdm import tqdm
from keras_facenet import FaceNet
from tensorflow.keras import layers, Model
from tensorflow import keras
import tensorflow as tf
from sklearn.metrics import f1_score, precision_score, recall_score
Step 2: Face Detection with MTCNN
2.1: Load the MTCNN Model
First, we’ll load the MTCNN model for face detection.
def load_face_detection_model():
return MTCNN()
2.2: Detect and Crop Faces
def detect_and_crop_face(image_path, detector):
image = cv2.imread(image_path)
faces = detector.detect_faces(image)
if faces:
face = faces[0]
(x, y, w, h) = face['box']
(x, y) = (max(0, x), max(0, y))
(endX, endY) = (min(image.shape[1], x + w), min(image.shape[0], y + h))
face = image[y:endY, x:endX]
return face, (x, y, endX, endY)
return None, None
Step 3: Preprocess and Create Dataset
3.1: Preprocess Images
Lets, preprocess images by detecting and cropping faces, then saving them in an organized directory structure.
def create_dataset(data_dir, output_dir):
face_detection_model = load_face_detection_model()
for student_folder in os.listdir(data_dir):
student_dir = os.path.join(data_dir, student_folder)
student_output_dir = os.path.join(output_dir, "preprocess", student_folder)
os.makedirs(student_output_dir, exist_ok=True)
for folder_name in ['anchor', 'positive', 'negative']:
folder_path = os.path.join(student_output_dir, folder_name)
os.makedirs(folder_path, exist_ok=True)
for subfolder_name in ['anchor', 'positive', 'negative']:
subfolder_dir = os.path.join(student_dir, subfolder_name)
for idx, image_file in enumerate(os.listdir(subfolder_dir), start=1):
image_path = os.path.join(subfolder_dir, image_file)
face, bbox = detect_and_crop_face(image_path, face_detection_model)
if face is not None:
output_file = f"{student_folder}_{subfolder_name}_{idx}.jpg"
cv2.imwrite(os.path.join(student_output_dir, subfolder_name, output_file), face)
else:
print(f"No face detected in {image_path}. Skipping...")
data_dir = "data" #your dataset folder path
output_dir = "face_preprocess" # you want to create this folder that contain preprocess images in Anchor, Positive and Negative formate like above shown
create_dataset(data_dir, output_dir)
3.2: Create Triplets
We’ll create triplets of anchor, positive, and negative images for training the Siamese network.(it’s work on my dataset )
def create_triplets(data_dir):
anchor_paths = []
positive_paths = []
negative_paths = []
people= os.listdir(data_dir)
for person in people:
anchor_dir = os.path.join(data_dir, person, 'anchor')
positive_dir = os.path.join(data_dir, person, 'positive')
negative_dir = os.path.join(data_dir, person, 'negative')
if not (os.listdir(anchor_dir) and os.listdir(positive_dir) and os.listdir(negative_dir)):
continue
anchor_images = os.listdir(anchor_dir)
positive_images = os.listdir(positive_dir)
negative_images = os.listdir(negative_dir)
for anchor in anchor_images:
anchor_path = os.path.join(anchor_dir, anchor)
positive = random.choice(positive_images)
positive_path = os.path.join(positive_dir, positive)
negative = random.choice(negative_images)
negative_path = os.path.join(negative_dir, negative)
anchor_paths.append(anchor_path)
positive_paths.append(positive_path)
negative_paths.append(negative_path)
return anchor_paths, positive_paths, negative_paths
data_dir = "face_preprocess/preprocess" #path of your preprocess folder you have created by above code
anchor_paths, positive_paths, negative_paths = create_triplets(data_dir)
3.3: Create TensorFlow Datasets
Convert the lists of paths into TensorFlow datasets and preprocess the images.
def preprocess(file_path):
byte_img = tf.io.read_file(file_path)
img = tf.io.decode_jpeg(byte_img)
img = tf.image.resize(img, (160, 160))
img = img / 255.0
return img
anchor_ds = tf.data.Dataset.from_tensor_slices(anchor_paths)
positive_ds = tf.data.Dataset.from_tensor_slices(positive_paths)
negative_ds = tf.data.Dataset.from_tensor_slices(negative_paths)
triplets = tf.data.Dataset.zip((anchor_ds, positive_ds, negative_ds))
preprocessed_triplets = triplets.map(lambda anchor, positive, negative: (preprocess(anchor), preprocess(positive), preprocess(negative)))
train_size = int(0.7 * len(preprocessed_triplets))
val_size = int(0.15 * len(preprocessed_triplets))
test_size = len(preprocessed_triplets) - train_size - val_size
shuffled_triplets = preprocessed_triplets.shuffle(buffer_size=len(preprocessed_triplets))
train_dataset = shuffled_triplets.take(train_size)
val_test_dataset = shuffled_triplets.skip(train_size)
val_dataset = val_test_dataset.take(val_size)
test_dataset = val_test_dataset.skip(val_size)
train_dataset = train_dataset.cache().batch(16).prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
val_dataset = val_dataset.cache().batch(16).prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
test_dataset = test_dataset.cache().batch(16).prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
Step 4: Defining the Embedding Model
4.1: Initialize FaceNet
We initialize FaceNet and create the embedding model.
def get_embedding_model(input_shape=(160, 160, 3)):
facenet = FaceNet()
base_model = facenet.model
base_model.trainable = False
inputs = layers.Input(shape=input_shape)
embeddings = base_model(inputs)
x = layers.Dense(units=1024, activation="relu")(embeddings)
x = layers.Dropout(0.2)(x)
x = layers.BatchNormalization()(x)
x = layers.Dense(units=512, activation="relu")(x)
x = layers.Dropout(0.2)(x)
x = layers.BatchNormalization()(x)
x = layers.Dense(units=256, activation="relu")(x)
x = layers.Dropout(0.2)(x)
outputs = layers.Dense(units=128)(x)
embedding_model = Model(inputs=inputs, outputs=outputs, name="embedding_model")
return embedding_model
embedding_model = get_embedding_model()
embedding_model.summary()
Step 5: Triplet Loss Function
Define the triplet loss function for training the Siamese network.
def triplet_loss( y_pred, margin=0.2):
"""
Triplet loss function.
Arguments:
y_pred -- list containing three parts:
anchor: the embedding for the anchor image
positive: the embedding for the positive image
negative: the embedding for the negative image
margin -- margin value, controls the relative distance between positive and negative pairs
Returns:
loss -- real number, value of the loss
"""
anchor, positive, negative = y_pred[0], y_pred[1], y_pred[2]
# Compute the distance between the anchor and the positive
pos_dist = tf.reduce_sum(tf.square(anchor - positive), axis=-1)
# Compute the distance between the anchor and the negative
neg_dist = tf.reduce_sum(tf.square(anchor - negative), axis=-1)
# Compute the triplet loss
basic_loss = pos_dist - neg_dist + margin
loss = tf.maximum(basic_loss, 0.0)
# Return the mean loss
return tf.reduce_mean(loss)
Step 6: Building the Siamese Network
Now build the Siamese network using the embedding model.
def get_siamese_network(embedding_model, input_shape=(160, 160, 3)):
anchor_input = Input(name="anchor", shape=input_shape)
positive_input = Input(name="positive", shape=input_shape)
negative_input = Input(name="negative", shape=input_shape)
anchor_embedding = embedding_model(anchor_input)
positive_embedding = embedding_model(positive_input)
negative_embedding = embedding_model(negative_input)
siamese_network = Model(
inputs=[anchor_input, positive_input, negative_input],
outputs=[anchor_embedding, positive_embedding, negative_embedding]
)
return siamese_network
siamese_network = get_siamese_network(embedding_model)
siamese_network.summary()
Step 7: Training the Siamese Network
7.1: Define Training Step
We define a function to perform a single training step for the Siamese network.
opt = tf.keras.optimizers.Adam()
checkpoint_dir = './training_triplet_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, 'ckpt')
checkpoint = tf.train.Checkpoint(opt=opt, siamese_model=siamese_network)
def siamese_train_step(siamese_network, optimizer, data, margin=0.2):
"""
Perform a single training step for the Siamese network.
Arguments:
siamese_network -- the Siamese network model
optimizer -- the optimizer for training
data -- tuple containing anchor, positive, and negative images
margin -- margin value for triplet loss
Returns:
loss -- computed loss value
"""
anchor, positive, negative = data
with tf.GradientTape() as tape:
anchor_embedding, positive_embedding, negative_embedding = siamese_network((anchor, positive, negative))
loss = triplet_loss([anchor_embedding, positive_embedding, negative_embedding], margin=margin)
gradients = tape.gradient(loss, siamese_network.trainable_variables)
optimizer.apply_gradients(zip(gradients, siamese_network.trainable_variables))
return loss
7.2: Train the Network
Next train the Siamese network using the defined training step.
def train_siamese_network(siamese_network, optimizer, train_dataset, num_epochs, margin=0.2):
"""
Train the Siamese network model.
Arguments:
siamese_network -- the Siamese network model
optimizer -- the optimizer for training
train_dataset -- the training dataset
num_epochs -- number of epochs for training
margin -- margin value for triplet loss
"""
for epoch in range(num_epochs):
epoch_loss = 0.0
epoch_accuracy = tf.keras.metrics.BinaryAccuracy()
with tqdm(total=len(train_dataset), desc=f'Epoch {epoch+1}/{num_epochs}', unit='batch') as pbar:
for data in train_dataset:
loss = siamese_train_step(siamese_network, optimizer, data, margin)
epoch_loss += loss
anchor_embedding, positive_embedding, negative_embedding = siamese_network(data)
pos_dist = tf.norm(anchor_embedding - positive_embedding, axis=-1)
neg_dist = tf.norm(anchor_embedding - negative_embedding, axis=-1)
accuracy = tf.cast(pos_dist < neg_dist, tf.float32)
epoch_accuracy.update_state(tf.ones_like(accuracy), accuracy)
pbar.update(1)
pbar.set_postfix({'loss': loss.numpy(), 'accuracy': accuracy.numpy()})
epoch_loss /= len(train_dataset)
acc_value = epoch_accuracy.result().numpy()
print(f'Epoch {epoch+1}/{num_epochs}, Loss: {epoch_loss:.4f}, Accuracy: {acc_value:.4f}')
checkpoint.save(file_prefix=checkpoint_prefix)
epoch_accuracy.reset_state()
num_epochs = 10
margin = 0.2
train_siamese_network(siamese_network, opt, train_dataset, num_epochs, margin)
Step 8: Evaluating the Model
8.1: Calculate Metrics
We define functions to calculate performance metrics like F1-score, precision, and recall.
def calculate_metrics(y_true, y_pred):
y_true = tf.keras.backend.flatten(y_true)
y_pred = tf.keras.backend.flatten(y_pred)
f1 = f1_score(y_true.numpy(), y_pred.numpy())
precision = precision_score(y_true.numpy(), y_pred.numpy())
recall = recall_score(y_true.numpy(), y_pred.numpy())
return f1, precision, recall
8.2: Evaluate on Datasets
We evaluate the Siamese network on the validation and test datasets.
def evaluate_siamese_network(siamese_network, test_dataset):
y_true = []
y_pred = []
for data in test_dataset:
anchor_embedding, positive_embedding, negative_embedding = siamese_network(data)
pos_dist = tf.norm(anchor_embedding - positive_embedding, axis=-1)
neg_dist = tf.norm(anchor_embedding - negative_embedding, axis=-1)
y_true.extend([1 if p < n else 0 for p, n in zip(pos_dist, neg_dist)])
y_pred.extend([1 if p < n else 0 for p, n in zip(pos_dist, neg_dist)])
f1, precision, recall = calculate_metrics(tf.constant(y_true), tf.constant(y_pred))
print(f'F1-score: {f1:.4f}, Precision: {precision:.4f}, Recall: {recall:.4f}')
print("Evaluating with validation dataset:")
evaluate_siamese_network(siamese_network, val_dataset)
print("\nEvaluating with test dataset:")
evaluate_siamese_network(siamese_network, test_dataset)
Step 9: Saving the Model
Finally, we save the trained Siamese network model.
siamese_network.save('siamese_network_triplet_loss.keras')
Conclusion:
In this blog, we built a facial verification system using cutting-edge techniques like MTCNN for face detection, FaceNet for embeddings, and a Siamese network with triplet loss. By comparing facial features, our system verifies identities. We also discovered that converting images to grayscale can enhance performance, making our system more robust across different lighting conditions.