SVM based multi view face recognition using HOG(Histogram of Oriented Gradients) technique and dlib library.
Facial recognition is a popular method used in various fields to detect and identify the faces. Even though the accuracy rate of facial recognition is lower than that of the other biometric technologies like iris scanning and fingerprint, it is considered as a safe method due to its contact less and non-invasive process. It is predicted more people will begin to implement and prefer facial recognition over other biometric processes in the coming years due to COVID19 scenarios. There are many facial recognition libraries and frameworks developed over the last few years. In this project i have used histogram of oriented gradients technology (HOG), this method was developed early in 2005. A study by Navneet Dalal and Bill Triggs showed that HOG was able to detect humans relatively better compared to the existing methods like wavelets. SVM (Support Vector Machine) is used to classify images in HOG method.
What is HOG and how it works ?
HOG is a feature descriptor used to extract the features pixel by pixel with the help of gradients. This is primarily used for face detection, recognition and object detection. HOG works on grey scale images. Every image has a particular gradient orientation which helps the HOG extract the unique features from an image.
Requirements -
- Python3
- OpenCv ( OpenCV-Python is a library of Python bindings designed to solve computer vision problems.)
- Dlib ( Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems.)
- Pillow ( Pillow is the friendly PIL fork by Alex Clark and Contributors. PIL is the Python Imaging Library by Fredrik Lundh and Contributors.)
- Utils
In this project i have used various videos clips from the famous web series Money Heist, the clips were collected from youtube uploaded by various users. First, the algorithm was trained with a single image of every character appearing in the video labeled separately. The model was able to identify the characters to an extend but not in multiple angles. To create a better dataset the code was tweaked a little bit in such a way that each time a face is detected the the model crops the detected face and saves it into the appropriate path with appropriate file name (character_name+n.bmp), and later on the faces collected were merged into a single image and used as the training dataset, which made it possible to detect the faces in multiple angles. In certain cases 3000+ plus faces per character were merged together to create a robust model.
I faced issued with cv2.imwrite trying to export the image into png format, so the images were saved in .bmp format.
The merged image combined with all the faces of a character looks like the image below.
The final output -
Link to the github repository of the code. Feel free to fork it :)