Computer vision and machine learning in PHP using the opencv library

Hello everyone. This is my jubilee article. In almost 7 years I have written 10 articles (including this one), 8 of them — technical. The total number of views of all articles is about half a million.
I have made the main contribution into two topics: PHP and Server Administration. I like to work at the junction of these two areas, but the scope of my interests is much broader.
Like many developers, I often use the results of someone else’s work (articles on the medium, the code on the github, etc), so I’m always glad to share with the community my results in response. Writing articles is not only a return of debt to the community, but it also allows you to find like-minded people, get comments from professionals in a narrow field, and further deepen your knowledge in the field under investigation.
Actually this article is about one of these moments. In it I describe what I have been doing almost all my free time in the last six months. Except those moments when I watched TV shows or played games.

Now, “Machine learning” is developing very fast, it has already written a lot of articles, including the ones on the medium, and almost every developer would like to start using it in their work tasks and home projects, but where to start and what to use is not always understandable. Most articles for beginners offer a bunch of literature, on the reading of which there is not enough life, “inexpensive” courses, etc.
Regularly there appear new articles in which new approaches to the solution of a particular problem are described. On the github you can find an implementation of the approach described in the articles. As programming languages ​​are more often used the following ones: c / c ++, python 2/3, lua and matlab, and as frameworks: caffe, tensorflow, torch. A large segmentation in programming languages ​​and frameworks greatly complicates the procedure for finding what you need and integrating it into the project.

To somehow reduce all this chaos in opencv added a module dnn, which allows you to use the model, trained in the basic frameworks. I’ll show you how this module can be used from php.

Jeremy Howard (creator of the free practical course “machine learning for coders”) believes that now there is a big threshold between learning machine learning and applying it in practice.

Howard says that to start learning machine learning one year of programming experience is enough. I fully agree with him and I hope that my article will help reduce the threshold of entry into opencv for php-developers who are little familiar with machine learning and are not yet sure whether they want to do this at all or not, and also will try to describe all the points for which I spent hours and days, so you do not have to spend more than a minute to do this.

logo of php-opencv project

I was considering writing a php-opencv module by myself using SWIG and spent a lot of time on it, but I did not achieve anything. Everything was complicated by the fact that I did not know c / c ++ and did not write extensions for php 7. Unfortunately most of the materials on the Internet were written for php-extensions on php 5, so I had to gather information by bit, and solve the problems by myself.

Then I found the library of php-opencv on the github space, it is a module for php7, which makes calls to opencv methods. It took me several evenings to compile, install and run the examples. I began to try various features of this module, but I lacked some methods, I added them myself, created a pull request, and the author of the library accepted. Later, I added more features.

This is how the image loading looks:

$image = cv\imread(“images/faces.jpg”);

For comparison, on a python it looks like this:

image = cv2.imread(“images/faces.jpg”)

When reading an image in php (as well as in c ++), the information is stored in the Mat object (matrix). In php, its analogue is a multidimensional array, but unlike a multidimensional array, this object allows various quick manipulations, for example, dividing all elements by a number. In the python, when the image is loaded, the numpy object is returned.

Careful, legacy! It happened so that imread (in php, c ++ and pyton) loads the image not in RGB format, but in BGR. Therefore, in the examples with opencv, you can often see the procedure for converting BGR -> RGB and vice versa.

Face detection

The first thing I tried was this function. For it in opencv there is a class CascadeClassifier, which can use the pre-model in xml format. Before finding a face, it is recommended to convert the image into a black and white format.

$src = imread(“images/faces.jpg”); 
$gray = cvtColor($src, COLOR_BGR2GRAY); 
$faceClassifier = new CascadeClassifier(); 
$faceClassifier->load(‘models/lbpcascades/lbpcascade_frontalface.xml’); 
$faceClassifier->detectMultiScale($gray, $faces);

complete example code

Result:

As can be seen from the example, there is no problem finding a face even on a photo in a zombie makeup. Points also do not interfere with finding a person.

Face recognition

For this, opencv has the class LBPHFaceRecognizer and the methods train / predict.
If we want to know who is present in the photo, then first we need to train the model using the train method, it takes two parameters: an array of face images and an array of numeric labels for these images. Then you can call the predict method on the test image (face) and get the numeric label to which it matches.

$faceRecognizer = LBPHFaceRecognizer :: create ();
$faceRecognizer-> train ($myFaces, $myLabels = [1,1,1,1]); // 4 my faces
$faceRecognizer-> update ($angelinaFaces, $angelinaLabels = [2,2,2,2]); // 4 faces of Angelina
$label = $faceRecognizer-> predict ($faceImage, $confidence);
// get label (1 or 2) and confidence

complete example code

Data sets:

Result:

When I started working with LBPHFaceRecognizer, it did not have the ability to save / load / update the finished model. Actually my first pull request added these methods: write / read / update.

Face marks / landmarks

When I started to get acquainted with opencv, I often came across photos of people, where points marked the eyes, nose, lips, etc. I wanted to repeat this experiment by myself, but in the opencv version for the python this was not implemented. It took me an evening to add FacemarkLBF support to php and send a second bulletback. All works simply, we load the pre-model, we feed an array of faces, we get an array of points for each person.

$facemark = FacemarkLBF::create(); 
$facemark->loadModel(‘models/opencv-facemark-lbf/lbfmodel.yaml’); 
$facemark->fit($image, $faces, $landmarks);

complete example code

Result:

As can be seen from the example, makeup of a zombie can make it difficult to find the points on the face. Points can also interfere with finding a face. Illumination also affects it. In this case, foreign objects in the mouth (strawberries, cigarettes, etc.) may not interfere.

After my first pull request, I was inspired and began to see what can be done with opencv and stumbled upon the article Deep Learning, now in OpenCV. Without hesitation, I decided to add into php-opencv the possibility of using pre-trained models, which are a lot on the Internet. It turned out to be not so difficult to load caffe-models, although later it took me a lot of time to learn how to work with multidimensional matrixes and work with caffe / torch / tensorflow models without using opencv.

Face detection using the dnn module

So, opencv allows you to load pre-trained models in Caffe using the function readNetFromCaffe. It takes two parameters — paths to .prototxt and .caffemodel files. In the prototxt-file there is the description of the model, and in caffemodel — the weights computed during the training of the model.
Here is an example of the beginning of a prototxt file:

input: “data” 
input_shape { 
 dim: 1 
 dim: 3 
 dim: 300 
 dim: 300 
}

This piece of file describes that a 4-dimensional matrix of 1x3x300x300 is expected to enter the input. In the description of models, it is usually stated what is expected in this format, but in most cases it means that an RGB image (3 channels) with a size of 300x300 is expected to be input.
By loading an RGB image of 300x300 with the imread function, we get a matrix of 300x300x3.
To bring the matrix 300x300x3 to the form 1x3x300x300 in opencv there is a function blobFromImage.
After that, we can only apply blob to the network input using the setInput method and call the forward method, which will return the finished result to us.

$src = imread(“images/faces.jpg”); 
$net = \CV\DNN\readNetFromCaffe(‘models/ssd/res10_300x300_ssd_deploy.prototxt’, ‘models/ssd/res10_300x300_ssd_iter_140000.caffemodel’); 
$blob = \CV\DNN\blobFromImage($src, $scalefactor = 1.0, $size = new Size(300, 300), $mean = new Scalar(104, 177, 123), $swapRB = true, $crop = false); 
$net->setInput($blob, “”); 
$result = $net->forward();

In this case, the result is a matrix of 1x1x200x7, i.e. 200 arrays of 7 elements each. In a photo with four faces, the network found 200 candidates. Each of which looks like [,, $confidence, $startX, $startY, $endX, $endY]. The element $confidence is responsible for “confidence”, i.e. then the prediction probability is good, for example 0.75. The following elements are responsible for the coordinates of the rectangle with the face. In this example, only 3 people were found with confidence greater than 50%, and the remaining 197 candidates have confidence less than 15%.

The size of the model is 10 MB, complete example code

Result:

As can be seen from the example, the neural network does not always produce good results when using it “on the forehead”. No fourth face was found, but if the fourth photo was cut and sent to the network separately, the face would be found.

Improving the quality of images using a neural network

A long time ago I heard about the waifu2x library, which allows you to eliminate noise and increase the size of icons / photos. The library itself is written in lua, and under the hood it uses several models (for increasing icons, eliminating photo noise, etc.) trained in torch. The author of the library exported these models to caffe and helped me to use them from opencv. As a result, an example was written in php to increase the resolution of the icons.
The size of the model is 2 MB, the full code of the example.

Original
Result
Enlarging a picture without using a neural network

Image classification

The MobileNet neural network, trained on the ImageNet data set, allows you to classify an image. In total, it can determine 1000 classes, which in my opinion is not enough.
The size of the model is 16 MB, the full code of the example.

Result: 87% — Egyptian cat, 4% — tabby, tabby cat, 2% — tiger cat

Tensorflow Object Detection API

The MobileNet SSD (Single Shot MultiBox Detector) network, trained in Tensorflow on the COCO dataset, can not only classify an image, but also return regions, although it can only detect 182 classes.
The model size is 19 MB, the full code of the example.

Original
Result

Syntax highlighting and code completion

To the repository with examples I also added the file phpdoc.php. Thanks to it, Phpstorm highlights the syntax of functions, classes and their methods, and also works with code completion. This file does not need to be included in your code (otherwise there will be an error), it’s enough to put it into your project. Personally, it makes life easier for me. This file describes most of the opencv functions, but not all, so that pull requests are welcome.

Installation

The dnn module appeared in opencv only in version 3.4 (before that it was in opencv-contrib).
In ubuntu 18.04 the latest version of opencv is 3.2. Build opencv from the sources takes about half an hour, so I compiled the package under ubuntu 18.04 (also works for 17.10, size 25MB), and also compiled php-opencv packages for php 7.2 (ubuntu 18.04) and php 7.1 (ubuntu 17.10) (size 100KB). Registered ppa: php-opencv, but has not yet mastered the fill and found nothing better than just upload packages on github. I also created an request for creating an account in pecl, but after a few months I still did not get a response.
So now the installation under ubuntu 18.04 looks like this:

apt update && apt install -y wget && \
wget https://raw.githubusercontent.com/php-opencv/php-opencv-packages/master/opencv_3.4_amd64.deb && dpkg -i opencv_3.4_amd64.deb && rm opencv_3.4_amd64.deb && \
wget https://raw.githubusercontent.com/php-opencv/php-opencv-packages/master/php-opencv_7.2-3.4_amd64.deb && dpkg -i php-opencv_7.2–3.4_amd64.deb && rm php-opencv_7.2–3.4_amd64.deb && \
echo “extension=opencv.so” > /etc/php/7.2/cli/conf.d/opencv.ini

Installation of this option takes about 1 minute. All installation options on ubuntu.
I also compiled a 168 MB docker image.

Examples using

Downloading:

git clone https://github.com/php-opencv/php-opencv-examples.git && cd php-opencv-examples

Running:

php detect_face_by_dnn_ssd.php

PS

Subscribe, so as not to miss my next articles, put a husky to motivate me to write them and write in the comments questions, offer options for new experiments / articles.

References:

php-opencv-examples — all examples from the article
php-opencv/php-opencv — my fork with support for the dnn module
hihozhou /php-opencv — the original repository, without the support of the dnn module (I created pulrequest, but it has not yet been accepted).
https://habr.com/post/358902/ — Russian version