Face Detection on Cloud Foundry
Computer vision (CV), a subfield of artificial intelligence, has been a hot topic for quite a while now with a wide range of applications. For example, in automotive it lays the foundation for self-driving cars (see figure 1) or in medicine it is used to extract clinically relevant knowledge from medical images.
Google’s DeepMind and the University of Oxford also recently published an article about a lip-reading system that can read better than humans after watching thousands of hours of TV. Another interesting application can be found in a comedy club in Barcelona where they are using facial-recognition technology to track how much users enjoyed the show and charged them upon this. Since CV is such an interesting field, we wondered if we would be able to deploy a simple face detection web app using OpenCV on Cloud Foundry (CF). The short answer is yes.
Notes: For those of you who don’t know what CF is. CF is a Platform-as-a-Service similar to Heroku with the main difference that it can also run in both public and private cloud. And if you want to know how CF is related to data science, then check out this article.
Face detection is widely used in many applications. There are already a bunch of algorithms out there that are quite successful at detecting human faces that come close to the performance of humans. Since we didn’t want to reinvent the wheel and time was limited (actually just a day), we decided to go with OpenCV, a popular computer vision library that was started by Intel in 1999. The main reason for using this library is due to its large amount of readily available algorithms, which includes a comprehensive set of both classic and state-of-the-art computer vision and machine learning algorithms. OpenCV was written in C++ but there are bindings in Python, C, Java and MATLAB.
For our purpose, we are using OpenCV’s Python API to create the web application. In terms of this, there are already a couple of examples out there. Specifically, two implementations proved to be very useful:
Both implementations use Haar Feature-based Cascade Classifiers to detect human faces. The core idea of the algorithm is as follows.
- It uses integral images on Haar-like features (figure 2), which are adjacent rectangular regions at a specific location in a detection window, to sum up pixel intensities very efficiently.
- A weak classifier is used for training a model that can classify whether there is a human face in an image with the haar-like features as input.
- The concept of Cascade is used for prediction where the idea is that the classifier that we get from training consists of several simpler classifiers that are applied subsequently on a sub-window with a subset of features instead of the entire feature space.
If you want to have more information about the implemented algorithm in OpenCV, we refer to this article here.
Flask and OpenCV
The two implementations above are able to detect human faces but it lacks the possibility to deploy it as an web app. This article by Miguel Grinberg, however, bridges this gap where he used Flask, a lightweight web-framework for Python, to create a video streaming web application. The main idea is to use Flask’s native support for streaming responses through generator functions. He used this idea to stream a sequence of independent JPEG pictures (this is called Motion JPEG) to the browser. His implementation, however, was not ideal for us since we were not able to control the webcam from an application container that sits on the cloud via a controller.
WebRTC is the Solution
To stream a webcam through a web application, we were luckily able to use WebRTC which is actually a collection of communications protocols and APIs that enables real-time communication over P2P connections. Specifically we used the getUserMedia (the ability to request access to a user’s webcam and microphone) resource which is widely supported by many browsers (except Safari at the moment).
Bringing the Pieces Together….
Now with all the individual parts in place, we were able to implement our solution that is fully deployable on CF. The full code can be found in this repo.
Our implementation works as follows.
- We are using OpenCV’s built-in cascade classifier for detecting faces where we take as input an image and then return a list of rectangles where it believes it found a face otherwise we return an empty list.
- The primary reason for only returning a list of rectangles instead of the full image with the drawn rectangles is to save bandwidth. In one of our earlier implementation, we were using the idea of Motion JPEG as well but that was really slow. The rectangles are rather drawn on the client side.
- Then we create an API endpoint which accepts images and then invoke the detect_faces() function that we described above and return a JSON object with the coordinates of the rectangles if there are faces found on the image.
- We then take a snapshot from the video by using the drawImage() method and then post it to our API endpoint. The API endpoint then returns the coordinates of the rectangles if there are faces found.
- Finally, the rectangles are drawn on a canvas element through the drawFaces() method which we put over the video.
At last, we are able to push the app to Cloud Foundry. Here we are using the official Python Buildpack which recently also supports Conda as package manager. Unfortunately, we were only able to use Python 2 at the moment as the build for OpenCV in Python 3 has GTK support enabled and this caused problems with the app container. In the future, we want to build our own OpenCV from source for Python 3 with GTK disabled.
Face Detector App
Finally, figure 3 shows the app in action which is deployed on Pivotal’s own CF instance, PWS:
Here is also the link to the demo. Make sure not to use Safari!
Conclusion and Outlook
We managed to build a simple face detection app in a very short time. We demonstrated that it is possible to use OpenCV on CF. In the future, we plan to create more use cases in this area.
We hope that you find this article useful. Do get in touch with us if you want to discuss more about this!