Facial Recognition — the easy way

Maciej Kankowski
Jun 4, 2019 · 6 min read

One of the key buzzwords in today’s IT world is machine learning. Everyone claims either to be doing it, trying to do it, pretending that they will do it or simply reading about it. The potential gains from using one of many machine learning approaches are obvious. The promises made by key technical vendors are very… promising, and many success stories of various machine learning start-ups or big tech firms are encouraging. In this article I’ve written down my experiences on taking the first steps in implementing an actual machine learning use case in our internal mobile app. Keep in mind that I’m not a trained machine learning expert, but rather a developer who likes to play with code.


Our team grows continuously recruiting new members. Basically, it has become harder to know who is who at the firm. This is why we created WeJit — our internal “Facebook”, developed by our internal dev team, available as a web and mobile app. Its main feature is a searchable database of employee profiles. You can think of it as a fancy CV database. Each profile has a photo assigned to it, and the HR department makes sure these photos are the actual photos of the people we hire (not their pets, cars or their favorite Avengers).

Aside from being a useful tool in our daily work in a growing tech company, WeJit is also a playground for testing new technologies and evaluating new use cases. The one that I’m interested in at this stage is the following:

As a WeJit user, I’d like to find a profile by taking a photo of a person.

The story is simple: I go to the lunch area, see someone I don’t know and take a photo of him/her — and the app shows me his/her public WeJit profile. Maybe it is not the smartest way to make new friends — but definitely a reasonable coding challenge to be taken.

What is Facial Recognition?

Image for post
Image for post
Machine Learning is not magic!

Initially, the problem didn’t appear to be trivial at all. If we take a systematic approach, we can define the following key points that need to be taken care of:

A few years ago, starting this project would have been much more difficult. But today, as machine learning grew to being one of the hottest things in IT, things have changed. A lot.

Technology stack

I looked for the most appropriate solution that I could easily integrate with the existing mobile app (which is a cross-platform React Native app). Moreover, it would be nice to have the same solution applicable to the web part in the future as well. This led to a conclusion that a cloud-based API solution would be the best first choice.

Basing on certain generally accessible sources (Quora, Kairos and RapidAPI), I’ve chosen to follow the offering of Microsoft available in their Azure cloud, with plans for in-house, own neural network implementation left for the future.

Azure Cognitive Services

Cognitive Services, a part of Microsoft Azure cloud platform, is a set of services for adding machine learning algorithms to existing (or new) products. In our case, I focused on a set of RESTful services contained in Face API, which provides following functionalities:

So it looks like it has all we need! Furthermore, Microsoft delivers very satisfying multiplatform documentation with examples and support for Web, iOS and Android development.

There’s no rose without a thorn, they say. And indeed, we encountered some limitations. One of them is quota for traffic: 20 transactions per minute and up to 30,000 transactions per month. These conditions apply to the first year of usage. Luckily, for our case it is more than needed!

First approach — playing with existing example

The first step of our proof-of-concept project was to test the demo app provided by Microsoft: Cognitive-Face-iOS.

I began with the following configuration:

Image for post
Image for post
Training dataset #1 (actual faces used with permission).

Training dataset: Group of 3 people: 2 people with 2 photos, 1 person with 1 photo.

Results: People with 2 photos are recognized correctly, but person with 1 photo is not!

Image for post
Image for post
Image for post
Image for post
Image for post
Image for post
Results of example #1. Person with only 1 training example is not recognized.

To improve the results and to get the expected answer, I corrected the input source by adding additional image of Jakub.

Image for post
Image for post
Training dataset #2 — added the second photo of Jakub.

Training dataset: Group of 3 people — each with 2 photos.

Results: All people are recognized correctly! 🚀

Image for post
Image for post
Image for post
Image for post
Image for post
Image for post
Results #2 — all three people are recognized correctly.

Second step — integration with WeJit

At this stage, after investigating the source code of the example and playing around for a while, I was ready to work on implementing the functionality into WeJit app.

To build the dataset required by the algorithms, I’ve used an open source tool: howlowck/train-faces, which offers a nice, web-based UI for interacting with the Face API.

The only thing needed now is: to take a photo, make few asynchronous calls to the Face API and get answers. More technically speaking, such steps should be followed:

Seems easy! Let’s take a look at the result.

Final result

The short movie below shows the app in action:

I did some testing, including face pictures with eyeglasses, with half-turned head or inside a dark room. And in most cases the solutions worked well! The most surprising test result was the one for a beardy workmate, with training dataset containing just bearded face pictures — after having shaved, he was still recognized correctly (Witold Bołt — thanks for such a dedication).


The research brought us a clear answer that machine learning doesn’t have to be magical at all. Machine learning can be easy and there is nothing to be afraid of.

The initial concept based on the cloud proved to be successful. Our next step is to dig deeper into details and work on a custom-made solution without proprietary cloud-based APIs. We stand up for popular libraries as OpenCV (computer visioning) and Tensorflow (neural networks). Stay tuned for more details!

Jit Team

Clever Thoughts by Jit Team

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store