Face detection and processing in 300 lines of code

Laurent Picard
Google Cloud - Community
5 min readSep 29, 2020

⏳ 2021–10–08 update

  • Updated GitHub version with latest library versions + Python 3.7 → 3.9

👋 Hello!

In this article, you’ll see the following:

  • how to detect faces in pictures,
  • how to automatically anonymize, crop,… a picture with faces,
  • how to make this a serverless online demo,
  • in less than 300 lines of Python code.

Here is a famous face that has been automatically anonymized and cropped.

Do you guess who this is?

Note: We’re talking about face detection, not face recognition. Though technically possible, face recognition can have harmful applications. Responsible companies have established AI principles and avoid exposing such potentially harmful technologies (e.g. Google AI Principles).

🛠️ Tools

A few tools will do:

  • a machine learning model to analyze images,
  • a library to process images,
  • a web application framework,
  • a serverless solution to keep the demo available 24/7 and at minimal cost.

🧱 Architecture

Here is an architecture using 2 Google Cloud services (App Engine + Vision API):

The workflow is the following:

  1. Open the demo: App Engine serves the home page.
  2. Take a selfie: the frontend sends it to the /analyze-image endpoint.
  3. The backend sends a request to the Vision API: the image is analyzed and the results (annotations) are returned.
  4. The backend returns the annotations, in addition to the number of detected faces (to display the info directly in the web page).
  5. The frontend sends image, annotations, and processing options to the /process-image endpoint.
  6. The backend processes the image with the given options and returns the result image.
  7. Change the options: steps 5 and 6 are repeated.
  8. Get the image with new options.

This is one of many possible architectures. The advantages of this one are the following:

  • The web browser caches both the selfie and the annotations: no storage is involved and no private images are stored anywhere in the cloud.
  • The Vision API is only called once per image.

🐍 Python libraries

Google Cloud Vision

Pillow

Flask

Dependencies

Define the dependencies in the requirements.txt file:

google-cloud-vision==1.0.0Pillow==7.2.0Flask==1.1.2

Notes:
- As a best practice, also specify the dependency versions. This freezes your production environment in a known state and prevents newer versions from potentially breaking future deployments.
- App Engine will automatically deploy these dependencies.

🧠 Image analysis

Vision API

The Vision API gives access to state-of-the-art machine learning models for image analysis. One of the multiple features is face detection. Here is a way to detect faces in an image:

Backend endpoint

Exposing an API endpoint with Flask consists in wrapping a function with a route. Here is a possible POST endpoint:

Frontend request

Here is a javascript function to call the API from the frontend:

🎨 Image processing

Face bounding box and landmarks

The Vision API provides the bounding box of the detected faces and the position of 30+ face landmarks (mouth, nose, eyes,…). Here is a way to visualize them with Pillow (PIL):

American Gothic (Wikimedia)

Face anonymization

Here is way to anonymize the faces thanks to the bounding boxes:

American Gothic (Wikimedia)

Face cropping

Similarly, to focus on the detected faces, you can crop everything around the faces:

American Gothic (Wikimedia)

🍒 Cherry on Py 🐍

Now, the icing on the cake (or the “cherry on the pie” as we say in French):

  • Having independent rendering functions lets you combine multiple options at once.
  • Knowing the bounding box for all faces allows cropping the image to the minimal bounding box.
  • Using the location of the nose and the mouth, you can add a moustache to everyone.
  • If your functions have parameters to render a single frame, you can generate animations with a few lines of code.
  • Once your Flask app works locally, you can deploy and keep it available 24/7 at minimal cost.

Here is what’s detected on famous photorealistic paintings:

American Gothic (Wikimedia)
Girl with a Pearl Earring (Wikimedia)
Shakespeare (Wikimedia)

Here are some animated versions:

American Gothic (Wikimedia)
Girl with a Pearl Earring (Wikimedia)
Shakespeare (Wikimedia)

Note: animations are a bit degraded (GIF version) as Medium does not support animated PNGs. The demo below lets you generate them in GIF, PNG, and WebP.

And, of course, this works even better on real pictures:

Personal pictures (aged from 2 to 44)
Yes, I’ve had a moustache for over 42 years, and my sister too ;)

And, finally, here is our famous anonymous from the beginning:

Mona Lisa (Wikimedia)

🚀 Source code and deployment

Source code

  • The Python code for the backend takes less than 300 lines of code.
  • See the source on GitHub.

Deployment

🎉 Online demo

Try the demo by yourself:
➡️ https://face-detection.lolo.dev ⬅️

https://face-detection.lolo.dev

🖖 See you

Feedback, questions? I’d love to read from you! Follow me on Twitter for more…

⏳ Updates

  • 2021–10–08: Updated GitHub version with latest library versions + Python 3.7 → 3.9

📜 Also in this series

  1. Summarizing videos
  2. Tracking video objects
  3. Face detection and processing
  4. Processing images

--

--

Laurent Picard
Google Cloud - Community

Tech lover, passionate about software, hardware, science and anything shaping the future • ⛅ explorer at Google • Opinions my own