Face Detection Using Google Cloud Vision API

In this tutorial, we will try to explore detecting number of faces in an image using Cloud Vision API part of Google Cloud Platform via Python.

Setting up Compute Engine

  • Please follow this article written by me for signing up to Google Cloud Platform.
  • For compute heavy tasks, recommend you to opt for GPU hardware which enables massively parallel processing capability, while creating Compute Engine. Below are the steps for selecting GPU, as we are just working on few images, we will stick to CPU in this article.
  • Create an instance, select appropriate name (which is unique), zone (which is near to you, to avoid latency issues).
  • If you click on “Customize” in the Machine type, you would be able to customize CPU, memory and also GPU.
  • Extend “GPUs”. You can select the number of GPUs and GPU type.
  • Click on “Change” in Boot disk, I wanted to select Ubuntu 16.04 OS, SSD persistent disk and 25GB in Size.
  • A service account is a special account that can be used by services and applications running on your Google Compute Engine instance to interact with other Google Cloud Platform APIs, which will be created by default, but we will select “No service account” here to explore further on how to authenticate.
  • Extend “ Management, disks, networking, SSH keys” and go to SSH keys tab.
  • Now, lets authenticate this instance with our computer. For this you need to download & install Putty.
  • Open “puTTy Key Generator” . Click on “Generate”, it will generate a Key. Then provide your computer hostname in the “Key comment”, which it will be listed in the Key automatically at the end after ‘=’ and then save both public and private keys.
  • Copy the Key generated in “SSH Keys” -> Enter entire Key data. Then click on “Create”
  • Then open PuTTy for connecting to the instance created. Go to SSH -> Auth, browse to the private key file you saved.
  • Then Session -> Provide the External IP of the Compute Engine instance in PuTTy to connect to the instance. Its better to make the IP static, if you wanted to login to the instance with the same ip, as it changes if you shutdown and start the instance again (for this I think charges will apply). You can also save the session, if you are working on the instance in future. Click on Open.
  • Enter your hostname which you have provided after generating the key. Its logged in now.
  • You can now continue install libraries you need. For changing the normal user to root, below is the command.
$sudo su
  • In this instance pip is not installed. Lets install pip .
#apt-get install -y python-pip
  • Also, install gcloud sdk.

Enabling the Vision AP & Creating Credential for API service

  • Click “hamburger menu”, goto APIs & services and select “Dashboard”.
  • Click on Google Cloud Vision API and “Enable” the API.
  • Goto Credentials and select “Service account key” from Create credentials.
  • From Service account, select “New service account” and provide any name you wish, also the role, then proceed creating the service account key.

Go to the instance you have already logged in..

  • Create a “.json” file and copy the contents of the JSON file, which you have downloaded in the previous step. Then save it (ctrl+X -> “Y” -> Enter).
  • Created a directory and lets move the json file to the directory created. This step is Optional
#mkdir cloudvision
#mv visioapi.json cloudvision
#cd cloudvision
  • Let’s create an environmental variable, which should be available for every session we login. Open .profile
#nano ~/.profile
  • If you want the changes to take effect for the current session, you could execute below command.
#source ~/.profile

Detecting the Number of Faces

  • Lets now detect the number of faces in the below movie posters which are from Imdb.
  • Create a file with .py extension in the instance you logged in.
#vi vision.py
  • Below is the python code for detecting the number of faces in the image.
  • Execute the code along with the path of the image directory.

Insights From the Results

  • Interestingly it did not detect a face in videodrome.jpg, as there is no face, its obvious (may be model did not train to detect faces in this type of images).
  • Also, in vessel.jpg, it only detected two faces and did not detect child face. I will update on this.

You are welcome to update the program. Link to Github repository.

If you find this article interesting and have learned something new, I’d appreciate your feedback either by clap or your views in a response.