Building & Calling Face Services Through SingularityNET

Our team has been seeding SingularityNET with valuable face-related services, enabling both existing and emerging solutions.

Joel Pitt
SingularityNET
7 min readJun 12, 2018

--

Summary

This post reviews our initial collection of human face-related services that are being launched on SingularityNET.

The first half covers the capabilities of our services. The second half covers the interactions with the SingularityNET smart-contracts via our command line tool. This allows you to create and fund jobs with which to make calls to these example services.

The code is in the face-services repository.

Seeding Our Network With Flexible Face-Related Services

Why are a bunch of human-face related services useful and interesting for SingularityNET?

  1. We’re human. We care about faces, whether that’s being able to read emotions or identity. In more creative pursuits, tracking faces lets us control animated characters with our facial expression or do video puppetry.
  2. Face tracking algorithms can enable the creation of more advanced services. The vision of SingularityNET is a collection of specialized agents that can inter-operate and lead to emergent intelligence, so having these services rely on each other is an example of building up complex behavior from simpler parts.

The specific services that are being implemented are described below.

Note: The images used are test images in the repository, with attribution in the README.

Face-Related Services Being Developed on SingularityNET

1. Face Detection

Bounding boxes for two faces detected

Given an RGB image, face detection returns bounding boxes where faces are found. This sounds like a simple problem, but there have been many approaches to solving this, each with their own trade-offs.

For a long time the Haar-cascade (available in OpenCV) was popular, but dlib also implements a HOG (histogram of gradients) detector, and more recently an convolutional neural network (CNN) detector. The latter is much more robust to non-frontal face orientation than previous methods, but requires more computation.

2. Face landmarks

Predicting the landmarks for both faces

A bounding box is a great start to working with faces — but there’s a lot of variety in face structure and where the parts of a face end up inside said bounding box. This could be due to rotation, different face structure, and the expression on the person’s face. That’s where landmarks come in!

Given an RGB image, and the bounding box for one or more faces, the goal is to find the pixel location of various key points or “landmarks” on the face, e.g. the corners of the eyes, the tip of the nose, etc. Shown in the image is a 68-point model, but dlib also has a 5-point model which is sufficient for some applications.

3. Face alignment

Each face aligned to a canonical orientation and size

One thing you can do with landmarks is align the face to a canonical orientation. A bounding box doesn’t let you do this by itself because its edges are always axis aligned. Why is alignment useful? Things like face recognition work a lot better if face images are aligned, other techniques like 3d reconstruction can also benefit from this as it removes one source of variability and lets a model focus on what it’s good at — instead of also trying to learn how to rescale and orientate the face.

So, the summary is: Given an RGB image, and a bounding box where there is a face, find the landmarks and then use these to rotate and scale the face to a canonical size and orientation. The difference from the original image in this example isn’t substantial. But if we take this following image of a model posing (another image in the test set of creative commons images), it’s more clear that the rotation is corrected:

4. Face Recognition

So how can we tell apart the two people in our example image? Face recognition deals with the challenge of identity: Given an RGB image, and a bounding box where there is a face, we return a vector of 128 floats describing the identity of the face.

Note this doesn’t tell you who the person is, it just differentiates this face from other faces by mapping them to different locations in a 128 dimensional space. Faces belonging to different people should be further away in this 128-d space than two photos of the same person.

This is a stem plot of the identity vectors from the two faces in our main example. The top is the face on the left, the middle the face on the right, and the bottom is the difference between them.

Note— While facial recognition is powerful, democratizing AI through our decentralized network helps ensure machine learning won’t support covert surveillance.

SingularityNET Services as Wrappers

I want to be clear that this isn’t original research by SingularityNET (though we have plenty of that, like trying to learn invariants and unsupervised language learning). Instead, this is an example of taking existing algorithms and making them available on SingularityNET. This helps us build a rich ecosystem that covers both existing and emerging machine learning models and solutions.

Calling each of these algorithms has been made almost as easy as a function call thanks to the marvelous dlib library by Davis King, and the hard work of the many contributors to OpenCV. Rather than explain how to do these calls yourself from python, here are some guides that others have already written:

The thing we’re interested in here is how to expose these calls on a network of AI services. Currently the alpha only supports JSON-RPC for direct communication with a service (there will be other options in future, and you can easily work around this if it’s too limiting — a topic for future post). Our task is to think about each function call and wrap it as a JSON-RPC method. Here is the find_face method from the face detection service:

The actual face detection and algorithm selection is wrapped in face_detect(...), so this is mostly about marshaling data.

About the most complicated thing here, assuming you’ve seen python async keyword, is the base64 decoding of the image.

If async and await are new to you, there are a few guides that are helpful (e.g. https://snarky.ca/how-the-heck-does-async-await-work-in-python-3-5/ ). In short, when you call an async function, it isn’t actually executed and instead you are returned a “coroutine”. It only gets executed when you pass it to an event loop, which is done in the below code (here jsonrpc_handler is the coroutine):

Now, the code here isn’t as simple as an example could be, because I also added support for a grpc interface and wanted to ensure the two servers played well together. If it’s unclear then a minimal but complete example is available in the alpha-service. This doesn’t require executing the event loop explicitly, because it only has a json-rpc interface and as such it can just rely on the aiohttp and jsonrpcserver libraries to manage the event loop for us.

Once you have you request and response cycle wrapped in a json-rpc method, you can get on to interacting with the blockchain…

Next Steps

We’ve made some recent improvements to our Alpha, and soon, we’ll be releasing a detailed tutorial on how to start using it for free. Stay tuned!

Our team believes that developer ergonomics are crucial to making SingularityNET successful, so while we currently need to call a CLI, and there is currently no schema defining the JSON-RPC endpoints, it’s a first step.

In the future, we’ll release improvements that allows developers to publish a specification for a service’s API, and an SDK that removes the need for the CLI to make service calls. The end goal is that it should be as easy as making a function call!

Be sure to join our Community Forum, which allows you to chat directly with our AI team, as well as developers and researchers from around the world.

--

--