Zuck Faces (part 2)

Alistair Roche
3 min readMay 7, 2018

--

There’s a dataset called Labelled Faces In The Wild (LFW) with about 13,000 photos of people scraped from the web. As part of this little series, I ran each of them through FaceNet to get the feature vector which represents their facial identity.

Facial recognition systems working working from the vectors produced by FaceNet are able to beat humans on industry benchmarks (a fairly recent milestone, by the way).

You can think of these vectors as representing a point in 512-dimensional space. When you want to know whether a given face belongs to the same person as another, you take the distance between the two points. I used this property the other day to produce the picture of Mark’s various faces sorted by how distant they are from his “average” face.

Today I sorted all the faces in LFW by how distant they are from Mark’s average. Here’s a sample of 160 of them, sorted from top-to-bottom and left-to-right:

Here’s the nearest 160:

And the most distant:

Things that stand out to me:

  • There’s a black guy and a woman in the top eight
  • The algorithm is clearly thrown off by glasses, hats, shadows and what I’m thinking of as “extreme facial expressions”
  • The top eight don’t really look like Zuck doppelgangers to me

One thing to keep in mind that although LFW has more than 13,000 photos, they’re only of a couple of thousand people, and they’re mostly celebrities. I’d very much like to try this over a larger and more diverse dataset. There must be people out there who look a whole lot more like Zuck who’ve worked their way into publicly available datasets.

I’m not sure how to deal with hats and glasses. There must be research on it, though. I want to dive into how those industry benchmarks (on which FaceNet does extremely well) are set up. Are they scrubbed of glasses / blurriness / facial occlusions / etc.? Are there benchmarks on which these systems currently don’t beat humans?

I’m pretty sure I’m making a mistake by using the average of Mark’s various faces. I think what I need to do is train a simple binary classifier that takes feature vectors as input, and then rank all faces by the probability spat out by the classifier. I’m going to try that soon. [UPDATE: tried it!]

On another note, how come Mark never wears glasses or a hat? I guess the photos I’ve scraped are extremely biased. They tend to be posed, and taken by (presumably) professionals. I might need to dive into the world of paparazzi to get some less biased images of his face.

(here’s a gist of how I produced these results: https://gist.github.com/atroche/287d803c6610a4500e18f009e7a38b4e)

--

--