Loki — Spying on User Emotion

Proof-of-concept of emotion-targeted content delivery.

Kevin Yap
nwPlus
Published in
6 min readJan 22, 2018

--

With the introduction of iOS 11, Apple added some new facial-detection APIs to ARKit, which makes it significantly easier for developers to create software with face-tracking features. For nwHacks 2018, we decided to develop a proof-of-concept that showcases its more sinister side — something that could be straight out of an episode of Black Mirror.

Loki presents a news feed to the user that looks a lot like any other social networking app. However, in the background, it tracks the user’s facial data in real time, without their knowledge. This data is piped through a neural network that we trained to classify various emotions. The user’s current emotion is then used to decide what content gets loaded into the news feed.

Our demo video for Loki, created as part of our hackathon submission.

On iOS, apps require permission from the user in order to gain access the device’s cameras. However, once this permission has been granted, apps are allowed to access the camera on a whim, without even needing to indicate to the user that their cameras are currently in use. The ability to discreetly use the front-facing camera is the key to Loki.

We were inspired to build Loki to illustrate the plausibility of social media platforms tracking user emotions to manipulate the content that gets shown to them. For example, a nefarious company might look to target ads towards people who they found were in a vulnerable state (ie. sad users).

When our team arrived at nwHacks, we had no idea of what to build. Between the four of us, we had experience with iOS, machine learning, and backend development, so after a rather lengthy brainstorming session, we settled on using the iPhone X’s ability to capture facial information to create an “emotion classifier”. However, we tempered our expectations regarding the success of this project. To improve our chances of having a working demo by the end of the hackathon, we decided to limit its scope to classifying four core emotions: happy, sad, angry, and surprised.

Our method of creating training data.

In order to train our model, we needed to create a dataset. To accomplish this, we wrote a prototype app that displays a live-updating feed of ARKit’s 51 facial “blend shapes” on-screen. Pressing a button at the top of the screen freezes those values and prompts the user to select one of the four aforementioned emotions. Upon doing so, those data points get pushed to our backend server and stored in a database for later recall. (This also meant that for a good portion of the hackathon, we sat around making exaggerated faces into our phones, which probably confused the groups around us.)

Now that we had collected about a hundred data points, the important question to ask was “what does our data actually look like”?

Our training data visualized in two dimensions.

The above image was created by using multidimensional scaling to reduce our training dataset to two dimensions (humans aren’t great at visualizing 51-dimensional space). MDS attempts to preserve the between-object distances found in the higher dimensional space, so it’s an effective way to grok similarities (and dissimilarities) in a complex dataset. From studying the visualization, a couple of points stand out:

  • Firstly, data points from each of the four emotions tend to form their own clusters with minimal outliers. Because each emotion is quite distinct from one another, this suggests that our model could be quite accurate about classifying new data points, despite the training set being so small.
  • Secondly, of the four main clusters, sad and angry share the largest overlap. This lines up with our intuition around these two expressions — both of them are quite similar to one another, and are distinctly different from happy and surprised expressions.

This visualization acted as a good gut check; even with a tiny training set, since the data was nicely clustered anyways, a basic machine learning model would likely be sufficient to classify new data into one of those clusters. Now that we had assembled some training data for each of the four emotions, it was time to get the emotion classifier working.

For the machine learning aspect of Loki, we decided to keep it simple and stuck with what we knew. One of our team members was familiar with Keras (an API for building neural networks that sits on top of TensorFlow), so although our training set would only contain a hundred or so data points, that’s what we decided to go with.

Our neural network architecture is simple, consisting of a feedforward neural network with two hidden layers. It takes the 51-dimensional facial data as its input and returns a 4-dimensional output, corresponding to the probability of each of the four emotions. Since our training dataset was so small, our model likely suffered from significant overfitting, but its performance was sufficient for building our proof-of-concept.

Generic neural network diagram to break up the wall of text. (Wikipedia)

CoreML is another framework that Apple shipped in iOS 11, which makes it easy to integrate pretrained models into apps and perform prediction on-device. Although CoreML was primarily designed to enable developers to build large models to include directly in their app binaries, for Loki, we wanted to take this one step further. Since we already had a way to seamlessly create new training data points, we wanted to be able to train new models and deploy them on-the-fly, without having to rebuild the app.

To accomplish this, we added an endpoint to the backend server that would train a new model (using all available training data) and store the resulting CoreML model file. This model file was then served from a separate endpoint. From there, we added a view to Loki that would trigger a new model to be trained, request that model file, and use it to replace the one stored locally on the device — just like that, we had hot-swappable models!

Finally, to tie everything together, we created a mock Facebook-like news feed containing dynamically generated content. In the background, though, the app actively uses front-facing camera to scan and classify the user’s face. If the user is currently happy, we load happy images; if the user is sad, we show sad images. This clearly demonstrated our ability to track the user’s current emotion, all without revealing to the user that any of this was even occurring.

Two screenshots of our “news feed”, with happy content on the left and sad content on the right.

All in all, our team was extremely happy with what we managed to build in only 24 hours. Although we weren’t finalists in the hackathon, everyone we showed Loki to seemed impressed at its ability to detect each emotion. It was really satisfying to see people’s reactions to the app correctly detecting the face they were making; when people were genuinely surprised by its accuracy, Loki would (correctly) register their astonishment!

While we hope that the ideas we showcased in Loki will not be used for malicious purposes, when considering that our online privacies are in constant decline, Loki feels like an ominous glimpse into the future.

Loki was developed during nwHacks 2018 by Kevin Yap, Lansi Chu, Nathan Tannar, and Patrick Huber. Check out the GitHub repository here.

--

--

Kevin Yap
nwPlus
Writer for

Developer and musician from Vancouver, Canada.