Wink! Or How your Users can interact with your iOS App by changing their Face Expressions

Get notified reactively when the user changes the face expression with Wink

César Vargas Casaseca
Axel Springer Tech
7 min readDec 13, 2020

--

In my previous stories Warhol and Warhol II I talked about how we can easily detect the user face features and render our desired overlay on top of them thanks to Warhol, a library of my own. For that I used the Apple Vision Framework, that among other things supports the detection of items such and face and face landmarks. Furthermore the Warhol client shall not deal with Vision at all, as it is hidden behind the library interface thus alleviating the effort of the client implementation.

During Warhol’s development process something came to my mind. Wouldn’t it be cool if, on top of the user features, we could detect face expressions as well? So, not only detect where is the user’s eye, but also when they wink it, open the mouth or smile. The use cases for this functionality are numerous; no more swipe left or right on Tinder but wink left or right, help disabled users to interact with our app via face gestures or make your videogame character jump by opening the mouth.

Unfortunately Vision does not offer such feature, but do not fear, thanks to ARKit we can obtain information about the pose, topology, and expression of a face that ARKit detects in the front camera feed. These capabilities were introduced on iOS 11 and requires a device with TrueDepth camera. This camera replaces the front one on the iPhone X and later, being capable of capturing 3D information for Face ID authentication and Animoji.

With that goal I developed Wink. Wink is a light reactive library written in Swift that makes easy the process of Face Expressions Detection (blink, smile, mouth open…) on IOS. It detects a default set of user face expressions using the TrueDepth camera of the iPhone, and notifies you on real time using Combine, Apple’s reactive framework. That way, the Wink client does not need to know anything about ARKit, only if they want to extend the set of Face Expressions detected.

Wink is a light reactive library written in Swift that makes easy the process of Face Expressions Detection (blink, smile, mouth open…) on IOS

Wink provides a view controller with the camera view, and a Combine AnyPublisher that gets updated whenever the user changes the expressions. We can then add that UIViewController to our view hierarchy and react to any any new user face gesture. In case we do not want to show the camera view, we can just hide it or resize it and place according to our requirements.

wink!

Behind the Scenes

In this section we are going to see how Wink uses ARKit to to detect the user face expressions. If you wanna start using Wink right away in your app, go directly to the next passage.

As in any other AR experience with SceneKit, we need to create an ARSCNView instance, that will be added to our FacialExpressionDetectorViewController view.

After that we set the delegate to our class, that will receive the view’s AR scene information with SceneKit content. Once we have added the view to our hierarchy and set the delegate, we call to run the session with a face tracking configuration in viewWillAppear:

In the same way, we should not forget to pause it when the view will disappear.

Once we have setup the ARSCNView that will display the AR experience and run the session, it is time to start detecting the face expressions and pass them to the client. We do this through the ARSCNViewDelegate methods:

Firstly, we have to return a new node with the representation of face topology in the our scene view for each 3D anchor, that is, the position and orientation of something of interest in the user’s face. If these concepts are too advanced for you or you need to refresh them, this tutorial is perfect to get familiar with the ARKit basics.

Now that we have a node for each anchor, ARKit will let us know whenever the node was updated, that is, when new information was obtained from the scene view:

Here we refresh our nodes with the face anchors, and proceed to detect the facial expressions from the ARFaceAnchor:

Not a new Avenger, but a face with AR Nodes. See how ARKit is able to detect them in 3D thanks to the True Depth Camera

To better understand this, we need to know what is aFacialExpressionAnalyzer in Wink. A FacialExpressionAnalyzer is a struct encapsulating the data needed to detect one specific face expression:

It contains the Wink FacialExpression object that represents the gestures in the user’s face (mouthSmileLeft, mouthSmileRight …), the AR object that it corresponds to, and the minimum valid coefficient to accept a facial expression as actually happening. As pointed before, Wink comes with a default set of analyzers:

that is created and assigned inFacialExpressionDetectorViewController:

As we will see later, it is very easy to extend this functionality with our own with having to modify the code, thus following the Open-Closed Principle:

“Software entities should be open for extension, but closed for modification”.

Going back to detectFacialExpression(from anchor: ARFaceAnchor), we see how we compactMap the analyzers into face expressions whose current value is higher than the minimumValidCoefficient, meaning that the user is actually doing that gesture as much as our requirements accepts them. Once we have the array with the current Face Expressions, we send it through the Combine'sPassthroughSubject object.

Let’s Play!

Photo by Jeremiah Lawrence on Unsplash

This was all very interesting, but to the developer who wants to roll fast it will be probably not so useful. After all, the purpose of Wink is to ease the process of face expression detections without the need of dealing with the compelling but arcane world of ARKit.

To start using Wink in your screen’sUIViewController, create a new instance of FacialExpressionDetectorViewController and add it to the former through the usual process of view controller containment:

You might then want to place it as you wish adjusting the constraints, or hide it if you don’t want the camera view to be shown. In this case we add it to the upper left corner of our view:

After these steps, it comes the climax of the movie; we have to subscribe to the User Face Expression changes, sinking the Combine publisher that come with the FacialExpressionDetectorViewController:

In our sample app we are just describing the current face expressions with the help of a FacialExpression extension, in your case here is where you react to the user face expression changes to trigger the desired action:

My expressions repertoire with which I will conquer Hollywood

In this debug sample you can see how Wink detects my face expressions, and how the client react to them displaying a description of the retrieved gestures. The statistics and the web are part of the Wink debug mode, they can be of course be disabled in production.

That’s it! The process of using Wink is very straightforward, just four steps:

  • Create an instance of FacialExpressionDetectorViewController
  • Add it your view controller
  • Place or hide the camera view according to your requirements
  • Subscribe to changes and react to them

Advanced

The Icing on the Cake!

Detect More Expressions

With the process above we can detect those face expressions included in the standard default set of Wink Face Expressions to be detected. But, what happens if we want to detect a gesture and it is not there? What if we want to detect when the user’s left eye is wide open? Very easy! Thanks again to the open/closed principle we can add this functionality without having to modify Wink’s code:

We create a new FacialExpressionAnalyzer with the new Wink FacialExpression , the AR BlendShapeLocation that should be detected and the minimum accepted coefficient to consider that gesture as actually happening. For a complete list of BlendShapeLocationyou can detect, or in other words, new user face expressions, please refer to the Apple Documentation.

Change the Minimum Valid Coefficient for a Face Expression

Yeah, it is that easy. In the same way, we can modify the minimumValidCoefficient of the default analyzers to meet our specific requirements:

We create a new analyzer with the expression we want to modify, assign it a new coefficient, and replace the old analyzer with the new one. In this case, we will accept a hint of a left smile (0,2/1) as valid, we feel so generous.

Photo by Jessica Rockowitz on Unsplash

Recap

That’s all friends! Now you can react in your app to any user’s face expression input to develop new functional, helpful and funny features interactions. After reading this post we know:

  • How you can detect and react to the user face expressions through Wink
  • How can it help you
  • How it uses ARKit to detect the Face Expressions behind the scenes
  • How you can react to the User Face Expressions in your app using Wink
  • How you can extend Wink to detect more Face Expressions or change the minimum valid coefficient of one of the default ones

To know more about Wink and start using it, Check here for the link to the GitHub Repo Page You can easily integrate it with Swift Package Manager, or just dragging the source files into your project.

And of course any contribution, suggestion or question are more than welcome, either here in the comments or opening an Issue in the GitHub Repo.

Happy Winking! And remember:

--

--

César Vargas Casaseca
Axel Springer Tech

Senior iOS Developer at Automattic. Allegedly a clean pragmatic one. Yes, sometimes I do backend. Yes, in Kotlin, Javascript or even Swift.