Interacting with a Mirror

4 min readApr 14, 2016

Thoughts about smart mirrors and building a prototype along the way

Lately I am spending a lot of time in the bath room changing diapers. The kind of activity that helps you reflect on that place itself and the fantastically well scoped situations it covers. Let’s talk about mirrors.

Just like any distinct platform a mirror comes with its own constraints and usage patterns. Something that makes a mirror appealing as an interactive medium is that people already interact with it on a daily basis. Smart humans.

Let’s see which patterns we can identify and adapt without interferring with its primary use.

A day from the perspective of a mirror looking at you

You look at it for a couple of seconds, maybe minutes in the morning to get yourself ready for the day. You use it repeatedly throughout the day while washing your hands. And you use it again at night brushing your teeth, getting ready to call close on your day.

Obvious cases. Also, it isn’t touch. It isn’t voice. Your main means to interact with a mirror are by mimics and pose.

Methods of interaction

I will focus on these modes that are already familiar to us.

Mimics, facial expressions
Gestures, hand and head movements
Pose

There are other things we can do. But let’s be very careful with our choices. Voice input can be quite embarrassing when your family is sleeping next door. And touching a mirror. Exactly. You don’t. Resist.

Learning how to say Yes and No

At this point we don’t really know how complex, contextual and personal smart mirror apps should be. We need to develop a sense for this over time while iterating. For now I will just start by exploring possible expressions for the most basic messages, like…

Positive expressions

Nodding your head
Thumbs up
Smiling
“Welcoming somebody in” hand movement

Negative expressions

Shaking your head
Thumbs down
Sad, disgusted, weird face
Crossing your arms
“Go away” hand movement

There are many more and all come with their own shades of meaning. Keep in mind that gestures are a type of language. They vary based on your cultural background and might require localization.

But let’s keep moving and pick something that can be easily extracted from a camera video stream and at the same time is effortless to reproduce as a user.

Detecting mimics

This is the point where you start searching and get carried away with OpenCV, augmented 3D meshes, and machine learning. But for a simple prototype we can safely stay in our cozy home and rely solely on iOS’s build-in Face detection capabilities:

I picked a simple smile as positive confirmation gesture and shaking your head to signal dismissal.

For this prototype I created a tiny MimicKit framework. It exposes events modelled around user decisions and presence.

didConfirm()
didDeny()
didEnterPresence()
didExitPresence()

This allows you to hook Mimic Recognizers into an existing app without worrying about extraction details like video capture and raw feature extraction data. The idea is to be able to test out various use cases quickly without building custom “made for Mirror” apps from scratch.

The Long Smile

Here are some findings from working on the prototype.

The smile detection works surprisingly well. Even with rather bad lighting subtle smiles get detected quickly and quite reliably.

It also turns out that smiling is very effortless to execute as a user. Try it out, yourself.

In order to avoid false positives a pure smile detection wasn’t enough. I added a timer triggered by the beginning of a smile to create a Long Smile Recongizer. This allows you to cancel unintended confirmation actions. Also it adds to the feeling of control for the user.

For this prototype the states covered inside the detector are Untracked, Tracked, Smile Began, Smile Cancelled, Smile completed (Long smile). You can see them in action in the following video

I found it encouraging to see how responsive and interactive switching between the states feels already.

Implementation notes

I added a buffer of a couple of frames to smooth out the incoming raw detection data as it can be slightly jittery from frame to frame. You can see this in the synronised eye blinking where I didn’t apply the smoothing.

Oh, the blinking. In contrast to a smile intentional blinking with your eyes is very exhausting. We don’t blink on purpose. It feels very unnatural to overload it with meaning for an interaction. I did however keep the tracking of closed/open states for both eyes. It’s mainly a fun visual gimick. But it also helps me to test the system latency and to make the avatar appear more alive.

Beyond

The face detection on iOS is a wonderful starting point but it is quite basic. Having additional tracking information about face rotation, yaw or even actual hand gestures would be very useful in order to build a bigger language dictionary of gestures.

This little hobby experiment got me excited. What are your thoughts on smart mirrors and how do you imagine interacting with it?

The right glass

Oh, and hey. If you have experiences building smart mirror’s yourself please let me know. I am currently looking for the right “one way” glass type to replace our 58,5cm x 98cm bathroom mirror. The sample I used for the prototype worked well to hide the device frame but as a mirror it was significantly lacking brightness.

The augmentation side will remain secondary for while. So I am not willing to sacrifice what makes a mirror useful in the first place, its perfect reflection.

Explore!

I think now is a good time to reconsider interactions based on mimics and hand gestures. Key is to identify contexts where using them feels natural and other means fail. Not a replacement for your keyboard and mouse.