Good friction for great UX

Sometimes, more is more

Charlotte Sferruzza

Published in

Onfido Product and Tech

6 min readJan 30, 2019

At Onfido, we’re creating an open world where identity is the key to access.

Since our goal is to prevent fraud while maintaining a great user experience for legitimate people, we launched advanced facial verification. With this new approach, we ask people to record live videos instead of static selfies. Users record themselves moving their head and talking out loud to prove they are real people.

This is the first version of our selfie video verification

Our clients adopted this new feature. But after a few weeks, we noticed that the videos were hard to process, making identity verification difficult. Users were not performing the right actions. They would retry multiple times and many eventually gave up. This meant that legitimate users were falling through the cracks.

We needed to find a solution, so over a Design Sprint, we talked to machine learning engineers, biometrics specialists, mobile developers, product managers and UX designers. Our goal was to radically improve our selfie video verification.

This series of blog posts will document our process to improve the selfie video feature. You can try it now on our demo app.

What went wrong?

The actions we asked users to perform while taking their selfie video were designed for us to detect fraud accurately. What’s wrong about the experience that made legitimate users get rejected?

We watched dozens of our users record selfie videos during usability testing, and finally came across two major problems.

Taking a selfie is easy. Taking a selfie video is hard. Currently, to record their selfie video, users have to hold their phone up, press a button, speak out loud, press a button, move their head, press a button. And finally submit their video. That’s a lot of buttons to press, while keeping the phone up, moving and speaking at the same time.
Users don’t know if they did right or wrong. Users have to make movements in front of their phones that they never do normally. They have instructions to follow, but nothing to tell them if they’re doing the right thing. At the end of the experience, users don’t know if what they’ve done is correct.

The actions we ask users to do have been designed for us to detect fraud very accurately. We couldn’t change the actions themselves, but we could improve the way we guide users through this process.

We decided to focus on helping users turn their head, as it’s where we saw the most mistakes. Improving the “say the digits out loud” action is our next step.

Initial prototypes

At the end of our Design Sprint, we had two prototypes that we tested with users.

Prototype 1: we’ve added an introduction screen before the selfie video to tell users what to do, and how to do it. When testing it, we saw less users struggling: they were recording good quality videos with the correct movements. Moreover, we saw an improvement in the feeling of trust from our users. But some people skipped this screen, as they wanted to start recording their video straight away.
Prototype 2: we show instructions on the camera screen to help users while they are recording their video. Most users were happy about those instructions that comforted them when performing the required actions. However, some people missed them: they were focusing on looking at their face on the screen, and nothing else. The instructions on the camera screen provide in-context help, but some users focus on their actions and prefer feedback to instructions.

The solution

Those two prototypes became our starting point to craft our final solution.

We created an introduction screen, and we kept it light. We recorded tutorial videos to show users performing the actions.

We recorded multiple videos to showcase different users demonstrating our feature. The video presented to a specific user is picked up randomly from a set of several videos.

We’ve also started detecting users’ faces. When users are correctly positioned in front of their phone, their face is detected and the video recording starts automatically. It’s a good way for users to be sure they’re correctly positioned.

We’ve added a playback of the recorded video for users to review it before submitting. This gave people more control over their image and biometrics. It also helped getting rid of a large amount of failed videos, before they are sent to us for identity verification.

Overall, we’ve added more visual and haptic feedback to help users turn their head correctly. More about this on the next blog post 🤓

To improve our selfie video feature, we’ve added new screens to the experience, more text to read and a video to watch. This made the video longer to record than before.

Our users usually sign up to a service (if they want to use our friends Revolut for example 😄) and need to verify their identity during this process. The time spent on our experience is added to the time spent on the sign up process. More time on our side can increase fatigue exponentially.

So… you made the experience longer, isn’t that increasing drop-off?

This is a common feedback we received when we started sharing our solution.

More friction = drop-off
More time spent on the task = drop-off

Those are very common hypotheses that we wanted to test. Through user testing, we were surprised to find that users were willing to trade time for a better experience and transparency. We found out that better instructions and a more robust capture experience were a game changer for our experience.

People would trust the software more, they would be more confident about what they’re doing and they won’t feel fatigued. The time spent on the task they perceived was equal or lower to the previous version.

We’ve added steps to our experience, more time to spend on screens before, during and after the video capture. But instead of creating fatigue and seeing users drop-off, we helped them to prevent making mistakes. Our solution led to less retries, and it made users trust our product more.

We are adding friction, but it’s the price to pay to help users achieve their goals in a comforting and secure way.

When asking users to share very personal data to allow them to access a service, perceived speed and effective speed are very different.

Each step we added brought transparency and reinforced our users’ trust.

Read Sérgio Moura’s post about how we made these improvements from an engineering point of view:

Face detection and tracking on Android using ML Kit

To improve our selfie video feature, we decided to give users feedback about the correctness of the actions they were…

medium.com

❤️ Thank you Rae, Daniel, Minh and Sérgio for helping me make this post better.