User Testing to spot Insights & Inferences in Uber & Ola’s Design

Usability/ User Experience/ User Tests are mostly ignored by companies in order to ship faster. Hell, a lot of them don’t even know what it is. However, it can be very useful to conduct these tests to understand from a user’s point of view on how they interact with the app and what they think so that we can gather insights and brainstorm on what goes wrong in the process and how one can improvise the app’s UI and UX based on those insights.

A good process is to design, prototype, test with real users improvise and then deliver. This can work with a Marvel/ Invision prototype, as well — in case you want to skip the painful effort of development and then figuring out what’s wrong (2x the trouble).

As a part of a side project along side college, I decided to start a Podcast/Youtube Channel with Nancy Jain and Saurabh Kumar in October 2016 to conduct User Tests and spot insights from the videos and broadcast them to the world.

Unfortunately, things got a little too hectic in all of our lives and the project got put on the back-burner.

But, we did conduct a few user tests for popular cab booking apps — Uber and Olacabs to understand how users of different levels of expertise use both the cab booking apps.

First up, we decided to come up with the task. A task is necessary to make sure users know what the end goal is, so that they don’t get lost in the middle of the test. We wanted to test multiple things in Uber and Ola’s app. Nancy & me decided to visit a local meet-up “Mumbai Artificial Intelligence Meetup” by Fractal Analytics in December 2016 and present the attendees with a task to perform on my phone (Nexus 6P) with an app called Adv Screen Recorder which uses Jake Wharton’s Telecine.

The task was simple.

Book a ride from Source (A) to Destination (B).

To rise the difficulty and let the user go a little deeper during the test, we presented the users with a few constraints.

You have Rs. 150 to book a ride from Point A to B
In case the amount doesn’t suffice, you can ask us for a Coupon Code which you’ll need to enter to avail the discount.
You have to reach the destination with 3 passengers (including you)

We decided to also record the user’s faces as they use the apps, but on testing this on one of the user’s we concluded that it made them feel too uncomfortable as if they were being watched and it did not feel natural to them and would have altered the test results. Hence, we decided to move along the default screen recording with audio.

Nancy & me tested several users in the meet-up, while Saurabh conducted a few tests on his own. I could give you all the juicy tit-bits of what happened, but a better way for you to explore is to see the videos for yourselves here.

A week or two later, we decided to meet up and spot insights and inferences through the videos and document them down.

Here’s the data we collected:

  1. User’s Expertise with the App (For 6 Users)
Expertise of each user involved in the test.

We were lucky to find testers who we could identify as frequent, occasional and ones who had never tried any of the app.

2. Insights for Uber’s Design

Insights for Uber

U1– U6 are the users it was tested it on, as per the image above this one. It showcases the problems each users faced. If one user faced a problem, we documented it on Excel to understand if others will face the problem as well.

1 indicates the user failed that particular test and the problem occurred.
0 indicates the user could successfully navigate
Blank indicates the user did not perform that activity so no data to show if they passed or failed.

Uber’s earlier design, as per the test found a lot of flaws and a few of them could be reproduced, making the new redesign a welcome change to address the insights we captured in the user tests.

3. Insights for Ola’s Design

Insights for Ola
1 indicates the user failed that particular test and the problem occurred.
0 indicates the user could successfully navigate
Blank indicates the user did not perform that activity so no data to show if they passed or failed.

Ola’s design did bring up a lot of troubles but not all of them were reproduced. However, this doesn’t mean their design was flawless. It only means, we didn’t test enough. The problems in Ola’s design were unique but one problem seemed to exist with almost every user who had not used the app or was an occasional user:

They could not find the destination box below the source box.

4. Inferences for both Uber & Ola

Inferences/ Conclusions based on the Videos.

Inferences are conclusions we’ve come to based on the videos. These are also helpful suggestion for the designers in the company or the agencies who look after their designs.

A few of the inferences were discovered in both, Uber and Ola, while the others were found in just one. Designers could use these insights to make meaningful interactions when users face one of the problems above.

5. User Testing Mistakes

We committed a few mistakes during the tests. While we haven’t recorded all of them, we did record one of mine. It’s important to spot mistakes you make yourselves to understand if the testing was affected in anyway that could have led the user to doing something other than what they intended to based on your cue.

Mistakes

For example, I asked the user to get a ride estimate, when they were lost in the application trying to figure out on how to calculate the amount. That cue by me was unnecessary and let my mistake cause a different outcome than what the user would have done.

Although we couldn’t get our podcast/channel up as we wanted to, it was a good learning and an even better experience to understand how User Testing works and how users behave differently than you’d have thought they would.

Another thing we learnt is that we cannot call this an actual user test, since the user were not in their actual environment while conducting the test. For instance, it wasn’t user driven. It was an activity the user was told to do as opposed to them wanting to do it. Since the user did not feel the urgency of booking a cab in this fake test, they could have taken it more lightly or taken a lot more time to do it. Also, these users were in a safe environment, in a meet-up. What if the user wanted to make the booking somewhere else, like on a street? Circumstances would have been different then, as well as the results.

I also used this study for my paper titled “Data Driven Design — Leveraging Data to Make Powerful Design Decisions” which is published in International Journal of Latest Engineering Research and Applications (IJLERA) — http://ijlera.com/papers/v2-i6/4.201705320.pdf

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.