How to “Remote User Test”

Baki Bektaş

Published in

The Startup

18 min readJun 6, 2018

Pro tips and know-how from 3000 remote user test sessions.

* Prisma App image effect applied to keep the actual user identities anonymous.

Abstract

When your user base consists of 3.5 million people from pretty much every country, there are many things to consider while conducting user tests. User behaviors and mindsets change drastically from city to city and country to country. This creates problems if you conduct user tests the old fashioned way, aka in person. That’s why most of the usability tests that we do at JotForm are remote user tests. We find it very effective and receive great results from them.

This is a collective know-how of over 3000 remote user test sessions, with most of them dedicated to our new and distinct form structure, JotForm Cards. During our R&D phase we have identified more than 280 usability issues and 110 improvement opportunities with the help of remote user test analysis alone. Want to know how and why? I’ll tell you.

What is remote user testing?

Remote user testing tools have been around for quite some time now. They’ve evolved with our products, adding more features to help us reach “the user” or to be more specific, “their users.” The way it works is by pooling a variety of people from different countries with for an incentive. The “users” are presented with a set of tasks that test and asked to complete them. The “users” go through with them and are recorded for further analysis.

If your software needs to appeal to a broad audience (or many diverse countries), you can’t just cut it with a single version, you have to constantly iterate and iterate and then some and then a little more, and then… with every version, you need to make sure that everything runs smoothly. Not just on your high end Macbook Pro, definitely not just on your fancy iPhone X, if you want to appeal to masses, your software needs to be “people safe” not just“lab safe.” It needs to work on a mediocre android phone, or on a lagging first generation iPad so that you can widen the probability of usage. To test your software with various devices, on various locations or countries, with various people with distinct set of features, remote user testing is the practical way to go.

As always, it all boils down to dualities. So let’s get the pros & cons out of the way shall we:

Pros

Target Audience: It is vital to reach out to your target audience on your user tests, otherwise the test data is useless. We have been working with Usertesting.com and they made it really easy for us to pick users that match our target audience by pinpointing demographic and device options. They even have a screener feature where you can select special requirements, such as “must be an admin of a webpage,” so that we can test our advanced features.

As I mentioned earlier, providing an incentive to the users in their pool has its own pros and cons. An incentive, especially an economic incentive, helps users stay dedicated and focused on their task at hand. This provides the opportunity to test your prototypes with great depth since you rule out the possibility of your users getting frustrated and leaving. You can even set requirements to difficult levels and still receive good data.

Anytime, Anywhere: So you have just finished your new and shiny prototype and want to test it with real people. Normally, either you set up the participant schedules, send out invitations and analyze the replies, and organize more and this and that… or you could just set some options and be ready to go. You can simply send a test to a fisherman who lives in the Bahamas, or the IT specialist who works in Denver. Your test videos will be ready in a few hours, fresh from the field, just the way you like them. This can be very beneficial if you’re working in a fast paced environment.

Any Device: There are many tools out there that allow you to test your software on different devices, but most of them are simulators running specific sets of code that work with the hardware. Others are linked up to actual devices, but run with buggy or outdated software, which make them very unpredictable. Yet again, usertesting.com helped us conduct user tests that matched our requirements, such as device type, brand, operating system, browser etc. Since they are the real devices used by their real owners, it’s hard to get more genuine than that.

Fire and Forget: Nope, you can’t intervene with this. After you send the order, it is out of your control — this is a good thing. For instance, if the user was sitting next to you, they could’ve picked up on your body language, and that could’ve changed the way they interacted with the test. On the other hand, in remote user tests, users can’t see you or your reactions to their actions, which renders out the possibility of you interacting with their experience. If you are after the authentic experience, this is a good way to get it.

Target Anyone, Friend or Foe: It is great to learn how your product performs with real people, but it can be even greater to know how your competitors are doing in a similar test settings. If they score better than you do in your key success metrics, then an opportunity of improvement arises with a sense of jealousy. :) Don’t feel bad if you ever find yourself surrounded with top players, great competitors provide lessons to be learned, weak ones only pump you ego.

Real Places, Real environments: One other thing with remote testing is it being remote — you know, away from your lab and its regulated environment. It’s their story now, their living room while their child is playing and making loud noises, their office at lunch break with ringing telephones in the background, their car when the alarm is going off. Your first impression might be, “how that is a good thing with all that noise?” However, it is quite the opposite. It is a real use case with authentic environmental features. This is what’s going to happen when you air your software to the world. People will use it in many different locations, some are isolated and calm, while others have lots of distractions. It is good to see your mobile app crash when a call comes in to the test participants phone — you might have missed that simple probability in a regulated lab environment.

Budget Friendly: I am no expert in their account types but as far as I know we had a great deal with usertesting.com, which covers pretty much all our needs with a fraction of a cost than if we were to do it on our lab. You probably don’t need to order around 150 test sessions per month like we do, just 5 per week should be enough to cover majority of your usability problems with most cases.

Cons

Can be all about the money: Have you ever heard of the saying “People never read?” It is true. Usually they scan through looking for highlights, randomly checking here and there, mostly fixating on points of high contrast. Maybe you’ve just scanned this post and have fixated on the Cons. I can’t blame you though, no one has time for irrelevant information, we barely have time for the relevant. However, if you are economically incentivized and motivated like you would be in a remote user test, you might spend more time and effort in places that you normally wouldn’t. This is somewhat problematic because the test administer wants to receive and analyze an authentic experience. Usertesting.com has ways to overcome this problem though because they allow you to invite your own users to the tests and control your incentives. One other way would be to set an unguided test scenario, but we will get to that later on.

Qualitative vs Quantitative: I can hear some data people saying 10 people is not statistically enough to base design decisions on. And they’re right — it’s not. But it’s also not about statistics either. Let’s say that you would like to test a new road. You can have a quantitative data from usage metrics, such as the average speed of 1000 cars passing through is 42 m/h. On the other hand, qualitative data of 5 cars could suggest metrics like fixing the bump on the 3rd mile could improve overall safety and speed. It does not matter if you ram 1000 cars on the same bump, the result would be the same — a bump is a bump. Statistics of thousands can tell a lot about the masses, but the journey of a few can tell a lot about people and their actual needs. According to research, qualitative analysis of 5 user tests can help identify 80% of the usability problems.

It can be really hard to communicate this with people who are accustomed to quantitative data. Also it is practically problematic to convert one to another since qualitative analysis of 500 test sessions can take a considerable amount of time and effort to set , watch, and analyze. I know, because we have managed to send 750 test session for a single topic and we are still analyzing it. :)

How to remote User Test

Let’s start with clarifying the purpose of your research:

Would you like to know if your product performs as intended?
Would you like to know if the content communicates your message effectively?
Would you like to learn how others see your brand from their perspective?
Would you like to know how your product performs within other brands? What are its weakness and strengths?

Why do you need this research for? Setting your goals and key performance indicators (KPI’s) upfront will help you benchmark your results more clearly in all future iterations of your research.

The Specs

Let’s get started with the people. Who is the target audience? There is no such thing as targeting everyone! There is no such thing as an average user or general use case either. You have to narrow your focus down to a certain group of people that acts and behaves similarly. Otherwise your next test will bring data from completely different set of people that you have not analyzed with a focused attention before, which makes it impossible to benchmark your KSI’s with previous tests. Carefully picking your target persona type upfront is vital to the success of your research, so please oh please, don’t aim for the mythical average user, they don’t exist!

No, people are far more capable than your dev team suggests!

Also, who is this so called “User” anyway? Definitely not just some number on your analytics data or a ghost that comes and goes. A user is not a robot with an ambition to visit every depth of your software either. They are the people with their own agenda and their own individual perspective on common concepts. So it’s important to know a few things about these humans and how they operate on a psychological level.

A couple of other things to consider is if your test requires previous domain knowledge and if it’s safe for new users.

Do you need to teach a few things before they can get going with your test scenario?
Can the user intuitively learn what they need to do without any guidance?

You can’t just ask someone to test drive a car if they don’t have a driver’s license to start with, right? You can guess up front that they will crash immediately if you force them to drive — they need domain knowledge first. Giving a few quick hints upfront will help to get started and get going with the test. You need to consider how much you can reveal to them too. You can’t just hand them the Guide book and expect a authentic experience analysis from it. Nobody reads those bloody guidebooks anyways, actually “nobody reads” will suffice just as good here. Instead try to give them a real use case scenario and see if they can keep their tracks on the road. Anybody can successfully land a jumbo jet plane with clear guidance, but that’s not the point here now is it?

On the contrary, you might find yourself in a tight position where you need to test a part of your software without giving hints about it beforehand. If that’s the case try to be suggestive about it. Don’t cite the names of the menu titles that they need to go to in you scenario or tasks, they will just search for those words on the screen and click on them without thinking too much about it. Instead try to imply the mindset that you need them to be in, they should be able to find their way around without any road signs if the design is intuitive.

But intuitive to whom? This is where the expected difficulty level of the test kicks in. Can an average web user handle it or do you need advanced users who can also code? Choose your audience wisely beforehand or you will just watch frustrated people doing things which they have no idea about.

Last but not least, channels of interaction. Do you need to test on a specific browser or screen resolution? Which device is your preferred weapon of choice? Desktop, tablet, mobile, smartwatch? This is where remote usability tests shine. You can pick pretty much any device, OS, or browser type and only relevant users will pick up your tests. This is quite handy if you need to test a device which is not accessible to your lab.

Task Sets — The scenario

I know, creating a mind-set up front can be leading but it is very necessary. If you don’t specify the state that they need to be in, they will just be in a random state. It’s like inviting someone to a costume party without letting them know what it’s all about. Briefly informing them about what they are up against and how long it would take can release unwanted stress and allow them be in the desired state. A first task with a title “Imagine yourself in this scenario” is a good way to start.

If we were to categorize the scenario task types, I think it would be safe to say that we can narrow them down to these two: Guided and Unguided.

Guided user test scenarios or task sets consist of step by step descriptions or guiding a user to a specific point in experience. There can be more than ten guided tasks in a single test session. Some examples would be:

Please go to “www.jotform.com” and sign up with a free account.
If you have signed in, please try to create a new form.
Please try to create a contact form with relevant questions…. etc.

Guided tests are like hand holding and work great if it is not possible for a new user to discover the feature that you want to test.

On the other hand, my personal favorite, the unguided user test scenarios or task sets consist of generally 1 or 2 tasks in total. The main idea is to leave the user as free as possible with open ended scenario like:

“While chatting with a colleague, you heard that online web forms can help you with your profession. For the next 20 minutes please make a Google search about web forms and online form builders, pick 3 alternatives of your choice, compare their abilities and try to test them out if they match your expectations.”

This task could have been assigned by any form builder brand. This vague task description forces the user to act as they would in a real life situation without giving a hint of the facilitators identity, rendering biased or leaning opinions improbable.

But why do it? For example, I believe initial experience point of a new JotForm user does not begin at the domain itself, it begins with an idea in mind, idea of a need for forms and how to handle it. After some google search user starts to get a grasp of the online form concept. Surfing through some landing pages they learn the general jargon of what online form builders are, how much they cost, what they look like, how they operate. Later on, they pick one or two of them to handle their needs. However, if someone were to intervene during any part of this unfolding process, let’s say a friend of colleague were to state that they use this X brand and it works awesome, it would act as a social proof and most likely change the decision of the user to lean on that brand.

Guided tests are like this friend or mentor, telling you what and how to do, which can create a biased opinion towards the brand. Unguided tests, on the other hand, tend to be like real life quests. They present you with the problem and ask you to figure it out in your own way. There is a possibility of sidetracking to unintended experiences of course, but I believe it is the magic of it, it just shows how that user persona would act like in that situation and it is most of the time just beautiful to observe and analyze. You can hear them say negative or positive opinions about your brand without hesitation and without knowing you are the owner of that test, and it provides the best and unbiased test results you can get.

Consider your objective, what do you want to get out of this test? What do you want to learn from the people on the other side of the keyboard? Either you choose guided or unguided, don’t forget, by creating a scenario you are actually creating a mind set, the initial state that which you make them think what there is. You craft their world for the next 20 minutes or so. Guide them and they will try to follow as they can and show you where they tumble along the way, leave them free and it’s just beautiful to observe and analyze the experience unfolding in front of you regardless of where it may end.

The Schedule

Consider time zones of your target audience. It can be 5am at your target location and your possible participants could be working on a computer for more than 12 hours, and they might be pretty tired by the time you reach them. It does not matter if they are geniuses, they will most likely fail all basic tasks if they are tired so try to start your tests considering their time zones, not yours.

When do you need this test results by, is another good question. If you need it in two hours then you better set your target audience scales in a wide manner or you will fail to receive it on time. An average user test takes about:

30 minutes to create and test thoroughly before sending it to the user
Depending on the specific requirements for your users, it can take between 10 to 100 minutes for your test to be picked up by an user.
20 to 25 minutes for the user to complete the test, ideally.
Around 40 to 60 minutes to analyze and report a single user session thoroughly. It will take considerable amount of time to analyze a test session with 10 users so plan ahead.
Bonus: Allow yourself some extra time to compile the results to form a presentation for your team. Communicating your findings the right way benefits everyone, it simply makes it easier for the team to digest.

Frequency

We try to conduct at least 100 user test session a month, mostly it reaches more than 150. This gives us the edge to keep every part of our developments under surveillance and in a human friendly state. But to be clear, JotForm software has a great depth that needs to be maintained and cared for. We have over 50 engineers constantly improving and committing new codes to the system which requires constant tests to keep everything in check. Your app might not require this many user tests per month, maybe 40–80 monthly sessions could be more than enough for most cases.

Debriefing

They made it! All the way through your test scenario, failed a few task on the road, but hey, this is what the tests are for right? So, what now? Any final thoughts?

Asking “how it all went” as a last question will not cut it, they will just repeat their last comment and get more frustrated about your never ending questions. At this point we love benefiting from flexible and resourceful data handling capabilities of web forms. Since we are in the digital form business, we know a trick or two about how to use them effectively, below are some ideas which you might find useful.

The immediate state of your test subject during their test phase is really valuable information and a game changer for the outcome of your analysis. Some of these data points are :

Age: Your test participant’s age can affect their problem solving capabilities, motor abilities, and many other variables.
Profession: Adds previous domain knowledge and experience to their problem solving capabilities.
Web expertise level: Most users are novice web users capable of basic actions such as copy and pasting information from one tab to another. If your test subjects are advanced users who also can debug or code a web site then basing your assumptions on their performance can be deceiving.
Their local time and Space (Automatic): Don’t expect too much from a tired and sleepy test subject, try to schedule your tests in respect to their sleep cycle to receive optimum performance from them. Also knowing where they are on planet Earth down to which city they live in can be a valuable statistical data for further tests. This information can be gathered without user interaction.
Total Work Hours: The amount of time they have been working that day until the test session affects their mental performance, they might have already drained their cognitive resources before they are exposed to your test, which affects your test success rates.
Energy and mood levels: How were they feeling before they accepted your test? Are they sad, upset, too happy, very tired or fresh? Knowing how they feel can help analyze their actions in certain situations.
Profile Picture: A quick selfie helps more than you can imagine, knowing how they really look like breaks the ice, connects your team with the user, allows you to double-check their persona details, and also reveals environmental & locational features, such as office or home settings.
System Usability Scale Test:(Semi-Automatic) A quick ten question System Usability Scale(SUS) test helps you to benchmark their experience accurately and provides comparable data points between other tests. Since you can calculate their score on the fly with the help of conditions and calculations features on JotForm, all there is left for you to do is analyze the final scores.
Net Promoter Score(NPS) Just a single but effective question will help you see how they feel about your product or service in one neat question.
Other Comment Questions: “If you had a magic wand, how would you improve this site?”, or “What frustrated you most during your experience?” These types of questions will get you a lot of written comments that can be turned into actionable data later on.
Device Type: Brand & model details comes in handy in some situations so that you can pinpoint the bug later on.
User Agent String (Automatic): You can gather their browser and OS information via User Agent String properties automatically.

You can test and clone a demo Remote User Test — Usability Summary Formhere, which we use at our tests. It only takes a few minutes to fill them out but helps greatly later on. You can integrate the form with google docs and collect the submissions on a sheet. You can also use reports to visualize the submissions and present it to your team or stakeholders (JotForm makes this process very easy). Here is a great article from Nielsen Norman Group about usability metrics and how to use them.

Some of these data points can be gathered without user interaction, such as their local time or their user agent string. It is important to note that you need to be clear with your test subjects up front and tell them that they will need to fill out a form after the test, which will require their personal details, such as their profile picture. They should be notified about this before they accept your test. These data points are intended to reveal their persona and not their true identity. That’s why we never ask for their real name, phone number, personal email or other information, which can reveal their identity. Remember the point here is to learn as much as you can about their initial state and their persona, not their identity. Always be respectful of that, I can’t stress this enough, it is really important and good practice to do so.

Summary

So this is about it. It has been a great journey. We have identified the problem, set our goals and KSI’s, sent the test to relevant people, analyzed the results thoroughly, and shared our findings with our team and stakeholders. I would call it a mission accomplished.

I’ve tried to share the way we do our remote user testing at JotForm, but now I’d like to know how you guys do it? Let’s share and improve our workflow together! Drop me a comment or DM if you’d like.

Thanks for reading. If you enjoyed this article, feel free to hit that clap button 👏 to help others find it.

Originally published at www.jotform.com.