A PM’s Guide to Building Good Hypotheses and Testing Them

Geetika Guleria
Agile Insider
Published in
5 min readApr 5, 2020
Image by StartupStockPhotos from Pixabay

The art of building a successful product is, in reality, a scientific endeavor. Instead of rolling out a feature to the entire user base, then testing its contribution to business, product managers introduce experimentation to the discovery process. Say hello to hypotheses.

A hypothesis is an assumption. It is an idea proposed for the sake of argument, so it can be tested to see if it might be true.

Even though a hypothesis statement is guesswork, you need to build it around a problem statement, so you can measure success. For instance, a problem statement might sound something like this: “How can we increase the sign-ups when the user is on the sign-up page?”

Any additional brainstorming would then have a clear focal point, such as increasing sign-ups when the user is on the sign-up page. You can then come up with the following:

  • If we shrink the sign-up form to a single page view, then the sign-ups will increase.
  • If we remove the drop-down options from the form, then the sign-ups will increase.

The above statements are incomplete hypotheses, though.

What makes a good hypothesis?

In a nutshell, a good hypothesis is testable. A good hypothesis will be specific, and easy to validate or refute through a test. The following components make up a good hypothesis:

  • The specific change you are testing: Avoid vague statements such as, “If we improve the UI, then it will increase the sign-ups.” What UI change are you trying to test? Get specific.
  • Expected outcome: We’ve discussed the importance of a problem statement having a clear focal point. We also need to set a numeric impact to guide the success of the experiment: “If we shrink the sign-up form to a single page view, then the sign-ups will increase by 30%.” This number is tricky to arrive at, but it is important to guide the duration of your experiment, and also to keep things black and white. In this case, if shrinking the sign-up form to a single page view results in a less than 30% increase in sign-ups, the hypothesis will be considered a failure. It could be set by the product heads at the start of the discussion, or you could arrive at it based on previous experiments or industry analysis around, in this case, conversion rates.
  • Duration of experiment: You can use this calculator to figure out how long to run a test to avoid false positives in your experiment.

Your final hypothesis might look something like this: “If we shrink the sign-up form to a single page view, then the sign-ups will increase by 30% in a week.”

Remember the list of assumptions

Every hypothesis has a list of underlying assumptions. It is important to identify the riskiest ones, and validate them before investing engineering resources for your test. You can do this through a simple paper prototype with your users or the Wizard of Oz method.

“Wizard of Oz” MVP is a concept where you give the impression of full functionality. Users feel they are using an actual feature, but things are handled manually behind the scenes. This saves development effort by only offering a UI without any back-end code.

Testing a hypothesis

A/B testing

This is one of the easiest ways to validate a hypothesis. You roll out your change to half your users and withhold it from the other half. If there’s a significant change in the success metric, the hypothesis is validated.

Multivariate testing

When there are several versions to test, a sequential approach can be limiting. For instance, if a hypothesis states that reducing the number of images on a screen increases CTR by 25%, you’d want to test the number of images that validate your hypothesis. This can be done by splitting users into multiple variants and testing simultaneously.

Time-based testing

Sometimes, seasonality and other external factors could cause a change in your success metric. How can you then determine if the change you tested was the real reason behind the surge or drop of your metric? Time-based testing can significantly reduce the effects of factors, such as seasonality.

You introduce the change to your audience for some time, then turn it off for the same amount of time, and repeat this cycle for a longer duration.

Things to consider when testing a hypothesis

User sample

When you divide your user base into variants, make sure there’s no inherent bias with respect to your success metric. If one of the user variants has relatively better values of your success metric to begin with, you will end up cascading that effect through your experiment and fudging the results.

Measuring significance

You need to make sure your test results are an actual effect of the changes you made and not a random event. This is done by measuring the significance of your end result. You can read more about this statistical concept here.

To recap:

  1. Make your hypotheses specific around your problem statement.
  2. List your risky assumptions, and validate them.
  3. Choose the best testing approach based on your end goal.
  4. Be wary of the user sample you choose and the significance of your test results.

--

--

Geetika Guleria
Agile Insider

I like writing on products, technology and self-growth