Practical Guide for Product Hypothesis Testing using Excel

Published in

Bootcamp

6 min readMar 11, 2024

Convert your intuition to evidence using Excel and Statistics

In the world of product management and constant innovation, separating gut feelings from solid evidence is key to your success. This process is rooted in decision science, especially when it involves turning straightforward data into strategic decisions. It all begins with a hypothesis — a smart guess about how things might work. Before we dive deep into using statistics to test these hypotheses, let’s first understand the basics. This is tailored for founders and product managers aiming to make more data-informed decisions in their roles.

What is a Product Hypothesis?

A hypothesis in product development and product management is a statement or assumption about the product, planned feature, market, or customer (e.g., their needs, behavior, or expectations) that you can put to the test, evaluate, and base your further decisions on. It’s crucial to note that a hypothesis arises from a position of limited knowledge and data, necessitating empirical testing to validate or refute your propositions.

The Difference Between an Idea and a Hypothesis

Understanding the distinction between an idea and a hypothesis is critical in the realm of product management.

Armed with a clear understanding of what constitutes a hypothesis, let’s delve into the nuanced practice of hypothesis testing, where statistical rigor meets product creativity.

The Core of Hypothesis Testing

Hypothesis testing is a statistical method employed to ascertain the likelihood that a hypothesis regarding a product feature or user experience holds true. This process begins with the formulation of two hypotheses: the null hypothesis Ho and the alternative hypothesis H1

- Null Hypothesis Ho: There is no effect or difference.

- Alternative Hypothesis H1: There is a significant effect or difference.

Defining Success Metrics

Success metrics quantitatively delineate the parameters of success, directly correlating to your hypothesis. For instance, the click-through rate (CTR) serves as a metric when testing the impact of button color changes.

Crafting and Conducting the Experiment

Experimental design is pivotal, outlining the methodology for comparing variations and observing their impact. A/B testing is a common approach, where users are randomly assigned to experience different variations.

The Statistical Backbone: The Statistical tests and Beyond

There are various statistical tests which are available for different use cases

1. T-Test: Compare means between two groups.

2. Paired T-Test: Compare means within the same group at different points.

3. ANOVA (Analysis of Variance): Compare means among more than two groups.

4. Regression Analysis: Examine the relationship between dependent and independent variables.

5. Correlation Analysis: Measure the strength and direction of the linear relationship between two variables.

6. Cointegration Test: Test if two time series are cointegrated, indicating a long-term relationship.

7. Stationarity Test: Check if a time series is stationary.

8. Chi-Square Test: Test the independence of categorical variables.

9. Jarque-Bera Test: Test if a set of data follows a normal distribution.

10. Bootstrap Test: Estimate the sampling distribution of a statistic through resampling.

Let’s do a practical using our favourite tool “Excel”

Step 1: Define Your Hypothesis

- Hypothesis: Changing the “Contact Us” button color from green to blue will result in a higher click-through rate (CTR).

- Null Hypothesis: Changing the “Contact Us” button color from green to blue will not result in a higher CTR.

This setup presumes that Variation B (blue button) is not inherently superior to Variation A (green button). The goal is to statistically prove whether Variation B enhances the CTR over Variation A.

Step 2: Set Up Your Experiment

- Create two webpage versions: one with a green “Contact Us” button (Variation A) and another with a blue “Contact Us” button (Variation B).

- Randomly divide website visitors into two groups to ensure unbiased exposure to each version.

- Record the number of clicks on the “Contact Us” button for each group over a predetermined period, such as one week.

Step 3: Get data

You can ask your data engineer or if you know a little about any tracking tool like GA4/ Clarity/ Mixpanel you can get this data from there if you’ve correctly configured the Tracker into your website/ tool

Step 4:

After gathering data, we will perform t-test which is a powerful tool for evaluating the significance of differences between groups. The formula for the t-test is:

t is the t-statistic.
x1 bar is the mean of the first sample.
x2 bar is the mean of the second sample.
s1 is the variance of the first sample.
s2 is the variance of the second sample.
n1 is the number of observations in the first sample.
n2 is the number of observations in the second sample.

The Data:

The data provides information on the performance of two variations of the website’s “Contact Us” button (green and blue) during a specified date or time period.

Date: Time of data collection.
Visitor Count: Total users on the website during that time.
GreenBtn_Clicks_Visitors: Clicks on the green “Contact Us” button, showing engagement.
BlueBtn_Clicks_Visitors: Clicks on the blue “Contact Us” button, indicating engagement with the altered design.
CTR (Click Through Rate) = (Visitors with Button clicks (Green/ Blue))/Visitor count *100

Now we have to use excel’s built in feature of data analysis for performing “t-test for two sample assuming equal variance” (check the below illustration)

The result will look something like this

In this P Value is 0.01455039 which is less than the significance level of 0.05

Get the Excel sheet from here: <https://abhijit.objectstore.e2enetworks.net/hypothesis_testing/website_hypothesis_testing.xls>

Step-4: Identifying the insights

How to read the results

The null hypothesis can be rejected, and it can be said that the change in button color had a statistically significant impact on the CTR if the p-value is less than the significance level (for example, 0.05). Thus, we cannot rule out the null hypothesis and conclude that the CTR was not significantly affected by the change in button color.

This example shows that the test group’s mean CTR is greater than the control group’s (green button) and that the difference is statistically significant because the p-value from the T-Test is less than 0.05 with 95% Confidence Interval. We can thus rule out the null hypothesis and conclude that the CTR was positively impacted by the “Contact Us” button’s switch from green to blue.

So, the next time you find yourself scratching your head over a “unique” feature suggestion, you’ve got the perfect comeback: “Let’s see what the data says!” This approach transforms those “Hmm, are you sure?” moments into “Ah-ha! Let’s test it” opportunities.

Imagine this: A stakeholder comes up with an idea that makes you wonder if they’ve been reading too much science fiction. Instead of a flat-out “No way,” you can now say, “Interesting idea! Let’s put it through our hypothesis testing process and see if the data backs it up.” It’s a win-win: you keep the peace, and who knows? Sometimes, the most out-there ideas turn out to be gold mines, as long as the data agrees.