Q#90: Testing user conversion

Published in

Foundational Data Science: Interview Questions

2 min readAug 15, 2023

Given the following dataset, can you see if there’s a significant difference between the conversion rate of users between the test and control group? The relevant columns in the table are conversion and test. The conversion column has values of 0 and 1 which represent if the user converted (1) or not (0). The test table has values of 0 and 1 as well, 0 for the control group and 1 for the test group.

TRY IT YOURSELF

ANSWER

We got another T-Test problem again, this time to check if folks are utilizing a product/engagement. We know the drill lets use the python scipy package!

Loading the Data

We'll use the pd.read_csv function to load the dataset from the provided URL.

url = "https://raw.githubusercontent.com/erood/interviewqs.com_code_snippets/master/Datasets/test_table_truncated.csv"
data = pd.read_csv(url)

Analyzing Conversion Rates

Now, let's calculate the conversion rates for both the test and control groups and then determine if there's a significant difference between them.

# Grouping data by 'test' and calculating mean conversion rate
conversion_rates = data.groupby('test')['conversion'].mean()

# Printing conversion rates
print("Conversion Rates:")
print(conversion_rates)

T-test Hypothesis Testing

To determine if there's a significant difference in conversion rates between the test and control groups, we'll perform a hypothesis test. We'll use a two-sample t-test assuming unequal variances since we want to compare means of two independent groups.

# Extracting data for test and control groups
test_group = data[data['test'] == 1]['conversion']
control_group = data[data['test'] == 0]['conversion']

# Performing t-test
t_stat, p_value = stats.ttest_ind(test_group, control_group, equal_var=False)

# Printing t-statistic and p-value
print("T-statistic:", t_stat)
print("P-value:", p_value)

Interpreting the Results

The p-value obtained from the t-test indicates the probability of observing the obtained difference in conversion rates (or a more extreme difference) if there were no true difference between the test and control groups. A lower p-value suggests stronger evidence against the null hypothesis.

We can set a significance level (e.g., 0.05) to determine whether the difference is statistically significant. If the p-value is less than the chosen significance level (which it is for us), we reject the null hypothesis and conclude that there is a significant difference in conversion rates between the groups.

Plug: Checkout all my digital products on Gumroad here. Please purchase ONLY if you have the means to do so. Use code: MEDSUB to get a 10% discount!

Q#90: Testing user conversion

TRY IT YOURSELF

ANSWER

Written by Abish Pius