A/B Testing: Analysis of Credit Card Marketing Campaign

Kailash Hari

Published in

Analytics Vidhya

8 min readOct 25, 2019

Data Analysis: Credit Card Marketing Campaign by Kailash and Pooja -

Github link with complete python notebook and UCI dataset on bank marketing- https://github.com/kailash14

Scenario

A large consumer bank has recently run a direct marketing campaign by creating a video ad for credit card offers to acquire more credit card applications.
During the campaign, they also ran a split test for landing pages. The control page is their default text-based page, while the test page features a new marketing video.

Dataset Description

· Dataset Name: bank_direct_marketing.csv

· Number of rows: 100000

· Each row represents the demographics and the response of the recipients.

· Number of Variables: 12

· Variables Name:

Demographics of Users: age, job, marital_status, education, gender,
The response of Users: suggested, day_of_week, test, frequency, page_views, prev_y, y

Objective

To check A/B testing and to see if the test page(with the video ad) has performed better than the control page(textbase).

Data Dictionary:

age: type — Numerical
Age of the recipient.

job: type — Nominal Categorical
Job category of the recipient.

marital: type — Nominal Categorical
Marital status of the recipient

education: type — Nominal Categorical education level of the recipient.

gender: type — Nominal Categorical
Gender (male is 1, the female is 0) of the recipient.

suggested: type — Nominal Categorical
Whether the recipient is outside of the bank’s original targeting parameters (i.e. suggested by algorithms).

test: type — Nominal Categorical
A test landing page with video (control page is 0, the test page is 1).

day_of_week: type — Ordinal Categorical
The day that the recipient saw the ad.

frequency: type — Numerical
The number of times the recipient had seen the ads from the campaign.

page_views: type — Numerical
A number of pages the recipient had viewed on the bank’s website in the 90 days period prior to seeing the ad.

prev_y: type — Nominal Categorical.
Whether the recipient had applied for a credit card after receiving a previous campaign.
0- Not applied for the credit cards.
1- Applied for the credit card.
2- the recipient has never received a previous campaign.

y (Target variable) : type — Categorical
Whether the recipient applied for the card.
0- Not applied
1- Applied Nominal

What is A/B Testing:

A/b testing, which also is known as split testing, is an experiment wherein you “split” your customers/audience to test a number of features of the campaign that you run and determine which performs better

What we need:

To run an A/B test, you need to create two different versions (control and experiment) which changes accordingly to the target variable, Then, you’ll show these two versions to two similarly sized audiences over a prolonged period of time (enough to make the accurate decision) and determine which performs better.

Detailed steps of A/B Testing:

Before the A/B Test:

A) Pick which feature it needs to be tested on : More than one variable can be tested but it would be necessary to test one at a time, But to evaluate how effective a change is, you’ll want to isolate one “independent variable” and measure its performance — otherwise, you can’t be sure which one was responsible for changes in performance.

Example of things needed to be tested: email subject lines

B) Frame Hypothesis Statements: There may be many metrics for everyone test, but it would be important to choose one primary metric to focus on — before we run this test, and based on that is it necessary to frame a hypothesis statement to test our predictions and results.

C) Create control and test: As we have information regarding independent, dependant and desired variable we will use this information to set up control and test pages.

D) Determine your sample size: If you’re A/B testing an email, you’ll probably want to send an A/B test to a smaller portion of your list to get statistically significant results. Eventually, you’ll pick a winner and send the winning variation on to the rest of the list.

During the A/B Test:

E) Test both variations simultaneously. Timing plays a significant role in your marketing campaign’s results, whether it’s a time of day, day of the week, or month of the year. If you were to run Version A during one month and Version B a month later, how would you know whether the performance change was caused by the different designs or the different months? When you run A/B tests, you’ll need to run the two variations at the same time, otherwise, you may be left second-guessing your results.

f) Give the A/B test enough time to produce useful data: Again, you’ll want to make sure that you let your test run long enough in order to obtain a substantial sample size. Otherwise, it’ll be hard to tell whether there was a statistically significant difference between the two variations. How long is long enough?

A/B Testing

A/B testing (also known as split testing or bucket testing) is a method of comparing two versions of a webpage or app against each other to determine which one performs better. AB testing is essentially an experiment where two or more variants of a page are shown to users at random, and statistical analysis is used to determine which variation performs better for a given conversion goal.

A/B Testing Working

In an A/B test, you take a webpage or app screen and modify it to create a second version of the same page. This change can be as simple as a single headline or button or be a complete redesign of the page. Then, half of your traffic is shown the original version of the page (known as the control) and half are shown the modified version of the page (the variation).
As visitors are served either the control or variation, their engagement with each experience is measured and collected in an analytics dashboard and analyzed through a statistical engine. You can then determine whether changing the experience had a positive, negative, or no effect on visitor behavior.

Distribution of Data

- Univariate Analysis of Variables to understand the overall distribution of data.
- Also to understand what kind of recipients are included in the campaign.

Distribution of Age :

Observations :

Almost 83% of the recipients belong to the age group of 20-50 years.

Distribution of Marital Status:

Observations :

Also practically depending upon the population of the age group, population mostly covers the 'Single' and 'Married' class of the people.

Distribution of Gender:

Observations :

Recipients of both the genders are almost equally included in the campaign.

Distribution of day of the week:

Observations :

Being the US-based bank only weekdays (i.e.Monday to Friday) are included in the campaign.
- All the days have almost equal responses being recorded from the campaign (i.e. 20% each).

Note: All the above variables are not giving any significant results for understanding the user's response to the campaign.

Distribution of Education sector:

Observation :

The recipients having 'School' and 'Graduate' level of education are majorly included in the campaign, which includes up to 91% of the campaign population.

Distribution of job sector:

Observations :

76% of the campaign population includes 'mid' and 'low' income recipients.

Distribution of frequency:

Observation :

84% of the recipients have either not viewed the campaign ad or have viewed the ad maximum up to 3 times.

Previous Y:

Observations :

This variable represents the response of the customers to previous campaign.
- Approximately only 14% of the population is only the known population (i.e. who's previous campaign response was recorded).
- And a complete 86% of the campaign population is a new and unknown population.

Distribution of Test:

Observations :

The split test is not equal.
- 63.5k of the campaign population was sent the test(i.e. Video-based link) and the only 36.5k of the campaign population was sent the control(i.e. text-based link)

Distribution of target variable (Y):

Observations :

The overall success rate of the campaign is 11.2%

Identifying which factors Contributed:

Education Sector Success Rate Distribution

- To understand the factors which really affected the recipients response we considered the variables which actually showed us some significant distribution of data.

Observations :

All the sectors are showing a 10-14% success rate.

Hypothesis testing:

•The following information is provided:

-The simple random sample size for group 1 (N 1)= 30000,

-The simple random sample size for group 2 (N 2)= 30000

-The number of favorable cases in Group1 = 4417

-The number of favorable cases in Group2 =

-The calculated sample proportion =0.73 and

-The significance level is alpha = .95

• Test Statistics

The z-statistic is computed as follows:

After underfitting to remove the bias, the results show that the z value is greater than z critical, and hence we can tell that we would reject the null hypothesis and accept the alternative, which indicates that the video-based page provides better results.

To confirm our solution, the most correlated variable for both the numerical and categorical is taken and seen if it would agree with the statement that the video linked page provides better results and this is done with the help of a z-score test.

CONCLUSION:

The test page (with video) is providing better results than the control for the given data.

The already present customers don’t show much difference compared to test/control, therefore we can send more text-based pages and observe the results.

Further data is needed to do a comprehensive analysis

Also if you to automate your graphs in any dataset, kindly look at my video where it is clearly explained: https://www.youtube.com/watch?v=9tanWQo4xlo

Snkailashhari

Poojamore

A/B Testing: Analysis of Credit Card Marketing Campaign

Data Analysis: Credit Card Marketing Campaign by Kailash and Pooja -

Contents:

Scenario

Dataset Description

Objective

Data Dictionary:

What is A/B Testing:

What we need:

Detailed steps of A/B Testing:

A/B Testing

A/B Testing Working

Distribution of Data

Distribution of Age :

Observations :

Distribution of Marital Status:

Observations :

Distribution of Gender:

Observations :

Distribution of day of the week:

Observations :

Distribution of Education sector:

Observation :

Distribution of job sector:

Observations :

Distribution of frequency:

Observation :

Previous Y:

Observations :

Distribution of Test:

Observations :

Distribution of target variable (Y):

Observations :

Identifying which factors Contributed:

Education Sector Success Rate Distribution

Observations :

Hypothesis testing:

CONCLUSION:

Also if you to automate your graphs in any dataset, kindly look at my video where it is clearly explained: https://www.youtube.com/watch?v=9tanWQo4xlo

Written by Kailash Hari