Using A/B Tests to Analyze Customer’s Response for New Menu With Alteryx Designer
ABSTRACT
The objective of the project was to show how the decision-making process of the restaurant chain is supported on the implementation of a new menu based on reliable data on the behavior of the customers in which the changes were presented and who had contact with a new marketing campaign. The data provided for the analysis are detailed but public data representing the data in this chain. To achieve the objective, the data analysis technique A / B tests, also known as hypothesis testing, was used. This technique is very useful to test and validate scenarios accurately when there is no previous data. To carry out these tests, the menus of a list of restaurants were modified and in these cities a marketing campaign was created on the changes implemented. These restaurants are then compared with those that have not had the changes. As a tool, the Alteryx Designer software was used so that it is not necessary to code each algorithm manually, saving time and saving effort. The result was an accurate forecast of the profit the company would achieve by adopting the proposed changes.
Keywords: datawarehouse, business intelligence, data analytics, A/B testing, hypothesis testing.
Lets get to it
The problem was presented by a chain of restaurants spread across the USA that wanted to know the result of the change in the chain’s menu by adding a gourmet burger and a wine list. These data were provided by an educational institution and not directly by the restaurant chain.
Not knowing if the change would have a positive financial impact, the company decided to change the menu and run an advertising campaign in two cities where its restaurants most resemble those of its other units.
To select which data analysis technique to use, the following image created by the company Alteryx for the data analysis nanodegree by Udacity was used. This image visually represents a logic behind the choice of an analysis technique taking into account the business problem presented.
There was a business problem, this problem required the prediction of a result (Predict Outcome) and there was no data to perform the analysis (Data Poor). Therefore, the technique used was A / B Testing, or Hypothesis Testing.
The test period was twelve weeks, starting on April twenty-ninth of two thousand and sixteen and ending on July twenty-first of two thousand and sixteen. The data were aggregated weekly with the week starting on Friday and ending on Thursday. The metric used for calculation was the gross sales margin. The data provided were the list of stores with all their characteristics (StoreID, square feet, average monthly sales, name, telephone, combined address, city, state, postal code, region, country, coordinates, timezone, timezone offset, timezone of parameter) and a file with all sales from all stores in the period from early 2015 to the end of 2016 containing all sales details (StoreID, invoice number, invoice date, SKU, category, product, quantity , size, gross margin and sales).
Step 1: Data preparation
The process starts by preparing the data by cleaning and adding to the desired parameters. With store list entries and store transactions separated by start and end data for tests and control dates, transaction data and store list are joined for the two time groups and non-filtered data is filtered If desired, the items of each invoice are counted and the value of the sale of the note is added together, so we have the aggregated totals per note for each time group to be evaluated.
The result of this data preparation is two files, one containing the data from the test weeks and the other containing data from the control weeks.
In the second phase of data preparation, the transaction data of the stores were used again, filtered by the treatment dates and applied a label per group of stores for easier filtering after preparation, the store data is added and aggregated for the desired granularity and the non-relevant characteristics are eliminated, thus generating the traffic of the stores again. Continuing, the list of treatment stores defined by the business is added, together with the traffic data of the stores, thus generating a store list file with its characteristics and weekly sales volume.
At the end of this flow, the characteristics evaluated in addition to the trend and seasonality for matching stores were compared were RoundRoasterStore, square feet and average monthly sales.
According to the matrix correlation graph (generated at the end of the flow of image 6), the variable Sq_Ft has a low correlation with the target variable (symbolized by the white color in the matrix), and should not be used in the analysis. Therefore, the variables used were Trend, Seasonality and Average Month Sales.
In the next step, the characteristics of each store were compared to define two control stores for each treatment store. The treatment store are the stores where the menu changes were made and the marketing campaign, the control stores are stores where no changes have been made and their characteristics are the same as the treatment stores. For this purpose, the store traffic file was used, together with the store list, the control stores are filtered and the characteristics are compared with the stores tested.
Comparing the previously mentioned characteristics, two control stores are selected for each treatment store.
Step 2: Analysis
After the paired stores the analysis is performed. The treatment stores are compared with the two control stores in the time interval defined by the business.
FINAL CONSIDERATIONS
After the A / B test analysis, it is recommended to change the menu as the results were satisfactory. The graphs show a significant difference between treatment stores and control stores.
As seen in the graphs in image 11, the results (profits) achieved by treatment stores are considerably higher than those of control stores during the testing period. These values are represented by the blue dots and blue line for treatment stores and red dots and red lines for control stores.
Analyzing the results by region, we have an increase in gross profit of 36.8% with a significance level of 99.5% for the western region.
For the central region, an increase in gross profit of 49.6% and a significance level of 99.6%
Overall the company can expect a 43.1% profit increase with a significance level of 100.0%
In this way, exploratory analysis was used to understand the data, we used data preparation techniques (data wrangling) and predictive analysis (in this case A / B testing) to predict the variable required by the business with the data provided to us.
For the business, it is recommended that the company invest in the new menu and marketing campaign throughout the country.