The Startup
Published in

The Startup

Using A/B Tests to Analyze Customer’s Response for New Menu With Alteryx Designer


The objective of the project was to show how the decision-making process of the restaurant chain is supported on the implementation of a new menu based on reliable data on the behavior of the customers in which the changes were presented and who had contact with a new marketing campaign. The data provided for the analysis are detailed but public data representing the data in this chain. To achieve the objective, the data analysis technique A / B tests, also known as hypothesis testing, was used. This technique is very useful to test and validate scenarios accurately when there is no previous data. To carry out these tests, the menus of a list of restaurants were modified and in these cities a marketing campaign was created on the changes implemented. These restaurants are then compared with those that have not had the changes. As a tool, the Alteryx Designer software was used so that it is not necessary to code each algorithm manually, saving time and saving effort. The result was an accurate forecast of the profit the company would achieve by adopting the proposed changes.

Keywords: datawarehouse, business intelligence, data analytics, A/B testing, hypothesis testing.

Lets get to it

The problem was presented by a chain of restaurants spread across the USA that wanted to know the result of the change in the chain’s menu by adding a gourmet burger and a wine list. These data were provided by an educational institution and not directly by the restaurant chain.

Not knowing if the change would have a positive financial impact, the company decided to change the menu and run an advertising campaign in two cities where its restaurants most resemble those of its other units.

To select which data analysis technique to use, the following image created by the company Alteryx for the data analysis nanodegree by Udacity was used. This image visually represents a logic behind the choice of an analysis technique taking into account the business problem presented.

Image 1: choosing the type of analysis

There was a business problem, this problem required the prediction of a result (Predict Outcome) and there was no data to perform the analysis (Data Poor). Therefore, the technique used was A / B Testing, or Hypothesis Testing.

The test period was twelve weeks, starting on April twenty-ninth of two thousand and sixteen and ending on July twenty-first of two thousand and sixteen. The data were aggregated weekly with the week starting on Friday and ending on Thursday. The metric used for calculation was the gross sales margin. The data provided were the list of stores with all their characteristics (StoreID, square feet, average monthly sales, name, telephone, combined address, city, state, postal code, region, country, coordinates, timezone, timezone offset, timezone of parameter) and a file with all sales from all stores in the period from early 2015 to the end of 2016 containing all sales details (StoreID, invoice number, invoice date, SKU, category, product, quantity , size, gross margin and sales).

Image 2: store file
Image 3: store file

Step 1: Data preparation

The process starts by preparing the data by cleaning and adding to the desired parameters. With store list entries and store transactions separated by start and end data for tests and control dates, transaction data and store list are joined for the two time groups and non-filtered data is filtered If desired, the items of each invoice are counted and the value of the sale of the note is added together, so we have the aggregated totals per note for each time group to be evaluated.

Image 4: Data prep
Image 5: Initial pred agg

The result of this data preparation is two files, one containing the data from the test weeks and the other containing data from the control weeks.

In the second phase of data preparation, the transaction data of the stores were used again, filtered by the treatment dates and applied a label per group of stores for easier filtering after preparation, the store data is added and aggregated for the desired granularity and the non-relevant characteristics are eliminated, thus generating the traffic of the stores again. Continuing, the list of treatment stores defined by the business is added, together with the traffic data of the stores, thus generating a store list file with its characteristics and weekly sales volume.

Image 6: generating lists

At the end of this flow, the characteristics evaluated in addition to the trend and seasonality for matching stores were compared were RoundRoasterStore, square feet and average monthly sales.

Image 7: Variable analysis

According to the matrix correlation graph (generated at the end of the flow of image 6), the variable Sq_Ft has a low correlation with the target variable (symbolized by the white color in the matrix), and should not be used in the analysis. Therefore, the variables used were Trend, Seasonality and Average Month Sales.

In the next step, the characteristics of each store were compared to define two control stores for each treatment store. The treatment store are the stores where the menu changes were made and the marketing campaign, the control stores are stores where no changes have been made and their characteristics are the same as the treatment stores. For this purpose, the store traffic file was used, together with the store list, the control stores are filtered and the characteristics are compared with the stores tested.

Image 8: Pairing treatment and control stores

Comparing the previously mentioned characteristics, two control stores are selected for each treatment store.

Image 9: Stores paired

Step 2: Analysis

After the paired stores the analysis is performed. The treatment stores are compared with the two control stores in the time interval defined by the business.

Image 10: A/B Testing


After the A / B test analysis, it is recommended to change the menu as the results were satisfactory. The graphs show a significant difference between treatment stores and control stores.

Image 11: Comparison of results between treatment stores and control stores

As seen in the graphs in image 11, the results (profits) achieved by treatment stores are considerably higher than those of control stores during the testing period. These values are represented by the blue dots and blue line for treatment stores and red dots and red lines for control stores.

Analyzing the results by region, we have an increase in gross profit of 36.8% with a significance level of 99.5% for the western region.

Image 12: Results for western region

For the central region, an increase in gross profit of 49.6% and a significance level of 99.6%

Image 13: Results for the central region

Overall the company can expect a 43.1% profit increase with a significance level of 100.0%

Image 14: Overall results

In this way, exploratory analysis was used to understand the data, we used data preparation techniques (data wrangling) and predictive analysis (in this case A / B testing) to predict the variable required by the business with the data provided to us.

For the business, it is recommended that the company invest in the new menu and marketing campaign throughout the country.



Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +760K followers.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Bruno Faria

Computer Scientist | Data Analyst | Beginner citizen astrophysicist | Nerd | Self defense instructor | IPSC shooter | Beginner maker |