Starbucks is one of big company around the world which famous with their coffee. From 1971, Starbucks expands their business from single caffe into more than 30.000 caffee spreading around the worlds (cites from https://www.cnbc.com/2019/01/07/starbucks-cafes-coffee-business.html). I found that it will be interesting to explore about business problem in large company like Starbucks. In this article I will explore offering / adertisement from Starbucks for their customers to find new business insight. P.S. used dataset is not real dataset it is only data that mimic Starbucks Customer Behaviour.
In this project, I used 3 dataset served by Starbucks. First is customers dataset containing customer id, age, gender, date became member and income. It is also given offering characteristics and customers interaction with the offering given to their customers. Before exploring and processing the data, dataset needs to be cleaned with several action such as data duplication cleaning, column data expansion, customer id encoding and change data type of each column into proper types.
To make this project more structured and straight to the point, I performed several steps in this projects, such as :
1. Set business questions
2. Data exploration
3. Explanatory Data Analysis
4. Making Conclution.
The Business Questions
Business questions will be aim for this project. Based on overview of the dataset, I set 2 business questions that I think important for Starbucks to now 1. Which campaign type that is the most effective?
2. Which customer segment that give good response to campaign?
Getting overview of dataset is very important things to do in the beginning of the projects. We can know general characteristics of the data, boundaries of given data, etc.
Customer’s Profile Exploration
In the first, getting the age and income distribution is important to get general outlook for present Starbucks Customers.
From the age distribution, we can see that Starbucks customers is distributed from 20 years old until 100 years old. Total female customers is around 6129 which is least than male customers that counted 8484 people. The interesting thing is mean average of both male and female customers more than 50 years old. Distribution of female customers income are almost in normal distribution, different with male customers income distribution which indicate skew to the right.
Transcript Data Exploration
Transcript dataset contain offering tracking for each customer and also contain type of offer given to each user. Counted offer event in this dataset that shown bar chart below.
In general, I can say that there are lost in the offering pipeline which start from offer received, then go to offer viewed and offer completed. This count is made from 10 offering types. About 75.67% offer is being viewed by customers and only 44.02% of offering that is completed by customers. Then further process can be done if we know that offer amount is similar. Offering type is counting below.
The graph shown above give us information that offerning type amount is much more alike.
Explanatory Data Analysis
- Which campaign type that is the most effective?
Next process is process transcript data to get which offering type that resulting high conversion rate. After processing this dataset, gotten offering pipelin table shown below
It is easy to find which offering that has highest complete percentage, but to proof it with statistic significancy it needs a lot work since the total of offering type is 10. For making it simpler, I grouped all offering into 3 groups with K-meas Clustering methods and resulting this.
With formed cluster, I need 2 step to prove statistical significancy of highest complete percentage group. Chosen A/B testing to get statistic significancy.
P value result from comparing cluster 0 and 1 is 0. It is prove that cluster 0 is statistically significant. And from the second test,comparison between cluster 0 and 2, gotten p value 0 too. Both test resulting same value and proof that cluster 0 is statistically significant over the rest groups.
Cluster 0 is consist of offer id 6 and 7. Both offering has similar characteristics, there are small awards, discount type offer and spreading in web, email, social media and mobile apps.
2. Which customer segment that give good response to campaign?
Extracted data from transcript data too, gotten completition of each user on Starbucks offer. To get user which gives good response to offer, I give boundaries offer completition more than 0.5 that will be processed. Below distribution graph of customer’s age and income which have good interaction with offers.
Known that male customers have more good interaction with offer than female have. Customers age range in which completed the offer are different in male and female too. Suggested, focus the offer on male customers in age range between 40–70 years old. For female customers, suggested give the offering into age range between 50–65 years old.
In the other hand, male costumers mean is less than female customers mean. Male customers which have salary around 40.000–75.000 reacting good on the offers. But female customers which reacted on offers have income around 50.000–80.000.
Customer segment which has good interaction with highest conversion rate
Similar with previous process, but in this section added more filter for track only for offers 6 and 7 gotten customer data with age and income below.
Customers which react well with offer id 6 and 7 in total are 5757 with male customers proporsion is around 0.58. Male customers in age range 40–70 years old react very well on completing the offer, since female customers age range with good conversion with the offer is shallower, about 50–70 years old. Income range for male and female customers which complete offer 6 and 7 is similar with the customers income range who complete all offering.
1. From 10 offerings, offer with id 6 and 7 give the highest conversion rate and it has statistically significant among the other offers. The completed rate for offer id 6 is around 67.4% and offer id 7 is 69.9%. Suggested for the next advertising plan, it will be good to give offer with discount type and spread it on all platform (web, social media, mobile and email) even the reward are small.
2. Male customers have tendency to complete the offer than female. It is approved by male proporsion who complete the highest completed offer is around 0.58. Suggested to focused on male customers in age range 40–70 years old and income in range 40.000–75.000. For female customers, recommended age range is around 50–70 years old with income in range 50.000–80.000.
For the technical details, you can see on my github repository
Notebook github link : https://github.com/irfanespe/EcommerceGatheringBusinessInsight
Thanks for reading my post, don’t forget to Like, Share, and Subscribe !