Corporación Favorita Grocery Sales Forecasting — Business Question
This blog post is about the business questions that can be asked for the kaggle competition Corporación Favorita Grocery Sales Forecasting. Corporación Favorita is a Brick and Mortar store in Ecuador. In this competition, they want the kagglers to do sales forecasting for their items in the store to avoid uncertainity in demand patterns and adequately stock the items better. For the business questions analysis, I have used the following data:
Training Data — Contains daily transaction across the stores and product. The transaction data is for the period 2013–2017. They also contain details about whether the item is in promotion or not.
Store Data — Contains store information about city, state, type of store and groups of similar stores (clusters)
Items Data — Contains family, class and perishability information of the items
Transaction Data — Contains information about the number of transactions for each store for each day
Oil Data — Since Ecuador is country whose GDP is heavily dependent on oil, this file contains daily oil price information.
Based on my analysis, I have come up with following business questions and we will answer those questions one by one:
- What is the overall sales trend ?
The sales data is aggregated to weekly level for better visualization. We can find from the graph that during Christmas that is last few weeks, the overall sales has increased compared to previous weeks. The sales follows reasonable seasonality pattern with upward trend. There is decrease in sales during mid 2015, which can be attributed natural disaster which happend in Ecuador during that time of the year and also global oil shock.
2. What are the top 10 Stores based on unit sales?
Store number 44 has the highest unit sales, while store number 47 and 3 have sales quite similar.
3. What are the top 10 items based on unit sales ?
Item number 1503844 has the largest sales. It has largest sales due it’s perishability and produce in nature
4. What are the top 10 unit sales with respect to family (Category) ?
Largest number of sales is from Grocery 1 section. But if you see from the question, the item number which has highest number of sales is from Produce family. Hence we can say that family — ‘Grocery 1’ has many items than ‘Produce’ family which is resulting in high number of sales for Grocery family
5. Whether oil prices affect sales ?
From the weekly sales graph shown above and average oil prices graph, we can say that oil prices didn’t affect sales much. We can see that sales has continous increasing trend, but oil prices have sales was up for few years and later it went down (2015). Even during the period of transition where oil prices went down, we can observe that sales constantly kept an upward trend with slight jerk in the middle during oil price sudden decrease. This might be due to people’s fear in economic unstability. In later years once the economy became stable, the sales kept an upward trend.
6. What are top stores which has very high number of items per transaction ?
Store Number 51 has highest number items sold per transaction. But store number 51 is not even in top 10 stores of highest sales. From this we can say that store, customers who come to store 51 make bulk purchases as compared to store 44 which has top unit sales
7. Did promotion affect the sales ?
Effect of promotion on sales can be analysed by finding the average number of unit sales per transaction during promotion and non promotion. From the graph we can say that promotion did have an effect on sales, which contributed increase in 3 (approx) items per transaction
8. How is the effect of promotion with repect to one product in one location ?
For this analysis I have considered item number323013 for location Quinto. I considered it had highest number of daily promotions. From the graph we can say that promotion was very effective during mid 2015 2016 i.e during early stages. After that the people were not showing much interest in promotion.
Note: Code can be found in the following git hub repo: https://github.com/tsnarendran14/Udacity/blob/master/Data%20Scientist%20Nanodegree/Project/Write%20A%20Data%20Science%20Blog%20Post/Udacity%20Data%20Scientist%20blog%20your%20solution%20project.ipynb