Multivariate Time Series Restaurant Demand Forecasting

Published in

Analytics Vidhya

8 min readSep 21, 2020

What is Demand Forecasting

Demand forecasting is a combination of two words; the first one is Demand and another forecasting. Demand means outside requirements of a product or service. In general, forecasting means estimating the present for a future occurring event. It is a technique for estimation probable Demand for a product or services in the future. It is based on the analysis of past Demand for that product or service in the present market condition.

Why is Demand Forecasting important?

There are many reasons why demand forecasting is an essential process for businesses:

Demand forecasting helps with business planning, budgeting, and goal setting. Once you have a good understanding of what your future sales could look like, you can begin to develop an informed procurement strategy to make sure your supply matches customer demand.

It allows businesses to more effectively optimize inventory, increase inventory turnover rates, and reduce holding costs.

It provides an insight into upcoming cash flow, meaning businesses can more accurately budget to pay suppliers and other operational costs and invest in the growth of the company.

Anticipating Demand means knowing when to increase staff and other resources to keep operations running smoothly during peak periods.

The purpose of this article is to do a walk-through of Data Preparation and implementing a multivariate time series forecasting techniques. This implementation show several ways of forecasting multivariate demand.

So let’s dive right in.

Business Problem and Limitations -

Our dataset is for restaurant sales for Tuesday and Wednesday, both Lunch and dinner time.

There are few instances of ‘To-Go’ orders like Uber Eats in this dataset.

Typical lunch hour is 11:30 AM-2:00 PM, and dinner hour is 6:30 PM-10:00 PM.

The data set is just for Tuesday and Wednesday.
We need to expand and randomize the Data for min. of 6 months (Jan. 2019 to June 2019) for all days of the week.

A typical restaurant has high covers (number of customers) on Wednesday, Weekend Dinner, followed by Weekend Lunch, and then relatively low covers for Monday to Friday Lunch.

The data expansion/randomization should follow the above pattern for the number of customers.

Our Gole -

Predict the top ‘Menu Item’ and ‘Item Qty’ for Lunch and Dinner.
These predictions need to be for future dates (Monday to Sunday, July 1st to July 7th)

Implementation -

Importing necessary libraries such as pandas for data manipulation, NumPy uses for performing a mathematical function, and matplotlib, seaborn are data visualization libraries.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
pd.set_option(‘display.max_row’,None)
sns.set_style(‘darkgrid’)

Loading data with the help of pandas

df = pd.read_excel('R. Data.xlsx')
df.head()

1. Data Preparation and Preprocessing.

After removing unnecessary columns and imputing null values of the dataset as shown below, We are ready for data preparation as per the pattern provided above.

As in our dataset, Wednesday is given as weekend, and it’s only available weekend data, So we are using it as it is for further data preparation.

Now we are dividing data into two groups. The first one is a weekday, and another one is the weekend.

weekday = df[df['Day Type']=='Weekday']
weekend = df[df['Day Type']=='Weekend']

We are creating data for each day in a week as per the limitations and information we have.

# Monday - WeekDay
monday = weekday.copy()
monday['Day'] = weekday['Day'].str.replace('Tuesday','Monday')# Tuseday - WeekDay
tuesday = weekday.copy()# Wednesday - WeekEnd
wednesday = weekend.copy()# Thusday - WeekDay
thursday = weekday.copy()                                         
thursday['Day'] = thursday['Day'].str.replace('Tuesday','Thursday')# Friday - WeekDay
friday = weekday.copy()                                          
friday['Day'] = friday['Day'].str.replace('Tuesday','Friday')# Saturday - WeekEnd
saturday = weekend.copy()                                        
saturday['Day'] = saturday['Day'].str.replace('Wednesday','Saturday')# Sunday - WeekEnd
sunday = weekend.copy()                                          
sunday['Day'] = sunday['Day'].str.replace('Wednesday','Sunday')# Combining data of days to create one-week data.
week = []
week = pd.concat([tuesday,wednesday,thursday,friday,saturday,sunday,
monday,],axis = 0)

After creating data for one week now, we have to expand it for 6 months with the same pattern. So we are writing a function for it.

months = week.copy()
x = 0
while x < 25:
    months = pd.concat([months,week],axis = 0)
    x = x+1
months.reset_index(drop=True, inplace=True)

This is the last stage of Data preparation. Here we are creating a new column of dates for respective data and did certain preprocessing to get the proper dataset.

# Creating Dates asper range given.
o = pd.date_range(start='1/1/2019', periods=(len(months)/100), freq='D')
date = []
for i in o:
    for j in range(100):
        date.append(i)
date = pd.DataFrame(date,columns = ['Date'])# Concating Dates with our dataset.
final = pd.concat([date,months],axis = 1)# Changing Columns Positions for better understanding.
final = final[['Date', 'Shift', 'Day Type', 'Day', 'PartySize', 'MenuCateogry','MenuItem', 'ItemPrice', 'ItemQty']]
final = final.iloc[:18100,:]

The final dataset for six months we get, which has 18100 rows and 9 columns.

Here our Data Preparation is finally completed. We maintain all patterns and consider other constraints to match real-world data.

2. Visualization

Now we are doing some Exploraray Data Analysis on data to determine the top items on the menu.

1. Word Cloud

from wordcloud import WordCloud
wordcloud_sp = WordCloud(width=5000,height=2000).generate(speaker_rev_string1)
plt.axis("off")
plt.tight_layout(pad=0)
plt.imshow(wordcloud_sp)

2. Barplot

p = pd.DataFrame(product_1.groupby(['MenuItem']).sum())
p = p.reset_index()
p.sort_values(by=['ItemQty'], inplace=True,ascending=False)
plt.figure(figsize=(20,8))
chart = sns.barplot(x="MenuItem", y="ItemQty", data=p)
plt.xticks(rotation=90)

3. Model Building

To build a model, firstly, we are modifying some arrangements of the table to facilitate multivariate forecasting.

df = final[['Date','Shift','MenuItem','ItemQty']]
new = df.loc[df.index.repeat(df.ItemQty)]
new = new[['Date','Shift','MenuItem']]
table = pd.DataFrame(pd.crosstab(new.Date,[new.Shift,new.MenuItem]))
table.columns = table.columns.map('{0[1]}-{0[0]}'.format)

The dataset we get for multivariate forecasting model training is

Looking Behavior of above Data

Splitting Data into train and test
For the time series model, we can not split our data randomly. We should follow the seasonality pattern of data while breaking. So we are splitting as follows.

Train = table[:int(0.85*(len(table)))]
Test = table[int(0.85*(len(table))):]

For this Data, we are using a Data-driven time series model such as smoothing techniques.

Importing libraries necessary for model building

from statsmodels.tsa.holtwinters import SimpleExpSmoothing 
from statsmodels.tsa.holtwinters import Holt 
from statsmodels.tsa.holtwinters import ExponentialSmoothing

Now we are building different models for each item of the menu and make a data frame of respective RMSE to detect which Model is best among all.

1st Model:- Winter Exponential Smoothing with Additive Seasonality and Additive Trend Model

p = []
for i in table.columns:
 hwe_model_add_add = ExponentialSmoothing(Train[i],seasonal=”add”,trend=”add”,seasonal_periods=7).fit()
 pred_hwe_add_add = hwe_model_add_add.predict(start = Test.index[0],end = Test.index[-1])
 rmse_hwe_add_add = np.sqrt(np.mean((Test[i]-pred_hwe_add_add)**2))
 p.append(round(rmse_hwe_add_add,3))
p = pd.DataFrame(p, columns = [‘Winter_Exponential_Smoothing_RMSE’])

2nd Model:- Holt Method Model

for j in table.columns:
    hw_model = Holt(Train[j]).fit()
    pred_hw = hw_model.predict(start = Test.index[0],end = Test.index[-1])
    rmse_hw = np.sqrt(np.mean((Test[j]-pred_hw)**2))
    q.append(round(rmse_hw,3)) 
p['Holt method Model_RMSE']= pd.DataFrame(q, columns = ['Holt method Model_RMSE'])

3rd Model:- Simple Exponential Mode

r = []
for o in table.columns:
    ses_model = SimpleExpSmoothing(Train[o]).fit()
    pred_ses = ses_model.predict(start = Test.index[0],end = Test.index[-1])
    rmse_ses = np.sqrt(np.mean((Test[o]-pred_ses)**2))
    r.append(round(rmse_ses,3)) 
p['Simple Exponential Mode_RMSE']= pd.DataFrame(r, columns = ['Simple Exponential Mode_RMSE'])

RMSE Data Frame of each Model for each item.

Sum of RMSE’s Model wise.

From above, we can see that Winters Exponential Smoothing with a seasonality of 7 showing less RMSE values as compare to others, So we are selecting it as the final Model for our prediction.

4. Building Functions

We are building two different functions first one is to forecast the Demand for all items in the menu and the second one is to forecast the Demand of the top N item on the menu.

1st Function:- Forecast the Demand for all items and normalize the arrangement of Dataframe in Lunch and dinner formate.

Input parameters. -
table — Historical Dataframe after modification.
start_date — First date of the period which we want to predict.
end_date — Last date of the period which we want to predict.

def Daily_menu_forcasting(table,start_date,end_date):
    da = pd.date_range(start = start_date, end = end_date , freq='D')
    for_pred = pd.DataFrame(da,columns = ['Date'] )
    for_pred = for_pred.set_index('Date')
    for i in table.columns:
        hwe_model_add_add = ExponentialSmoothing(table[i],seasonal="add",trend="add",seasonal_periods=7).fit()
        pred_hwe_add_add = hwe_model_add_add.predict(start = for_pred.index[0],end = for_pred.index[-1])
        for_pred[i]=((round(pred_hwe_add_add)).astype(int))
    final_pred =  for_pred
    p = pd.DataFrame(final_pred.stack())
    p = p.reset_index()
    p[['MenuItem','Shift']] = p.level_1.str.split("-",expand=True,)
    p = p.rename(columns={0: "ItemQty"})
    p = p[['Date','Shift','MenuItem',"ItemQty"]]
    p = p[p['ItemQty'] != 0]
    # Makind Dataframe with dinner and lunch columns
    new = p.loc[p.index.repeat(p.ItemQty)]
    f = pd.DataFrame(pd.crosstab([new.Date,new.MenuItem],[new.Shift]))
    f = f.reset_index()# Shorting Data Frame on the basis top item
    f['Total orders of Day'] = f.Dinner + f.Lunch
    f = f.sort_values(['Date', 'Total orders of Day'], ascending=[True, False]).reset_index(drop= True)
    f
    Daily_req_FiNal_Ans = f.copy()
    return Daily_req_FiNal_Ans

2nd Function:- Forecast the Demand for top N items and normalize the arrangement of Dataframe in Lunch and dinner formate.

Input parameters —
table — Historical Dataframe after modification.
start_date — First date of the period which we want to predict.
end_date — Last date of the period which we want to predict.
N — Number of the top items we want. (default n=5)

def Daily_top_menu_forcasting(table,start_date,end_date,N):
    da = pd.date_range(start = start_date, end = end_date , freq='D')
    for_pred = pd.DataFrame(da,columns = ['Date'] )
    for_pred = for_pred.set_index('Date')
    for i in table.columns:
        hwe_model_add_add = ExponentialSmoothing(table[i],seasonal="add",trend="add",seasonal_periods=7).fit()
        pred_hwe_add_add = hwe_model_add_add.predict(start = for_pred.index[0],end = for_pred.index[-1])
        for_pred[i]=((round(pred_hwe_add_add)).astype(int))
    final_pred =  for_pred
    p = pd.DataFrame(final_pred.stack())
    p = p.reset_index()
    p[['MenuItem','Shift']] = p.level_1.str.split("-",expand=True,)
    p = p.rename(columns={0: "ItemQty"})
    p = p[['Date','Shift','MenuItem',"ItemQty"]]
    p = p[p['ItemQty'] != 0]
    # Makind Dataframe with dinner and lunch columns
    new = p.loc[p.index.repeat(p.ItemQty)]
    f = pd.DataFrame(pd.crosstab([new.Date,new.MenuItem],[new.Shift]))
    f = f.reset_index()# Shorting Data Frame on the basis top item
    f['Total orders of Day'] = f.Dinner + f.Lunch
    f = f.sort_values(['Date', 'Total orders of Day'], ascending=[True, False]).reset_index(drop= True)
    f
    # Finding Topr product for days.
    name =((f['Date'].astype(str)).unique()).tolist()
    t = pd.DataFrame(columns = f.columns)
    for i in name:
        v = pd.DataFrame((f[f['Date']==i]).head(N))
        t = pd.concat([t,v],axis = 0)
    Daily_top_FiNal_Ans = t.reset_index(drop = True)
    return(Daily_top_FiNal_Ans)

Now final Demand forecasting of items on the menu for future dates (Monday to Sunday, July 1st to July 7th) is.

For total items.

all_menu = Daily_menu_forcasting(table,'7/1/2019','7/7/2019')
all_menu.head(10)

For top N items.

# Here N = 8
top_8_menu = Daily_top_menu_forcasting(table,'7/1/2019','7/7/2019'
,8)
top_8_menu.head(10)

Conclusion -

We successfully forecast the Demand for restaurant food items on the menu for Lunch and dinner for a given time period and identify top items on the menu. This article shows one of many ways of demand forecasting implementation. In my view, it is one of the best ways of handling multivariate forecasting business problems.

For complete code, visit my Kaggle page, and please upvote if you find it helpful.- https://www.kaggle.comrestaurant-demand-forecasting

My Linkedin — https://www.linkedin.com/in/nilaydeshmukh