A Python Data Science Web App in Minutes

How to Build a SARIMAX Forecast App for End Users with Streamlit

10 min readOct 5, 2021

We will build, step by step, a forecast model and deploy it — using nothing but a few added lines of Python code — as a web app, for the end users in the fictional start-up company UpJetBaggies.

Daeron Morsk, the CEO of the up-and-coming startup UpJetBaggies Inc. (aka UJBAG), paces from left to right in front of the wall-to-wall presentation screen.

UJBAG’s business strategy is to build and deploy a gargantuan fleet of pelican-sized drone copters to provide any customers who feel a sudden urge for comfort garments — sweatpants, onesies, calf warmers, you name it — with their chosen cuddle article within 22 minutes, rain or shine.

UJBAG urgently needs a forecast tool that will enable them to prepare fact-based projections on their market outlook. They have shown me the data they collect for demand projections in Excel. Their primary source on market developments consists of Google Trends data on the waxing and waning popularity of sweatpants.

sweatpants — Explore — Google Trends

UJBAG does not employ data scientists. Thus, we’ll have to provide their business analysis team, sales managers, and financial forecasters with an easily accessible platform they can use for preparing and sharing demand forecasts — without turning every team member into an experienced Python coder or Jupyter notebook expert; and without installing, maintaining, and updating Python on each user’s computer, but nonetheless with easy access to advanced data science tools.

No problem. It will take, by my calculation, 42 minutes, give or take, to implement an initial forecast model as a Python web app for them. They will be able to upload a file with their time series raw data, then calibrate and run a SARIMAX model, see the forecast results displayed on the intranet site, where it can be shared with the management team, and download the forecast figures as a csv or Excel file for inclusion in their financial plan.

0a. Design

Let’s take 3 screenshots, so we can get an idea of the overall structure our website will have:

from the file uploader widget in the side bar on the left
to the interactive forecast chart and the file download button at the bottom

We start to design our website by creating a Streamlit sidebar on the left. At its top, we insert two headers and a picture.

0b. Dependencies

We import the Streamlit package;
and will mainly use the statsmodels and pmdarima libraries to set up the forecast model.

1. Data Collection and Processing

1.1 The End User Uploads Her Raw Data

To enable our app to receive source data, we insert Streamlit’s file uploader widget. The end user will be able to upload a .csv file, from her local machine or a network drive, with her newest time series data, for which she wants to run the forecast model.

The widget displays a help icon in its upper right corner. Upon click, the user can read the guidance we prepared on how to design the file and upload it.

The file uploader widget is not limited to csv files. The upload can be restricted to, or expanded to include, other file types. The method also offers an accept_multiple_files Boolean option; if set to True, the user can upload more than one file in one go.

1.2 Data Wrangling

Our app reads the uploaded file into a pandas dataframe and does some quick data wrangling: conversion of string dates to datetime type; setting a datetime index; deriving year and month as separate columns; and skipping the time part by applying .datetime.date.

After these housekeeping steps, the app displays the dataframe on the website via the st.dataframe or st.write methods.

The .style.highlight options for Streamlit’s dataframe method allow us to pour some color on certain numbers in the dataframe, for example style.highlight_max to mark the largest values in a column.

The user can scroll vertically and horizontally in the interactive dataframe. A click on the double-arrow icon in the upper right corner expands the table.

We insert an interactive line chart by using the st.plotly_chart method. Streamlit offers a catalogue of chart types. It wraps charting packages such as plotly, altair, matplotlib, and bokeh.

We aggregate and filter our time series by having the app create a pivot table. Streamlit’s write method — the equivalent of the print method we use for terminal windows and Jupyter notebook cells— displays the pivot table on the website.

1.3 Diagnostics and Transformations

Next, we use Streamlit’s metric widget to display the results of some diagnostic tests which the app is going to run on our uploaded time series.

We apply scipy’s normaltest to check if the observations are approximately normally distributed. The st.metric line takes the p-value of the test, and also displays the delta value representing its deviation from the significance level.

The first Streamlit widget, to which we assign the normaltest on the raw data, returns a p-value of 0.000, -0.050 lower than the preferred result. The widget’s red color code illustrates the delta relative to the threshold of 0.05, the alpha significance level we chose in our code. We reject the null hypothesis that the observations follow a normal distribution.

To deal with the non-normality — which would not completely invalidate a forecast, but would return unreliable confidence intervals and test results — , we apply a Box-Cox transformation and then test again for normality. The transformed time series passes the test with flying green colors — the widget displays a p-value as high as 0.997.

Next, we need to determine if the time series is stationary: Are mean and variance time-invariant?

We run the ADF and KPSS tests to learn if our Box-Cox-transformed time series will require differencing to make it stationarity. The tests of the pmdarima package return the recommended order of first differencing.

If ADF and KPSS disagree on the need for differencing, the series is either difference-stationary (ADF does not recommend differencing, but KPSS does, as in our case) or trend-stationary (ADF does recommend differencing, but KPSS does not). The conclusion is the same: We need to difference if either test passes a judgment of non-stationarity.

After determining the order of first differencing, the app runs the OSCB and CH tests for seasonal differencing. Both tests agree that seasonal differencing will not be required.

The app collects the test results in a dataframe and displays it on the website.

Then the app follows the recommendations of the tests and calculates the differenced time series, df2[“y_bc_diff”].

The Python code runs the normaltest on the differenced series and passes the result to the Streamlit widget to make sure that the series still is normally distributed. The p-value of 0.071, illustrated by the green up arrow, assures that it is.

2. Forecast Model

2.1 Setting up the Model

In our website’s sidebar on the left, the user can select a set of parameters to control the forecast process if she wants to override the default settings.

Streamlit offers text controls such as the expander box to provide a longer guidance text to the end user.

In the sidebar, the user can open Streamlit’s date_input widget to select a date and determine the number of periods she wants to reserve for testing. In the code that we have associated with the date_input widget in our app, we have limited the possible user inputs to certain minimum and maximum dates.

The sidebar — via Streamlit’s radio buttons and selectboxes — also offers the user a choice between three different forecast models which the app can apply to the source data: SARIMAX, facebook Prophet, or LSTM.

After selecting the model — the default is SARIMAX — , the end user can manually override the SARIMA default values for the autoregressive and moving average terms in the number_input widgets, which the app also shows in its sidebar.

In the animation below, you see that when the user increases or decreases the AR or MA terms on the left, the SARIMAX summary will be updated in real-time. The information criteria in the top-right corner change; so do the the coefficients, their confidence intervals, and their p-values in the bottom-half of the summary table. Thus, the user has the option to take a model, after an automatized hyperparameter search has come up with it, and play with it, for instance by excluding dubious AR or MA terms that exhibit coefficients close to zero or high p-values far above 0.05.

Internally, the app uses the ARIMA method of the pmdarima package to build the SARIMAX model:

2.2 Model Training

After the user has made her decision on the SARIMA parameters, or has accepted the default values, the app reports the SARIMAX summary — a pandas dictionary — on the website.

In a scrollable Streamlit textbox, which has the capacity to accept a long text, we can provide guidance to the user on how to interpret the summary table.

Before we proceed and calculate predictions for the training and test periods, we should define a function that will provide us with four metrics for prediction accuracy:

mean absolute error MAE
mean absolute percentage error MAPE
root mean square error RMSE
correlation coefficient between actual observations and predictions

Final training step: the app computes the predicted values for the months in the training period.

2.3 Model Testing

Training completed, the app will run the SARIMAX model for the months in the test period.

Next, the website will display a dataframe that compares the prediction accuracy metrics for the training and test periods.

2.4 Forecast

After training and testing, the app proceeds to the forecasting part of its routine.

Before she uploaded her source data file, which contained the Google Trends actual observations of “sweatpants popularity” until September 2021, the end user had appended dates, with as many months as she wanted to get forecast values for, to the time series: from October 2021 through July 2022.

She also inserted her expert estimate for the exogeneous variable X. X represents the peak months that exhibit extreme pants bagginess, marked with 1; or extreme dips, marked by -1, when an excessive peak is followed by a deep trough; and the more normal months in between, marked with 0. She expects that the pants bagginess cycle will soon return to its stable historical seasonality, with foreseeable peak months limited to November/December.

The app concludes that it is supposed to compute forecast values, ‘y_hat’, for the months in which the csv source file shows empty cells in its second column, which contains the forecast target variable ‘y’; but filled cells in the third column, which contains the exogenous variable, the peaks ‘X’.

After running the SARIMAX model on these months, the app combines the predicted values with the actual observations in a dataframe.

The app reports the forecast on the website from two angles:

(1) an interactive Streamlit plotly line chart

(2) a scrollable dataframe that reports the forecast values and actual observations

A Streamlit download button enables the user to retrieve the dataframe as an Excel or csv file.

3. To Summarize …

… we can insert a textbox with explanations in our web app.

We can also show, using the st.image method, an image with a corporate logo on the app’s website. Or alternatively, we can insert the picture of a wiener dog in zen pose, because we have now completed our Python web app tutorial and can decompress.