Time Series Forecasting- Predicting coffee prices

Dushyant Mahajan
AI Skunks
Published in
13 min readApr 15, 2023
Photo by Jon Tyson on Unsplash

Introduction

Formal Definition: Time series forecasting is the process of analyzing time series data using statistics and modeling to make predictions and inform strategic decision-making. It’s not always an exact prediction, and the likelihood of forecasts can vary wildly, especially when dealing with the commonly fluctuating variables in time series data and factors outside our control.

In the simplest terms, time-series forecasting is a technique that utilizes historical and current data to predict future values over a period of time or a specific point in the future.

Picture this: you’re a coffee shop owner, and you’re trying to forecast the demand for your coffee products over the next few months. You know that historical sales data can provide valuable insights, but how can you use that data to predict future trends? This is where time series forecasting comes in. It’s a powerful tool that can help you make informed decisions based on the patterns and trends you’ve observed in your data.

At its core, time series forecasting involves analyzing sequential data to identify patterns and trends over time. By using historical data to predict future values, you can gain insights into future trends, identify potential risks and opportunities, and optimize resource allocation. And it’s not just coffee shops that can benefit from time series forecasting. It’s a technique that can be applied to a wide range of industries, from finance and economics to healthcare and retail.

There are various time series modeling techniques, such as ARIMA, SARIMA, and Prophet, each with its own advantages and limitations. But regardless of the technique used, time series forecasting requires careful data preprocessing and exploratory data analysis to ensure the accuracy and reliability of the models. And while time series forecasting is not perfect, and there are limitations to its accuracy, it’s a rapidly growing field with ongoing research and development of new techniques and tools.

So if you’re a business owner, analyst, or data scientist looking to gain insights into future trends and patterns, time series forecasting is essential to add to your toolkit. With its ability to provide valuable insights into future trends, it has the potential to transform various industries and help businesses make data-driven decisions that can lead to better outcomes and higher profitability.

Photo by Luke Chesser on Unsplash

Dataset

Photo by 🇸🇮 Janko Ferlič on Unsplash

The dataset used is going to be the daily coffee price dataset from Kaggle. Link to dataset: https://www.kaggle.com/datasets/psycon/daily-coffee-price?resource=download

The Daily Coffee Price dataset is a collection of historical coffee prices from various markets around the world. With data spanning several decades, the dataset provides a valuable resource for analyzing trends and patterns in the coffee market and for forecasting future prices and demand. The dataset includes variables such as opening, closing, high, and low prices, as well as trading volume and currency exchange rates. By understanding the underlying time series components of the data, such as trends, seasonality, and cyclical patterns, we can build accurate and reliable time series models that can be used for decision-making in the coffee industry.

We begin by loading the dataset, visualizing the ‘Close’ prices, and analyzing the data’s basic properties.

That’s how the Daily Coffee Price Dataset looks like

Our primary objective is to forecast the ‘Close’ prices of coffee.

Candlestick Plot for the input data

Resampling data

One typical task when doing analysis with time series data is to resample the time series from one frequency to another, such as aggregating the Yearly price observations to daily or monthly averages. Here we sample the data from daily frequency to monthly, quarterly, and annual frequency.

Daily data was Resampled to monthly, quarterly, and annual frequency.

Understanding Patterns in Data

Analyzing Autocorrelation and Partial Autocorrelation:

Autocorrelation (ACF) and partial autocorrelation (PACF) are vital tools for understanding the underlying patterns in our time series data. ACF measures the correlation between a time series and its lagged version, while PACF measures the exact correlation while accounting for the effect of other lags. By examining the ACF and PACF plots, we can identify the appropriate lags for our forecasting models.

Decomposing the Time Series:

To gain further insights into our data, we can decompose the time series into its constituent components: trend, seasonality, and residual. This helps us understand the underlying patterns and improve our forecasting models.

Stationarity

Stationary data refers to the time series data that mean and variance does not vary across time. The data is considered non-stationary if there is a strong trend or seasonality observed from the data.

picture from Forecasting: Principles and Practice

Which of these series are stationary?

(a) Google stock price for 200 consecutive days; (b) Daily change in the Google stock price for 200 consecutive days; © Annual number of strikes in the US; (d) Monthly sales of new one-family houses sold in the US; (e) Annual price of a dozen eggs in the US (constant dollars); (f) Monthly total of pigs slaughtered in Victoria, Australia; (g) Annual total of lynx trapped in the McKenzie River district of north-west Canada; (h) Monthly Australian beer production; (i) Monthly Australian electricity production.

Consider the nine series plotted in the Figure above. Which of these do you think is stationary?

Obvious seasonality rules out series (d), (h), and (i). Trends and changing levels rule out series (a), ©, (e), (f), and (i). Increasing variance also rules out (i). That leaves only (b) and (g) as stationary series.

At first glance, the strong cycles in series (g) might appear to make it non-stationary. But these cycles are aperiodic — they are caused when the lynx population becomes too large for the available feed so that they stop breeding and the population falls to low numbers, then the regeneration of their food sources allows the population to grow again, and so on. In the long term, the timing of these cycles is not predictable. Hence the series is stationary.

Is Stationarity important for time series analysis?

In most cases, it is essential. This is because much statistical analysis or model is built upon the assumption that means and variance are consistent over time.

When we fit a stationary model to the time series data that we want to analyze, we should detect the stationarity of the data and remove the trend/seasonality effect from the data.

Many current time series models like ARIMA have options to include steps to convert the original data into stationary data or which makes our life more convenient. However, it would still be beneficial if we could understand the stationarity of the data so that we can give better input to the model.

Checking for Stationarity:

Stationarity is an essential assumption in time series analysis. A stationary time series exhibits constant statistical properties like mean and variance over time. If our data is non-stationary, our models may produce unreliable predictions.

To test for stationarity, we can plot the ‘Close’ prices and observe if there’s an apparent trend or seasonality. Additionally, we can perform the Dickey-Fuller test, which statistically evaluates stationarity. If the test’s p-value is below a predetermined threshold (e.g., 0.05), we can reject the null hypothesis that the time series is non-stationary.

Making the Data Stationary:

If our data is non-stationary, we need to apply techniques like differencing or transformations (e.g., log, square root) to make it stationary. For example, if the first-order differencing (subtracting the previous value from the current value) results in a stationary series, we can proceed with this transformed data.

Model Selection and Parameter Tuning

Various time series forecasting models exist, such as ARIMA, SARIMA, and Prophet.

ARIMA (Autoregressive Integrated Moving Average): A linear model that combines the autoregressive (AR), moving average (MA), and differencing (I) components.

SARIMA (Seasonal Autoregressive Integrated Moving Average): An extension of ARIMA that includes seasonal components, making it suitable for time series with seasonal patterns.

Prophet: A forecasting model developed by Facebook, which is robust to outliers and missing data, automatically handles seasonality and trend components.

Model Selection using AIC and BIC:

After analyzing the autocorrelation and partial autocorrelation, it’s time to select an appropriate forecasting model and its optimal parameters. AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) are two widely used criteria for model selection in time series analysis.

AIC and BIC balance model complexity and goodness-of-fit, helping us avoid overfitting. Lower AIC and BIC values indicate better models, considering both fit and parsimony.

For example, when working with ARIMA models, we search over various combinations of (p, d, q) parameters, where p represents the order of the autoregressive term, d is the differencing order, and q is the moving average term. We can choose the combination with the lowest AIC and BIC scores, as it likely represents the best trade-off between model complexity and fit.

Similarly, for other time series models, such as SARIMA or VAR, we can use AIC and BIC to guide our search for the best model and its parameters.

We will use the AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) to compare these models and select the best one. AIC and BIC balance model complexity and goodness-of-fit, helping us avoid overfitting.

For example, using the ARIMA model, we can search over various combinations of (p, d, q) parameters, where p represents the order of the autoregressive term, d is the differencing order, and q is the moving average term. We can choose the combination with the lowest AIC and BIC scores.

Model Evaluation

One of the most important things you need to do when evaluating a time series machine learning model is to perform error analysis. This involves calculating the error between the predicted and actual values for each data point. You can then use this information to determine how accurate the model is. There are a few different ways that you can perform error analysis. The following is a list of steps that you can use to check the accuracy of your time series models or perform error analysis:

  • Seasonality impact: Time series models are impacted by time cycles such as daily, weekly, monthly, or any recurring cycle. It is recommended to analyze prediction errors across different cycles. If the errors are seen across different cycles, one can add more time-based features to the training data set. Recall that seasonality is defined as a repeating pattern of events occurring at fixed intervals.
  • Trends analysis: Make sure your model is capable of tracking broad rises and falls. In other words, it does not outperform or underperform in rising or declining trends.
  • Model reactiveness: This is the model’s ability to react quickly to changes in data distribution that aren’t caused by a trend or cycle. Try adding short-term rolling or lagging features if the model is slow to react to sudden data distribution changes. If the model predictions change very swiftly, try adding longer-term rolling or lagging features.
  • Evaluation metrics choice: It is important to decide which metrics to use out of mean squared error (MSE) or mean absolute error (MAE). Performing error analysis with mean squared error loss is appropriate for models that can react to sudden changes in the data distribution. On the other hand, metrics such as mean absolute error are most appropriate for models that don’t change swiftly based on changes in data distribution. If the data distribution changes are unexpected and ignorable, MAE may be a good idea. If it’s important to be reactive to the changes in data distribution, you can as well consider using MSE.
  • Holidays consideration: Make sure to check the effect of holidays on the day and the period around them, AND not just the day or holiday. If the model struggles to perform just before and after holidays, try adding features that tell your model that it’s close to a holiday.
  • Model bias analysis: Make sure that your model is not over-forecasting or under-forecasting in a consistent manner. If that is happening, try adding more data. In addition, try checking the data quality, as this has been found to be a common issue that can cause the model to be consistently over-forecast or under-forecast. Sometimes the data you’re working with is not in a clean format. This can cause your machine-learning model to perform poorly. Make sure you clean up your data before training your model.
  • Different machine learning algorithms: If you’ve already tried different feature engineering and your data doesn’t seem to be responding, you can try switching machine learning algorithms. Different algorithms have different strengths and weaknesses, so this could be your data’s solution.
  • Model performance in production: Make sure to check the time-series model deployed in production continues to perform well. If the model performance starts to vary too quickly, it is a good idea to consider increasing the length of the roll-forward evaluation windows.

Exploring Advanced Models — Facebook Prophet

Facebook Prophet is an open-source algorithm for generating time-series models using a few old ideas with new twists. It is particularly good at modeling time series that have multiple seasonalities and doesn’t face some of the above drawbacks of other algorithms. At its core is the sum of three functions of time plus an error term: growth g(t), seasonality s(t), holidays h(t), and error e_t :

The Growth Function (and change points):

The growth function models the overall trend of the data. The old idea should be familiar to anyone with a basic knowledge of linear and logistic functions. The new idea incorporated into Facebook Prophet is that the growth trend can be present at all data points or altered at what Prophet calls “changepoints.”

Changepoints are moments in the data where the data shift direction. Using new COVID-19 cases as an example, it could be due to new cases beginning to fall after hitting a peak once a vaccine is introduced. Or it could be a sudden pick up of cases when a new strain is introduced into the population, and so on. Prophet can automatically detect change points or you can set them yourself. You can also adjust the change points’ power in altering the growth function and the amount of data taken into account in automatic changepoint detection.

The growth function has three main options:

  • Linear Growth: This is the default setting for Prophet. It uses a set of piecewise linear equations with differing slopes between change points. When linear growth is used, the growth term will look similar to the classic y = mx + b from middle school, except the slope(m) and offset(b) are variable and will change value at each changepoint.
  • Logistic Growth: This setting is useful when your time series has a cap or a floor in which the values you are modeling becomes saturated and can’t surpass a maximum or minimum value (think carrying capacity). When logistic growth is used, the growth term will look similar to a typical equation for a logistic curve (see below), except it the carrying capacity © will vary as a function of time and the growth rate (k) and the offset(m) are variable and will change value at each change point.
  • Flat: Lastly, you can choose a flat trend when there is no growth over time (but there still may be seasonality). If set to flat the growth function will be a constant value.

The Seasonality Function:

The seasonality function is simply a Fourier Series as a function of time. If you are unfamiliar with Fourier Series, an easy way to think about it is the sum of many successive sines and cosines. Each sine and cosine term is multiplied by some coefficient. This sum can approximate nearly any curve or, in the case of Facebook Prophet, the seasonality (cyclical pattern) in our data. All together it looks like this:

If the above is difficult to decipher, I recommend this simple breakdown of the Fourier Series or this video on the intuition behind the Fourier series.

If you are still struggling to understand the Fourier series, do not worry. You can still use Facebook Prophet because Prophet will automatically detect an optimal number of terms in the series, also known as the Fourier order. Or if you are confident in your understanding and want more nuance, you can also choose the Fourier order based on the needs of your particular data set. The higher the order the more terms in the series. You can also choose between additive and multiplicative seasonality.

The Holiday/Event Function:

The holiday function allows Facebook Prophet to adjust forecasting when a holiday or major event may change the forecast. It takes a list of dates (there are built-in dates of US holidays, or you can define your own dates), and when each date is present in the forecast adds or subtracts value from the forecast from the growth and seasonality terms based on historical data on the identified holiday dates. You can also identify a range of days around dates (think the time between Christmas/New Year's, holiday weekends, thanksgiving’s association with Black Friday/Cyber Monday, etc.).

Forecasting Future Coffee Prices

Once we have chosen the best model, we can use it to forecast future coffee prices. Visualizing these predictions alongside historical data is essential to interpret the forecasts better.

Moreover, providing confidence intervals for our predictions can help communicate the uncertainty associated with the forecasts. Most time series models, including ARIMA and Prophet, offer functionality to generate confidence intervals.

Conclusion

In this blog post, we walked you through a comprehensive, step-by-step tutorial on time series forecasting using the Daily Coffee Price dataset. We discussed various aspects of forecasting, including data preparation, stationarity, model selection, and evaluation.

Remember that time series forecasting is an iterative process; there’s always room for improvement. Keep experimenting with different models, parameters, and techniques to refine your forecasts further. Happy forecasting!

Colab notebook Link:

https://colab.research.google.com/drive/136WNAwoZ0D5zoMkkU5B8QcAiU6mjeR5c#scrollTo=yp7L4nv-M3K4&uniqifier=1

References

  1. https://www.kaggle.com/datasets/psycon/daily-coffee-price?resource=download
  2. https://otexts.com/fpp2/stationarity.html
  3. https://towardsdatascience.com/stationarity-assumption-in-time-series-data-67ec93d0f2f#:~:text=What%20is%20stationary%20data%3F,seasonality%20observed%20from%20the%20data.
  4. https://colab.research.google.com/drive/1FyhK0hhscSQqHfo227QvarzcWJAlIDDy?usp=sharing#scrollTo=ON9zL6SEdIhD
  5. https://vitalflux.com/steps-for-evaluating-validating-time-series-models/
  6. https://towardsdatascience.com/time-series-analysis-with-facebook-prophet-how-it-works-and-how-to-use-it-f15ecf2c0e3a

--

--