Open-Source Repository of Forecasting Best Practices for Accelerating Solution Development

Francesca Lazzeri
Microsoft Azure
Published in
3 min readApr 22, 2020

This post was co-authored by Chenhui Hu, Vanja Paunic, Hong Ooi, Tao Wu, Wee Hyong Tok.

Time series forecasting is one of the most important topics in data science. Imagine that you are a business owner, you might want to predict different sorts of future events to make better decisions and optimize your resource allocation. Typical examples of time series forecasting use cases are retail sales forecasting, package shipment delay forecasting, energy demand forecasting, and financial forecasting. As you can see, forecasting is everywhere!

Given its ubiquitous nature and wide-ranging business applications, we have developed an open-source forecasting repo that puts world-class models and forecasting best practices in the hands of data scientists and industry experts — i.e., you!

Figure 1: Visualization of training and testing iterations of a sales forecasting scenario using LightGBM model

This repository provides examples of building forecasting solutions presented as Python Jupyter notebooks, R markdown files, and a library of utility functions. Our goal is to help you as a data scientist or machine learning engineer with varying levels of knowledge in forecasting:

  • Learn best practices for the development of forecasting solutions in a variety of languages.
  • Leverage recent advances in forecasting algorithms to build high-performance solutions and operationalize them.
  • Accelerate the solution development process for real-world forecasting problems. With the provided examples, you will be able to significantly reduce the “time to market” by simplifying the experience from defining the business problem to the development of solutions by orders of magnitude.

In the repository, you will find state-of-the-art (SOAT) forecasting models using traditional machine learning and deep learning approaches. Implementations of SOTA models in this release are centered around retail sales forecasting and are written in Python and R, two of the most popular programming languages in the forecasting domain. To enable high-throughput forecasting scenarios, we have included notebooks for forecasting multiple time series with distributed training techniques such as Ray in Python, the parallel package in R, and multi-threading in LightGBM. The following is a quick summary of forecasting models covered in this repository.

The repository also comes with Azure Machine Learning (Azure ML) themed notebooks and best practices recipes to accelerate the development of scalable, production-grade forecasting solutions on Azure. You will find the following examples for forecasting with Azure AutoML as well as tuning and deploying a forecasting model on Azure.

Developing an accurate forecasting solution can be a complex and time-consuming process. We hope the forecasting repo will help shorten your development cycle on Azure.

To Learn More and Contribute

For more information, please visit: https://github.com/microsoft/forecasting

Contributions from open-source community are always welcome! Please feel free to check our contribution guide if you would like to contribute to the content and bring in the latest SOTA algorithms.

Additional Azure resources to learn more

To learn more, you can read the following articles and notebooks:

— — — — — — — — — — — — — — — — — — — — — — — — — —

This post was originally published at https://techcommunity.microsoft.com/t5/azure-ai/open-source-repository-of-forecasting-best-practices-for/ba-p/1298941 on April 14, 2020.

--

--

Francesca Lazzeri
Microsoft Azure

Principal Data Scientist Director @Microsoft ~ Adjunct Professor @Columbia University ~ PhD