Analytics Vidhya
Published in

Analytics Vidhya

Keras LSTM Forecasting Using Synthetic Data

This post was originally published on my website adamnovotny.com

Summary

Keras LSTM can be a powerful tool for forecasting. Below is a simple template notebook showing how to setup a data science forecasting experiment.

Dataset

A synthetic dataset was generated using a scikit-learn regression generator make_friedman1. The dataset is nonlinear, with noise, and some features are manually scaled to make the deep learning task more challenging. Time series dependence is created by making each label a weighted average of the make_friedman1 generated values and previous labels. For details see notebook function generate_data().

The image below shows correlations between the generated features, the y label generated for the same time period, and actual future_label we are trying to forecast. Features x_0 — x_4 are the only informative features as can be verified from the bottom row showing meaningful but not very strong correlations:

Model training

The model is a simple NN with a single hidden layer defined as keras.layers.LSTM(32).

The generated dataset is split into training, validation, and test sets, each honoring time series nature of the data. Validation set is used to stop training early to prevent overfitting. However, this is not a concern for our synthetic dataset as can be seen from following chart. The validation curve never starts increasing as training epochs continue:

Model evalution

Comparing predictions and actual labels for the validation set shows strong performance even though there are clear optimizations that can be made near extreme values:

However, the validation set was already used during training for early stopping purposes. This is why we set aside a test dataset the model has never seen during training. The test dataset is the only true evaluation of the expected performance of the model and in this case it confirms that the model performs well for the synthetic dataset:

Notebook

--

--

--

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Recommended from Medium

Simplified Explanation of Transfer Learning in Deep Learning: What You Should Know

Tutorial: Custom NER using spaCy

Training deep neural networks on a GPU with PyTorch

Language Model using Transfer Learning for Cricketers

AWS AI-Powered Health Data Masking: What Is it and How Does it Work?

Introduction to MLflow

Explainability In Machine Learning: A Tutorial At AAMAS-2022

Multiscale and multimodal reconstruction of cortical structure and function

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Adam Novotny

Adam Novotny

More from Medium

Using Machine Learning For Predictive Maintenance

Predictive Modeling Using Sklearn

Time-series forecasting using ordinary Machine Learning algorithms

K-Means Clustering — An Unsupervised Machine Learning Algorithm