<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by azul garza ramirez on Medium]]></title>
        <description><![CDATA[Stories by azul garza ramirez on Medium]]></description>
        <link>https://medium.com/@azul.garza.ramirez?source=rss-2855bd3e0293------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*xGkkOSTE0FVwgXMpT23yfw.jpeg</url>
            <title>Stories by azul garza ramirez on Medium</title>
            <link>https://medium.com/@azul.garza.ramirez?source=rss-2855bd3e0293------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Wed, 27 May 2026 23:55:14 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@azul.garza.ramirez/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[Distributed Forecast of 1M Time Series in Under 15 Minutes with Spark, Nixtla, and Fugue]]></title>
            <link>https://medium.com/data-science/distributed-forecast-of-1m-time-series-in-under-15-minutes-with-spark-nixtla-and-fugue-e9892da6fd5c?source=rss-2855bd3e0293------2</link>
            <guid isPermaLink="false">https://medium.com/p/e9892da6fd5c</guid>
            <category><![CDATA[statistics]]></category>
            <category><![CDATA[distributed-systems]]></category>
            <category><![CDATA[data-science]]></category>
            <category><![CDATA[spark]]></category>
            <category><![CDATA[time-series-forecasting]]></category>
            <dc:creator><![CDATA[azul garza ramirez]]></dc:creator>
            <pubDate>Fri, 16 Sep 2022 04:54:10 GMT</pubDate>
            <atom:updated>2022-09-16T04:57:22.514Z</atom:updated>
            <content:encoded><![CDATA[<h4>Scalable Time Series Modeling with open-source projects StatsForecast, Fugue, and Spark</h4><p>By <a href="https://medium.com/u/e8b70581a392">Kevin Kho</a>, <a href="https://medium.com/u/6926bdc4ca1e">Han Wang</a>, <a href="https://medium.com/u/76b639655285">Max Mergenthaler</a> and <a href="https://medium.com/u/2855bd3e0293">Federico Garza Ramírez</a>.</p><p><strong>TL:DR We will show how you can leverage the distributed power of Spark and the highly efficient code from StatsForecast to fit millions of models in a couple of minutes.</strong></p><p>Time-series modeling, analysis, and prediction of trends and seasonalities for data collected over time is a rapidly growing category of software applications.</p><p>Businesses, from electricity and economics to healthcare analytics, collect time-series data daily to predict patterns and build better data-driven product experiences. For example, temperature and humidity prediction is used in manufacturing to prevent defects, streaming metrics predictions help identify music’s popular artists, and sales forecasting for thousands of SKUs across different locations in the supply chain is used to optimize inventory costs. As data generation increases, the forecasting necessities have evolved from modeling a few time series to predicting millions.</p><h3>Motivation</h3><p><a href="https://github.com/Nixtla">Nixtla</a> is an open-source project focused on state-of-the-art time series forecasting. They have a couple of libraries such as <a href="https://github.com/Nixtla/statsforecast">StatsForecast</a> for statistical models, <a href="https://github.com/Nixtla/neuralforecast">NeuralForecast</a> for deep learning, and <a href="https://github.com/Nixtla/hierarchicalforecast">HierarchicalForecast</a> for forecast aggregations across different levels of hierarchies. These are production-ready time series libraries focused on different modeling techniques.</p><p>This article looks at <a href="https://github.com/Nixtla/statsforecast">StatsForecast</a>, a lightning-fast forecasting library with statistical and econometrics models. The AutoARIMA model of Nixtla is 20x faster than <a href="http://alkaline-ml.com/pmdarima/">pmdarima</a>, and the ETS (error, trend, seasonal) models performed 4x faster than <a href="https://github.com/statsmodels/statsmodels">statsmodels</a> and are more robust. The benchmarks and code to reproduce can be found <a href="https://github.com/Nixtla/statsforecast#-accuracy---speed">here</a>. A huge part of the performance increase is due to using a JIT compiler called <a href="https://numba.pydata.org/">numba</a> to achieve high speeds.</p><p>The faster iteration time means that data scientists can run more experiments and converge to more accurate models faster. It also means that running benchmarks at scale becomes easier.</p><p>In this article, we are interested in the scalability of the StatsForecast library in fitting models over <a href="https://spark.apache.org/docs/latest/api/python/index.html">Spark </a>or <a href="https://github.com/dask/dask">Dask</a> using the <a href="https://github.com/fugue-project/fugue/">Fugue</a> library. This combination will allow us to train a huge number of models distributedly over a temporary cluster quickly.</p><h3>Experiment Setup</h3><p>When dealing with large time series data, users normally have to deal with thousands of logically independent time series (think of telemetry of different users or different product sales). In this case, we can train one big model over all of the series, or we can create one model for each series. Both are valid approaches since the bigger model will pick up trends across the population, while training thousands of models may fit individual series data better.</p><p><em>Note: to pick up both the micro and macro trends of the time series population in one model, check the Nixtla </em><a href="https://github.com/Nixtla/hierarchicalforecast"><em>HierarchicalForecast</em></a><em> library, but this is also more computationally expensive and trickier to scale.</em></p><p>This article will deal with the scenario where we train a couple of models (AutoARIMA or ETS) per univariate time series. For this setup, we group the full data by time series, and then train each model for each group. The image below illustrates this. The distributed DataFrame can either be a Spark or Dask DataFrame.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/730/0*HbHd-D8XmtN5F2bI.png" /><figcaption>AutoARIMA per partition — Image by Author</figcaption></figure><p>Nixtla previously released benchmarks with <a href="https://www.anyscale.com/">Anyscale</a> on distributing this model training on Ray. The setup and results can be found <a href="https://www.anyscale.com/blog/how-nixtla-uses-ray-to-accurately-predict-more-than-a-million-time-series">in this blog</a>. The results are also shown below. It took 2000 cpus to run one million AutoARIMA models in 35 minutes. We’ll compare this against running on Spark.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/800/0*bnlD5NAslUxfTniv.png" /><figcaption>StatsForecast on Ray results — Image by author</figcaption></figure><h3>Statsforecast Code</h3><p>First, we’ll look at the StatsForecast code used to run the AutoARIMA distributedly on <a href="https://docs.ray.io/en/latest/index.html">Ray</a>. This is a simplified version to run the scenario with a one million time series. It is also updated for the recent StatsForecast v1.0.0 release, so it may look a bit different from the code in the previous benchmarks.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/cb97e485c9e89855d7b213d584e2872b/href">https://medium.com/media/cb97e485c9e89855d7b213d584e2872b/href</a></iframe><p>The interface of StatsForecast is very minimal. It is already designed to perform the AutoARIMA on each group of data. Just supplying the ray_address will make this code snippet run distributedly. Without it, n_jobswill indicate the number of parallel processes for forecasting. model.forecast() will do the fit and predict in one step, and the input to this method in the time horizon to forecast.</p><h3>Using Fugue to run on Spark and Dask</h3><p><a href="https://github.com/fugue-project/fugue">Fugue</a> is an abstraction layer that ports Python, Pandas, and SQL code to Spark and Dask. The most minimal interface is the transform() function. This function takes in a function and DataFrame, and brings it to Spark or Dask. We can use the transform() function to bring StatsForecast execution to Spark.</p><p>There are two parts to the code below. First, we have the forecast logic defined in the forecast_series function. Some parameters are hardcoded for simplicity. The most important one is that n_jobs=1 . This is because Spark or Dask will already serve as the parallelization layer, and having two stages of parallelism can cause resource deadlocks.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/e7ce8fdf2d33500490ab3657df1af17d/href">https://medium.com/media/e7ce8fdf2d33500490ab3657df1af17d/href</a></iframe><p>Second, the transform() function is used to apply the forecast_series() function on Spark. The first two arguments are the DataFrame and function to be applied. Output schema is a requirement for Spark, so we need to pass it in, and the partition argument will take care of splitting the time series modelling by unique_id.</p><p>This code already works and returns a Spark DataFrame output.</p><h3>Nixtla’s FugueBackend</h3><p>The transform()above is a general look at what Fugue can do. In practice, the Fugue and Nixtla teams collaborated to add a more native FugueBackendto the StatsForecast library. Along with it is a utility forecast() function to simplify the forecasting interface. Below is an end-to-end example of running StatsForecast on one million time series.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/1ff768f9bb0d38bec1283dd881a87ed9/href">https://medium.com/media/1ff768f9bb0d38bec1283dd881a87ed9/href</a></iframe><p>We just need to create the FugueBackend, which takes in a SparkSession and passes it to forecast() . This function can take either a DataFrame or file path to the data. If a file path is provided, it will be loaded with the parallel backend. In this example above, we replaced the file each time we ran the experiment to generate benchmarks.</p><p>It’s also important to note that we can test locally before running the forecast()on full data. All we have to do is not supply anything for the parallel argument; everything will run on Pandas sequentially.</p><h3>Benchmark Results</h3><p>The benchmark results can be seen below. As of the time of this writing, Dask and Ray made recent releases, so only the Spark metrics are up to date. We will make a follow-up article after running these experiments with the updates.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*2ovS-D5XHQcVQobK.png" /><figcaption>Spark and Dask benchmarks for StatsForecast at scale</figcaption></figure><p><em>Note: The attempt was to use 2000 cpus but we were limited by available compute instances on AWS.</em></p><p>The important part here is that AutoARIMA trained one million time series models in less than 15 minutes. The cluster configuration is attached in the appendix. With very few lines of code, we were able to orchestrate the training of these time series models distributedly.</p><h3>Conclusion</h3><p>Training thousands of time series models distributedly normally takes a lot of coding with Spark and Dask, but we were able to run these experiments with very few lines of code. Nixtla’s StatsForecast offers the ability to quickly utilize all of the compute resources available to find the best model for each time series. All users need to do is supply a relevant parallel backend (Ray or Fugue) to run on a cluster.</p><p>On the scale of one million timeseries, our total training time took 12 minutes for AutoARIMA. This is the equivalent of close to 400 cpu-hours that we ran immediately, allowing data scientists to quickly iterate at scale without having to write the explicit code for parallelization. Because we used an ephemeral cluster, the cost is effectively the same as running this sequentially on an EC2 instance (parallelized over all cores).</p><h3>Resources</h3><ol><li><a href="https://github.com/Nixtla/statsforecast">Nixtla StatsForecast repo</a></li><li><a href="https://nixtla.github.io/statsforecast/">StatsForecast docs</a></li><li><a href="https://github.com/fugue-project/fugue/">Fugue repo</a></li><li><a href="https://fugue-tutorials.readthedocs.io/">Fugue tutorials</a></li></ol><p>To chat with us:</p><ol><li><a href="http://slack.fugue.ai/">Fugue Slack</a></li><li><a href="https://join.slack.com/t/nixtlaworkspace/shared_invite/zt-135dssye9-fWTzMpv2WBthq8NK0Yvu6A">Nixtla Slack</a></li></ol><h3>Appendix</h3><p>For anyone. interested in the cluster configuration, it can be seen below. This will spin up a Databricks cluster. The important thing is the node_type_id that has the machines used.</p><pre>{<br>    &quot;num_workers&quot;: 20,<br>    &quot;cluster_name&quot;: &quot;fugue-nixtla-2&quot;,<br>    &quot;spark_version&quot;: &quot;10.4.x-scala2.12&quot;,<br>    &quot;spark_conf&quot;: {<br>        &quot;spark.speculation&quot;: &quot;true&quot;,<br>        &quot;spark.sql.shuffle.partitions&quot;: &quot;8000&quot;,<br>        &quot;spark.sql.adaptive.enabled&quot;: &quot;false&quot;,<br>        &quot;spark.task.cpus&quot;: &quot;1&quot;<br>    },<br>    &quot;aws_attributes&quot;: {<br>        &quot;first_on_demand&quot;: 1,<br>        &quot;availability&quot;: &quot;SPOT_WITH_FALLBACK&quot;,<br>        &quot;zone_id&quot;: &quot;us-west-2c&quot;,<br>        &quot;spot_bid_price_percent&quot;: 100,<br>        &quot;ebs_volume_type&quot;: &quot;GENERAL_PURPOSE_SSD&quot;,<br>        &quot;ebs_volume_count&quot;: 1,<br>        &quot;ebs_volume_size&quot;: 32<br>    },<br>    &quot;node_type_id&quot;: &quot;m5.24xlarge&quot;,<br>    &quot;driver_node_type_id&quot;: &quot;m5.2xlarge&quot;,<br>    &quot;ssh_public_keys&quot;: [],<br>    &quot;custom_tags&quot;: {},<br>    &quot;spark_env_vars&quot;: {<br>        &quot;MKL_NUM_THREADS&quot;: &quot;1&quot;,<br>        &quot;OPENBLAS_NUM_THREADS&quot;: &quot;1&quot;,<br>        &quot;VECLIB_MAXIMUM_THREADS&quot;: &quot;1&quot;,<br>        &quot;OMP_NUM_THREADS&quot;: &quot;1&quot;,<br>        &quot;NUMEXPR_NUM_THREADS&quot;: &quot;1&quot;<br>    },<br>    &quot;autotermination_minutes&quot;: 20,<br>    &quot;enable_elastic_disk&quot;: false,<br>    &quot;cluster_source&quot;: &quot;UI&quot;,<br>    &quot;init_scripts&quot;: [],<br>    &quot;runtime_engine&quot;: &quot;STANDARD&quot;,<br>    &quot;cluster_id&quot;: &quot;0728-004950-oefym0ss&quot;<br>}</pre><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=e9892da6fd5c" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-science/distributed-forecast-of-1m-time-series-in-under-15-minutes-with-spark-nixtla-and-fugue-e9892da6fd5c">Distributed Forecast of 1M Time Series in Under 15 Minutes with Spark, Nixtla, and Fugue</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Forecasting with Synthetic Data at Scale (Nixtla & YData)]]></title>
            <link>https://medium.com/data-science/forecasting-with-synthetic-data-at-scale-nixtla-ydata-404b65600876?source=rss-2855bd3e0293------2</link>
            <guid isPermaLink="false">https://medium.com/p/404b65600876</guid>
            <category><![CDATA[deep-learning]]></category>
            <category><![CDATA[open-source]]></category>
            <category><![CDATA[time-series-forecasting]]></category>
            <category><![CDATA[synthetic-data]]></category>
            <category><![CDATA[python]]></category>
            <dc:creator><![CDATA[azul garza ramirez]]></dc:creator>
            <pubDate>Wed, 05 Jan 2022 18:09:22 GMT</pubDate>
            <atom:updated>2022-09-15T16:45:08.310Z</atom:updated>
            <content:encoded><![CDATA[<h4>Make synthetic time series data and then forecast it with Deep Learning models</h4><p><em>By </em><a href="http://github.com/nixtla"><em>Nixtla</em></a><em> and </em><a href="http://YData.ai"><em>YData</em></a><em>. </em><a href="https://medium.com/u/2855bd3e0293"><em>Federico Garza Ramírez</em></a><em> and </em><a href="https://medium.com/u/76b639655285"><em>Max Mergenthaler</em></a><em>.</em></p><h4>Introduction</h4><p>In this post, we explain how to use <a href="https://github.com/Nixtla/nixtlats">nixtlats</a> and <a href="https://github.com/ydataai/ydata-synthetic">ydata-synthetic</a>, open-source and free python libraries that allow you to generate synthetic data to train state-of-the-art deep learning models without any significant loss of data quality. We develop a deep learning forecasting pipeline without direct access to the original data and show that synthetic data has a minimal impact on the performance of the models.</p><h4>Motivation</h4><p>In the last decade, neural network-based forecasting methods have become ubiquitous in large-scale forecasting applications, transcending industry boundaries into academia, as it has redefined the state-of-the-art in many practical tasks like demand planning, electricity load forecasting, reverse logistics, weather forecasting, as well as forecasting competitions like the M4 and M5.</p><p>However, one of the problems for those interested in creating forecasts is model development or software testing without using original data; this may be because the actual data takes time to collect, there are restrictions on its use, or the data simply does not exist. In many applications, the user does not want the model to have access to the actual data, in particular, if the model training is done in the cloud or outside one’s infrastructure. The above dramatically limit the practice, preventing the scaling of models for large datasets using available clouds.</p><p>This post shows how to solve this problem using nixtlats and ydata-synthetic. First, the user can create synthetic data using ydata-synthetic; synthetic data is artificially created and keeps the original data properties, ensuring its business value while being compliant. Subsequently the user can train state-of-the-art neural forecasting algorithms using nixtlats without accessing the original data. Once the model is trained, the model can be sent to the owner of the original data and perform inference in the security of their infrastructure. The following diagram describes the process.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*AJtRNHZTvg86Sz0tCexOUg.png" /><figcaption>Image by the authors</figcaption></figure><p>We evaluate and show the performance of the synthetic model’s predictions remains constant compared with the original model’s predictions.</p><h3>Libraries</h3><p>The libraries nixtlats and ydata-synthetic are available in <a href="https://pypi.org/project/nixtlats/">PyPI</a>, so you can install them using pip install nixtlats and pip install ydata-synthetic.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/ef81f7b5426aa19c010d63f313e412b4/href">https://medium.com/media/ef81f7b5426aa19c010d63f313e412b4/href</a></iframe><h4>Data</h4><p>To evaluate the pipeline, we consider the yearly <a href="https://www.kaggle.com/yogesh94/m4-forecasting-competition-dataset">M4 competition</a> dataset. The dataset was <a href="https://github.com/Mcompetitions/M4-methods">originally released publicly</a> and it was released with a <a href="https://github.com/Mcompetitions/M4-methods/issues/16">completely open-access license</a>. The M4 major forecasting competition introduced a novel multivariate time series model called Exponential Smoothing Recurrent Neural Network (ESRNN), which won by a large margin over baselines and complex time series ensembles.</p><p>We will use nixtlats library to easily access the data.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/352dd4d7d4f09acf36b3c6e2844196f9/href">https://medium.com/media/352dd4d7d4f09acf36b3c6e2844196f9/href</a></iframe><p>In this example, we use 1,000 Yearly time series.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/a28b6559e60e5dff68a6f6facff2e870/href">https://medium.com/media/a28b6559e60e5dff68a6f6facff2e870/href</a></iframe><p>The M4.load method returns train and test sets, so we need to split them. The library also provides a wide variety of datasets, <a href="https://nixtla.github.io/nixtlats">see the documentation</a>.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/63707be9244ef7fb7ab96d49d5b43361/href">https://medium.com/media/63707be9244ef7fb7ab96d49d5b43361/href</a></iframe><p>nixtlats requires a dummy test set to make forecasts, so we combine the training data with the testing data with zero values.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/8f3a5fb1f9f3b30270200d145e402796/href">https://medium.com/media/8f3a5fb1f9f3b30270200d145e402796/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/106b236a33feb19173a2c6b0e550e8f3/href">https://medium.com/media/106b236a33feb19173a2c6b0e550e8f3/href</a></iframe><h4>Pipeline</h4><h4>Creating synthetic data using ydata-synthetic</h4><p>In this section we make synthetic the training data defined by Y_df_train using the TimeGAN model from ydata-synthetic. You can learn more about the TimeGAN model seeing the post <a href="https://towardsdatascience.com/synthetic-time-series-data-a-gan-approach-869a984f2239">Synthetic Time-Series Data: A GAN approach</a>.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/7cf941eb099cb423f00e75199ba6e75a/href">https://medium.com/media/7cf941eb099cb423f00e75199ba6e75a/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/3ffb810010f6b600ce45c12cdf8403b1/href">https://medium.com/media/3ffb810010f6b600ce45c12cdf8403b1/href</a></iframe><p>The following lines train the TimeGAN model,</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/b8efd99e7ed19b4bd4a0ee3d436849ec/href">https://medium.com/media/b8efd99e7ed19b4bd4a0ee3d436849ec/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/8ad3200bd93cf27265ff09339fac7fcf/href">https://medium.com/media/8ad3200bd93cf27265ff09339fac7fcf/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/90f35684fa0b643eaf7e447a38c15fac/href">https://medium.com/media/90f35684fa0b643eaf7e447a38c15fac/href</a></iframe><p>Thus, the object synth_data contains the synthetic training data. To use nixtlats we need to transform synth_data to a pandas dataframe. This can be easy done using the following lines.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/3b573908face8ff282f53f25e64af87f/href">https://medium.com/media/3b573908face8ff282f53f25e64af87f/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/ae21993ec47000ac64341b95f284c8ca/href">https://medium.com/media/ae21993ec47000ac64341b95f284c8ca/href</a></iframe><h4>Training Deep Learning model using nixtlats</h4><p>In this section, we use the previous synthetic data to train the ESRNN model, the winner of the M4 competition. This model is hybrid; by one hand, it fits each time series locally through an Exponential Smoothing model and then trains the levels using a Recurrent Neural Network. You can learn more about this model by seeing the post <a href="https://medium.com/analytics-vidhya/forecasting-in-python-with-esrnn-model-75f7fae1d242">Forecasting in Python with the ESRNN model</a>.</p><p>The pipeline for model training follows the PyTorch common practices. In the first instance a Dataset must be instantiated. The TimeSeriesDataset class allows to return the complete series in each iteration, this is useful for recurrent models such as ESRNN. To be instantiated, the class receives the target series Y_df as a pandas dataframe with columns unique_id, ds and y. Additionally, temporary exogenous variables X_df and static variables S_df can be included. In this case we only use static variables as in the original model.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/3d8f13fe58d32a00db1612ac8c178ca4/href">https://medium.com/media/3d8f13fe58d32a00db1612ac8c178ca4/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/972528504b58c147c8d81e20ad85d745/href">https://medium.com/media/972528504b58c147c8d81e20ad85d745/href</a></iframe><p>The next we need to do is define the ESRNN model included in nixtlats as follows,</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/a9f9fd8fd7dae5a6636ca7c5d4d944d7/href">https://medium.com/media/a9f9fd8fd7dae5a6636ca7c5d4d944d7/href</a></iframe><p>And then we can train it as follows,</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/2fa33fac0d9db03ecc9af1efc037404f/href">https://medium.com/media/2fa33fac0d9db03ecc9af1efc037404f/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/72c02b62e76da11221b95897eb75f84e/href">https://medium.com/media/72c02b62e76da11221b95897eb75f84e/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/6c8c79248b959ba7f2f3bc0f278be074/href">https://medium.com/media/6c8c79248b959ba7f2f3bc0f278be074/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/4331aa79aa71d10a52aba3957730b844/href">https://medium.com/media/4331aa79aa71d10a52aba3957730b844/href</a></iframe><h4>Model trained with real data</h4><p>To compare both solutions offer similar results, in this section we train the model with the original data.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/c0daa0616da54b870fcebc86c397d644/href">https://medium.com/media/c0daa0616da54b870fcebc86c397d644/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/b9c1d8c0dc1409fb957fc70660c2adb6/href">https://medium.com/media/b9c1d8c0dc1409fb957fc70660c2adb6/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/2ad4b7d1c1d6396dce9392fb782681f6/href">https://medium.com/media/2ad4b7d1c1d6396dce9392fb782681f6/href</a></iframe><p>And then we can train it as follows,</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/9eeaa3f444254cbe35ad196f071b512a/href">https://medium.com/media/9eeaa3f444254cbe35ad196f071b512a/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/8ce0e8bb9c94e831ae6fddc3d3385a33/href">https://medium.com/media/8ce0e8bb9c94e831ae6fddc3d3385a33/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/3a31e8b9219d6c0bb2d953e0affe219d/href">https://medium.com/media/3a31e8b9219d6c0bb2d953e0affe219d/href</a></iframe><h4>Comparing forecasts</h4><p>Finally, we use the original data to make forecasts for both models, model_synth trained with synthetic data and model, trained with the original data. First, we define the test dataset and loader.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/d06e7ea8a6b3feae696a77de53cb8ded/href">https://medium.com/media/d06e7ea8a6b3feae696a77de53cb8ded/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/cfc5de2e335e44334edef2da0cf1e8c8/href">https://medium.com/media/cfc5de2e335e44334edef2da0cf1e8c8/href</a></iframe><p>The following lines obtains forecasts with the synthetic model,</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/428e78d690ef42af0b33ba13f4ef201c/href">https://medium.com/media/428e78d690ef42af0b33ba13f4ef201c/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/fc7290f765a16947b159655e1e454cdb/href">https://medium.com/media/fc7290f765a16947b159655e1e454cdb/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/07b99520175c6f8b499202f759629a67/href">https://medium.com/media/07b99520175c6f8b499202f759629a67/href</a></iframe><p>Likewise, the following lines obtaines forecasts with the model trained with real data,</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/7a56702321d1d297ba9378baca8c478e/href">https://medium.com/media/7a56702321d1d297ba9378baca8c478e/href</a></iframe><p>Now we compare the performance of both models against the real value using the Mean Average Percentage Error (MAPE) and its symmetric version (SMAPE). nixtlats provides functions to easily do that.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/28bee88ce30ca4277fc9cbd749ec8608/href">https://medium.com/media/28bee88ce30ca4277fc9cbd749ec8608/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/40d21ea8f90a779d58986d68b5ba75f0/href">https://medium.com/media/40d21ea8f90a779d58986d68b5ba75f0/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/aac40bc74a59c8f30a1ce152e765fc6a/href">https://medium.com/media/aac40bc74a59c8f30a1ce152e765fc6a/href</a></iframe><p>As can we see, even the model trained with synthetic data generated with ydata-synthetic produces better forecasts considering the MAPE loss.</p><h4>Conclusion</h4><p>Synthetic data have a wide range of applications. In this post we showed a full pipeline to create synthetic data and using it to train state-of-the-art Deep Learning models. As we saw, performance is not harmed, and even for some metrics, it is even better.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=404b65600876" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-science/forecasting-with-synthetic-data-at-scale-nixtla-ydata-404b65600876">Forecasting with Synthetic Data at Scale (Nixtla &amp; YData)</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Prophet vs Linear Regression on Real Estate: The Zillow Case]]></title>
            <link>https://medium.com/analytics-vidhya/prophet-vs-linear-regression-on-real-estate-the-zillow-case-2f846e6c4fdc?source=rss-2855bd3e0293------2</link>
            <guid isPermaLink="false">https://medium.com/p/2f846e6c4fdc</guid>
            <category><![CDATA[facebook]]></category>
            <category><![CDATA[zillow]]></category>
            <category><![CDATA[open-source]]></category>
            <category><![CDATA[time-series-forecasting]]></category>
            <dc:creator><![CDATA[azul garza ramirez]]></dc:creator>
            <pubDate>Wed, 15 Dec 2021 20:57:39 GMT</pubDate>
            <atom:updated>2022-10-20T20:31:19.121Z</atom:updated>
            <content:encoded><![CDATA[<p><em>By </em><a href="http://github.com/Nixtla"><em>Nixtla Team</em></a><em>. </em><a href="https://medium.com/u/2855bd3e0293"><em>fede garza ramírez</em></a><em> , </em><a href="https://medium.com/u/76b639655285"><em>Max Mergenthaler</em></a></p><blockquote><strong>TL; DR </strong>Recently there has been controversy in the data science community about the Zillow case. There has been <a href="https://ryxcommar.com/2021/11/06/zillow-prophet-time-series-and-prices/">speculation that the Zillow team may have used</a> <a href="https://github.com/facebook/prophet">Prophet</a> to generate forecasts of their time series. Although we do not know if the above is true, we contribute to the discussion by showing that creating good benchmarks is fundamental in forecasting tasks. Furthermore, we show that Prophet does not turn out to be a good solution on <a href="https://www.zillow.com/research/data/">Zillow Home Value Index</a> data. Better alternatives are simpler and faster models like <a href="https://github.com/robjhyndman/forecast">auto.arima</a> or <a href="https://github.com/Nixtla/statsforecast">statsforecast</a>, and to improve them <a href="https://github.com/Nixtla/mlforecast">mlforecast</a> is an excellent option because it makes forecasting with machine learning fast and easy and it allows practitioners to focus on the model and features instead of implementation details.</blockquote><h3>Introduction</h3><p>Recently, Zillow announced that it would close its <a href="https://www.cnbc.com/2021/11/02/zillow-shares-plunge-after-announcing-it-will-close-home-buying-business.html">home-buying business</a> because its models were not being able to correctly anticipate price changes. The Zillow CEO Rich Barton said, <em>“We’ve determined the unpredictability in forecasting home prices far exceeds what we anticipated”</em>. Since this news, <a href="https://twitter.com/vhranger/status/1456064415845990408">several opinions</a> have been published about the alleged technology used by them for forecasting. In particular, opinions criticize the fact that they requested Prophet in their job offers.</p><p>Forecasting time series is a complicated task, and there is no single model that fits all business needs and data characteristics. <a href="https://towardsdatascience.com/time-series-forecasting-with-statistical-models-f08dcd1d24d1">Best practices</a> always suggest starting with a simple model as a benchmark; such a model will allow, on the one hand, to build models with better performance and, on the other hand, to measure the value-added of such models (data scientists should obtain a lower loss of their more complex models compared to the benchmark’s loss).</p><p>In this blog post, we have set ourselves the goal of empirically determining whether Prophet is a good choice (or at least a good benchmark) for modeling the data used in the context of Zillow. As we will see, auto.arima and even the naive model turn out to be better baseline strategies than Prophet for the particular dataset we use. We reveal that Prophet does not perform well compared to other models, which is consistent with the evidence found by other practitioners (for example <a href="https://www.microprediction.com/blog/prophet">here</a> and <a href="https://kourentzes.com/forecasting/2017/07/29/benchmarking-facebooks-prophet/">here</a>). Also, we show how using mlforecast (and <a href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html">LinearRegression from </a><a href="https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html">sklearn</a> as training model) can beat auto.arima and Prophet in no more than 3 seconds.</p><h3><strong>Dataset</strong></h3><p>The dataset we use to evaluate Prophet is the Zillow Home Value Index (ZHVI), which can be downloaded directly from the <a href="https://www.zillow.com/research/data/">Zillow research website</a>. According to the page, the ZHVI is <em>&quot;a smoothed, seasonally adjusted measure of typical home value and market changes for a given region and housing type. It reflects the typical value of homes in the 35th to 65th percentile range&quot;</em> and <a href="https://www.zillow.com/research/zhvi-user-guide/"><em>&quot;represents the &quot;typical&quot; home value for a region&quot;</em></a>.</p><p>The dataset reflects price changes, so we decided to experiment with it because a stakeholder can potentially use it to make decisions. The dataset consists of 909 Monthly series for different aggregations of regions and states. We downloaded it on November 4, 2021 and anybody interested can find a copy of it <a href="https://github.com/Nixtla/nixtla/blob/main/utils/experiments/zillow-prophet/data/Metro_zhvi_uc_sfrcondo_tier_0.33_0.67_sm_sa_month.csv">here</a>.</p><h3>Experiments</h3><p>To test the effectiveness of Prophet in forecasting the ZHVI, we use the last 4 observations as the test set and the remaining observations as the training set. We performed a hyperparameter optimization over each time series using the last 4 observations of the training set as validation for Prophet. In addition to Prophet, we ran auto.arima of R, some models of statsforecast (random walk with drift, naive, simple exponential smoothing, window average, seasonal naive, and historic average) and mlforecast.</p><p>mlforecast is a framework that helps practitioners forecast time series using machine learning models. They need to give it a model (in this case, we use LinearRegression from sklearn), define which features to use and let mlforecast do the rest.</p><h3>Reproducing results</h3><p>You can reproduce the results using this <a href="https://github.com/Nixtla/nixtla/tree/main/utils/experiments/zillow-prophet">repo</a>. Just follow the next steps. The whole process is automized using Docker, conda, and Make.</p><ol><li>make init. This instruction will create a docker container based on environment.yml which contains R and python needed libraries.</li><li>make run_module module=&quot;python -m src.prepare_data&quot;. The module splits data into train and test sets. You can find the generated data in data/prepared-data-train.csv and data/prepared-data-test.csv respectively.</li><li>make run_module module=&quot;python -m src.forecast_prophet&quot;. Fits Prophet model (forecasts in data/prophet-forecasts.csv).</li><li>make run_module module=&quot;python -m src.forecast_statsforecast&quot;. Fits statsforecast models (forecasts in data/statsforecast-forecasts.csv).</li><li>make run_module module=&quot;Rscript src/forecast_arima.R&quot;. Fits auto.arima model (forecasts in data/arima-forecasts.csv).</li><li>make run_module module=&quot;python -m src.forecast_mlforecast&quot;. Fits mlforecast model using LinearRegression (forecasts in data/mlforecast-forecasts.csv).</li></ol><h3>Results</h3><h4>Performance</h4><p>The following table summarizes the results in terms of performance.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/780/1*CRCEhxivcRY-b1N-Xs3YTg.png" /><figcaption>Image by Author</figcaption></figure><p>As can we see, the best model is mlforecast.linear_regression for mape, rmse, smape, and mae metrics. Surprisingly, a very simple model such as naive (takes the last value as forecasts) turns out to be better in this experiment than Prophet.</p><h4>Computational cost</h4><p>The following table summarizes the results in terms of computational cost.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/379/1*xWKufu9IgD4r2LTuZOGD3Q.png" /><figcaption>Image by Author</figcaption></figure><p>To run our experiments we used a <a href="https://aws.amazon.com/ec2/instance-types/c5/">c5d.24xlarge AWS instance (96 vCPU, 192 RAM)</a>. It costs 4.608 USD each hour. As can we see, mlforecast takes no more than 3 seconds and beats Prophet and auto.arima in performance.</p><h3>Conclusion</h3><p>This post showed in the context of the Zillow controversy that doing benchmarks is fundamental to addressing any time series forecasting problem. Those benchmarks must be computationally efficient to iterate fast and build more complex models on top of them. The libraries <a href="https://github.com/Nixtla/statsforecast">statsforecast</a> and <a href="https://github.com/Nixtla/mlforecast">mlforecast</a> are excellent tools for the task. We also showed better options than Prophet to run benchmarks, which is consistent with previous findings by the data science community.</p><p><strong>Build benchmarks. Always.</strong></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=2f846e6c4fdc" width="1" height="1" alt=""><hr><p><a href="https://medium.com/analytics-vidhya/prophet-vs-linear-regression-on-real-estate-the-zillow-case-2f846e6c4fdc">Prophet vs Linear Regression on Real Estate: The Zillow Case</a> was originally published in <a href="https://medium.com/analytics-vidhya">Analytics Vidhya</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Time Series Forecasting with Statistical Models]]></title>
            <link>https://medium.com/data-science/time-series-forecasting-with-statistical-models-f08dcd1d24d1?source=rss-2855bd3e0293------2</link>
            <guid isPermaLink="false">https://medium.com/p/f08dcd1d24d1</guid>
            <category><![CDATA[forecasting]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[statistics]]></category>
            <category><![CDATA[open-source]]></category>
            <category><![CDATA[time-series-forecasting]]></category>
            <dc:creator><![CDATA[azul garza ramirez]]></dc:creator>
            <pubDate>Mon, 06 Dec 2021 19:49:09 GMT</pubDate>
            <atom:updated>2022-10-20T20:31:43.131Z</atom:updated>
            <content:encoded><![CDATA[<h4>statsforecast makes forecasting with statistical models fast and easy</h4><p><em>By </em><a href="http://github.com/Nixtla"><em>Nixtla Team</em></a><em>. </em><a href="https://medium.com/u/2855bd3e0293"><em>fede garza ramírez</em></a>, <a href="https://medium.com/u/76b639655285">Max Mergenthaler</a></p><blockquote>TL;DR</blockquote><blockquote>In this post we introduce <a href="http://github.com/nixtla/statsforecast"><strong>statsforecast</strong></a>, an open-source framework that makes the implementation of statistical models in forecasting tasks fast and easy. <em>statsforecast</em> is able to handle thousands of time series and is efficient both time and memory wise. With this library you can easily create benchmarks on which to build more complex models; it can also allows you to run your own models in a parallel fashion. In this post we also offer a guide on how to use “Forecast Value Added” for benchmarking different models and assessing competing models.</blockquote><h4>Introduction</h4><p>In this post, we will talk about using statistical models in forecasting tasks. In particular, we introduce <strong>statsforecast</strong>. This Python library allows fitting statistical models in a simple and computationally efficient way for hundreds of thousands of time series so that you can benchmark your own models quickly. Throughout this post, we will show how to use the library to calculate the Forecast Value Added of some models with respect to a benchmark model. This methodology allows us to select the best model among a variety.</p><h4>Motivation</h4><p>Deep learning and Machine Learning models have demonstrated state-of-the-art performance in time series forecasting tasks. However, it is helpful to have a battery of simpler models to benchmark and validate the value that those models add.</p><p>In business problems, metrics such as Forecast Value Added (FVA) are usually used to compare the value-added of more complex models against more straightforward techniques to implement and explain to decision-makers. FVA is calculated by subtracting the loss of a benchmark model from the loss of a more complex one. In the <a href="https://www.sas.com/content/dam/SAS/en_us/doc/whitepaper1/forecast-value-added-analysis-106186.pdf">following example</a>, three models were fitted: Naive, Statistical, and Override. The first column shows the Mean Average Percentage Error (MAPE) of these three models. The FVA vs. Naive column displays in the second row the difference between the Naive&#39;s MAPE and the Statistical&#39;s MAPE, which is positive; that means that the Statistical adds value to the process. Likewise, the third row shows the difference between the Naive&#39;s MAPE and the Override&#39;s MAPE; the result is negative, so the model Override doesn&#39;t add any value.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/260/1*ZYSakADTQ7ekSEnDKbHlbQ.png" /><figcaption>Image from <a href="https://www.sas.com/content/dam/SAS/en_us/doc/whitepaper1/forecast-value-added-analysis-106186.pdf">SaaS</a></figcaption></figure><p>A wide range of statistical base models is included in <em>statsforecast</em> that can be used for decision making or as benchmarks for implementing more complete models. Also included are models for specific tasks, such as forecasting sparse (or intermittent) time-series, i.e., time series with a high percentage of zero values, such as sales. These models exist in implementations for the R programming language but not for Python.</p><h4>statsforecast</h4><p>To make benchmarking easier, we created <a href="https://github.com/Nixtla/statsforecast">statsforecast</a>, which is a framework to help you forecast time series using statistical models. You just need to give it a model you want to use and let <em>statsforecast</em> do the rest.</p><h4>Included models</h4><ul><li><strong>ADIDA</strong>: Temporal aggregation is used for reducing the presence of zero observations, thus mitigating the undesirable effect of the variance observed in the intervals. ADIDA uses equally sized time buckets to perform non-overlapping temporal aggregation and predict the demand over a pre-specified lead-time. The time bucket is set equal to the mean inter-demand interval. SES is used to obtain the forecasts.</li><li><strong>Croston Classic</strong>: The method proposed by Croston to forecast series that display intermittent demand. The method decomposes the original series into the non-zero demand size and the inter-demand intervals and models them using Simple Exponential Smoothing with a predefined parameter.</li><li><strong>Croston SBA</strong>: SBA stands for Syntetos-Boylan Approximation. A variant of Croston’s method that utilizes a debiasing factor.</li><li><strong>Croston Optimized</strong>: Like Croston, but this model optimizes the Simple Exponential Smoothing for both the non-zero demand size and the inter-demand intervals.</li><li><strong>Historic average</strong>: Simple average of the time series.</li><li><strong>iMAPA</strong>: iMAPA stands for Intermittent Multiple Aggregation Prediction Algorithm. Another way for implementing temporal aggregation in demand forecasting. However, in contrast to ADIDA that considers a single aggregation level, iMAPA considers multiple ones, aiming at capturing different dynamics of the data. Thus, iMAPA proceeds by averaging the derived point forecasts, generated using SES.</li><li><strong>Naive</strong>: Uses the last value of the time series as forecast. The simplest model for time series forecasting.</li><li><strong>Random Walk with Drift</strong>: Projects the historic trend from the last observed value.</li><li><strong>Seasonal Exponential Smoothing</strong>: Adjusts a Simple Exponential Smoothing model for each seasonal period.</li><li><strong>Seasonal Naive</strong>: Like Naive, but this time the forecasts of the model are equal to the last known observation of the same period in order for it to capture possible weekly seasonal variations.</li><li><strong>Seasonal Window Average</strong>: Uses the last window (defined by the user) to calculate an average for each seasonal period.</li><li><strong>SES</strong>: SES stands for Simple Exponential Smoothing. This model recursively weights the most recent observations in the time series. Useful for time series with no trend.</li><li><strong>TSB</strong>: TSB stands for Teunter-Syntetos-Babai. A modification to Croston’s method that replaces the inter-demand intervals component with the demand probability.</li><li><strong>Window Average</strong>: Uses the last window (defined by the user) to calculate an average.</li></ul><h4>Usage</h4><p>To create an ample set of benchmarks you can install <strong>statsforecast</strong> which is available in <a href="https://pypi.org/project/statsforecast/">PyPI</a> (pip install statsforecast).</p><h4>Libraries</h4><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/00ae2e43f779e45b275cb9c5e239947f/href">https://medium.com/media/00ae2e43f779e45b275cb9c5e239947f/href</a></iframe><h4>Data</h4><p>In this example, we use the M4 time series competition data. The objective of the competition was to validate models for different frequencies and seasonalities data. The dataset was <a href="https://github.com/Mcompetitions/M4-methods">originally released publicly</a> and it was released with a <a href="https://github.com/Mcompetitions/M4-methods/issues/16">completely open-access license</a>. To download the data we used <a href="https://github.com/Nixtla/nixtlats">nixtlats</a>. In this example, we use Daily time series.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/0baad9edb0f25cf2bdeb08a5e145c8a8/href">https://medium.com/media/0baad9edb0f25cf2bdeb08a5e145c8a8/href</a></iframe><p>Initially, the data don’t contain the actual dates of each observation, so the following line creates a datestamp for each time series.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/5f25c359e79aa04af8467042590b109a/href">https://medium.com/media/5f25c359e79aa04af8467042590b109a/href</a></iframe><p>The function M4.load returns train + test data, so we need to separate them.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/37a8bb51a6f8a2f111352f1f4c483bbe/href">https://medium.com/media/37a8bb51a6f8a2f111352f1f4c483bbe/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/9c38d68119f33f481e97c16e2f0e3434/href">https://medium.com/media/9c38d68119f33f481e97c16e2f0e3434/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/ac06f660d3b8f40387b089bc00c90ff1/href">https://medium.com/media/ac06f660d3b8f40387b089bc00c90ff1/href</a></iframe><p>This is the required input format.</p><ul><li>an index named <strong>unique_id</strong> that identifies each time series. In this example, we have 4,227 time series.</li><li>a <strong>ds</strong> column with the dates.</li><li>a <strong>y</strong> column with the values.</li></ul><h4>Training</h4><p>We now define the statistical models we will use. We must define a list of functions. If the model has additional parameters, besides the forecast horizon, it must be included as a tuple with the model and the additional parameters.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/8ca6e2e6250ba27efd82cd706431ea18/href">https://medium.com/media/8ca6e2e6250ba27efd82cd706431ea18/href</a></iframe><p>Now we define our trainer, StatsForecast, where we define the models we want to use, the frequency of the data, and the number of cores used to parallelize the training job.</p><p>In this way adjusting these models and generating forecasts is as simple as the following lines. The main class is StatsForecast; it receives four parameters:</p><ul><li>df: A pandas dataframe with time series in long format.</li><li>models: A list of models to fit each time series.</li><li>freq: Frequency of the time series.</li><li>n_jobs: Number of cores to be used in the fitting process. The default is 1 job. To compute the process in parallel you can use the cpu_count() function from multiprocessing.</li></ul><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/a751c810fadcf954df5225fb5d61ce28/href">https://medium.com/media/a751c810fadcf954df5225fb5d61ce28/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/d192986684c139fe73f35483eedaab57/href">https://medium.com/media/d192986684c139fe73f35483eedaab57/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/6ee50bee4c348d53f36a6f04dca3eb68/href">https://medium.com/media/6ee50bee4c348d53f36a6f04dca3eb68/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/5835b62781d7001927199d089f710d7a/href">https://medium.com/media/5835b62781d7001927199d089f710d7a/href</a></iframe><h4>Forecast Value Added</h4><p>In this example, we’ll use the historic_average model as a benchmark; this is on of the simpler model among the fitted ones (it only takes the mean value of the time series as forecast).</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/92ac4c3a1cfeb9845bfaded3f2dab09d/href">https://medium.com/media/92ac4c3a1cfeb9845bfaded3f2dab09d/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/ffb52a6942d05a3bb6a30208c837780c/href">https://medium.com/media/ffb52a6942d05a3bb6a30208c837780c/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/fbb7f44ce938c12e139981cce8b450d1/href">https://medium.com/media/fbb7f44ce938c12e139981cce8b450d1/href</a></iframe><p>As the table shows, the Forecast Value Added against the historic_average model is positive for the majority of the models.</p><h4>Visualization</h4><p>In this section we present visual examples of the forecasts generated.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/8e864ac8ed0e7bce4d8d03e81909ef64/href">https://medium.com/media/8e864ac8ed0e7bce4d8d03e81909ef64/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/337db139affcb597d90e67a0453df674/href">https://medium.com/media/337db139affcb597d90e67a0453df674/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/proxy/1*jbq7L1mPqjmmMVu8TtT-ww.png" /><figcaption>Image by Author</figcaption></figure><h4>Create your own model</h4><p>Additionally, you can use the full power ofStatsForecast to parallelize your own model. You just need to define a function with mandatory parameters y, the target time series, and h, the horizon to forecast; in addition, you can add more optional parameters. The function&#39;s output must be a numpy array of size h. In the following example, we&#39;ll fit a linear regression against time; this is a very basic model but it is useful to explain how to get the full potential of statsforecast.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/d4f08254238dc1a9f806603d75359955/href">https://medium.com/media/d4f08254238dc1a9f806603d75359955/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/42d4174e26a13266c0ccdadc18bcf03b/href">https://medium.com/media/42d4174e26a13266c0ccdadc18bcf03b/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/3807a71a3db027573a8602f01ad7df13/href">https://medium.com/media/3807a71a3db027573a8602f01ad7df13/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/b4e9f626aec446c8d80d8e7618cc91cc/href">https://medium.com/media/b4e9f626aec446c8d80d8e7618cc91cc/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/a925fdf8c2c981052a6f0220a24cef02/href">https://medium.com/media/a925fdf8c2c981052a6f0220a24cef02/href</a></iframe><p>A more complicated example with extra parameters would be a Lasso regression as follows,</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/21cf61c0e2c9b221f5f6372dd98e2a67/href">https://medium.com/media/21cf61c0e2c9b221f5f6372dd98e2a67/href</a></iframe><p>Instead of passing the model, you just need to pass a tuple with the function and the parameter you want to use,</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/3cbf58613b252407a944abad01e15461/href">https://medium.com/media/3cbf58613b252407a944abad01e15461/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/e6b0597430af8c8edde04fff347c6968/href">https://medium.com/media/e6b0597430af8c8edde04fff347c6968/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/aea37c1df689accde2537ddc816b94a6/href">https://medium.com/media/aea37c1df689accde2537ddc816b94a6/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/dd3a684ae360e46cd615d2be1cc7212e/href">https://medium.com/media/dd3a684ae360e46cd615d2be1cc7212e/href</a></iframe><p>Finally, you can train both models and a historic_average model (for benchmarking purposes) at the same time defining the models list as follows,</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/f1810aaa8ac6358d5b9c3e4b21cb6c54/href">https://medium.com/media/f1810aaa8ac6358d5b9c3e4b21cb6c54/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/7caf42d7ac24c8eaf485b2b22a724da2/href">https://medium.com/media/7caf42d7ac24c8eaf485b2b22a724da2/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/6261323c52589092330d34f7e6eb2f40/href">https://medium.com/media/6261323c52589092330d34f7e6eb2f40/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/ec21ad53d8bb687597633021fd043940/href">https://medium.com/media/ec21ad53d8bb687597633021fd043940/href</a></iframe><p>Now we can calculate the FVA for the linear and lasso regression based on the historic average model.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/f98d23d3ea10dafae335f981d4b62699/href">https://medium.com/media/f98d23d3ea10dafae335f981d4b62699/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/dcd778142101a4a902f07062fc5ef652/href">https://medium.com/media/dcd778142101a4a902f07062fc5ef652/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/2fc3a7d79142ed126143cbb0357b7d35/href">https://medium.com/media/2fc3a7d79142ed126143cbb0357b7d35/href</a></iframe><p>So, the table shows a positive FVA for both models; we can also see that the regularization provided by the Ridge regression improves the FVA.</p><h4>Conclusion</h4><p>In this post, we introduce <strong>statsforecast</strong>, a library written in python to quickly fit statistical models. As we saw, in the practice of time series forecasting it is very useful to first fit a simple model, as a benchmark. This benchmark model allows to build more complex models and also to show that its complexity brings value to the process through the FVA.</p><p>Statsforecast allows you to create benchmark models in a simple way; moreover, it allows you to fit your own models efficiently by fitting in parallel.</p><h4>WIP and Next Steps</h4><p><strong>statsforecast</strong> is a work in progress. In the next releases we plan to include:</p><ul><li>Automated backtesting.</li><li>Ensembles (such as <a href="https://github.com/FedericoGarza/fforma">fforma</a>).</li><li>More statistical models with exogenous variables.</li></ul><p>If you’re interested you can learn more in the following resources:</p><ul><li>GitHub repo: <a href="https://github.com/Nixtla/statsforecast">https://github.com/Nixtla/statsforecast</a></li><li>Documentation: <a href="https://nixtla.github.io/statsforecast/">https://nixtla.github.io/statsforecast/</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=f08dcd1d24d1" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-science/time-series-forecasting-with-statistical-models-f08dcd1d24d1">Time Series Forecasting with Statistical Models</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Open Source Alternative to the AWS Deep Learning AMI]]></title>
            <link>https://medium.com/data-science/open-source-alternative-to-the-aws-deep-learning-ami-f8f77318f8a8?source=rss-2855bd3e0293------2</link>
            <guid isPermaLink="false">https://medium.com/p/f8f77318f8a8</guid>
            <category><![CDATA[pytorch]]></category>
            <category><![CDATA[deep-learning]]></category>
            <category><![CDATA[aws]]></category>
            <category><![CDATA[cloud]]></category>
            <category><![CDATA[open-source]]></category>
            <dc:creator><![CDATA[azul garza ramirez]]></dc:creator>
            <pubDate>Wed, 17 Nov 2021 20:55:48 GMT</pubDate>
            <atom:updated>2022-10-20T20:32:15.472Z</atom:updated>
            <content:encoded><![CDATA[<h4>An article describing how to set up GPU infrastructure automatically using conda, Docker, make and terraform.</h4><p><em>By </em><a href="https://github.com/Nixtla/"><em>Nixtla Team</em></a><em>. </em><a href="https://medium.com/u/2855bd3e0293"><em>fede garza ramírez</em></a>, <a href="https://medium.com/u/76b639655285">Max Mergenthaler</a></p><blockquote><strong>TLDR;</strong> Running Deep Learning models with GPUS is complicated, particularly when configuring the infrastructure. Prefabricated GPU cloud infrastructure tends to be particularly expensive.</blockquote><blockquote>To help people focus on their models rather than on their hardware and its configuration, we at Nixtla developed a fast and simple way to use GPUs on the AWS cloud without paying for the AMI environment and made it open-source: <a href="https://github.com/Nixtla/nixtla/tree/main/utils/docker-gpu">https://github.com/Nixtla/nixtla/tree/main/utils/docker-gpu</a> and <a href="https://github.com/Nixtla/nixtla/tree/main/utils/terraform-gpu">https://github.com/Nixtla/nixtla/tree/main/utils/terraform-gpu</a>.</blockquote><p><strong>INTRODUCTION</strong></p><p>Deep Learning has become widespread in many areas: computer vision, natural language processing, time series forecasting, etc. Due to the state-of-the-art results it has obtained, it has become increasingly popular in the daily practice of data scientists and researchers.</p><p>GPUs have accelerated the training and inference of the models because they are optimized to perform linear algebra computations on which deep learning heavily relies. The need for this specialized hardware, however, increases the monetary/economic cost of experimenting and deploying these models to production.</p><p>A common problem faced by Deep Learning practitioners is the proper configuration of the GPU infrastructure on the cloud. The installation of required drivers for hardware management tends to be bothersome. When this is not tackled correctly, it can be detrimental to reproducibility or unnecessarily increase the cost of these novel models. In this post, we provide the community with a simple solution to this problem using Docker.</p><p><strong>SOLUTION</strong></p><ol><li><strong>NVIDIA Deep Learning AMI + Conda environment + Terraform</strong></li></ol><p><strong>a) NVIDIA Deep Learning AMI</strong></p><p>To run your code with GPU accelerated computation, you need two things covered: (i) have NVIDIA GPUs and (ii) their necessary drivers.</p><p>If you opt for EC2 instances <a href="https://towardsdatascience.com/choosing-the-right-gpu-for-deep-learning-on-aws-d69c157d8c86">(P2, P3, P4D, or G4)</a>, NVIDIA provides a free <a href="https://aws.amazon.com/marketplace/pp/prodview-e7zxdqduz4cbs#pdp-reviews">AMI</a> with pre-installed and optimized GPU software for which you only need to pay the EC2 computational costs.</p><p>You can easily launch GPU EC2 instances with their corresponding drivers from your terminal with the AWS console. To do it you need:</p><ol><li><a href="https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html">AWS CLI installed</a>.</li><li>EC2 launch permissions.</li><li>EC2 connection permissions: (I) The <em>.pem</em> file from the instance launch &lt;YOUR_KEY_NAME&gt; (you can create one following the instructions <a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html">here</a>). (II) The instance’s security group &lt;YOUR_SECURITY_GROUP&gt;.</li></ol><p>If you don’t have your own &lt;YOUR_SECURITY_GROUP&gt; you can create one using:</p><pre>aws ec2 create-security-group \<br>        --group-name nvidia-ami \<br>        --description “security group for nvidia ami”</pre><p>And add to it ingress rules using:</p><pre>aws ec2 authorize-security-group-ingress \<br>        --group-name nvidia-ami \<br>        --protocol tcp \<br>        --port 22 \<br>        --cidr 0.0.0.0/0</pre><p>With the above, launching a GPU ready EC2 instance is as simple as running:</p><pre>aws ec2 run-instances \<br>        --image-id ami-05e329519be512f1b \<br>        --count 1 \<br>        --instance-type g4dn.2xlarge \<br>        --key-name &lt;YOUR_KEY_NAME&gt; \<br>        --security-groups nvidia-ami</pre><p>The image id (<em> — image_id</em>) identifies the required NVIDIA AMI. The values for the number of instances (<em> — count</em>) and the instance type (<em> — instance-type</em>) are optional.</p><p>Once the instance is initialized, we can access it with ssh. The AMI comes pre-installed with git, so we can clone the repo of our project without much additional difficulty.</p><pre>ssh -i path/to/&lt;YOUR_KEY_NAME&gt;.pem ubuntu@&lt;PUBLIC_EC2_IP&gt;</pre><p><strong>b) Conda environments</strong></p><p>We recommend the use of <a href="https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html"><em>Conda</em></a> to facilitate the handling of Deep Learning dependencies (<em>PyTorch, TensorFlow, etc.</em>), in particular, we recommend creating environments with <em>environment.yml files</em>.</p><p>The following image shows an example. The Deep Learning framework used in this example is <em>PyTorch</em>, and standard libraries such as <em>NumPy</em> and <em>pandas</em> were also included. This file is a skeleton, so any additional dependencies can be added without any difficulty. In addition, <em>jupyterlab</em> is included.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/346/1*0Vdsg7CI8hS3H3wAEBywZw.png" /><figcaption>The original file can be found <a href="https://github.com/Nixtla/nixtla/blob/main/utils/docker-gpu/environment.yml">here</a>.</figcaption></figure><p>As can be seen, the python version to be used is 3.7. This version can be easily adjusted to the user’s needs, as can the other versions of the packages.</p><p>To use conda environment you need to install conda first because the NVIDIA AMI doesn’t have it installed. You can follow the next set of instructions:</p><pre>wget <a href="https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh">https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh</a> &amp;&amp; \<br>bash Miniconda3-latest-Linux-x86_64.sh -b -p $HOME/miniconda &amp;&amp; \<br>rm -rf Miniconda3-latest-Linux-x86_64.sh &amp;&amp; \<br>source $HOME/miniconda/bin/activate &amp;&amp; \<br>conda init</pre><p>So you can install your environment with,</p><pre>conda env create -n &lt;NAME_OF_YOUR_ENVIROMENT&gt; -f environment.yml</pre><p>To verify that everything is correctly installed, you can clone our repo and run a test,</p><pre>git clone <a href="https://github.com/Nixtla/nixtla.git">https://github.com/Nixtla/nixtla.git</a><br>cd nixtla/utils/docker-gpu<br>conda env create -n gpu-env -f environment.yml<br>conda activate gpu-env<br>python -m test</pre><p>A final piece of advice: the user must be careful with the version of the Deep Learning framework used, verifying that it is compatible with the NVIDIA AMI drivers.</p><p><strong>c) Terraform</strong></p><p>To facilitate the creation of the whole process described above, we developed a <a href="https://www.terraform.io/">Terraform</a> script. Terraform is an open-source infrastructure as Code tool that allows you to synthesize all the manual development into an automatic script. In this case, the infrastructure as code we wrote mounts the NVIDIA AMI (including the creation of a compatible security group) and installs conda. The following image shows the main.tf file.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*QfIS4XlNxdiJrJi52fYFNw.png" /><figcaption>The original file can be found <a href="https://github.com/Nixtla/nixtla/blob/main/utils/terraform-gpu/main.tf">here</a>.</figcaption></figure><p>Additionally, a terraform.tfvars file is required for the credentials. An image of this file is shown below.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/414/1*sy4FOwwbWFUUTTGy4ysz3A.png" /><figcaption>The original file can be found <a href="https://github.com/Nixtla/nixtla/blob/main/utils/terraform-gpu/terraform.tfvars">here</a>.</figcaption></figure><p>To use Terraform you only have to install it, <a href="https://www.terraform.io/downloads.html">following </a>these instructions. Subsequently, you must run</p><pre>terraform init<br>terraform apply</pre><p>This will create the required infrastructure and install conda on the deployed EC2. When Terraform finishes running, you will be able to see the public IP associated with the instance so you only need to use an ssh connection to access it.</p><pre>ssh -i path/to/&lt;YOUR_KEY_NAME&gt;.pem ubuntu@&lt;PUBLIC_EC2_IP&gt;</pre><p><strong>2) NVIDIA Deep Learning AMI + Conda environment + Terraform + Docker + Make</strong></p><p><strong>a) Docker</strong></p><p>It is common practice to use Docker to ensure the replicability of projects and experiments. In addition, it allows the user to concentrate all the necessary dependencies in one place, avoiding installing dependencies locally that can later cause conflicts.</p><p>We use docker because it allows us to isolate the software from the hardware, making computation more flexible. If the load is very heavy, it is enough to change the EC2 instance and just run the code inside the container. On the other hand, if the load is lighter, we can choose a smaller instance.</p><p>The following image shows the Dockerfile we built for images to access the instance’s GPU. First of all, an image compatible with the drivers installed on EC2 must be chosen. To date, the NVIDIA AMI uses CUDA version 11.2, so this is the selected image.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*UziFRyAJyLciGINX2iErZA.png" /><figcaption>The original file can be found <a href="https://github.com/Nixtla/nixtla/blob/main/utils/docker-gpu/Dockerfile">here</a>.</figcaption></figure><p>Subsequently, additional operating system libraries are installed that may be needed for the project. For example, in the Dockerfile above, <em>wget</em> and <em>curl</em> are installed, which might be useful for downloading data that the project requires.</p><p>In the next instruction, <em>miniconda</em> is installed. Conda, as we discussed earlier, will allow us to handle python dependencies and also install them with the <em>environment.yml</em> file shown in the previous section.</p><p>We highly recommend using <a href="https://github.com/mamba-org/mamba"><em>mamba</em></a> for version management and installation as it significantly improves the waiting time. If the user prefers, she can easily switch to Conda.</p><p>Finally, the <em>environment.yml</em> file created earlier is added to the Docker image and installed in the base environment. It will not be necessary to initialize a specific environment every time a container is required.</p><p><strong>b) Makefile</strong></p><p>Finally, we facilitate the use of a Makefile. <a href="https://www.gnu.org/software/make/">Make</a> is a powerful tool for controlling workflows and executable files. Our workflow will allow us to quickly build the Docker image from the Dockerfile and run python and bash modules without continuously declaring the necessary arguments.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ADjK4GuZRMU7nMn3F0WSzQ.png" /><figcaption>The original file can be found <a href="https://github.com/Nixtla/nixtla/blob/main/utils/docker-gpu/Makefile">here</a>.</figcaption></figure><p>In this example, the Docker image will be called <em>gpucontainer</em>, and you can just run <em>make init</em> to build it. Once this instruction is executed, the user can use the <em>run_module</em> instruction to run her python or bash modules using GPUs.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/749/0*G3oGUNYdo1uBdCCO" /></figure><p>For example, to verify that everything works fine as expected, we create the <em>test.py</em> file that makes sure that CUDA is available for <em>PyTorch</em> and the GPUs are available. This module would be executed as follows:</p><pre>make run_module module=&quot;python -m test&quot;</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*WzoIf0CehX-Hx_gpBqv5PQ.png" /><figcaption>The original file can be found <a href="https://github.com/Nixtla/nixtla/blob/main/utils/docker-gpu/test.py">here</a>.</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1015/0*mBIK_vEeTn6ahDsM" /></figure><p>Other valuable instructions could be to run <em>nvidia-smi</em> inside the Docker container to verify that everything works fine:</p><pre>make run_module module=&quot;nividia-smi&quot;</pre><p>Or initialize the container interactively, which can be done with <em>make bash_docker</em>. Finally, an instruction is provided to run jupyterlab inside the docker and do experiments interactively:</p><pre>make jupyter</pre><p>If port 8888 (default) is used by another process, it can easily be changed using</p><pre>make jupyter -e PORT=8886</pre><figure><img alt="" src="https://cdn-images-1.medium.com/max/1000/0*sNv5SJUlwHUNnDwj" /></figure><p><strong>SUMMARY</strong></p><p>In this post, we show a simple solution to the problem of configuring GPUs for Deep Learning on the cloud. With this fully open-source workflow, we hope that practitioners in the field will spend more time implementing the models and not so much on the infrastructure as we have done.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=f8f77318f8a8" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-science/open-source-alternative-to-the-aws-deep-learning-ami-f8f77318f8a8">Open Source Alternative to the AWS Deep Learning AMI</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Forecasting with Machine Learning Models]]></title>
            <link>https://medium.com/data-science/forecasting-with-machine-learning-models-95a6b6579090?source=rss-2855bd3e0293------2</link>
            <guid isPermaLink="false">https://medium.com/p/95a6b6579090</guid>
            <category><![CDATA[data-science]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[machine-learning]]></category>
            <category><![CDATA[deep-dives]]></category>
            <category><![CDATA[notes-from-industry]]></category>
            <dc:creator><![CDATA[azul garza ramirez]]></dc:creator>
            <pubDate>Thu, 21 Oct 2021 23:10:15 GMT</pubDate>
            <atom:updated>2022-10-20T20:32:36.087Z</atom:updated>
            <content:encoded><![CDATA[<h4><a href="https://towardsdatascience.com/tagged/notes-from-industry">Notes from Industry</a></h4><h4>mlforecast makes forecasting with machine learning fast &amp; easy</h4><p><em>By </em><a href="https://github.com/Nixtla/"><em>Nixtla Team</em></a><em>. </em><a href="https://medium.com/u/2855bd3e0293"><em>fede garza ramírez</em></a>, <a href="https://medium.com/u/76b639655285">Max Mergenthaler</a></p><blockquote><strong>TL;DR</strong>: We introduce mlforecast, an open source framework from <a href="https://github.com/Nixtla/nixtla">Nixtla</a> that makes the use of machine learning models in time series forecasting tasks fast and easy. It allows you to focus on the model and features instead of implementation details. With mlforecast you can make experiments in an esasier way and it has a built-in backtesting functionality to help you find the best performing model.</blockquote><blockquote>You can use mlforecast in your own infrastructure or use our <a href="https://github.com/Nixtla/nixtla">fully hosted solution</a>. Just send us a mail to <a href="http://federico@nixtla.io">federico@nixtla.io</a> for testing the private beta.</blockquote><blockquote>Although this example contains only a single time series it, the framework is able to handle hundreds of thousands of them and is very efficient both time and memory wise.</blockquote><h4>Introduction</h4><p>We at Nixtla, are trying to make time series forecasting more accessible to everyone. In this post, we’ll talk about using machine learning models in forecasting tasks. We’ll use an example to show what the main challenges are and then we’ll introduce <a href="https://github.com/Nixtla/mlforecast">mlforecast</a>, a framework that facilitates using machine learning models in forecasting. <strong>mlforecast</strong> does feature engineering and takes care of the updates for you, the user only has to provide a regressor that follows the scikit-learn API (implements fit and predict) and specify the features that she wants to use. These features can be lags, lag-based transformations, and date features. (For further feature creation or an automated forecasting pipeline check <a href="https://github.com/Nixtla/nixtla">nixtla</a>.)</p><h4>Motivation</h4><p>For many years classical methods like ARIMA and ETS dominated the forecasting field. One of the reasons was that most of the use cases involved forecasting low-frequency series with monthly, quarterly, or yearly granularity. Furthermore, there weren’t many time-series datasets, so fitting a single model to each one and getting forecasts from them was straightforward.</p><p>However, in recent years, the need to forecast bigger datasets higher frequencies has risen. Bigger and higher frequency time series impose a challenge for classical forecasting methods. Those methods aren’t meant to model many time series together, and their implementation is suboptimal and slow (you have to train many models) and besides, there could be some common or shared patterns between the series that could be learned by modeling them together.</p><p>To address this problem, there have been various efforts in proposing different methods that can train a single model on many time series. Some fascinating deep learning architectures have been designed that can accurately forecast many time series like ESRNN, DeepAR, NBEATS among others. (Check <a href="https://github.com/Nixtla/nixtlats">nixtlats</a> and <a href="https://nixtla.github.io/blog/deep%20learning/forecasting/m4/2021/06/25/esrnn-i.html">Replicating ESRNN results</a> for our WIP.)</p><p>Traditional machine learning models like gradient boosted trees have been used as well and have shown that they can achieve very good performance as well. However, using these models with lag-based features isn’t very straightforward because you have to update your features in every timestep in order to compute the predictions. Additionally, depending on your forecasting horizon and the lags that you use, at some point you run out of real values of your series to update your features, so you have to do something to fill those gaps. One possible approach is to use your predictions as the values for the series and update your features using them. This is exactly what <strong>mlforecast</strong> does for you.</p><h4>Example</h4><p>In the following section, we’ll show a very simple example with a single series to highlight the difficulties in using machine learning models in forecasting tasks. This will later motivate the use of <em>mlforecast</em>, a library that makes the whole process easier and faster.</p><h4>Libraries</h4><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/2949ad86b9775dbe3256f0bd36e55573/href">https://medium.com/media/2949ad86b9775dbe3256f0bd36e55573/href</a></iframe><h4>Data</h4><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/28606d4b94055177baea773d30be3ca6/href">https://medium.com/media/28606d4b94055177baea773d30be3ca6/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/proxy/1*Z0eX1zbe7xqjqm6XH6-5RQ.png" /><figcaption>Image by Author</figcaption></figure><p>Our data has daily seasonality and as you can see in the creation, it is basically just dayofweek + Uniform({-1, 0, 1}).</p><h4>Training</h4><p>Let’s say we want forecasts for the next 14 days, the first step would be deciding which model and features to use, so we’ll create a validation set containing the last 14 days in our data.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/b62ea08930a0be363cf755375747cc39/href">https://medium.com/media/b62ea08930a0be363cf755375747cc39/href</a></iframe><p>As a starting point, we’ll try lag 7 and lag 14.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/6678ac1e6425136ac80a1eedb67f3225/href">https://medium.com/media/6678ac1e6425136ac80a1eedb67f3225/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/proxy/1*x-KS1vgcbC0Jcg0hBoQnlA.png" /><figcaption>Image by Author</figcaption></figure><p>We can see the expected relationship between the lags and the target. For example, when <em>lag-7</em> is 2, <em>y</em> can be either 0, 1, 2, 3 or 4. This is because every day of the week can have the values [day — 1, day, day + 1], so when we’re at the day of the week number 2, we can get values 1, 2 or 3. However the value 2 can come from day of the week 1, whose minimum is 0, and it can come from the day of week 3, whose maximum is 4.</p><p>Computing lag values leaves some rows with nulls.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/d94e200a28cf47579bfa0c92b7c4e45c/href">https://medium.com/media/d94e200a28cf47579bfa0c92b7c4e45c/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/888/1*yGo4svNLBF_1Xl6eKfccag.png" /><figcaption>Image by Author</figcaption></figure><p>We’ll drop these before training.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/51e998667ce1cb876c08ad239a36eb10/href">https://medium.com/media/51e998667ce1cb876c08ad239a36eb10/href</a></iframe><p>For simplicity sake, we’ll train a linear regression without intercept. Since the best model would be taking the average for each day of the week, we expect to get coefficients that are close to 0.5.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/9911e8167a25f6b9ab2d8416a2435646/href">https://medium.com/media/9911e8167a25f6b9ab2d8416a2435646/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/890/1*QCuUmVG6INOlD_TEEwKaIA.png" /><figcaption>Image by Author</figcaption></figure><p>This model is taking 0.51 * lag_7 + 0.45 * lag_14.</p><h4>Forecasting</h4><p>Great. We have our trained model. How can we compute the forecast for the next 14 days? Machine learning models a feature matrix <em>X</em> and output the predicted values <em>y</em>. So we need to create the feature matrix <em>X</em> for the next 14 days and give it to our model.</p><p>If we want to get the <em>lag-7</em> for the next day, following the training set, we can just get the value in the 7th position starting from the end. The <em>lag-7</em> two days after the end of the training set would be the value in the 6th position starting from the end and so on. Similarly for the <em>lag-14</em>.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/2782d40a6f69cf231e5b17d10110aeb3/href">https://medium.com/media/2782d40a6f69cf231e5b17d10110aeb3/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/892/1*XKCOoXwHDHfRVZoPZ9OrIg.png" /><figcaption>Image by Author</figcaption></figure><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/d4b774475b6b16d52afdeed0bf0f1a7d/href">https://medium.com/media/d4b774475b6b16d52afdeed0bf0f1a7d/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/893/1*k_5hRfUuxDyYouKv5N4gNA.png" /><figcaption>Image by Author</figcaption></figure><p>As you may have noticed we can only get 7 of the <em>lag-7</em> values from our history and we can get all 14 values for the <em>lag-14</em>. With this information we can only forecast the next 7 days, so we’ll only take the first 7 values of the <em>lag-14</em>.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/cc85cd4de72a32da8944006c5760a30c/href">https://medium.com/media/cc85cd4de72a32da8944006c5760a30c/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/891/1*t3a3BEereO7hgfwr41F4MQ.png" /><figcaption>Image by Author</figcaption></figure><p>With these features, we can compute the forecasts for the next 7 days.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/095ddc8417c424550204bf7d2ab6e1a6/href">https://medium.com/media/095ddc8417c424550204bf7d2ab6e1a6/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/891/1*ntbWdxo_smaiVdFVR3YFVQ.png" /><figcaption>Image by Author</figcaption></figure><p>These values can be interpreted as the values of our series for the next 7 days following the last training date. In order to compute the forecasts following that date, we can use these values as if they were the values of our series and use them as <em>lag-7</em> for the following periods.</p><p>In other words, we can fill the rest of our features matrix with these values and the real values of the <em>lag-14</em>.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/5d538bb07163047b44ffea3bb27c061a/href">https://medium.com/media/5d538bb07163047b44ffea3bb27c061a/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/896/1*W_G41I8nwjWTmv0yWkKY1Q.png" /><figcaption>Image by Author</figcaption></figure><p>As you can see we’re still using the real values of the <em>lag-14</em> and we’ve plugged in our predictions as the values for the <em>lag-7</em>. We can now use these features to predict the remaining 7 days.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/1d3c10223e502ec35a667151e2aa5a0d/href">https://medium.com/media/1d3c10223e502ec35a667151e2aa5a0d/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/894/1*JiaLHtmFGdKvaBE9erESzA.png" /><figcaption>Image by Author</figcaption></figure><p>And now we have our forecasts for the next 14 days! This wasn’t that painful but it wasn’t pretty or easy either. And we just used lags which are the easiest feature we can have.</p><p>What if we had used <em>lag-1</em>? We would have needed to do this predict-update step 14 times!</p><p>And what if we had more elaborate features like the rolling mean over some lag? As you can imagine it can get quite messy and is very error prone.</p><h4>mlforecast</h4><p>With these problems in mind, we created <a href="https://github.com/Nixtla/mlforecast">mlforecast</a>, which is a framework to help you forecast time series using machine learning models. It takes care of all these messy details for you. You just need to give it a model and define which features you want to use and let <em>mlforecast</em> do the rest.</p><p><strong>mlforecast</strong> is available in <a href="https://pypi.org/project/mlforecast/">PyPI</a> (pip install mlforecast) as well as <a href="https://anaconda.org/conda-forge/mlforecast">conda-forge</a> (conda install -c conda-forge mlforecast).</p><p>The previously described problem can be solved using <strong>mlforecast</strong> with the following code.</p><p>First, we have to set up our data in the required format.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/659277045bb90865b85e8e2bd7e34d63/href">https://medium.com/media/659277045bb90865b85e8e2bd7e34d63/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/898/1*vhNkbtEVUgwYj_0lnKEP6A.png" /><figcaption>Image by Author</figcaption></figure><p>This is the required input format.</p><ul><li>an index named <strong>unique_id</strong> that identifies each time serie. In this case we only have one but you can have as many as you want.</li><li>a <strong>ds</strong> column with the dates.</li><li>a <strong>y</strong> column with the values.</li></ul><p>Now we’ll import the <a href="https://nixtla.github.io/mlforecast/core.html#TimeSeries">TimeSeries</a> transformer, where we define the features that we want to use. We’ll also import the <a href="https://nixtla.github.io/mlforecast/forecast.html#Forecast">Forecast</a> class, which will hold our transformer and model and will run the forecasting pipeline for us.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/f748b230463a621a3ae906209bed9538/href">https://medium.com/media/f748b230463a621a3ae906209bed9538/href</a></iframe><p>We initialize our transformer specifying the lags that we want to use.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/8616b1f5d86cebbfeac6dd293a202188/href">https://medium.com/media/8616b1f5d86cebbfeac6dd293a202188/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/895/1*220R5J6Q1v9HL3l_zLExkQ.png" /><figcaption>Image by Author</figcaption></figure><p>As you can see this transformer will use <em>lag-7</em> and <em>lag-14</em> as features. Now we define our model.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/5d4393834ae67f11fe38e14ae43e3734/href">https://medium.com/media/5d4393834ae67f11fe38e14ae43e3734/href</a></iframe><p>We create a <a href="https://nixtla.github.io/mlforecast/forecast.html">Forecast</a> object with the model and the time series transformer and fit it to our data.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/fe2095d7cbed28f97e95deb4afa11f99/href">https://medium.com/media/fe2095d7cbed28f97e95deb4afa11f99/href</a></iframe><p>And now we just call predict with the forecast horizon that we want.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/3c2e60e92ea5a1c6f8461203ccc925b3/href">https://medium.com/media/3c2e60e92ea5a1c6f8461203ccc925b3/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/890/1*2aa4guARpt5vH57giMI73g.png" /><figcaption>Image by Author</figcaption></figure><p>This was a lot easier and internally this did the same as we did before. Let&#39;s verify real quick.</p><p>Check that we got the same predictions:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/ed193d71de1fc229513900145810a034/href">https://medium.com/media/ed193d71de1fc229513900145810a034/href</a></iframe><p>Check that we got the same model:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/6667dd17e051f09c945445a1ac72bf7d/href">https://medium.com/media/6667dd17e051f09c945445a1ac72bf7d/href</a></iframe><h4>Experiments made easier</h4><p>Having this high-level abstraction allows us to focus on defining the best features and model instead of worrying about implementation details. For example, we can try out different lags very easily by writing a simple function that leverages <em>mlforecast</em>:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/77f72fc5ea5b44aa075aa219eebc6834/href">https://medium.com/media/77f72fc5ea5b44aa075aa219eebc6834/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/3ec14fece6a3e13b338d42352ff81fc7/href">https://medium.com/media/3ec14fece6a3e13b338d42352ff81fc7/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/893/1*PJqGfDJk40Biu6NHlZQu_w.png" /><figcaption>Image by Author</figcaption></figure><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/00a176b2e6684de6fd88f6318ccf05b5/href">https://medium.com/media/00a176b2e6684de6fd88f6318ccf05b5/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/893/1*BvsdT81OOtKOD7EwDZembQ.png" /><figcaption>Image by Author</figcaption></figure><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/6e37af47cc7033e59ca7318e9c35225e/href">https://medium.com/media/6e37af47cc7033e59ca7318e9c35225e/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/891/1*K_tD7ig7R4RTBZ5ofxwkRw.png" /><figcaption>Image by Author</figcaption></figure><h4>Backtesting</h4><p>In the previous examples, we manually split our data. The <strong>Forecast</strong> object also has a <a href="https://nixtla.github.io/mlforecast/forecast.html#Backtesting">backtest</a> method that can do that for us.</p><p>We’ll first get all of our data into the required format.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/305029b997747752cab6c71deb1c11b4/href">https://medium.com/media/305029b997747752cab6c71deb1c11b4/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/892/1*wNXiiS4OBMFFUzXJqxhrmQ.png" /><figcaption>Image by Author</figcaption></figure><p>Now we instantiate a Forecast object as we did previously and call the backtest method instead.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/133f780f43f3301263aefe99317a2fbb/href">https://medium.com/media/133f780f43f3301263aefe99317a2fbb/href</a></iframe><p>This returns a generator with the results for each window.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/4a48d3d6c3baac5c64a6e10ccd17b586/href">https://medium.com/media/4a48d3d6c3baac5c64a6e10ccd17b586/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/892/1*KrBp9SWEPtFf7CNe7b60qw.png" /><figcaption>Image by Author</figcaption></figure><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/2e91d3048cdf1f83d8d5f0892b3025cc/href">https://medium.com/media/2e91d3048cdf1f83d8d5f0892b3025cc/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/888/1*0pi1T8d1cKRvKydDMTtHaw.png" /><figcaption>Image by Author</figcaption></figure><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/461b66e765e95714e72aa052c231761f/href">https://medium.com/media/461b66e765e95714e72aa052c231761f/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/889/1*dNNp6TvIsKNZ5zyv-tB_Rw.png" /><figcaption>Image by Author</figcaption></figure><p>result2 here is the same as the evaluation we did manually.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/426c111e6aca2727e124baaf63cbb436/href">https://medium.com/media/426c111e6aca2727e124baaf63cbb436/href</a></iframe><p>We can define a validation scheme for different lags using several windows.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/8858c36bad07e8e0fa01caf4a03f6849/href">https://medium.com/media/8858c36bad07e8e0fa01caf4a03f6849/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/013b733f68c360f67e49891e68da72d7/href">https://medium.com/media/013b733f68c360f67e49891e68da72d7/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/887/1*-bPHZYasVxMCxGKQFRughw.png" /><figcaption>Image by Author</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/proxy/1*_O746H32Vw7iRBAxoatCcA.png" /><figcaption>Image by Author</figcaption></figure><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/8faa284da73e88688a2a9024427c8995/href">https://medium.com/media/8faa284da73e88688a2a9024427c8995/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/891/1*UbxVkXVOLHLrCOVh2xo55g.png" /><figcaption>Image by Author</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/proxy/1*1eB3pAl_fVnm_tituRXRqA.png" /><figcaption>Image by Author</figcaption></figure><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/450bdb60474c2321d871ba68f14f26d7/href">https://medium.com/media/450bdb60474c2321d871ba68f14f26d7/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/893/1*db_J6fD5u5BBQecN3dgUJw.png" /><figcaption>Image by Author</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/proxy/1*AkHuTf39Ecft0CHnd5X8kQ.png" /><figcaption>Image by Author</figcaption></figure><h4>Lag transformations</h4><p>We can specify transformations on the lags as well as just lags. The <a href="https://github.com/jose-moralez/window_ops">window_ops</a> library has some implementations of different window functions. You can also define your own transformations.</p><p>Let’s try a seasonal rolling mean, this takes the average over the last n seasons, in this case, it would be the average of the last n Mondays, Tuesdays, etc. Computing the updates for this feature would probably be a bit annoying, however, using this framework we can just pass it to <em>lag_transforms</em>. If the transformations take additional arguments (additional to the values of the series) we specify a tuple like (transform_function, arg1, arg2), which in this case are season_length and window_size.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/7c6ee3ba5e6795e5f6096e61adbf2a3a/href">https://medium.com/media/7c6ee3ba5e6795e5f6096e61adbf2a3a/href</a></iframe><pre>help(seasonal_rolling_mean)</pre><pre>Help on CPUDispatcher in module window_ops.rolling:<br><br>seasonal_rolling_mean(input_array: numpy.ndarray, season_length: int, window_size: int, min_samples: Union[int, NoneType] = None) -&gt; numpy.ndarray<br>    Compute the seasonal_rolling_mean over the last non-na window_size samples of the<br>    input array starting at min_samples.</pre><p><em>lag_transforms</em> takes a dictionary where the keys are the lags that we want to apply the transformations to and the values are the transformations themselves.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/3867c848f0df6a7e43f8b5bf95e3bac6/href">https://medium.com/media/3867c848f0df6a7e43f8b5bf95e3bac6/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/888/1*rEh33GcCPs5aTpQYpSw9Ew.png" /><figcaption>Image by Author</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/proxy/1*Cz8l2uA8HdfIw1oiiLDszA.png" /><figcaption>Image by Author</figcaption></figure><h4>Date features</h4><p>You can also specify date features to be computed, which are attributes of the <strong>ds</strong> column and are updated in each time step as well. In this example, the best model would be taking the average over each day of the week, which can be accomplished by doing one-hot encoding on the day of the week column and fitting a linear model.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/98ea605dd3ec1e7e27c3ff296475d457/href">https://medium.com/media/98ea605dd3ec1e7e27c3ff296475d457/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/895/1*GbCsgVPppo-rJJBhENVvGg.png" /><figcaption>Image by Author</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/proxy/1*q_jGr973Tgi5TiRrJj1XSg.png" /><figcaption>Image by Author</figcaption></figure><h4>Next steps</h4><p><strong>mlforecast</strong> has more features like <a href="https://nixtla.github.io/mlforecast/distributed.forecast.html#Example">distributed training</a> and a <a href="https://nixtla.github.io/mlforecast/cli.html#Example">CLI</a>. If you’re interested you can learn more in the following resources:</p><ul><li>GitHub repo: <a href="https://github.com/Nixtla/mlforecast">https://github.com/Nixtla/mlforecast</a></li><li>Documentation: <a href="https://nixtla.github.io/mlforecast/">https://nixtla.github.io/mlforecast/</a></li><li>Example using mlforecast in the M5 competition: <a href="https://www.kaggle.com/lemuz90/m5-mlforecast">https://www.kaggle.com/lemuz90/m5-mlforecast</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=95a6b6579090" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-science/forecasting-with-machine-learning-models-95a6b6579090">Forecasting with Machine Learning Models</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Automated Time Series Forecasting Pipeline Faster and More Accurate than Amazon Forecast]]></title>
            <link>https://aws.plainenglish.io/automated-time-series-forecasting-pipeline-662e0feadd98?source=rss-2855bd3e0293------2</link>
            <guid isPermaLink="false">https://medium.com/p/662e0feadd98</guid>
            <category><![CDATA[software-development]]></category>
            <category><![CDATA[programming]]></category>
            <category><![CDATA[aws]]></category>
            <category><![CDATA[cloud-computing]]></category>
            <category><![CDATA[cloud]]></category>
            <dc:creator><![CDATA[azul garza ramirez]]></dc:creator>
            <pubDate>Fri, 15 Oct 2021 18:44:07 GMT</pubDate>
            <atom:updated>2022-10-20T20:33:32.684Z</atom:updated>
            <content:encoded><![CDATA[<h3>Automated Time Series Forecasting Pipeline: Faster and More Accurate than Amazon Forecast</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/575/0*tbEOBonS9J28AC2h.png" /></figure><blockquote>TLDR: We built a<a href="https://github.com/Nixtla/nixtla"> fully open-source time-series pipeline</a> capable of achieving 1% of the performance in the M5 competition, performing 25% better than Amazon Forecast in less than an hour and 20% better than fbprophet. To test the production version write to <a href="mailto:federico@nixtla.io">federico@nixtla.io</a>.</blockquote><p><em>By Nixtla Team. </em><a href="https://medium.com/u/2855bd3e0293"><em>fede garza ramírez</em></a>, <a href="https://medium.com/u/76b639655285">Max Mergenthaler</a></p><p>Time Series forecasting is an exciting field for Machine Learning. Its applications can be found everywhere, ranging from inventory management, financial predictions to healthcare analytics.</p><p>In contrast with other Machine Learning tasks that treat their slowly changing datasets like constants over time and only pay attention to these changes when they are no longer negligible, a time series dataset is explicit when accounting time within its structure. This time dimension imposes a structure and constraints in the datasets, making the ML model life cycle faster.</p><p>If a time series forecasting model is deployed in production, several steps need to be addressed:</p><ul><li><strong>Data Ingestion:</strong> to communicate the data to powerful and fast computing services.</li><li><strong>Data Preprocessing:</strong> to clean the data by removing outliers and filling missing observations, and to enhance the data with time-series features, like auto-regressors, statistical summaries, and other variables like calendar variables and holidays.</li><li><strong>Model Training:</strong> to select from well-performing models and statistical benchmarks.</li><li><strong>Hyperparameter Selection:</strong> to find models capable of generalization with good prediction performance.</li><li><strong>Model Deployment:</strong> to evaluate and make the predictions available to the users.</li></ul><h3>Time Series Pipeline Automation</h3><p>All of these steps are challenging and time-consuming. And automating them can help data scientists save time and apply their skills to discovering, creating, and building.</p><p>In Nixtla, we have developed an end-to-end forecasting pipeline throughout our projects that include sklearn, lightGBM, and in general, any model with “fit” and “predict” methods as an out-of-the-box solution for developers capable of integrating with other pipelines.</p><p>With our solution, any data scientist or developer can set up their forecasting service on AWS by following the instructions in the <a href="https://github.com/Nixtla/nixtla">repository</a>. Or, if you prefer, you can ask us for free trial keys to test the solution on Nixtla’s infrastructure (just send an email to <a href="mailto:federico@nixtla.io">federico@nixtla.io</a> or open a GitHub issue).</p><p>At Nixtla we strongly believe in open-source, so we have released all the necessary code so that anyone can set up their time-series processing service in the cloud (using AWS). That same repository uses continuous integration and deployment to deploy the APIs on our infrastructure.</p><p>If you want to deploy Nixtla on your AWS Cloud, you will need:</p><ul><li>API Gateway (to handle API calls).</li><li>Lambda (or some computational unit).</li><li>SageMaker (or some bigger computational unit).</li><li>ECR (to store Docker images).</li><li>S3 (for inputs and outputs).</li></ul><p>You will end up with an architecture that looks like the following diagram:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*NTh5ZvYILQwtL1Yx" /></figure><p>Each call to the API executes a particular Lambda function depending on the endpoint. That particular lambda function instantiates a SageMaker job using a predefined type of instance. Finally, SageMaker reads the input data from S3 and writes the processed data to S3, using a predefined Docker image stored in ECR.</p><h3>Forecasting Pipeline as a Service</h3><p>Our forecasting pipeline is modular and built upon simple APIs:</p><h4>1. tspreprocess</h4><p>Time series usually contains missing values. This is the case for sales data where only the events that happened are recorded. In these cases it is convenient to balance the panel, i.e., to include the missing values to correctly determine the value of future sales.</p><p>The <a href="https://github.com/Nixtla/nixtla/tree/main/tspreprocess">tspreprocess</a> API allows you to do this quickly and easily. In addition, it allows one-hot encoding of static variables (specific to each time series, such as the product family in case of sales) automatically.</p><h4><strong><em>2. tsfeatures</em></strong></h4><p>It is usually good practice to create features of the target variable so that they can be consumed by machine learning models. This API allows users to create features at the time series level (or static features) and also at the temporal level.</p><p>The <a href="https://github.com/Nixtla/nixtla/tree/main/tsfeatures">tsfeatures</a> API is based on the tsfeatures <a href="https://github.com/Nixtla/tsfeatures">library</a> also developed by the Nixtla team (inspired by the <a href="https://github.com/robjhyndman/tsfeatures">R package tsfeatures</a>) and the tsfresh <a href="https://github.com/blue-yonder/tsfresh">library</a>.</p><p>With this API the user can also generate holiday variables. Just enter the country of the special dates or a file with the specific dates and the API will return dummy variables of those dates for each observation in the dataset.</p><h4><strong><em>3. tsforecast</em></strong></h4><p>The <a href="https://github.com/Nixtla/nixtla/tree/main/tsforecast">tsforecast</a> API is responsible for generating the time series forecasts. It receives as input the target data and can also receive static variables and time variables. At the moment, the API uses the mlforecast <a href="https://github.com/Nixtla/mlforecast">library</a> developed by the Nixtla team using LightGBM as a model.</p><p>In future iterations, the user will be able to choose different Deep Learning models based on the nixtlats <a href="https://github.com/Nixtla/nixtlats">library</a> developed by the Nixtla team.</p><h4><strong><em>4. tsbenchmarks</em></strong></h4><p>The <a href="https://github.com/Nixtla/nixtla/tree/main/tsbenchmarks">tsbenchmarks</a> API is designed to easily compare the performance of models based on time series competition datasets. In particular, the API offers the possibility to evaluate forecasts of any frequency of the <a href="https://www.sciencedirect.com/science/article/pii/S0169207019301128">M4 competition</a> and also of the <a href="https://mofc.unic.ac.cy/m5-competition/">M5 competition</a>.</p><p>These APIs, written in Python, can be consumed through an <a href="https://github.com/Nixtla/nixtla/tree/main/sdk/python-autotimeseries">SDK</a> also written in Python. The following diagram summarizes the structure of our pipeline:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*MhsmHdXwPJ8WdPYN" /></figure><h3><strong>Data Format</strong></h3><p>Nixtla’s infrastructure is built to receive the same data structure throughout the entire pipeline.</p><h4><strong>1. Target Data</strong></h4><p>The target data must contain three columns: the identifier of each of the time series, the column that identifies the time of the observation, and the column of the target variable. In other words, it must be a time series panel (or long format).</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/312/0*skOb5yBam3S3qZ7I" /></figure><h4><strong>2. Static Data</strong></h4><p>Static data, i.e. data that are common in time for each time series, must have the identifier of each time series and also the static variables to be considered:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/530/0*4yCECsSBHIgF6KMs" /></figure><h4><strong>3. Temporal Data</strong></h4><p>Like the target data, the exogenous time data must have an identifier for each of the time series, the time identifier and also the exogenous variables to be considered. Additionally, this dataset must contain the exogenous variables of the time-period to be forecasted:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/496/0*sTXoTEX_IOQ0vcV7" /></figure><h3>Proof of Concept: Large Online Retail Dataset Example</h3><p>This section demonstrates how the APIs can be integrated into an end-to-end forecasting pipeline on one large retail dataset. We compare Nixtla’s performance against the top solutions of the competition and also with Amazon Forecast, the AutoML solution for time series forecasting developed by AWS. Nixtla’s solution achieves the top 1% of the performance without much effort. You can achieve the top 1% directly in <a href="https://colab.research.google.com/drive/1pmp4rqiwiPL-ambxTrJGBiNMS-7vm3v6?ts=616700c4">Colab</a>.</p><h4><strong>M5 Competition</strong></h4><p>The M5 competition is composed of Walmart’s daily sales for stores in three states in the United States. The dataset includes department, product categories, and store details. A full description of the competition can be found <a href="https://www.kaggle.com/c/m5-forecasting-accuracy">here</a>.</p><p>As a benchmark, we use <a href="https://pypi.org/project/fbprophet/">fbprophet</a>. For this, we ran the parallelized solution on an AWS EC2 of type c5d.24xlarge (96 cores, 185GB RAM). Reproduction of these results can be found <a href="https://github.com/Nixtla/nixtla/tree/main/sdk/python-autotimeseries/examples/m5/benchmarks/fbprophet">here</a>.</p><p>We also ran the AWS AutoML solution called <a href="https://aws.amazon.com/forecast">Amazon Forecast</a>. We used the same data as in our solution. Our solution offers the following advantages:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/567/1*YJdKBx3GV3sqbY4MHvztfA.png" /></figure><ul><li>Usability. Amazon Forecast through the AWS console requires uploading the data in CSV format directly to S3, which makes it complex and time-consuming due to the size of the datasets.</li><li>Speed. Amazon Forecast took approximately 4 hours to run the entire forecast, compared to at most 1 hour for our solution.</li><li>Performance. Our solution reaches 1% while Amazon Forecast is far behind.</li><li>Open-source. The user knows exactly the <a href="https://github.com/Nixtla/nixtla/blob/main/tsforecast/forecast/make_forecast.py">code</a> that is running through the API in contrast to Amazon Forecast where the best performing model is known but not the code behind it.</li></ul><h4>Usability</h4><p>To use our solution you just need to install the <a href="https://pypi.org/project/autotimeseries/">library autotimeseries</a> from PyPI as follows:</p><p>pip install autotimeseries</p><p>Import the library and add the keys:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/ca9ec3c9d3d1dbd30dca48641ac18790/href">https://medium.com/media/ca9ec3c9d3d1dbd30dca48641ac18790/href</a></iframe><p><em>AutoTS</em> class wraps all the APIs for building a simple pipeline. To instantiate it, define the credentials and the bucket name on S3 where the data will be uploaded.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/95e7b219b01396769ab2f61eb94bbfa9/href">https://medium.com/media/95e7b219b01396769ab2f61eb94bbfa9/href</a></iframe><p>First, upload the data in CSV or Parquet format to S3:</p><ul><li>target: time-series variable of interest. Must have three columns: unique_id, datestamp and value.</li><li>static: exogenous static features for each unique_id. Must have unique_id and features in columns.</li><li>temporal: exogenous temporal features. Must have unique_id, datestamp, and values for each feature.</li><li>calendar-holidays: dictionary with holiday name and dates with occurrences.</li></ul><p>The data for this example was generated with src.upload_data script available <a href="https://github.com/Nixtla/autotimeseries/tree/main/examples/m5">here</a>. We recommend using Parquet format to reduce the size of files and uploading time.</p><p>See below:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/8a3cc420f38a7babe5fb1a20ddaef6a9/href">https://medium.com/media/8a3cc420f38a7babe5fb1a20ddaef6a9/href</a></iframe><p>Specify the names for unique_id_column, ds_column, and y_column on the target file.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/641f7fa4fd9f3cb56b1a880dc577e6aa/href">https://medium.com/media/641f7fa4fd9f3cb56b1a880dc577e6aa/href</a></iframe><p>Additional features can boost the performance of models significantly. It allows the model to incorporate exogenous events, such as holidays, which drastically affect the target time series. AutoTS features module automatically generates temporal and calendar features.</p><p>In this example, we use calendartsfeatures() method to create calendar features specific to the US:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/bc276c73866ff99113eacc8663c0d748/href">https://medium.com/media/bc276c73866ff99113eacc8663c0d748/href</a></iframe><p>To run the forecasts, simply call the tsforecast() method. This method initiates a SageMaker job to train the AutoTS model and produce the forecasts, starting after the last date of the training data in the target dataset. The forecast horizon of the M5 competition is 28 days, which we can specify with the horizon parameter.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/b71e4e910706f434db2f119749986e2f/href">https://medium.com/media/b71e4e910706f434db2f119749986e2f/href</a></iframe><h4>Forecasting Performance</h4><p>We measure our pipeline’s point predictions performance following the competitions evaluation metric: the Weighted Root Mean Square Scaled Error (WRMSSE) as shown below:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/219/0*RD5gtqQbxi7HS8Iw" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/187/0*O3r1ZyUpXdG4c2Uh" /></figure><p>The results were also computed by uploading a late submission to the official evaluator. As can be seen in the following table, Nixtla’s forecasts perform better than the 50th place winner. This puts Nixtla in the top 1% of the competition with a processing time of less than one hour:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/213/1*tLs16F5TMDfBlz_n7r1g4Q.png" /></figure><p>The following are examples of the forecasts generated by the Nixtla pipeline:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/424/0*oxQ3KsxR0OxFPFVG" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/424/0*lIB0xD5t_2NAlhuN" /></figure><h4>Computational Performance</h4><p>We also measured the computation performance of our solution against the AutoML solution provided by AWS. Amazon Forecast took 4 times longer than our solution.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/217/1*5DBtSv1p-pmWtqN9MgNIxg.png" /></figure><h3>Summary</h3><p>We introduced the problem of automation of time series forecasting and showed how Nixtla’s open-source APIs can build robust forecasting pipelines with little effort. We showed how the current version of the forecasting pipeline achieves accuracy within the top 1% of the M5 submissions in less than an hour.</p><h3>Contact us</h3><p>We are looking for people to help us build and validate Nixtla, so please reach out to us if:</p><ul><li>You have feedback or want to talk about forecasting.</li><li>You want to be part of the private beta of our fully hosted solutions.</li><li>You are interested in using Nixtla at your company.</li></ul><p><strong>Mail: </strong><a href="mailto:federico@nixtla.io">federico@nixtla.io</a></p><p><strong>Whatsapp</strong>: Scan the QR code. :)</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/300/0*HEX3kBtZ038aB0d6" /></figure><h3>Contribute</h3><ul><li>Report errors and request features by adding Issues on GitHub</li><li>Contribute to the codebase directly on GitHub!</li><li>Nixtla ecosystem: <a href="https://github.com/Nixtla">https://github.com/Nixtla</a>.</li></ul><h3>Nixtla Team</h3><ul><li><a href="https://github.com/kdgutier/">Kin Gutiérrez</a>.</li><li><a href="https://github.com/mergenthaler">Max Mergenthaler</a>.</li><li><a href="https://github.com/cristianchallu">Cristian Challú</a>.</li><li><a href="https://github.com/FedericoGarza">Federico Garza</a>.</li></ul><p><em>More content at </em><a href="http://plainenglish.io/"><em>plainenglish.io</em></a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=662e0feadd98" width="1" height="1" alt=""><hr><p><a href="https://aws.plainenglish.io/automated-time-series-forecasting-pipeline-662e0feadd98">Automated Time Series Forecasting Pipeline Faster and More Accurate than Amazon Forecast</a> was originally published in <a href="https://aws.plainenglish.io">AWS in Plain English</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Forecasting in Python with ESRNN model]]></title>
            <link>https://medium.com/analytics-vidhya/forecasting-in-python-with-esrnn-model-75f7fae1d242?source=rss-2855bd3e0293------2</link>
            <guid isPermaLink="false">https://medium.com/p/75f7fae1d242</guid>
            <category><![CDATA[pytorch]]></category>
            <category><![CDATA[deep-learning]]></category>
            <category><![CDATA[machine-learning]]></category>
            <category><![CDATA[time-series-forecasting]]></category>
            <category><![CDATA[python]]></category>
            <dc:creator><![CDATA[azul garza ramirez]]></dc:creator>
            <pubDate>Tue, 16 Jun 2020 22:53:21 GMT</pubDate>
            <atom:updated>2020-06-19T13:37:45.252Z</atom:updated>
            <content:encoded><![CDATA[<h3><strong>M4 Competition and Background</strong></h3><p>Deep Learning algorithms enjoys success in a variety of tasks ranging from image classification to natural language processing; its use in time series forecasting has also began to spread. On the recent M4 major forecasting competition, a novel multivariate hybrid ML(Deep Learning)-time series model called Exponential Smoothing Recurrent Neural Network (ESRNN) won by a large margin over baselines and complex time series ensembles.</p><p>In this post, we introduce the model and show its use on a Pytorch implementation which achieves state of the art performance on the M4 competition:</p><ul><li>The GPU implementation achieves a x300 speed up over the original Smyl model in C++ using Dynet library.</li><li>The model can be easily used on new (non M4) data, since our class was built similar to scikit-learn models with fit and predict methods.</li></ul><p>For anyone interested in exploring the model deeper the package is available at <a href="https://pypi.org/project/ESRNN/">https://pypi.org/project/ESRNN/</a> and the following github page <a href="https://github.com/kdgutier/esrnn_torch">https://github.com/kdgutier/esrnn_torch</a>.</p><h4>Model</h4><p>The premise of this model is simple and yet intuitive and appealing. The model cleverly combines the classic Exponential Smoothing model (ES) and a Recurrent Neural Network (RNN). The ES decomposes the time series in level, trend and seasonality components. The RNN is trained with all the series, has shared parameters and it is used to learn common local trends among the series while the ES parameters are specific for each time series. The models are combined by including the output of the RNN as the local trend component in the ES model.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*E-9obpPKnIRRAjZu9drfMg.png" /></figure><p>One main challenge of this idea is that local trends are not directly observed. Also, for the output of the RNN to be meaningful the trends must be comparable between series. The model addresses this by normalizing and deseasonalizing the series given by the ES decomposition. This preprocessing is then an integral part of the algorithm instead of taking place before the training process. Another advantage of the RNN is that allows for exogenous variables, which in the M4 example corresponds to dummies of the category.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*3ZUuCsrn_zqAPqucXxqs1A.png" /></figure><p>Regarding the architecture of the RNN, Smyl proposed to use different architectures depending on the frequency of the data. The basic architecture is a dilated-RNN with LSTM cells, this allowed the RNN to reduce the number of parameters while stacking more layers. For series without obvious seasonality, such as the yearly data, an attention layer is added. More information on these architectures can be found in the references.</p><h4>Loss function</h4><p>The ESRNN model optimizes over two losses. First, the quantile loss with minimizer the quantile of the target variable and second, a penalty on the variance or wiggliness of the predictions as a regularizer. The quantile loss is given by:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/664/1*8pW4iL3VROjHFQx4aXnmyA.png" /></figure><p>The quantile loss makes the model to predict the conditional quantiles of the target distribution, it is robust and does not make distributional assumptions. Usually the model is trained to fit the median, but in case the model consistently underestimates or overestimates the target values, the quantile can be changed accordingly.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/408/1*cpuzTgRZRjjozF4R6UyGDg.png" /></figure><h3>Example on M4 data</h3><h4>Usage Example</h4><p>The library can be installed from the <a href="https://pypi.org/project/ESRNN/">python package index</a> with:</p><pre>pip install ESRNN</pre><p>The library also includes some utilities that allows us to easily experiment with the model. The prepare_m4_data function allows us to obtain data from the M4 competition, so it can be easily used with the model. In particular, it returns predictions from the Naive2 model; this predictions can be used to evaluate each iteration of the ESRNN through the Overall Weighted Average. Here we are obtaining the 414 hourly time series of the M4 data, which are stored in the &#39;./data&#39; folder:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/0368d9eee4a9217a670ac8925d228aea/href">https://medium.com/media/0368d9eee4a9217a670ac8925d228aea/href</a></iframe><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/e1d7995519f856ff97da86f751829ab5/href">https://medium.com/media/e1d7995519f856ff97da86f751829ab5/href</a></iframe><pre>Successfully downloaded M4-info.csv 4335598 bytes.<br>Successfully downloaded Train/Daily-train.csv 95765153 bytes.<br>Successfully downloaded Train/Hourly-train.csv 2347115 bytes.<br>Successfully downloaded Train/Monthly-train.csv 91655432 bytes.<br>Successfully downloaded Train/Quarterly-train.csv 38788547 bytes.<br>Successfully downloaded Train/Weekly-train.csv 4015067 bytes.<br>Successfully downloaded Train/Yearly-train.csv 25355736 bytes.<br>Successfully downloaded Test/Daily-test.csv 576459 bytes.<br>Successfully downloaded Test/Hourly-test.csv 132820 bytes.<br>Successfully downloaded Test/Monthly-test.csv 7942698 bytes.<br>Successfully downloaded Test/Quarterly-test.csv 1971754 bytes.<br>Successfully downloaded Test/Weekly-test.csv 44247 bytes.<br>Successfully downloaded Test/Yearly-test.csv 1486434 bytes.<br><br><br>Preparing Hourly dataset<br>Preparing Naive2 Hourly dataset predictions</pre><p>The model is built to function similarly to scikit-learn models. It is instantiated as follows (for a detailed description of the parameters, see the documentation):</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/9173a0789893138b891f1539dfd12d4e/href">https://medium.com/media/9173a0789893138b891f1539dfd12d4e/href</a></iframe><p>The model is trained with the fit method. If the test set is passed to it, the method will compute out-of-sample losses for this set at the end. This method receives X_df, y_df training pandas dataframes in long format. Optionally X_test_df and y_test_df to compute out of sample performance.</p><p>The &#39;X&#39; and &#39;y&#39; dataframes must contain the same values for &#39;unique_id&#39;, &#39;ds&#39; columns and be <strong>balanced</strong>, ie.no <em>gaps</em> between dates for the frequency.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*cuTV_LN2KGK-XrstFQiZLg.png" /></figure><p>The frequency of computing and reporting this loss can be changed with the freq_of_test hyperparameter.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/f533240bcc1d778e1753952602f6be83/href">https://medium.com/media/f533240bcc1d778e1753952602f6be83/href</a></iframe><pre>model.fit(X_train_df, y_train_df)</pre><pre>Infered frequency: H<br>=============== Training ESRNN  ===============</pre><pre>========= Epoch 0 finished =========<br>Training time: 50.14884<br>Training loss (50 prc): 0.70241<br>========= Epoch 1 finished =========<br>Training time: 51.24384<br>Training loss (50 prc): 0.59290<br>========= Epoch 2 finished =========<br>Training time: 51.81561<br>Training loss (50 prc): 0.53481<br>========= Epoch 3 finished =========<br>Training time: 52.64761<br>Training loss (50 prc): 0.49683<br>========= Epoch 4 finished =========<br>Training time: 50.96984<br>Training loss (50 prc): 0.46950<br>Train finished!</pre><p>Finally the predictions are obtained with the predict method. Furthermore, the package has a special function to calculate the OWA of the predictions, evaluate_prediction_owa.</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/98b07fc82a2d59a70ffcbe04c8085abc/href">https://medium.com/media/98b07fc82a2d59a70ffcbe04c8085abc/href</a></iframe><pre>===============  Model evaluation  ==============<br>OWA: 0.987 <br>SMAPE: 15.623 <br>MASE: 2.69</pre><p>A function has also been implemented to plot predictions:</p><iframe src="" width="0" height="0" frameborder="0" scrolling="no"><a href="https://medium.com/media/90542343372a81bbacbc1629a66f0dfe/href">https://medium.com/media/90542343372a81bbacbc1629a66f0dfe/href</a></iframe><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*h6CDRcsp6xW-8B-O0ysr1w.png" /></figure><h3>Comparison with M4 winning submission</h3><h4>Naive2 Forecast</h4><p>The Naive2 model is a popular benchmark model for time series forecasting that automatically adapts to the potential seasonality of a series based on an autocorrelation test. If the series is seasonal the model composes the predictions of Naive and Seasonal Naive, else the model predicts on the simple Naive. Following the M4 competition practice we report the relative performance of the ESRNN compared to Naive2.</p><h4>Overall Weighted Average</h4><p>To quantify the aggregated errors we use the Overall Weighted Average (OWA) proposed for the M4 competition. This metric is calculated by obtaining the average of the symmetric mean absolute percentage error (sMAPE) and the mean absolute scaled error (MASE) for all the time series and also calculating it for the Naive2 predictions. Both sMAPE and MASE are scale independent. These measurements are calculated as follows:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*VsKh6DxzCj402b4mxmkndQ.png" /></figure><p>The following table shows the OWA obtained by our implementation and the original model. The results deviate slightly from original implementation, but still very competitive on the M4 leaderboard, placing it in the top 5 models. Also, these results were achieved with a x300 speedup over Smyl’s implementation, since we are batching the time series for training and our model can be trained in GPU.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/382/1*COToK1xM8YHz7mmjXAkr4A.png" /></figure><h3>How to contribute</h3><p>The full code is publicly available at <a href="https://github.com/kdgutier/esrnn_torch">github</a>. To contribute you can fork this repository and make a PR with your improvements. You can also create issues if you have problems running the model.</p><h3>Authors</h3><p>This repository was developed with joint efforts from <a href="https://www.autonlab.org/">AutonLab</a> researchers at Carnegie Mellon University and Orax data scientists.</p><ul><li><a href="https://github.com/kdgutier/">Kin Gutiérrez</a>.</li><li><a href="https://github.com/cristianchallu">Cristian Challú</a>.</li><li><a href="https://github.com/FedericoGarza">Federico Garza</a>.</li><li><a href="https://github.com/mergenthaler">Max Mergenthaler</a>.</li></ul><h3>References</h3><ol><li><a href="https://www.sciencedirect.com/science/article/pii/S0169207019301153">A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting.</a></li><li><a href="https://www.researchgate.net/publication/325901666_The_M4_Competition_Results_findings_conclusion_and_way_forward">The M4 Competition: Results, findings, conclusion and way forward.</a></li><li><a href="https://github.com/M4Competition/M4-methods/tree/master/Dataset">M4 Competition Data.</a></li><li><a href="https://papers.nips.cc/paper/6613-dilated-recurrent-neural-networks.pdf">Dilated Recurrent Neural Networks</a>.</li><li><a href="https://arxiv.org/abs/1701.03360">Residual LSTM: Design of a Deep Recurrent Architecture for Distant Speech Recognition</a>.</li><li><a href="https://arxiv.org/abs/1704.02971">A Dual-Stage Attention-Based recurrent neural network for time series prediction</a>.</li><li><a href="https://medium.com/@equeum/a-brief-history-of-forecasting-356f5fba847bhttps://medium.com/@equeum/a-brief-history-of-forecasting-356f5fba847b">A Brief History of Forecasting.</a></li><li><a href="https://towardsdatascience.com/n-beats-beating-statistical-models-with-neural-nets-28a4ba4a4de8">N-BEATS. Beating Statistical Model with Neural Nets.</a></li><li><a href="https://towardsdatascience.com/why-we-unapologetically-use-deep-learning-in-our-forecasts-2923a5773073">Why We Unapologetically Use Deep Learning in Our Forecasts</a>.</li><li><a href="https://www.sciencedirect.com/science/article/pii/S0169207019301128">The M4 Competition: 100,000 time series and 61 forecasting methods</a>.</li><li><a href="https://forecasters.org/resources/time-series-data/m4-competition/">M4-Competition — International Institute of Forecasters</a>.</li><li><a href="https://robjhyndman.com/hyndsight/m4comp/">M4 Forecasting Competition — Rob Hyndman.</a></li></ol><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=75f7fae1d242" width="1" height="1" alt=""><hr><p><a href="https://medium.com/analytics-vidhya/forecasting-in-python-with-esrnn-model-75f7fae1d242">Forecasting in Python with ESRNN model</a> was originally published in <a href="https://medium.com/analytics-vidhya">Analytics Vidhya</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
    </channel>
</rss>