Sample size and time series models — A case study on ARIMA() processes.

Konrad Hoppe
Apr 24, 2021 · 6 min read

Goals and contents

ARIMA timeseries models are often taught in econometrics courses as part of the regular business science curriculum and are thus put to use by sometimes inexperienced data scientists.

The intention of this case study is to understand the data generating process behind simple MA(1) models and illustrate weakness of the estimators at small sample sizes.


For the tested MA(1) model with coefficient beta=0.3, a time series length of at least 5000 observations is necessary to reach a narrower confidence interval.

The impact on the goodness of forecasts is evaluated and depends critically on the estimated coefficient.

The case

Install some libraries first:

Now generate some data from the process. This is better than using real-word data since we know in this laboratory setting what we are aiming at: 50 observations, simple MA(1) model with coefficient β=0.3:

Now, lets do the fit and see whether the coefficient that was used to generate the data, is uncovered:

The output is for the coefficient test is:

and the confidence interval:

The confidence interval is rather wide after 50 observations and the estimator slightly off. Let us investigate how fast this improves: How long does the series need to be to wrap the confidence interval tight around the input parameter β=0.3.

Plotting now the development of the estimator for different timeseries lengths shows:

Illustration of estimator development for different time series lengths
Illustration of estimator development for different time series lengths

Curiously, the “true” value is not even reached for large number of observations. Let’s see whether this is spurious or whether there is an estimation bias by averaging over a number of estimates with the same number of observations:

with output:

So as expected, estimator is unbiased for large number of repititions.


Does it matter that the the number of observations need to be large to get sufficiently tight confidence intervals? What is the difference of a coefficient of β=0.3 and β=0.35, since it anyway refers to unobserved random shocks in the past?

Let us investigate the forecasting ability under these circumstances. Generate a 50,000 long time series, fit an MA(1) process to it and create a 5 step forecast:


Excursus: Forecasting MA models

It’s curious that the forecast of aboves model breaks off after one period. Let us quickly investigate what is going on here.

The MA(1) model in its simplest form as we are using it, is given by

Thus, the one period forecast is given by

All forecasts are conditional on the past

Basic assumption of the MA(p) process is that the error terms are iid. with


So the remaining term in above equation,

can be calculated recursively. Therefore, rewrite the process description Xt using Lag-operator notation(

and thus

Notice now that the series expansion of 1/(1−x)=1+x+x2+x3+⋯ can be transformed by replacing x with −x to

thus, we can calculate ϵt|It:

Since we know the series of {Xt}, we can also calcuate the unobserved series of {ϵt}. Notice further that an MA(1) process can only be forecasted one step, since

This is because the random error terms are independently distributed.

Let us come back to the starting point. The reason we went down this excursus was to understand whether the predictor quality is actually an issue in forecasting MA(1) models. In our concrete case, the difference between β=0.3 and β=0.35 can be calculated:

The result is 0.32 — So the forecast deviates by ∼32% (RMSE) when the β is slightly differently estimated.


  • MA(p) models can only be forecasted for p periods.
  • Although only one parameter is estimated, these models need sufficient sample sizes for narrow confidence intervals
  • It is important to review confidence intervals width and investigate the impact on forecast quality
  • For classical operations research tasks, there are better fitting models: often volume curves can be predicted better by investigating the underlying drivers of the curve, rather than following a purely univariate time series approach.

From Confusion to Clarification

Nerd For Tech
Konrad Hoppe

Written by

applied AI / strategy consultant / aspiring XC rider on weekends //

Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit