In recent years Big Data and Deep Learning AI systems have become more popular, but are they really the solution that problem owners need? A recent project we did put things a bit into perspective for us. It seems that human data science expertise perhaps still outranks computing power.
Our client Westland Infra Netbeheer had the following problem. They are a DSO (Distribution Network Operator) that deliver electricity (and natural gas) mostly to agricultural companies. In contrast to private individuals, high-volume business customers have a contracted maximum power that they can consume (per 15-minute time interval) even though their physical connection would allow a higher consumption. These individual limitations serve to protect the supplier’s infrastructure from a demand overload when all users simultaneously draw a high amount of electricity. This indeed lowers the total amount of energy that Westland Infra Netbeheer could deliver if their customers would spread out their power consumption. In order to provide their customers with a better service they have the ambition to smarten up their electricity grid.
In this project they want to relieve the individual power consumption caps in certain conditions. These conditions are typically the hours before and after the early-morning power peaks. In this way, Westland Infra Netbeheer expect they can persuade their industrial customers to spread their demand, which would mitigate the need to upgrade their network — an expensive operation. At the same time it also allows the customer to occasionally exceed their contracted maximum power demand without additional costs, so it’s a win-win situation.
In order to decide if a (randomly selected) group of customers are assigned the “green light” for the extra power consumption Westland Infra Netbeheer have to predict when the demand peak will occur. Based on this prediction the exact time slots for the green light assignment is made. They need a forecast window of roughly 36 hours ahead.
Westland Infra Netbeheer asked us first to analyse the historical power consumption data. They had a historical time series of the past seven years, consisting of the aggregated power consumption (in MWh) per 15-minute interval. Not exactly what you would call Big Data, but qualitatively a nice data set.
First a general note on time series prediction. When forecasting a time series there are two important aspects. There are often global patterns that are repetitive and predictable. When the time interval of a cyclic pattern is constant it is referred to as “seasonal”, as opposed to e.g. business cycles. But there are also “local” influences that determine the trend of a time series, like the temperature or the values of the recent past. This dependence on the recent past is called autoregression. Making a reliable forecast is all about modelling the seasonal and the local aspects properly.
When we looked into the data set it quickly became apparent that this data had three cyclic components.
First of all, there was the yearly cycle. During the winter more power was consumed compared to the summer. This makes sense because many agricultural companies in the area use it to heat their greenhouses.
Secondly, the weekly cycle indicated that there was a lower demand during weekends. But also during the week there were noticeable differences between e.g. a typical Monday and Tuesday. The magnitude of this cyclic component however is the smallest of the three.
Thirdly, during the day the electricity usage obviously also isn’t constant. During the night there is a constant demand with a peak at around 6 am. During the day the demand is typically negative due to a return of electricity by solar panels or Combined Heat & Power (CHP) installations.
These three repetitive patterns could explain a big part of the demand curve. From a signal processing point of view it now seems straightforward: just filter out these three “frequencies” and what you end up with is the local influence. However, these patterns turned out to be dependent on each other: a daily pattern in January looks very different from a daily pattern in July. This is important because standard forecasting methods assume that there is only a single seasonal component and that it is constant. And we have three seasonal components that are dependent on each other. So we had to improvise.
As a simple forecasting method we chose to first “manually” remove the seasonal components, then use a standard autoregression model (Arima) to train the model on local effects, then predict the next few hours, and finally add the seasonal components back.
In order to capture the changing daily pattern we calculated the average day pattern per month. In this way we could simply subtract the average day pattern of the corresponding month. Though a bit rough, it did prove to be reasonably accurate and easy to implement.
The autoregression step basically expresses the expected value of the next value as a linear combination of the previous ones. During the training phase the optimal weights are estimated by regression. In the prediction phase the future values are iteratively calculated based on these weights.
We ended up with a simple model that performed well and took relatively little time to implement. During development we could easily navigate our way through the maze of parameter choices because the model was so intuitive. But above all it was transparent for the client.
The alternative of course was some kind of Deep Learning black box. Would it have outperformed our handcrafted model? Possibly. But then we would still have spent lots of hours experimentally tuning the model architecture and its parameters. And from the outside “the AI” is always an oracle that never answers the questions “why?” and “how?”.
So yes, we have learned that a simple methodology can have benefits over a more sophisticated one. Especially if the “simple” way is exactly as powerful as you need it to be.