Releasing Faraday Alpha V2
Happy 2023! We’ve been hard at work at Centre for Net Zero working on adding new features to the Faraday Alpha tool, and we’re proud to announce the release of Faraday Alpha V2! In this blog post we’ll introduce the new features of our second alpha release, some challenges we faced and how we’ve tackled them.
Faraday Alpha V2 New Features
In our first release last August, users could only specify Low Carbon Technologies, Season and Day of Week. In this new release, we’ve added new properties/ attributes that users can query, e.g. property type, number of rooms, or EPC rating.
New inputs now available:
- Property Type: Flat/Houses. For houses, there are five other sub-categories: Terraced, Semi-Detached, Detached, Bungalows and “House” for other house types.
- EPC Ratings: These are clustered in three different categories “A/B/C”, “D/E” and “F/G”.
- Urbanity: In this category, users can choose between “Remote” or “Urban”.
- Number of habitable rooms: These are grouped into “2” or “3 plus”. Habitable rooms include any living room/ sitting room/ bedrooms etc (See the EPC guidance document for full definition of “habitable rooms”. In our dataset, all households with LCT also have 2 or more habitable rooms and hence, only “2” or “3 plus” options are available.
- Mains Gas: If a property has mains gas connection
- LCT Types: Types of LCT now available are “Heat Pumps”, “Solar PVs”, “Electric Radiators”, “Electric Hot Water”, and “Electric Vehicles”. For “EVs”, you can also specify 1, or 2 or more EVs.
- Dates: You can now specify specific months of the year.
- (Optional): Optionally you can also enter the Ofgem price cap for “average households” to see the impact of price cap on consumption. The default (if not provided) is the latest Energy Price Guarantee cap of £2,500. More on this later.
First Iteration of the Faraday Alpha V2 Model
Faraday Alpha V2 is an XGBoost model that predicts the consumption based on user-selected inputs. We’ve chosen the XGBoost algorithm to model household consumption because it is one of the best performing algorithms for tabular datasets as demonstrated in the paper Tabular Data: Deep Learning Is Not All You Need. Furthermore, it can easily model nonlinear interactions between variables without much feature engineering.
The model was trained on:
- 2021 data and evaluated on 2022 consumption
- Trained on household level, and evaluated on 3 hierarchical levels: household, daily-settlement and daily.
- On daily-settlement level model achieved Mean Absolute Percentage Error (MAPE) of ~ 18%
How did the model perform (aggregated daily)
Aggregating forecasts to the daily level allows us to ascertain the performance of the model on a ‘global’ level and see how well it did in capturing the overall trend. The plot above shows the Actual vs Predicted consumption over the training period (2021) and evaluation period (2022) and shows that:
- The model performed well on a daily level during the training period, barring some anomaly likely driven by weather and holiday. We currently don’t include weather/ holiday data; adding these data could help to account for these anomalies.
- Model performed reasonably well over out-of-time periods (2022) until the end of March 2022 where the model then started to over-predicting consumption consistently.
Comparing 2021 vs 2022 Actuals
To understand what’s causing the systemic bias after March 2022, we compared the actuals of 2021 vs 2022 and found that the consumption pattern has changed:
- Load reduction: On average, households have reduced their consumption by about 10% in 2022
- Load shift: On average, households have also shifted more of their consumption to the overnight peak periods, and consumed slightly less electricity throughout the rest of the day.
We believe this was likely caused by macroeconomic factors (source 1,2) — as there was a significant hike in energy prices in April 2022, which is something our model needs to factor in.
Recall that our dataset (see our first blog post here) contains the energy behaviours of customers with low carbon technologies on our smart tariffs, so we often see load shifting to periods where energy is cheaper for those customers, such as the overnight period.
Adjusting for macroeconomic factors
Whilst XGBoost is great in many aspects, it has one limitation: it cannot extrapolate beyond seen values. Trees can only split based on the value it has seen before. For example, if during the training it has only seen prices from £1-£5, and when in real-life prices have increased to £100, a tree-based model would treat £100 the same as it would £5. Given the current economic climate, there is no guarantee that the future price cap will be in a region that the model has seen before. Therefore adding price as an input to our model is not a good solution.
Stacked XGBoost with Linear Regression Price model
To account for the changing macroeconomic conditions, we stacked a linear regression model on top of our XGBoost model to take into account price. The benefit of using linear regression is that it can extrapolate beyond seen values as it attempts to model the functional form of the relationship between Price and Consumption. However, it has one major draw-back: it won’t be able to capture nonlinear relationships as successfully as tree-based methods do.
To adjust for price, we:
- Generate household-level predictions using XGBoost and aggregate them to the daily-settlement level
- Fit linear regression model on daily-settlement level to model Price ~ Consumption
- Apply the linear regression model on top of XGBoost prediction to get final price-adjusted household-level consumption
Second Iteration of the V2 Model
Stacking a linear price model on top of XGBoost has decreased our Mean Absolute Percentage Error on daily-settlement level from 18% to 12%. We also extended our training period to 30th April as that’s when the price hike occurred. The plot above shows the actuals, before and after price-adjustment predictions on the daily level and daily-settlement level.
The linear price model works by shifting the consumption down for every settlement period as seen in the example of one specific household. However, it failed to increase the prediction for the morning peak periods (settlement period 0–8). This is because price affected consumption nonlinearly — it increased consumption over some settlement period whilst consumption decreased over other periods. As expected, our linear model could only capture the overall reduction (linear effect), but failed to capture the nonlinear relationship between price and consumption.
There are many ways we could fix this though, possibly in future releases, such as:
- Fit two separate models for Price X Consumption: one for settlement periods 0–8, another for periods 9–48.
- Add class weights to improve predictions for settlement periods 0–8 or resample our training set to contain more rows from the overnight period (though this might reduce performance over other settlement periods).
- Move beyond XGBoost/ Linear Predictions e.g. Neural Networks that could both model nonlinear relationships and extrapolate beyond seen values.
However, given the level of noise currently in energy markets (such as the energy price guarantee) and the fact that we’re already achieving 12% MAPE on a daily-settlement level, we don’t think it’s worth integrating these fixes into Faraday whilst it’s still in its Alpha stage.
Note that we are not seeking to model the causal relationship between price and consumption. To do so would be challenging, particularly when we don’t have the counterfactuals. However, it gives us the means to account for (and possibly extrapolate) how price would potentially affect consumption.
The V2 model is now live!
Have a go and play with our Alpha tool and please provide feedback. Note that the price slider tool is experimental as we’ve only had 1 price hike event so far to model. We’ll continually improve the model as we collect more data on how households react to the changing economic climate.
Let us know if there are any other inputs to the model that you would like to see in the following iteration. In our next blog post, we’ll share our plans for Faraday in 2023 — so keep your eyes peeled!
Interested in working at Centre for Net Zero?
We plan to expand our team this year. If you like the sound of what we’re getting up to at the Centre, please keep a lookout for new roles on our jobs site and follow us on LinkedIn and Twitter. You can also email us at info@centrefornetzero.org.