Why does it always rain on me?
Does it pay to rely on the weather forecast in the digital age?
We’ve all been there haven’t we; you have planned a trip out with the family and you’re busy checking the weather forecast in the days preceding. Should you leave the house once the rain has passed? or head out early before it arrives? Hour by hour forecasts help decision making but they’re subject to change. When did a simple day out get so difficult to plan?
Of course it never used to be like that. To those of us of a certain age the weather forecast used to be simple. You’d catch the breakfast news, or perhaps the lunchtime or evening forecast and make plans from there. The forecast was as accurate as the broad sweeps of the forecasters arm, and the vague positions of the symbols on the map meant nothing could be taken for granted. Yes, you’d perhaps get wet but that was beyond your, or the forecast’s, control.
In 2017 weather forecasts are very different. The “google” age of information on demand means consumers expect more, and this has driven forecast providers to get increasingly detailed with their forecasts. With tens, if not hundreds, of weather websites to choose from, not to mention the various watch and mobile phone “Apps”, hour by hour forecasts are the norm. In fact the met office boast:
The Met Office’s four day forecast is now as accurate as our one day forecast was 30 years ago.
In our respective families this accuracy certainly proves useful. As keen campers, Chris and family will often find themselves checking the forecast hourly in the run up to a Sunday morning trying to maximise the time on the campsite without risking the onset of a sudden shower soaking the tent (and the subsequent drying in the garden at home). Rob and family are aficionados of the city break, and as such are often at the mercy of the weather as they explore — planning lunch is essential.
It was undoubtedly off the back of a particularly bad weekend, weather predicting wise, that we first discussed how often the forecast would change as we attempted to plot our respective weekends. Plotting a course through any weekend seems to be a constantly moving task as the weather forecast dictates the plan, shifting things here and there, much to our partners’ annoyance. Does it really change that much, with no history on the website it’s hard to tell? Perhaps we just felt it changed a lot… As two self-confessed data nerds we decided to scrape some data and find out.
The BBC website seemed to be a common source of weather information for both of us and so we decided to use that as the source of our data collection. We scraped data from the BBC Weather Website throughout November and the early part of December, recording the forecast in several places across the UK every hour. Armed with the data we are now in a position to check our instinct, did the weather forecast change as much as we thought?
Let’s look at a sample day for Rob’s local weather forecast, we can track count up the changes in the forecast as they happen.
The forecast goes from every 3 hours to hourly, over the course of 4 days. Some hours see up to 30 changes, which means about 30% of the forecasts for that hour are changes. The forecast flips back and forth between dark cloud, light cloud and sunny spells, unable to make up its mind. This might not ruin a day out but for poor Rob and family trying to predict the weather it doesn’t make for a great experience (and that’s without throwing in accuracy of the final forecast and differences between the forecasts across multiple providers into the mix).
How do these changes map out more generally?
Using a 5 day moving average to smooth out any irregular hours or days in the forecast we can see over the 6 weeks we collected data then, on average, the hourly forecasts composed of around 20% changes (a change is recorded when a given forecast for an hour differs from the preceding forecast).
We can see the changes across all the locations we collected data for below.
Manchester proving its reputation for being unpredictable at the end of November, with Cardiff offering forecasters the chance to avoid too many changes shortly afterwards. Unsurprisingly perhaps, given their proximity, neither of Chris and Rob’s respective locations in Kimberley, Nottinghamshire and Oadby, Leicestershire gave them the satisfaction of being harder to predict (see below).
What can we learn from this brief, but detailed look at the forecasting data? With around 20% change over the lifetime of a given forecast then it’s safe to assume that forecasting remains a difficult task. Our data points focused on Autumn / Winter, which presumably is more difficult to judge, but the delivery of old forecasts, with their maps and symbols, helped convey the nature of forecasting, it didn’t deal in specifics. Forecasters need to find a way of delivering modern forecasts without delivering specifics by hour that imply a promise of accuracy they can’t deliver on. There are certainly providers taking the opportunity to grab an audience based on the uncertainty of forecasts, e.g. Climendo. Until more do then we’ll be taking the hour by hour forecasts with a dash of healthy scepticism from now on.
There’s no need to read past here unless you’re interested in the techniques we used to scrape and analyse the data.
The data scraping was performed by Alteryx — using the download tool to scrape URLs for each location and download the HTML. Regex was then used to retrieve each forecast.
The data was moved into Tableau for data exploration where we quickly realised it was incredibly difficult to visualise the changes in the forecast.
A quick calculation to compare the Image to the previous one allowed us to visualise just the changes.
As we explored the data and researched a theme for the article the amount of change quickly became apparent but the method of quantifying this was difficult in Tableau. Therefore we parsed the data in Alteryx to compare the changes and track the total change across each hour.
This methodology included an attempt to categorise the amount of change and whether changes were good / bad by rating the weather on a scale of 1–5. However in the end the analysis we wanted to run and report was intended to be deliberately simple and so we didn’t use this analysis.
We have shared the data, workflows and Tableau analysis in a Google Drive folder here.