ENSO and Neural Nets: Take 1

My interest in neural nets has grown exponentially over the past few years. Although there’s been a ton of R&D dedicated to dynamical weather and climate models over the past couple of decades, there hasn’t been nearly as much attention paid to modern machine learning techniques. Sure, there are a couple of papers here and there about using machine learning to make predictions, but there are way more about dynamical model forecasts. Why is that?

Machine learning has proven very good at many sorts of predictions. It can pick your face out from a picture, suggest a movie to watch next, have a conversation with you, and so on. These are all popular uses — chances are you use their applications every day, when you interact with your phone or peruse the list of recommended products on Amazon. You’ll notice I didn’t mention weather forecasting.

Atmospheric scientists have a huge advantage over the applications that I mentioned above — we have equations of motion that allow us to predict future states of the atmosphere. The equations of motion form the foundation of every weather and climate model. You put in the current state of the atmosphere, start your model, and voila! you get a prediction. You can’t do that with TV show recommendations — there is no system of equations that can tell you what to watch if you loved The Americans and are really sad that it’s over.

Modern numerical weather prediction suffers from a few problems, especially at long forecast times. First, weather models depend on our ability to measure the current state of the atmosphere. If we have bad measurements then we’ll have bad forecasts — which is a simple way of framing Chaos Theory. The truth is that it’s pretty difficult to get perfect measurements of the atmosphere every six or twelve hours. There’s a lot of atmosphere out there and we don’t have nearly enough observing stations (think of the oceans or all the atmosphere above the ground).

The second big issue is that our models are only approximations of how the real atmosphere works. There’s no closed solution to those equations of motion, and we don’t have equations to govern how every single process in the atmosphere works, so we make a bunch of assumptions and use simple statistical models to fill in the gaps. Those assumptions introduce errors and those errors grow with time.

These problems limit our ability to make accurate forecasts, especially at monthly and seasonal time scales. The field of atmospheric science has pushed solutions like improved data assimilation (which mainly attacks the Chaos problem) and higher-resolution models. These are all great, but we still can’t make particularly good long-range forecasts. Most forecast sites show a 7 day forecast for a reason; week-2 is tough, week-3 is even tougher, and by week-4 you might as well just roll some dice and call it a day.


So I’d say it’s about time for atmospheric scientists to get serious about machine learning. We’ve hit a wall with dynamical models and it might be quite some time before we break through it. Machine learning has a lot of untapped potential, which I’m reminded of every time I’m creepily auto-tagged in a picture. To make a long story short, I’ve been playing with neural nets over the past year and I think everyone else in the field should too. There’s tremendous potential and the only way for us to realize it is if we get a real group effort. Here’s my feeble attempt.


I’ve been teaching myself about machine learning for the past year or so and I wanted a simple project to experiment with… something that I could run on my home gaming rig, which isn’t particularly powerful (3.4 GHz i5/16 GB RAM/GTX 1080 Ti). A simple ENSO prediction model seemed like a good idea.

This project is very simple: I’ve made a recurrent neural network (LSTM in this case) to predict the SST 3.4 time series. Even simpler, there are no external predictors — the model uses the previous 6 months of forecast data to predict the upcoming 12 months of data. Simple as that. It’s basically a souped up autoregressive (AR-12) model.

I used SST 3.4 monthly data that goes from January 1982 through the present. I trained the model on the first 200 months and then verified on the rest of the time series. This model was so simple that I didn’t even need my GPU — turns out that LSTMs are often faster on CPUs than GPUs, and it takes a whole 70ish seconds on my i5. The image below shows how the model did.

Results of my simple LSTM ENSO model for months 1–6. For 7–12 check the link in the image.

The shading shows reality — so it’s the same for each lead time. The black line shows the forecast from 1 month to 6 months out. A few things pop out to me: 1) The correlations are pretty good, about 0.51 at 6 months, but don’t seem significantly better or worse than what I’ve seen from others simple studies. Of course, more sophisticated neural net techniques do quite a bit better.

The real take home point here isn’t how well it does. It’s how easy it was to set it up and give it a try. I think there’s a ton of untapped potential… some ideas I have are to extend the training set, use external predictors, and smooth the data before training it to reduce the amount of noise that gets absorbed into the model. I’ve already started those things and plan to regularly update the model.

You can keep an eye on the model’s development by going to kylemacritchie.com/ENSO. That site includes the verification images and a real-time forecast that updates on the 4th of each month (allegedly… we’ll find out in a couple of days). I’ll write up new posts as I update the model to discuss how my changes impact the outcome. I’d like to get that 6-month correlation into the 0.8 range, but I won’t hold my breath.

I look forward to sharing this journey with you and discussion is always welcome! (I should also be crystal clear that this is a pet project that is not in any way sponsored by or affiliated with my day job.)