Hurricane Path Prediction using Deep Learning

Kamban Parasuraman
6 min readJun 18, 2019

--

Hurricane Harvey, Irma & Maria (2017) — Source: NOAA

Stuck behind the PayWall? Click here to read the full story using my friend link!

Every year, the time window between June 1 and November 30 signifies the North Atlantic Hurricane season. During this period, the warm waters in the Atlantic give birth to tropical cyclones, and few of these tropical cyclones end up making landfall causing major casualties and loss of property. In year 2017, there were record number of hurricanes including Harvey, Maria, and Irma with damages exceeding $300 billion. Accurate prediction of frequency, severity, and landfall locations are all important for mitigating the risk from these costly disasters.

The European Center for Medium Range Weather Forecast (ECMWF)model and National Weather Service’s Global Forecast System (GFS) models are widely used in predicting storm tracks. These data intensive numerical models attempt to model meso-scale weather patterns, and are computationally expensive. Predicting storm tracks and landfall locations is challenging, and the graphic below illustrates the uncertainty associated with these model predictions. In this article, let’s explore how we can leverage the power of deep learning to develop a storm track prediction model using LSTMS.

Five Day Storm Track Forecast For Hurricane Sandy (2012) - Source: NOAA

Show Me the Data:

Similar to the earlier study, we will be using the Atlantic Hurricane Database (HURDAT2). The HURDAT dataset provides location details (Latitude & Longitude) of every storm at every 6-hrs interval from the point of genesis to decay. More details about the dataset and exploratory analysis can be found here. Short-term forecasting (6 hrs., 12 hrs.) of the hurricane path is relatively straightforward. In this study, we will build a long-term prediction model that can predict the path of the hurricane few days/weeks in advance. In total, there are 1792 historical storms, and we would use 1590 of them to train the neural network model.

Snapshot of Processed HURDAT Dataset
Atlantic Hurricane Historical Storms (1851–2014)

Feature Engineering:

Since we will be using just the genesis point to predict the full path traversed by the hurricane, lets explore how we can use the domain knowledge to create additional features that would allow the model to capture the spatial patterns hidden in the historical data.

  1. Transforming Latitude and Longitude:

Latitude and Longitude are two attributes that describes a point in a three dimensional space. Since Longitude goes all around, two extreme values are infact very close together in reality. In order to circumvent this issue, we will transform the Lat/Lon coordinates to points in unit sphere. This means close points in these 3 dimensions are also close in reality.

Converting Latitude & Longitude to Cartesian Coordinates

2. Transition Probabilities:

Historical dataset shows the hurricanes after their genesis in the tropics propagate west. The clockwise rotation of North-Atlantic hurricanes, and the global trade winds, steer the hurricanes in the north-west direction along the subtropical ridge. In addition, factors like sea-surface temperature, Coriolis force, and wind shear force hurricanes to make loops and hairpin turns, resulting in unpredictable trajectories. In order to capture these spatial patterns, we first calculate the transition probabilities by overlaying a 0.25*0.25 degree grid, on historical footprint, and calculate the transition probabilities from each grid to the other grid. Details on calculating the transition probabilities can be found here. These transition probabilities are then appended to each row of data and used as inputs to determine the next location of the hurricane.

3. Clustering:

The climatology (genesis points, sea surface temperature, energy, etc) determine the trajectory of hurricanes and the probability of making the landfall. In order to account for the climatology, we will cluster the hurricanes into four groups.

Group I: Hurricanes that originate in the East Atlantic near to the equator are categorized under Group I. These hurricanes have the time to gather energy over warm waters, and they usually curve up.

Group II: Similar to the previous group, the genesis points of these hurricanes are near to the equator as well — but more towards the west. The trajectory of these hurricanes are typically straight paths towards Florida and Gulf.

Group III: The genesis of these hurricanes are further from the Equator, and they are not very strong. Their trajectories typically curve up, and hardly make landfalls

Group IV: The genesis of these hurricanes are near the Gulf-of-Mexico. Given the proximity of the genesis points to the land, these hurricanes do not have the time to gather energy, but have high probability of landfall.

Clustering of Hurricanes based on Genesis Coordinate

4. Modular Network Vs. One-Hot Encoding:

Now that we have clustered the hurricanes into four groups to account for climatological conditions, we could either (a) develop a modular neural network architecture by creating a series of four independent neural networks moderated by some intermediary, or (b) develop one model by treating cluster definitions as categorical data and performing a one-hot-encoding. In this study, we will be adopting one-hot-encoding of the cluster definition.

Building the Model:

Long Short-Term Memory (LSTM) models are extremely powerful time-series models. They can predict an arbitrary number of steps into the future. An LSTM module (or cell) contains the following components: (1) Forget Gate , (2) Candidate layer, (3) Input gate, (4) Output gate, (5) Hidden state, and (6) Memory state. Since there are many good resources available online (e.g. https://colah.github.io/posts/2015-08-Understanding-LSTMs/) to learn about LSTM’s, we will not get into the details of the inner workings of LSTM’s.

For this study we will develop a four layer LSTM model to predict the location of hurricane (x,y,z coordinates) one time-step at a time. The resulting prediction at time (t) is feedback as input to predict the hurricane location at next time step (t+1). The architecture of the model used in the study, along with its python implementation is shown below.

Four Layer LSTM Architecture

Model Prediction

The trained model is now to ready to make prediction. To test the robustness of the model, lets use it to do an ensemble prediction of Hurricane Ivan (2004) and Hurricane Wilma (2005). The plots below show the actual path (solid black lines) of these hurricanes, and the ensemble prediction of paths by the developed model. Considering we are making a long term forecast (days and weeks ahead) of the hurricane path with just the Genesis point of hurricanes, the model predictions are reasonably good.

Hurricane Ivan (2004)
Hurricane Wilma (2005)

Data Driven Vs. Physically Based Models

The imperatives of developing good weather modeling systems is not new, but deriving actionable insights from the petabyte-sized weather datasets with mixed multi-dimensional variables is challenging (e.g. Global Circulation Models). Relying on the strength of data-driven models to identify the casual inter-dependencies in multi-dimensional data, in this study we explored the use of deep learning to model Hurricane paths just with the Genesis points, and the model predictions are reasonably good.

If you have any thoughts or comments, please leave it below.

--

--