FireMind: Using Neural Networks to Model Wildfire

Published in

MBF-data-science

6 min readJul 9, 2019

2018 was the deadliest and most destructive year for wildfire in California history. The Camp Fire killed 85 people, and burnt almost 240 square miles, including over 18,000 structures and $16 billion dollars of damage. A better understanding of wildfire could save lives and help direct firefighting resources to where they can do the most good.

Wildfire modeling is almost as old as computing technology. Currently, the best wildfire model in the world is FIRETEC, out of Los Alamos labs. FIRETEC uses computational fluid dynamics to simulate landscape-scale fires on a 1m cubic voxel grid. FIRETEC is very impressive, but it has two major limitations. First, it needs the resources of a major supercomputing center like Los Alamos to run. Second, even if you have a supercomputer, FIRETEC runs slower than real time, making it useless for prediction.

Firefighters have rules of thumb for how wildfire will behave. Flame fronts are driven by the wind. Fires burn faster uphill than downhill. Fires can be directed by canyons and skip from ridgeline to ridgeline on blown embers, missing valleys below. As much as wildfire are unpredictable, there are emergent patterns in how wildfire behaves.

Machine learning is adept at distinguishing patterns which evade human faculties, and there’s a wealth of historically available data about wildfire. Training machine learning models is slow, but once a trained model is available, making a prediction is fast; on the order of milliseconds. FireMind uses deep neural networks and historical wildfire data to model how wildfire spreads.

A Fire Perimeter advances southeast over several days

Formally, we want to make a predict step as indicated below. To use the location of the flame front today to predict where the flame front will be tomorrow, taking into account local topography and weather.

The fire indicated as red 1s, spreads up.

Data Sources and Preparation

Wildfire is described by the fire triangle: heat, fuel and oxygen. Fire will spread as long as all three elements are present, and fire is fought by removing one or more sides of the triangle. Heat is represented by wildfire perimeters, a record of burning areas maintained through the GeoMAC program. Fuel is recorded in the Landfire dataset, along with topography. And the NREL WIND dataset provides a record of atmospheric conditions, including wind, temperature, and precipitation.

Of course, because these three datasets are maintained by three different Federal agencies, they’re in entirely different formats. GeoMac records its observations as shapefiles. A fire perimeter is a vectorized shape, with vertices indicated by latitude-longitude coordinates, and a fire consists of daily perimeters. Fortunately, shapefiles play nicely with the GeoPandas library, which is much like Pandas, but for GIS data. NREL WIND uses one-kilometer cubes as its units, but it has a pythonic API, where it’s easy to input the coordinates, times, and types of information you want, and it’ll output the results to a NumPy array. Landfire represents fuel load and topography as a 30m x 30m continental raster derived from Landsat satellite observations, supplemented with local updates, and stored in a proprietary ArcGIS format.

Getting this data into a workable combined format presented some minor challenges.

I used the 30m x 30m GeoMAC raster dataset as my base resolution, since I planned to use NumPy arrays as input for my model, and making each 30m x 30m square equivalent to one cell in the array seemed like an obvious choice. The other data sources would be manipulated to fit this standard.

I transformed the coordinate reference system of Landfire shapes from lat-lon to EPSG:5070, the coordinate reference system used by GeoMAC. Because latitude and longitude specify a point on a sphere (well, spheroid) and paper and screens are flat, sperical trig is required to properly translate lat-lon coordinates to something that can be plotted. I used the maximum extent of each fire to generate a bounding box at a size and resolution such that each pixel in the plot corresponded to a 30m x 30m square, then plotted each day’s fire perimeter, and then read the plot buffer to get a array corresponding to the fire perimeter (note: I am aware of RasterIO and Fiona, but this task was unique enough that this method was faster) These arrays were stored as pickled dictionaries, along with data about the fire and the corners of the bounding box.

The corners of the bounding boxes were used to generate two further sets of arrays for each fire. ArcGIS includes a built in Python interpreter, which I used to read out topographical data including slope, aspect, and altitude, and most importantly the Scott & Burgan Fuel Model Standard Fire Behavioral Fuel Model code for estimated fuel load on that region.

Fire Data Layers. Perimeter advance between two days, slope, azimuth, and fuel load.

Machine Learning

Because machine learning requires identically formatted data, and fire perimeters varied in size from a few hundred meters to a few hundred kilometers, I borrowed convolutional techniques from machine vision to traverse the edge of the fire, finding 256 x 256 pixel arrays where the fire advanced but did not overrun the size of the array.

Convolutions around the perimeter shown above

The NREL WIND API translated lat-lon coordinates into its native 1-km spatial resolution, producing a set of weather data, including wind speed, wind direction, precipitation, humidity, and temperature. These were combined with the convolutions described above to produce 71,000 fire progress tensors, identically shaped arrays describing fire progress, local topography, and weather.

Fire progress tensors are multidimensional arrays representing trainable data about the fire.

These fire tensors were stored in a NoSQL database, and streamed to a multipart neural network written in Keras. The neural network architecture combined three neural networks: A convolutional neural network to extract high level features from static topography, a Long Short-Term Memory (LSTM)network to summarize the day’s weather, and finally a regularized multilayer perceptron to understand the change in fire perimeter.

A neural network composed of 3 neural networks.

This network was trained over the course of 48 hours using an AWS p3.8xlarge EC2 instance, a powerful machine learning computer with multiple Nvidia V100 Tensor Core GPUs. The p3.8xlarge hardware costs as much as a sports car, and has an hourly cost comparable to renting one. Of course, this is the democratizing power of cloud computing. While FIRETEC and Los Alamos definitely have a more powerful computer, I was able to get a machine far better than anything I could afford locally with nothing more than a credit card and an internet connection — and since Metis provides $1000 of AWS credit as part of the bootcamp, I didn’t even run up any charges!

The results were impressive. Given a fire tensor, FireMind was able to predict which 30x30 squares would be ignited in the next 24 hours with an F1 Score of 0.85. From a machine learning perspective, this is pretty good!

Aftermath of fire in Paradise, CA, 2018. Josh Edelson/GETTY IMAGES

FireMind is very fast, it does in fact predict in milliseconds. But wildfire is complex; fire storms generate their own weather. Wildfire is chaotic; A single drifting ember can change a contained situation into a raging inferno. When doing data science, we have to be aware of the limits of our models, and the consequences of being in error. 0.85 F1 is pretty good, but not good enough when homes and lives are at stake.

I’d like to thank several people for their help with this project. Everyone at Metis San Francisco Fall 2018, but especially Adam Wearne; Robert Taylor at the National Park Service, who stayed late to answer all my questions about fire behavior and wildfire data sources; and Beth Burnam and Eric Kennedy, for inspiring this project and for helping me think out loud about it.

FireMind: Using Neural Networks to Model Wildfire

Data Sources and Preparation

Machine Learning

Written by Michael Burnam-Fink