A Predictive Earthquake Damage Model written in Python

Madeline Jones
New Light Technologies, Inc. (NLT
5 min readMay 6, 2021


I have been working on a project over the last couple of years in my role as a Data Scientist at New Light Technologies to develop a model that predicts damages to structures in near-real-time following earthquakes in the USA using nationwide building centroids and Hazus damage functions.

Hazus is a risk model for earthquakes, floods tsunamis, and hurricanes and is the authoritative voice for predicted impacts following one of these major events. To learn more about Hazus, you can read about it directly from the Hazus program’s website.

This blog post describes the methodology and supplemental data sets used in the Earthquake Damage Model, which is available on GitHub here. The model is a Python program that can be set up to run regularly as a scheduled task, in which case it could provide predicted structural damage following an earthquake in near-real-time.

Data and Tables:

There are several important data sources used in this predictive model. The GitHub repo contains all required tables, however, some of the external data sets must either be downloaded and/or generated by the user with some basic geoprocessing steps, which are outlined in the GitHub readme file.

Building Centroids: In order to try and predict the number of buildings damaged in an area following an earthquake, you first need to know how many buildings are present. I used Oak Ridge National Labs building outlines from the USA Structures project. Some other open and public data sets that could be used instead are Microsoft Building Footprints or OpenStreetMap. The building centroids are used to calculate the count of structures within each Census Tract.

General Building Stock (GBS): This is the Census Tract-level breakdown of structure types. Not only do you need to calculate how many buildings are within each Tract, but you need to be able to estimate what kinds of structures they are. This information was extracted from Hazus at the Tract-level. The count of building centroids is multiplied by this breakdown of structure types to estimate the number of structures within each structure type per Tract. The Hazus Earthquake building types are defined in the Hazus Technical User Manual (p. 17).

Sample of Hazus General Building Stock table.
Sample of Hazus General Building Stock (GBS) indicating the percentage of structures within each Structure Type per Tract.

Damage Function Variables: Hazus damage functions are open and available online. The easiest place to access them, along with a repository of many other peer-reviewed physical vulnerability functions, is through The OpenQuake Platform. Hazus’ damage functions are a set of cumulative distribution functions whose variables are dependent on structure type and seismic design code. They help to estimate the probability that a structure might meet or exceed Slight, Moderate, Extensive, or Complete damage based on the peak ground acceleration (PGA).

Hazus Damage Function graphs for 36 Structure Types
Set of damage functions for each Structure Type + Seismic Design Code. For example, C1HHC can be interpreted as C1 (Concrete Moment Frame), H (High Rise), HC (High Code). The colored lines represent Damage Curves for the following Damage Categories: Blue — Slight Damage, Green — Moderate Damage, Yellow — Extensive Damage, Red — Complete Damage. For each graph, the X-axis is Peak Ground Acceleration (PGA) (%g), and the Y-axis is Probability of Meeting or Exceeding the Specified Damage State.

USGS ShakeMap API: ShakeMaps provide near-real-time maps of ground motion and shaking intensity following significant earthquakes.

Other supplementary data sets used in the code (for doing things like spatial filtering, spatial joins, etc) include:

How it works:

The program calls the USGS ShakeMap API to detect new or recent earthquake events in the USA. If the earthquake occurred in the US, and GIS files are available through the API, data is downloaded to the ShakeMaps subdirectory.

USGS ShakeMap (MMI) GIS data for the Napa 2014 M 6.0 Earthquake. MMI represents the intensity of ground shaking following an earthquake.

Sometimes following an earthquake, the ShakeMap data will be updated by USGS as the epicenter is relocated or more data comes available. In this case, the program will update the model results based on the most up-to-date ShakeMap data.

The ShakeMap GIS files are then processed and clipped to census geographies (counties, tracts), with all hazard information spatially joined to each Tract, County, and building centroid. All output files are saved into a local geodatabase built inside the ShakeMap folder for the event.

The building centroid count for each tract is used to estimate the number of buildings within each tract, and to calculate the breakdown of structures using the Hazus GBS table. Below is a breakdown of all the files pulled from USGS, as well as the output file generated by the Earthquake Damage Model:

Data sets produced by the Earthquake Damage Model for a single earthquake event.

The final output of the model is the Tract-Level Damage Assessment, which is stored in the TractLevel_DamageAssessmentModel_Output.shp file. This contains the number of structures within each Damage Category (Green, Yellow, Red) for each Tract within the spatial extent of the ShakeMap. The features in the eqmodel_outputs.gdb are ancillary and can be deleted.

Earthquake Model Results vs. Hazus Model Results for the Napa 2014 M 6.0 Earthquake (Red/Destroyed Buildings per Tract)

There are a few significant differentiators between this code and the Hazus model itself which results in slightly different model outputs (a difference of less than 10 buildings per tract in this test case):

  • Seismic Design Code: Seismic Design Codes represent guidance on how structures should be designed and constructed to limit seismic risk. The Seismic Design Code is dependent on many things, such as the year the structure was built, the seismic hazard of the area, local geologic conditions, and structure type. There are damage functions available for the following Seismic Design Codes: High Code, Moderate Code, Low Code and Pre- Code. This model assumes the highest possible seismic design code for structures, resulting in more conservative damage estimates. The Hazus program customizes the seismic design code based on the seismic hazard of the impacted region as well as the year the structures were built.
  • Tract-level hazard: Peak Ground Acceleration (PGA) determines the probability of meeting or exceeding each damage state and in order to do this analysis at the Tract-level, a single PGA must be generalized for an entire tract, which, depending on the size of the tract, is likely not the case. The Python model assigns the minimum PGA to each tract, resulting in more conservative damage estimates.
  • General Building Stock: This model uses the count of building centroids within each Tract for structure count, whereas Hazus uses an estimation of structure counts based on Census demographics.

Future iterations of this work will include:

  • Converting program to entirely open source (removing arcpy dependencies)
  • Enabling analysis at the structural level using parcel data attribution spatially joined to building centroids and mapping those attributes to the Hazus structure types.
  • The ability for a user to define the Seismic Design Code as a parameter for Tract-level analysis, or calculation of Seismic Design Code based on Seismic Hazard and Year Built for the Structural-level analysis.



Madeline Jones
New Light Technologies, Inc. (NLT

Madeline is a Data Scientist at New Light Technologies with 5+ years of experience working in Natural Hazards and Disaster Response.