Animating 40 years of California Earthquakes
Many years ago in college, I decided to take a Computer Graphics class. One of the assignments during the class was to create a visualization using any type of scientific data.
One day, while reading the news, I noticed a small earthquake had happened in California. I decided that earthquake data would make for an interesting visualization.
I chose to use data from the Northern California Earthquake Data Center (NCEDC) , an organization that has measured and made available all California earthquake records from 1967 to the present. Back in the day I used OpenGL to build the earthquake visualization and it looks “kind of” good (fortunately I wasn’t able to find screenshots).
Several years passed, technology improved, and I found myself wanting to recreate this visualization with a modern visualization tool, specifically Kepler.gl, a WebGL based app developed by the mad geniuses at Uber.
Creating the earthquake visualization was significantly easier and quicker this time around.
Creating the dataset
I was able to retrieve earthquake records from NCEDC which had not abandoned it’s 50 year mission of capturing earthquake data. It provides an API (http://service.ncedc.org/) to fetch earthquake records in the following format:
There is an enormous amount of seismic data available through this API, not all of which counts as something we might consider an earthquake. Fortunately the API allows users to filter by parameters such as magnitude.. This is my final query:
Using the above query I created an earthquake dataset of (~55000) by filtering it to contain only 2.5+ magnitude earthquakes. I downloaded it as a CSV file.
Each row contained:
Without adjusting parameters this is what kepler.gl displayed immediately after loading in my CSV (Figure 3).
I was impressed that the application was able to automatically detect geo fields, the `Latitude` and `Longitude` columns, in my dataset particularly since kepler.gl is entirely browser-based — nothing is sent to remote services. The visualization of all ~55000 data points showed up in just a few seconds without crashing the browser.
At this point, I thought “this is nice but I need to gather more information about the dataset.” I decided to adjust kepler.gl layer configuration to increase the clarity of the results.
kepler.gl can customize several attributes of your visualization, e.g. point radius/color, so I began by updating the properties of “artwork”. After few really attempts, I decided to customize my point layer based on the magnitude of each single earthquake. At each point, the layer radius and color would be determined based on the magnitude of the particular earthquake, Figure 4.
And after updating few settings I was able to create the following visualization:
The new created visualization is very intuitive and provide a quick and easy way to spot stronger earthquakes in the dataset.
With only few minutes of actually using kepler.gl I was able to replicate the work I did back in college with greater clarity and a significantly more professional aesthetic.
Most importantly, by looking at the above map, you can spot big earthquakes and understand their distribution.
After I was satisfied with the color palette and various settings, I decided to start using some filters.
kepler.gl has the ability to filter on each column of loaded dataset, for this dataset, possible options were lat, long, and timestamp. I decided to create a filter using the timestamp when the earthquake occurred, Figure 6.
While using time stamp as a filter, kepler.gl performed a domain analysis on the specified property values and generated a histogram to better understand the value spectrum — so I did not have to squint at the map and guess about the optimal distribution.
By clicking on the clock icon on the right top of the filter item panel, shown in Figure 6, the application shows a timeline panel that can be used to scrub through earthquakes over time, shown in Figure 8. This enables a deep understanding of how the data changes over time.
After spending several hours working on large dataset in kepler.gl I was struck by the app’s performance in terms of data loading, data manipulation, map interactions and filtering.
The smart functionality that automatically detects columns is unbelievably useful not just in terms of saving time, but also for detecting insights in the dataset which otherwise may be unknown and/or taken to time to be discovered.