A Saturday on Strava, Mapped

Strava Engineering
strava-engineering
Published in
3 min readAug 13, 2013

On Saturday July 20th Strava athletes covered 4,890,000 miles in over 326,000 combined hours. To get a look at what all that riding looks like I built the Saturday on Strava Heatmap, a visual of all those miles, broken down by hour.

First some quick tips. The initial page will show the world from 12:00–1:00 pm at everyone’s local time. Use the orange slider at the top to step through the hours and zoom into any location for better detail. Individual data points are visible at the highest zoom level.

Limiting the data to only one day makes it easy to see group rides moving along their routes. The Dunwich Dynamo, an annual “through the night” ride from London to Dunwich stood out the most. At first I thought it was bad data since they were the only ones out in the extra early morning hours. The hot weather of Tucson, AZ also brings out the early riders with the Shoot Out leaving at 6am.

While you can probably find your local Saturday ride or race on the map, some other highlights include the the Tahoe Trail 100, Alpe d’Huez two days after the tour and Central Park before 6am.

Creating the map took a couple days of computing time and involved a few steps: downloading the ride data, aggregating it, refining the colors and generating the tiles. I used a mix of Go and C and much of the code was based on some heatmap work I did last year. Go was mainly used as a scripting language to pull all the data together and C allowed for full control over the memory management.

To respect privacy zones, the rides were downloaded using the V3 API vs. straight from the datastore. Then, all 750 million lat/lng datapoints were broken up by hour and aggregated using a modified quadtree data structure. Each “heat” value on the map represents the number of unique rides corresponding to the datapoints in that pixel. Using rides instead of total datapoints helps to normalize for speed. When riding uphill, people ride slower, and generate more points than when going downhill. I tried the datapoint method at first, but it just didn’t look right.

Once that was complete, I talked to the design team and toyed with the colors on a bay area subset of the data before generating all 18.5 millions unique tiles. The code is fast enough to serve the tiles dynamically, but for this project it was much easier to generate them once and let S3 deal with the hosting.

I find this map mesmerizing to look at, but sometimes it can bring out more questions than answers. So have a look at your ride from July 20th and let me know what you think in the comments below.

Originally published at labs.strava.com by Paul Mach.

--

--