The Bay Area’s Top Stops

Strava Engineering
strava-engineering
Published in
2 min readAug 29, 2014

Strava has done a lot of work figuring out how athletes move; we have global heatmaps, challenge heatmaps, personal heatmaps and even Strava Metro. But what about where people go to stop? I looked at 4.3 million rides in the San Francisco Bay Area to answer this question and found the best places to meet friends, have coffee or just check out the view.

The results, browsable at labs.strava.com/top-stops, show some interesting patterns. For example, our cyclists tend to favor Peet’s over Starbucks in Los Altos, Cupertino and Orinda, and no one stops at Philz on the weekends. For meeting places, the most popular are the Golden Gate Bridge and near the Woodside Bakery. The data also made us ask questions, like who is the Tuesday/Thursday morning Headlands Raid ride waiting for every week on Hawk Hill?

One of the goals of this project was to automate everything to see if this could be done on a larger scale. So I simply ran the rides through the following steps:

  • Find all the stops over 5 minutes (hopefully that filters out stop lights)
  • Cluster the 2,771,301 stop locations. Instead of building or wiring in a hierarchical clustering algorithm, I aggregated them using the heatmap code which buckets points by pixel.
  • Run the 4607 pixels with more than 50 stops through the Foursquare venues explore API. I did this several times to make sure all the categories were covered, like coffee shops, markets and parks.
  • Map all the stops to their closest venue and find the 150 most popular.
  • Remap the stops to these popular venues. This helped clean up some data issues with Foursquare but also might have hid some less popular venues.
  • Finally I aggregated up all the numbers and created the UI.

I have to admit, I did do some manual work to clean up a few locations where there were two popular venues near each other. For example, I combined Alpine Lake into Alpine Dam and Mount Diablo State Park into Mount Diablo North Summit. But I did find these locations with a script :)

The resulting 150 locations seem to make sense to me. However, since the process is mostly automated, I would expect that the map is missing some popular spots that are not included in Foursquare’s dataset. If you find some, I’d like to know so that the process can be can improved. As for number 2, it’s the top of Old La Honda, where people stop to rest. If you really can’t believe it, you should, there’s even a cyclist hanging out in the Google Street View!

Originally published at labs.strava.com by Paul Mach.

--

--