Visualizing docked bikeshare movement with Power BI

Daniel Corley
4 min readMay 7, 2019

--

Photo by Snapwire from Pexels

The Problem

I have always been curious on bike sharing data, and love reading articles that others put out on the subject. It is great to see so much love towards cycling… but I’ve always felt those have been missing something. They will always tell you interesting facts on the data set - which bikes have been ridden the most, average duration of trips, male/female breakdown - but not about how the bikes are being used. I wanted to visualize the bike movement, and found a great way to do so utilizing Microsoft’s Power BI with a custom visualization from MapBox. Here’s a breakdown of my process, with the maps at the bottom (best viewed on a computer - sorry phone people).

Getting Data

To start, I retrieved the transactional level data for three metropolitan bike-share providers from their relevant site, along with the time frame that I used in this article:

  • Chicago Divvy - (1/1/2016 - 12/31/2018) - 10,999,458 Rides - 1.5GB
  • NYC Citi Bike - (1/1/2017 - 2/28/2019) - 35,530,183 Rides - 6GB
  • Ford GoBike - (6/28/2017 - 2/28/2019) - 2,746,692 Rides - 569MB

Motivate (Lyft) generously makes this data available that includes starting date/ time, trip duration, station names, station lat/long, gender, and user type which will all be used in this report.

Normalizing Data

The data sets that are available consisted of many CSVs with years of transactional data stored in each. This in turn resulted in massive file sizes (the full Citi Bike data set was 12.5GB at the time of writing) so I had to get creative on how to slim these down to show as much of a time frame as I could. Fortunately, I had a little help from Power BI.

This data showed every detail about every trip which in turn took up far too much space. Extracting data such as the Lat/Long and station names into lookup tables proved quickly to decrease the file size, and help create a workable model. This allowed for removal of these values from my ride data table.

These systems also say that there is employee and ‘test’ data removed, I found a few stations with areas listed outside city limits (3,000 miles) or ‘TEST’ within the station’s name (looking at you Citi Bike). All of this extra data was removed to vastly improve the ability to understand the mapping.

Visualizing

It became apparent that space on the page was going to be limited with all of the data, and for the sake of staying on course with the movement patterns of the bikes, I chose to split the page 80/20 between maps and filters. Being able to hone in on the data with any slicer selected enables real-world scenarios (think rush hour or weekday vs. weekend). Seeing the average ride duration and total rides starting with any filter seemed like two additional ancillary metrics to track as well.

Starting with two maps on the screen show the starting (left) and ending (right) points for the trips. If you hover over the circles you see metrics about those stations show up in the tool-tips:

  • Total rides - # of trips starting/ending at that station
  • % overall - % out of total filter starting/ending
  • Station - Name of station
  • % round-trip - % of rides that start and end at the same station (starting stations only)

After poking through the total rides starting and ending, you can slice further with the filters in the middle. These enable filtering to a specific date (July 4th), time of specific days (Mon-Fri, 7-9am), or type of user (subscriber vs casual user). While doing so, utilizing the auto-zoon on/off feature in the top right of the maps, in case you’re trying to hone in on specific stations.

When you’ve selected your favorite filters, you’ll notice a box with control buttons at the very bottom. Press play and watch the rides cycle through the months with your chosen filters. You can pause this to skip through months, or hit the ‘double arrows down’ to drill into weeks and days. Enjoy!

Example of Divvy’s Dashboard

Thoughts? Comments? I’d love to hear it!

--

--