The Process of Making Data Viz of NYC Traffic

Boys On Hudson
Computational Media
2 min readDec 16, 2017

Collecting Data
After deciding going for the idea of data visualization of the New York City traffic, James and I dived right into searching for all sorts of data related to subway and taxi. The idea of looking for something on the MTA website came naturally to us, but later on it became one of the biggest challenges we had because the MTA used the GTFS (General Transit Feed Specification) format instead of a more commonly format like JSON or CSV. To solve the problem, we used a Python utility called “gtfs_realtime_json“. As for the data of traffic on roads, we first found the traffic speed feed on the DOT website, but it only showed the traffic on highways. In the end, we decided to use the example data of taxi trips from Uber.

Analyzing Data
The MTA GTFS data was far more complicated than we expected, but it explained and reflected the statuses of train delay, cancellation and reroute. To analyze the data efficiently, James used a hash table structure instead of simply running for loops. And then, each object would be made sure to contain the data of same train from different GTFS files.

Displaying Data on the Map
To display the results, Mapbox was used to render the interactive tile map, and Mappa was used to wrap our p5.js codes as a canvas and put on top of Mapbox. Because the movements of trains and taxis were time-based, it made more sense to display them through p5.js, and the static information like station name would be displayed through Mapbox.

Improving Performance
While seeing the initial result, we were pretty happy it’s moving, but we then noticed that the frame rate slowed gradually with time. We thought it would made the user experience worse. So, we went to the office hour with Shawn, and decided to do two things to resolve the problem. First was reducing the points to render train routes, and second was, instead of showing moving taxi trails, we only showed dots. The end result was better than we anticipated!

--

--