Holiday Taxi Traffic Trends — NYC Airports

ImageWork Technologies
2 min readNov 20, 2014

--

At ImageWork, we are constantly looking at ways to make content and information more usable for our customers. Over the last few months we’ve been hard at work overhauling our website, showcasing some of the work we’ve been executing for our customers, and leveraging big data & content to deliver visually engaging experiences.

An interesting opportunity arose when we came across Chris Whong’s work with NYC Taxi Trip Data. Chris’ own visualization called “NYC Taxis: A Day in the Life” was gorgeous. We wanted to showcase the data set he acquired through a more collective lens.

We wanted to see just how many NYC taxi trips originate at various NY airport terminals (JFK and LGA) over the holiday season (Nov 15th to Dec 31st). We wanted to correlate each airport terminal with their Airlines, so that the visualization could be narrowed down to a specific terminal that served your favorite airline.

We also added an aggregate bar chart to depict how trip counts drop on key holidays. The result is a stunning visualization that shows taxi trip flow between airports and destinations in a mesmerizing manner.

The Stack

On the backend, it was all Google Compute Cloud, Hadoop 2.4.1 (5 nodes), HDFS, Java-MapReduce and Hue. We opted not to use a pre-packaged distribution from Hortonworks or Cloudera — this exercise was as much about understanding Hadoop’s underpinnings as it was about data mining.

We ended up crunching through 173.2M records containing Trip data, or 28.85 gigs of text. Based on the criteria we setup (a 50–100M radius around terminal pickup zones), we ended up with 270,000 relevant trips. Proximity calculations were executed using the tried and tested Haversine Distance formula.

Next step was to build out the front-end so that we could convey our findings with you, World.

Server:

  • Node.js (modules: Express | SQLite | SQL)

Client:

  • jQuery (DOM, Ajax)
  • d3 (Visualization | Charting)
  • Polyline (decoding polyline strings)
  • Moment (Datetime manipulation and display)

Mapping:

  • Leaflet (POI rendering and other functions)
  • Mapbox (map tile provider)

APIs:

  • Directions API from Open MapQuest

The Team

We had a ton of fun learning, building and sharing this project with you. Happy Holidays!

--

--