Open Data for Public Transportation

And why I am building a GTFS manager

How many times have you used Google Maps to find your way to a place using public transportation? If you live in a big city I guess this will be happening really often. But have you ever wondered how Google Maps is getting this information from the public transportation providers? Well, since last year I couldn’t care less about this, till I had the chance to find out the whole story behind it.

It was May of 2013 when RATP, the company running the Parisian metro, organised a big event and a hackathon (#opendatalab) to celebrate the release of the data for its lines. It didn’t seem very attractive but it was supposed to have free pizza so I was totally in. So me, and some awesome people from my team at work, registered to take part. The day of the hackathon started really tough for me. All the presentations were in French and my French level back then was at bonjour level. However, the presentations were full of links and links are universal. I started typing and exploring what these guys were trying to say. Et voilà! In a couple of minutes I was discovering the core of the open data format for public transportation, the GTFS feeds. That was the way that Google Maps (along with anybody else) was getting all the information.

But wait! What is GTFS?

The General Transit Feed Specification (GTFS) defines a common format for public transportation schedules and associated geographic information. GTFS “feeds” allow public transit agencies to publish their transit data and developers to write applications that consume that data in an interoperable way.

So the point is this: The public transportation providers expose their data in a common format so that anyone could download, understand and use them freely. And that’s really cool if you consider the ways app creators could take advantage of it and develop some awesome apps.

GTFS is a rather flexible format. The data are formatted in a series of .txt files (which in fact are .csv formatted) and describe every aspect of the services provided by the transportation agencies (routes, stops, timetables, fares, etc.). Some files are required and some others are optional, so you can create as many as you like. However the more you have the better you describe your services. Here is the list of the available files:

agency.txt, stops.txt, routes.txt, trips.txt, stop_times.txt, calendar.txt, calendar_dates.txt, fare_attributes.txt, fare_rules.txt, shapes.txt, frequencies.txt, transfers.txt, feed_info.txt

However, even if it sounds so good to have Open Data for public transportation it requires a big effort from the transportation agencies to create these GTFS feeds. This is because the most of them are using proprietary systems for managing their network or, even worse, they do not use any system for the network management. So if you were the manager of a transportation company the options would be to either build a way to export the data from your system to the GTFS format or build the GTFS feed manually from scratch. Nevertheless, sometimes the first option is not feasible, thus the manual way is the only way to go. And it’s not an easy way at all.

What if there were tools for creating and managing GTFS feeds?

That was my thought from the first moment. There should be some tools doing that, right? In fact there are, but are they good enough to make the service providers use them? Well, I don’t really think so. The fact is that all of them lack in usability, providing either minor functionality or more functionality but with non-intuitive interfaces (see figure bellow).

Example of an existing GTFS manager. By yTransit

Therefore how we are supposed to convince the transportation agencies to adopt the Open Data if there are no appropriate tools to work with? That’s the reason I started imagining how I would build such a tool. I had in my mind the essentials. A tool like this should be:

  • Easy to use: So you don’t need a Phd to understand how it operates.
  • Flexible: To be able to work on the move, on a pc or your tablet. To monitor in real-time the network or edit the feeds.
  • Fast: Minimum time needed to perform a task. Everything in the UI should make sense + fast computations.
  • Scalable: Because you never know how large your agency can become.
A new GTFS Manager is born

Having all that in mind I started building the tool the way I imagined it. I decided to go with a web app (instead of a desktop/native app) to meet the flexibility expectation. Node.js was the tool to start since I also wanted my GTFS manager to be fast and scalable. Along with Node.js I was in need for something to store the data on the server side. The most obvious option was to go with MongoDB as, at the time speaking, it is the most favourable db for Node.js. By doing this I have a homogeneous flow of data running from the database down to the client in the JSON format, making the parsing process an easy task. Many would argue that MongoDB still has a lot of issues and I can’t agree more. But in overall it performs well and till now I am not regretting of using it. Since a lot of computation will happen on the client side I wanted something really powerful to handle the data. The options were many but I decided to go with AngularJS since it is the most promising out of the competition (Backbone.js, ember.js, etc.). For the layout, Bootstrap was a wise choice since it brings in some nice elements, so that I don’t have to spend a lot of time writing already existing stuff. But enough with the technical details, let me show you how it looks now!

Searching and editing a stop

Searching a stop through the interactive map

Searching and editing a route

Editing the timetables

I am super excited with the progress so far and I believe that soon I will have something that will be really useful. In the Open Data we trust!

P.S. 1: At the RATP’s hackathon we won a prize ☺ and it was awesome!

P.S.2: In France hackathons end at the end of the day and start again in the morning. No sleeping overs :P

P.S.3: You can find some already existing GTFS feeds here:

Thanks for reading!!!