Our cofounder and CTO, Dan Getelman, originally published this post on Medium. We’re reposting it here.
Open data is here, and is being used to help build businesses and help cities make better decisions. For years, technologists have been clamoring for governments to open up more of their data. It’s now happening. Cities of all sizes, from San Francisco to Asheville have open data portals filled with machine-readable data. Los Angeles, San Diego, and Philadelphia, among others, have even created executive Chief Data Officer positions that are responsible for opening data across departments. But, now that the data is out there, how is it actually getting used?
I can tell you what we’re up to at Remix. We’re a quickly growing company that uses open data to help cities plan out better public transit. Here’s a couple of examples of how we take open data, and make it available for cities to make better decisions. I’ve also linked to sources and places you can find all the data we use, so be sure to follow the links! (Plug: If you’re interested in building great interfaces using open data, we’re hiring front-end engineers!)
OpenStreetMap and Its Ecosystem
Everything in Remix is built on OpenStreetMap and different services and software packages in its ecosystem. In the photo above, all of the data you see in the map is coming from OpenStreetMap. We use Mapbox as a service to both generate the tiles that make up the map imagery, and to keep them up to date. OpenStreetMap has matured incredibly over the last couple of years, and offers us a number of advantages. First, the data is both accurate and easy to change. This is incredibly valuable to us, since it means that our customers immediately have correct data. And if it’s not quite right, it’s very easy to contribute those fixes back and quickly have them reflected in our product.
One of the coolest advantages of OpenStreetMap, though, is in our routing. We’re able to use Valhalla, a project from Mapzen, to do dynamic routing. This lets us snap to roads as you draw routes. Being able to do that based on the easily updatable data I mentioned above is cool in itself, but there’s another advantage. OpenStreetMap actually has a tag for bus-only roads (psv, for “public service vehicle”), which means that we can use existing metadata to route bus lines over bus-only roads. This would be impossible with any other standard routing engine.
Using GTFS To Show Planners Their Network
Our users are transit planners — the city employees that decide where each bus goes, and how often it shows up. Without open data, often the best representation of their current transit system is the printed paper schedule. However, in recent years, agencies have begun publishing all of their schedule data in a format called the General Transit Feed Specification (GTFS). You can usually find this data on the transit agency’s website, or on aggregators like TransitFeeds.
This data is output directly by the cities’ bus scheduling systems, and is far too granular to be directly useful by the planners. It also comes in a zip of CSVs, which isn’t easily digestable either. But it’s in a machine-readable format, and that has let us do a bit of magic to work backwards from the super-granular schedule back to their intent. We can take open data to give cities better ideas of the systems they’re running.
Using the Census to Drive Better Decisions
I’d be remiss if I didn’t mention the U.S. Census, or “the original open data,” as my colleague Danny refers to it. The Census has been around since 1790, but can often be unwieldy to work with. We’ve been able to take this data, all of which is freely available, and place it right in front of transit planners when they’re making decisions.
We were able to use some great existing open source tools, such as PostGIS, GDAL, Mapnik, and Tilestache to generate maps for a variety of factors, like population, jobs, people with low incomes, seniors, and more.
We’re also able to take the high granularity of the Census data, and give an instant calculation of how many people are near a route, and their demographic characteristics. To get this kind of insight before, agencies would have had to go through an arduous analysis process. However, because the data is openly available, we’re able to build it right into the product and make it available at their fingertips.
The Impact
The important piece of this isn’t that open data is pretty, but that it gives city transit planners the information they need to help justify and make better decisions. Our team hears from planners from cities of all sizes, from Marvin in Sandusky to Max near Seattle, about how Remix has changed the way they make changes, and allowed them to more efficiently evaluate and justify alternatives. If any of this sounds interesting, we’re hiring!