Imagine you were a student at Cambridge University in 17th century England. Let’s say you wanted to go down to London. We’re not talking about a huge distance here — just under 65 miles. You could drive that today in about an hour and a half.
You’d probably be pretty excited to make this journey, because an incredible new piece of technology was just starting to reach broad adoption, that was going to make your life on this trip many times more comfortable and convenient than in the past: the stagecoach.
The stagecoach was an amazing technological advancement. Previous wagons didn’t really have suspension systems, meaning they were frequently off-balance and really uncomfortable to ride in. It was a common occurrence for drivers to instruct passengers to lean out one side of the wagon and then the other, to balance it and keep it from overturning. These wagons also didn’t have real seats, just benches.
So the stagecoach was a huge improvement. It introduced one of the first suspension systems, making rides far more comfortable, and even had seats with backs on them. So you would be thrilled to get to ride in one of these things. And you had better be, because you’d be riding in it for a long time. Back in 1750, the Cambridge to London route would take a full two days. That’s moving at a hearty pace of 1.3 miles an hour.
Why was this trip so slow? It wasn’t the stagecoach itself, these things could go way faster than that. The problem was with the roads. You see, English law at the time put the responsibility for maintaining roads with local parishes, or towns. This worked great for roads that the townspeople used regularly, but really fell apart when it came to the highways, which were principally used by long-distance travellers.
So to fix this, Parliament started creating entities called “Turnpike Trusts.” These organizations managed resources from several parishes through which highways passed, and collected tolls from travellers. They used these funds to maintain and improve the highways. They erected milestones, telling travellers the distance between major towns. And they set the first rules of the road, mandating things like driving on the left-hand side.
Throughout the 17th and 18th centuries, the Turnpike Trusts helped road transport overtake ships as the best way to move between England’s booming towns. They also helped drive innovations in stagecoach technology: better roads meant lighter and faster coaches could be designed. Faster coaches meant more traffic, which meant more tolls to further improve the road. And so on.
And most importantly for our Cambridge University student, the Trusts helped cut that brutal two-day Cambridge to London trip in 1750 down to just 7 hours by 1820.
So why am I telling you all this?
It’s not because I’m a transportation history buff, although I do think this stuff is pretty cool. It’s because when I look at the civic technology space today, I see a whole lot of incredible stagecoaches, but really poor roads to drive them on.
I and my colleagues at Google are in awe of the impactful work being done by everyone in this room and by others working in this sector worldwide. Together we are improving public service delivery, engaging more citizens in the political process, and making governments more accountable around the world.
By many accounts our sector is booming. Many of you have likely seen this report from the Knight Foundation, showing that between 2008 and 2012 our field grew annually by 23%. More people are getting involved, more tools and services are being built, more investments are being made, and most importantly more peoples’ lives are being improved.
But we’re not living up to our potential. Yet.
You can see this on a number of fronts. For years now we’ve been on the cusp of having our democratic systems fundamentally transformed by the internet. We see exciting changes in campaigning and fundraising, but when it comes down to making big decisions our model of politics hasn’t really changed. The biggest stories about this area in the past year have been overwhelmingly negative, focusing on times when government’s use of technology falls down or goes too far. And perhaps most importantly, public trust in government continues to decline, pushing back against one of the long held beliefs of our space that increased openness through technology can increase public confidence.
We are doing important work and it is helping people. But too much of our work today is the equivalent of building better stagecoaches in the 17th century. We have produced incredible, innovative technologies, but they are being prevented from achieving maximum impact because we lack necessary public infrastructure. We don’t have good roads to drive our stagecoaches on.
We need to shift our conversations away from the latest shiny apps and towards infrastructure and collaboration.
I’m going to talk today about three ways we can make that happen.
First, we must be clear that open isn’t enough. The phrase “open data” should always be accompanied by words like “structured”, “licensed”, and “updated.”
Second, it’s time to put a spotlight on interoperable data. We’re currently just scratching the surface of what open data can enable. Making it easy to combine different datasets and APIs is key to taking our work to the next level.
And third, we need to focus on building ecosystems, not apps. New models of collaboration across our community will help us offer services that have a sustainable and proven impact.
First, open isn’t enough.
The incredible growth of open data and open government in recent years has been an amazing thing. But our focus on that single adjective has made it too easy for many to think that their responsibilities start and end with being open, and this is holding us back.
All too often, open data is not delivered in a way that creates incentives for its re-use. Too much data sits out there, waiting for someone to use it, never achieving its purpose. Or when data is used, everyone building on top of it repeats the same work to get it into a usable format.
To quantify this, let’s look at some research done by my team at Google and others into open data publishing. When looking at the US and Europe, much of what is published comes in the form of spreadsheets and documents instead of structured data formats. Particularly in Europe, spreadsheets far exceed structured data. Spreadsheets and documents can be useful for some types of data and for performing one-off analysis, but are often not what developers want when building reliable apps and services.
Releasing data in a structured format makes it technically usable by developers, but it also needs to be legally usable. This means being released under a license that permits reuse, and often modifications of the data. Without proper licensing, open data might be usable at hackathons or in side projects. But if you’re looking to build a business on top of open data, you can’t accept the risks of using unlicensed data and hoping for the best — you need the certainty that comes from a clear license.
Unfortunately, this hardly ever happens. In fact, the overwhelming majority of open data across 100 top portals has no clear license at all. Now, in many cases licensing for these datasets might be covered by local laws or other regulations, but not clearly mentioned on the portal. That’s not okay. Developers aren’t lawyers — we need licenses to be clearly presented and easily understandable, or we won’t be able to use the data. Public domain releases like Creative Commons Zero are the most useful here, giving developers broad permission to turn the data into something useful without worrying about license terms.
Open data is also infrequently updated, and in many cases when new data is available it’s simply released as a new dataset rather than updates to an old one. Again looking at 100 top open data portals, barely any datasets have ever been updated. There is one clear outlier here — it looks like the Kenyans are schooling the rest of us in keeping their data updated.
Without timely updates, data is fine for demos and one-off projects, but it will never provide a strong enough case for serious developers to adopt it. Imagine what would happen if just launched the Google Maps API once and never updated it as new businesses and roads opened. How many apps would actually be built on top of it?
These principles are essential: structured, updated, and licensed. We give “open” prime billing, but that just gives data publishers an excuse to ignore these other key aspects that are just as important to making data useful.
Some people are doing this very well. The UK, for example, licenses most of its datasets through an Open Government License and has more data in CSV than XLS. Outside government, the Frictionless Data project I know many of you contribute to is making strides towards a lightweight standard for ensuring open data is structured and usable. We’re making progress, but too many others releasing data still stop at open alone. It’s up to all of us to call this out, and extend our push for openness to give equal weight to structure, updates, and licenses. Just as England’s turnpikes made great progress through the first rules of the road like driving on the left, our civic technology infrastructure will be far stronger when we build these new rules of the road into all of our work.
Once we have truly useful and usable data, we can start to focus on the next level, something that we as a community are just starting to understand and advance. Interoperability is the key to unlocking new potential in open data.
Today it’s far too hard to bring multiple open datasets together. Let’s say I want to build an app that tells you who donated to your local councillor’s last election campaign. There are two steps to that: first, figuring out who your local councillor is, and then looking up their donors. There are plenty of datasets or APIs out there that cover each of those needs. But automatically connecting the councillor you’ve found who represents someone with the list of councillors in the finance data can become a very tricky problem. You might be able to do this manually for small datasets, but overall without interoperability we’re stuck just scratching the surface of what open data can do.
Interoperability is also key for expansion. I might build and launch my campaign donors app in Berlin. But if I want to turn that app into a successful business, just serving Berlin probably isn’t enough. Today the costs of expanding my app to other cities in Germany, Europe, or the rest of the world are far too high, because similar open datasets generally are not interoperable across geographies. This serves to help block innovative uses of open data from expanding from side projects into real, scalable businesses.
Interoperability isn’t unique to our space. The success of many of the technologies we use every day is rooted in it. Think about how we’re all connected to the internet in this room: through wi-fi. We’ve come together from all over the world using devices from any number of manufacturers, and yet we can all open up our laptops or turn on our phones and tablets, enter a password, and get online. Think about how frustrating it’d be if Mac and PC users had to connect to different networks, or if visitors from America couldn’t connect to this European network.
So Wi-Fi’s clearly enabled massive scale through interoperability. But it’s also led to significant innovation. When the 802.11 spec was first designed in 1988, it wasn’t for personal computers at all — it was a group of corporations seeking to connect wireless cash registers. It’s proponents only later picked up on the incredible potential of wireless networking for personal computers. They couldn’t have imagined the explosion of applications of Wi-Fi today, from connecting new devices like TVs, refrigerators, and watches to new methods of connectivity, like portable hotspots or in-flight Wi-Fi. An interoperable spec opened the floodgates to all manner of innovation.
We are making some strong early progress towards data interoperability. The Popolo Project has set out to produce international specs for legislative data, which have been adopted by civil society groups on multiple continents. And my team at Google helped start the Open Civic Data project along with the Sunlight Foundation, Open North, and Granicus. One of the unique elements of this project is a focus on interoperable identifiers, meaning that data can still be published in differing formats but with common identifiers for key elements like political jurisdictions to enable easy connections between data.
We’re off to a good start but there is much work left to be done. Data interoperability won’t just help scale our tools around the world and to all levels of government. It will unlock new innovations on top of our work that we can’t even imagine today. It won’t be long before civic technology in 2014 looks as simple as when Wi-Fi was just for connecting personal computers. But we have to work together and prioritize interoperability to make that happen.
Ecosystems, Not Apps
I’ve said “working together” a few times, and I wanted to close by talking about what it means to do more of that. We need to focus on building ecosystems, not apps.
Here’s a pattern that’s repeated all too frequently. A government agency decides to open up data. They bring in a civil society partner to host a hackathon. A few exciting prototypes get produced, and some positive stories get written about the agency for being open and innovative. But then fast forward a few months, and in all likelihood none of the hackathon apps are being used or maintained, and the newly opened data is barely being touched.
The problem here is that hackathons produce apps, when we need ecosystems. It’s easy to talk about the need for a “killer app” on open data, but that just doesn’t exist. The success stories in our space aren’t individual killer apps — they’re healthy ecosystems that involve many organizations.
For an idea how we can build more ecosystems, once again we can turn to the English turnpikes for inspiration. When the local governments and others invested in roads didn’t have the right incentives to improve long-distance highways, the English set up Turnpike Trusts to bring together governments, companies, and road users and focus on the infrastructure for them all to thrive. We need our own Turnpike Trusts to build new ecosystems in civic technology.
We’re starting to see more ecosystems emerge in our space like this. One great example that was highlighted earlier this week is Code for Germany, a project of Open Knowledge Germany that Google is proud to be a supporter of. Since launching in February, Code for Germany has launched labs in 14 cities across the country, and driven over 4000 hours of volunteer development work. Focusing on the ecosystem is helping them build tools and quickly spread them across the country, reducing the amount of repeated work taking place in different cities.
To see how this type of collaboration can lead to a thriving ecosystem over a longer period of time, we can look to an effort my team’s been privileged to be a part of for 7 years now: the Voting Information Project. Back in 2007, many groups in the US were struggling with how to help voters find where to vote and other key information online. Google was one of these — our interest came from a group of engineers reviewing broad aggregate search trends, and seeing spikes every election for queries like [where do i vote].
When we dug into these spiking queries, we found we did a really bad job at answering them. Elections in the US are administered by hundreds of different state and county governments, and of course they all did things differently. In 2007, voting information was only online at all in 11 of 50 states.
So along with the Pew Charitable Trusts, we founded the Voting Information Project, which set out to collect and standardize this information. What’s notable here is how different groups came together to accomplish this. After we jointly established a standard data format for voting information, Pew funded civil society groups to work with local governments to get their data into the new standard. At Google, we then ingested and processed all of this data, and served it to users searching for voting information. But we didn’t just use it ourselves, we provided analysis on issues with the data back to governments, and opened the data back up to any developer through an API that made it easier to use. This unlocked hundreds of new uses of the data, everything from building an SMS service to integrating it onto the homepage of CNN.com.
This type of collaboration goes far beyond hackathons and conferences — it’s enabled a healthy ecosystem of funders, corporations, civil society, and governments. And the results have been incredible. We’ve provided nationwide voting information for the past three US national elections, and are on track for our fourth this November. In 2012 we saw over 24 million lookups from over 600 different sites and apps. The ecosystem approach has also been key to keeping the project sustained over the years. Strong investments from funders and private corporations have kept the project on track despite changes in government or turnover in civil society groups.
These types of projects and many others are successful because they focus on the ecosystem, rather than an individual app. Bringing together disparate interests from many types of organizations and ensuring regular collaboration helps us achieve scale and sustain projects over time.
I hope you’ll start thinking about the infrastructure we can build together to accelerate all of our work.
Let’s shift the conversation to be clear that open isn’t enough, but is just part of a path towards structured, updated, and licensed data.
Let’s invest in data interoperability to make it easier to scale successful work globally and unlock new innovations we can’t even imagine today.
And let’s stop focusing on the next killer app, but instead on building healthy ecosystems.
We at Google are thrilled and humbled to be part of such an amazing community, and we’re ready to dig in and work with all of you to make this possible.
Let’s go build roads together.