A Right to the Digital City

A response to the Smart London ‘new deal for city data’ — Andrew Eland & Richard Pope

Earlier this year, the London Chief Digital Officer and the Smart London board started a listening exercise to help develop the Smart London plan. The following is our response to their question of how to define a new deal for city data.

Summary

How to make London a smart city is, perhaps, the wrong question. A better question might be: how can London use digital tools to improve the lives of people who live, visit or work here?

How can we house them better, make them safer, healthier and maximise their happiness? What infrastructure needs to be in place to enable all this? How can we do this while giving people agency over the data about them? How can we give them the democratic control over digital services?

Fundamentally, what should a digital city be like? To start to answer this, we think there are three things that the Mayor, Chief Digital Officer and the Smart London Board will need to do:

First is the adoption of open standards, the development of definitive data registries and open APIs. These are the foundations the GLA, London boroughs, private sector and others will need to build upon.

Next, the use of data is set against increasing public concern over the effects of technology on our society. If Londoners are going to trust a digital city, then privacy, transparency and accountability must be core principles. Consumer technology and Silicon Valley struggle in these domains, leaving the opportunity for London to define and promote the worldwide standards for this emerging field.

Finally, the real prize here is new and improved services for Londoners. Redesigning everyday things Londoners rely on — things like getting a school place, commenting on a planning application or joining a housing list — represents an opportunity to improve millions of lives. To do this, London will need to invest in the digital capability of the GLA, London Boroughs and the private sector.

Below, we describe those three areas in more detail, talk through some of the challenges, give examples from different sectors that show what is possible and provide some practical recommendations.


1. Open standards, data and APIs

London depends on its infrastructure. The Tube supports up to 5 million journeys per day. The Thames Water Ring Main carries 0.3 gigalitres of drinking water a day. In the words of the Open Data Institute: data is as important as our road, railway and energy networks and should be treated as such.

TfL doesn’t discriminate between Londoners travelling on the Tube to go to school or to go shopping. Passengers don’t have to buy different tickets for different tube lines. In the same way, London’s data infrastructure should be open for all to use and operate in standard and expected ways.

We think that means adopting a mix of open standards, definitive data registries and open APIs. This open approach will make it easier and quicker for any company or organisation to develop services that meet the needs of Londoners. Lowering the barrier to providing services can enable companies to meet a wider set of needs experienced by people who are often not catered for by lowest common denominator, for instance: single parents or the elderly.

Definitive data registries

The open data movement has begun to demonstrate the power of data, though has been hampered and shaped by the difficulty of liberating data from existing systems. Historically, open data efforts have taken the approach of publishing snapshots of data, generally in non-standard formats and to an ad-hoc schedule. There has been little coordination between boroughs. It is now time to move beyond this initial approach, which while necessary to gain traction initially, now hinders progress.

The development and maintenance of definitive data registries — agreed lists of facts like bin collection days, protected views, locations of physical infrastructure, boundaries of school catchment limits and conservation areas — will be critical for enabling other services. Reliance on ad-hoc formats and schedules limits the reliability of downstream services to an unacceptable level.

The work of the Government Digital Service Registers Team in building definitive registers of things like local authorities and countries are allowing digital teams across government to build services faster and more consistently. Beyond government, these registers are being used by Tisc Report to track modern slavery compliance.

Evolving standards for open data

The nature of London local government means that any attempt to agree standards, from the aggregation of pollution sensor data to planning applications, will require the GLA and boroughs to work together.

There have been previous attempts at agreeing standards, such as the Local Waste Standards project, and the Open Election Data Project, but getting agreement between local authorities seems to be hard.

However, this does not mean it is impossible. Lessons can be learnt from successful, large-scale, standardisation efforts outside local government, from OpenStreetMap to FHIR and IETF. Each effort favours progress, working implementations, public evolution and concrete use cases over perfection and completeness.

The Cabinet Office solicits suggestions for the adoption of existing open standards via GitHub. Businesses, citizens or civil servants can suggest areas that may benefit from the adoption of standards. There is then an open process for the suggestion and adoption of open standards.

Significant effort and good will from the GLA, boroughs and the open data community will be needed if we are going to see open standards adopted across the city.

From open data to open APIs

Definitive data registries are not enough on their own. Both data and shared services need to be made available through open APIs for others to build upon. This is common practice in the private sector and increasingly in government.

For example, it’s possible to book hotels, flights and train travel through any one of a large number of competing online services, and this competition leads to improved user experiences. The provision of those services requires more than just train time data, it requires supporting APIs to issue tickets.

Similarly, banking services in the UK are currently introducing Open APIs as part of the The Open Banking Standard. This will allow the development of new financial services, such as real-time budgeting applications.

Through its GOV.UK Pay component, the Government Digital Service has shown that it is possible to offer shared platforms for central and local government to take online payments. By acting as a thin layer over commercial payment providers, it ensures central control of the user experience and accessibility, while also avoiding government services from being locked into a single provider.

London could emulate the approach of GOV.UK Pay by, for example, creating an open taxi dispatch system that acted as a thin layer over commercial providers. This would ensure TfL could maintain quality of operators, while also acting as a counterbalance to the risk of dominance from a single provider, such as Uber, avoiding lock-in. Ride Austin, a non-profit that built such a system in the wake of Uber and Lyft withdrawing service from their city, shows what’s possible.

Carefully designed open APIs are a powerful mechanism to not only protect against monopolies, but also to allow the kind of interventions that the private sector does not have the incentives to make. For example, with real-time analytics TfL could ensure taxi availability met demand, without an excess that causes both congestion and air pollution.

Similarly, Open311 defines an open API for issue tracking, such as broken streetlights, in use by many cities. With the right support, the API presents a possible path to scale services such as FixMyStreet, through embedding in applications with user traction (e.g. Citymapper, Google Maps) on one side, and within local government workflows on the other.

Improving public services using performance data

There is an opportunity to improve the performance of public services through near real-time analytics. This helps policy makers make better decisions, but also allows the public to participate in conversations about those decisions in new ways. Being clear and open about what data is collected and what purpose it is put to, will be critical for public trust. As such, data used for policy development and implementation should, whenever possible, be publicly available under an open license.

To take full advantage of the opportunity, there will need to be an understanding of where the gaps in performance data exist today. The GLA currently only has access to a subset of data about how public services and infrastructure are performing. The nature of government in London means that building the whole picture will require aggregating data from organisations as diverse as the NHS to the boroughs, while preserving user control and privacy.

This is obviously a significant task. One approach could be for the GLA to provide a set of open APIs for the ingest of performance data from other organisations, or failing that, agree standards for what the data should look like. APIs for shared services can also act as a collection point for performance data.

Finally, data analytics should not be seen as a panacea for improving public services. They are obviously only part of the solution.

Recommendations

  • Create a cross discipline group, with representation from within local government and outside, to prioritise and develop the definition of open standards and APIs between London boroughs.
  • Run an ongoing open standards challenge to allow the public and technology companies to suggest open standards for adoption.
  • Build the processes, standards and tools to maintain critical, London wide, datasets as definitive registers.
  • Kickstart the move from open data to open APIs with a significant, impactful platform project on the scale of Ride Austin.

2. Designing for rights, accountability and trust

In their recent report, Doteveryone found people feel disempowered by a lack of transparency in how digital services operate and many people were unable to find out how their data is used.

As cities become more digital, what would it mean if people were unable to trust the place they live? For example, would less people visit sexual health clinics if they were unsure whether their sensor tracked movements might later be shared publicly?

At the same time, are the institutions the public currently rely on to represent their interests equipped to manage this in a digital city? For example, is the UK planning system set up to deal with dynamic physical structures in the public realm relying on sensors?

As more services are digitised, more sensors deployed and more objects connected to networks, it is essential to consider how those things can be explained to the public, agreed upon, and made accountable. Together, these will help to build trust by the public.

Thinking beyond anonymity

Common sense suggests that once data has been aggregated (for example, the individual speeds of vehicles using a particular street turned into an average speed), or stripped of identifiable data (license plate numbers removed or obscured from a table of measured vehicle speeds), it may be considered anonymous; it can no longer be attributed to the behaviour of an individual.

This is an important concept, as it underpins current policies towards data sharing without the explicit consent of individuals represented within that data. So long as data cannot be attributed to an individual, and an individual cannot be identified from that data, the controls on sharing can be relaxed.

Unfortunately, recent advances in privacy theory have shown these promises of anonymity cannot always be met, and that even the most innocuous data can be attributed to individuals by cross-referencing a data set against others. Anonymised data released by the New York Taxi & Limousine Commision on the utilisation of taxis was combined with license place data and magazine photos to determine the movement of celebrities. The Governor of Massachusetts’ medical data was reidentified from summaries stripped of identifiers, and many other examples exist. The increased variety and volume of data inherent in the use of modern software systems and sensors reduces the difficulty of re-identification, by providing broader data for cross referencing. While the explicit re-identification process is complex, modern machine learning techniques can potentially result in unintended outcomes where this occurs implicitly.

Recently, mathematical techniques have been developed that provide well defined levels of privacy protection, even when data can be cross-referenced. The current gold standard in these techniques is differential privacy. While the best option, the technique has significant drawbacks — among other issues, privacy is provided by adding noise to data to obscure the behaviour of individuals, reducing its utility. Fundamentally, there is a tradeoff between utility and privacy. In some problem domains, it’s easy to find a reasonable balance, in others, particularly the publication of large datasets, this is harder. Additionally, differential privacy requires specialist technical skills to implement, and does not provide protection for the behaviour of groups. As such, it is currently only practical to apply it to the most sensitive of data.

Our current reliance on anonymity doesn’t just ignore a complex privacy reality, it has stalled the creation of the processes to decide how data should be used. Without these, it’s difficult to determine how data can be reasonably used. The London-wide analysis of individual pedestrian journeys through sensors and cameras is today, rightly, unthinkable. This data, however, could aid air quality improvement through the creation of accurate particulate matter exposure models, and validate important initiatives such as healthy streets. The purchase of non-prescription medicines could be used to detect emerging public health issues, though the analysis of purchase data is understandably controversial. Dan Hill imagines a future in which city-wide biometric detection is used to constantly optimise services for citizens, while contemporary punks wear algorithm defeating makeup — we need an inclusive and democratic process for deciding whether that’s a future London wants.

There is an opportunity for London to define new methods to determine the reasonable use of data about its citizens; methods that address the design, technology and regulatory complexities of data that reflects many of us at the same time. Doing this successfully would enable progress on some of the most complex issues facing the city, issues that require the integration of datasets that would not be politically possible today.

Transparency and accountability

It is a requirement of a healthy society that people are able to trust the services they rely on, and that they have agency to fix things when they go wrong. It follows that any organisation seeking to make use of data needs to build and maintain the trust the of individuals represented by that data, and the society it serves.

Building transparency and accountability into services is the central mechanism to build trust. However, there is no single way of doing this — it requires a multidisciplinary approach that includes user research, systems and interaction design, technology and policy.

A transparent service explains, in terms understandable by those it affects, how and why data is used. It allows individuals to make informed consent decisions, and verify those decisions are respected by the eventual behaviour of the service. Accountability for services handling data has been broken down, by academics studying machine learning, into the principles of responsibility, explainability, accuracy, auditability and fairness. The implications of violating these principles are becoming increasingly clear.

For example, ProPublica investigated racial bias in the software used to conduct risk assessment in the US criminal justice system. Job search tools promote higher paying jobs to men. An algorithm managing Medicaid in Arkansas cut benefits without explaining those decisions.

Techniques to address these issues, political and technical, are nascent. New York’s city government passed an algorithmic accountability bill into law and established a task force to bring transparency and accountability to automated decision-making by the city’s agencies — according to ProPublica, an initiative driven by serving politician James Vacca following their investigation. This bill prompted the AI Now Institute to issue a set of recommendations that align with many in this response. Nesta also issued similar recommendations.

The ability to build transparent services is predicated on a basic level of understanding by citizens of how data can be used, independently of specific services. While this is part of the underlying digital literacy challenge, domain specific efforts exist. Wellcome’s Understanding Patient Data project attempts to improve understanding for health data specifically. The GLA should decide what educational role it needs to play from the perspective of city data.

Consent for data use is predicated on trust — trust that data will only be used for purposes aligned with the values of those individuals, and for purposes that will not cause them harm. Violate that trust, and consent will be withdrawn. If explicit consent was sought, people will revoke it, by no longer using an application, or opting out of services; if implicit consent was relied upon, public campaigns to limit use will form.

Asking for consent and giving people effective agency over the data held about them is a hard design problem. People generally don’t change the settings available to them in software, so hiding consent choices away in a preferences centre or dashboard is not a good option. To avoid this, software, including the Android operating system, now increasingly asks for consent at the point of use. Many of these patterns are captured in a catalogue developed by London based consultancy IF.

UK banking startup Monzo have a public product development roadmap, allowing customers to verify future product plans match their current expectations, while involving them in their design process through an active community. Similarly, Co-op Digital talk candidly about the development of their projects.

When planning mechanisms to exchange sensitive health data, the Connected Health Cities effort in Manchester convened a citizen’s jury, who heard evidence from experts before publicly answering a set of questions defined in advance to guide the project. In a similar vein, Tom Steinberg has written about the concept of Personal Data Representatives.

Designing systems and processes that allow for effective auditing of how data is being used is also increasingly important. As the capabilities of software increase, and the decisions we ask it to make become more important — for example for sentencing guidelines within the criminal justice system — it is important to be able to understand why a decision was made at a point in time.

In this spirit, TunnelBear, a Canadian VPN provider, commissioned and published an independent security audit of their services. The aim of the audit was to prove to their users that they are using data in a secure and ethical way.

Similarly, DeepMind Health has a group of independent reviewers who report on the organisation’s use of data, and is developing technology to maintain a provably verifiable audit log of the data used by an algorithm, and the decisions reached.

Meanwhile, the fundamental technical concepts and limits of accountability are currently a topic of active research.

Sensor technology

Modern communication standards, such as LoRa, provide a means of remotely collecting sensor data that’s economical in power and cost, aiding widespread deployment of long-lived devices. Advances in sensing techniques are also reducing cost in many domains allowing, for example, the cheap measurement of important air quality characteristics, such as particulate matter. Improvements in batteries and solar power increase sensor placement flexibility and longevity. Developments in the field of machine learning allow the estimation of human interaction metrics, such as pedestrian footfall and dwell time, from video recorded on low cost cameras.

Increasingly, sensors aren’t just standalone devices, but an integral component of a new category of urban technology with potential to improve the lives of Londoners. CityTree and Smog Free Tower are two novel air purification systems. Copenhagen uses LEDs embedded in cycle lanes to present a physical green wave, driven by sensors that ensure all bicycles in a group make it through junctions together. DriveWAVE by MIT’s Senseable City Lab shows what future junctions could look like.

Sensors present particular and significant challenges to transparency due to their physical form. In many cases it will not be possible for a passer-by to determine whether a given physical object collects data via sensors or not, let alone determine how and why data is being used. These issues need to be addressed upfront, as systems are designed. The transparency risk profile associated with sensors in the public realm suggests explicit governance should be created, with the existing planning process being one candidate vehicle. A coordinated effort will be necessary to avoid adverse reaction to individual projects restricting the potential of future efforts by others.

Many of these systems, and the software supporting them, are new products under active development, rather than well understood commodities. This presents a unique opportunity to ensure that the final products reflect the ideals of transparency and accountability, evolving them publicly — a technique spearheaded in the physical environment by the urban prototyping movement.

Ultimately, architect Zaha Hadid imagined a symbiotic relationship between the development of technology and urban form: there is a strong reciprocal relationship whereby our more ambitious design visions encourage the continuing development of the new digital technologies … and those new developments in turn inspire us to push the design envelope ever further. Embedding transparency and accountability in urban technology will keep this future open to us.

Recommendations

  • Develop simple tools for Londoners to request the data held about them, and understand what data is used to make decisions that affect them.
  • Develop a governance process and supporting tools to allow Londoners to engage with and influence consent decisions for data use, with an emphasis on understandability through user research.
  • Develop legislation requiring transparency measures for sensors deployed in the public realm, including an open register of deployed sensors, and investment in privacy & ethics training for the officers involved.
  • Support the development by research institutions of differential privacy based techniques to protect London’s most sensitive and valuable data, reducing the barriers to use.
  • Embed the concept of algorithmic accountability into London’s regulations and procurement processes, following New York’s example.

3. Better services

The real opportunity of a digital city is better services that positively impact the lives of millions of Londoners — addressing the real issues facing London, from housing to air pollution.

A well executed data infrastructure for London will mean it will become significantly easier to build services that meet the needs of Londoners. These services can be provided by the GLA, by London Boroughs and by commercial or third sector organisations. These can overlap and complement each other.

As such, it would be a wasted opportunity if only the most superficial areas were investigated, or the ambition of the GLA was limited by products already available on the market.

Using service design to design London’s services

The principles of service design should be used to determine where improved use of data would have the most impact on the lives of Londoners. Building modern, theoretically generalisable, data infrastructure without the messiness of concrete, real world, use cases is unlikely to result in improved services or positive outcomes for Londoners. A service design led approach was pioneered in local government by the likes of FutureGov, Snook and DXW, is now the standard in the UK government sector and, increasingly being adopted worldwide from San Francisco to Ontario.

As technology blends into the urban environment, the service design approach should follow. Gehl have pioneered the application of user research to public space, and make their protocols, and schemas for data collection, publicly available through the Gehl Institute. Dan Hill imagines a future in which those involved in designing services for the elderly build empathy by wearing heavy exoskeletons that restrict movement.

There is a role for central coordination to ensure focus is placed on the hard, but important, problems. The GDS Service Standard attempts to incentivise this at the national level.

Building capability

To deliver on the potential of a truly digital city — to actually deploy better services — London will need to invest in the digital capability of the GLA, London Boroughs and the private sector.

For the GLA and London Boroughs, this means building a shared delivery capability that can start to build the infrastructure and services that London needs. It also means ensuring leaders, politicians and planning & procurement officers have a working understanding of both technology and issues of digital ethics.

In the context of the private sector, it is important that London grows a tech sector that is both diverse and has a strong understanding of digital ethics. The Mayor’s commitment to increasing the diversity of London’s tech sector is particularly welcome.

Recommendations


Conclusion

Responding to emerging urban issues from gentrification to forced evictions & the privatisation of public space, the United Nations 2016 New Urban Agenda historically incorporated the concept of a “right to the city”: We share a vision of cities for all, referring to the equal use and enjoyment of cities and human settlements, seeking to promote inclusivity and ensure that all inhabitants, of present and future generations, without discrimination of any kind, are able to inhabit and produce just, safe, healthy, accessible, affordable, resilient and sustainable cities.

Richard Rogers presents a vision of technology and the city becoming inseparable: buildings, the city and its citizens will be one inseparable organism sheltered by a perfectly fitting, ever-changing framework … constantly adjusting through electronic … self-programing. The National Infrastructure Commision describes the more concrete concept of a digital twin for public infrastructure in their Data for Public Good report.

It is time to unify these concepts to define a “right to the digital city”: access to evolving digital services that both fit and meaningfully shape the city. Services that are open, transparent, accountable and inclusive.

About the authors

Andrew Eland is Director of Engineering for health at DeepMind, a London based artificial intelligence company. Richard Pope is COO at IF, a London based digital rights consultancy, and a fellow of the Harvard Kennedy School of Government. This response was written in our capacity as Londoners. We’d like to thank Sarah Drinkwater and Sarah Gold for their extensive input.