When you need it the most
My friend Alissa is an enthusiastic early adopter of new technologies. She can’t wait for all the promises regarding smart devices to come true. She has welcomed a huge range of smart gadgets into her home and worked hard to orient them to her needs and wishes. Like trained seals, she has configured a menagerie of Echos, Alexas, Siris and Google Homes to do her bidding. Last Tuesday she had a stroke, the main result was a speech impediment that makes her speech slurred and harder to understand for everyone but the people that become used to it, like her family and friends. She also has difficulty walking because of reduced control of one leg. Always irrepressible, before she left the hospital, she said that it was a good thing that she had outfitted her home as a smart home, now she could really take advantage of all that smart functionality.
Her smart menagerie of gadgets betrayed her. It refused to do her bidding. Every request went hopelessly astray. Always patient, Alissa predicted “they just need to get used to the new me.” As she herself was painfully getting used to the “new Alissa,” she persisted in training the smart devices to understand her requests. She repeated requests again and again until some semblance of what she asked for was produced. Then came the second betrayal. She thought that her devices might need an upgrade, that the updated software would be smarter, more adept at understanding her requests than her now earlier-model devices. The opposite was true, she was back to square one and the smarter devices never recognized, understood or served her, no matter how hard she tried, and she couldn’t go back. Smarter meant more data, and that data would bias the understanding toward more average speech. At a time when she could use the smart things the most, her smart things abandoned her.
What does this have to do with smart cities? I’ve been experimenting with and concerned about what smart systems do with people that deviate from the norm or average. My sense of growing alarm was galvanized by one pivotal experience. In working with the Ministry of Transportation I had the opportunity to play with and try out a number of machine learning models that would guide automated vehicles or driverless cars through intersections. These artificial intelligence systems would tell the vehicle to proceed, stop or adjust to avoid obstacles. I wanted to see what these models would decide if they encountered some of my friends and colleagues who do things unexpectedly. I brought a capture of a friend of mine that propels herself backwards in her wheelchair. She has strong legs, but her movements are poorly controlled and while she can’t stand, she can move very fast by pushing her wheelchair with her legs and feet. Many people that encounter her in an intersection are tempted, and often do, grab her chair and try to push her back onto the sidewalk thinking she has lost control. When I presented a capture of my friend to the learning models, they all chose to run her over. I was told to try again once the learning models had been exposed to more data about people using wheelchairs in intersections. I was told that the learning models were immature models that were not yet smart enough to recognize people in wheelchairs, they would expose them to learning data that included many people using wheelchairs in intersections. When I came back to test out the smarter models they ran her over with greater confidence. I can only presume that they decided, based on the average behavior of wheelchairs, that wheelchairs go in the opposite direction.
The jagged starburst of human data
In reflecting on how we make data-driven decisions or determinations, I should have expected this. If you plot the needs and characteristics of any population or group of people on a multivariate scatter-plot (a picture of complex data, because people are complex), we come up with a recurring pattern called a normal distribution, also known as the Gaussian curve or the Bell Curve (e.g., used by teachers to guide grading). In a tallied, two-dimensional form it looks like a bell. If you look at it in three dimensions, or from above, you see an exploding star. In the middle of the star there is usually a highly dense cluster of data points and as you move away from the centre they spread out more. The edges of the exploding star are very jagged with data points that are further and further apart. Those jagged edges represent my friends and colleagues who deviate from the norm more than most. When we process the scatter-plot of data to guide our decisions using standard data analysis, our decisions will be determined by the middle which is the norm and represents the majority. The jagged edge is either eliminated from the data set to make our decision clearer, or those edge points are outnumbered and overwhelmed by the norm or majority. The more data there is, or the smarter the systems are, the more the edges are outnumbered.
Smart Cities, at their most basic, are urban planning systems guided by data. If we don’t change the way we process data, those smart cities will similarly ignore and fail to recognize the needs of my friends and colleagues that deviate from the norm.
Data abuse, misuse and bias
I’ve also come to realize that while my friends, like Alissa, could use the truly smart systems the most, they are also most vulnerable to the abuses and misuses of those smarts. I’ve learned this while working with the Ontario Provincial Police unit that addresses fraud. The risks are multiplying and morphing as people come up with more and more nefarious ways to manipulate the data. These include identity theft, sales scams, financial fraud, to name just a few.
It’s not just the criminals that do unfair things with the data. The institutions that we put our trust in, such as insurance companies, banks, human resource services, welfare systems, political parties, justice systems and security services discriminate against my friends based on their data — to deny insurance, refuse loans, deny a chance at a position, unduly punish trivial acts, and unnecessarily flag my friends as security risks. (Virginia Eubanks, Cathy O’Neil and Safiya Noble give many poignant examples of this.)
De-identification and re-identification
The current response to this data abuse and misuse, by our Canadian privacy efforts and in the Sidewalk Toronto project, is to de-identify the data. The notion is that if we remove our identity from the data, it can’t be traced back to us and it can’t be used against us. The assumption is that we will thereby retain our privacy while contributing our data to making smarter decisions about the design and functioning of our city.
Knowing the characteristics of the data about my unique friends, I’ve long argued that while my friends are the most vulnerable to data abuse and misuse, they are also the easiest to re-identify. If you are the only indigenous person in a neighborhood of people from European origin, it will be very easy to trace your data back to you. If you are the only person that receives delivery of colostomy bags in your community, it will be very easy to re-identify your purchasing data.
People as a set of numbers
There is more to my unease with data and the privacy strategies being proposed. Data analytics represents people as numbers. It reduces people to one number, or a bunch of numbers. This serves to occlude, reduce or remove our humanity. In our history, data and machine determinations about people have often been used to excuse inhumane decisions. Pointing to the data is used to absolve ourselves of guilt in acts that we would otherwise consider unfair. Common examples include when a government official denies a service or gives an excuse for an unreasonable application of a policy. We claim that the data doesn’t lie; that we are powerless to make exceptions. We remove our humanity and the humanity of the numbered person. Data-driven decisions are not humane, they are impersonal. People who are not average are best served by the personal systems that recognize their unique needs, not by what we learn from Big Data or services guided by data analytics using data about the whole population.
This concern goes beyond people that are far from average. When I think about the things that make a city livable for me, it is the grace notes of humanity and human connection. The small munificent acts on the Queen Street streetcar, the wave of smiles that signal shared recognition of a toddler’s audacious behavior in public, the tiny “take one, leave one” libraries in my neighborhood, the house proud things people do to their front yards that cheer my walk, greeting my neighbor’s grumpy and smelly old dog each morning, the illegal busker who plays an unexpected variant of my favorite song, and the many surprising, serendipitous moments that the jagged rich diversity of a City produces. It isn’t the efficiency of the infrastructure. In fact, there are more grace notes when we need to pull together because the infrastructure breaks down. These grace notes are not captured by data. That isn’t to say that I don’t appreciate the data-driven decisions. I love the counter-flow lanes for bicycles arrived at by looking at open data, for example. I appreciate the timing of the lights on Dundas Street, so I can get into work more efficiently. But I worry about what happens when the numbers get bigger and further from our humanity and are applied to more and more decisions. The factual truth or evidence we arrive at using our current data analytics is devoid of context and many of the complex things that make us human. Our data analytics strategies combined with our privacy strategies remove our uniqueness and variability. What will be missing in the resulting plans?
Who to engage in design
So how do we make better urban planning decisions? Polls and surveys are structurally biased toward people that are able to participate. Rule by majority does not help my far-from-average friends because they are each a tiny minority, even if they are able to participate. (However, add up all those tiny minorities and you have a substantial chunk of our city.)
In our practice as inclusive designers, we invite people that have difficulty with or can’t use a design to co-design with us. In this way we find more innovative approaches. Most often, design that benefits people at the margins also benefits people in the crowded middle. This would make sense for Sidewalk as well. People at the margins of our current communities have the most to risk and the most to gain from smart communities. It is when we are far from average that we have the most compelling uses of smarts. If we find strategies to protect ourselves from risk when we are most vulnerable, we will have a means to protect anyone that is less vulnerable as well.
Leveling the playing field
To address the issue of majority data overwhelming the needs of people at the edge, I’ve been playing with the Gaussian curve or normal distribution. I call it the “lawnmower of justice.” I cut off all but a small number of repeats of any given data point. This forces the learning model to level the playing field for the full spectrum of needs and pay attention to the edge or outlying data as well. It levels the hill in the normal distribution. So far, it takes longer for the learning model to reach what would be considered reasonable decisions, but once it does, it can address a greater diversity of needs and is also better at addressing the unanticipated scenario or inevitable changes in the context.
Personal data preferences
If de-identification is not a reliable approach to maintaining the privacy of individuals that are far from average, but data exclusion means that highly impactful decisions will be made without regard to their needs, what are potential approaches to addressing this dilemma? The primary focus has been on an ill-defined notion of privacy. When we unpack what this means to most people, it is self-determination, ownership of our own narrative, the right to know how our data is being used, and ethical treatment of our story.
To begin to address this dilemma, I have proposed an International Standards Organization personal data preference standard as an instrument for regulators to restore self-determination regarding personal data. I developed the proposal as a response to the all-or-nothing terms of service agreements which ask you to give away your private data rights in exchange for the privilege of using a service. These terms of service agreements are usually couched in legal fine print that most people could not decode even if they had the time to read them. This means that it has become a convention to simply click “I agree” without attending to the terms and the rights we have relinquished. The proposed standard will be part of an existing standard I developed together with my team at the Inclusive Design Research Centre and an international working group. The parent standard is called AccessForAll or ISO/IEC 24751. The structure of the parent standard enables matching of consumer needs and preferences with resource or service functionality. It provides a common language for describing what you need or prefer in machine-readable terms and a means for service providers or producers to describe the functions their products and services offer. This allows platforms to match diverse unmet consumer needs with the closest product or service offering. Layered on top of the standard are utilities that help consumers explore, discover and refine their understanding of their needs and preferences, for a given context and a given goal. The personal data preference part of this standard will let consumers declare what personal data they are willing to release to whom, for what purpose, what length of time and under what conditions. Services that wish to use the data would declare what data is essential for providing the service and what data is optional. This will enable a platform to support the negotiation of more reasonable terms of service. The data requirements declarations by the service provider would be transparent and auditable. The standard will be augmented with utilities that inform and guide consumers regarding the risks and implications of preference choices. Regulators in Canada and Europe plan to point to this standard when it is completed. This will hopefully wrest back some semblance of self-determination of the use of our data.
Data platform co-ops
Another approach to self-determination and data that we are exploring together with the New School and the Platform Co-op Consortium is the formation of data co-ops. In a data co-op, the data producers would both govern and share in the profit (knowledge and funds) arising from their own data. This approach is especially helpful in amassing data in previously ignored domains, such as rare illnesses, niche consumer needs or specialized hobbies. In smart cities there could be a multiplicity of data domains that could have associated data co-ops. Examples include wayfinding and traffic information, utility usage, waste management, recreation, consumer demands, to name just a few. This multiplicity of data co-ops would then collaborate to provide input into more general urban planning decisions.
We need more time
Learning from the jagged scatter-plot of human difference takes more time. The imposed deadlines for coming up with a plan for Sidewalk Toronto are unrealistic. In our race to dominate globally, to “beat China,” we fail to protect our most vulnerable. Whatever plan we devise should not be perfect, complete or permanent. The perfect does not invite engagement. The plan should be for an inclusive process that supports iterative co-creation of our community.
Toward more inclusive communities
How do we better understand our cities? I would say it is by seeking out the rich stories of its residents and visitors, including the stories of people that have been ignored — the multiplicity of experiences and perspectives within this highly complex city — without removing the context or the variability. This is counter to current data approaches. How do we plan a better city? I would say it is by addressing the needs of people that have difficulty with or are excluded by our current urban design, so we create an urban plan that is more welcoming and humane. How do we ensure our data is not abused, misused, or exploited without our consent? De-identification is not the answer because it doesn’t work for the most vulnerable and because it removes our variability and uniqueness. We need to create systems that enable us to have self-determination over our own story and who uses our data, for what purposes.
It is the way we are unique, our context, our goals, our variable and varied narrative that is most relevant to creating welcoming, humane communities. Turning our data into a homogeneous mass that privileges the mean won’t result in a more livable city. We lose the jagged edges of our humanity. We need a city where it is safe to be our whole, true, unique selves, where our stories matter. We need smarts that can dynamically orchestrate a response to these highly diverse needs, while helping us find our shared values and deeper commonality.
*Note: I am a member of the Digital Strategy Advisory Panel of Waterfront Toronto.
**Please note this work is licensed under an Attribution-NonCommercial 4.0 International License**