Legibility and Interpretability in Predictive Models (of Cities)
The discussions among the Data & Society fellows over the last few months have been steadily working their way up the food chain from the rudiments of data collection to the mechanics of data processing to the sorcery data analytics (literally, Ingrid Burrington is doing some interesting thinking about magic and data). In early December, these conversations reached a crescendo with a deeply critical talk by Cathy O’Neill, who blogs at MathBabe.org and is the author of the forthcoming book “Weapons of Math Destruction”.
While I still feel like we are finding our feet… we are still talking about “algorithms” with a sense of wonder and lack of nuance and technical background that we scoff at in the broader public’s use of terms like “big data”, we are making progress. We are moving beyond the fetishization of big data — the volume, veracity, velocity meme — and starting to realize that the real breakthroughs are happening at the other end of the funnel — its the algorithms. As Gary King told Harvard Magazine, more data and more computations “is nothing compared to a big algorithm”. We’re getting over the fascination with the abundance of this new resource (“data is the new oil” thinking) and homing in on the new infrastructure that will transform it into valuable products and services. In industrial age terms, we’ve gotten over the oil men and are now zeroing in on the chemists.
The challenge is that 99.9999% of “the algorithms” are hidden. This was something that I wrote in the most basic terms in Smart Cities, and my concern has only grown. Back then writing in late 2012:
The most powerful information in the smart city is the code that controls it. Exposing the algorithms of smart-city software will be the most challenging task of all. They already govern many aspects of our lives, but we are hardly even aware of their existence.
I went on to recap some of the relevant history and the context for the urban simulation renaissance….
…computer modeling of cities began in the 1960s. Michael Batty, the professor who runs one of the world’s leading centers for research in urban simulation at University College London, describes the era as “a milieu dominated by the sense that the early and mid-twentieth century successes in science could extend to the entire realm of human affairs.” Yet after those early failures and a long hibernation, Batty believes a renaissance in computer simulation of cities is upon us. The historical drought of data that starved so many models of the past has given way to a flood. Computing capacity is abundant and cheap. And like all kinds of software, the development of urban simulations is accelerating. “You can build models faster and quicker,” he says. “If they’re no good, you can throw them away much more rapidly than you ever could in the past.”
Those failed modeling efforts were spawned out of the computational arsenal of the Cold War and anxiety over America’s urban / civil rights crisis. But after a number of failures, many scholars started pointing fingers. Douglass Lee, then a professor of city planning at UC-Berkeley, penned the most comprehensive critique of those early efforts. And their opaqueness was one of his seven sins — the “most important attribute any model should have is transparency,” he argued in a seminal 1973 article “Requiem for Large-Scale Models”.
But even today, amidst the resurgence in computer-based model of cities, we aren’t seeing the kind of transparency Lee called for, let alone even the same kind of radical transparency that we do around other kinds of computer code:
Ironically, while open-source software — -which thrives on transparency — -is playing a major role in this renaissance in urban modeling research, most models outside the scholarly community today receive little scrutiny. The “many eyes” philosophy that ferrets out bugs in open source is nowhere to be found.
It’s also odd that software seems to be treated as a different class of regulatory mechanism in government.
The tools that have governed the growth of cities — -the instructions embodied in master plans, maps, and regulation — -have long been considered a matter of public record.
And so I grasped to assess both the obstacles and potential tools for bringing forth the code of urban government to light.
Citizens will need legal tools to seize the models directly. The Freedom of Information Act and other local sunshine statutes may offer tools for obtaining code or documentation. The impacts could be profound. Imagine how differently the inequitable closings of fire stations in 1960s New York might have played out if the deeply flawed assumptions of RAND’s models had been scrutinized by watchdogs. At the time, there was one case in Boston where citizen opposition “eventually corrected the modeler’s assumptions” according to Lee. Today assumptions are being encoded into algorithms into an increasing array of decision-support tools that inform planners and public officials as they execute their duties. But the prospects for greater scrutiny may actually be shrinking instead. New York’s landmark 2012 open data law, the most comprehensive in the nation, explicitly exempts the city’s computer code from disclosure.
And I homed in on the issue of complexity and trust as a key obstacle to getting people to even use models.
Greater transparency could also increase confidence in computer models with the group most prepared to put them to work solving problems — -urban planners themselves. But the modeling renaissance that Batty sees isn’t driven by planners or even social scientists, but by physicists and computer scientists looking for extremely complex problems. As Batty told an audience at MIT in 2011, “Planners don’t use the models because they don’t believe they work.” In their eyes, the results of most models are too coarse to be useful. The models ignore political reality and the messy way groups make decisions. And while new software and abundant data are lowering the cost of creating and feeding city simulations, they are still fantastically expensive undertakings, just as Douglass Lee noted forty years ago.
Without addressing the trust issue through transparency, cybernetics may never again get its foot in the front door of city hall. As journalist David Weinberger has written, “sophisticated models derived computationally from big data — -and consequently tuned by feeding results back in — -might produce reliable results from processes too complex for the human brain. We would have knowledge but no understanding.” Such models will be scientific curios, but irrelevant to the professionals who plan our cities and the public officials that govern them. Worse, if they are kept under lock and key, they may be held in contempt by citizens who can never hope to understand the software that secretly controls their lives.
It’s always kind of surprised me that not a single reader has engaged me around this issue since the publication of Smart Cities. In my view it was the most vexing and potentially intractable issue we face going forward.
Fast forward to today, which I spent in Boston at a conference organized by the Boston Area Research Initiative, one of the new centers I’m studying that are loosely organized around exploiting the opportunity to use big data to study cities — “Understanding and Improving Cities: Policy/Research Partnerships in the Digital Age”. The conference was focused on understanding how universities and the city of Boston can work together to improve policymaking and planning.
There were a lot of interesting presentations, but the one that really threw me for a loop was by MIT Sloan professor Cynthia Rudin, which concluded with a call for what she calls “interpretable modeling”. Rudin has a dislike for “black box” models deployed in policy settings.
Now the kind of black box that I was worrying about in Smart Cities, and that Doug Lee criticized in 1973 was much less sinister. We were simply complaining that we couldn’t see the equations being used to classify things, or describe feedback relationships between different urban processes. The modeler’s assumptions needed to be brought into the light. Transparency was a tractable thing with these models, due to their relative simplicity. They could even be explained to non-technical stakeholders, even if only in general terms.
The next step up (and we’re still not yet in the badlands that Rudin seems to be concerned about) are black boxes that hold mathematical constructs of significant complexity… that mostly only professional mathematicians will really be able to grasp what they are doing. They are too complicated for a lay person or policymaker to understand how they produce the results they do.
But as Rudin was talking, I recalled a comment from a Data & Society lunch talk earlier in the fall by Claudia Perlich, an NYU Courant Institute alum currently working on predictive models in online advertising. Claudia described her use of (what I recall in my limited mathematical vocabulary) as a “million column Logit regression” that basically would allow her to predict how likely you were to click on an ad. These models worked, she argued, but no one knew how… the complexity was so vast, that it was impossible for anyone to really ever understand what was going on inside. A true black box.
For me, Rudin’s opening comments in Boston picked up where Perlich left off — (I’m paraphrasing here) — “for years now in the machine learning business have been selling black boxes that we stuff in and stuff comes out and we dont know how, we just know it predicts.”
But this lack of transparency, really a lack of tractability or legibility of the model creates huge obstacles to building tools that are used in policy—because the stakes are so huge, they need to “produce believable and trustworthy results” in Rudin’s words. (as Perlich confessed “I can sleep at night because the worst thing that my model can do is serve the wrong ad.”)
That’s why Rudin’s lab at MIT focuses on developing “interpretable machine learning models” that improve predictions but can actually be understood by humans. In a research summary posted to her website, she writes “Possibly the most important obstacle in the deployment of computer-generated decision support systems (and in particular predictive models) is the fact that humans simply do not trust them.” Elsewhere, she elaborates, “It is essential in many application domains to have transparency in predictive modeling. Domain experts do not tend to prefer black box predictive models. They would like to understand how predictions are made,
and possibly, prefer models that emulate the way a human expert might make a choice, with a few important variables, and a clear convincing reason to make a particular prediction.”
Her work focuses mostly on medical diagnostics, but I imagine the issue is general enough to apply to most fields where there is a lot of discretion in decision-making and huge social consequences.
“We do not want our predictive models to have the curse of Cassandra,” she writes, “who tells the truth, but who no one believes.”
Update (Dec 17): Stephan Hugel (@urschrei) points me to this very interesting Research at Google paper outlining the numerous ways in which machine learning systems create “technical debt” — e.g. costs and risks associated with short-term engineering choices that need to be addressed as systems scale over time.
Update (Dec 18): Solon Barocas, a Princeton CITP fellow and Data & Society friend, highlights a recent workshop on “Fairness, Accountability, and Transparency in Machine Learning” he co-organized, and a great bibliography of related work.
Update (Dec 19): I should have linked to Rob Goodspeed’s very well-written paper on “Smart cities: moving beyond urban cybernetics to tackle wicked problems”, which documents a lot of the frustration with predictive models in urban planning.