How to Lie with Analytics
The following guest post, “How to Lie with Analytics” was written by Ted Byfield for Mozilla’s Internet Citizen blog.
Learn more about the history of data analytics and visualization at The Glass Room through December 14, 2016. Ted will be hosting a session and guided tour on December 11 at 3 p.m.
How to lie with analytics
There’s a particular kind of image that’s associated with “big data” — you’ve probably seen it. Let’s call it the luminous city. The background — maybe a city, a nation, or the entire earth — is stark, often dark. Against that backdrop we see thousands of radiant trajectories fanning out and spanning across empty expanses, intertwined tendrils dispersing and converging. Aaron Koblin’s “Flight Patterns” (2011) is one famous example; Paul Butler’s “Visualizing Friendships” (2010) for Facebook is another. There are many more.
Images like this are supposed to show how our lives coincide — in cars and on planes, on the internet or social networks, whatever. What they don’t show is everyone and everything that isn’t so “connected.” The poorer neighborhoods, industrial and mixed-use areas, the people, places, and things forgotten in one way or another by “innovation.” They disappear into the background as if they never even existed. In this way, visualizations can often help us to forget many of those gritty, obstinate, and inconvenient worlds.
When people talk about visualization, they often emphasize how it can “surface” important patterns and correlations. But surfacing one thing means submerging everything else — in other words, forgetting it. Hiding isn’t a bug of visualization, it’s a feature. It’s not just fanciful visualizations that do it: their poor cousins, spreadsheets, do it too — with much more impact.
My point isn’t to condemn visualization, of course — that would be ridiculous. Every serious decision made anywhere in the world now — in government, business, manufacturing, construction, science, education, civil society, and the military — is fundamentally shaped by visualization. That is why we need to think more seriously about how visualization works — and also how it doesn’t work. But the questions we ask shouldn’t just be functional — about whether what we hope to discover or communicate is clear, effective, persuasive, or elegant. We also need to ask about the unintended effects: what disappeared into the background?
Lies, damn lies, and analytics
So what does this have to do with analytics? And how do you lie with them, anyway?
Along with the adage that “there are three kinds of lies: lies, damned lies, and statistics,” the phrase How to Lie with Statistics is one of the main references for what many people associate with statistics: lies. The phrase was the title of a 1954 book by Darrell Huff — the first popular book to tackle statistics. To this day, people often note its breezy style and describe it as current; but emphasizing its style obscures key ways in which the book is very dated.
The goalposts have moved in the last sixty-odd years. When Huff wrote it, data wasn’t an everyday fact at work, home, and everywhere between. It was still mostly hidden away in bean-counting back offices and an occasional feature of print journalism. Now, though, data is front and center — for example, in “data-driven” journalism, where the data is the story. For the outlets that publish those stories, analytics play a decisive role in what kinds of stories are developed, presented, and promoted. The same is true, in different ways, everywhere else data appears: it describes the questions we ask. As in the “luminous cities” example I began with, we should ask ourselves what kinds of stories are not developed and what kinds of data aren’t collected.
Huff’s book wasn’t really a how-to manual for lying with data, it was a primer in how not to be lied to. And that problem remains just as real decades later: how not to be lied to. The challenge now is to recognize where we can think critically and practically about the many ways that statistics and analytics shape and distort our own — and others’ — lives.
So…how do you lie with analytics?
The first step is to remember that analytics, however sophisticated, is an evolutionary step in, not a revolutionary break from, statistics. True, analytics also involves some important distinctions about how data is processed and analyzed. But the field of statistics has changed dramatically over the last two to three centuries, and on a basic level analytics is just another step. And if we can lie with statistics, we can lie with analytics.
Most people will never command what Mark Surman, executive director of the Mozilla Foundation, called “data empires.” And — hopefully — they won’t break or hack into data centers. Nor can most people immerse themselves in the esoteric world of governing algorithms. These aren’t viable options. So what is viable? What can you do?
I’d argue that that one answer is right in front of you, in the steady parade of visual presentations of data and statistics you see every day. It’s no longer just a question of the informed consumer casting a skeptical eye on how a handful of facts and figures are used in front-page stories, though. Instead, it begins with a simple set of questions: What else is in the background? What’s in all those empty, “negative” spaces between, behind, and around the statistics we see? And, if or when you’re called upon to act on those images of data, you can factor in the gritty, obstinate, and inconvenient worlds they often hide.
Ted Byfield is a retired artist, frequent editor, escaped urbanist, governance hobbyist, perpetual collaborator, and recovering academic. He’s moderated nettime for a really long time and, more recently,co-founded the Open Syllabus Project. He’s currently writing a history of what people imagine information looks like, which isn’t at all what you’d imagine.
Join Ted at The Glass Room on Sunday, Dec. 11 at 3 p.m. to learn more about “How to lie with analytics.”