Catherine D’Ignazio and Lauren Klein spoke with PAIR founders Fernanda Viégas and Martin Wattenberg about Catherine and Lauren’s new book Data Feminism (MIT Press, 2020). Catherine is an assistant professor of Urban Science and Planning at MIT where she directs the Data + Feminism Lab. Lauren is an associate professor in the departments of English and Quantitative Theory and Methods at Emory University, where she also directs the Digital Humanities Lab. The conversation was collaboratively edited, with the help of PAIR’s former writer-in-residence, David Weinberger.
Fernanda: So, just to start, how do you define data feminism?
Catherine: In the book we define it as a way of thinking about data science that’s informed by intersectional feminism. This means we consider not just sexism, but also racism and colonialism, and how they infiltrate the data science process. That’s the short way to say it. Lauren, do you want to add anything to that?
Lauren: Just that it’s a book about power in data science. An intersectional feminist lens brings an analysis of power and inequality. Sometimes that gets hidden in casual conversations about intersectional feminism.
Martin: When it comes to data visualization, maybe we can look at recent examples, because since COVID19 hit I have this sense that we’re seeing a flowering of visualization. Are there some that fit the intersectional feminist model better or worse?
Lauren: It’s both upsetting and unsurprising to see just how timely things we say in the book are in respect to COVID. The early response to the pandemic became synonymous with “flattening the curve”. The messaging was around visualizations of the curve.
But standard visualizations weren’t enough. For example, the Financial Times’ visualization of COVID cases became an early reference for so many people.
But the Financial Times had to annotate the curves. Since, for example, we all knew that South Korea was testing everyone, and that there was very limited testing in Iran. FT recognized that the image by itself, even though it was presenting reported data, wasn’t doing the work that needed to be done. The image needed those words.
Catherine: In the first couple of chapters we speak a lot about missing data, what we collect versus what we don’t collect, and how this relates to structural forces of oppression such as racism and sexism. For example, in the U.S. we’re not collecting data about sex or race and ethnicity in COVID19 deaths. These variables are really important to know about, and not just for women; it’s also important for men because men are disproportionately dying of COVID right now. But we don’t have those numbers at the federal level because we haven’t had the forethought to plan protocols for collecting pandemic data.
It’s been interesting to see the civic actors who have stepped into these data gaps. These include the Solutions Journalism Network, COVID Black and Data for Black Lives, among others. Data for Black Lives is compiling a national database of which states have which data, and lobbying states to do better.
Fernanda: I’m glad you brought up Data for Black Lives. They talk about data as a tool for social change. It resonates a lot with your book. How do we get there?
Catherine: There’s much more fundamental structural change that has to come about. It’s hard to make anti-oppressive tools if the underlying infrastructure is completely oppressive. There are major changes outside of the datasets and outside of how we use data that we need to be working on in tandem.
So there are limitations to what we can do with data, but in the book we showcase some of the groups that are working in these anti-oppressive ways. For example, one of the seven principles in the book is: Embrace pluralism. How do we have more people at the table — the people who are directly impacted — participating in data science in a much more situated and grounded way? There’s a lot of learning that has to take place so that these groups can speak the same languages to each other.
But we also talk about direct action and movements and the importance of building really strong public information ecosystems and infrastructures.
Lauren: I also think we have to credit Data for Black Lives for bringing the idea of abolition to the field of data science. It’s a concept that’s been in circulation since the abolitionist movement in the 19th century, and that’s been taken up by prison abolition movements and other social justice organizations for a very long time. When Data for Black Lives says they want to abolish big data, they don’t mean they want to do away with it. Rather it means replacing the oppressive system with something more liberatory. In our book we’re also not saying that all data is bad and corrupt and we ought to throw up our hands and walk away. No. We should use our imaginations and replace these structures with something generative. That’s the big picture.
More specifically, when it comes to model-driven inquiry, you need to scope the problem right. You have to decide what it is that you are focused on, what you’re trying to capture in the model that you’re building, and then understand what it tells you. That means recognizing that everything is situated within a larger system or set of systems, and that the output that you get from a particular model only gets you so far.
That is my biggest frustration with a lot of the way in which the research is framed right now. It’s an exciting time for machine learning. Every single week or month, there’s some new trained model advancing the state of the art. But we are only capturing a very, very small subset of the complexity of the world.
Martin: Your book is part of a long line of works about data and power. For instance, the classic How to Lie with Statistics has a line that says “a well-wrapped statistic is better than Hitler’s big lie” because it can’t be pinned on you. And even the cover of Edward Tufte’s book The Cognitive Style of Powerpoint references a Stalin-era Soviet rally. There’s a very interesting argument in your book about the appearance of objectivity and how austere design can reinforce structures of power. In the 20th century we thought the way to speak truth to power is through plainness and simplicity. Is there anything to learn from that approach?
Catherine: There are some resonances there. We’ve excluded too much in our pursuit of objectivity, including emotions, and the fact that our whole bodies are involved in cognition and understanding. In the book, this leads us to advocate for “data visceralization,” a concept formulated by Kelly Dobson, which means making data that is intelligible to the whole body, not just the eyes. The feminist argument goes back to longstanding feminist critiques of objectivity by saying we should ask who is the objectivity for, in whose service is it, and who’s left out when it fails to live up to its universalist and generalist claims.
We talk in the book about a red lining map of Detroit that shows the Black neighborhoods in red meaning high-risk, which translated to banks not making loans there. This is an amazing example of scalable, big data — super-advanced techniques for its time.
It looked so objective on the surface while it was actually upholding a patriarchal and white supremacist vision of the world. It had really devastating consequences for cities. As in this example, objectivity can often become a cloak for the interests of the dominant group. This is why we say that even a “neutral” visualization is rhetorical, in the sense that it is producing a persuasive argument about the world. The neutral visualizations might be even more persuasive because of their apparent neutrality.
Lauren: Going back even before the 20th century, the early visualization innovators knew that they were using visualization rhetorically. William Playfair, who invented the pie chart, and who Tufte thinks is an exemplar of objectivity, said that the world was in chaos; he was writing at the time of the American, French, and Haitian revolutions. He said he didn’t know who would win or what language they’d speak, but he wanted anyone to be able to look back and get a clear view of what was going on at the time. And by “clear” he didn’t mean coldly objective. It was a deliberate rhetorical strategy to present the data in a way that would point to the economic and political instability of the time, even if he couldn’t name it as such in his chart..
Fernanda: What’s a go-to positive example for you?
Lauren: One of the best recent visualizations was the New York Times’ visualization of the sudden increase in unemployment numbers because of the pandemic. According to best practices they should have used a logarithmic scale, so that everything would fit in a nice rectangle. But the point was to show that the increase was huge. They wanted it to activate your emotions. So they showed it to scale.
In the book we argue that that’s good design. If when designing a visualization you were to say, “Oh I’m just gonna import the numbers into Excel and then have it pick the scale for me,” that’s a bad design process. But if you say that your goal is to show people that these numbers are massive and need to be recognised, you should think about the ways that you can make your point, and not be afraid if it results in something more emotional because somehow that’s considered not to be objective. Which one of those employment figure visualizations is more responsible, the log scale one or the one that makes visually clear the magnitude of the numbers?
Fernanda: What are some pitfalls designers can avoid?
Catherine: I think that’s one thing that’s always really important to keep in mind is that when we’re dealing with data about people, the data are not just data. For example, with a map of evictions, the dots represent deep and fraught personal lives. We need to collect oral histories, as well as do a mixed methods approach because the dots-on-a-map approach does not do justice to the impact on the community.
We also say in the book that you have to be careful not to tell “deficit narratives”, especially if your identity is at the intersection of multiple dominant categories –white and male and educated — and you think you’re doing something good by looking at inequities. For example, you can use data to show the relative lack of women in STEM. But that can inadvertently frame women as victims without agency about what they can do about it. Or, white people’s response to Black women’s maternal mortality stats can lead to white people thinking they have to save Black women. This can trip you up. You think you’re being a good ally but you’re in fact perpetuating a deficit narrative.
Lauren: I saw a really good infographic about digital activism by the digital strategist Leslie Mac. It had questions you can ask yourself before you join. Where and who does the initiative come from? Have members of the group already started an initiative? Are you amplifying that one, or maybe you don’t even know about it because you haven’t done the work. What’s the work done there already?
Fernanda: And if you encounter such a group, reach out to them.
Fernanda: What’s your advice to designers who want to do better?
Lauren: My advice would be to look to a wide range of places for inspiration. Some of the best work right now is coming from independent artists, creative community organizers, design-oriented activists, and just plain people in the world. Not all of these people have formal design credentials, but it doesn’t necessarily mean that their work is any less valuable. One of the points we make in the book is that requiring specific credentials or looking for a particular professional affiliation are some of the ways that women and people of color are unwittingly excluded from the field. But great work is great work, and all it all counts as data visualization. We have huge challenges ahead of us, and we’ll need as many sources of inspiration as we can get, and as many accomplices as we can gather, if we’re going to make meaningful change in the world.
Catherine: My advice would be for designers to think more about creating responsible data visualizations and less about trying to achieve “neutral” or “truthful” or “objective” visualizations. Responsible is a relational term — you are accountable to another person or group. You are responsible for honoring their humanity and dignity, for making space for their emotions when the topic is fraught, and for shaping an interpretation of the data that leads to more just decisions and outcomes. Who is your visualization accountable to?
Opinions in PAIR Q&As are those of the interviewees, and not necessarily those of Google. In the spirit of participatory ML research, we seek to share a variety of points of view on the topic.
For more from Catherine, join us at the PAIR Symposium, live-streaming from London, Boston and Seattle on November 18.