What World Are We Building?

danah boyd

Published in

Data & Society: Points

12 min readJan 25, 2016

CC BY-SA 2.0-licensed photo by Andrew Hart.

It’s easy to love or hate technology, to blame it for social ills or to imagine that it will fix what people cannot. But technology is made by people. In a society. And it has a tendency to mirror and magnify the issues that affect everyday life. The good, bad, and ugly.

0. Internet

I grew up in small town Pennsylvania, where I struggled to fit in. As a geeky queer kid, I rebelled against hypocritical dynamics in my community. When I first got access to the Internet, before the World Wide Web existed, I was like a kid in a candy store. Through early online communities, I met people who opened my eyes to social issues and helped me appreciate things that I didn’t even understand. Looking back, I think of the Internet as my saving grace because the people that I met — the strangers that I met — helped me take the path that I’m on today. I fell in love with the Internet as a portal to the complex, interconnected society that we live in.

I studied computer science, wanting to build systems that connected people and broke down societal barriers. As my world got bigger, though, I quickly realized that the Internet was a platform and that what people did with that platform ran the full spectrum. I watched activists leverage technology to connect people in unprecedented ways while marketers used those same tools to manipulate people for capitalist gain. I stopped believing that technology alone could produce enlightenment.

CC BY-SA 2.0-licensed photo by Kalyan Kanuri.

In the late 90s, the hype around the Internet became bubbalicious, and it started to be painfully clear to me that economic agendas could shape technology in powerful ways. After the dot-com bubble burst in 2000, I was part of a network of people determined to build systems that would enable people to connect, share, and communicate. By then I was also a researcher trained by anthropologists, curious to know what people would do with this new set of tools called social media.

In the early days of social network sites, it was exhilarating watching people grasp that they were part of a large global network. Many of my utopian-minded friends started dreaming again of how this structure could be used to break down social and cultural barriers. Yet, as these tools became more popular and widespread, what unfolded was not a realization of the idyllic desires of many early developers, but a complexity of practices that resembled the mess of everyday life.

1. Inequity All Over Again

While social media was being embraced, I was doing research, driving around the country talking with teenagers about how they understood technology in light of everything else taking place in their lives. I watched teens struggle to make sense of everyday life and their place in it. And I watched as privileged parents projected their anxieties onto the tools that were making visible the lives of less privileged youth.

As social media exploded, our country’s struggle with class and race got entwined with technology. I will never forget sitting in small town Massachusetts in 2007 with a 14-year-old white girl I call Kat. Kat was talking about her life when she made a passing reference to why her friends had all quickly abandoned MySpace and moved to Facebook: because it was safer, and MySpace was boring. Whatever look I gave her at that moment made her squirm. She looked down and said,

I’m not really into racism, but I think that MySpace now is more like ghetto or whatever, and…the people that have Facebook are more mature… The people who use MySpace — again, not in a racist way — but are usually more like [the] ghetto and hip-hop/rap lovers group.

As we continued talking, Kat became more blunt and told me that black people use MySpace and white people use Facebook.

CC BY-NC 2.0-licensed photo by Simon Hildrew.

Fascinated by Kat’s explanation and discomfort, I went back to my field notes. Sure enough, numerous teens had made remarks that, with Kat’s story in mind, made it very clear that a social division had unfolded between teens using MySpace and Facebook during the 2006–2007 school year. I started asking teens about these issues and heard many more accounts of how race affected engagement. After I posted an analysis online, I got a response from a privileged white boy named Craig:

The higher castes of high school moved to Facebook. It was more cultured, and less cheesy. The lower class usually were content to stick to MySpace. Any high school student who has a Facebook will tell you that MySpace users are more likely to be barely educated and obnoxious. Like Peet’s is more cultured than Starbucks, and Jazz is more cultured than bubblegum pop, and like Macs are more cultured than PC’s, Facebook is of a cooler caliber than MySpace.

This was not the first time that racial divisions became visible in my research. I had mapped networks of teens using MySpace from single schools only to find that, in supposedly “integrated” schools, friendship patterns were divided by race. And I’d witnessed and heard countless examples of the ways in which race configured everyday social dynamics which bubbled up through social media. In our supposedly post-racial society, social relations and dynamics were still configured by race.

CC BY-NC-SA 2.0-licensed photo by monkeyc.net.

In 2006–2007, I watched a historic practice reproduce itself online. I watched a digital white flight [.pdf]. Like US cities in the 70s, MySpace got painted as a dangerous place filled with unsavory characters, while Facebook was portrayed as clean and respectable. With money, media, and privileged users behind it, Facebook became the dominant player that attracted everyone. And among youth, racial divisions reproduced themselves again, shifting, for example, to Instagram (orderly, safe) and Vine (chaotic, dangerous).

Teenagers weren’t creating the racialized dynamics of social media. They were reproducing what they saw everywhere else and projecting onto their tools. And they weren’t alone. Journalists, parents, politicians, and pundits gave them the racist language they reiterated.

And today’s technology is valued — culturally and financially — based on how much it’s used by the most privileged members of our society.

2. Statistical Prejudice

Thirteen years ago I was sitting around a table with a group imagining how to build tools that would support rich social dynamics. None of us, I think, imagined being where we are now. Sure, there were those who wanted to be rich and famous, but no one thought that a social network site would be used by over a billion people and valued in the hundreds of billions of dollars. No one thought that every major company would have a “social media strategy” within a few years. No one saw that the technologies we were architecting would reconfigure the political and cultural landscape. None of us were focused on what we now call “big data.”

CC BY-SA 2.0-licensed photo by Intel Free Press.

“Big data” is amorphous and fuzzy, referencing, first, a set of technologies and practices for analyzing large amounts of data. But, these days, it’s primarily a certain phenomenon that promises that if we just had more data, we could solve all of the world’s problems. The problem with “big data” isn’t whether or not we have data, but whether or not we have the ability to make meaning from and produce valuable insights with it. This is trickier than one might imagine.

One of the perennial problems with the statistical and machine learning techniques that underpin “big data” analytics is that they rely on data entered as input. When the data you input is biased, what you get out is just as biased. These systems learn the biases in our society, and they spit them back out at us.

Latanya Sweeney, Discrimination in Online Ad Delivery.

Consider the work done by computer scientist Latanya Sweeney. One day she was searching for herself on Google when she noticed that the ads displayed were for companies offering criminal record background checks with titles like “Latanya Sweeney, Arrested?” which implied that she might have a criminal record. Suspicious, she started searching for other, more white-sounding names, only to find that the advertisements offered in association with those names were quite different. She set about to test the system more formally and found that, indeed, searching for black names was much more likely to produce ads for criminal justice-related products and services.

Andrew Leonard, Online advertising’s racism mess, *Salon*, February 4, 2013.

The story attracted a lot of media attention. But what the public failed to understand was that Google wasn’t intentionally discriminating or selling ads based on race. Google was indifferent to the content of the specific ad that showed up with a name search. All it knew is that people clicked on those ads for some searches but not others, and so it was better to serve them up when the search queries had a statistical property similar to queries where a click happens. In other words, because racist viewers were more likely to click on these ads when searching for black names, Google’s algorithm quickly learned to serve up these ads for names that are understood as black. Google was trained to be “racist” by its racist users.

Our cultural prejudices are deeply embedded in countless datasets, the very datasets that our systems are trained to learn on. Students of color are much more likely to have disciplinary school records than white students. Black men are far more likely to be stopped and frisked by police, arrested for drug possession or charged with felonies, even as their white counterparts engage in the same behaviors. Poor people are far more likely to have health problems, live further away from work, and struggle to make rent. Yet all of these data are used to fuel personalized learning algorithms, to inform risk-assessment tools for judicial decision-making, and to generate credit and insurance scores. And so the system “predicts” that people who are already marginalized are higher risks, thereby constraining their options and making sure they are, indeed, higher risks.

This was not what my peers set out to create when we imagined building tools that allowed you to map who you knew or enabled you to display interests and tastes.

We didn’t architect for prejudice, but we didn’t design systems to combat it either.

Lest you think that I fear “big data,” let me take a moment to highlight the potential. I’m on the board of Crisis Text Line, a phenomenal service that allows youth in crisis to communicate with counselors via text message. We’ve handled millions of conversations with youth who are struggling with depression, disordered eating, suicidal ideation, and sexuality confusion. The practice of counseling is not new, but the potential shifts dramatically when you have millions of messages about crises that can help train a system designed to help people. Because of the analytics that we do, counselors are encouraged to take specific paths to suss out how they can best help the texter. Natural language processing allows us to automatically bring up resources that might help a counselor or encourage them to pass the conversation to a different counselor who may be better suited to help a particular texter. In other words, we’re using data to empower counselors to better help youth who desperately need our help. And we’ve done more active rescues during suicide attempts than I like to count (so many youth lack access to basic mental health services).

The techniques we use at Crisis Text Line are the exact same techniques that are used in marketing. Or personalized learning. Or predictive policing. Predictive policing, for example, involves taking prior information about police encounters and using that to make a statistical assessment about the likelihood of crime happening in a particular place or involving a particular person. In a very controversial move, Chicago has used such analytics to make a list of people most likely to be a victim of violence. In an effort to prevent crime, police officers approached those individuals and used this information in an effort to scare them to stay out of trouble. But surveillance by powerful actors doesn’t build trust; it erodes it. Imagine that same information being given to a social worker. Even better, to a community liaison. Sometimes, it’s not the data that’s disturbing, but how it’s used and by whom.

CC BY 2.0-licensed photo by Jared Tarbell.

3. The World We’re Creating

Knowing how to use data isn’t easy. One of my colleagues at Microsoft Research — Eric Horvitz — can predict with startling accuracy whether someone will be hospitalized based on what they search for. What should he do with that information? Reach out to people? That’s pretty creepy. Do nothing? Is that ethical? No matter how good our predictions are, figuring out how to use them is a complex social and cultural issue that technology doesn’t solve for us. In fact, as it stands, technology is just making it harder for us to have a reasonable conversation about agency and dignity, responsibility and ethics.

Data is power. Increasingly we’re seeing data being used to assert power over people. It doesn’t have to be this way, but one of the things that I’ve learned is that, unchecked, new tools are almost always empowering to the privileged at the expense of those who are not.

For most media activists, unfettered Internet access is at the center of the conversation, and that is critically important. Today we’re standing on a new precipice, and we need to think a few steps ahead of the current fight.

CC BY-NC 2.0-licensed photo by Bob Mical.

We are moving into a world of prediction. A world where more people are going to be able to make judgments about others based on data. Data analysis that can mark the value of people as worthy workers, parents, borrowers, learners, and citizens. Data analysis that has been underway for decades but is increasingly salient in decision-making across numerous sectors. Data analysis that most people don’t understand.

Many activists will be looking to fight the ecosystem of prediction — and to regulate when and where prediction can be used. This is all fine and well when we’re talking about how these technologies are designed to do harm. But more often than not, these tools will be designed to be helpful, to increase efficiency, to identify people who need help. Their positive uses will exist alongside uses that are terrifying. What do we do?

CC BY-NC-SA 2.0-licensed photo by .solo.

One of the most obvious issues is the limited diversity of people who are building and using these tools to imagine our future. Statistical and technical literacy isn’t even part of the curriculum in most American schools. In our society where technology jobs are high-paying and technical literacy is needed for citizenry, less than 5% of high schools offer AP computer science courses. Needless to say, black and brown youth are much less likely to have access, let alone opportunities. If people don’t understand what these systems are doing, how do we expect people to challenge them?

CC BY 2.0-licensed photo by nolifebeforecoffee.

We must learn how to ask hard questions of technology and of those making decisions based data-driven tech. And opening the black box isn’t enough. Transparency of data, algorithms, and technology isn’t enough. We need to build assessment into any system that we roll-out. You can’t just put millions of dollars of surveillance equipment into the hands of the police in the hope of creating police accountability, yet, with police body-worn cameras, that’s exactly what we’re doing. And we’re not even trying to assess the implications. This is probably the fastest roll-out of a technology out of hope, and it won’t be the last. How do we get people to look beyond their hopes and fears and actively interrogate the trade-offs?

Technology plays a central role — more and more — in every sector, every community, every interaction. It’s easy to screech in fear or dream of a world in which every problem magically gets solved. To make the world a better place, we need to start paying attention to the different tools that are emerging and learn to frame hard questions about how they should be put to use to improve the lives of everyday people.

We need those who are thinking about social justice to understand technology and those who understand technology to commit to social justice.

Points: “What World Are We Building?” is adapted, with permission, from a talk of the same name. On October 20, 2015, Data & Society founder danah boyd spoke at the 2015 Everett C. Parker Lecture on Ethics and Telecommunications in honor of Dr. Parker’s work and legacy of fighting for media justice. danah’s full remarks are archived on her website. Video is available here. — Ed.