Q&A: Participatory Machine Learning

Illustrated portrait of Fernanda Viégas, Jess Holbrook and Martin Wattenberg by Charlotte Trounce for Google
Fernanda Viégas, Jess Holbrook and Martin Wattenberg, illustrated by Charlotte Trounce for Google

In May 2020, Fernanda Viégas, Jess Holbrook, and Martin Wattenberg — who cofounded Google Research’s People + AI Research (PAIR) initiative in 2017 — sat down to talk about participatory machine learning, a core idea central to the direction PAIR’s research and projects have taken. They took the conversation as an opportunity to further articulate and explore the concept in theory and especially in practice. David Weinberger, PAIR’s writer-in-residence, prompted them with questions. They then collaboratively edited the transcript for concision and clarity.

Origin story

Q: How did the idea of participatory machine learning come up?

Fernanda: From the beginning, PAIR has had a broad research agenda focused on putting humans at the center of building AI technology. For instance, we were building tools to help developers understand their data and model behaviors, but we were also working on how doctors do or don’t trust AI-assisted diagnoses. We were bringing Tensorflow to the web and publishing human-centered AI guidance for UXers. We were gathering academic and industry experts at events. At the end of the day we were scratching our heads and thinking, you know, this could look like a laundry list of projects, but there was something about that list that still made sense, that felt cohesive.

We realized that it wasn’t just that people were at the center, but they were different kinds of people doing different kinds of things with the technology.

We wanted to empower the diversity of engagements and the diversity of stakeholders, and the different ways they could participate. An important part of what PAIR’s about is democratizing this technology and bringing it to a much broader set of people.

And, you know, it’s not like we were the first people to use the word “participatory” in reference to AI; Joy Buolamwini springs to mind. Participatory ML is a very positive way of looking at what we can do.

Martin: Sometimes I get the sense that people feel like AI is something that is being done to them. It’s so important that people feel like they’re active participants, that they’re agents, that ultimately they’re in control. Participatory ML means we’re not centering the question on simply making better technology. We’re thinking very much about what people need and how technology could help.

Fernanda: If you’re in fact feeling that AI is being done to you, as Martin says, then your first impulse will be to narrow the scope of AI, to constrain it. But by thinking about participatory ML, we’re thinking about broadening it, how to get more people involved, and how can we make this better for everyone by responsibly enlarging the possibilities, rather than constraining them.

Q: So, how do you bring people in?

Jess: There needs to be participation from the people in the communities that will be affected by these technologies. And that circle of communities keeps growing. For years PAIR developers have worked on tools that enable more responsible development of AI, such as the What-If Tool and Facets. But we also want to lower the hurdles to enlarge the circle of people who can engage in AI development, so we’ve worked on TensorFlow.js that lets people create powerful machine learning applications on their regular laptops.

Then, as soon as you enable one group, you realize there’s another group to involve. For example, we enabled the design and the UX community through resources like the guidebook.

Q: Since not everyone’s on a first-name basis with it, can you please describe it?

Jess: The People + AI Guidebook. It assembles in one place what we’ve learned about designing and implementing machine learning projects in ways that keep them responsive to user needs and ethically responsible. People from all across Google participated in this. Now the Guidebook is being used in 195 countries and by lots of groups — UXers, developers, product managers, startups — which makes us happy.

But as we were working on that, we realized the Guidebook might be helpful to policy leaders if they could participate as well. So we added some considerations to the Guidebook and have been working on new materials for and with policy folks, including interactive AI Explorables, which we think lots of people, especially nontechnical folks, will find helpful.

And we are reaching out further these days.We’ve held a couple of very eclectic symposia that bring people together from across many disciplines and backgrounds. We’ve started to develop educational materials.

Fernanda: There’s plenty of room for us to grow in all these areas, which is very exciting to us.

Navigating Differences

Jess : That’s led to us to think about what’s involved in each of these different stakeholders’ domains. It’s great having more and more participants, but they frequently have different incentives. You might have a policy maker who wants full transparency of models. Our job is then to think, okay, what that would mean from a developer’s point of view? Would that mean open sourcing every single piece of code? Now, most of the code that PAIR develops is open source, but from a larger point of view, what would that mean for all the commercial software companies? There’s a tension there.

Or, to stick with transparency and explainability, you might say that every ML system should explain its decisions to the user at every point in time. We can of course see the appeal of that, but if you actually do it, the user experience can become worse because too many details can make something harder to understand. At some point, the information becomes friction. So, we’re finding the tensions that arise when more and more people can participate in different ways — from developers and UXers to end users.

But that’s good. It’s creating a much more nuanced conversation.

Q: How do you manage those tensions?

Fernanda: We try to meet people where they are. For instance, we build educational modules where you can get a hands-on sense of how the technology works, what the implications are, what critical questions you should be asking. We build tools where no coding is necessary, such as our What-If Tool, so that non-coders can engage with the technology in their own ways, or people with less technical skills.

That’s one of the things about PAIR’s work that’s very exciting to me: It morphs, it goes from academic papers to open source tools, to industry guidelines, to educational modules. We’re engaging with all these different publics — helping them, we hope, but definitely learning from them.

Hearing the new demand for participation

Q: So far the sort of participation that you’ve been talking about generally is among the people who are intimately involved with the AI systems: building the models, doing the design, the frontline users. But the demand to participate seems to be spreading rapidly.

Jess: Definitely. The people whose lives and work are affected by AI need to be part of the participatory circle.

PAIR has long advocated for “inclusive design”, and not just for AI systems. That’s good for business because you can expand your markets, and you’ll get feedback from a more diverse group, which lets you improve your product. But it’s also obviously essential on the most basic moral grounds.

But now AI is revealing a bottled up need for these conversations. It’s like AI was the last straw. Things have been built forever without people’s input. I think what’s been happening is pretty profound.

It feels like AI “flipped a bit” with lots of people — it reached a tipping point –and now they’re saying: “Okay, enough. We want to participate. We want a say.”

Forty years ago the notion that you could even participate or contribute to something beyond the local was kind of farcical, right? Now I can just casually tweet at a global airline that lost my luggage and I expect them to jump to respond to me. We’ve hit a point where people expect that. And I think that’s a good thing.

In fact, now that we can so easily connect with a large organization, it feels like the old communication channels were broken until now.

I think the same sort of thing is happening with communities’ pent up desire to participate in the technical systems that affect them.

Q: How does a company that’s creating machine learning products engage in that sort of conversation with these wider communities?

Martin: We don’t want everyone to feel like they have to have a degree in computer science to participate. That is exactly the wrong direction. We have to come up with the designs that will help communicate with people. We need people’s help with that.

Q: That sounds like a UX issue.

Jess: That’s certainly part of it. And there’s a real challenge there because there are often many users, and what they each want and need is oftentimes in tension with others. I also think it is really important to remember that with all of this talk about participation that people can choose not to participate if they don’t want to.

How to participate: Data Training

Q: Let’s look at if and how wider communities can participate in each of the steps in the process of developing and deploying a machine learning system, starting with the data gathering process that enables the model training process.

Fernanda: ML depends dearly on data, good high-quality data, and often lots of it. You have to make sure that you have a diverse dataset. Crowd-sourcing can sometimes be really helpful. It can sometimes reach populations that might otherwise be missed. Google Translate for instance, definitely wants global participation because language is so diverse, subtle, flexible, and constantly evolving.

But as always you have to be careful about the quality of the data.

Nevertheless, the important point is that no one company can on its own meet the challenge of creating all the diverse, representative datasets we need.

Q: Once the data’s been gathered, how important is participation in the training of models on that data?

Martin: Really important.

PAIR puts a lot of emphasis into building better tools for engineers. Good tools means building ML is less of an esoteric craft, which opens it up to more people and more types of people.

And some of PAIR’s most important contributions have been tools that can enable not only a developer but the developer’s client or the users of a system to understand more of what’s going on in a model. Why is it giving the results it does? Are those results fair? After all, one of the important goals of participatory ML is to enable a wider spectrum of people to assess the effects of a model, understand what’s contributing to those outcomes, and tune the model to achieve a community’s goals.

AI is so important to communities and can be so technically complex that it can be overwhelming just to figure out where to start. But I believe in the power of small steps. With the tools PAIR builds, even seemingly small interventions can ultimately have a very big effect down the line. Given that PAIR’s mission is not to stand aside and to criticize but to repair the world in some sense, I’d say that finding those small steps that have big effects is very important.

How to participate: Defining success

Q: How about participation in deciding what counts as a successful use of a particular ML system?

Fernanda: Machine learning systems run on what are called objective functions that define for the training process what goals to aim at. These are expressed internally as mathematical equations that are often complex and hard to understand. But participatory ML means that the goals should not be understood only by the engineers or the mathematicians on the team. The designers, the product managers, the support people, the salespeople… everyone needs to understand exactly what the ML system is striving towards, what the tradeoffs are, what the limitations are, even how the thing might break.

Q: Including the client? Even if the client might be, say, an entire city’s population?

Fernanda: That’d be wonderful. Imagine a widespread citizens’ conversation about what goals the system is trying to accomplish! Here are the tradeoffs you can make. What matters more to the city, false positives or false negatives? How can that be expressed as an objective function? You need to have that translation between the math and the real world consequences.

And one of the exciting things about ML is that it provides a language for talking about what you want from the system in rather precise terms.

Jess: It’s hard, and not just because people disagree. Humanity exists in the gray areas. So, outside of the realm of AI, a city might decide to institute jaywalking laws. But if you walk across your street when there are no cars, and you get a ticket from a police officer, your reaction would probably be: Oh my gosh, what’s that about? We’d reasonably expect that the legal code might be bent a bit, depending on circumstances.

If you were designing AI traffic police robots — something we are definitely not recommending, mind you — you’d want them to not be so hard-coded that they’re arresting everyone who jaywalks. But that would require the city to articulate more precisely exactly what sort of rules and exceptions it wants. Machine learning is leading us as citizens and users to have discussions about how to formalize things that have never needed to be formalized at that level of precision before.

Q: It’s a forcing function for discussions about what people want, their values, their assumptions…

Martin: Yes, and that’s a big reason why we need more participation. The questions are so hard.

But I also want to say that this can cut both ways. The idea behind machine learning is that you don’t have to explicitly specify the logic of the program. The classic example is spam filtering. Computers have gotten very good at it because they look at a whole lot of examples rather than having a programmer try to put their finger on exactly what words and phrases indicate that something is probably spam. That’s the promise behind machine learning: it can learn these fuzzy things.

But it’s very weird. In traditional computer programming, you can be completely fuzzy about what you’re trying to do, but you have to be very, very specific about how you do it. In machine learning, on the other hand, you can be fuzzy about how things are done, but you need to be very precise about what your goals are.

The fact that there are fewer things you have to be explicit about with machine learning might let it handle ambiguity better. And at the same time, it places a much greater weight on the humans getting their objectives right.

Q: Martin and Jess, I want to make sure I understand the distinction that you’re making. On the one hand, you have the spam example where human beings can pretty reliably label mail as real or as spam, and then you can let the machine figure out how to sort the new mail coming into your inbox.

But you could also take an example from Cathy O’Neill about using machine learning systems to evaluate teachers and public schools. The ML has to be told precisely what constitutes “success” for a teacher. Does that mean that students pass the standardized exams? Or that lots of them go to college? Or that they get good jobs? Or that they’re happy? The ML can’t decide that on its own. So what human being decides that?

Martin: To begin with, there’s the question of whether to use ML at all. Answering that question needs to be a participatory process. But it’s important to recognize that questions about assessing teachers aren’t a new thing. You could ask what it’s like to evaluate teachers without machine learning. You might have an administrator who counts how many times the teacher calls on students, and how many pages they assign each week. Or you could have a structured program of classroom visits by trained educators who form holistic, deep impressions of a teacher’s skills. Or the administrator might just ask, “Do I like the way this person sounds or not?”

We know there’s all sorts of ways that people are biased in evaluating teachers. Those are not really AI related things. Those are human things. AI is making us all confront this head on.

How to participate: Deployment

Q: Is there a role for participatory ML in the deployment phase and beyond? Some of these systems are going to be in place for many, many years, and circumstances and even values might change.

Jess: Also the chances are that there will be edge cases that you’re only going to discover as more people use the system. So for sure, yes, you definitely need feedback, including and especially from the people directly affected by the ML system.

Fernanda: And from people affected indirectly.

Jess: In fact we’ve been using sustainability as a lens on some of the work on the UX side. Not just in the material or environmental sense, but in the sense of building things that can last for very long periods of time, that become more stable over time, that don’t value the present day over the future. And that includes building in structures for continuing participation.

Q: It sounds like this includes ethical sustainability and moral sustainability, as edge cases show up and as the world changes around the system. Is that fair?

Jess: Yes. And one of the ways to make systems more ethically sustainable might be — Martin and Fernanda, I’m interested if you disagree with this — to move the conversation so it’s not so exclusively about proactivity, I’m becoming more and more a convert to reactivity. It’s really important to have systems — technical and human — that can monitor and sense the context of whatever’s happening. I think it’s a fool’s errand to think, “Oh, if we just had planned that little bit more!” You know, humans are worse at everything than other animals are except at adapting. But adaptability is more valuable than everything else.

Fernanda: I think that it has to be a combination of predicting as much as we can, which isn’t nearly as much as we’d like, and then monitoring. For one thing, a lot of these systems scale and once you scale, there are unpredictable dynamics. And so it’s incredibly important that we continue to try to understand what these systems are doing, how they are behaving.

And that provides opportunities to invite participation over the lifetime of the project.

Should all ML be Participatory ML?

Q: Are you saying that all machine learning systems ought to be built according to participatory principles? For example, a weather forecasting system or one that’s playing chess…?

Martin: In some ways it comes down to the objective function question again. If you’re creating a chess playing machine where the only goal is to train your model to ruthlessly win the most games, there may not be that much of a need for participation. But if the goal is for the system to be fun for people to play against, then you likely need participation throughout the process, including deciding what counts as fun. You might also want to bring in chess players to understand how their community would see a new chess AI.

Q: Even for something as seemingly straightforward as weather prediction ML, the community it’s serving may think some metrics and predictions are more important than others. And how much uncertainty they’re ok with in predictions may vary.

Fernanda: Yes, and I think that that’s a reason why when in doubt you should welcome participation.

People + AI Research

Thinking together about people and AI