Q&A: Donald Martin on Community Based System Dynamics and Machine Learning
Donald Martin is a Social Impact Technology Strategist — a role and title he created four years ago at Google. His work focuses on understanding complex societal issues and using that knowledge to proactively mitigate machine learning bias.
Donald’s recent research looks at how community based systems dynamics, an approach that emphasizes the participation of typically excluded stakeholders in complex problem modeling, can help us pinpoint the root causes of high-stakes problems that often involve societal biases. We sat down with Donald to talk more about his research and how — when it comes to designing technology — partnering with communities early can help contribute to better solutions for complex social problems.
Can you tell us about your role?
Donald: I created my role four years ago when I started a project at Google to help more people see and understand the underlying systems and structures that cause societal issues like racial disparities in police shootings. In my current work, I apply deep problem understanding techniques to help AI researchers and engineers better understand the potential societal impacts of their work.
Machine learning is used widely to solve all kinds of problems both big and small — from extending phone battery life to helping screen for cancer. Can you walk us through how you think people should approach using machine learning to solve a particular problem — especially a societal problem?
Donald: It’s important to focus on understanding the problem before trying to solve it. Be cognizant that you probably cannot understand all the dimensions of that problem. It’s almost impossible for one person to do that. Zoom out to the broader problem domain. Partner with people who have lived experience in that problem domain and can help you gain a more complete understanding of the broader context of the problem. By doing so, you can start to see where biases are hidden that you may otherwise miss when thinking about a problem from only your perspective.
Can you give us a real-world example of what can go wrong if the problem is not well understood?
Donald: There’s the example of racial bias that was discovered two years ago in a medical algorithm that hospitals in the U.S. widely use.
In an effort to reduce high healthcare costs, a healthcare systems provider built a system intended to identify patients with the most complex healthcare needs. These patients could then be given access to special programs that would help reduce overall healthcare system costs. This solution hinged on the causal assumption that patients with more complex healthcare needs will spend more on healthcare over time.
Now, for certain groups in the U.S. — particularly Black Americans — spending more on prescriptions and hospital bills is not a good indicator of whether you have more complex healthcare needs. There are other factors — such as under-diagnosis due to bias in the healthcare system, lack of access to affordable healthcare, and lack of trust in the healthcare system — that actually decrease how much Black Americans spend on healthcare, independent of what their healthcare needs are. Due to the oversimplified causal assumption — the bias — that complex healthcare needs lead to higher healthcare spending for all populations, the algorithm ended up having the opposite effect of what was intended. It selected a group of people for special programs that had 50,000 fewer chronic diseases than the group of people who weren’t selected — and the people who weren’t selected were disproportionately Black Americans. As a result, thousands of Black people with complex healthcare needs were not given access to the special programs and services they would have otherwise qualified for and that could have helped them achieve better health outcomes.
What could the model designers have done differently to avoid this outcome?
The key is to eliminate the knowledge gaps that lead to these oversimplified causal assumptions. In our research, we call these causal assumptions causal theories. You can think of them as hypotheses about why a problem is happening and what the key factors are.
To fill those knowledge gaps, you need to spend time with the communities that are proximate to these problems. Ground truth from people that are close to these problems can help you understand what other factors should inform your hypothesis or your causal theory.
Can you talk more about how to approach problems that are deeply technical and involve complex societal issues?
Donald: As tech folks, we tend to say: “Let’s start with the data we have and start solving.” But the data we have is often incomplete and reflects historical systemic inequities. Working in the societal issue domain requires a different approach that includes qualitative and quantitative understanding. You have to arrive at a qualitative understanding of the problem before you can dive into the quantitative.
That’s why we apply system dynamics. System dynamics was invented by J. W. Forrester in the mid-1950s to help people understand complex problems. The method starts with reflecting a qualitative visual hypothesis — a guess, an intuition — about what factors are relevant to a problem and how they might be related to each other as a system. System dynamics provides a specific visual language for qualitatively articulating relationships between factors and the feedback loops and time delays that characterize dynamic, complex societal problems. The visual language includes elements that can be used to quantify factors and the relationships between them to create a simulation of your hypothesis. Simulation is a bridge from the qualitative world into the quantitative, analytical world that engineers are used to working in. Here they can start to think about particular factors or variables and the equations associated with the relationships between them. A simulation is like an interactive prototype of the problem that can help you test your hypothesis and deepen your understanding.
You mentioned spending time with communities. When does that come in?
There’s a special way to practice system dynamics called community based system dynamics. There’s expertise in communities that is not being brought to bear to understand complex problems. Community based system dynamics is about building capacity in communities so they can describe problems themselves and fully participate in the problem understanding phase of solution development. With community based system dynamics, you partner with folks with different perspectives and expertise to jointly develop a shared qualitative hypothesis. As you go through this process with others you start to discover different problems that matter to them. That’s why it’s important not to hypothesize alone or only within your organization. We need to shift from trying to solve and design for others to understanding, solving, and designing with others.
Selecting the right training data is particularly critical for these high-stake domains. Can these techniques help with that at all?
Donald: One of the key aspects of system dynamics is figuring out — or hypothesizing — what the most important factors are for a given problem regardless of if you currently have data for them. If you conclude that a factor is critical but you don’t have enough high-quality, representative data about it, then it’s probably not a good idea to try to apply machine learning to solve the problem.
When it comes to using AI to solve problems, what domains are you excited about or wary of?
Donald: The domains I’m most excited about — criminal justice, healthcare, and wealth — are also the ones I’m wary of! The historical and structural inequities in these areas are hard to see. If we jump to solutions without deep understanding we can make things worse. I think it’s important to have proactive strategies to understand and solve the seemingly intractable problems in these domains. Techniques like community based system dynamics can help teams and organizations gain the deep understanding needed to tackle problems in these high-stakes domains.
For those interested in learning more about community based system dynamics, what resources would you recommend?
Donald: I would start with a book called Community Based System Dynamics. Peter Hovmand, the author, is a professor from Case Western Reserve University and former founding director of Washington University St. Louis’s Social Systems Design Lab. I think the book answers an important question: How do we use system dynamics in a way that benefits and empowers the communities that have the most stake in the societal problems we all want to solve?
Thinking in Systems: A Primer by Donella Meadows provides a great introduction to systems thinking and system dynamics, and the System Dynamics Society has some good resources.
How can we advocate for more people to take this broad-based approach?
Donald: The first step is making people aware of the dangers of oversimplified causal theories and jumping to solutions too quickly. But also make them aware of the power of focusing on problem understanding first in partnership with communities. Help people see how problem prototypes built in partnership with communities can lead to better products and solutions. After that, we need to make techniques, like community based system dynamics, accessible and useful to people within their daily workflows — both within tech organizations and in civil society. That can be done with introductory training, but we also need to develop tools to make it easier to integrate these practices into current workflows and meet people where they are.
Have you developed any of the introductory training you mentioned?
Donald: Yes. In fact the first time we delivered our introductory system dynamics training was to 75 people at the 2nd Data 4 Black Lives conference held in January of 2019. We wanted to be very intentional about starting with communities most at risk of being harmed by machine learning applied in high-stake domains. This initial training led to relationships with Data 4 Black Lives community members and the joint development of a problem prototype in the AI-for-healthcare domain.
You mentioned tools that integrate with workflows. What kinds of tools; which workflows?
For machine learning developers, two pioneers in Ethical AI — Timnit Gebru and Meg Mitchell — led the creation of datasheets for datasets and model cards. These are great resources, but they’re focused on tasks within the ML development stage — such as selecting the data, choosing model architecture and model evaluation. We also need tools for formalizing the critical problem understanding and formulation decisions that happen before the ML development stage. Those tools will be based on community based system dynamics practices and should help teams prototype problems and express hypotheses in partnership with communities.
What else?
Donald: Tools alone are not enough. We need processes for bringing in people with a variety of expertise — including human rights, ethics, the social sciences, and lived experience — to collaborate on and contribute to prototyping problems. We need to put extra effort into collaborating with communities.
The first job has to be building trusting relationships with marginalized communities that are currently underrepresented in the technology industry and are the most vulnerable to the negative impacts of technology. Building trusting relationships cannot be rushed, and it requires genuine concern for the needs, problems and goals of the most vulnerable communities above and beyond the immediate research or product goal.