As You Have More Data Available: Consider Advanced Data Methods like Machine Learning

CASE at Duke
Scaling Pathways
Published in
12 min readDec 8, 2020
“Data is the cake, machine learning and artificial intelligence are the frosting; 80 percent of data problems are
simple business intelligence problems. Instead of asking ‘how do we use a solution like machine learning’ ask what is the problem we are trying to solve?”
Bob Filbin, Crisis Text Line

Machine learning and artificial intelligence are all around us: for example, Amazon recommending the next product that you “must” have and ride-share apps telling you how long until your driver arrives. And as we witness an explosion of data in the impact sector — resulting from an increased focus on data collection and the huge quantities of data becoming available through mobile phones, satellites, social media, and more — we are seeing more opportunities for machine learning in service of social impact.

The positive implications are enormous: early prediction of earthquakes, more accessible and higher quality health diagnoses, real-time deforestation monitoring, and so much more. However, the negative impacts can be major as well, not only taking precious time and resources to deploy, but amplifying biases and leading to investments in misguided interventions (often resulting from lack of context and poor-quality data). The positives and negatives of machine learning warrant careful consideration; our interviewees who have ventured into this area provided advice as shared below.

Is machine learning right for you?

What is the problem that you are trying to solve and is machine learning — or something simpler — the tool to solve it? As Sharmi Surianarain, Chief Impact Officer of Harambee warns, “Organizations and funders always want to go for the new shiny tools. But my advice would be: don’t start the conversation focused on machine learning and artificial intelligence. Instead, start by asking ‘what is the question that we want to answer?’ Then determine what data you have and what the right methodology and tools are to move us toward an answer.” It could be machine learning, but it is more likely to be something far simpler.

Surianarain’s caution was echoed by multiple interviewees, including Crisis Text Line’s Co-Founder and Chief Data Scientist, Bob Filbin. He gave the example of when, early in Crisis Text Line’s history, it wanted to understand how Crisis Counselors were using their time and ways in which they could be more efficient. While Crisis Text Line was developing capacity for highly complex analytic methods, it was able to identify a major opportunity by first using simple descriptive statistics. The discovery that approximately three percent of its texters were using up to 34 percent of its Crisis Counselors’ time allowed Crisis Text Line to develop processes to support those frequent texters while freeing up counselor time for other text conversations. With respect to machine learning, Crisis Text Line spent four years thinking about the problems it wanted to solve and building data systems and a strong data culture before beginning to employ machine learning.

So how do you know if you have a problem that is ripe for a machine learning methodology? And if you have the right problem, how do you know that you have the right data to enable machine learning to work? Follow the stories of Educate Girls and Crisis Text Line below to learn more.

STEP ONE: Assess Your Problem’s Fit with Machine Learning

Machine learning is particularly good at taking large amounts of data and identifying relationships across many variables, recognizing patterns, and making predictions based on those patterns. For example, machine learning predictions can help with prevention (e.g., forecasting disease outbreaks), targeting of programmatic interventions (e.g., predicting yields of different types of crops or improving credit scoring algorithms to predict factors for repayment), and filling in data gaps. In order to assess if your problem is right for machine learning, you must first identify a mission-critical problem with a need for predictive capacity to solve.

EXAMPLE: Educate Girls, using machine learning to fill in data gaps

Identify a Mission Critical Problem
After a three-year randomized control trial demonstrated Educate Girls’ impressive impact on enrollment and learning outcomes, Educate Girls was selected by the Audacious Project to receive funding for an ambitious scale-up of its program, from approximately 10,000 villages to 35,000 villages over the next five years. Educate Girls had a problem, though. It did not want to spend its precious resources scouting tens of thousands of villages, and it did not possess village-level data for the entire country. Thus, it posed the question:
Given the huge number of potential villages to serve, how could Educate Girls efficiently and effectively determine which ones to target so as to reach the most out-of-school girls possible?

Identify Need for Predictive Capability
Challenge: Incomplete data

Educate Girls did not have all the data it needed to be able to identify villages with the most out-of-school girls, and therefore had a prediction problem. Educate Girls had collected household data in 8,000 villages where it had previously operated, but it did not have data on villages in potential areas of expansion. Government census data included counts of out- of-school girls, but it was outdated, of questionable accuracy, and did not go below the district level to the village level.

EXAMPLE: Crisis Text Line, using machine learning for targeted triage and intervention
Content Warning: The sections below include words and phrases associated with suicide.

Identify a Mission Critical Problem
Crisis Text Line sees surges in the number of text messages it receives during certain times of day and knows that some of the texters are at imminent risk of harming themselves. During periods when there is a surge in texter volume, Crisis Counselors may not be immediately available — but would want to prioritize responding more quickly to those who are in imminent danger.
The problem was that Crisis Text Line did not know which texters were in imminent danger. Thus, it posed the question:
Given the limits of available Crisis Counselors during surges in volume, how can Crisis Text Line determine which texters are at most risk of harming themselves so they can be reached in the shortest time possible?

Identify Need for Predictive Capability
Challenge: Limitations of human analysis

Crisis Text Line initially addressed this prioritization issue by creating a list of words based on clinical research that it believed were likely to be associated with imminent risk, such as “die,” “cut,” and “suicide.” A text parser would monitor incoming texts and flag those including words from this list as “imminent risk” to have them moved to the front of the queue. Yet Crisis Text Line recognized the limitation of this approach, which missed the nuance of context and thus generated a level of performance that formed a baseline on which to improve the organization’s high-risk detection.

STEP TWO: Make Sure You Have the Right Outcome And Predictor Data

Once you determine that you have a problem that is a good fit for machine learning, your next step is to ensure that you have the data necessary to make machine learning successful. There are two types of data that are important for effective predictions: a subset of high-quality outcome data and reliable predictor data.

EXAMPLE: Educate Girls

Identify a subset of high quality outcome data:
Jeff McManus, Senior Economist at IDinsight, spoke about IDinsight’s work with Educate Girls and how important it was that the organization possessed reliable data in some regions on the outcome that it was seeking to change: the number of in-school versus out-of-school girls. Through a door-to-door census, Educate Girls had already collected real household data for households in the 8,000 villages where it already operated, which included a count of out-of-school girls. This data was reliable, complete, and was critical for building and training the predictive model.

Identify reliable predictor data:
IDinsight used the government’s education dataset (published annually for every school in the country and including data on student/teacher ratio, gender, school infrastructure, etc.) as well as census data to find 300 indicators that could correlate with, and therefore predict, whether girls are in-school or out-of- school in various villages.
McManus noted that “secondary data, such as that collected and published by government ministries, is often underutilized” but also warned that it often requires a lot of effort to get this publicly available data in the right format. “We had to scrape data from different websites, clean it, and match the government data across sources and with Educate Girls’ data — including identifying where the same village names had been spelled differently. This took a significant amount of time.”

Putting machine learning to use:
The factors above led Educate Girls and data partner IDinsight to use machine learning to build on the available data to predict where out-of-school girls were clustered. IDinsight estimates that over a five- year timespan Educate Girls would have been able to reach around 1 million out-of-school girls with the old approach. Using machine learning, it estimates Educate Girls will be able to reach over 50% more, a total of about 1.6 million, for roughly the same cost.

EXAMPLE: Crisis Text Line

Identify a subset of high quality outcome data:
Once a text is received by Crisis Text Line, the texter is connected as quickly as possible to a live Crisis Counselor. Crisis Text Line has access to a growing
corpus of over 150 million text messages — and, most importantly, has tagged these text messages with information about the outcome that Crisis Text Line is seeking to understand: the texter’s level of risk as assessed by a trained Crisis Counselor. The data is plentiful and high quality; the Crisis Counselors are trained on standards for assessing risk in addition to being overseen by Crisis Text Line supervisors who monitor and audit conversations.

Identify reliable predictor data:
Crisis Text Line was already in possession of a large quantity of predictor data embedded in the text messages it received and archived: the words used by the texter in their messages. By parsing the words in text messages that have been flagged by Crisis Counselors as imminent risk, the model is able to learn the words associated with this status and attach a relative determination of risk.

Putting machine learning to use:
By using machine learning, Crisis Text Line is able to analyze large swaths of data and better assess the relationship between the words used and risk levels assigned by crisis counselors. Surprisingly, the machine learning model determined that words most predictive of a high risk of suicide are not the actual word “suicide” but words like “EMS”(five times more likely) or over- the-counter drug names such as “Advil” or “Ibuprofin.” By using the predictive power of machine learning, the team can identify 86 percent of people at severe imminent risk for suicide in their first conversations, allowing for those incoming messages to be prioritized immediately.

Ready to take the next step?

Additional advice from the field:

1. Beware how machine learning can perpetuate inequity.

Jake Porway, Founder and Executive Director of DataKind stated, “If not designed properly, data science interventions can be ineffective or, worse, harmful. That goes doubly when it’s being used to support nonprofits and governments charged with caring for the most vulnerable populations.” If bias exists in the way a question is posed or in the data that a machine learning model is based upon, the model and predictions will dangerously replicate and amplify that bias.

Our interviewees discussed grappling with these issues and addressing them through adjusted modeling and through programming. For Educate Girls, as it employed machine learning to find villages with the highest number of out-of-school girls, it recognized the possibility of the algorithm favoring certain types of villages, such as those with the largest number of people, which could disadvantage smaller, more rural, lower caste areas. Educate Girls and partner IDinsight built checks into the algorithm to ensure that this type of bias was not occurring. Whenever any especially vulnerable villages were given lower scores by the algorithm, Educate Girls and IDinsight manually overrode the model to ensure those villages were not excluded from the program. In Harambee’s case, it realized (through more basic data analysis) that young people are less likely to stay in a job if they are more than one to two taxi rides away from the work location. Using this insight, Harambee engaged machine learning to predict taxi routes from candidate home addresses to work locations, and thus match people against cost/time/distance parameters to increase the likelihood of job retention. It used this data, along with market intelligence on hiring patterns and employment opportunities, to provide suggested work opportunities to the youth it served. However, Harambee realized that youth living further away from economic centers would be left out using this data, as they were further from potential jobs. To address the bias against more rural youth with this particular intervention, Harambee is piloting a targeted program to enable rural youth to increase their income mobility — instead of focusing on job matching alone.

Social enterprises can take steps to avoid the dangerous consequences of bias in a machine learning effort, including (as a start):

  1. Clearly articulating all of the objectives of particular effort with the data analysis team at the start (e.g., in Educate Girls’ case, the objective of reaching the most vulnerable out-of-school girls), and establishing processes to test the model against these objectives;
  2. Including multiple stakeholders and checkpoints in the development and validation of the model; and
  3. Using open source tools, such as the Aequitas project at University of Chicago or Deon to help assess data sets for potential bias.

AI Principles: The OECD has developed a set of shared principles for Artificial Intelligence which have also served to inform the G20’s principles for responsible AI. Both frameworks set out key standards and highlight human centeredness, fairness, transparency, security, and safety, among others, as central principles for all actors designing or leveraging AI solutions.

2. Build your team for machine learning.

Supporting machine learning projects requires a broad range of skills and experiences. The most obvious are the data scientists that understand how to build, test, and maintain the models. For Crisis Text Line, whose business centers on data and technology, having a variety of data scientists (including machine learning experts) on staff is natural. But for many other organizations, machine learning is best supported through external experts. Harambee partnered with a local Google cloud partner for its machine learning projects. Educate Girls has been working closely with IDinsight, a research and consulting organization. Several social enterprises focusing on community health workers are partnering with DataKind, a nonprofit that connects social impact organizations with data scientists. Many other organizations also exist that can provide external technical expertise.

Just as important as engaging the technical data science expertise for a machine learning project is bringing in subject matter experts (e.g., mental health clinicians), programmatic experts (e.g., those heavily engaged in the organization and implementation), and those who are closest to the ultimate client or community (e.g., field supervisor, community health worker). The model must match the realities of the ecosystem in which it operates, the organization in which it lives, and the people or other entities it serves.

3. Machine learning is not a replacement for humans.

Jackie Weiser, Lead Data Scientist at Crisis Text Line cautioned, “Think of machine learning as an assistant to humans … and always have a healthy skepticism about the algorithms. Even two years into our machine learning effort, we question it regularly.” Weiser makes an important point that, although artificial intelligence and machine learning can help increase efficiency and provide more complex analytics, regular review and validation by humans remains critical — especially when working with vulnerable populations. Only humans can contextualize and identify extenuating circumstances, such as factors not captured in the data on which the algorithm was trained or nuances that exist in some contexts and not others. Two years into Crisis Text Line’s machine learning work, humans are still in the loop on all text conversations; supervisors have access to all messages, scanning and bumping up messages in the queue when they are deemed higher risk than the algorithm produces. In another example, Educate Girls continues to validate its model as new data becomes available. For example, as it moves into new states in India, it spends time collecting primary data — going door to door in several hundred villages to feed that data into the model and ensure that different cultural dynamics or other factors are considered appropriately.

Want to learn more about Machine Learning for Impact?
Content in this section was based on interviews and also draws from the following excellent guides for learning about machine learning for impact:

This article was written by Erin Worsham, Kimberly Langsam, and Ellen Martin, and released in June 2020.

--

--

CASE at Duke
Scaling Pathways

The Center for the Advancement of Social Entrepreneurship (CASE) at Duke University leads the authorship for the Scaling Pathways series.