In May 2018, the United Nations Human Rights Council published a searing report detailing the various ways in which the US, among the wealthiest countries in the global north, remains plagued by extreme poverty and income inequality. Included in the report are the child protection agencies that were ratified by President Nixon in the mid-1970s as a supposed safeguard against child abuse, but which have since become a means for the government to intervene in the lives of the poor, mentally ill, drug addicted, and otherwise disabled — even to the point of permanently separating children from their parents.
Each year, more than three million children and their families are investigated by child services across the nation. By all accounts, the overwhelming majority of those families are poor. On 13 April 2018, my family joined those statistics when Florida became one of the first states in the nation to use AI as part of their child welfare process, hiring analytics giant SAS to conduct a research project with advanced analytics. Using data collected between 2007 and 2013, SAS contracted with the state to create an algorithm designed to determine which factors would most likely lead to the death of a child through abuse or neglect.
On the surface that sounds wonderful. Who doesn’t want to protect children? But child maltreatment fatalities are extremely rare — about 1,750 yearly, compared with the hundreds of thousands of kids removed from their homes, and close to four million investigated, unnecessarily.
Trying to apply a set of risk factors for an event that only takes place in roughly .0005% of open cases, leads to an abundance of false positives and the investigation of families with no history of abuse. It’s the statistical equivalent of saying that people who experience childhood trauma are going to become serial killers; serial killers often do have a history of childhood trauma, but most traumatised children don’t grow up to become serial killers.
This type of statistical error had already occurred in Los Angeles. In 2017, LA County quietly ended its contract with SAS after running a practice algorithm with almost identical goals against child abuse. The algorithm — which was tested against historical cases and never applied in real time — came up with 3,829 false positives from data that only included 171 high-risk cases. LA County used those results to end the contract and look into other methodologies for the future of child welfare AI. But Florida codified SAS’s research findings into its analytic approach to abuse.
Where this likely affected my family was in their newly strengthened emphasis on substance-exposed newborns. My eldest daughter was born in Florida in 2014 while I was taking prescribed methadone. Deciding to use methadone while pregnant was a difficult decision. I knew it was likely to lead to withdrawal when my daughter was born, but my prenatal care providers insisted it was the best course of action since I’d been addicted to heroin when I learned that I was pregnant. Methadone has been the gold standard of care for opioid use disorder since the 1970s and is approved by the World Health Organization. I did nothing illegal by taking it, but there had been a brief child welfare investigation into my family because of it.
Four years later, in the hands of the Florida Department of Children and Families, that brief investigation resurfaced as a risk factor that an algorithm decided made me more likely to harm my children. Did our family need help? Yes. Was removing my children from my custody and throwing me and my husband out onto the streets a beneficial decision for my children? Not at all.
As Christopher Teixeira, chief engineer for model-based analytics at The MITRE Corporation, suggests, it’s difficult to place a hard figure on what constitutes an acceptable margin of error for algorithms because it’s relative to what’s being measured. “If you’re looking at a model predicting whether a child will be killed, there are two types of errors that can come up,” he explains. “One, you may miss the severity of abuse and the child was killed. Depending on who you talk to, any child’s death is inexcusable, so any errors in that respect are not acceptable. The corresponding error type is, what happens if you say every child is at risk of fatality? Now you’ve overwhelmed the system.”
That overwhelmed system has major repercussions for families like mine — rent apart by overblown risk analyses instead of being provided help by the state, marginalised predominantly by our precarious economic status. SAS used public data to compile its research, including information like criminal backgrounds, welfare benefit history, and use of public health services — all sources mostly intended to service poor people. Before any data was analysed at all, poverty was assumed to be the most defining factor in determining child safety. How can any predictions stemming from such an obviously biased assumption be fair?
Of course, this isn’t an issue unique to the child welfare system. Any system known for historical bias will naturally incorporate that bias when it implements AI systems without proper safeguards. For example, in 2016, ProPublica uncovered that a risk algorithm used for criminal sentencing in Broward County was incorrectly flagging black defendants as high risk while scoring some white defendants too low.
These are not the only jurisdictions to incorporate AI into their child welfare system. The Illinois Department of Children and Families recently ended their contract with Eckerd Connects, the Florida-based child welfare community agency behind a real-time scoring tool called Rapid Safety Feedback. Like the Florida and California programs developed by SAS, the Rapid Safety Feedback tool claims to predict which children are most likely to be killed or seriously injured due to abuse, abandonment or neglect.
This program was deployed in the state, which includes the highly populated city of Chicago, where it mistakenly flagged thousands of kids as being at imminent risk of death. Two children who were not given high-risk scores ended up dying while under child welfare supervision, reported the Chicago Tribune at the time. Illinois cut ties with Eckerd, but nearly identical versions of this tool are being used in jurisdictions around the nation — including Hillsborough County, Florida, where the same company also oversees services for families whose maltreatment allegations are deemed substantiated.
Each year, more than three million children and their families are investigated by child services across the nation. By all accounts, the overwhelming majority of those families are poor.
So why are these programs still in use if they’re so prone to error? “We know through research that if a human had a good sleep they are hugely better decision makers than if they had a bad sleep,” says Rhema Vaithianathan, the co-director of the Centre for Social Data Analytics at the Auckland University of Technology, and one of the lead designers of a real-time algorithm currently being used in Allegheny County, Pennsylvania. “Human cognitive abilities are variable across the day, or week, or month. These aren’t robots making decisions, they are humans with all the flaws humans have.” An objective, tireless algorithmic tool, she argues, can provide faster and more comprehensive analysis across a variety of factors.
Emily Putnam-Hornstein, an associate professor of social work at the University of California who designed the program with Vaithianathan, adds that these tools work best when paired with caring human workers. The Allegheny Family Screening Tool, for example, helps the intake screener decide whether or not to send an investigator to the home, but it doesn’t contribute to any further decisions. It’s also not trying to measure whether a child will be killed. Instead, it scores the likelihood that a family will become involved with child services again in the future.
This matters because it’s much easier to measure the factors leading to a common outcome than one as rare as the death of a child. Re-referrals, unlike maltreatment fatalities, happen fairly regularly (the exact rates vary between jurisdictions). The problem with measuring this outcome, however, is that it might just end up measuring bias. If kids are referred for bad reasons, like racism, then predicting whether a family will end up in the system again could just be a prediction of whether a family has racist neighbours.
Ira Schwartz, a private analytics consultant specialising in juvenile justice and child welfare, also believes AI has the ability to make child welfare programs more effective and judicious provided they use good data implemented effectively. “They’re not designed to eliminate worker discretion,” Schwartz explains. “It’s really designed to help workers make better decisions.”
In 2017, Schwartz contracted with Broward County to assess whether the current maltreatment substantiation process could be improved by replacing the current tool — what he calls “basically a checklist” — with machine learning tools. His research, which also measured whether families would become involved with child services again, found that 40% of court-referred cases (those that typically involve the removal of a child from the custody of its parents) were inappropriate. It also learned that cases that were met with overzealous interventions were ultimately harmed, not helped, by child welfare involvement. Schwartz believes that combining predictive and prescriptive analytics can help reduce some of these errors and, potentially, cure some of the effects of the bias against poor people.
“We found that socioeconomic status was a predictor for child welfare involvement,” says Schwartz, which is not surprising given the algorithm relied on data from an agency notorious for targeting the poor. But Schwartz believes that adding prescriptive analytics into the picture might be able to put that information to equitable use, rather than simply using it as criteria to rip poor families apart.
Prescriptive analytics would be able to decide not only whether a family needs intervention, but what type. This could help disperse aid services in a more effective way. Instead of putting a kid into foster care, the system could prescribe subsidised daycare. “A lot of cases of neglect are basically the result of poverty,” says Schwartz. “If those economic and financial issues could be taken care of in a meaningful way, it would drastically reduce the number of people who show up and are identified as being neglectful parents who aren’t.”
I wonder how my case would have been handled if my family risk factors had been viewed through a different lens: “may become re-involved with child services without proper intervention” instead of “may kill her kids”. What if we had been given access to subsidised daycare, timely mental health services, and housing support?
Instead, my husband and I were evicted via court order while still dealing with the mental health and financial problems that led to the opening of the case. A judge eventually ordered us to complete all of those services, but it has taken so long to get referrals to agencies the court will accept that, nine months later, my husband and I have yet to receive any mental health counselling. We have, however, been ordered to pay $200 per month each in child support to the in-laws who had to look after our kids.
We are a textbook example of a family harmed by overzealous child welfare intervention. If Schwartz’s research holds true, then it might not have been the use of AI that caused the harm, but its gross misapplication. Regardless, it’s evident that before any government can safely apply these powerful algorithms, it must first remedy the faults in the underlying system so that poor people, people of colour, and disabled people don’t continue to be the unwitting victims of misdirected AI.