Robo-diagnosis

We made a machine to vacuum for us (and also to transport cat-sharks). What’s next?

For a high school social studies assignment to create a piece of propaganda, some of my classmates drew detailed political cartoons. Some made videos. I drew on a small rock with a Wite-Out pen and handed that in.

Over the years, I have often worried that people think I’m lazy. To be clear, I don’t consider myself lazy, but rather, time-efficient. I would rather put more work into coming up with a creative solution to a problem and then much less incremental work later on than keep plodding along with repetitive actions that could be automated. My rock-paganda isn’t an example of automation, but it did take a bit of out-of-the-box thinking at the outset to come up with an idea that could be very, very easily implemented. Perhaps a more relevant example would be the phone screening procedures in my old lab. As a research assistant, I was frustrated with the amount of time I spent phone screening potential participants. I would ask people the same questions on the phone over and over, just to learn after twenty-five minutes that they were claustrophobic and couldn’t do an MRI. I would play weeks-long games of phone tag. I would go days at a time without speaking to an individual who qualified and would have to give up time slots on the scanner. This seemed like a remarkably inefficient system, so I created an online version of our phone screen and one of my coworkers made an excel template that would code and interpret the responses. Building the website and the template took some time, but we dramatically cut the amount of time we had to spend on the phone after that and increased our “hit rate” for eligible participants. In the words of Borat, great success.

But how far can we go with automation? Dawes, Faust, and Meehl (1989) persuasively argue for actuarial, rather than clinical judgment, including in the realm of psychiatric decisions. They make a strong case that consistently using a formula that differentially weights pieces of information leads to more accurate decision-making than making holistic judgments on the totality of the evidence, even for expert judges. In some cases, clinical decisions and actuarial decisions have about the same success rate, but they argue that clinical decisions are never more valid than actuarial ones. A counterargument would be that there are so many infrequent, significant exceptions to any rule that an expert would pick those up better than a formula — but Dawes et al. responded that those exceptions could be built into the equation. Rather than making the final decisions themselves, experts might be better employed as observers, scoring the symptoms present and entering them into the equation. The problem with clinical judgment is not inaccurate, but rather, inconsistent weighting of information — and relying on a formula would mitigate this natural, human tendency to be influenced by our internal states and fluctuations in the environment. I can get behind this argument.

Technology has come a long way since their article came out in 1989, and I wonder what the extent of automation might feasibly look like today. What would be the role of experts now? Could decisions about diagnoses and treatments be made solely on the basis of information gathered without experts? Basically, this is the old sci-fi question of whether we have created (or are in the process of creating) smart machines that make humans obsolete. I could imagine a few approaches to automated decision-making. First and simplest would be a computerized, self-reported symptom checklist that auto-scores and outputs a diagnosis. Here, we encounter all the problems with reliability of self-report and insight into the self. Perhaps more interesting would be an algorithm that could process and analyze multimodal information, making use of the recent uptick in biomarker studies. What if we could build a model that integrates psychophysiological and neurological information, along with things like vocal and text markers of pathology, family history, self-reported symptoms collected throughout the day in real life (like ecological momentary assessment, rather than an in-lab questionnaire), and body posture and movement measured by something like a motion sensing camera? Do we even need experts to observe and code these markers, or can we just do the research and program the machines to measure and interpret them? Or beyond that, could we use machine learning to dynamically build the algorithm from the data acquired by the machines?

It would be important to know (1) whether the actuarial decision reached by such an algorithm would be superior to clinical judgment, and (2) how the costs (both time and $$$) compare to simpler decision processes. If such a multi-modal assessment produces only a marginally better success rate than a short battery of questionnaires and human-rated clinical assessments, it might not be worth automating. But if, in fact, this process could produce significantly better decisions and outcomes, it could be very valuable.

A single golf clap? Or a long standing ovation?

By clapping more or less, you can signal to us which stories really stand out.