So, you want to revolutionize healthcare? A meme-fueled primer on FDA regulation.
You have a brilliant machine learning idea, hard-earned technical ability, and the capacity for airtight execution. Nonetheless, a challenge remains: navigating the complex layers of bureaucracy needed to please “the man” and attain permission to share your technology with the world.
Machine learning is clearly the hottest topic in HealthTech — you couldn’t use the restroom at RSNA 2017 without engaging in a discussion about whether A.I. beats radiologists at diagnosing pneumonia — yet most physicians have no understanding of the FDA approval process. The reason for this, of course, is that FDA approval represents the indisputable “Most Boring Topic in the History of Time”:
The fundamental step in transitioning from a research algorithm to a practical clinical tool that helps people (and generates $1 billion) is attaining FDA approval. I wrote this guide both for myself — how else could I summon the willpower to learn these things? — and for the many entrepreneurs, engineers, and physicians with powerful technical ability but no understanding of the regulatory processes that will guide the deployment of that ability. Further, I figure that if John Oliver can make esoteric concepts like civil asset forfeiture seem hilarious, perhaps I can find a way to help you tolerate reading on the FDA.
(A) Basic definitions.
In the FDA approval process, medical devices are evaluated with respect to both potential benefit and potential risk— if the upside is massive, a higher degree of risk is tolerated. Risk is grouped into three categories:
- Class I: Low risk device. Example: toothbrush.
- Class II: Moderate risk device. Example: CT scanner.
- Class III: High risk device. Example: artificial heart valve.
The risk class has significant influence on the type of approval that is required, which ranges from “exempt from premarket review” (booyah!) to premarket approval (PMA) needed (you are screwed). The primary approval pathways are as follows:
- Exempt from premarket review. Nearly all Class I devices are exempt from premarket review. Although filling out 250 pages of paperwork documenting the safety profile of a Band-Aid would be a fun exercise, your FDA overlords decided to be merciful. They will, of course, be less cool about your machine learning algorithm.
- 510(k). This is considered “fast-track” approval, and is (generally) the best we can hope for in the setting of interesting medical software. Qualifying for this category has two primary requirements: (1) the risk of the device is Class II (moderate), and (2) a predicate device exists that is similar to the technology in question, with comparable safety and effectiveness. Technology in this category is said to be “evolutionary” rather than “revolutionary” (revolutionary tech lacks a predicate). This approval takes ~4–8 months, requires ~20 hours of effort on the part of the FDA, and costs about $5000.
- Premarket approval (PMA). This is the torturous big brother of the 510(k) process that is required for Class III (high risk) devices. Whereas 510(k) approval takes ~4-8 months , PMA generally requires 3–7 years of rigorous clinical trials and ~1200 hours of work by the FDA (it was ~20 hours for the 510(k)). This is generally the category for fundamentally new, revolutionary technologies that lack a predicate device.
- De Novo. This is a special category for devices that have no similar prior technology but are low to moderate in risk (Class I or II). In this case, the approval process reduces to the more tolerable 510(k) fast-track system, allowing applicants to avoid the horrors of the PMA.
- Humanitarian device exception. This is a special case in which a medical device is designed to benefit patients with a rare disease (< 8000 cases/year (previously 4000 cases/year before the 21st Century Cures Act passed…more on this later)). Given the difficulty of collecting robust datasets in these small populations, the FDA requirements are less stringent. Hey, perhaps you can make your mark by creating the world’s best machine learning algorithm for Wohlfart-Kugelberg-Welander disease?
Presumably you are falling asleep at this point, so I will break things up with a delightful video of a man protecting his dog by fighting a kangaroo:
Moving on. Another important concept is differentiating between the two types of automated image interpretation algorithm: computer aided detection (CADe) and computer aided diagnosis (CADx). These categories face markedly different levels of FDA scrutiny.
In computer aided detection (CADe), an algorithm points out a particular finding — flags it — but does not make a specific diagnosis. Think traditional CAD in diagnostic breast imaging, which existed well before we all started obsessing over machine learning: it places a marker over a potential tumor, but it never comments on whether cancer is present.
Computer aided diagnosis (CADx), alternatively, goes beyond merely flagging an abnormality: it specifies disease type, severity, stage, prognosis, or suggested treatment. If you point out the existence of a breast opacity it is CADe, but if you specify the probability of malignancy it becomes CADx. Attaining FDA approval for CADx tends to be considerably more difficult, as the risk of harm associated with making a particular diagnosis is higher. These technologies are generally placed in risk Class III and require the laborious PMA process.
There was, however, a recent massive development in the way these (boring) issues are managed. The FDA granted de novo approval (you totally know what that means!) to the Quantitative Insights QuantX software, a form of breast cancer CADx software. Specifically, the FDA determined that this CADx technology is sufficiently low risk to be considered Class II and undergo the 510(k) approval process.
Why should you care about this incredibly dry, technical nonsense? You should care because this acceptance establishes an entire new category of algorithm that can undergo the easier 510(k) approval: “radiological computer-assisted diagnostic (CADx) software for lesions suspicious for cancer.” This applies to all types of imaging and all types of cancer, making it much easier to get FDA approval for this large category of machine learning algorithm.
Perhaps you should start focusing your machine learning efforts on cancer?
(B) Case study of current successes (protip: copy these guys!)
Although many companies are applying machine learning to healthcare, few have attained FDA clearance. To summarize the current successes:
- Arterys Cardio DL. This was the first company to get FDA approval for a medical algorithm using deep learning. Their technique uses artificial intelligence to expedite the post-processing of cardiac MRI, obviating the need for a human to calculate values like stroke volume, flow velocity, cardiac output, etc. They attained 510(k) clearance using both Medis Imaging Qmass and Arterys v2.0 as predicate devices. Although technical specifics of their new technology are unclear, they presumably used deep learning to create a superior form of cardiac MRI post-processing that was at minimum equivalent to prior techniques in quality and safety. Since they focus on expediting low-level work rather than making a particular diagnosis, the level of perceived risk by the FDA was probably relatively low. It unclear how this approval generalizes to other technologies.
- Quantitative Insights QuantX Advanced. This is the previously discussed CADx technology awarded de novo FDA approval for the evaluation of breast cancer. Their software offers both a large database of pathology-proven tumors to compare with new cases, and a “QI score” that synthesizes multiple tumor characteristics into a metric that helps evaluate likelihood of malignancy. In data submitted to the FDA, this technology reduced the rate of missed breast cancer by 39%.
- Hologic Quantra 2.2 Breast Density Assessment Software. This software uses machine learning to quantify breast density more systematically than the current clinical standard, which consists largely of scratching one’s head and proclaiming, “eh, looks like probably a 2?” This technology was given 510(k) clearance, using Quantra 2.1 as a predicate device and offering improved density assessment relative to this predecessor. The Quantra technology can ultimately be traced back to a predicate device from 2005, the Sectra IDS5 Workstation, highlighting the fact that this sort of thing has been around for quite some time.
- AliveCor Kardia Band. This company makes a band for the Apple Watch that collects EKG data. They got 510(k) approval, using their prior AliveCor Heart Monitor EKG for mobile as a predicate. Machine learning is involved because the EKG incorporates “SmartRhythm” technology, which uses artificial intelligence to analyze Apple Watch data and determine when a given heart rate is abnormal for a given level of activity — think a heart rate of 150 when the patient is lazing around. When pulse and activity are discordant, the user is prompted to press a button to collect EKG data and assess for rhythm abnormalities.
(C) The FDA approval landscape is changing dramatically.
Interest in healthcare information technology is currently exploding, and the number of companies wanting FDA approval for software is increasing dramatically (note that software development generally takes months, whereas medical device creation previously took years). Further, as new technologies develop it can be unclear how they fit into the context of preexisting FDA guidelines created around the time of President Nixon.
Clearly, the FDA needs to make some changes:
The FDA, by necessity, is focused on streamlining and clarifying the approval process for digital health technologies. How are they approaching this?
(i) The 21st Century Cures Act.
The 21st Century Cures Act, which went into effect in December of 2016, did a bunch of things— from making informed consent less cumbersome to treating the opioid epidemic to selling barrels of crude oil to generate funding for the NIH. It was ~1000 pages in length, and clearly zero human beings actually read it.
For our purposes it is important because one of its innumerable focuses was making the FDA approval process faster and easier. Further, it is important because it is generally the basis for the FDA statements that generate massive excitement among my colleagues and I on “nerdy radiology Twitter”:
To summarize aspects of the act that matter to us:
- It provides further clarity regarding types of software that do and do not require FDA approval. Various categories are identified as not requiring approval, including “wellness” software encouraging a healthy lifestyle, certain types of electronic medical record (EMR), and, most interestingly, particular types of clinical decision support (CDS). As further clarified in a recent FDA draft guidance, CDS software does not require approval when four criteria are met: (1) it does not acquire/process/analyze medical images or certain other medical signals, (2) it displays/analyzes medical information, (3) it provides healthcare professionals with support regarding prevention/diagnosis/treatment, and(4) healthcare professionals can review the basis for the recommendations and do not rely on them primarily when making a medical decision. That is, clinicians are efficiently pointed towards information they could have more laboriously found elsewhere. This sort of clarity regarding software that can avoid FDA scrutiny is huge.
- The Cures Act also requires the FDA to develop streamlined approaches to the approval of “breakthrough devices,” such as technologies that provide new treatments or diagnostic tools for life-threatening conditions.
- The Cures Act emphasizes that the FDA must evaluate technologies for equivalence to a predicate in the “least burdensome” way possible, encouraging this with various rules such as mandatory “how to not be a burden” training for FDA employees. This sounds sweet but is no doubt of fuzzy meaning/significance.
(ii) Establishing an FDA-approval fast track.
Do you like systems offering large companies significant advantages over small upstarts? Do you enjoy the TSA and their approach to customer service? Well then, you will love the new FDA Pre-Cert program!
The basic logic of this pre-certification program is that some companies have a track record of excellence, and the likelihood that they will fail the FDA approval process is low. What is the probability that Google will mess it up? We should thus, the logic follows, offer them less time-consuming and laborious FDA approval after they complete a pre-certification process (similar to TSA PreCheck) establishing high quality software design/validation processes.
This approach is currently being evaluated in a pilot study with the likes of Apple, FitBit, Google’s Verily, and even a few smaller companies like the adorable-sounding Pear Therapeutics. If the pilot is successful, this approach might be critical in helping the FDA contend with a large application volume and more rapidly make decisions.
(iii) Work towards solving the issue of dynamic technologies.
Applications of machine learning to healthcare have a significant problem: they require FDA approval, yet their power depends on continually consuming data to become progressively smarter and better. This begs the question of when, exactly, does an algorithm that continues to change needs to undergo additional FDA evaluation? If the consumed data is low in quality, algorithm performance could suffer and patients could be harmed — garbage in, garbage out. Furthermore, high-quality test sets are hard to come by, yet significant problems arise if a changing algorithm is continually evaluated with respect to the same test set (the test set effectively becomes part of the training).
These are problems that do not currently have solutions, but the FDA is thinking about them. They will likely be solved using the Pre-Cert process discussed above, where certain companies will attain a trusted status and earn the ability to update an algorithm without constant FDA scrutiny. I wish you the best of luck in convincing the FDA that you are awesome.
These are the key changes that will shape the FDA approval process in the coming years. For additional insight into the future, consider spending a few decades perusing the mammoth 21st Century Cures Act.
Congratulations, you are now a verifiable expert on FDA approval, capable of tackling entire PMA applications in like 3 hours. You probably deserve a fancy certificate you can display on your LinkedIn, but it will take me some time to work out the details. Thanks for reading.