Accordion APIs How-To series, part IV
How does the Accordion Risk Surface Engine (Arise) determine whether a member is likely to have a disease, Parkinson’s disease for example?
In our last post, we demonstrated how our Risk Profiler API
/api/suspect-hcc can help you identify missing or hidden diagnosis codes that were not properly documented. This API, when given a member’s existing diagnoses, medications and procedures, will tell you what conditions (Hierarchical Condition Categories, also known as HCCs) may be missing and what their corresponding ICD diagnosis codes are.
At this point, some people may wonder “On what basis does the Risk Profiler API make these suggestions?”.
The answer is Machine Learning! At Accordion, we developed advanced machine learning algorithms to make data-driven suggestions. In this post, we will go deeper into how our machine learning algorithms (Arise) work by using Parkinson’s disease as an example.
Before talking about how Arise works, let’s first do some background research on the Parkinson’s disease!
Parkinson’s disease (PD) is a chronic, degenerative neurological disorder that affects 1 in 100 people over age 60, with males having higher risk than females. The symptoms generally come on slowly over time. In the early stage of PD, shaking, rigidity, slowness of movement and difficulty with walking (so-called motor symptoms) are the most common ones. Later in the disease, mental health and behavioral problems, as well as dementia, may also occur. In more than one third of people with PD, depression and anxiety are commonly observed .
Recent research indicates that at least one million people in the U.S., and more than five million worldwide, have PD, making it the second most common neurodegenerative disorder after Alzheimer’s disease . Although it is believed that both genetic and environmental factors influence PD, the exact cause remains unknown, and there is no scientifically validated preventive course of action to reduce the risk of its onset.
According to Wikipedia, there is no cure for PD, but medications, surgery, and multidisciplinary management can provide relief from the symptoms. The main families of drugs for treating motor symptoms are levodopa, dopamine agonists and monoamine oxidase inhibitors (especially MAO-B inhibitors) . When medications are not enough to control symptoms, surgery, and deep brain stimulation can be of use . In the final stages of the disease, palliative care is usually provided to improve quality of life .
As for the diagnosis of PD, medical organizations have created diagnostic criteria to ease and standardize the diagnostic process . A physician can use these criteria in combination with the patient’s medical history, a neurological examination, and his/her professional knowledge and judgement to diagnose Parkinson’s disease.
Accordion Risk Surface Engine
Now, after having some basic knowledge of PD, let’s return to Arise.
One major building block of Arise can be thought of like a decision tree (more details here and here), which is an easy-to-interpret and widely-used prediction model in the machine learning and statistics communities. As its name implies, a decision tree model uses a tree-like structure to mimic humans’ decision making process.
In a decision tree, each node represents a logical test on an attribute. Outgoing branches are the outcomes. For a new data point, starting from the root note, you follow down nodes and branches to pass all the tests till you reach a leaf node, where you will find a decision (or prediction).
In this Parkinson’s disease example, each data point that goes through the decision tree represents a member. For simplicity, we will assume only two kinds of leaf nodes (note that this is a simplified example of the actual Arise engine); one indicates that the member is likely to be diagnosed with PD, the other indicates otherwise. To give you a more concrete example, we pulled out one of the models that Arise automatically learned from our data, and created the simplified decision tree below (Model #1).
As a sanity check before we proceed, let’s try to relate these nodes to PD. Among these nodes, we can see “Aromatic Amino Acid Decarboxylation Inhibitor” (so-called levodopa), “Nonergot Dopamine Agonist” and some keywords/keyphrases such as “dementia”, “abnormalities of gait and mobility”, “lack of coordination” and “tremor”. From a logical perspective, these test nodes match the symptom and treatment descriptions mentioned in the previous section.
Let’s say a member has taken “Aromatic Amino Acid Decarboxylation Inhibitor” and “Nonergot Dopamine Agonist” according to the pharmacy claims. Also, this member has been diagnosed with “G25.81 - Restless Leg Syndrome”, but has no other diagnoses recorded (not even “G20 - Parkinson’s Disease”!) and . Starting from the root of Model #1, we will first take the “true” branch after passing the first node. At the second node “Member has been diagnosed with G31.83 - Dementia with Lewy Bodies”, we will take the “false” branch and reach our third node “Member has taken Nonergot Dopamine Agonist”. At the third node, we’ll take the “true” branch and come to our decision “Parkinson’s disease diagnosis likely”.
So this is how a decision tree determines whether a member is likely to have PD. Does this decision process remind you of something? Yes, it’s like a process of how a doctor thinks.
However, Arise doesn’t make such predictions based only on ONE decision tree. Instead, we use a lot of decision trees. What’s more, we mix an advanced machine learning technique called gradient boosting and deep learning to aggregate all the decisions. Long story short, it is an ensemble of the decision trees and deep learning models that makes the final data-driven decisions.
The diagram below (Model #96) shows another decision tree in Arise. As a second sanity check, nodes in this tree have key attributes such as “restless leg syndrome”, “extrapyramidal and movement disorders” and “Parkinson’s disease diagnosis reviewed”, which can all be related to PD in a logical way.
Now, let us continue with the member information we used above. Based on this member’s medical record, we’ll take three consecutive “true” branches and finally end up in the leaf “Parkinson’s disease diagnosis likely”.
Next, let us treat these two trees as different decision making processes used by two doctors, say Dr. John Doe and Dr. Jane Roe. For the example member above, both doctors agree that the member may have PD, although their reasoning processes were quite different (Note that there may be cases that these two doctors may not agree with each other). Furthermore, things get more interesting if we have not only two doctors, but hundreds and thousands of doctors with more complicated decision rules.
That is the magic of ensemble inside Arise. This technique takes every model’s result into consideration, where each model is different from the others in two ways: different test nodes and altered tree structures. It is these differences that enable Arise to tackle the problem from different angles and to make more robust predictions in terms of the great variation among different data points (or members).
To sum up, Arise determines whether a member is likely to have PD using hundreds of base models, where each base model is like a virtual doctor in that it has its own testing rules and diagnostic logics (Unlike doctors, each of these rules and logics is generated purely from data). The final decision for the diagnosis of each member is then made on the “votes” of these “doctors”.
Arise in action
At this point, you should have the basic idea of what’s happening behind the scenes. Back to our Risk Profiler APIs, let’s send a
POST /api/suspect-hcc with the member information we’ve been using in this post.
POST /api/suspect-hcc HTTP/1.1
Authorization: Bearer JSON.WEB.TOKEN
"ndcCodes": ["00074301201", "00179008301"]
A successful response would look like the following.
HTTP/1.1 200 OK
"[CC78] Parkinson's and Huntington's Diseases"
"cc": "[CC78] Parkinson's and Huntington's Diseases",
"dxDesc": "Parkinsons disease",
"varname": "[Rx] Aromatic Amino Acid Decarboxylation Inhibitor"
"varname": "[Rx] Nonergot Dopamine Agonist"
"varname": "[Dx] X-G2581: Restless legs syndrome"
In today’s post, we presented a case study that shows how our machine learning engine Arise determines whether a member is likely to have Parkinson’s disease. We started by providing a brief background of Parkinson’s disease, introduced the mechanism of Arise and then concluded the post with a
POST /api/suspect-hcc API call.
In addition to helping you understand the algorithm behind our APIs, we also hope that this post can bring Parkinson’s disease to your attention. As pointed out by Jankovic in 2008, the progress of the illness over time may reveal that it is not Parkinson’s disease, and therefore the diagnosis should be reviewed periodically .
 Sveinbjornsdottir, S. (11 July 2016). “The clinical symptoms of Parkinson’s disease”. Journal of Neurochemistry. Volume 139, Issue S1: 318–324.
 Yao, S.C.; Hart, A.D.; Terzella, M.J. (May 2013). “An evidence-based osteopathic approach to Parkinson disease”. Osteopathic Family Physician. Volume 5, Issue 3: 96–101.
 de Lau LM, Breteler MM (June 2006). “Epidemiology of Parkinson’s disease”. Lancet Neurol. Volume 5, Issue 6: 525–35.
 The National Collaborating Centre for Chronic Conditions, ed. (2006). “Symptomatic pharmacological therapy in Parkinson’s disease”. Parkinson’s Disease. London: Royal College of Physicians. pp. 59–100.
 Bronstein, M.; et al. (February 2011). “Deep brain stimulation for Parkinson disease: an expert consensus and review of key issues”. Arch. Neurol. Volume 68, Issue 2: 165.
 The National Collaborating Centre for Chronic Conditions, ed. (2006). “Palliative care in Parkinson’s disease”. Parkinson’s Disease. London: Royal College of Physicians. pp. 147–51.
 Jankovic, J. (April 2008). “Parkinson’s disease: clinical features and diagnosis”. Journal of Neurology, Neurosurgery, and Psychiatry. Volume 79, Issue 4: 368–76.