Skyrocketing Technological Innovations Foster Accurate Diagnosis of Parkinson’s Disease (using sound recordings)

AMTHUL MUQHEET
7 min readDec 17, 2023

--

One of the most exciting frontiers in modern technology is the rapid advancement of artificial intelligence (AI).

The Growing Influence of AI

AI, or artificial intelligence, is no longer just a concept confined to science fiction movies; it has rapidly become an integral part of our daily lives. AI has permeated almost every aspect of our society. Its influence can be seen in industries such as healthcare, finance, retail, and even entertainment. As more data becomes available and computing power increases exponentially, the potential for AI seems boundless. In recent years, AI has made great strides in various fields, including healthcare, finance, and transportation. For example, machine learning algorithms are being used to diagnose diseases like Parkinson’s disease, and cancer more accurately and quickly than ever before.

Parkinson’s disease (PD) is a neurodegenerative disease caused by the death of dopamine-generating neurons. Dopamine is a chemical that controls physical movement by transmitting messages between the substantia nigra and the brain, thereby enabling coordinated movement. Losing 60–80% of dopamine-producing cells results in a plummet in the amount of dopamine which hampers an individual’s movement control which is the onset of symptoms of PD. The meagre production of dopamine neurons results body’s motor functions in losing body movement control. Among the symptoms are rigidity (inflexibility of limbs and trunk), tremors (shaking of jaws, hands, legs, and arms), slow movement, and poor balance. Nonmotor features of PD include dementia, depression, restless legs, temperature sensitivity, and digestion problems.

Although PD is still incurable, few treatment options for patients with motor and nonmotor symptoms have been developed. These options include noninvasive (drugs) and invasive (surgical) detection and treatment methods. The medications are used to block nerve impulses to control the motor system which have side effects. So it makes sense to work on the detection of PD are an early stage which is possible by speech or voice analysis. Voice disorder testing is a useful and noninvasive option for the early detection of PD since approximately 90% of PD patients have dysphonia or vocal impairment, differentiating them from healthy individuals. Thus, diagnosing PD through voice disorders is one of the emerging and effective methods. Affecting a very large population of approximately 10 million people worldwide, PD stands second among the various types of neurodegenerative disorders after Alzheimer’s Disease.

The speech production system is composed of an assortment of soft-tissue components. Although parts of a speech signal might seem stationary, there are always small fluctuations in it. Variations in signal frequency and amplitude are called jitter and shimmer, respectively. Jitter and shimmer are acoustic characteristics of voice signals, and they are caused by irregular vocal fold vibration. They are perceived as roughness, breathiness, or hoarseness in a speaker’s voice. Any natural speech contains some level of jitter and shimmer, but measuring them is a common way to detect voice pathologies. Personal habits such as smoking or alcohol consumption might increase the level of jitter and shimmer in voice. However, many other factors can have an effect as well, such as loudness of voice, language, or gender.

There are several different ways to measure jitter and shimmer. For instance, when detecting voice disorders, they are measured as percentages of the average period, where values above certain thresholds are potentially related to pathological voices. Jitter and shimmer are most clearly detected from long, sustained vowels.

A commonly used jitter value is the absolute jitter. This measure expresses the average absolute difference between consecutive periods.

When this is divided by the average period, another common measure, relative jitter, is obtained.

An intriguing thought arises as to how to carry out this detection process effectively and in the least possible time ensuring accuracy. The one-stop solution is CleverInsight’s AI platform — PredictEasy. As a first step, we embark on choosing a relevant dataset to reconfirm the potential of the newfangled technology (AI), and its application blended with the ease of this novel platform.

Dataset chosen: “Parkinson speech dataset with multiple types of sound recordings”

The scrutiny of the dataset to arrive at the outcome of this classification problem is fostered through the inferences drawn and its accuracy percentage is a disclaimer of the reliability of this neoteric yet highly innovative platform. Let us analyze to predict.

This High-dimensionality dataset tends to pose problems — the most common being overfitting, which reduces the ability to generalize beyond what is in the training set. Thus we can opt to reduce the number of features in the dataset to finetune the analysis better only if we are sure of the non-impacting features with the help of domain knowledge. In this speech recognition dataset, the variance is found to be very clearly exhibited in the case of feature jitter_local_. This can confirmed from the Feature Rank and SHAP value.

The performance of the Classification model can be elicited from the Confusion Matrix.

This matrix helps to compute the below evaluation parameters indicating the predictive model’s performance:

The predictive model achieved perfect accuracy, precision, recall, and F1 score of 1.00, indicating that it is performing exceptionally well in predicting the target variable PD. The matrix further indicates that the Type 1 and Type 2 errors are nil, enhancing the effectiveness of the prediction model.

Deeper insights can be harnessed from the strongly influencing feature in the model’s predictions viz jitter_local_ with a notable impact on the model’s predictions. The Pairwise Grid feature aids in finding the correlations among the associated variables thus contributing to further statistical inferences.

One can connect all associated features with the most significant attribute “jitter_local_absolute” for inferences resulting in the detection of PD.

What if we want to observe all correlations between features? Yes, you are right. The answer is Heatmap or Correlation matrix which is an old yet powerful plot method.

From the hierarchy of shades, we observe that it indicates the features that are highly correlated showcased by the highest value shades followed by the mid-range shaded areas and a few lowest scale shades.

In Machine Learning, the performance measurement of any classification problem is inferred from the AUC — ROC Curve.

In general, an AUC of 0.9 or greater is considered outstanding discrimination (i.e., the ability to diagnose patients with and without Parkinson’s disease or condition).

Thus based on the analysis, it is recommended to focus on the feature jitter_local_absolute for further research and exploration. This feature shows the highest predictive power for PD and can potentially provide valuable insights into the disease as listed below:

The “What-If” attribute of PredictEasy helps with a real-time interface to change the inputs of a model and view the output. It shows predictions, confidence, and explanations for those inputs as can be seen below:

Summarizing, the most important feature for predicting Parkinson’s Disease is jitter_local_absolute, with a score of 1.0. This feature has the highest predictive power and should be given special attention. Other features have scores of 0.0, indicating that they do not contribute significantly to the prediction.

Machine learning techniques help predict the plausibility of Parkinson’s Disease to facilitate its treatment through early detection, thus enabling patients to lead a normal life. The rise of an aging population all over the world emphasizes the need to detect PD early, remotely, and accurately.

--

--