The 3 P’s — Prediction for Prevention with PredictEasy- a probe into detection of lung cancer

AMTHUL MUQHEET
5 min readDec 15, 2023

--

The lungs are primary organs that play a pivotal role in the respiratory system and are responsible for the exchange of oxygen and carbon dioxide in the human body.

Lungs help in sustaining life by:

  • facilitating the essential process of respiration

* Eliminating carbon dioxide thus maintaining the optimal balance of oxygen and carbon dioxide in the bloodstream.

* Maintaining the body’s acid-base balance to prevent shifts in blood pH that could disrupt normal physiological functions.

  • Serving as a crucial defense mechanism against harmful foreign particles.

Now you get an idea of why our understanding of healthy lung functioning is critical as it is necessary for maintaining overall health and sustaining life. One such criticality being faced is that of Lung cancer making it significant to focus on how accurately we can detect its occurrence subject to several features that we may suspect as reasons that may lead to lung cancer. One naturally bumps into 3 A’s “Alert, Aware to Avoid”.

Do technological advancements contribute to this in any way?

Lung cancer is one of the most common cancers. The importance and ease in lowering its risk are some of the humongous benefits of this era of Data Analytics and Artificial intelligence with which the future of healthcare services is poised for massive change. AI simplifies the lives of patients, medical professionals, and hospital administrators by carrying out those tasks that are typically done by humans, but in less time and at a fraction of the cost. Thus, Technology has the potential to significantly improve India’s health infrastructure by increasing access to care, improving quality and efficiency, decreasing medical errors, and most importantly implementing and taking advantage of predictive and prescriptive analytics.

PredictEasy is the no-code AI-based data analytics platform that provides a range of analytical tools to help users analyze and make sense of their data. Detecting Lung cancer using an early prediction technique based on factors that may lead to this disease in the future can be life-saving! This can be achieved using PredictEasy. Launching PredictEasy is so hassle-free that actionable insights are provided in a matter of a few seconds. First-time users of PredictEasy can refer Elsa Saji blog provided below for tips on exploring the PredictEasy tool.

https://medium.com/@elsasaji02/mastering-the-supply-chain-optimizing-back-order-shipment-prediction-through-feature-engineering-1f55b7e11627

Dataset: We shall manifest this through an analysis of the inferences we can elicit from an existing lung cancer dataset churned through PredictEasy to arrive at actionable insights as precautionary measures.

The prediction of lung cancer is processed based on certain features which include gender, age, smoking, yellow_fingers, anxiety, peer_pressure, chronic disease, fatigue, allergy, wheezing, alcohol consumption, coughing, shortness of breath, swallowing difficulty, chest pain, and their role in the anticipated detection of lung cancer.

We analyse the dataset using a classification model which summarises as below:

· The SHAP value indicator reveals the probability of the features that may lead to a lung cancer.

  • The AUROC (Area Under the Receiver Operating Characteristics) curve is an important evaluation metric for checking the classification model’s performance.
  • The confusion matrix

Here we analyze the Accuracy, Precision and Recall of the classification model. Confusion matrix, precision, recall, and F1 score provide better insights into the prediction as compared to accuracy performance metrics.

  • The correlation matrix or heatmap is a tool that displays the correlation coefficients between multiple variables in a dataset providing insights into patterns and dependencies within the data.
  • The feature rank indicates the contributing factors leading to lung cancer are the symptoms of coughing, anxiety and allergy. These indicators are the warning signs that must be taken seriously to ensure efficient functioning of the vital organs particularly the lungs. This helps mitigate the risks one may be prone to over time or conditions of an individual.

We can exercise the option to remove the low-impacting features to ascertain the most crucial and impacting ones.

  • The Pairwise Grid is another indicator of the inter-feature relationship highlighting the variation of the two associated features.
  • The “What-if Analysis” is another great feature that can reveal how susceptible is a specific feature anticipated to serve as a cause of lung cancer. One can test various parameters to predict this impact.

One can infer that if coughing and allergy persist, an individual is 67% likely to end up with a diagnosis of lung cancer in future. This measure is a clear indicator of the contribution of prescriptive analytics in the detection of lung cancer.

Be aware, stay protected for

--

--