Predicting COVID-19 Death Using Machine Learning

Arunadevi Ramesh
The Startup
Published in
3 min readJun 25, 2020

Coronavirus disease 2019 (COVID-19) is an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It was first identified in December 2019 in Wuhan, the capital of China’s Hubei province. While the World Health Organization (WHO) announced a ‘mystery pneumonia’ on December 31, 2019, and it has spread globally, resulting in the ongoing 2019–20 coronavirus pandemic. Since then, the virus has been identified (SARS-CoV-2), the disease named (COVID-19), and there has been global spread.

The severity of this disorder could be well established by the fact that while the first seven cases were reported on January 20, 2020, in the United States of America, by April 2020, it crossed over 300,000 cases. By May 9, 2020, more than 3.93 million COVID-19 cases were reported in 187 countries (and territories), resulting in about 274,000 deaths (WHO). Several countries, including India, have gone into a state of lockdown to prevent the spread of this deadly virus.

In this particular scenario, one primary thing needs to be done. Now We need Future death count prediction across world. Because to prevent the massive amount of death count across world. The open source global world COVID-19 dataset is collected from the WHO COVID-19 repository as a CSV file. Parameters have chosen three independent variables such as total_cases, recovery and death and one dependent variables( After one week Death count for COVID-19 across world).This one gave Highest accuracy and low error rate.

Classification algorithm is only used for discrete target values whereas regression models are used for continuous target values. Why Linear Regression is best to predict future death count across world? Because the Same Methodology is used some classification algorithm such as Logistic, Random Forest and Light GBM. Mean Absolute Error (MAE) is very important to choose best algorithm to predict the death count. But It gave higher accuracy, but high error rate using classification algorithms whereas It gave higher accuracy and also low error rate using Regression algorithm (Mean absolute Error is 30298 for Logistic Regression and 4411 for Linear Regression). Finally Regression Models is the best algorithm to predict the after one week cumulative death count across world.

First India dataset for COVID-19 is taken. In our proposed methodology, prediction of future death count for five states like Maharashtra, Madhya Pradesh, Gujarat, Delhi and Tamil Nadu in India. cumulative death count on June-29 is predicted based on cumulative confirmed cases, Recovery and cumulative death count on June-13. India will reach nearly 13949 cumulative death count on June 29, Tamil Nadu will reach nearly 929, Gujarat will reach nearly 1660, Maharashtra will reach nearly 6767, Madhya Pradesh will reach nearly 524 and Delhi will reach nearly 2425 cumulative death count on June -29. All those experimental results were predicted using Linear Regression algorithm based on our proposed Methodology.

Early forecast of the tracking system can help to take necessary actions. This article proposed to utilize the various data mining and machine learning models for COVID-19 epidemic analysis using world dataset from who website. The results show that Linear Regression (LR) yielded a good accuracy compare to other approaches in forecasting the COVID-19 pandemic virus. Evaluating this experiment results using various error metrics such as Mean Absolute Error (MAE), R2_score, Mean Squared Error (MSE) and Root Mean Squared Error (RMSE).

Our proposed approach will predict cumulative number of death rate in various countries using the number of confirmed cases /active cases in each day. Developing a Graphical User Interface for our study to predict the number of death cases which will be helpful for Government and Health care sector (hospitals and pharmacy) to discover vaccinations at the earliest. The study can further be extended with multiple features to utilize other deep learning models.

As always, thank you so much for reading, and please share this article if you found it useful!

Stay home, Stay safe!

--

--