CORONA, 12 questions and Exploratory Data Analysis

An analytical approach to visualize the COVID-19 outbreak data

Arpan
Analytics Vidhya
4 min readMar 10, 2020

--

source : https://www.leehealth.org/public-health

Coronaviruses (CoV) are a large family of viruses that cause illness ranging from the common cold to more severe diseases such as Middle East Respiratory Syndrome (MERS-CoV) and Severe Acute Respiratory Syndrome (SARS-CoV). A novel coronavirus (nCoV) is a new strain that has not been previously identified in humans. — WHO

So, the new decade begins worst way possible with the corona outbreak. Before getting in to the details, I want to thank WHO and Johns Hopkins university to make the data available publicly. After getting a sense on the scale, we must admit that, this is the kind of incident where the world need to be stay strong and fight together.

We will visualize the outbreak on the basis of both geographical and time domain.

The Answers we will seek

Here we will seek answers of 11 key questions to analyze the outbreak closely.

  1. Which countries are mostly affected by the outbreak?
  2. As we know China is the source of the incident, What is the comparative situation between china and rest of the world?
  3. How the confirmed cases are distributed globally over various regions?
  4. How confirmed and death cases are distributed over worst effected countries?
  5. Which countries have the worst death rates?
  6. Which countries have the best recovery rates?
  7. How the virus is spreading over time?
  8. How the virus is spreading outside of china over time?
  9. How the virus is spread across various regions?
  10. Which are the regions where death cases are reported?
  11. Which countries are completely recovered till date (All confirmed cases are recovered)?
  12. How many new incidents (confirmed/ recovered/ Death) are reported on a daily basis?

Q1. Which countries are mostly affected by the outbreak?

Q2. As we know China is the source of the incident, What is the comparative situation between china and rest of the world?

Though most of the confirmed cases are linked to China, the outbreak is also spreading rapidly to the other parts of the globe.

Q3. How the confirmed cases are distributed globally over various regions?

Till the date of this analysis (7th march, 2020), 78% of the confirmed cases are in china, but the scenario is changing daily basis as the outbreak is spreading to the other regions rapidly.

Q4. How the confirmed and death cases are distributed over worst effected countries?

Q5. Which countries have the worst death rates?

Q6. Which countries have the best recovery rates?

Q7. How the virus is spreading over time?

Q8. How the virus is spreading outside of china over time?

Q9. How the virus is spread across various regions?

Q10. Which are the regions where death cases are reported?

Q11. Which countries are completely recovered till date (All confirmed cases are recovered)?

Q12. How many new incidents (confirmed/ recovered/ Death) are reported on a daily basis?

Summary

End Notes

I have used plotly to create these interactive maps. All codes used can be found in this repository. This analysis is based on the data received up to 7th March, 2020. Stay strong the world!

Additional Links

--

--