A small comparative analysis between COVID-19 and SARS
Introduction
It’s quarantine day number 10. If you are like me, it’s most likely the case that you are about to swallow your own head because you can’t stand the sight of the same four walls around you. While catching up on videogames with friends or binge watching series are good options to adjust to this new temporary reality, this is also a perfect time to pick up on awesome skill. As a data geek, I’m using this time to run some analysis on COVID-19 and SARS and improve my Python data science skills.
As COVID-19 started to spread back in the beginning of this year, many began to compare it against 2003’s SARS. While both viruses stem from the coronavirus family and show similar symptoms, their spread have had different magnitudes. Some of the questions that came to mind are:
- How does the number of cases differ between the two viruses?
- What are the countries with the highest number of cases in both outbreaks?
- How do recovery rates and mortality rates compare among both diseases?
To attempt answering these questions, I took a few datasets on COVID-19 and SARS from Kaggle. The COVID-19 dataset included over 6000 rows on daily cumulative cases, deaths and recoveries per country as of March 16th, 2020. The SARS dataset is smaller compared to the previous one and has approximately 2000 rows with the same data columns. The following analysis was made using the Python libraries numpy, pandas and plotly.
Question 1 — How do the number of cases differ between the two viruses?
From Figure 1 we can already observe that the spread for each virus has a different magnitudes. SARS never experienced an exponential growth, but rather stayed constant throughout it’s entire outbreak period. On the other hand, COVID-19’s growth was not only exponential, but also experienced a major inflection point on day 20 (February 10th, 2020). COVID-19 had already surpassed the number of total SARS cases within the first days of its outbreak. If we compare the last entry of available COVID-19 cumulative cases (i.e. day 55), the approximate 7300 SARS cases represent a mere 4% of the total 181k COVID-19 cases. Clearly the magnitude of this new virus is a complete different beast.
Question 2 — What are the countries with the highest number of cases in both outbreaks?
The common denominator among the two viruses is that they both originated in China and as a result this country has the highest position for cases in both scenarios. SARS mainly spread to neighboring regions and countries such as Hong Kong, Taiwan and Singapore. 7 out of 10 countries/regions are in Asia. While some cases extended to Western regions such as the USA and Germany, the volume of these cases were pretty minimal compared to the Eastern regions. With the exception of China and Hong Kong, no country or region had more than 1000 cases.
Figure 3 shows the global spread of COVID-19: 3 of the top countries are in Asia, 6 in Europe and 1 in America. China has approximately 80k cases, followed by 28k cases in Italy and 15k cases in Iran. The spread of the virus is so massive that even countries such as Germany and South Korea have more cases than the entire SARS epidemic.
Question 3 — How do recovery rates and mortality rates compare among both diseases?
A more concerning question for those who are afraid of contracting the virus is how likely one is to recover. Looking at recovery rates for SARS, we can see that the lowest recovery rate is 76% for Taiwan and can range as high as 93% as in the case of China.
Given that the outbreak is still happening at the time of this writing, we cannot accurately determine the recovery rate for COVID-19. The numbers seen in Figure 5 are still low across the board, with the exception of China where the outbreak has been going on for months now. China has seen a recovery rate of 84%. It will be very likely that we will see similar numbers across other countries towards the end of the pandemic.
Shifting our focus to mortality rates, the average mortality rate for the top 10 countries is approximately 11%, which is higher than any of the mortality rates for COVID-19. While China has had the highest number of cases for both diseases, the mortality rates for COVID-19 and SARS are 4% and 7% respectively. In Figure 6, we see mortality rates going as high as 22% and 17% as in the case of Thailand and Hong Kong respectively. In Figure 7, the highest mortality rate can be seen in the Italy with a total of 8%. Overall mortality rates for COVID-19 appear to be lower when compared to SARS.
Conclusion
These insights show that despite COVID-19 having more rapid and exponential growth than SARS, it does not appear to be as deadly as the latter one. It’d be interesting to see how COVID-19 compares to other diseases such as Ebola and the Spanish flu. Historical data should be used to learn about what solutions were implemented in the past and see what can be implemented today to reduce the extent of this pandemic.
From a personal view, doing this analysis allowed me to have a better understanding of these two outbreaks while gaining more experience cleaning and plotting data through Python. Please use this quarantine time to continue developing skills that can add value to you! :)