A Detailed Report on COVID-19 statistics using South Korea’s dataset

Dipanshu Prasad
Analytics Vidhya
Published in
7 min readSep 10, 2020

--

The COVID-19 pandemic which originated from Wuhan, China put to test many countries’ management, health services, economy and decision making.

Among the worst hit countries, most saw their worst phases of economy and healthcare but a few managed to get away with minimal damage, S. Korea being one of them.

The South Korean COVID-19 model is considered by experts as one of the most successful models with high testing numbers, public cooperation and timely damage control. South Korea managed to flatten its curve well before anyone else, that too without any drastic lockdown measures and thus presents an excellent lesson for the world to learn.

Initial days…

During the first 4–5 weeks, S. Korea witnessed a fairly steady rise in the number of new cases and was tracking the patients using CCTV footage and credit card tracking of the patients. However, the cases jumped drastically after the identification and diagnosis of the well attributed “Patient 31”, a super spreader who attended the Shincheonji Church in the city of Daegu. Until the end of February, half of all the cases were linked to the church.

The combination of high testing and accurate contact tracing kept the numbers low. The cumulative testing numbers, confirmed cases and deceased figures were:

We see that despite the increased number of tests conducted, the fresh cases curve has flattened with the month of May looking at less than 50 cases daily.

This commendable feat was achieved thanks to the staggering number of tests conducted per million people in the highly affected zones. The total number of tests conducted until the end of June were upwards of 12 lakhs.

South Korea’s journey was nowhere smooth as multiple hot-spots appeared simultaneously in various cities with super-spreaders linked to each.

This plot shows the number of people directly affected by the top 6 super spreaders (patient ID given)

It was also noted that the number infections from a particular group or cluster were as high as thrice the number of infections from individual contact/other sources.

Hence, mass gatherings were stopped in the following months.The restrictions laid down on the public were mild but they were implemented well before anywhere else.

Infection source tracing in Seoul during early months

The top sources of the infection were identified and effective contact tracing was started before the people who came in contact complained of the symptoms.

Main sources of infection and their contribution to the total cases in %

While all this was being done outside the clinic, the observations made inside them also made big changes towards understanding the seriousness of the disease.

Despite the high number of patients, data such as confirmed date and released date were carefully monitored for most patients, making S. Korea’s dataset one of the most reliable in the world.

The x-axis of the graph shows the number of days taken for recovery starting from the onset of symptoms and the y-axis shows the proportion of the cases. The peak is observed around 20–25 days and the average recovery time was found to be 24 days, which was unusually higher than the world average.

After getting fresh cases in single digits in the mid of April, restrictions were gradually lifted and normal life began in South Korea from the end of April. However, the seemingly smooth sail of South Korea encountered another cluster in the nightclubs of Itaewon in the month of May. The Korean government announced that S. Korea was witnessing the second wave of Covid-19 in June.

CORRELATION WITH AGE

Number of cases in each age group

The highest infection rate occurred in the working age with fairly high infection rate in the senior citizens as well (considering their smaller population). But surprisingly, the chances of children catching the infection were many folds lower.

Children somehow had the immunity power to fight off the infection much better than the adults. Also, the children showed milder symptoms and almost never developed complications thus owing them a much lower death rate as shown next.

Death rate came out to be linearly correlated with age.

COVID-19 was the most merciless against the elderly and in contrast proved fatal for almost no children and infants (of course we are talking about a country with world class medical facilities).

CORRELATION WITH GENDER

Total number of infections by gender
Total number of deceased by gender

The total infections by gender and total number of deceased by gender don’t add up. The infection rate in women is significantly higher but the death is nearly half of that of men.

We observe the same trend again using scatter plot.

The red line shows x=y, the line where the points would be if males and females were equally likely to get infected. These are 20 data points picked on random dates.

The infection numbers are to the left of the line, again indicating women are more prone to the infection and in the graph of death rate, the points inch closer towards the right of the graph, returning us to our earlier conclusion.

OTHER INSIGHTS

  • No correlation between temperature or humidity and infection rate was observed in South Korea, probably the reason being very effective spread control.
  • The COVID-19 phase saw the largest internet traffic in history.
  • Gyeonggi-do, Daegu and Seoul were the worst affected provinces in S. Korea.
  • Most of the testing was done through ‘drive-thru’ or ‘walk-thru’.
  • The testing process in S. Korea was amongst the simplest.
  • The awareness campaigns run by the government were very successful and maximum level of cooperation of the public was observed.

INTERNET SEARCH TRENDS

The internet searches made during COVID months were observed and the internet traffic was monitored. Unsurprisingly, the searches related to “coronavirus” and its symptoms were many folds higher than in 2018 and 2019.

GOVERNMENT MEASURES

  • Level 1 (Blue) Infectious Disease Alert was announced by the government as early as 3rd January.
  • Level 2 (Yellow) and Level 3 (Orange) Infectious Disease Alert were issued on 20 January and 28 January respectively.
  • Level 4 (Red) Alert was issued on 23 February.
  • Immigration procedure was started to be heavily scrutinized starting from 4th February. It was limited to Chinese immigrants initially.
  • Special immigration procedure was made mandatory for all countries on 19th March.
  • Drive-thru screening center was first started by the local govt on 26 February.
  • All schools were shut down on 2nd March.
  • High School online classes started on 9th April, rest of the classes from the following week. Schools began to open in the second half of May but were closed again shortly after.
  • Numerous apps were built for the people and mask distribution campaigns were initiated

CONCLUSION

Many international media have praised the South Korean model of “TRACE-TEST-TREAT”. The right combination of technology, decision making and public participation made the Korean model the one to be followed and learnt from.

South Korea even beat the European countries in quickly attaining peak and negative slope. This should give S. Korea some extra points in next year’s HDI rankings.

Despite what appears to be a fantastic feat for S. Korea, the threat of a second outbreak is far from over and it has almost started giving them the nightmares of the month of March.

August is witnessing a new wave of infections and in fact 16th August witnessed the highest fresh cases spike since March and the figures continue to rise, raising new tensions, which were over for a while. The new epicentre of the outbreak appears to be the protests in the city of Seoul.

The end of the pandemic is still nowhere in sight, not even for South Korea but it is expected that South Korea does what it does best.

Jupyter notebook (ipynb) file link: https://drive.google.com/file/d/1S4Jo6m-kdmnztbIOAaqdp35Kq4sFMNaw/view?usp=sharing

Leave a clap if you found it informative!

--

--