341,760 Cases; 14,757 Deaths; 99,041 Recoveries.
These are some metrics as on March 23, 2020, of Coronavirus Outbreak that began its rampage on the last day of 2019 and brought the world into a frenzy in less than three months.
In the 2020s, to better combat this global health emergency we can understand the spread of 🔗COVID-19 with the arsenal of new-age data-driven techniques.
From supply chains to consumption patterns, the virus has affected everybody down to the lowest common denominator. Machine learning and data engineering can be leveraged to analyze news reports and social posts, information coordination can be made better, and predictions can be made.
It is critical to determine where the virus would surface in order to block its spread effectively. We are trying to understand how is the virus interacting with the population at large.
~BlueDot, a company running AI and data-driven surveillance for COVID-19
How can AI, data science and machine learning be instrumental?
Is the virus more prevalent in certain areas than in others, and why? Is the spread only correlated to the primary sources (directly coming from infected countries) and secondary cases (people primary sources are coming in contact with) or there is more to the story?
What are the trends witnessed globally in the case of community spread?
These and many more questions can be answered by data visualization and interpretation. AI and supercomputers are using big data on the virus to develop a vaccine (with major companies like Tencent, DiDi, and Huawei involved in the research). Data science techniques can approach the problem of such magnitude and suggest combat strategies. Let’s see how.
Handling the crisis with data science
Data Science can be used in multifaced ways — to track and forecast outbreaks; in testing; process healthcare claims, using robots in sterilization and food supplies, determining non-compliance to government and health advisory, using chatbots to share information.
To these goals, the 🔗data science industry can go about the problem through three methods.
First, by understanding the problem. Followed by taking the action. Finally, prevention.
👉First Phase: Understand the problem
This involves extracting as much information as possible about the virus and understanding it through data visualization techniques, GIS techniques, and graph analysis. The analysis will become fundamental to the next steps to take.
Some answers to dig for include: Where are the outbreaks? How fast is it spreading? How many have become sick? What are the demographics of those infected? Which areas have been most successful in diagnosing and handling its spread? Etc.
Data Visualization, GIS Mapping, and Network Mapping are few techniques that can aid data engineering in this phase.
👉Second Phase: Action
Based on the information collected and analyzed, determine which training models should be deployed to take action? Which applications should be used?
It can be complicated to determine with success which actions to take. Choose and deploy models based on their scalability, effectiveness, and speed. Since a system that can respond to such a massive phenomenon needs to be able to hand data with high concurrence levels.
👉Third Phase: Prevention
If and when we manage to contain the pandemic, it is important to prepare for the future. Keep the pulse of best practices, its likely occurrence (if at all) again, and augmenting preparation levels to deal with the crisis.
One of the major challenges that would also emerge is related to the privacy of data infrastructure, an issue pervasive in 🔗data engineering and data analysis.
Despite the international community coming together, obtaining relevant information about the virus is still a challenge.
Although it is not expected to be contained in the short-term, making efforts and becoming better prepared than before is the least any community can do. It, even more, applies to data science pros, the mavens of information.
Originally published at https://datafloq.com.