

How Big Data can be used to fight the Zika virus
As health agencies struggle to understand the mosquito-borne disease, digital data analysis becomes an useful tool to prevent the spread of Zika
Recently, the World Health Organization declared Zika a public health emergency which could affect 4 million people by 2017. However, while health surveillance remains focused in reactions to medical emergencies, the supporting infrastructure to prevent and contain this epidemic is still limited. Data modeling techniques allow safer, more precise prospects to control the spread of the disease and turn Zika into a rich source for clinical studies, which could potentially predict new outbreaks of the disease.
Our ability to generate data has changed dramatically in only a few years. The amount of digital information available is almost immeasurable. Systems such as the Internet of Things are made up by thousands of devices and sensors connected to the Internet, which generate an even bigger data flow.
This huge amount of data offers the potential to revolutionize basically any area of knowledge. Police departments in the United States are already employing data modeling to predict when and where a crime will occur before it actually happens. Some cities have been using Big Data to calculate the flow of traffic and reduce jams. Multinational companies are using the information to understand and predict how their customers will spend their money. The main goal, in most cases, is to create predictive models.
The same procedure can be used to lower the impact caused by epidemics, thus revolutionizing public health on a global scale.


Big Data, remarkable for its volume, speed and variety, is a broad concept related to the creation and storage of data, which can be used to evaluate strategies and actions. The knowledge to decide a course of action, made possible by Big Data, can make a valuable difference to the end result, especially when we talk about stopping viral epidemics.


In 2011, the city of Lahore in Pakistan was struck by the worst dengue fever outbreak in its history. The disease — which is transmitted by the same mosquitoes that are currently spreading global concerns over other epidemics — infected around 16,000 people and claimed 352 lives. In an effort to contain the spread, the Pakistani government used Google tools to develop a digital system with algorithms designed for early detection of both dengue and influenza. Whenever imminent outbreaks were detected by the system, government employees would immediately clear up mosquito reproduction areas.
The results were impressive. The following year, there were only 234 confirmed cases and no deaths.
Today, control and prevention of viral infections such as Zika is usually performed by traditional health surveillance methods, such as doctor visits and health departments. There is no data-crossing and, as a consequence, no intelligence is gathered for improving prevention methods. With biometric sensors being a growing trend in the market, the digital collection of individual data creates possibilities for real-time health monitoring. The cattle industry is already employing this method by using sensors that monitor their animals’ vital signs, a technology which drastically lowers mortality levels and infection rates from cattle diseases. Big Data combined with these technological devices makes early detection of epidemics easier, thus improving containment strategies for viruses and infectious diseases before they spread.


Crowdsourcing and geomapping create new ways of tracking epidemics such as Zika, since environmental factors affect how diseases spread. Google Flu Trends, for instance, offers an estimate of influenza cases in certain regions using data collected through online searches. Using data made available by Google, health departments can make region-specific monitoring. This will better equip them to prepare for emergencies and stop ongoing epidemics from spreading further. Google Flu Trends, which is currently offering only historic estimates, has a competitor now: SickWeather is a Twitter-based app which also has the goal to anticipate trends in several diseases.
The use of Big Data for public health surveillance is not only in its implementation phase; it is also raising certain ethical issues which are far from being settled. Some experts argue that excessive predictive measures could lead to panic, poor allocation of limited supplies, lack of medical resources and — as certain reactions to the recent Ebola outbreak have shown — stigmatization of communities and nationalities. Regulatory departments for medical ethics lack extensive analyses to deal with the whole range of digital detection systems and make sure this assemblage of information doesn’t lead to any undesired effects.
Nevertheless, a few days ago, Canadian company BlueDot tested Big Data to predict Zika’s propagation patterns. Weather data has been turned into mathematical models using temperature maps, population density numbers and several other variables that affect mosquito concentration. Besides showing the presence of Aedes aegypti mosquitoes in a large portion of Latin America, this risk map has indicated other areas which are vulnerable to both infection and transmission of the virus. Researchers say they will continue to add new data, which might change this model, perhaps in a significant way.


It’s important to realize that the use of Big Data in the health sector covers a lot more than virus infections and emergency situations. Environmental scientists have collected an enormous amount of data about air quality in polluted areas which, combined with sets of health data, can help them study new ways to treat and prevent respiratory diseases. Epidemiologists have been gathering information from social network websites to identify the spread of sexually transmitted diseases and even create early warning systems.
Data mining is changing the way we do science. Clearly, the potential for hypothesis generation offered by Big Data can help doctors and researchers to gather insights on public health problems, from emergencies to permanent issues. If carefully applied after a proper discussion of its related ethical issues, health databases offer a real possibility to improve prevention and treatment of diseases, with the potential to transform healthcare policies.