Dengue Fever Prediction

Jack Ross
Analytics Vidhya
Published in
5 min readMar 6, 2020

--

Subscribe to my weekly newsletter here! ✍️✉️

Dengue viruses are spread to people through the bite of an infected Aedes species (Ae. aegypti or Ae. albopictus) mosquito. Dengue is common in more than 100 countries around the world. Forty percent of the world’s population, about 3 billion people, live in areas with a risk of dengue. Dengue is often a leading cause of illness in these areas.

I’ll be using data from San Juan, Puerto Rico and Iquitos, Peru to predict the total cases of dengue fever infections for each week. Let’s start out by looking at the total cases of dengue plotted against a time series.

As we can see above, we have 18 years worth of data for San Juan (1990–2007) but only 10 years for Iquitos (2000–2009). To combat this, I split the data into 2 groups (after splitting the training data into a validation set to avoid leakage) based on which city the data belonged to. It’s also hard to see any real correlation on the plot above so I engineered a “month” feature in order to get a better understanding of when infections are most likely to occur. This feature happened to be the most important of any feature in the San Juan data as shown in the plot further down.

--

--

Jack Ross
Analytics Vidhya

Data Engineer • YouTuber • Digital Nomad. Leveraging the internet to create a life with more freedom. 💻✍️🎥 Subscribe: www.TheJailbreak.io