Part 2: IARPA Super Forecasters Challenge — IFP 840

Baur Safi
2 min readMar 31, 2018

--

This story is a continued from my previous post today: https://medium.com/@baursafi/iarpa-super-forecaster-challenge-ifp-840-can-we-predict-number-of-influenza-detections-between-e436e6955926

Previously we looked at this graph that reflects the cumulative number of influenza positive detections in Argentina in the first 11 weeks of each year since 1997 until 2017 and their effect on the number of similar detections during week 28.

2009 is obviously an outlier. Let’s drop it and look at the rest and see if there is strong correlation between the cumulative number of Influenza detections in the first 11 weeks of year and number of detections during week 28.

a.drop(a.index[12],inplace = True)
# Check correlation in numpy:
import numpy as np
x = list(a["11Weeks"][:-1])
y = list(a["Week28"][:-1])
np.corrcoef(x, y)
OUT:
array([[ 1. , 0.74315895],
[ 0.74315895, 1. ]])

Let’s build a linear regression with help of stats module of scipy framework:

from scipy import stats
slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)print('\
Slope: {}\n\
Intercept: {} \n\
R-value: {} \n\
p-value: {} \n\
Standard Error: {}'.format(slope, intercept, r_value, p_value, std_err))
OUT:
Slope: 3.2563835456560057
Intercept: 32.24674281828507
R-value: 0.7431589521528315
p-value: 0.0001737712726845809
Standard Error: 0.6910640438880239

Next step is just extrapolate to the number of positive influenza cases already observed in the first 11 weeks of 2018 (27 cases):

prediction = intercept + slope * a["11Weeks"][-1:]
print("Predicted number of Flue Detections in week 28 of 2018 will be {}".format(prediction))
OUT:
Predicted number of Flue Detections in week 28 of 2018 will be 21 120.169
Name: 11Weeks, dtype: object

Resuming the study I suggest to submit as our prediction for IFP 840:

(2595) Less than 60(2594) Between 60 and 170, inclusive — 0.1

(2593) More than 170 but less than 270–0.5

(2592) Between 270 and 390, inclusive — 0.2

(2591) More than 390–0.1

--

--