Hey, Colin…interesting article!
Jordan Hoolachan
1
Sure.
Building a panel data set doesn’t automatically solve the leakage issue. We still can’t use revenue in December 2015 to predict churn in January 2015, for instance.
But if we are careful, we can build a model such that, for instance, the likelihood of churning this month is a function of some variables from last month, or the last two months, or the last year, as well as perhaps some other time-invariant variables.
Panel data in particular is amenable to this kind of modelling because it is generally easy to do take lags and first differences, so you can have a model that goes something like Y = f(lag(X)), where Y is being predicted by the offset value of X.