Interpreting Poisson Regression

Hannah Do
3 min readDec 8, 2021

--

What is Poisson Regression?

Linear regression predicts continuous values, and logistic regression predicts multiple classes. How about Poisson regression?

Poisson regression predicts count based data.

Count based data indicates events that occur at a certain rate, distribution that is discrete, not continuous, and limited to non-negative values.

For example, it could be an incident rate where the number of events occur over per person-time.

Incidence, R Epidemics Consortium

Poisson regression is suitable for this type of data because the distribution of Poisson processes resembles natural and classical distribution to model count data.

  • The distribution is slightly skewed to the left depending on different lambda values.
Syed Naeem Ahmed, Essential statistics for data analysis, Physics and Engineering of Radiation Detection (Second Edition), 2015

In other words, Poisson regression assumes the target variable distribution would have Poisson distribution. Whereas regular linear regression would assume normal distribution for the dependent variable.

Following is an example of Poisson Regression for Dow Jones stock volume (count) prediction based on the UCI dataset.

Dow Jones data from UCI repository

Using python’s library — statsmodels and patsy’s dmatrices — the preprocessed data can be set up as matrices and fit to a generalized linear model (GLM) with Poisson distribution.

Poisson GLM (Generalized Linear Model) Summary
  • Notice the method is named IRLS, which is used to find maximum likelihood estimates of a GLM. In addition, the coefficients are computed based on different features (day, day of week, high price, low price, etc.)

Finally, predicting with the test data returns the mean value, standard error and confidence intervals for each prediction.

Prediction Summary — Mean, Standard Error, Confidence Intervals
XOM stock count based on Dow Jones data and Poisson Regression

As you can see, the prediction of the stock volume over a time period shows close accuracy to the actual data. We can conclude that Poisson regression worked successfully for this count dataset.

References

  1. Poisson Regression Part I | Statistics for Applied Epidemiology | Tutorial 9 https://www.youtube.com/watch?v=0XfXHYDYoBA
  2. Poisson distribution (ScienceDirect), Mathematical Modeling (Fourth edition), 2013. https://www.sciencedirect.com/topics/mathematics/poisson-distribution
  3. Time Series Analysis, Regression and Forecasting https://timeseriesreasoning.com/contents/poisson-regression-model/

--

--