Computation: 2.11.13 in RStan

To give us another close look at modeling and computing using RStan, here’s exercise 2.11.13.

We start out with a table of fatal airplane accidents and deaths per year over a 10-year period. The death rate is number of passenger deaths per 1e8 passenger miles.

Year   Fatal Acc.    Pass. Death  Death Rate
1976 24 734 0.19
1977 25 516 0.12
1978 31 754 0.15
1979 31 877 0.16
1980 22 814 0.14
1981 21 362 0.06
1982 26 764 0.13
1983 20 809 0.13
1984 16 223 0.03
1985 22 1066 0.15

a)

First we are instructed to assume these are Poisson distributed. We get to pick a prior ourselves, and should determine a posterior distribution estimating the Poisson rate. Using the rate, give a 95% predictive interval for the number of fatal accidents in 1986.

As pointed out in the text, Gamma is a reasonable conjugate prior for the Poisson distribution. The text suggests that in order to pick a good prior, we can estimate mean and standard deviation from the data, and then solve for the corresponding parameters. We get

data = data.frame(
Year = 1976:1985,
FatAcc = c(24,25,31,31,22,21,26,20,16,22),
Deaths = c(734,516,754,877,814,362,764,809,223,1066))
data %>% summarize(FatAcc_mu = mean(FatAcc),
FatAcc_var = var(FatAcc))
FatAcc_mu FatAcc_var
1 23.8 22.17778

We seek Gamma parameters that produce mean 23.8 and variance 22.17. Stan uses the shape/rate version of the Gamma distribution, so that for gamma(alpha,beta), we get

  • mean = alpha/beta
  • variance = alpha/beta²

Solving for alpha and beta:

  • alpha = 25.55
  • beta = 1.07

Hence the model we can propose is going to be

  • FatAcc ~ Poisson(theta)
  • theta ~ Gamma(25.55,1.07)

One way that Stan handles predictions is to make the predicted variable a generated quantity. These are sampled alongside the parameters as the model fits.

Armed with this, we are ready to write some Stan code!

```{stan output.var="model"}
data {
int<lower=0> N;
int<lower=0> FatAcc[N];
}
parameters {
real<lower=0> theta;
}
model {
theta ~ gamma(25.55, 1.07);
FatAcc ~ poisson(theta);
}
generated quantities {
int<lower=0> FatAccPred;
FatAccPred = poisson_rng(theta);
}
```

Our data is the count of fatal accidents. We know it can’t be negative. We do allow for more than one observation — which necessitates the N and the array shape for FatAcc.

Our parameter 𝜃 is the rate for our Poisson distribution.

Our prior is the Gamma prior with hyperparameters as we estimated from the data.

Once the model is created, we can start sampling from it to get empirical distributions on 𝜃. We draw 1000 predictions to get a good empirical distribution for the prediction.

fit = sampling(model, data=list(N=nrow(accidents), 
FatAcc=accidents$FatAcc))

As usual, this gives a long output with progress reports on the sampling simulation. At the end we get an object fit out, with the fitted model. Inspecting the results:

> print(fit)
Inference for Stan model: stan-436319e1c875.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.

mean se_mean sd 2.5% 25% 50% 75% 97.5%
theta 23.81 0.04 1.49 20.96 22.79 23.80 24.77 26.83
FatAccPred 23.89 0.09 5.21 14.00 20.00 24.00 27.00 35.00
lp__ 571.39 0.02 0.73 569.25 571.21 571.68 571.85 571.90
n_eff Rhat
theta 1431 1
FatAccPred 3748 1
lp__ 1723 1

Samples were drawn using NUTS(diag_e) at Thu Oct 3 13:22:20 2019.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

From the printout we can read off a 95% predictive interval as (14.00, 35.00).

b)

Now, let’s do the same thing — but with each year independent Poisson with a rate and exposure, where exposure is proportional to passenger miles flown. In other words FatAcc ~ poisson(x*theta) for some x proportional to the passenger miles flown.

We can estimate x from the rate and the passenger deaths as (using 10¹² passenger miles as our unit)

accidents$MilesE12 = 1e8 * accidents$Deaths / accidents$Rate / 1e12

We can assume that the joint proportionality constant is absorbed by the 𝜃 parameter, making our model:

data {
int<lower=0> N;
int<lower=0> FatAcc[N];
real<lower=0> MilesE8[N];

real<lower=0> MilesE8new;
}
parameters {
real<lower=0> theta;
}
model {
theta ~ gamma(25.55, 1.07);
for(n in 1:N)
FatAcc[n] ~ poisson(MilesE8[n]*theta);
}
generated quantities {
int<lower=0> FatAccPred;
FatAccPred = poisson_rng(MilesE8new*theta);
}

We run and examine the model. The input 8 * 10¹¹ for prediction was given in the exercise.

expfit = sampling(expmodel, data=list(
N = nrow(accidents), FatAcc = accidents$FatAcc,
MilesE12 = accidents$MilesE12, MilesE12new = 8e11/1e12
))
> print(expfit)
Inference for Stan model: stan-43636ec3d8c0.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.
mean se_mean sd 2.5% 25% 50% 75% 97.5%
theta 38.89 0.06 2.36 34.61 37.19 38.82 40.41 43.80
FatAccPred 31.14 0.10 5.89 20.00 27.00 31.00 35.00 43.00
lp__ 558.41 0.02 0.67 556.51 558.24 558.66 558.84 558.89
n_eff Rhat
theta 1718 1
FatAccPred 3395 1
lp__ 1383 1
Samples were drawn using NUTS(diag_e) at Thu Oct 3 13:41:17 2019.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

We can read off the 95% predictive interval as (20,43).

Now you…

The next two steps ask for the same thing — but with passenger deaths instead of number of accidents. Adapt the code to this case and produce posterior distributions and predictions.

--

--

Mikael Vejdemo-Johansson
CUNY CSI MTH594 Bayesian Data Analysis

Applied algebraic topologist with wide-spread interests: functional programming, cooking, music, LARPs, fashion, and much more.