Halloween Candy Count M&Ms
we are task to make a Multinomial model for categorical data, in particular the different color M&Ms.
```{stan, output.var="model"}
data {
int<lower=1> N; // observations
int<lower=1> K; // categories
int<lower=0> y[N,K]; // observations
}
parameters {
simplex[K] theta;
}
model {
theta ~ dirichlet(rep_vector(1, K));
for(n in 1:N) {
y[n,] ~ multinomial(theta);
}
}
```
Allowing N to be the number of rows and K to be the number of Columns. since we need the sum of our thetas to be equal to 1, this can be achieved by taking take a simplex of length K. With the dirichlet distribution as our prior and the Multinomial as our likelihood model.
Now loading the data
Using the tidyverse library. We will read in the excel sheet that with data we want. Particularly with type being “Peanut” and plant being “CLV”.
```{r}
mnms = read.csv("Halloween Candy Counts - M&Ms.csv") %>%
filter(Type=="Peanut") %>%
filter(Plant=="CLV") %>%
select(-Plant, -Type)
```
now we will fit our model with the data list being the number of rows, number of columns, and y being the mnms
```{r}
fit = sampling(model, data=list(N=nrow(mnms),
K=ncol(mnms),
y=mnms))
```
printing our model we get
We can see that the difference between the medium and the mean is very small, thus the distribution of mnms
is fairly symmetric. We can also see that in the distribution of mnms
that there are less of theta 5 and theta 6 compared to the rest.
This is our model fitted to mnms
from another plant (HKP)
This is our model fitted to mnms
from (PA)