Halloween Candy Count M&Ms

David Angeles
CUNY CSI MTH594 Bayesian Data Analysis
2 min readNov 12, 2019

we are task to make a Multinomial model for categorical data, in particular the different color M&Ms.

```{stan, output.var="model"}
data {
int<lower=1> N; // observations
int<lower=1> K; // categories

int<lower=0> y[N,K]; // observations
}
parameters {
simplex[K] theta;
}
model {
theta ~ dirichlet(rep_vector(1, K));
for(n in 1:N) {
y[n,] ~ multinomial(theta);
}
}
```

Allowing N to be the number of rows and K to be the number of Columns. since we need the sum of our thetas to be equal to 1, this can be achieved by taking take a simplex of length K. With the dirichlet distribution as our prior and the Multinomial as our likelihood model.

Now loading the data

Using the tidyverse library. We will read in the excel sheet that with data we want. Particularly with type being “Peanut” and plant being “CLV”.

```{r}
mnms = read.csv("Halloween Candy Counts - M&Ms.csv") %>%
filter(Type=="Peanut") %>%
filter(Plant=="CLV") %>%
select(-Plant, -Type)
```

now we will fit our model with the data list being the number of rows, number of columns, and y being the mnms

```{r}
fit = sampling(model, data=list(N=nrow(mnms),
K=ncol(mnms),
y=mnms))
```

printing our model we get

We can see that the difference between the medium and the mean is very small, thus the distribution of mnms is fairly symmetric. We can also see that in the distribution of mnms that there are less of theta 5 and theta 6 compared to the rest.

This is our model fitted to mnms from another plant (HKP)

This is our model fitted to mnms from (PA)

--

--