The Multinomial Probability Distribution

Maryam Raji
Analytics Vidhya
Published in
5 min readFeb 16, 2020

Multinomial Distribution

A multinomial distribution is the probability distribution of the outcomes from a multinomial experiment.

Multinomial Experiment

A multinomial experiment is a statistical experiment that has the following properties: The experiment consists of n repeated trials.

1.)Each trial has a discrete number of possible outcomes.
2.)On any given trial, the probability that a particular outcome will occur is constant.
3.)The trials are independent; that is, the outcome on one trial does not affect the outcome on other trials.

Consider the following statistical experiment. You toss two dice three times and record the outcome on each toss. This is a multinomial experiment because:

The experiment consists of repeated trials. We toss the dice three times. Each trial can result in a discrete number of outcomes — 2 through 12. The probability of any outcome is constant; it does not change from one toss to the next. The trials are independent; that is, getting a particular outcome on one trial does not affect the outcome of other trials.

Another example is that you are given a bag of marbles. Inside the bag are 5 red marbles, 4 white marbles, and 3 blue marbles. Calculate the probability that with 6 trials, you choose 3 marbles that are red, 1 marble that is white, and 2 marbles that are blue, replacing each marble after it is chosen.

Notice that this is not a binomial experiment since there are more than 2 possible outcomes. For binomial experiments, k=2 (2 outcomes). Therefore, we use the binomial experiment formula for problems involving heads or tails, yes or no, or success or failure(the keyword here is a binary outcome for each independent trial). In this problem, there are 3 possible outcomes: red, white, or blue.

Note: A binomial experiment is a special case of a multinomial experiment. Here is the main difference. With a binomial experiment, each trial can result in two — and only two — possible outcomes. With a multinomial experiment, each trial can have two or more possible outcomes.

Multinomial Formula.

Suppose a multinomial experiment consists of n trials, and each trial can result in any of k possible outcomes: E1, E2, . . . , Ek. Suppose, further, that each possible outcome can occur with probabilities p1, p2, . . . , pk. Then, the probability (P) that E1 occurs n1 times, E2 occurs n2 times, . . . , and Ek occurs nk times is:

P = [ n! / ( n1! * n2! * … nk! ) ] * ( p1 * p2**n2 * . . . * p**nk)

where n = n1 + n2 + . . . + nk.

The example below illustrates how to use the multinomial formula to compute the probability of an outcome from a multinomial experiment.1

Suppose we have a bowl with 10 marbles — 2 red marbles, 3 green marbles, and 5 blue marbles. We randomly select 4 marbles from the bowl, with replacement. What is the probability of selecting 2 green marbles and 2 blue marbles?

Solution: To solve this problem, we apply the multinomial formula. We know the following:

The experiment consists of 4 trials, so n = 4. The 4 trials produce 0 red marbles, 2 green marbles, and 2 blue marbles; so nred = 0, ngreen = 2, and nblue = 2. On any particular trial, the probability of drawing a red, green, or blue marble is 0.2, 0.3, and 0.5, respectively. Thus, pred = 0.2, pgreen = 0.3, and pblue = 0.5 We plug these inputs into the multinomial formula, as shown below:

P = ( n! / ( n1! n2! … nk! ) ) ( p1**n1 p2**n2 . . . pk**nk )

P = ( 4! / ( 0! 2! 2! ) ) ( (0.2)**0 (0.3)**2 * (0.5)**2)

P = 0.135

Thus, if we draw 4 marbles with replacement from the bowl, the probability of drawing 0 red marbles, 2 green marbles, and 2 blue marbles is 0.135.

In Austria, 30% of the population has a blood type of O+, 33% has A+, 12% has B+, 6% has AB+, 7% has O-, 8% has A-, 3% has B-, and 1% has AB-. If 15 Austrian citizens are chosen at random, what is the probability that 3 have a blood type of O+, 2 have A+, 3 have B+, 2 have AB+, 1 has O-, 2 have A-, 1 has B-, and 1 has AB-?2

n=15 (15 trials)
p1= (probability of O+)=0.30
p2=(probability of A+)=0.33
p3=(probability of B+)=0.12
p4=(probability of AB+)=0.06
p5=(probability of O-)=0.07
p6=(probability of A-)=0.08
p7=(probability of B-)=0.03
p8=(probability of AB-)=0.01
n1=3 (3 O+)
n2=2 (2 A+)
n3=3 (3 B+)
n4=2 (2 AB+)
n5=1 (1 O-)
n6=2 (2 A-)
n7=1 (1 B-)
n8=1 (1 AB-)
k=8 (8 possibilities)

P = ( n! / ( n1! n2! … nk! ) ) ( p1**n1 p2**n2 . . . pk**nk )

P=(15! / (3!2!3!2!1!2!1!1!))×(0.30**3 x 0.33**2 x 0.12**3 x 0.06**2 x0.07**1 x0.08**2 x0.03**1 x0.01**1

P=0.000011

Therefore, if 15 Austrian citizens are chosen at random, the probability that 3 have a blood type of O+, 2 have A+, 3 have B+, 2 have AB+, 1 has O-, 2 have A-, 1 has B-, and 1 has AB- is 0.0011%.

scipy.stats.multinomial

scipy.stats.multinomial(n, p, seed=None) = 
A multinomial random variable.
Parameters
x: array_like
Quantiles, with the last axis of x denoting the components.
n: int
Number of trials
p: array_like
Probability of a trial falling into each category; should sum to 1
random_state: None or int or np.random.RandomState instance, optional
If int or RandomState, use it for drawing the random variates. If None (or np.random), the global np.random state is used. Default is None.

Notes

n should be a positive integer. Each element of p should be in the interval [0,1] and the elements should sum to 1(probabilities for the events). If they do not sum to 1, the last element of the p array is not used and is replaced with the remaining probability left over from the earlier elements.

Alternatively, the object may be called (as a function) to fix the n and p parameters, returning a “frozen” multinomial random variable:

The probability mass function for multinomial is

f(x) = ( n! / x1!…..xk!) * (p**x1….. p**xk)

supported on x =(x1….xk) where each xi is a non -negative integer and their sum is n.

I will use the example above: In Austria, 30% of the population has a blood type of O+, 33% has A+, 12% has B+, 6% has AB+, 7% has O-, 8% has A-, 3% has B-, and 1% has AB-. If 15 Austrian citizens are chosen at random, what is the probability that 3 have a blood type of O+, 2 have A+, 3 have B+, 2 have AB+, 1 has O-, 2 have A-, 1 has B-, and 1 has AB-?2

n=15 (15 trials)
p1= (probability of O+)=0.30
p2=(probability of A+)=0.33
p3=(probability of B+)=0.12
p4=(probability of AB+)=0.06
p5=(probability of O-)=0.07
p6=(probability of A-)=0.08
p7=(probability of B-)=0.03
p8=(probability of AB-)=0.01
n1=3 (3 O+)
n2=2 (2 A+)
n3=3 (3 B+)
n4=2 (2 AB+)
n5=1 (1 O-)
n6=2 (2 A-)
n7=1 (1 B-)
n8=1 (1 AB-)
k=8 (8 possibilities)

In [1]:

from scipy.stats import multinomial
rv = multinomial.pmf([3,2,3,2,1,2,1,1],15, [0.3, 0.33, 0.12,0.06,0.07,0.08,0.03,0.01]) # multinomial.pmf(x,n,p)
rv

Out[1]:

1.1162058001298526e-05

Same as the answer above right?

REFERENCES

1.The Statrek Blog
2.The CK12 Foundation
3.Scipy.stats

--

--

Maryam Raji
Analytics Vidhya

I am a Data Scientist in training, mobile web development enthusiast , writer who loves all forms of creativity. Oh! lest I forget,I am also a medical doctor.