Your Guide to Discrete Probability Distributions and Their Applications in R

Spardha
Analytics Vidhya
Published in
9 min readAug 20, 2021
Photo by Lucas Santos on Unsplash

Probability distributions are statistical functions that describe the likelihood of obtaining possible values that a random variable can take. This article will explore the different types of discrete probability distributions along with their code in R. Each distribution is supplemented by a real-world example.

Broadly, probability distributions can be divided into two categories:

A. Discrete Probability Distribution

It models the probabilities of random variables that can have discrete values as outcomes. A discrete random variable is a random variable that has countable values, such as a list of non-negative integers. Discrete probability functions are also known as probability mass functions.

Example: If you’re counting the number of books that a library checks out per hour, you can count 15 or 16 books, but nothing in between.

Discrete Probability Distributions can further be divided into

1. Binomial Distribution

2. Multinomial Distribution

3. Bernoulli Distribution

4. Negative Binomial Distribution

5. Poisson Distribution

6. Geometric Distribution

7. Hypergeometric Distribution

B. Continuous Probability Distribution

It models the probabilities of the possible values of a continuous random variable. A continuous random variable is a random variable with a set of possible values that are infinite and uncountable. Continuous variables are often measurements on a scale, such as weight and temperature. Continuous probability functions are also known as probability density functions.

Let’s look at the types of Discrete Probability Distributions:

  1. Binomial Distribution

A binomial distribution is frequently used to model the number of successes in a sample of size ndrawn with replacement from a population of size N. Each experiment has a Boolean-valued outcome such as success/yes/true/one (with probability p) or failure/no/false/zero (with probability, q=1-p). The following conditions must be satisfied for the experiment to be termed as a binomial experiment:

i. Fixed number of ntrials.

ii. Each trial is independent.

iii. Only two outcomes are possible (Success or Failure).

iv. The probability of success ( p) for each trial is constant.

v. A random variable Y= the number of successes.

Example: For a coin tossed ntimes, a binomial distribution can be used to model the probability of the number of successes (say, heads).

Code: To find the probability of getting 6 heads from 10 tosses of a coin, we use dbinom(x, size, prob).

  • x= vector of length kof integers in 0:size
  • size= the total number of trials.
  • prob= the probability of success in each trial. Infinite and missing values are not allowed.
dbinom(6,size=10,prob=0.5)

To plot the probability mass function for a binomial function in R:

dbinom(x, size, prob) is used to create the probability mass functionplot(0:x, dbinom(0:x, size, prob), type = ‘h’) to plot the probability mass function, specifying the plot to be a histogram (type=’h’)

Let’s say that for a coin tossed 10 times, the binomial distribution could be used to model the probability of the number of heads (1 to 10). Here is the probability mass function for a binomial distribution created with size=10 and p=0.5

success <- 0:10plot(success,dbinom(success,size=10,prob=0.5),type=’h’,main=’Binomial Distribution (size=10, p=0.5)’,ylab=’Probability’,xlab =’# Successes (heads)’,lwd=3)
Probability Mass Function for Binomial Distribution | Image by Author

Note:

  • dbinom= Binomial probability mass function(pmf).
  • pbinom= Binomial distribution(Cumulative distribution function).
  • qbinom=Binomial quantile function.
  • rbinom=Binomial pseudorandom number generation.

Since this article reports probability mass functions, we are only going to use pmf (dbinom here) associated with the type of discrete probability distribution.

2. Multinomial Distribution

It is a generalization of the binomial distribution to k categories instead of just binary (success/fail). For n independent trials each of which leads to success for exactly one of k categories, the multinomial distribution gives the probability of any particular combination of numbers of successes for the various categories.

Example: A multinominal distribution models the probability of counts of each side for rolling a k-sided die n times.

  • When k = 2 and n = 1, the multinomial distribution is the Bernoulli distribution.
  • When k = 2 and n > 1, it is the binomial distribution.
  • When k > 2 and n = 1, it is the categorical distribution.
  • When k > 2 and n > 1, it is termed as multinomial distribution.

Code: We use dmultinom(x, size, prob, log = FALSE)

  • x= vector of length K of integers in 0:size
  • size= total number of trials. For dmultinom(), it defaults to sum(x)
  • prob= the probability of success in each trial. Infinite and missing values are not allowed.
  • log= logical. If TRUE, then log probabilities are reported.

Let’s say that two chess players A and B have the probability of winning a game as 0.40 and 0.35 respectively. The probability that the game would end in a draw is 0.25.

We can use multinomial distribution to answer: If these two chess players played 12 games, what is the probability that Player A would win 7 games, Player B would win 2 games, and the remaining 3 games would be drawn?

dmultinom(x=c(7,2,3), prob = c(0.4,0.35,0.25))

3. Bernoulli Distribution

It is a special case of the Binomial Distribution where only a single trial is performed. For n = 1 (one experiment), a binomial distribution can be termed as Bernoulli distribution. A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment and a sequence of outcomes is called a Bernoulli process.

Example: Consider a coin toss where the probability of getting the head is 0.5 and getting a tail is 0.5.

Code: We use dbern(x, prob)

  • x= vector of length 1
  • prob= the probability of success for each trial. Infinite and missing values are not allowed.

To find the probability of getting 1 head from 1 toss of a fair coin:

dbern(1, prob = 0.5)

To plot the probability mass function for a Bernoulli function in R, we can use the following functions:

dbern(x, prob) is used to create the probability mass functionplot(0:x, dbern(0:x, prob), type = ‘h’) to plot the probability mass function, specifying the plot to be a histogram (type=’h’)

For a fair coin tossed once, the probability mass function for a Bernoulli distribution is:

success<-0:1plot(success,dbern(success, prob = 0.5) , type='h', main='Bernouli Distribution (size=1, p=0.5)', ylab='Probability', xlab ='# Successes (heads)', lwd=3,ylim=c(0,1))
Probability Mass Function for Bernoulli Distribution | Image by Author

4. Negative Binomial Distribution

It is a type of binomial distribution where the number of trials, n, is not fixed and a random variable Y is equal to the number of trials needed to make r successes. The negative binomial distribution is known as the Pascal distribution.

Example: You are surveying people exiting from a polling booth and asking them if they voted independent. The probability (p) that a person voted independent is 20%. What is the probability that 70 people must be asked before you can find 5 people who voted independent?

Code: We use dnbinom(x, size, prob)

  • x= number of failures that need to happen before reaching the required number of successes.
  • size= number of successes.
  • prob= the probability of success in each trial. Infinite and missing values are not allowed.

To answer the above question:

dnbinom(65,5,0.2)

To plot the probability mass function for a negative binomial function in R, we can use the following functions:

dnbinom(x, size, prob) is used to create the probability mass functionplot(0:x, dnbinom(0:x, size, prob) , type = ‘h’) to plot the probability mass function, specifying the plot to be a histogram (type=’h’)

Plotting the probability mass function for the aforementioned question:

success<-0:65plot(success,dnbinom(success,5, prob = 0.2) , type='h', main='Negative Binomial Distribution', ylab='Probability', xlab ='# Successes (people who voted independent)', lwd=3)
Probability Mass Function for Negative Binomial Distribution | Image by Author

5. Poisson Distribution

It is used to model the number of independent events occurring within a given time interval. It shows how many times an event is likely to occur within a fixed interval of time if these events occur with a known average rate and independently of the time since the last event.

Example: Consider a customer help center. On average, there are, say, 10 customers which call in an hour. Thus, Poisson distribution can be used to model the probability of a different number of customers calling within an hour (say, 5 or 6 or 7, etc).

Code: We use dpois(x, lambda, log=FALSE)

  • x=vector of length K
  • lambda= vector of means.
  • log= logical; if TRUE, probabilities p are given as log(p).

Let’s find the probability of 20 customers calling within an hour.

dpois(20, lambda=10)

To plot the probability mass function for a Poisson function in R, we can use the following functions:

dpois(x, lamda) is used to create the probability mass functionplot(0:x, dpois(0:x, lambda) , type = ‘h’) to plot the probability mass function, specifying the plot to be a histogram (type=’h’)

Plotting the probability mass function for the aforementioned question:

success<-0:20plot(success,dpois(success, lambda=10) , type='h', main='Poisson Distribution (lambda=10)', ylab='Probability', xlab ='# Successes (number of customers who called)', lwd=3)
Probability Mass Function for Poisson Distribution | Image by Author

6. Geometric Distribution

It is the probability distribution of the number of trials needed to get the first success in repeated independent Bernoulli trials. The Geometric distribution can also be defined as a number of failures before the first success happens.

The geometric distribution is an appropriate model if the following assumptions are true:

i. The phenomenon being modeled is a sequence of independent trials.

ii. There are only two possible outcomes for each trial, often designated success or failure.

iii. The probability of success, p, is the same for every trial.

Example: A researcher is waiting outside of a library to ask people if they support a certain law. The probability that a given person supports the law is p = 0.2. What is the probability that the fourth person the researcher talks to is the first person to support the law?

Code: We use dgeom(x, prob, log = FALSE)

  • x= number of failures that need to happen before reaching n successes.
  • prob= the probability of success in each trial. Infinite and missing values are not allowed.
  • log= logical. If TRUE, then log probabilities are reported.

To answer the aforementioned question:

dgeom(x=3, prob=0.2)

To plot the probability mass function for a geometric function in R, we can use the following functions:

dgeom(x, prob) is used to create the probability mass functionplot(0:x, dgeom(0:x, prob), type = ‘h’) to plot the probability mass function, specifying the plot to be a histogram (type=’h’)

Plotting the probability mass function:

success<-0:3plot(success,dgeom(success, prob = 0.2) , type=’h’, main=’Geometric Distribution’, ylab=’Probability’, xlab =’# Successes (person supports the law)’, lwd=3)
Probability Mass Function for Geometric Distribution | Image by Author

7. Hypergeometric Distribution

It describes the number of successes in a sequence of k draws from a finite population without replacement, just as the binomial distribution describes the number of successes for draws with replacement.

Example: A deck of cards contains 20 cards: 6 red cards and 14 black cards. 5 cards are drawn randomly without replacement. What is the probability that exactly 4 red cards are drawn?

Code: We use dhyper(x, m, n, k, log = FALSE)

  • x= a vector of integers in 0:n
  • m =number of items in the population that are classified as successes.
  • n =number of items in the population that are not classified as successes.
  • k =number of items in the sample that are classified as successes.
  • log= logical. If TRUE, then log probabilities are reported.

To answer the aforementioned question:

dhyper(4,6,14,5)

To plot the probability mass function for a hypergeometric function in R, we can use the following functions:

dhyper(x, m, n, k) is used to create the probability mass functionplot(0:x, dhyper(0:x, m, n, k), type = ‘h’) to plot the probability mass function, specifying the plot to be a histogram (type=’h’)

Plotting the probability mass function:

success<-0:4plot(success,dhyper(0:4,6,14,5) , type=’h’, main=’Hypergeometric Distribution’, ylab=’Probability’, xlab =’# Successes’, lwd=3)
Probability Mass Function for Hypergeometric Distribution | Image by Author

Hope you’re now familiar with the technical and theoretical breadth of Discrete Probability Distributions!

References:

[1]:https://www.math.ucla.edu/~anderson/rw1001/library/base/html/Geometric.html

[2]: https://www.statisticshowto.com/negative-binomial-experiment/

[3]: https://www.r-bloggers.com/2011/02/using-r-for-introductory-statistics-chapter-5-hypergeometric-distribution/

[4]: https://www.statisticshowto.com/hypergeometric-distribution-examples/

--

--

Spardha
Analytics Vidhya

Columbia University | Passionate about Data Science