Probability is about Counting-Part 1

elvis lim
Elvis Lim
Published in
7 min readSep 16, 2020

Introduction

During my data science learning journey, I found bayesian analysis interesting as it’s capable to logically predict possible cause by just observing the outcome. It’s sound abstract now however let me write on it in the future.

This article is not about bayesian analysis but on it’s building block, probability. While digging deep and relearning back all my high school and A-levels materials, I had new discovery and understanding.

It’s all gonna credited to Statistical Rethinking by Richard McElreath.

Objective

  1. How to do Combination & Permutation easily as I’m those who prefer understanding instead of memorizing formula(Map possible pairs intuitively)
  2. Estimate the plausible causes based on the observed outcomes.

I created a python code for this which you can found at https://github.com/Elvislim1991/Probability_Article.git. In order to demonstrate the code, I use a simple example of predicting the plausible proportion of colored balls by observing the draws from an unknown urn.

However, the application doesn’t limit here. We can do it for any independent events.

What is Probability?

Probability is useful because it let us estimate how frequent is some event happening when the underlying mechanism of some events is still unknown.

Based on observations, we can then map all the possible causes and calculate the possibility for each cause.

All of this can be explained using the example below.

For a bag of 4 balls, consists of 2 colors (green, blue) with unknown proportions. we draw from the bag 3 times and observe 3 green balls by returning the ball back into the bag for each draw.

So what happens on the pic above is for each pie we can see 4 colored dots near to the center point. Each pie consists of 3 levels, repeated the colored dots connecting for each dot. It is to illustrate what happened based on the scenario.

You can observe there are thicker bold lines for all pies for those connecting 3 green dots. There are lines that match our observation. Thus, each starting 4 dots represent our plausible combination of colored balls. Whereas the connecting lines represent all our possible draws.

For this example, all 4 green balls become the most plausible combination due to we only observed green balls from the draw. By counting and dividing the lines we can then understand the possibility for each combination.

This is why we say the probability is about counting in a deeper sense.

How to map all possibilities?

From the previous example, we can see it’s essential to map all possible pairs in order to calculate probability. This section we going to:

  • Applying combination and permutation.
  • Determine no. of outcome that is possible based on the respective situation.

The equation for the following different conditions will be based on the abbreviation below:

  • No. of options: n
  • No. of frequency: r
Type of situations

Most events consist of 2 major categories based on the importance of order (Sequence of outcomes) as the above diagram. It will then split into with or without replacement on both conditions. Both these 4 conditions will further explain later.

Permutation with replacement

This can demonstrate in an event of calculating the possible lottery combination with 3 digits consists of 1, 2, 3.

3 digits lottery

As the lottery does concern on the order of digit, where 123 is not the same as 213. Thus, it’s is a permutation and with replacement as the same digit can appear more than once.

In order to calculate all possible pairs, there are 3 possible integers (1,2,3) in 1st digits. For each 1st digit, it will have 3 more integers again total up 9 pairs (3 x 3) [ (1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), (3,3)]. On last digits, it will futher multiply by 3 total 27 pairs (3 x 3 x 3).

Therefore, you can just find out the number of pairs by taking the number of options (n) to the power of the number of frequency (r).

Let’s try using this to calculate the total pairs for the 4D lottery which consists of 4 digits where 0–9 is possible for each digit.

It will be 10 x 10 x 10 x 10 = 10000 pairs

Thus, if you bet on 1 4 digits on the 4D lottery, the odd is 1 to 10000 (0.01%)

Permutation without replacement

For a situation where we need to choose president, secretary & treasury out of 4 candidates. (Harry, Don, Sam, and Sally)

Choose 3 positions from 4 candidates

Because the same candidate can’t occupy more than 1 position. Thus, 4 candidates to be considered as president. There are only 3 candidates left for the secretary post and so on.

Therefore, it is a bit similar to permutation with replacement. Just that the options get less and less when multiplying.

So, there are 4 x 3 = 12 total pairs for president (4 candidates) and secretary (3 candidates) post. Whereas, 4(President) x 3(Secretary) x 2(Treasury) = 24 total pairs for the 3 positions. Pretty intuitively right compare to the below equation?

Combination without replacement

After we have done with permutation, we now move to the Combination part where the order of items doesn't matter. It’s a bit similar to the way permutation calculation but with an extra step to remove the repeat pairs. Let’s explain clearly with an example.

There is a small stall selling 6 types of fruits (Banana, Apple, Kiwi, Grape, Pear, Orange), we buy 3 unique fruits. How many combination pairs are there?

Pick 3 out of 6 fruits

From the above diagram, we only do the illustration for 1 row while the rest repeat itself. For the 1st fruit, we pick banana out of 6 whereas the second becomes 5 as the condition is without replacement. Until here still the same as permutation without replacement, therefore the total pairs are 6 x 5 x 4 = 120 pairs.

However, we need to consider the repeat pairs (Banana, Pear, Apple) when the sequence of fruits changed like the one shown on the right side of the diagram above. To calculate how many pairs of repeat pairs from 3 items just refer to permutation without replacement 3 x 2 x 1 = 6 pairs.

To get the combination without replacement, we just need to take 120 / 6 = 20 pairs. It’s the same with what the equation did.

Combination with replacement

Finally, we come to the last type of condition which is the most complicated one. The same stall selling 6 types of fruits (Banana, Apple, Kiwi, Grape, Pear, Orange), we buy 3 fruits from them and the fruit can be repeated. How many pairs are possible?

In this case, we need to focus on the circles and arrows on the top. The circle is the select button and the arrow is the next button. We need to use these buttons to represent our condition. To interpret this set of buttons, We select 3 from the first fruit and skip another 5 more fruits to complete the first pair.

Therefore, we just need to rearrange these buttons to get all possible pairs. To get all permutation without replacement of circles and arrows, we just 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1 = 40320.

After that, we just need to remove the repeated pairs like previously. For circles, it’s 3 x 2 x 1 = 6 pairs where arrows is 5 x 4 x 3 x 2 x 1 = 120 pairs.

Therefore, together circles and arrows will make up to 120 x 6 = 720 pairs. Finally, just take the total 40320 / 720 = 56 pairs.

It is a little bit mess up however it is super satisfying when you able to get it.

Calculating All Options

Let’s see a real-life application based on Apple official website. What are the possible options when purchasing the iPhone 11?

Just absorbed what we learned, There are 6 types of colors and 3 different storage we can pick. So, the unique color and storage size of the iPhone is 6 x 3 = 18 pairs.

Conclusion

I will split this topic into 2 parts to prevent it become lengthy. Next part I will show the coding and focus on the main meat of this article. How to estimate and trace the cause of the effect?

If you got any suggestions please reach me out as I’m still learning and take this opportunity to practice.

--

--