Hypothesis Testing 101: A Beginner’s Guide to Statistical Testing (Part 1) Statistics Lecture-05

11 min readMay 23, 2024

In this two-part lecture, I’ll cover everything you need to know about hypothesis testing. Before diving in, let’s quickly review probability theory, as it’s crucial for understanding hypothesis testing. This lecture is designed for those who already have a grasp on distributions, measures of central tendency, and measures of dispersion. If you need a refresher on these topics, be sure to check out my previous statistics lectures (1, 2, 3 and 4). Let’s get started!

Topics Covered:

Probability
* Addition rule
* Multiplication rule
Permutations and Combinations
Hypothesis testing
P — Value
Confidence Intervals
Significance value
Combining it All Together

Remember hypothesis testing, p-value, confidence interval and significant value are all interconnected, so reading about all four of them is important to get a clearer picture.

If you are too bored to read all of these or just need to brush up the things skip to the 7th topic of this blog.

Probability: As we all know probability is a measure of how likely an even is to happen. It ranges from 0 (impossible event) to 1 (certain event). For example, if you flip a fair coin, the probability of getting a head is 0.5 because there are two possible outcomes (heads or tails) and both are equally likely.
Now, there are some fundamental concepts in probability theory, and they have explained below that how they help us understand about combined events.

i) Addition rule: The addition rule is used to find the probability that either of two events (or more) will occur. It applies mutually exclusive events (events that cannot happen at same time).

Formula: P (A or B) = P(A) + P(B)

If the events are not mutually exclusive (the events can happen at the same time) we need to subtract probability of both the events happening together.

Formula: P (A or B) = P(A) + P(B)-P (A and B)

Example:

Suppose you have a deck of 52 cards, the probability of drawing ACE P(A) is 4/52 = 1/13.
The probability of drawing a king P(B) is also, 4/52 = 1/13.

Since these are mutually exclusive events (you can’t draw both Ace and King at the same time).

P (A or B) = 1/13 + 1/13 = 2/13

ii) Multiplication rule: The multiplication rule is used to find the probability of that both of the independent events will occur.

Formula: P (A and B) = P(A) X P(B)

If the events are dependent (that is one event will affect the occurrence of another event) the formula adjusts to.

Formula: P (A and B) = P(A) X P(B/A)

Where, P(B/A) is the probability of event B occurring given that event A has already occurred.

Example:

Suppose you roll a fair six-sided die. The probability of getting 4 P(A) is 1/6.
If you roll a second die, the probability of getting 4 P(B) is also 1/6.

Since these are independent events.

P (A and B) = 1/6 * 1/6 = 1/36

2. Permutations and Combinations:

Permutations: It is all about arranging things in a specific order. The order matter!

Imagine you have 3 different colored marbles red (R), blue (B) and green (G) and you want to know how many different ways you can line them up.

Example:

If you line them up as R, B, G, that’s one arrangement, if you line them up G, B, R that’s another arrangement.

So, let’s find out all the possible arrangements.

R, B, G

G, B, R

B, R, G

B, G, R

G, R, B

R, G, B

There are 6 different ways to arrange these marbles.

Formula: The number of permutations for n items is given by n! (n factorial). Which means multiplying all whole numbers up to 1.

For our 3 marbles:

3! = 3 x 2 x 1 = 6

Combinations: This is about selecting items where the order doesn’t matter.

Imagine you have 3 same marbles, and you want to choose 2 of them, remember here the order doesn’t matter which means R and B is same as B and R.

So, the possible combinations of selecting 2 marbles out of 3 are.

R, B

G, B

G, R

There are three different ways to select 2 marbles out of 3 without worrying about their order.

Formula: The number of combinations of n items taken k at a time is given by n! / k! (n-k)!

For choosing 2 marbles out of 3:

3! / 2! (3–2)! = 6/2 = 3

3. Hypothesis testing: Hypothesis testing a method that is used in statistics to decide whether there is enough evidence to reject the null hypothesis about a population based on sample data.

Example: Testing a new cereal

Imagine you have new cereal, and you want to know whether this cereal is better than the old one. You decide to taste test your cereal.

Steps in hypothesis testing:

i) State the Hypothesis:

Null Hypothesis (H0): This is the default assumption. In this case the kids like the new cereal as same as the old cereal.
Alternative Hypothesis (H1): This is what you want to test. That is the kids like the new cereal more than the old cereal.

ii) Collect Data:

You ask your 20 friends to taste both of the cereals and reply back which one they prefer. Suppose 15 out of 20 friends say they like new cereal more.

iii) Choose a significance value:

The significance value is a threshold to decide whether to reject the null hypothesis. A common choice is 0.05 (5%).

iv) Calculate Test Statistic:

This involves some math but let's simplify it. You compare the number of friends who prefer new cereal to what you would expect if the null hypothesis were true.

v) Determine the P-value:

The p-value will tell you how likely that you will get your results (or more extreme) if the null hypothesis is true. If the p-value is low, that means your results are somewhat unusual under the null hypothesis.

vi) Make a Decision:

If the p-value ≤ significance value (α): Reject the null hypothesis. This means you have strong evidence that kids like the new cereal more.
If the p-value > significance value (α): Fail to reject the null hypothesis. This means you don’t have enough evidence that kids like the new cereal more.

4. P-value: A p-value is a number that helps us determine whether the results of the experiment or a study are significant. It tells us how likely we are getting the observed results, or more extreme ones, if the null hypothesis is true.

Example: Coin Toss

Imagine you have a regular coin, and you might suspect that it is not fair. You think it might be landing on heads more often than tails. To test this, you decide to flip the coin 100 times.

Steps to understand P value:

Null hypothesis (H0): This is like a default assumption. Meaning for the coin it has 50% of the chance landing on the heads.
Alternative hypothesis (H1): This is what you are testing for, in this case the coin is biased, meaning it lands on heads more than 50% of the times.
Conduct the experiment: You flip a coin 100 times, and you count the number of times it landed on heads. Suppose you get 60 heads on 100 times.
Calculate the p-value: The p-value would tell you how likely it is to get 60 heads in 100 flips if the coin is actually fair under the null hypothesis.

What does p-value tell us:

Small p-value (typically ≤ 0.5): If the p-value is small that means getting 60 heads or more out of 100 it is very unlikely the coin is fair. You might reject the null hypothesis and conclude that the coin is probably biased.
Large p-value (> 0.5): If the p-value is large it means getting 60 heads out of 100 flips could easily happen by chance if the coin is fair. So, you might not have enough evidence to reject the null hypothesis and you might conclude the coin is fair.

Note: The p-value which is 0.5 will be decided by the domain expert.

Importance of p-value:

Decision making: The p-value helps us decide whether your results are significant. If the p-value is low, you have stronger evidence against the null hypothesis.
Threshold: A common threshold for p-value is 0.05. If the p-value comes anything lesser than that, you consider results are statistically significant.

5. Confidence Interval: A confidence interval is a range of values that contains the true value of population parameter. It gives us an idea of precision of our estimate based on sample data.

Example: Estimating heights of trees

Imagine you want to estimate the average height of the trees in a forest. You can’t measure every tree, so you take a sample out of the forest and calculate the average height of trees in that sample.

Steps to understand confidence interval

i) Collect Data

You measure the height of 50 randomly selected trees from a forest. And you calculate the average which comes around 20 feet with the standard deviation of 2 feet.

ii) Calculate Confidence Interval

You want to know how confident you can be that your estimate of 20 feet is close to the true average height of all the trees in the forest.
You decide to calculate 95% of confidence interval, which means you are 95% confident that true average height lies within this interval.

iii) Determine Margin Error

The margin of error depends on sample size and variability of the data. You use a formula to calculate it based on standard deviation of your sample size and desired confidence level.

iv) Construct Confidence Interval

Using the sample mean (20 feet) and margin of error, you construct the confidence interval. Let’s say the confidence interval is around 19 to 21 feet.

Why confidence interval is important?

Precision: It tells you how precise your estimate is. Narrower interval means a more precise estimate.
Interpretability: It gives a range of values instead of just one single point estimate, providing more information about uncertainty in your estimate.

6. Significance Value: A significance value or also known as significance level alpha (α) is a threshold used in hypothesis testing to determine whether to reject the null hypothesis. It represents the probability of rejecting the null hypothesis when it is actually true.

Example: Testing a new plant fertilizer

Imagine you want to determine whether the new plant fertilizer help grow plants taller than the old fertilizer. You perform a test on two group of plants. One group gets the new fertilizer, and the other one gets the new fertilizer.

Steps to understand the significance value

i) State the Hypothesis:

Null Hypothesis (H0): The new fertilizer does not affect the growth of the plant compared to the old fertilizer.
Alternative Hypothesis (H1): The new fertilizer helps plants grow taller than the old fertilizer.

ii) Collect Data:

You measure the heights of plants in both the groups after certain period.

iii) Choose a Significance Level (α):

Common choices for alpha are 0.05 (5%), 0.01 (1%) and 0.10 (10%), let’s use 5% for this example.

iv) Perform the test and calculate the P-value:

You perform a statistical test to calculate the heights of the two group of plants to compare it with the p-value, it tells us how likely it is to get your observed results if the null hypothesis is true.

v) Compare P-value to Significance Level(α):

If p-value ≤ α: Reject the null hypothesis. This means there is strong evidence that the new fertilizer will help plants to grow taller.
If p-value > α: You fail to reject the null hypothesis. This means you don’t have enough evidence to prove that new fertilizer helps plants grow taller.

Why significance level is important?

Decision Making: It helps you decide whether your results are statistically significant.
Control of Error: By setting the significance level, you control the probability of making type I error, which means rejecting the null hypothesis when it is actually true.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

I know… All these things are very confusing, actually I took different examples to explain you about hypothesis testing, p-value, confidence interval and significance value just to make sure you would get a good understanding of it. But in reality, we use all these things to just solve a single problem, it is not that particular testing or level should be used solely for a single purpose.

How all these factors or techniques are corelated?? Let’s break down how these concepts are interconnected and support each other in the realms of statistical analysis.

Example: Using a New Teaching Method

Hypothesis testing: A method to determine if there is enough evidence to support specific hypothesis about a population parameter based on sample data.

i) State the Hypothesis

Null Hypothesis (H0): The new teaching method is not more effective than the traditional method.
Alternative Hypothesis (H1): The new teaching method is more effective than the traditional method.

ii) Collect Data

The teacher tests both the group and collects the scores.\

Confidence Interval: A range of values that contains the true population parameter.

i) Calculate the Confidence Interval

The teacher calculates the average score of both the groups and finds the difference.
She then calculates 95% of confidence interval for differences in the score.

Interconnection

If the confidence interval does not include zero: It suggests that there is significant difference in teaching methods, implying that the new method might be more effective.
If the confidence interval includes zero: It suggests that there is no significant difference and might conclude that the new method might not be that effective.

P-Value: The probability of obtaining the values at least as extreme as observed results, assuming null hypothesis is true.

Based on the data, the teacher performs the statistical test (like t-test) to calculate the p-value.

Interconnection

If the p-value is low (≤ α): That means that the observed data is unlikely under the null hypothesis, leading to its rejection.
If the p-value is high (> α): This indicates that the observed data is likely under the null hypothesis, so there is not enough evidence to reject it.

Significance Value (α): A threshold to decide whether to reject the null hypothesis.

Interconnection

The significance value is the cut-off point to interpret the p-value.
If p-value ≤ α: Reject the null hypothesis and accept the alternative hypothesis.
If p-value > α: Do not reject the null hypothesis

Putting it all together:

Hypothesis Testing sets the stage by farming a question and setting up the hypothesis.
Confidence interval provides a range where true effect size lies and give a visual and numerical way of understanding the data.
P-value helps us make the decision by quantifying the strength of evidence against the null hypothesis.
Significance value is the threshold to make a decision.

Summary:

These concepts are interrelated components of statistical analysis.

Hypothesis testing provides a framework.
Confidence interval provides a range that helps to understand the estimate’s precision and potential significance.
P-Value quantifies the evidence against the null hypothesis.
Significance value is a criterion to make the final decision.

Example Recap: Testing a New Teaching Method

Hypothesis testing is the set up to see whether the new teaching method is better.
Confidence intervals calculate the range of difference in the score.
P-value is to calculate how unusual the observed difference is under the null hypothesis.
Significance value 0.05 as a cut-off to decide whether to reject the null hypothesis.

That’s it guys, in the next part of this blog we shall discuss more about statistical tests and how to perform hypothesis testing, how to calculate confidence interval, how to calculate p-value and how to set the threshold for significance value.

Why reading is better than watching YouTube videos?

Reading let’s your imagination run wild. You can create pictures in your mind, and that’s like having personal movie in your mind.
When you are reading you will be focused on words, but while watching videos you will be distracted by flashy visuals and adds.
Reading encourages you to think. You pause, reflect and understand things deeply. It’s like having conversation with the author itself.
Reading is patient game. It’s not a race. You learn to enjoy the journey, and that helps for you in many areas of the life.

Hypothesis Testing 101: A Beginner’s Guide to Statistical Testing (Part 1) Statistics Lecture-05

Written by Anju Reddy K