Published in

Analytics Vidhya

# The Ultimate Guide to Hypothesis Testing for beginners

We come across so many statements in our life whose truth cannot be verified that instant. Have you ever wondered which one to choose? or which statement will validate our truth? or is there a way to find statistically acceptable answers to our simple statements? Of course, yes!

How? Hypothesis Testing.

Hypothesis Testing is one of the most important concepts in Analytics yet many of us do not have a clear idea about what it is actually about.

1. Average life of vegetarians is different than that of meat-eaters.

2. On a state driver’s test, about 40% pass the test on the first try.

3. Proportion of married people defaulting on loan repayment is less than the proportion of singles defaulting on loan repayment.

How do we statistically validate the above claims/verify its truth?

Hypothesis and Hypothesis Testing: A hypothesis is nothing but an assumption/claim/proposition made about something which is later on tested statistically to verify its truth.

Hypothesis Testing: Hypothesis testing is the process of checking the validity of the claim using evidence found in sample data.

A Hypothesis Testing consists of two contradictory statements namely,

1.Null Hypothesis

2.Alternative Hypothesis

Null Hypothesis(H₀): In a null hypothesis, we claim that there is no relationship or difference between groups with respect to the value of the population parameter.

We begin the Hypothesis Test by assuming that the Null Hypothesis to be true and later on we retain/reject the Null Hypothesis based on the evidence found in sample data.

Note: The null statement must always contain some form of equality (=, ≤ or ≥)

Alternative Hypothesis (H or H):In the alternative Hypothesis, we claim that there is some change /relationship between groups with respect to the value of the population parameter.

An alternative hypothesis always contradicts the Null Hypothesis and only any one of the above hypotheses could be true.

Note: An Alternative hypothesis is denoted using less than, greater than, or not equals symbols, i.e., (≠, >, or <).

Steps in Hypothesis Testing:

Now let’s see the steps involved in performing a Hypothesis Test.

1. Formulate the Null and Alternative Hypothesis

2. Decide the Significance Level (α)

3. Calculate the test statistic

4. Calculate the p-value

5. Decision to reject/retain the Null Hypothesis

1.Formulating the Null and Alternative Hypothesis: The first step in performing Hypothesis Testing is to describe the Null and Alternative Hypothesis in words. These Hypotheses are described using a population parameter such as mean, median, or proportion.

For Example, consider the below hypothesis,

1. Average life of vegetarians is different than that of meat-eaters.

Here,

H₀: The average life of vegetarians and meat-eaters are the same

H₁: The average life of vegetarians and meat-eaters are not the same

H₀: µᵥ= µₘ

H₁: µᵥ ≠µₘ

Where µᵥ and µₘ are the average life of vegetarians and meat-eaters respectively.

2. Proportion of married people defaulting on loan repayment is less than the proportion of singles defaulting on loan repayment.

H₀: The proportion of married people defaulting on loan repayment is not less than the proportion of singles defaulting.

H₁: The proportion of married people defaulting on loan repayment is less than the proportion of singles defaulting.

H₀: pₘ ≥ pₛ

H₁: pₘ < pₛ

where pₘ and pₛ are the proportion of married and single defaulters respectively.

After formulating the Null and Alternative hypothesis, based on the evidence from the sample data we retain/reject the Null Hypothesis.

2.Decide the Significance Level (α): The Significance Level(α) is nothing but the maximum threshold set to reject the Null Hypothesis.

Usually, α is set as 0.05. This means that there is a 5% chance we will reject the Null Hypothesis even when it’s true. Here if p-value<0.05 we reject the null hypothesis even when there is a 5% chance for it to be true.

Critical Value and Critical region: The value of the statistic in the sampling distribution for which the probability is α is called the critical value.

The areas beyond the critical values are known as the critical region/rejection region and critical values are the values that indicate the edge of the critical region.

Note: Critical regions describe the entire area of values that rejects the null hypothesis. If the test statistic falls in the critical region, the null hypothesis will be rejected.

One-Tailed Test: When the critical/rejection region is on one side of the distribution it is known as a One-Tailed Test. In this case, the null hypothesis will be rejected if the test statistic is on one side of the distribution, either left or right.

Left Tailed Test: If the test is left-tailed, the critical/rejection region, with an area equal to α, will be on the left side of the distribution curve. In this case, the null hypothesis will be rejected if the test statistic is very small(as it will fall on the left end of the distribution).

Here, p-value = P[Test statistics <= observed value of the test statistic]

E.g.: college students take less than five years to graduate from college, on the average

H₀: μ ≥ 5

H₁: μ < 5

This is a left-tailed test since the rejection region would consist of values less than 5.

Right Tailed Test: If the test is right-tailed, the critical /rejection region, with an area equal to α, will be on the right side of the distribution curve. In this case, the null hypothesis will be rejected if the test statistic is very large (or falls on the right end of the distribution).

Here, p-value = P [Test statistics >= observed value of the test statistic]

E.g.: A package of gum claims that the flavor lasts more than 39 minutes.

H₀: μ ≤ 39

H₁: μ > 39

This is a right-tailed test since the rejection region would consist of values greater than 39.

Two-Tailed Test:

If the test is two-tailed, α must be divided by 2 and the critical /rejection regions will be at both ends of the distribution curve. Hence, in this case, the null hypothesis will be rejected when the test value is on either of two rejection regions on either side of the distribution.

For a two-tailed test,

p-value = 2 * P[Test statistics >= |observed value of the test statistic|]

3.Calculate the Test statistic:

A test statistic is nothing but the standardized difference between the value of the parameter estimated from the sample (such as sample mean) and the value of the null hypothesis (such as hypothesized population mean). It is a measure of how far the sample mean is from the hypothesized population mean.

test statistic = (x̄ — μ) / (σ / √n)

where, x̄ = sample mean

μ = population mean

σ = Standard Deviation of Population

n = Number of Observation

If the test statistic falls in the critical region, we reject the null hypothesis. Also, the larger the absolute value of the test statistic, the larger will be the standardized difference between hypothesized mean and sample mean, and hence the p-value, in this case, will be smaller with greater evidence against the null hypothesis.

There are various different ways to calculate the test statistics. Some of the commonly used tests are Z-test and T-Test. Although the test statistic calculations are mostly similar any of the above tests can be chosen based on the given population or distribution parameters.

T-Test: A T-test can be used when,

- The population distribution is normal or

- The sampling distribution is symmetric and the sample size is ≤ 15 or

- The sampling distribution is moderately skewed and the sample size is 16 ≤ n ≤ 30 or

-The sample size is greater than 30, without outliers.

Z-Test: A Z-test can be used when

-when the population is normally distributed and σ is known.

- The sample size n ≥ 30

Now consider the below example,

The average height of a 7th grader five years ago was 145 cm with a standard deviation of 20 cm. From a random sample of 200 students, the average height of the sample students is found to be 147 cm. Are 7th graders now taller than they were before?

Here,

H₀: µ ≤ 145

Hₐ: µ > 145

Clearly, this is a right-tailed test since the rejection region (µ > 145) in this case lies on the right end of the sampling distribution.

sample mean, x̄ = 147

population mean, μ =145

Standard Deviation of Population, σ =20

Number of Observation, n =200

test statistic = (x̄ — μ) / (σ / √n) = (147–145) / (20/√200) = 1.414

The above test statistic means that the sample mean is 1.414 standard deviations away from the population mean or the standardized difference between the sample mean and hypothesized population mean is 1.414

For α = 0.05, the critical value for a one-tailed test is 1.64. Hence if the test statistic is greater than 1.64 it will be in the rejection region.

Clearly, our test statistic 1.414 is less than the critical value of 1.64 and hence does not fall in the rejection region. This means that the sample mean is not significantly different from the hypothesized population mean and hence we fail to reject the Null hypothesis

P-value:A p-value is nothing but the conditional probability of getting the test statistic given the null hypothesis is true.

P-value=P (Observing the test statistic| Null hypothesis is true)

If p-value <= significance level (α), we reject the null hypothesis in favor of the alternative hypothesis.

If p-value > significance level (α), we fail to reject the null hypothesis and hence accept the alternative hypothesis.

For the above example, the p-value can be stated as the probability of observing a test statistic of 1.414 when the null hypothesis (µ ≤ 145) is true.

i.e., P (z=1.414| µ ≤ 145) = P (z ≤ 1.414) =0 .078681.

clearly p-value, 0.0786 is greater than 0.05(α). Hence, we fail to reject the Null Hypothesis.

So, to summarize the process of Hypothesis Testing starts with the formulation of the Null and Alternative Hypothesis, setting the significance level(α) and then calculating the p-value based on which we accept or reject the Null Hypothesis.