Hypothesis Testing

Nikhila Sindhe
AlmaBetter
Published in
5 min readJul 5, 2021
Photo by Carlos Muza on Unsplash

Introduction

In plain English, a Hypothesis is a supposition or a proposed explanation made on the basis of limited evidence as a starting point for further investigation.
Before we delve into the topic of Hypothesis Testing, let us understand the concepts of population and sample.

A population is the set of all similar items or events which is of interest for some experiment. If we want to calculate the average age of a country, then the population would consist of ages of all the people living in the country. However, collecting information about each and every single person in a country would consume a lot of resources. Instead, we obtain a sample that is representative of the population and draw conclusions about the population.

In simple terms, population consists of the totality of observations with which we are concerned; and a sample is a subset of a population. We select a random sample to get information about the unknown population parameters.

The population mean and standard deviation used to represent a population are known as parameters. And the variables used to represent a sample are called statistics.

A parameter can be estimated from a sample. Frequently however, the problem confronting us is not just estimating the parameter, but deciding whether the claim made about the parameter is correct. To accomplish this, we use hypothesis testing.

Steps in hypothesis testing

  1. Formulate null and alternate hypotheses based on the practical question
  2. Set significance level (α)
  3. Decide the type of test (left-tailed, right-tailed or 2-tailed test)
  4. Calculate the test statistic and corresponding P-value
  5. Draw conclusions

Null and Alternative Hypotheses

The null hypothesis, denoted by Ho, is the claim that is initially assumed to be true. The alternative hypothesis Ha, is the assertion that is contradictory to Ho. And the test of hypotheses is the method used for using sample data to decide whether the null hypothesis should be rejected or not.

There is a familiar analogy to this in a criminal trial. One claim is the assertion that the accused individual is innocent. In the judicial system, this is the claim that is initially believed to be true. Only in face of strong evidence to the contrary should the jury reject this claim in favor of the alternative assertion that the accused is guilty. If there is no strong evidence, they will not be able to reject the claim of innocence. So, in this example, our hypotheses would be as follows:

Ho : The individual is innocent
Ha : The individual is guilty

Formulating the hypothesis statements

The alternative hypothesis Ha usually represents the question to be answered or the theory to be tested, and thus its specification is crucial. The null hypothesis Ho nullifies or opposes H1 and is often the logical complement to Ha.

Alternative Hypothesis:

  • Specify the “effect” we are looking for
  • Specify the research question
  • Specify the condition that would cause change to happen or an action to be taken

Null Hypothesis:

  • Null = without value, effect, consequence or significance
  • Specify the “status-quo” or condition that would not cause change
  • Must place the point of equality in Ho

The alternative to the null hypothesis Ho : μ= μo will look like one of the following three assertions:

  1. Ha : μ > μo (in which case the implicit null hypothesis is μ ≤ μo)
  2. Ha : μ < μo (in which case the implicit null hypothesis is μ ≥ μo)
  3. Ha : μ ≠ μo (in which case the null hypothesis will be μ = μo)

Significance level (α)

The level of significance tells us how much evidence we need in order to reject Ho.

One-Tailed or Two-Tailed test?

Left-tailed test:
Ha : μ < μo

Right-tailed test:
Ha : μ > μo

Two-tailed test:
Ha : μ ≠ μo

Test Statistic and P-value

A test statistic is a function of the sample data on which the decision (reject Ho or do not reject Ho) is to be based. The null hypothesis will be rejected if and only if the observed or computed test statistic value falls in the rejection region.

P-value:

  • The P-value is a probability
  • This probability is calculated assuming that the null hypothesis is true
  • To determine the P-value, we must first decide which values of the test statistic are at least as contradictory to Ho as the value obtained from our sample sample.

Drawing conclusions

“If the P is low, the null must go!”

  • Reject, or fail to reject the null hypothesis
    Never reject, or fail to reject the alternative hypothesis
  • All our confidence is in the alternative hypothesis
    As the p-value gets smaller, our confidence in Ha grows.
    If the p-value is lower than α, we reject the null hypothesis

Type I and Type II Errors

With every decision, one is either correct, or has made either a Type I or Type II error

  • Type I Error = P(Reject Ho | Ho is True)
  • Type II Error = P(Fail to Reject Ho | Ho is False)
  • Type I Error is assumed to be the most egregious error

To Summarize:

This blog gives an overview of Hypothesis Testing. To know more about the steps involved, the references mentioned will be useful.

References:

Walpole, Myers, Myers, & Ye. (2012). Probability & Statistics for Engineers & Scientists. Prentice Hall
Montgomery, & Runger. (2003). Applied Statistics & Probability for Engineers. John Wiley & Sons, Inc.
https://www.theopeneducator.com/doe/hypothesis-Testing-Inferential-Statistics-Analysis-of-Variance-ANOVA

--

--