Hypothesis Testing
Hypothesis testing is used for testing a claim about a parameter in a population, using data measured in a sample. Hypothesis Testing is sometimes referred to as significance testing. Hypothesis testing aims to make a statistical conclusion about accepting or rejecting the hypothesis.
The below steps are involved in statistical hypothesis testing
- Formulate the Null Hypothesis (Ho) and Alternative Hypothesis (Ha) — In general, null hypothesis is the commonly accepted fact and proposes that no statistical significance exists in a given set of observations. The Alternate hypothesis is the statement which contradicts null hypothesis.
- State the significance level — Level of significance is the probability with which we will reject a null hypothesis when it is true. Level of significance is defined by alpha (a). Confidence Interval (CI) = 1-a
- Calculate test statistic.
- Decide to Reject or Retain Null hypothesis — If the test statistic falls within the accepted region retain Retail Null hypothesis, if it falls in the rejection region reject the hypothesis
Type I Error and Type II Error
Type I error is rejecting the Null Hypothesis (Ho ) when in reality Ho is true
Type II error is rejecting the Alternative Hypothesis (Ha) when in reality Ha is true
The type I error level is the significance level denoted by alpha (a). Since Type I error is more serious error we generally set small values of alpha.
Rejection Regions
Critical values, which mark the cutoffs for the rejection region, can be identified for any level of significance. The rejection region is the region beyond a critical value in a hypothesis test. When the value of a test statistic is in the rejection region, we decide to reject the null hypothesis; otherwise, we retain the null hypothesis
Type of Tests
- Two Tailed
Ho: µ = µ0
Ha: µ ≠ µ0
Critical Regions consists of both tails of sampling distribution of the test statistic.
2. One tailed (left sided)
Ho: µ = µ0
Ha: µ < µ0
Critical Region consists of left tail of the sampling distribution of the test statistic.
3. One tailed (right sided)
Ho: µ = µ0
Ha: µ > µ0
Critical Region consists of right tail of the sampling distribution of the test statistic.
Commonly used critical Values for one and two tailed tests:
For the below 2 commonly used levels of significance (a)
a = 0.05: One Tailed Test +- 1.645, Two Tailed Test+- 1.96
a = 0.01: One Tailed Test +- 2.33, Two Tailed Test +- 2.58
Z Test
Used when the sample size n > 30 (If the sample size is small use t test instead).
Z statistic is calculated as:
Around 68% of the elements have a z-score between -1 and 1, 95% have a z-score between -2 and +2
Illustrative example.
The problem statement:
Suppose you need to investigate a claim that in a faraway Village Choco land, every 10-year-old kid consume 800 calories of chocolate per day. You select a random sample of 36 10-year-old kids. The average of the daily calories consumed by these 36 kids is 806 calories. (Assume we know that the population standard deviation of calories consumed by the 10-year-old kids is 12 calories)
Solution:
a)State the Null Hypothesis and Alternate Hypothesis
Null Hypothesis: Ho: x = 800
Alternate Hypothesis: Ha: µ ≠ µ0
This will be a 2 tailed test
b)State the significance level
Significance Level (a) is set at 0.05
c)Test Statistic
Sample size n > 30. We will use the z test.
Test Statistic Z = (806– 800)/(12/root 36) = 3
The calculated test statistic is >2
d)Decision to reject or retain the hypothesis
Null hypothesis is rejected. Alternate hypothesis is chosen.
Therefore, every 10-year-old kid do not consume 800 calories of chocolate per day
Hope this article was helpful to you. Thank you.