Z Test & T test in a Nutshell:

Namita Rana
7 min readDec 20, 2021

--

I will start with a little background about the topic assuming that you know about Hypothesis Testing and it’s related terms.

What is Hypothesis Testing — A hypothesis is a calculated prediction or assumption about a population parameter based on limited evidence. The whole idea behind hypothesis formulation is testing this means the researcher subjects his or her calculated assumption to a series of evaluations to know whether they are true or false.

In hypothesis testing, statistical tests are used to check whether the null hypothesis is rejected or not rejected. These statistical tests assume a null hypothesis of no relationship or no difference between groups.

There are many different types of hypothesis tests, they can be divided into two main categories:

Parametric tests

  • Makes inferences about parameters like mean and variance
  • Based on assumptions of specific distributions (ex. “normal” or “t” distributions)

Non-parametric tests

  • Makes inferences about frequency distribution like median, distribution type.
  • Usually include sign and rank tests (a type of “math” used).
  • Do not require assumptions of normality (but do have some assumptions… always check them!).

In this blog I will be covering two type of parametric test:

  • z-test
  • T-test

Parametric t-tests and z-tests are used to compare the means of two samples. These two tests are used to test the null hypothesis of equality of the means of two groups (samples). The calculation method differs according to the nature of the samples. A distinction is made between independent samples or paired samples. The t and z tests are known as parametric because the assumption is made that the samples are normally distributed.

Parameters for using the normal distribution are–

  • Mean
  • Standard Deviation

Let’s understand Z-test:

z-test is a statistical test used to determine whether two population means are different when the variances are known and the sample size is large(sample_size>30).

It is basically a form of hypothesis test that is used to decide whether to accept a null hypothesis or not.

This test statistic is assumed to have a normal distribution, and standard deviation must be known to perform an accurate z-test.

When to use a z-Test:

  • Your sample size must be greater than 30.
  • Data points should be independent from each other. In other words, one data point isn’t related or doesn’t affect another data point.
  • Your data should be normally distributed.
  • Samples should be drawn at random from the population.
  • The standard deviation of the population should be known

Steps to run a z-Test:

Step1: State Your Hypotheses:

  • Null Hypothesis(𝐻0):
  • Aleternative Hypothesis(𝐻𝑎).

Step 2: Specify a Significance Level (alpha).

What is alpha α: The significance level is the probability of rejecting the null hypothesis when it is true. For example, a significance level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no actual difference. Lower significance levels indicate that you require stronger evidence before you will reject the null hypothesis.

The researcher determines the significance level before conducting the experiment.

Step 3: Calculate the test statistic: For z-tests, we are using z-statistic as our test statistic.

what is z-statistic: Is a number that represents how many standard deviations above or below the mean population the score derived from a z-test is.A one-sample z-statistic is calculated as where :

  • x¯ : Sample mean
  • 𝜎 : The population standard deviation.
  • 𝑛: Number of items in the sample.
  • 𝜇0: Mean you’re testing the hypothesis for.
Formula for calculating Z- statistic

Step 4: Calculate the p-value(calculated probability:

We will look up the related probability value in a z-table, or use scipy.stats to calculate it directly. In SciPy, the cumulative probability up to the z-value can be calculated as:

Step 5: Interpret p-value:

what’s a p value: It is a value used in hypothesis testing to help us support or reject null hypothesis. The smaller the p-value, the stronger the evidence that you should reject the null hypothesis.

Step6: Evaluate null hypothesis:

  • If p < alpha : reject the null hypothesis.
  • If p >alpha : Fail to reject the null hypothesis.

Types of z test:

One Sample z test: A one-sample z-test is used to test whether a population parameter is significantly different from some hypothesized value.

where,

  • X¯: mean of the sample.
  • 𝜇: mean of the population.
  • 𝜎 : Standard deviation of the population.
  • 𝑛: sample size.
Formula for 1 Sample z test.

Two Sample z test: The Two-Sample Z-test is used to compare the means of two samples to see if it is feasible that they come from the same population.

Formula for calculating z statistic for 2 samples.

Let’s understand T test:

A t-test is a type of inferential statistic used to determine if there is a significant difference between the means of two groups, which may be related in certain features.

The t-test assumes your data’s:

  • Sample observations are independent from each other .
  • Sample observations have numeric and continuous values
  • Is normally distributed.

t-tests are a statistical way of testing a hypothesis when:

  • Don’t know the population standard deviation.
  • You have a small sample size< 30.

Steps to run a T-Test:

Step1: State Your Hypotheses:

  • Null Hypothesis(𝐻0):
  • Aleternative Hypothesis(𝐻𝑎).

Step 2: Choose a Significance Level (Alpha):

The significance level, also denoted as alpha or α, is the probability of rejecting the null hypothesis when it is true.

Step 3: Calculate the t-statistic:

where,

  • 𝜇: Population mean
  • : The sample mean
  • 𝑠: The sample standard deviation
  • 𝑛: Number of observations
Formula for calculating t-statistic

Step 4: Calculate Critical Value (Find Rejection Region):

The t-test produces two values as its output: t-value and degree of freedom((no of items in sample -1).

  • A large t-score indicates that the groups are different.
  • A small t-score indicates that the groups are similar.

t-critical: To calculate a critical t-value we refer to t-table(t distribution table) or we can calculate by using Python scipy.stats module.

For left-tailed test:

where,

alpha : significance level

df: degree of freedom.

t critical for left tailed test.

For right tailed test:

where,

alpha: significance level

df: degree of freedom

Formula for a right tailed test.

For a two-tailed test:

Formula for a two tailed test

Step 5: Compare Sample t-value with Critical t-value: Can We Reject the Null Hypothesis?

We observe if t-statistic is greater than the critical t-value and which region it lies on, based on these results we decide whether to accept or reject the null hypothesis.

  • We can also calculate p value using:

where,

df: degree of freedom,

t : t- statistic

Formula for p value

Step6: Evaluate null hypothesis:

  • If p < alpha : reject the null hypothesis.
  • If p >alpha : Fail to reject the null hypothesis.

What type of t-test should I use?

When choosing a t-test, you will need to consider two things: whether the groups being compared come from a single population or two different populations, and whether you want to test the difference in a specific direction.

One-sample, two-sample, or paired t-test?

  • If the groups come from a single population perform a paired t-test.
  • If the groups come from two different populations perform a two-sample t-test ( independent t-test).
  • If there is one group being compared against a standard value perform a one-sample t-test.

One-tailed or two-tailed t-test?

  • If you only care whether the two populations are different from one another, perform a two-tailed t-test.
  • If you want to know whether one population mean is greater than or less than the other, perform a one-tailed t-test.

How to decide which test to perform.

While both tests are used in the comparison of population averages, the two tests differ in their use. The t-test is useful in the determination of the availability of statistical significance between two independent sample datasets. The t-test is suited for the test of the hypothesis of problems with limited sample size, that is, sample size less than thirty and with the population variance unknown.

On the other hand, the z-test is used to show the deviation of a data point from the average of a set of data. Additionally, the z-test is used for data sets that have known the standard deviation. The data set’s sample size should also be large; that is, it should exceed thirty.

I hope this blog,helps in understanding these two tests and how to decide which one to go for.

--

--