Test Statistic Cheat Sheet: Z, T, F, and Chi-Squared

Marin Gow
3 min readMay 27, 2019

--

One of the more confusing things when beginning to study stats is the variety of available test statistics. You have the options of z-score, t-statistic, f-statistic, and chi-squared, and it’s easy to forget what the difference is between all of these letters. When do you use f-stat rather than t-stat? What is a chi-squared value?

The important thing to remember is that they are not that different from each other. These four test statistics fundamentally do the same thing, just in different situations. Here, we will quickly break down what they all have in common, and then provide a reference for when to use which test statistic.

What is a test statistic?

A test statistic is one component of a significance test. It is used to determine how unusual your result is assuming the null hypothesis is true. For example, let’s say you flip a coin three times, and get tails every time. Assuming the coin is fair, your odds of having done that are .5³, or 12.5%. This is not particularly out of the ordinary, and you probably wouldn’t think much of it.

Now you flip the coin five more times, and you still get tails every time, for a total of eight tails. The chance of this (assuming again that the coin is fair) is .5⁸, or 0.39%. At this point, you would probably start to wonder if the coin is rigged to show tails, because it is so unlikely that a normal (null hypothesis) coin would give you eight tails in a row.

A test statistic is just a way to calculate this likelihood consistently across a variety of situations and data. This is important because it helps you establish the statistical significance of your result, which in turn determines whether or not you reject your null hypothesis. This is done by comparing your test statistic value to a pre-established critical value. The higher the absolute value of your test statistic, the higher the significance of your result.

If you need more information on null hypotheses or critical values, this is a great article.

To recap, all test statistics:

  • Tell you how far into the tail of your distribution your result is, or in other words, how unusual it would be to achieve your result if the null hypothesis were true
  • Are compared to a critical value to determine statistical significance
  • Can be negative or positive, but the higher the absolute value, the more significant the result
  • Accompany a particular significance test — z-score for a z-test, f-statistic for ANOVA, etc. More on this below

Why are there different significance tests?

  1. Sample vs. population data. In statistics, you will virtually always be working with a sample (a small group used to try to infer something about a larger group) rather than a population (where you are able to survey every single person or data point). Samples require special treatment because the smaller your sample is, the less confident you can be that your results will apply to the population at large.
  2. Nature of the test. Different questions require different tests. For example, if you want to know whether three or more samples differ from each other, you would use ANOVA and look at f-stat. If you want to know whether one population differs from another, you would use a z-test and look at z-score.
  3. Underlying distribution. You are probably familiar with several distributions such as normal, binomial, and Poisson. These different distributions have different shapes, and by choosing the correct test statistic, you adjust for this.

Test statistic cheat sheet

The table below is meant to serve as a general guide when deciding which statistical test to run. These tests have quite a bit of nuance to them, so not every variation of each test is covered, but hopefully it starts to demonstrate the differences between these four tests.

--

--