Contingency Table

Nhan Tran
3 min readMar 6, 2019

--

What is Contingency Table?

Contingency tables summarize the observed frequencies to describe the relationship between two categorical variables.

In statistics, a contingency table (also known as a cross tabulation or crosstab) is a type of table in a matrix format that displays the (multivariate) frequency distribution of the variables

Let’s assume we are at the painting exhibition. This exhibition presents paintings of three periods: Early Renaissance, Late Renaissance, and Baroque. On these paintings, you can see either Fruits, or Flowers, or a mix of both. You would like to count how many paintings from each of the mentioned periods have Fruits, how many have Flowers, and how many have a mix of both. This is what you get:

Table 1 (observations)

Values inside are called joint frequencies because they relate to both categorical variables: to a certain period of time (Early Renaissance, Late Renaissance, and Baroque) and to a certain object drawn (Fruits, Flowers, or a mix of both). Marginal frequencies are a sum of each row and each column of our table, as marked in grey color below. You can also see how many paintings you have observed in total by looking at the bottom right corner of a table (64):

Table 2 (with sum for each dimension)
  • Sum (1) — marked as pink: contains the sum amount of observations (pictures) by period (for each row)
  • Sum (2) — marked as blue: contains the sum amount of observations (pictures) by content (for each columns)
  • And you can see, total number of pictures — marked in red circle — can be found in the conjunction of Sum (1) and Sum (2) which is 64

In statistics, a contingency table is a type of table in a matrix format that displays the frequency distribution of the variables in terms of joint frequencies and marginal frequencies. They are heavily used in survey research, business intelligence, engineering and scientific research.

Standard contents of a contingency table

  • Multiple columns (historically, they were designed to use up all the white space of a printed page). Where each row refers to a specific sub-group in the population (in this case men or women), the columns are sometimes referred to as banner points or cuts (and the rows are sometimes referred to as stubs).
  • Significance tests. Typically, either column comparisons, which test for differences between columns and display these results using letters, or, cell comparisons, which use color or arrows to identify a cell in a table that stands out in some way.
  • Nets or netts which are sub-totals.
  • One or more of: percentages, row percentages, column percentages, indexes or averages.
  • Unweighted sample sizes (counts).

--

--