Calculate Value of The Chi Square

Nhan Tran
4 min readMar 8, 2019

--

This post will help you extend knowledge about Chi Square with a simple example.

Hope you guys read my previous article about The Chi Square Statistic (p.1) before moving on this post. Now, we will explore deeper the Chi Square calculation.

When analysis of categorical data is concerned with more than one variable, two-way tables (also known as contingency tables) are employed. These tables provide a foundation for statistical inference, where statistical tests question the relationship between the variables on the basis of the data observed.

In the dataset “Popular Kids”, students in grades 4–6 were asked whether good grades, athletic ability, or popularity was most important to them. A two-way table separating the students by grade and by choice of most important factor is shown below:

Data source: Chase, M.A and Dummer, G.M. (1992), “The Role of Sports as a Social Determinant for Children,” Research Quarterly for Exercise and Sport, 63, 418–424. Dataset available through the Statlib Data and Story Library (DASL).

The chi-square test provides a method for testing the association between the row and column variables in a two-way table. The null hypothesis H0 assumes that there is no association between the variables (in other words, one variable does not vary according to the other variable), while the alternative hypothesis H1 claims that some association does exist. The alternative hypothesis does not specify the type of association, so close attention to the data is required to interpret the information provided by the test.

The chi-square test is based on a test statistic that measures the divergence of the observed data from the values that would be expected under the null hypothesis of no association. This requires calculation of the expected values based on the data. The expected value for each cell in a two-way table is equal to (row total x column total)/n, where n is the total number of observations included in the table.

Now, let’s calculate the expected value as below formula:

…then apply to our contingency table:

Continuing from the above example with the two-way table for students choice of grades, athletic ability, or popularity by grade, the expected values are calculated as shown below:

Once the expected values have been computed (done automatically in most software packages), the chi-square test statistic is computed as:

Chi square (x²) formula

Let’s apply the chi square formula to our example table:

Now, let’s calculate the degree of freedom (df):

With significance level of 0.05 and degree of freedom of 4, we have table value = 9.488 (fourth row, third column)

Probability level (alpha)

Now we have both x² and table values and can compare these two to make a decision.

So, any estimated from above calculation x² = 1.51 which is below our table values = 9.488, that is, any difference between our actual probabilities and probabilities we expect to have if two variables are independent, that is below 9.488, means that these two variables are independent, or have no relationships in-between.

As our x² < table value, we can reject the hypothesized and conclude that the pupil grade and their choices are NOT correlated with each other.

Conclusion

So, here again, are the steps to make a Chi-square test:

  1. Add marginal frequencies to a contingency table
  2. Translate joint and marginal frequencies into probabilities
  3. Estimate the expected probability for each cell
  4. Calculate x²
  5. Compare x² with table value and make a decision:
  • x² > table value = accept = dependent
  • x² ≤ table value = reject = independent

--

--