Anova function in R and the type of sum of squares

Tingyu Zou
2 min readFeb 27, 2024

--

The aov() function in R, by default, uses Type I sum of squares (SS). Type I SS calculates the sum of squares sequentially for each term in the model. This means that the sum of squares for a term is calculated after accounting for the sum of squares of all preceding terms in the model. This approach can lead to different results for the same terms depending on their order in the model, which is particularly relevant in models with interaction terms or when dealing with unbalanced data.

Characteristics of Type I SS

  • Sequential Analysis: The calculation of sum of squares for each factor depends on the order of factors in the model. Each factor is adjusted only for the factors that precede it in the model.
  • Sensitivity to Order: The significance of a factor can change based on its position in the model, especially in the presence of interactions or unbalanced designs.
  • Mainly Suitable for Balanced Designs: Type I SS is most straightforward and interpretable when the design is balanced (equal sample sizes across groups) and the model does not include interactions.

For models that include interactions or are unbalanced, researchers often prefer Type II or Type III SS. Type III SS, in particular, is favored for testing the main effects in the presence of interactions and for unbalanced designs, as it calculates the sum of squares for each term in the model after adjusting for all other terms, providing a more balanced assessment of each factor’s effect.

To use Type III SS in R, you would typically use the Anova() function from the car package, as it allows specifying the type of sum of squares directly with the type argument.

--

--