Introduction to ANOVA

shyambhu Mukherjee
statistical programming
3 min readJul 17, 2022

Written By: Catherine Shendre

Why do we need ANOVA?

Let’s assume we want to compare the efficacy of three different drugs, X, Y, and Z. Applying a t-test to the X-Y, X-Z, and Y-Z pairs will help us determine their comparisons.

But as the number of groups increase it gets more difficult to manage. In this situation, we choose ANOVA.

ANOVA, or analysis of variance, is a statistical method developed by Ronald Fisher that determines whether the effects of independent variables on a dependent variable differ or are correlated.

Different types of ANOVA:

ANOVAs are frequently applied in one of three ways:

One-way ANOVA: Suppose, a automobile company wishes to compare the average petrol consumption of three similar models of bike and has available six vehicles of each model. Here, six vehicles of each of the three models are available so there are three independent samples, each of size six. In this example the appropriate method to analyze is one-way ANOVA to test for differences between the three models of bike.

Two-way ANOVA: Assume that examination was conducted of boys and girls in the age groups of 10, 12, and 14. To identify which factors (gender, age group, or both) have an impact on the score. In this case, Two-way ANOVA is an appropriate method since there are two factors age and gender.

N-way ANOVA: N-way ANOVA is a generalization of two-way ANOVA where N being the no. Of independent variables. you can use it to determine if the means in a set of data differ when grouped by multiple factors.

Hypothesis Formation:

In the example of an automobile company testing the average petrol consumption of three models of bikes and has available six vehicles of each model. We assume it is normally distributed and has the same variance and sample size. there is one factor (petrol consumption) of three different bikes of equal sample size which is six. The Null hypothesis (H0) and Alternative/research hypothesis (Ha) will be

H0: µi = µ i = 1, 2, 3,4,5,6 i.e petrol consumption of three bike models don’t have any significant difference

Ha: µi ≠ µ i = 1, 2, 3,4,5,6 i.e petrol consumption of three bike models have a significant difference

F-ratio:

The F-ratio is defined as the ratio of two mean square values.

The calculated F-ratio can be compared to a table value to determine if there are any differences between groups or not. Thus, you can use the F-ratio in an ANOVA to decide whether to accept or reject the null hypothesis.

Formula:

F: MSB/MSW

Where,

F = ANOVA Coefficient

MSB = Mean sum s between the groups

MSW = Mean sum of squares within the groups

Key takeaways:

● ANOVA is used to identify correlations and relationships between diverse elements in a wide range of fields, including finance and the financial markets.

● basic assumptions of ANOVA are Data independence, normally distributed, equal sample size and homogeneity of variance.

● If no true variance exists between the groups, the ANOVA’s F-ratio should equal close to 1.

Sources to learn more about ANOVA:

(1) https://en.wikipedia.org/wiki/Analysis_of_variance

(2) https://www.qualtrics.com/au/experience-management/research/anova/

Did you like this post? follow our publications for more such posts. To work with Mentorbruh first-hand to learn about such concepts and do real-time projects, register today at https://mentorbruh.com

--

--

shyambhu Mukherjee
statistical programming

A math geek who enters the realm of tech and gets awestruck… expect some codes and data talks from me