Flash back to introductory statistic — ANOVA

Maya Toteva
Human Systems Data
Published in
3 min readApr 5, 2017

ANOVA is one of the most frequently used statistical analyses in research. It is used to compare the means across three or more independent groups instead of multiple t-test to avoid the multiple comparisons problem. Three or more groups cannot be analyzed by multiple t-tests because they tend to either cluster close together (low variance), or spread far apart (high variance).

ANOVA and regression share common traits. They explore the variability of the dependent variable between different groups as result of various factors such as treatment or potential confounding effects. The difference between ANOVA and regression is that the ANOVA presents the effect on the DV as a result of the treatment, where regression presents the relationship between the DV and number of predictors. ANOVA compares the means between independent groups, but cannot provide us with details about the significance of the difference. Its main purpose is to test the null hypothesis, which in all research assumes that H0: µ1 = µ 2= µ3 = µk. There are several other assumptions such as normal distribution of the population where the sample is drawn from, as well as the equal variance of the samples.

One-way ANOVA

There are several different uses for ANOVA. The most basic of them is one-way ANOVA. One-way means that we will examine only one IV with multiple levels. I will use the data from the “Psych” library in “R” to provide detailed explanation of the statistical analysis. The data-set contains 18 data points representing each of the participants. They have been assigned to three different groups, A, B, and C (levels of IV), according to the dose of medication administered. The alertness of each participant is measured as a function of the medication dose. Using RStudio, I will attempt to create the model of the groups means. To run the test I used the fit model provided in Quick — R: fit <- aov (y ~ A, data = mydataframe), where “y” is the DV, “~” sign reads as “function of” , and “A” is the DV.

anovatreatment = aov(Alertness~Dosage,data=dosage)
summary(anovatreatment)

The results show that we have observed a significant effect indicated by the F- ratio F=8.789

Df Sum Sq Mean Sq F value  Pr(>F)   
Dosage 2 426.2 213.12 8.789 0.00298 **
Residuals 15 363.8 24.25

One way repeated measures ANOVA

This test is helpful when researchers have the same groups measured several times during the treatment. At this point, I will refrain from running the test in R because I saw other posts analyzing the same data-set I found. Therefore I will attempt to provide a follow-up on the ANOVA with a post hoc test. As I mentioned above, the analysis of variance offers information about the variation between the groups, but does not provide us with details of the differences. Therefore, to gain better understanding of the differences, we perform Tukey test. The test relies on the same assumptions as the ANOVA, therefore we can be confident that if there is a significant difference between the groups, we will be able to reject the H0. Here is what I found about the groups:

TukeyHSD(anovatreatment)$Dosage
diff lwr upr p adj
b-a -4.25 -11.15796 2.657961 0.2768132
c-a -13.25 -21.50659 -4.993408 0.0022342
c-b -9.00 -16.83289 -1.167109 0.0237003

The results show that a large difference is observed between groups A and C with p = 0.0223 — meaning that small fraction of a percent can be attributed to chance, flowed by C and B with less than 2% of variance not explained by the treatment. Tukey test uses pairwise comparison to provides all the necessary data such as difference between pair of groups, the 95% confidence interval, and the p-value. Performing Tukey test is an important step in the process of hypothesis testing.

References

ANOVA http://www.statmethods.net/stats/anova.html

ANOVA Test: Definition, Types, Examples http://www.statisticshowto.com/anova/

An introduction to ANOVA https://www.datacamp.com/courses/analysis-of-variance-anova

Factorial Between Subjects ANOVA https://ww2.coastal.edu/kingw/statistics/R-tutorials/factorial.html

--

--