This week our instruction was to work through an R tutorial for ANOVA, found here: http://www.statmethods.net/stats/anova.html. I am familiar enough with the concept that I decided to branch away from a conventional ANOVA and focus instead on working through a factorial design. I’ve always wanted to spend more time with factorial designs and I feel there is great strength in the visual of the interaction plots that include main effects and interaction effects. My challenge to myself for this post was to get a successful R output for these interaction plots.

On a side note, before I dive into the factorial design, I would like to note the fact that the tutorial starts off with “If you have been analyzing ANOVA designs in traditional statistical packages, you are likely to find R’s approach less coherent and user-friendly”. This is not an encouraging thing to read before you begin a tutorial and gets at some of the inherent difficulties of R that I noted in my post last week. However, something I did recognize is that R has a direct functional out to the interaction plots. I thought this was at least interesting, because in SPSS the plot is a separate output after the analysis is finished. This is the function then, that I wanted to focus on:

`interaction.plot()`

Before I begin, I wanted to give some background to the factorial design. A factorial design is one that looks at the effect of more than one independent variable. Further, these independent variables usually have more than level. I’ll be using trusty ‘mtcars’ for this exercise, so I’ll use it for my examples. In this case, one of the IVs is Cylinders, and the levels are 4, 6, and 8. The other IV I will be examining is Gears, which has the levels of 3, 4, and 5. Thus, I have a 3 x 3 factorial design, because I have two IV’s with 3 levels each. On a quick side note, if I wanted to use car Weight has an IV, I would have had to change the values into categories of low, medium, and high in order to get it to work as a factor.

The next major part of the factorial design in the main effect. The main effect, simply put, is the effect of one of the IVs on the dependent variable (which in this case, will be Miles per Gallon). So I can observe the effect of Cylinders on MPG, and I can observe the effect of Gears on MPG. However, this is not where the factorial design is needed. The factorial design is useful because it shows us interaction effects. An interaction occurs when the effect of one of the IVs on the DV changes depending on the effect of the other IV. So in my current example, I want to examine how the level of Cylinders has an effect on MPG depending on the level of Gears (Main Effects and Interactions).

The code is very similar to the one at the end of the tutorial. I did not include some of the graphical notation because I was unfamiliar with it and wanted to see what the normal output looked like. The code and output is below:

`View(mtcars)attach(mtcars)#allows to attach the variables directly instead of indexing them as data\$cyl, etc.gear <- factor(gears) #set gear as a factorcyl <- factor(cyl) #set cyl as a factorinteraction.plot(cyl, gear, mpg) #create the interaction plot`

How do we interpret this then? In general, if the lines cross and do not run parallel, there is an interaction effect. In this case, then, there is clearly an interaction effect between Cyl and Gears on MPG. As the number of gears interacts with higher number of cylinders, the MPG drops drastically.

I won’t pretend to understand the intricacies of the interaction. I think I would need the actual data values to understand that better, but I can say at the end of this that I appreciate the ease with which you can create an interaction plot in R. These interaction plots allow you an easy, direct visual to observe interactions.

References