Comparison of R and SPSS: ANOVA

Published in

Human Systems Data

9 min readApr 4, 2017

Hey Folks,

If you read my last post on an Attempt at Multiple Regression in R, you may be aware that I recently collected data for my thesis. Since that post, I have spent a substantial amount of time analyzing the data.

For this weeks post, I will be talking about analysis of variance or ANOVA using my thesis data and examples from R and SPSS.

In a nutshell, my study was looking at differences in the way people pay attention to two information sources when those information sources are at different differences from each-other. In the study, I had 12 participants respond to events that would appear on one or two monitors. For the different distance conditions, I had participants perform the same task at 5 different monitor distances ranging from 22.5 degrees of visual separation (from the participants point of view) all the way to 112.5 degrees. There was also an independent variable of “relatedness” with two levels. The main dependent variable for this experiment was response time or RT in milliseconds. For each type of event, average reaction times were compiled for each of the independent variable combinations.

One way to compare how participant reaction time might be different between these distances and relatedness conditions is an ANOVA. ANOVA is a statistical test that compares the means of different groups to each other to see if the differences between them are statistically supported. This one of the forms of hypothesis testing that is disparaged in the readings I mentioned in my post: The Process is Failing the Goal. While ANOVA has weaknesses and has contributed to issues with p-values and the replication crisis, it can still be useful if it is used as a supplement to other sorts of data analysis.

Because there were 5 distance conditions and 2 relatedness conditions (all within the same subjects), the type of ANOVA I used was called a 5x2 repeated measures factorial ANOVA. This analysis lets me compare the impact of distance and relatedness on reaction time and see whether there is an interaction how distance might affect reaction time differently between the relatedness conditions.

When I finally had my data collected, I had a decision to make. I have used SPSS, a statistical analysis package from IBM, to analyze data for almost 10 years now. However, this semester I have been exposed to R and really like some of the graphics capabilities it provides. I decided to use both separately and compare the results.

Funny looking data

Ok, so when my data came out of the program I wrote on Inquisit by Millisecond, I had raw reaction time observations in “long” format. This means that for each row of data, I had one reaction time observation. The columns included all of the other variables such as: participant ID, type of task, etc… The issue was that I needed to have compiled means for each combination of the independent variables for SPSS to be able to analyze my data. This meant that I needed to have a row for each participant, and separate RT columns for each distance and relatedness combination (10 columns). This process required hours of copy and pasting into an excel formula. Next, to put this data in R, I needed it in long format again. While I probably could have used Tidyr or Dyplr (r packages) to make it into long form, I ended up just transposing them in Excel so that each row represented each independent variable combination and each column represented a variable (relatedness, distance, RT).

I think in the future, I will be making decisions about what analysis software to used based on what the data looks like to begin with.

Now for the Anova in SPSS

Once the data was appropriately organized, I put it into SPSS and began to run the analysis. This involved a lot of mouse clicking to select the variables I wanted to analyze and select the tests I would like SPSS to run. The only reason I know what tests to run at this point is that I have used the program for years and know what to expect. The process is not pretty as you can see in the image below:

These are only a few of the options you need to consider for a repeated measures ANOVA.

Once you hit OK, you get an output window with a whole bunch of information arranged in tables and graphs. However, if you know what you are looking for, reading the output is pretty easy. Again, it doesn’t look pretty but it works:

So what does this all mean? The first column shows which comparison you are looking at (distance on reaction time, relatedness on reaction time, and the interaction of distance and relatedness on reaction time). In the next column, you can see that SPSS has run a number of different ANOVA tests on the data. We will be looking at Sphericity Assumed.

Now, there is a lot of useful information here about estimates of effect size, observed power, F values. The “Sig” column shows the p-value, or the probability of finding these results if there was no change in the dependent variable based on the independent variable of that comparison. A p-value<.05 is generally considered in psychology to be indicative that an effect is statistically supported.

In this case, it looks like there were changes in RT based on the main effect of distance as well as changes in RT based on task-relatedness. In all, the table that is output is in the wrong format for an APA-style paper, but all of the values are easy to copy and paste into Excel or Word for editing. This is a big plus.

Another great feature is that with a measly three mouseclicks I was able to get the output to show me all the pairwise comparisons beyond the main ANOVA. This means that SPSS runs separate- one way ANOVAs to see how each level of each variable may relate to the others. For example, I was able to see comparisons of distance 1 with distance 3 versus distance 1 and distance 5.

Visualizing the Comparison in SPSS

When comparing means, it is often useful to see what the difference between means looks like instead of just seeing the output. For this comparison, I had SPSS make me a plot of the means with the different distances on the X axis and relatedness conditions on the Y axis. Here is what it looked like:

Now, in one of my previous posts, I talked about bad graphics. This plot, in my opinion, is a great example of a terrible graphic. For one thing, there are no units for any of the 3 conditions. Also, while visual angle distance is a continuous measure, there was no data collected in between the different distances, so a line graph is very misleading. This plot is useful only to me as the experimenter as it gives me a glimpse of what the data looked like.

Now for the Anova in R

Now let’s compare this process and it’s outputs using the pirate’s favorite statistical analysis software — R. This is a great tutorial on how to perform ANOVA in R if you are interested in trying it for yourself.

My ANOVA will be a little different from the tutorial for a number of reasons. First, I had already organized my data into long form, and for the factorial ANOVA from the tutorial to work, I would have had to make it wide again. Also, the main reason I wanted to use R was not to perform ANOVA again, but to use ggplot2 (a great graphics package in r) to visualize my data.

Here is the basic code I used to run the 5 x 2 repeated measures ANOVA in R:

#Makes a repeated measures factorial ANOVA
Fit <- with(data, aov(RT ~ Distance * Relatedness +
            Error(PID / (Distance * Relatedness))))summary(Fit)

This means that I created an ANOVA model “fit” using the “data” dataframe, then the aov ANOVA function. RT was reaction time, Distance was the distance and Relatedness was the relatedness condition. PID was the participant number, and I used that so that the program knew how to measure error or distance from the mean per participant.

Here was the output (compare to within subjects effects):

I almost didn’t include this output in the blog post because of how confusing and long it is, but I think that it makes a point. When I saw this output, I decided not to use R to do Factorial ANOVAs any more. One strange thing is that the p-values or PR(>F) are very different from those I got from SPSS. I’m not sure why this is, but I assume it is because aov() uses a slightly different type of ANOVA test to come up with the model.

One of the main problems with this output is that it is not organized very well visually, so you really need to search for the comparisons you are interested in. It also makes it very difficult to organize your results into a table for a paper. You either have to copy and paste each number individually or transcribe them if you are fast at 10 key.

Another problem is that to get pairwise comparisons, I would have to split the dataframe by variables and run separate ANOVAs for each comparison. The tutorial mentions you can use the TukeyHSD() function to do the pairwise comparisons based on the output of the aov() function, but this didn’t work when I tried it. This is likely because my data was in long form.

Visualizing the Comparison in R

Ok, so the ANOVA output in R was a little cumbersome, but how does it work for visualizing the comparison?

It takes a little familiarity with ggplot2 to make good graphics in R. Here is the code I used to plot the comparison:

#Mean comparison Distance x Relatedness on RT
ggplot(data = data, aes(x = Distance, y = RT, color = Relatedness)) +
  geom_errorbar(data = dataRT, aes(ymin=RT-se, ymax=RT+se), width=.1)+
  geom_point(data = dataRT, aes(y = RT))+
  scale_y_continuous(name = "Reaction Time (Milliseconds)")+
  scale_x_discrete(name = "Visual Angle Distance (Degrees)")+
  theme_bw() +
  ggtitle("Task Relatedness and Visual Angle Distance on Reaction Time ")+
  labs(fill = "Task Relatedness")+
  theme(plot.title = element_text(hjust = .5))+#centered title!!!
  theme(text=element_text(family="Times New Roman", size=12))

For this particular graphic, I needed to follow the style guide for my thesis. This meant that I needed to have the text be in Times New Roman. With a package called extrafonts and a little code to read in the built-in Windows fonts to R, it worked great!

I also found a really cool function called summarySE(). This function makes a dataframe of the descriptive statistics within a data set based on the group variables and dependent variable of your choosing. It provides means, standard deviations, standard error, and confidence interval for whatever you select. In this case, it let me put error-bars on my graphs.

dataRT <- summarySE(data, measurevar="RT", groupvars=c("Relatedness","Distance"))

Now that I had the code together, I could plot my data:

The great thing about this is how easy to customize it is. When you export graphs from R-studio, you can even choose how many pixels in size you would like the image to be and it will adjust the fonts accordingly.

With this plot you can really see the difference in RT change over different distance conditions and you can see that the related and independent tasks had very different difficulty.

What to Use?

If I had to change anything about how I analyzed this data, I don’t think I would have bothered performing ANOVAs in R. While it definitely works, I am more familiar with SPSS and it seems much easier to transfer the output into a publishable table if necessary. Other than that, I think I will continue to use both programs to deal with my data. The ggplot2 package in R really does a much better job of creating custom graphics than SPSS or even Excel. Also, ANOVA and comparison of means are major features of SPSS.

If a less common analysis was needed, I think it may end up being easier in R. Because R is open-source, people can build their own functions into the program. SPSS is owned by IBM and anything that is added to the program must be provided by their development team and released whenever they release a new version of the software. In R, it is likely that somebody else has tried to do what you are trying to do, and maybe they had the chops to write the code themselves.

One question I have for my readers this week has to do with the different results from R and SPSS. It is troubling that the statistical strength that would be reported from a comparison could be affected by the software used for the analysis. Let me know what you think!