Codes for Multiple Regression in R

Pouria Salehi
Human Systems Data
Published in
4 min readMar 29, 2017

The link for this week’s blog assignment simply gives us the following coding information with the example of “mtcars” data form the package of {datasets}:

#Input data
input <- mtcars[,c(“mpg”,”disp”,”hp”,”wt”)]
#Check the data
print(head(input))
#Use multiple regression
model <- lm(mpg~disp+hp+wt, data = input)
#Show the model
print(model)

Honestly, I compared this output to what SPSS presents, and I wasn’t satisfied with the little information these codes provided. So , I searched and fond these comments useful for running regression in R(1, 2):

#Get a summary of the model
summary(model)
#Put all 4 diagnostic plots in one page, it is a kind of classic
#See Figure 1

layout(matrix(c(1,2,3,4),2,2))
#Check the regression diagnostic plots for our model
plot(model)
#Coefficients of the model
coefficients(model)
#Predicted values for the model
fitted(model)
#Residuals of the model
residuals(model)
#Anova table
anova(model)
#Calculate peason correlation
cor(wt, hp, method =”pearson”)
#Condidence intervals
confint(model,conf.level=0.95)
#Calculate Variance-Covariance Matrix
vcov(model)
#Regression Diagnostics for checking the quality of regression fits
influence(model)
Figure 1: Having all 4 diagnostic plots of the regression model in one page by using an additional code.

p.s.: Here are some keyboard shortcuts for Medium, which might be useful for having lines of code in your post and more. For a full list of these shortcuts please refer to this link from Medium support.

Add Code block:  + Alt + 6 / Ctrl + Alt + 6
Add Header:
⌘ + Alt + 1 / Ctrl + Alt + 1

R vs SPSS in Multiple Regression: Using the Example of My Master Thesis’s data

From the moment I saw the description of this week’s assignment, I was interested in chosing the SPSS and R topic. I needed to have a clue to start. With a little googling, I found that there is a rumor out that the outcomes for these two software are different (for example, go to here or there). Therefore I decided to compare the regression outcomes of SPSS and R for the data from my master thesis.

In my study, the overall goal was to examine the factors explaining work-family enrichment of academics at Malaysian Research Universities. To do that, I had eight variables including seven independent variables (explaining factors) and one dependent variable (work-family enrichment). Among them, I am going to briefly talk about only four of them, and then, run multiple regression analysis once in SPSS and again in R to compare their results.

(Dependent variable)
Work-family Enrichment
: “the extent to which experiences in one role improve the quality of life in the other role” (p.73).

(Independent variables)
Social Support: the resources embedded in social networks or derived from social relationships, which are accessed to facilitate actions (see here)
Job Autonomy: “the extent that individuals control their job” (p.1)
Extraversion: someone who is sociable, active, assertive, energetic, enthusiastic, outgoing, talkative, cheerful, and optimistic. (refer to here)

Result In SPSS

To run the regression analysis in SPSS, I used the below code, and result showed up as figure 2 indicates.

DATASET ACTIVATE DataSet1.
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIA=PIN(.05) POUT(.10)
/NOORIGIN
/DEPENDENT WFE
/METHOD=ENTER Support Autonomy Extraversion.
Figure 2: SPSS Result.

Result in [R]

In order to run the regression analysis in [R], I deployed the following lines of code, and figure 3 and figure 4 showed up.

#input data
library(data.table)
WFE.df <- fread(“Focal.csv”, header = T, sep = ‘,’)
#multiple regression
input <- WFE.df[,c(“WFE”,”Support”,”Autonomy”,”Extraversion”)]
model <- lm(WFE~Support+Autonomy+Extraversion, data = input)
#get a summary of the model
summary(model)
#putting all 4 diagnostic plots in one page
layout(matrix(c(1,2,3,4),2,2))
plot(model)
Figure 3: [R] Result.
Figure 4: diagnostic plots for the regression model.

If we compare figure 2 (SPSS result) to figure 3 ([R] Result), we can see that there is no difference between R and SPSS output. All coefficients, p-values, and R-squared values stemming from SPSS are identical to the output from R, if we round them to two decimal places. However, this was just one observation and does not prove or reject the difference in their outputs and we need to know the circumstances under which people get different results.

--

--