The Five Number Summary
Descriptive Statistics Using Box Plots
Hello!
Now that we’re done discussing what box plots are and what they tell us and how it can be done on paper, let me present to you how R could be used to replicate the graphs presented as visual aid given Life Expectancy Data of 197 countries.
https://d396qusza40orc.cloudfront.net/introstats/Data/LifeExpTable.txt
CHANGING THE WORKING DIRECTORY
- Using the GUI — (File -> Change Dir… -> *your directory*)
- Using R console commands — (setwd(*your directory*))
READING DATA FROM GIVEN FILE
data = read.table(“LifeExpTable.txt”)
This command retrieves data from our LifeExpTable.txt and places it inside variable ‘data’. Now, whenever we type ‘data’ the records of the life expectancies are produced.
ASSIGNING COLUMNS TO VARIABLES
lifeexp = data[,2]
in this case whatever number you place right after the comma symbol denotes the number of the column you would want placed in variable lifeexp
CREATING A SIMPLE SCATTER PLOT DIAGRAM
plot(lifeexp)

CHANGING THE LABELS OF A SCATTER PLOT DIAGRAM
It’s good practice to provide labels so that scatter plots, though raw, still present some sense.
plot(lifeexp, xlab=”Country”, ylab=”Life Expectancy”)

In addition to labels, we could also specify limits to cater to your preferences in presentation.
plot(lifeexp, xlab=”Country”, ylab=”Life Expectancy”, ylim=c(0,86))

SORTING SCATTER PLOTS
To sort scatter plots, we may use the sort() command on the variable holding our life expectancy values. In this case, lifeexp
plot(sort(lifeexp), xlab=”Country”, ylab=”Life Expectancy”, ylim=c(0,86))

CREATING BOXPLOTS
All those tedious calculations and processes we’ve done on paper to produce boxplots could be accomplished using this single R command
boxplot(lifeexp, ylab=”Life Expectancy”, ylim=c(0,86))

PRODUCING SUMMARY STATISTICS
summary(lifeexp)
-
Min. 1st Qu. Median Mean 3rd Qu. Max.
47.79 64.67 73.24 69.86 76.65 83.39
THE COMBINE — c() FUNCTION
the combine function allows us to assigned manually some values to variables. These values could also be processed as if they were elements of plots.
grades = c(78, 68, 69, 88, 90, 74, 87, 76, 93)
sort(grades)
summary(grades)
-
boxplot(grades, ylab=”Grades”, ylim=c(60,100))

LINES IN BETWEEN SORTED SCATTER PLOTS
Scatter plots look extremely nice when all of them are sorted and each of the plots are connected via lines.
six_grades = c(68, 84, 90, 74, 78, 93)
sort(six_grades)
summary(six_grades)
-
plot(sort(six_grades), type=”b”, xlab=”Students”, ylab=”Grade”, ylim=c(60,100))

SAVING GRAPHS AS IMAGES/BITMAPS/PDFS
png(filename="your/file/location/name.png")
plot(data)
dev.off()
Notice that after executing the command the plot command, no image gets returned. This is because the png function directs all graph output to the standard image device allowing us to save the image to the disk. NB: the device remains open until instructed to close — for graphs therefore to be printed in the R Studio environment again, the standard image device must be turned off. Hence, dev.off()
Now try a few of these examples yourself and see how easy it is to create beautiful graphs in R. ☺
be sure to give this article a recommend if you found it helpful. Thanks!