Visualizing Relationships In Data Using R

4 min readJul 27, 2022

Data visualization is the process of transforming information into a visual form like graphs and maps to make data easier for humans to understand and pull insights from it. The main goal of data visualization is to make it easier to identify patterns, trends, and outliers in large data sets.

In this section, we will see how to draw

Scatter Plot
Bar Plot
Pie Plot
Histogram

Scatter Plot

A scatter plot is used to display the values of two quantitative variables in the form of dots in a 2-D plane. We want to visualize House Price and Square Footage data using a scatter plot and draw a line of best fill from where we can predict the value of an unknown house given the Square Footage area of the house.

sqft <- c(40,30,20,10,50,12,14,25,26,24,60,31,42,45,50)
price <- c(4000,2900,1700,1200,4800,1500,2200,2900,
                   3200,2100,5500,3850,3900,5800,4500)
plot(sqft, price, xlab = "Size (Sqft)", ylab = "Price (USD$)", 
     main = "House Price vs. Square Footage")
abline(lm(price ~ sqft))

head(iris, n = 5)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosastr(iris)
## 'data.frame':    150 obs. of  5 variables:
##  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
##  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
##  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
##  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
##  $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 ...plot(iris$Petal.Length, iris$Petal.Width, xlab = "Petal Length", ylab = "Petal Width", 
     main = "Petal Width vs. Petal Length")
abline(lm(iris$Petal.Width ~ iris$Petal.Length ))

Bar Plot

Bar Plot presents categorical data with rectangular bars where the heights or lengths are proportional to the values they represent.

users <- c(200, 400, 300, 100, 50)
progLang <- c('Java', 'R', 'Python', 'C++', 'Other')
barplot(users, 
        names.arg = progLang,
        xlab = "Programming Language", 
        ylab = "Number of Users", 
        main = "Number of Users for Various Programming Languages")

To draw the barplot using ggplot2 we need to import the ggplot2 library which is a part of the tidyverse package. We will follow the steps given below

import tidyverse package
show first 5 rows of chickwts dataset
display structure of chickwts dataset
make a barplot of the chickwts dataset in which feed is on the x-axis and weight on the y-axis.

library(tidyverse)head(chickwts, n = 5)
##   weight      feed
## 1    179 horsebean
## 2    160 horsebean
## 3    136 horsebean
## 4    227 horsebean
## 5    217 horsebeanstr(chickwts)
## 'data.frame':    71 obs. of  2 variables:
##  $ weight: num  179 160 136 227 217 168 108 124 143 140 ...
##  $ feed  : Factor w/ 6 levels "casein","horsebean",..: 2 2 2 2 2 2 2 2 2 2 ...chickwts %>%
  ggplot(aes(x = feed, y = weight)) +
  geom_col()

Pie Chart

A pie chart presents data in a circular graphic which is divided into slices to illustrate numerical proportion.

users <- c(200, 400, 300, 100, 50)
progLang <- c('Java', 'R', 'Python', 'C++', 'Other')
pie(users, 
    labels = progLang,
    main = "Number of Users for Various Programming Languages")

ggplot(chickwts, aes(x = "", y = weight, fill = feed)) +
  geom_col() +
  coord_polar(theta = "y")

Histogram

A histogram represents the distribution of numeric data graphically.

grades <- c(51,53,64,67,68,71,73,76,78,79,81,85,88,91,95)
hist(grades, breaks =5)

hist(iris$Petal.Length, breaks =10)

Conclusion

In this article, we discover relationships within data using graphs such as scatter plots, bar plots, pie plots, and histograms.