Introduction to R for Data Science (Part Seven Final)

This is the seventh introduction to R. This will cover boxplots, variable plotting, coordinates, and more.

Ivan Huang
3 min readApr 6, 2023

*Originally published on my Substack. This is just a part of the article.

PS: Please read ‘Introduction to R for Data Science (Part Six)’ before reading this one. This is a continued version of part six.

Part six: Introduction to R for Data Science (Part Six)

Boxplots

In this case, I have created a barplot using geom_boxplot(). You must include factor() in the x-axis. You don’t have to include it for the y-axis. Boxplots are common for stock pricing. When it comes to a boxplot, you don’t want to oversaturate it, meaning you don’t want to add many layers.

I’m not going to show this in an image, but if you want to flip the coordinate use coord_flip(). Using the image from before, you can write print(pl + geom_boxplot() + coord_flip()). This would flip the coordinates.

Like with the histogram and barplot, you can add color and fill inside geom_boxplot().

Here is an example of using aes(fill=factor(cyl)). Everything should sound familiar if you have read the part six version. The only thing that is new is that I have added a theme_dark() which changes the background color. It doesn’t have to be theme_dark, you can put theme_classic, theme_light, theme_void, etc.

2 Variable Plotting

The 2 variable plotting is like a heat map which indicates high or low counts of ratings (in my case). Not the number, but the occurrence. I have also added scale_fill_gradient() to make it easier to see since the default color is kinda bad.

In this case, I have added a binwidth. Binwidth has to take a vector in this case.

You can change it to hexagon, but you have to install the package which is install.packages(‘hexbin’). Just change geom_bin2d to geom_hex.

There is also geom_densiy2d(). Just change geom_hex() to geom_density2d().

Coordinates and Faceting

--

--