Grammar of Graphics — Worked Examples

Swathi Sharma
AI Skunks
Published in
7 min readMar 14, 2023

In this article, we will be going through two worked examples for Grammar of Graphics

Example — 1

For this example, we will use a preloaded dataset called mtcars from plotline library. The mtcars dataset consists of data that was extracted from the 1974 Motor Trend US magazine, and depicts fuel consumption and 10 other attributes of automobile design and performance for 32 automobiles (1973–74 models).

from plotnine import *
from plotnine.data import mtcars

mtcars.head()

Visualizing two-dimensions (2-D)

We can now visualize data up to two dimensions using some of the components from our layered grammar of graphics framework including data, scale, aesthetics and geometric objects (dot plot). We choose a dot or scatter plot in this case for our geometric object to represent each data point.

(ggplot(mtcars, 
aes('wt', 'mpg'))
+ geom_point()
+ theme_bw())

Visualize three-dimensions (3-D)

To visualize three dimensions from our dataset, we can leverage color as one of our aesthetic components to visualize one additional dimension besides our other two dimensions as depicted in the following example.

(ggplot(mtcars, 
aes('wt', 'mpg', color='factor(gear)'))
+ geom_point()
+ theme_bw())

Visualizing four-dimensions (4-D)

To visualize four dimensions from our dataset, we can leverage color and size as two of our aesthetics besides other regular components including geoms, data and scale.

(ggplot(mtcars, 
aes('wt', 'mpg',
color='factor(gear)', size='cyl'))
+ geom_point()
+ theme_bw())

Alternatively, we can also use color and facets to depict data in four dimensions instead of size as depicted in the following example.

(ggplot(mtcars, 
aes('wt', 'mpg',
color='factor(gear)'))
+ geom_point()
+ facet_wrap('~cyl')
+ theme_bw())

Visualizing data dimensions and statistics

To visualize data dimensions and some relevant statistics (like fitting a linear model), we can leverage statistics along with the other components in our layered grammar.

(ggplot(mtcars, 
aes('wt', 'mpg',
color='factor(gear)'))
+ geom_point()
+ stat_smooth(method='lm')
+ theme_bw())

Visualizing five-dimensions (5-D)

To visualize data in five dimensions, We will leverage: aesthetics including color, size and facets.

(ggplot(mtcars, 
aes('wt', 'mpg',
color='factor(gear)',
size='cyl'))
+ geom_point()
+ facet_wrap('~am')
+ theme_bw())

Visualizing six-dimensions (6-D)

To visualize data in six dimensions, we can add in an additional facet on the y-axis along with a facet on the x-axis, and color and size as aesthetics.

(ggplot(mtcars, 
aes('wt', 'mpg',
color='factor(gear)',
size='cyl'))
+ geom_point()
+ facet_grid('am ~ carb')
+ theme_bw())

Example — 2

For this example, we are going to use the mpg dataset, Fuel economy data for a range of vehicles. It is pre-loaded dataset from plotnine.

Data: The Source of Information

First step when we’re creating a data visualization is specifying which data to plot. In plotnine, we do this by creating a ggplot object and passing the dataset that we want to use to the constructor.

from plotnine.data import mpg
from plotnine import ggplot
(mpg)

Aesthetics: Define Variables for Each Axis

After specifying the data that we want to visualize, the next step is to define the variable that we want to use for each axis in the plot. Each row in a DataFrame can contain many fields, so we have to tell plotnine which variables we want to use in the graphic.

Aesthetics maps data variables to graphical attributes, like 2D position and color. For example, the following code creates a graphic that shows vehicle classes on the x-axis and highway fuel consumption on the y-axis:

from plotnine.data import mpg
from plotnine import ggplot, aes

ggplot(mpg) + aes(x="class", y="hwy")

Geometric Objects: Choose Different Plot Types

After defining the data and the attributes that we want to use in the graphic, we need to specify a geometric object to tell plotnine how data points should be drawn.

plotnine provides a lot of geometric objects that we can use out of the box, like lines, points, bars, polygons, and a lot more.

from plotnine.data import mpg
from plotnine import ggplot, aes, geom_point

ggplot(mpg) + aes(x="class", y="hwy") + geom_point()
#There are many other geometric objects that you can use to visualize the same dataset. 
#For example, the following code uses the bar geometric object

from plotnine.data import mpg
from plotnine import ggplot, aes, geom_bar

ggplot(mpg) + aes(x="hwy") + geom_bar()

Statistical Transformations: Aggregate and Transform Data

Statistical transformations apply some computation to the data before plotting it, for example to display some statistical indicator instead of the raw data. plotnine includes several statistical transformations that you can use.

from plotnine.data import mpg
from plotnine import ggplot, aes, stat_bin, geom_bar

ggplot(mpg) + aes(x="hwy") + stat_bin(bins=10) + geom_bar()
from plotnine.data import mpg
from plotnine import ggplot, aes, geom_boxplot

(
ggplot(mpg)
+ aes(x="factor(year)", y="hwy")
+ geom_boxplot()
)

Coordinates Systems: Map Data Values to 2D Space

A coordinates system defines how data points are mapped to 2D graphical locations in the plot. You can think of it as a map from mathematical variables to graphical positions. Choosing the right coordinates system can improve the readability of your data visualizations.

from plotnine.data import mpg
from plotnine import ggplot, aes, geom_bar

ggplot(mpg) + aes(x="class") + geom_bar()
from plotnine.data import mpg
from plotnine import ggplot, aes, geom_bar, coord_flip

ggplot(mpg) + aes(x="class") + geom_bar() + coord_flip()

Facets: Plot Subsets of Data Into Panels in the Same Plot

Facets is one of the coolest features of plotnine. Facets allow us to group data by some attributes and then plot each group individually, but in the same image. This is particularly useful when we want to show more than two variables in the same graphic.

from plotnine.data import mpg
from plotnine import ggplot, aes, facet_grid, labs, geom_point

(
ggplot(mpg)
+ facet_grid(facets="year~class")
+ aes(x="displ", y="hwy")
+ labs(
x="Engine Size",
y="Miles per Gallon",
title="Miles per Gallon for Each Year and Vehicle Class",
)
+ geom_point()
)

Themes: Improve the Look of Your Visualization

Another great way to improve the presentation of data visualizations is to choose a non-default theme to make the plots stand out, making them more beautiful and vibrant.

from plotnine.data import mpg
from plotnine import ggplot, aes, facet_grid, labs, geom_point, theme_dark

(
ggplot(mpg)
+ facet_grid(facets="year~class")
+ aes(x="displ", y="hwy")
+ labs(
x="Engine Size",
y="Miles per Gallon",
title="Miles per Gallon for Each Year and Vehicle Class",
)
+ geom_point()
+ theme_dark()
)

Visualizing Multidimensional Data

Here we will demonstrate how to display three variables at the same time, using colors to represent values.

As an alternative to faceting, we can use colors to represent the value of the third variable. The following code creates the described data visualization:

from plotnine.data import mpg
from plotnine import ggplot, aes, labs, geom_point

(
ggplot(mpg)
+ aes(x="cyl", y="hwy", color="class")
+ labs(
x="Engine Cylinders",
y="Miles per Gallon",
color="Vehicle Class",
title="Miles per Gallon for Engine Cylinders and Vehicle Classes",
)
+ geom_point()
)

References

[1] : Wickham, H. (2012). A layered grammar of graphics. Taylor & Francis. https://www.tandfonline.com/doi/abs/10.1198/jcgs.2009.07098Links to an external site.

[2] : Szafir, D. A. (2021, July 13). A grammar of graphics. https://www.youtube.com/watch?v=RCaFBJWXfZcLinks to an external site.

[3] : Sarkar, D. (D. J. (2018, September 13). A comprehensive guide to the grammar of graphics for effective visualization of multi-dimensional. https://towardsdatascience.com/a-comprehensive-guide-to-the-grammar-of-graphics-for-effective-visualization-of-multi-dimensional-1f92b4ed4149

[4] : Garcia, M. (2012). Using ggplot in Python: Visualizing Data With plotnine. https://realpython.com/ggplot-python/

--

--