How to Create a Simple Yet Effective Scatterplot

How to answer the question, “Do I have a correlation, linear or otherwise?”

Jonathan Dunne
Nightingale

--

Introduction

Back in January, I spoke about how to create a simple yet effective bar chart. In this second installment of my “How to” series, I want to focus on another common chart type: the scatterplot.

A scatterplot is a useful way to determine the relationship between two variables. Scatterplots display numerical values across an x and y-axis using the cartesian coordinates of observations with a dataset. Scatterplots can also be used to display observations across multiple categories or groups. It is this type of use case that makes scatterplots a popular chart type.

Figure 1: An example of multi-group scatterplot showing statin dose with the drop in high-density lipoprotein cholesterol.

Figure 1 above provides an example of a scatterplot. Data was measured for the level of statin administered to a patient along with the increase or decrease of high-density lipoproteins (HDL). The observations themselves are typically represented by a circular dot.

The purpose of the chart is to understand if there is an underlying relationship (correlation) between statin dose and adjustment in HDL. Furthermore, by colour coding the different groups, it helps provide a measure of diffusion within each group.

--

--

Jonathan Dunne
Nightingale

I work as a Data Scientist by day, and have a passion for visual story telling. I have a PhD in Mathematics and Statistics, and love watching Cricket.