Understanding and interpreting box plots

Dayem Siddiqui
Feb 18, 2017 · 3 min read

Hi everyone. In this article I am going to discuss everything about box plots. But before we get started you may ask why box plots? Why are they so special? You see, box plot is a very powerful tool that we have for understanding our data. Using box plots we can better understand our data by understanding its distribution, outliers, mean, median and variance. Box plot packs all of this information about our data in a single concise diagram. It allows us to understand the nature of our data at a single glance.

Consider the diagram below:

Image for post
Image for post
Image source: (http://web.pdx.edu/~stipakb/download/PA551/boxplot_files/boxplot4.jpg)

Every box-plot has two parts, a box and whiskers as you can see in the figure above. That’s why it is also sometimes called the box and whiskers plot. The start of the box i.e the lower quartile represents the 25% of our data set. So by looking at the diagram we can instantly conclude that 25% of our data has a value less than 6.2, similarly the end of the box i.e the upper quartile represents 75% of our data. So again from the diagram we can conclude that 75% of our data is less than 8.8. The bold black line in the box represents the median value of our data. In our example the median lies at about 7.8. The difference between the lower quartile and upper quartile is called the inter-quartile range. So basically the entire red box represents the inter-quartile range.

The following diagram will explain the quartiles even further:

Image for post
Image for post
Image source (http://www.wellbeingatschool.org.nz/sites/default/files/W@S_boxplot-labels.png)

Now for outliers

Image for post
Image for post
Image source (http://www.itl.nist.gov/div898/handbook/eda/gif/boxplot0.gif)

Identify Skewness

Image for post
Image for post
Image source (http://datapigtechnologies.com/blog/wp-content/uploads/2014/11/111714_1527_MethodsofMe7.png)

Here is a box plot with respect to the distribution curve:

Image for post
Image for post
Image source (https://onlinecourses.science.psu.edu/stat500/sites/onlinecourses.science.psu.edu.stat500/files/lesson02/rt_skew.gif)

I hope this article helped you in understanding box plots at least to some extent. Stay tuned for more. Bye :) !

Dayem Siddiqui

Software Engineer, wannabe data scientist and a tech…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store