Box and whisker plot Aka Box Plot explained

Raghavan
2 min readJun 24, 2018

--

When trying to analysis a series of numerical data , it is essential for us to understand the distribution of the data.One quick way to understand is to use a Box Plot. Box Plot shows us lot of insights able the data and sometimes this could be difficult to follow. Let start looking into example to learn more.

Youth Employment 2014

This plot is based youth employment data for year 2014 from here .If we hover on the plot , we could see the following things .

Median
The median (middle quartile) marks the mid-point of the data and is shown by the line that divides the box into two parts. Half the scores are greater than or equal to this value and half are less.

Inter-quartile range
The middle “box” represents the middle 50% of scores for the group. The range of scores from lower to upper quartile is referred to as the inter-quartile range. The middle 50% of scores fall within the inter-quartile range.

Upper quartile
Seventy-five percent of the scores fall below the upper quartile.

Lower quartile
Twenty-five percent of scores fall below the lower quartile.

Upper Fence and Lower fence

Upper and lower fences cordon off outliers from the bulk of data in a set. Fences are usually found with the following formulas:

  • Upper fence = Q3 + (1.5 * IQR)
  • Lower fence = Q1 — (1.5 * IQR).

Where IQR is the interquartile range. In the above plot there is no lower fence.

In addition to these min and max values can also be observed. Where a bo plot gets more useful and interesting is when we compare two distributions. We can observe the how these two distribution varies.

--

--

Raghavan

Data scientist at Ericsson AI Accelerator, Kravmaga Trainer