Part 3 : Percentages, Percentiles and Quartiles

Nishan
3 min readMar 9, 2022

A basic Approach.

1 . Percentages

I hope all of us are well aware of the term Percentage. and most of us use it in our daily lives. for example, we calculate percentage of marks scored by us, percentage of video downloaded from You-tube etc.

Percentage is defined as the proportion or rate per hundred parts. and we can understand how to calculate the percentage using an example.

Suppose, x = {1, 2, 3, 4, 5, 6, 7, 8, 9}. We want to find the percentage of even numbers in this distribution. So what should we do? We can calculate percentage using the equation

Percentage = (No. of items satisfying the condition / Total No. of items)×100

Here, there are 4 odd numbers and total number of items are 9. So, percentage = (4/9)*100 = 44.44%. So, we can clearly say that, 44.44% of the total items in the distribution is even.

2. Percentiles

Most of us have attended at least one or more competitive exam’s in our lives. And, we have heard the term percentile while the result comes out. Even if we have heard the term percentile, we are not much aware of what does it mean. So, lets look into it.

Percentile is a value below which a certain number of observations fall in a distribution.

Lets say a distribution x = {2, 2, 3, 3, 3, 4, 4, 5, 6, 7, 7, 8, 8, 8, 9, 10} Suppose, in this equation, we need to calculate the percentile value of 7. So how to solve this? For that , Firstly, we have to sort the data into ascending order if it is not ordered. Then, we have to take the first time 7 appeared and count the number of values less than 7.

Here, there are 2, 2, 3, 3, 3, 4, 4, 5, 6 below 7 and the equation to calculate the percentile ranking of a number is

Percentile(x) = (No. of Values Below x / Total No. of Values) × 100

Now, one of the other important things about percentiles are calculating the (n)th percentile value in a distribution.

Suppose, we have to find the 25th percentile value in a distribution with total number of 20 elements. To Find the 25th percentile value, we use the equation

Value = ((Given Percentile) * (n + 1))/100

where, n is the total number of elements in the given distribution.

Here, it is value = (25/100)*21 = 5.25 ~ 5.

The 5th element of the distribution will be the 25th percentile value of the distribution.

3. Quartiles

Now, as we have understood about percentiles, lest look into Quartiles.

What exactly are quartiles? Even while doing machine learning problems, we work a lot with these concepts in stage of cleaning the data to remove outlier values.( Outliers discussed in part 2 )

In a distribution, there are many percentile values. Some of them are 25th, and 75th Percentile values. Here, 25th percentile value is called as first quartile (Q1), and 75th percentile value is called as Third quartile (Q3)

We can calculate those values using the formula given above in the percentiles section.

So, lets conclude by saying that, these are one of the most important basic concepts when it comes to machine learning as, it plays an important role while cleaning and pre-processing the data. So, it is important to understand these concepts. In the next article, we will discuss how these concept can be applied practically while cleaning data in pandas.

Keep Learning!

--

--