PostgreSQL: percentile query

Chanmann Lim
lchanmann
Published in
2 min readMar 15, 2018

Suppose that we are having students’ test score data for a particular school year which we can simply store it in a PostgreSQL table (eg. test_scores) with three columns: student_id, subject_id and score.

Let’s also assume that all students had taken the same number of classes and the students who were absent during an exam would receive a score of zero.

Then came a school board and he asked for a list of well performing students who are in the top 10th percentile of that school year.

It might be quite tempting to quickly but incorrectly code up the query with the help of the LIMIT clause to achieve this.

And the problem with this is it actually returns the top 10% of rows order by the students’ average scores however it does not necessarily mean getting the students who are in top 10 percentile of the school year. For example, if there are 100 students and 12 of them get perfect average score (i.e. 10/10) the above query would only return exactly 10 students which is not correct.

To address the question properly we need to first find the smallest average score in the poll of the students in the top 10 percentile. Equivalently, if we are to order the average scores in ascending order it would be the score corresponding to 90 percentile then we can easily use it in WHERE clause to filter for students who attain that score and higher.

Luckily since PostgreSQL 9.4, getting discrete percentile value is made available by the percentile_disc function in ordered-set aggregates which depend onWITHIN GROUP clause to perform operations on a group of rows organized by ORDER BY like so:

Now we can answer the question of the school board for getting a list of students who are in the top 10 percentile by using simple WHERE instead LIMIT clause.

Voila, there we have it.

Percentile query will always have its own niches. I hope you find it useful for your work as well as personal development. If you like this article please clap and share it on your social media so more people can benefit from it. If you have or know someone who has solved the same problem differently I encourage you to share your thought by posting comments below.

Let’s stay curious and stay hungry for knowledge, Cheers!

--

--

Chanmann Lim
lchanmann

M.S. in Computer Science, University of Missouri-Columbia.