Calculating Gini Coefficient in BigQuery with SQL
The Gini coefficient, also known as the Gini index, is a common econometric tool for measuring inequality of asset distribution.
Here is the query that outputs Gini coefficient for each day given daily non-zero (anonymous) account balances:
1 — 2B formula from this Wikipedia page https://en.wikipedia.org/wiki/Gini_coefficient, where B is the area under the Lorenz curve:
balance * (rank — 1)is the area of the rectangular horizontal slice under the Lorenz curve.
balance / 2is the area of the triangle on the left of the rectangular slice.
- all slices are then summed:
sum((balance * (rank — 1) + balance / 2))
count(*)is needed to normalize the x axis to the range 0 to 1
sum(balance)is needed to normalize the y axis to the range 0 to 1
The result can be verified on this website http://shlegeris.com/gini.
You can find a comparison of Gini Indexes of the major cryptocurrencies, which used this SQL, in the following article https://cloud.google.com/blog/products/data-analytics/introducing-six-new-cryptocurrencies-in-bigquery-public-datasets-and-how-to-analyze-them.