Gini Coefficient or Gini Index in our Data Science & Analytics platform

Analyttica Datalab
2 min readDec 21, 2018

--

The Gini Coefficient or Gini Index measures the inequality among values of a variable. Higher the value of an index, more dispersed is the data. Alternatively, the Gini coefficient can be looked like half of the relative mean absolute difference.

Gini index is the most commonly used measure of inequality. It is a very popular measure in econometric studies. It is used to represent the income or wealth distribution of a nation’s residents.

The Gini coefficient is usually defined mathematically based on the Lorenz curve, which plots the proportion of the total income of the population (y-axis) that is cumulatively earned by the bottom x% of the population (see diagram). The line at 45 degrees thus represents perfect equality of incomes. The Gini coefficient can then be thought of as the ratio of the area that lies between the line of equality and the Lorenz curve (marked A in the diagram) over the total area under the line of equality (marked A and B in the diagram); i.e., G = A / (A + B). It is also equal to 2A and to 1–2B due to the fact that A + B = 0.5 (since the axes scale from 0 to 1).

Img source: Wikipedia

Input and Output:

In Analyttica TreasureHunt, we can calculate the Gini Index of one numeric variable by clicking on “Machine Learning” and from drop-down selecting “Filtering” and then “Gini Index”.

Interpreting Output:

Closer the Gini Index to zero higher the equality and closer the Gini Index to one higher the inequality.

See Also:

Gain Ratio, Information Gain, Information Value

--

--

Analyttica Datalab

Analyttica Datalab (www.analyttica.com) is a contextual Data Science (DS) & Machine Learning (ML) Platform Company.