Calculating Gini Coefficient in BigQuery with SQL

Evgeny Medvedev
Google Cloud - Community
1 min readJan 10, 2019

--

The Gini coefficient, also known as the Gini index, is a common econometric tool for measuring inequality of asset distribution.

Here is the query that outputs Gini coefficient for each day given daily non-zero (anonymous) account balances:

It uses 1 — 2B formula from this Wikipedia page https://en.wikipedia.org/wiki/Gini_coefficient, where B is the area under the Lorenz curve:

  • balance * (rank — 1) is the area of the rectangular horizontal slice under the Lorenz curve.
  • balance / 2 is the area of the triangle on the left of the rectangular slice.
  • all slices are then summed: sum((balance * (rank — 1) + balance / 2))
  • count(*) is needed to normalize the x axis to the range 0 to 1
  • sum(balance) is needed to normalize the y axis to the range 0 to 1

The result can be verified on this website http://shlegeris.com/gini.

You can find a comparison of Gini Indexes of the major cryptocurrencies, which used this SQL, in the following article https://cloud.google.com/blog/products/data-analytics/introducing-six-new-cryptocurrencies-in-bigquery-public-datasets-and-how-to-analyze-them.

--

--