In order to be transparent about the way the Global Wine Score (GWS) is realized, it seems interesting to study the scores distribution concerning 4 critics and the GWS (currently aggregating scores from 20 wine critics). We will show how the consistency between the critics ratings (input) and the GWS (output) allows our score to be easily interpreted.
Below, you will find ratings distribution coming from 4 critics out of 20 (Robert Parker, Neal Martin, Jancis Robinson and Jacques Dupont) used in the calculation of the GWS. The last graph corresponds to the GWS distribution.
In this study, the data is limited to vintages from 1995 to 2016. When a wine has been rated multiple times by a same critic, only their last score is considered. Plus, for the GWS, we are only using wines rated by at least 3 different journalists .
The input data for the GWS comes from critics who rate wines in different ways. Jancis Robinson and Jacques Dupont use a 20-point scale while Robert Parker and Neal Martin use a 100-point scale. Even when two critics have the same scale, a same rating has potentially two different meanings because of their habits, personal tastes and severities. These differences appear to be very significant in the case of Neal Martin and Robert Parker. Equal ratings by those two critics may not reflect the same wine quality.
The GWS algorithm takes into account these dissimilarities by normalizing these input notations in order to give a final score as objective as possible.
Then, for the sake of interpretability, let’s check the consistency between the distribution of the GWS and the distribution of critics ratings.
Mean values, 10% and 90% quantiles are good indicators to compare the score distributions. They are plotted on the figures above and reported in the table below for the four critics and the GWS.
For simplicity, in this article, we concentrate on the raw input data coming from critics, i.e. the data shown in this article is not normalized. As Robert Parker, Neal Martin and the GWS share the same 100-point scale, we are able to compare their statistics. They have pretty close mean values (respectively 89.73, 89.68 and 89.37), 10% quantiles (respectively 85.50, 85.00 and 86.13) and 90% quantiles (respectively 95.00, 94.00 and 93.32). These results show great consistency between the journalists and the GWS.
The different scales between Jancis Robinson, Jacques Dupont and the GWS make their comparison less straightforward. Anyway, it appears that Jacques Dupont (15.53) is a little more severe in his ratings than Jancis Robinson (16.21). The GWS takes these differences into account in the normalizing process.
Different critics may have different scales to rate wines. And even when they use the same scale, they also have various habits and tastes. The main goal of the GWS is to take into account these differences and get a score as objective as possible.
It is also crucial that the distribution of the score be consistent with other critics for better interpretability. By comparing key statistics of the distributions of the GWS, we have shown that it can be easily and safely interpreted by the community of wine lovers used to the widely accepted 100-point scale of critics.