How I Found Perfect Correlation between Chess Player Rating and ACPL and STDCPL

Rafaelvleite
8 min readOct 4, 2022

--

Also, how I found out that Chess Player Hans Niemann has a 2500–2550 Strength, even being Rated near 2700

It all started with a chess scandal

The world of chess is experiencing a huge scandal in 2022. A young American chess player, Hans Niemann, who has confessed cheating in the past at online chess games, has risen like a meteor in Over the Board chess tournaments.

* Hans Niemann, 19 years old, who has beaten world champion Magnus Carlsen several times in 2022

The speed of his rise has no precedents in chess history: from 2484 to 2700 in 18 months.

* Official data from International Chess Federation, FIDE: https://ratings.fide.com/profile/2093596/chart

And things got really strange when the whole chess community started to suspect him, mainly because of these facts:

  1. The day he beat Magnus Carlsen for the first time, he gave no interview. When the interviewer called him to talk about that huge victory, he only said “Chess Speaks for itself”. Turned his back to the interviewer and went away. You can check the video here https://youtu.be/fxe0o2pCGwo
  2. The second time he beat Magnus Carlsen, Sinquefield Cup 2022, he was playing with the black pieces and played a masterpiece, completely outplaying the world champion. When interviewed, he said he studied an ultra rare opening prior to the game, and for some “miracle” that opening happened in their game.
  3. Magnus Carlsen withdrew from Sinquefield Cup 2022 after this event, a world champion withdrawing from a tournament has no precedents in chess history. In his announcement on Twitter, he confirmed that he had withdrawn from the competition, adding a link to a video of former Chelsea manager José Mourinho saying “If I speak I am in big trouble”.
  4. Another top player, Ian Nepomniachthci, when asked what he thought about the game, he replied: “More than impressive…”
  5. The day after, next round in Sinquefield Cup 2022, Niemann faced one of the best players in the world, Alireza Firouzja. Because all the players and the organization became very suspicious about him and because Magnus withdrew from the tournament, Niemann had to pass a full inspection with metal detectors. He passed clear. And played a masterpiece against Firouzja, the game ended in a draw.
  6. During the interview to explain his game against Firouzja, Niemann couldn’t explain his ideas. He got very confused, and explained the most difficult he made in the game, a brilliancy, as a “psychological” move. He said he didn’t calculate.
  7. At another tournament, Julius Baer 2022, Niemann faced Magnus Carlsen Again. Carlsen made one move and resigned the game, refusing to play.
  8. Niemann confessed he cheated on online games when he was 12 and 16 years old. He got banned from chess.com due to Fair Play violation.

After those events, the chess community was shocked, and living a scandal. Many people are trying to figure out how a player could be cheating in an Over The Board Tournament. Even Elon Musk tweeted about this, saying he could be using anal beads to communicate with someone and get the best moves by some kind of code vibrations.

Well, the fact is that no one has found any proof of cheating. That’s where the controversy lies.

Meanwhile, lots of people started analyzing his moves, trying to find some evidence of cheating. Grandmasters, statistics specialists, body language readers…

I also felt encouraged to try to help. I have a bachelor in Production Engineering at the University of Sao Paulo (POLI-USP), and also a Programmer and Data Scientist. Besides that, I have a YouTube channel about chess in Brazil, called “Xadrez Brasil”, with almost 300k subscribers and more than 80 million views.

After seeing a video from FIDE Master Yosha, a player that was trying to find some evidence by comparing Niemann’s moves with the ones suggested by the computers, I found that it could be a nice approach. The idea was to compare every move from Niemann’s career against the computer suggested moves, and to do that for huge amounts of data, for lots of players. I would need to make an algorithm for that.

The STEPS to create a Database of compared moves between Niemann and other players against a chess engine

  1. I have downloaded the most recent version of the best chess engine, with neural networks, called Stockfish 15. You can download it here: https://stockfishchess.org/download/
  2. I collected some game databases from Chessbase software to feed the algorithm. Players included in the database were top GMs like Andrey Esipenko, Gukesh, Firouzja, Keymer, Praggnanandha, Erigaisi, Carlsen, Niemann, Caruana, Bobby Fischer and others.
  3. I created a python script to process the data, using the stockfish 15 to evaluate every single move for every single game. You can download the python source code in my GitHub repo.
  4. I finally had a huge database of games analyzed move by move by the best chess engine. You can download the final database with 7568 analyzed games here.

The results

With these studies, I made 2 huge findings: there is an established correlation between a player Rating and the ACPL (Average Centipawn Loss) and STDCPL (Standard Deviation Centipawn Loss).

Basically, Centipawn Loss is a measure of how distant the evaluation of the move a player made in a game is from the suggested computer best move. It is reasonable to think that as a chess player gets better and more professional, the lower will be the Average of the Centipawn Loss.

And Standard Deviation Centipawn Loss is a measure of how consistent the moves are. Amateur players tend to do some good moves, then bad moves, but a professional player is more consistent on the quality of his moves. It is reasonable to think that there is some correlation between Rating and STDCPL then.

And when I took the Centipawn Loss averages from all players in the database and grouped by Rating Tiers, and the same to the standard deviations, I found this:

That’s beautiful! Not only is there indeed a correlation between these metrics, and it is also very close to perfectly Linear!

I would need to plot a correlation matrix to be sure about the statistical relevance of these findings. I did it:

* Pearson correlation of -0.99 between Rating and ACPL and between Rating and STDCP

Huge findings! Now ACPL and STDCPL can be used by chess community for several purposes:

  • Estimate Rating from players of the past, legends like Paul Morphy.
  • Chess coaches can identify better their students’ levels by evaluating their ACPL and STDCPL from a group of games (need at least 30–40 games to be statistically relevant).
  • A chess professional may understand his location in his career life cycle (progress, stabilization, decay).
  • Identify anomalies that could be evidence of cheating.
  • Predict the probability of the results between players.
  • Compare chess players throughout the history from different times

Real world

When looking at real world examples of top chess players, we could definitely identify correlation between Rating, ACPL and STDCPL:

* GM Gukesh (ACPL 22, STDCPL 40)
* GM Vicent Keymer (ACPL 21, STDCPL 40)
* GM Praggnanandha (ACPL 22, STDCPL 37)
* Erigaisi (ACPL 22, STDCPL 38)
* GM Magnus Carlsen (ACPL 17, STDCPL 32) — This is a monster!
* GM Fabiano Caruana (ACPL 17, STDCPL 33) — Another monster!

And what about Hans Niemann?

* GM Hans Niemann (ACPL 25, STDCPL 48)

After analysing more than 1200 games from Niemann's entire carreer, I found the curves below:

At first glance, these curves seem to be absolutely normal. They are "almost" linear, there is a decay on ACPL and STDCL as rating increases… BUT…

  1. ACPL stopped to go down at 25 value, even when Hans acchieved 2700 rating. According to our expected curves, 25 should stand for a 2500–2550 rated player. A 2700 rated player should have gone down to 22, as all 2700 players above did.
  2. STDCPL is at 48. The lowest value he achieved was 44, again, those are values found in a 2500–2550 player. A 2700 rated player is expected to go down to value 38. This level of 48 shows a high variation regarding the quality of his moves for a 2700 rated player.
* Expected ACPL and STCPL by Rating Tier

Besides that, things get pretty ugly when we split the data before 2018, the year that he started to grow abnormally:

* The years before 2018 shows an expected curve: Linear, with decay on ACPL and STDCPL
* The years after 2018 shows unexpected behavior. Low correlations and ACPL and STDCPL not compatible with 2700 rating

The years before 2018 shows an expected curve: linear, with decay on ACPL and STDCPL, expected values.

But the years after 2018 shows unexpected behavior. Low correlations and ACPL and STDCPL not compatible with 2700 rating.

Big Question

The big question is: What can possibly explain a 2500 strength player acchieve 2700 rating?

Conclusion

I have pointed a new direction showing the perfect correlation between ACPL, STDCPL and Rating. I have explained the steps that I took, anyone can easily replicate it.

My contribution is done. If FIDE wants to use it as a complementary method to use in the ongoing investigations, it may be helpful. It may not be. I don’t know. I just feel happy with what I found and thankful to have the opportunity to spread it to the world.

For Fun

Enjoy more FIDE Rating Charts from current top chess players:

* GM Hikaru Nakamura (2768 at Oct. 2022) https://ratings.fide.com/profile/2016192/chart
* GM Duda, Jan-Krzysztof (2731 at Oct. 2022) https://ratings.fide.com/profile/1170546
* GM Ding Liren (2811 at Oct. 2022) https://ratings.fide.com/profile/8603677
* GM Wesley So (2774 at Oct. 2022) https://ratings.fide.com/profile/5202213
* GM Hans Niemann (2699 at Oct. 2022) https://ratings.fide.com/profile/2093596/chart

The last chart speaks for itself.

--

--

Rafaelvleite

Chess Streamer in Brazil with more than 320k subscribers and +120 million views. Bachelor of Production Engineering at USP-SP. Programmer and Data Scientist.