From chess score sheet to ICR with OpenCV and image recognition.

Why it all have begun

I’ve started playing chess again after I finished the school. As I am tech savvy guy I immediately started to look into the things that I can improve around this beautiful game.

The first look was into the way we can track chess moves. We have nice electronic chessboards but they are quite expensive. You can also build your own one but it is quite hard at the moment.

Longer games come with the requriement of noting you games on the chess score sheet. The idea here is quite simple. Every player is noting down the moves that he is making on the form. Than, later arbiter is taking those forms and puts them together to recreate the game. This is done usually in the electronic form.

Bobby Fischer’s Scoresheet from Siegen Olympiad, 1970

I was about to start in the tournament with longer time control when sudden sparkling idea come to my mind. I can use image recognition technic to import handwritten scoresheet into the computer. This would make the games available immediately after the event without much effort.

What I wanted to build

So the idea is quite simple. We took the picture of the chess score sheet with an smartphone camera and upload it to the server. Later it is processed into the chess game notation file ready to be analysis by the chess player.

With this simple explanation comes the great deal of processing that the computer needs to do before it converts the picture into something a way more structure such as pgn text.

For the human eye this process takes fraction of the second and comes totally unoticed most of the time. For the computer this is not that simple. It involves at least those steps:

1. Fix orientation of the page, scale it and align perspective.

2. Identify fields and extract the images with the marks.

3. Classify letters and numbers into categories based on the image.

4. Do a post processing to check if all moves are legal.

Process of ICR as of PoC

How is it all possible

So now the fun part starts. All of those things involve image manipulation. It is just so happens that this is a filed that was always very attractive to me from the technical point of view. I went for it totally and every next day I devoted few hours of the evening to work on this. My goal was to validate how complicated and if even possible such a task can be to me.

After around the week or two I had a first proof of concept working with the tremendous success rate of marks recognition at the rate of 69%. That was a real wow for me. I perfectly knew that the final goal was and in fact still is a far away but it was already something that encourage to make the greater trial.

Frist run from the scoresheet above. In brackets there are expected moves.

So the story begins here. I wanted to start this series for the two reasons. One, to be better motivated to make a progress of my project. Secondly beacuse I would like to gather the knowledge that I am learning in the organise way. I would like that this posts will become a kind of blueprint or open-knowledge project. Still I hope it will be nice story to read for the people that are not so scared about some technical bits that I might shown of during the next week.

If you want to read more about it please make sure that you follow me.

Update 1.

If you are willing to do a bit of reverse enginering and find out the code behind I’ve published it it the github:

https://github.com/smigielski/pgn-reader-poc