Implementing an OCR for Identity Cards — Part 2: Fine-Tuning
In the first part, Implementing an OCR for Identity Cards — Part 1: Image Preprocessing, I covered the various preprocessing steps I took to help out the OCR as much as possible. This is because most OCR systems are very sensitive to the input images, especially if the images are tilted, noisy, etc.
In this part, we’ll cover the actual OCR-ing, along with how I did model fine-tuning for the KTP OCR mode, using the NIK and Name as examples. Information on fine-tuning a model, much less an OCR model, is quite sparse. The information presented here has been gleaned from scouring numerous GitHub issues and lots of trial and error.
Now, it is quite difficult to give a step-by-step account as to how to actually do this, because it’s a very iterative process. But hopefully, you’ll get a general idea and adapt it to your own use case.
EasyOCR
After evaluating a bunch of open-source OCR solutions, I settled on EasyOCR because it gave the best results balanced with inference time. More importantly, I was able to figure out how to fine-tune it. Most importantly, having 15+ GitHub stars didn’t hurt either.
EasyOCR on NIK
Now, while EasyOCR did pretty OK for most of the text I threw at it, detection of the NIK had…