Implementing an OCR for Identity Cards — Part 2: Fine-Tuning

Benjamin Tan Wei Hao
DKatalis
Published in
8 min readJun 17, 2022

--

In the first part, Implementing an OCR for Identity Cards — Part 1: Image Preprocessing, I covered the various preprocessing steps I took to help out the OCR as much as possible. This is because most OCR systems are very sensitive to the input images, especially if the images are tilted, noisy, etc.

In this part, we’ll cover the actual OCR-ing, along with how I did model fine-tuning for the KTP OCR mode, using the NIK and Name as examples. Information on fine-tuning a model, much less an OCR model, is quite sparse. The information presented here has been gleaned from scouring numerous GitHub issues and lots of trial and error.

Now, it is quite difficult to give a step-by-step account as to how to actually do this, because it’s a very iterative process. But hopefully, you’ll get a general idea and adapt it to your own use case.

EasyOCR

After evaluating a bunch of open-source OCR solutions, I settled on EasyOCR because it gave the best results balanced with inference time. More importantly, I was able to figure out how to fine-tune it. Most importantly, having 15+ GitHub stars didn’t hurt either.

EasyOCR on NIK

Now, while EasyOCR did pretty OK for most of the text I threw at it, detection of the NIK had…

--

--

DKatalis
DKatalis

Published in DKatalis

DKatalis is a highly adaptive tech company, driven to solve problems through tech and data.

Benjamin Tan Wei Hao
Benjamin Tan Wei Hao

Written by Benjamin Tan Wei Hao

Author of The Little Elixir & OTP Guidebook, Mastering Ruby Closures, Building an ML Pipeline in Kubeflow. | Currently: Product Owner at @dkatalis.

Responses (3)