Implementing an OCR for Identity Cards — Part 1: Image Preprocessing

Benjamin Tan Wei Hao
DKatalis
Published in
11 min readNov 15, 2021

--

www.dkatalis.co

In this series of articles, I will take you through how I’ve implemented an OCR to detect Indonesian identity cards, or KTP for short. While I obviously cannot open-source our internal code, I can provide you with a roadmap to implementing your OCR system.

During my research, I came across several helpful articles, but there were still large gaps in specific areas which weren’t covered very well. For example:

  • How to fine-tune an OCR model
  • How to deploy an ML model
  • How to optimize an ML model on CPU (Yes, CPU!)

These are some of the big questions that I had when I started on this journey. The techniques covered here should be applicable for a wide variety of identity cards, and even passports. Some of the techniques you see might appear sophisticated, while others might be surprisingly simple. I encourage you to experiment with multiple techniques for your specific problem domain.

Detection results from the OCR of a synthetically generated KTP

Being the lazy person that I am, I didn’t implement the OCR from scratch, but instead opted to look for open-sourced solutions and beat it into submission. I…

--

--

Benjamin Tan Wei Hao
DKatalis

Author of The Little Elixir & OTP Guidebook, Mastering Ruby Closures, Building an ML Pipeline in Kubeflow. | Currently: Product Owner at @dkatalis.