Automate Form Processing 1 | Custom Vision + Computer Vision

Marvin Heng
Aug 25 · 2 min read


Last year, I have came out with an idea to automate the form processing which I think it will definitely help many of the businesses to reduce the effort of hiring people just for data entry. Not only that, it will drastically help to reduce the human error while entering the data.

My ultimate goal was to capture handwritten text on a printed form.

Architect Design

With the technology back then, to achieve what I want, I need to either keep training and fine tuning the model using machine learning, or I have to find other solutions. Due to my limited knowledge, I opted for the latter, which I landed with utilizing Microsoft’s Custom Vision + Cognitive Service (Specifically, Computer Vision API).

Custom Vision allows me to train with a set of sample forms and tag all the handwritten texts on the printed form. After training is done, when I pass in the test form, the API will tell where those handwritten texts are located on the forms.

Then, we will take that, and send it over to Cognitive Service’s computer vision API to translate the handwritten text to digitized text.

Outcome of Experiment

What we have achieved in the approach above are as following:

  • Trained the forms with Custom Vision
  • Translated handwritten text to digitized text.
  • Only accomplished ~50% accuracy.

Second Experiment — Extended with Web Technologies.

With the help from Goh Chun Lin, we managed to extend it with web technology which then allowed us to adjust and fine tuning and most exciting one is, see our approach with almost immediate result. We have supposed to prepare and present the above idea at FutureNow Singapore event in January 2019. But, we didn’t quite make it on time. Therefore, we decided to open source it and you may download look at the project on GitHub listed under DotNetSG repo which is an initiative of

Next: Automate Form Processing 2 | Next Journey with Microsoft Form Recognizer

