Optical Character Recognition Implementation

Successive Digital
Successive Digital
Published in
2 min readNov 21, 2019

Introduction and background

Optical Character Recognition (OCR) is the technology to read and convert typed, printed, or handwritten characters into machine-encoded text or a format that the computer can manipulate. As a subset of image recognition, it is commonly used as a form of data entry with the input being some sort of printed document or data records such as bank statements, passports, sales invoices, resumes, and business cards.

Android version

For this, the services/APIs which were used for implementation was Xamarin. Google Play Services. Vision, Our team, was successful in this project as the response rate to be expected in less than one second. The system will be able to read the text in grey and red color with a 65 to 75 percent accuracy in grey text and a 90 percent accuracy in red text.

The process flow includes the user opening the camera and placing it above the lock, followed by the camera detecting the surface. After this, it captures the text and returns the value after processing it.

The Android version of the application can have a custom camera that will capture the video stream of the lock, then divide it into frames, capture the text, and adding it into a list of text. This cycle will repeat until two texts are found to be similar. This procedure will increase the chances of correct detection of the lock instead of just capturing the image, detecting the text within the captured image, and then presenting the output. However, it will also affect the processing time/response rate of the complete operation.

The more efficient working of the application will depend on the latest version of the Xamarin.GooglePlayServices.Vision

iOS version

For the iOS version, the services/APIS used for implementation was text recognition. ML Kit’s text recognition APIs can help you recognize any Latin-based language. This translates into the automation of data entry for receipts, credit cards, and business cards.

With a response rate of less than one second, the system has the ability to read the text in grey and red color. With grey text, the accuracy is at 60 to 70 percent, while the red text is at 80 percent.

The iOS version is supposed to have a custom camera that can capture the lock’s video stream, which will be divided into several frames. From the frames, the system will capture the text and adding it to a list of texts. Until two texts are found to be similar, the system will populate it as the lock code does to the user.

One limitation is that accurate capturing of data depends on the lighting condition of the surroundings as well as the condition of the lock.

--

--

Successive Digital
Successive Digital

A next-gen digital transformation company that helps enterprises transform business through disruptive strategies & agile deployment of innovative solutions.