Build your own OCR(Optical Character Recognition) for free

Some popular OCR APIs

Open Source Frameworks:

How to setup and get it to work?

  • Gradle dependency
  • Tesseract -CPP Preset — It is the Java wrapper for Tesseract which is built on a CPP framework.
  • Leptonica — Its a dependency for Tesseract, through which we get support to several image formats. It also gets position and page layout Information.
  • JMagick — JMagick is the java interface for ImageMagick C-API.
  • Im4Java — It is the Java wrapper for ImageMagick. This fires command line ImageMagic commands using Java Process builder.
  • brew install imagemagick
  • brew info imagemagick — We can run this command to make sure the installation was successful.
  • In order for Tesseract to work its best you would have to make sure the image is as clear as possible.
  • Which could might mean we would have to perform image modifications such as resizing, colorspace, contrast, morphology, filter(Gaussian, Triangle, Spline, etc), edge detection.
  • For this reason we will make use of JMagick which has a slew of functions which makes use of ImageMagick under the skin to perform image modifications.
  • Here are some useful links for performing Image Modifications
  • Below are sample images of what it was and how it needs to be for Tesseract to understand and perform OCR.

Font Recognition?

  • Install Tesseract on the machine
  • Download and Install JTessBox Editor
  • Identify the font in the image and install it on the system
  • Open the JTessBox Editor and choose the needed font and type in a sentence with all the needed characters.
  • Clicking on generate, would create .box and .tif files.
  • Now update the font name in the below code and run the python script using the below command
  • python tesseract-trainer.py
  • Once the python script run successfully it will generate a slew of file and will add the same to the tesseract installable. Although you would need to copy them and add it to the tessdata folder in your project.

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Review — Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction

Review — Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation

Introduction to Deep Learning

Support vector machines

TensorFlow Lite Model for On-Device Housing Price Predictions

Simple Face Detection using Open CV(Google Colab)

Collaborative Filtering in Recommendation Systems

Predicting Bitcoin’s Price With Recurrent Neural Networks

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Balaaji Parthasarathy

Balaaji Parthasarathy

More from Medium

Building a LINE Messenger Chatbot with Flask in Python

Face Detection in Videos

Empower your website with pyScript

Scrape News website using Scrapy