Build your own OCR(Optical Character Recognition) for free

Some popular OCR APIs

Open Source Frameworks:

How to setup and get it to work?

  • Gradle dependency
  • Tesseract -CPP Preset — It is the Java wrapper for Tesseract which is built on a CPP framework.
  • Leptonica — Its a dependency for Tesseract, through which we get support to several image formats. It also gets position and page layout Information.
  • JMagick — JMagick is the java interface for ImageMagick C-API.
  • Im4Java — It is the Java wrapper for ImageMagick. This fires command line ImageMagic commands using Java Process builder.
  • brew install imagemagick
  • brew info imagemagick — We can run this command to make sure the installation was successful.
  • In order for Tesseract to work its best you would have to make sure the image is as clear as possible.
  • Which could might mean we would have to perform image modifications such as resizing, colorspace, contrast, morphology, filter(Gaussian, Triangle, Spline, etc), edge detection.
  • For this reason we will make use of JMagick which has a slew of functions which makes use of ImageMagick under the skin to perform image modifications.
  • Here are some useful links for performing Image Modifications
  • Below are sample images of what it was and how it needs to be for Tesseract to understand and perform OCR.

Font Recognition?

  • Install Tesseract on the machine
  • Download and Install JTessBox Editor
  • Identify the font in the image and install it on the system
  • Open the JTessBox Editor and choose the needed font and type in a sentence with all the needed characters.
  • Clicking on generate, would create .box and .tif files.
  • Now update the font name in the below code and run the python script using the below command
  • python
  • Once the python script run successfully it will generate a slew of file and will add the same to the tesseract installable. Although you would need to copy them and add it to the tessdata folder in your project.




Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Review — Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction

Review — Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation

Introduction to Deep Learning

Support vector machines

TensorFlow Lite Model for On-Device Housing Price Predictions

Simple Face Detection using Open CV(Google Colab)

Collaborative Filtering in Recommendation Systems

Predicting Bitcoin’s Price With Recurrent Neural Networks

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Balaaji Parthasarathy

Balaaji Parthasarathy

More from Medium

Building a LINE Messenger Chatbot with Flask in Python

Face Detection in Videos

Empower your website with pyScript

Scrape News website using Scrapy