Tesseract OCR implementation in .NET Core & Spring Boot
My Purpose :
This article was written for How to implement Tesseract OCR with .net core and with spring boot. Also, both of these projects was coded for proofing of concept without any high level architecture or any software pattern. Project can quickly explain main implementation of Tesseract OCR. Because of it , I preferred two enterprise software languages which are .net core and JAVA. I was coded both of these in Rest API format.This introduction is enough. Let’s begin ↩
What is Tesseract OCR ( Optical Character Recognition ) ?
Tesseract OCR is open source. Since 2006 it is developed by Google.🤙
Basically, this technology recognises text inside images, such as scanned photos,documents, screenshots and pdf. OCR technology is used to convert virtually any kind of images containing scanned /written /taken text into machine-readable text data.
History
Tesseract was originally developed at Hewlett-Packard Laboratories Bristol and at Hewlett-Packard Co, Greeley Colorado between 1985 and 1994, with some more changes made in 1996 to port to Windows, and some C++izing in 1998. In 2005 Tesseract was open sourced by HP.
Tesseract has unicode (UTF-8) support, and can recognize more than 100 languages “out of the box”.
Tesseract supports various output formats: plain text, hOCR (HTML), PDF, invisible-text-only PDF, TSV. The master branch also has experimental support for ALTO (XML) output.
( Reference: https://github.com/tesseract-ocr/tesseract#brief-history )
🚀.NET CORE IMPLEMENTATION
🔗Dependencies
System.Reflection.Emit - Version=4.6.0
Tesseract -Version=3.3.0
📌Tesseract OCR implementation code block in .NET Core
📌 Input Image:
📌 Result:
🍃 SPRING BOOT IMPLEMENTATION
🔗Dependencies
net.sourceforge.tess4j -Version = 3.4.0 (Pom.xml)
java -Version =1.8
📌 Tesseract OCR implementation code block in Spring boot
📌 Input Image:
📌 Result:
Some Alternative For Tesseract OCR
- Google Cloud Vision
- IronOcr
References — Additional Resources
Postman collections link :
Github Source Code Repositories:
🚀 .net core (POC) — .net:
🍃 Spring boot (POC) — Java:
Official Resources:
Hope you’ve enjoyed!
Thank you for reading, please press clap button for me 👏