PDF OCR

Published in

NanoNets

1 min readAug 4, 2021

PDF Optical Character Recognition (OCR) is the process of converting PDFs of scanned and handwritten text into machine-encoded text such that it could be further used by programs for processing and analysis.

Modern OCRs use Neural Networks that mimic the way human brains learn. In the case of Deep-learning based OCRs, 2 genre of neural networks are applied.

Convolutional Neural Networks (CNNs): CNNs are one of the most dominant sets of networks used today particularly in the realm of computer vision. It comprises of multiple convolutional kernels that slide through the image to extract features.

Long Short-Term Memories (LSTMs): LSTMs are a family of networks applied majorly to sequence inputs. The intuition is simple — for any sequential data (i.e., weather, stocks), new results may be heavily dependent on previous results, and thus it would be beneficial to constantly feed-forward previous results as part of the input features in performing new predictions.

Learn more about PDF OCR

PDF OCR

Written by Suresh Thiyagaraj