It’s not hard to understand why businesses want to use technologies to deal with their documents. Given the massive and growing amount of documents to process, machine help is inevitable. And machine analysis has shown greater efficiencies in everything from processing medical records and insurance claims to detecting frauds in emails.
The success of any given document processing project, however, is far from preordained. Those who think of their documents simply as text may be caught off guard by a project’s difficulty and complexity.
For clarity, let’s define document analysis as analyzing and extracting information from digital documents that contain rich components such as text and graphs. The daunting challenge of building machines for this task covers plenty of disciplines, including database systems, image processing, natural language processing, pattern recognition, and machine learning. …