It wouldn’t be wrong to say that the last couple of years has been a huge revelation about what Artificial Intelligence can and cannot do at the moment. Now, while most of us are growing our interests in the real possibilities of this technology, we are also surfacing challenges in the process. So, we would say that 2019, is going to be a year of major retrospection for the following people:
a) Those who have already implemented AI Solutions in their organizations
b) Those who are considering to adopt it this year
c) Those who have grown in awareness to differentiate the practical solution from the vast majority that is available in the market today.
Now, if you happen to identify yourself in one of the above-mentioned scenarios, then sit back and read on. This article was a result of speaking to different stakeholders (ranging from CXO’s to Technology Managers and decision-making management teams).
To make things interesting (and worth your while), we have broken the article in the following three phases, each of these forms the agenda here and speaks to 3 templates of people we just established (Metaphors apart, the solution we will talk about works for any given template).
a) The technology that has been masquerading as AI, gets exposed (a.k.a OCR)
What is OCR? The standard full form states it as Optical Character Recognition and it literally does as the title so suggests, it recognizes the characters of a printed or written text. Its most popular application has been Data processing.
Let’s call it Traditional OCR representing the original idea. It fails on the following grounds since it fails to understand 3 most crucial aspects of data processing:
1) Formatting
2) Content
3) Context
To further understand let’s take a random example as given below:
Here, we have the profile of a person with some of their related information. In this example, the labels would be something like:
Name, Date of Birth, Gender, Location.
Now, here
i) the corresponding information to these labels refers to as content,
ii) the specific way/sequence the content is represented would be the formatting (eg date could be DD/MM/YYYY or MM/DD/YYYY)
iii) the information that is displayed as a result of the labels signifying the context.
For a successful processing of the given data, these 3 factors represent the primary checklist. Traditional OCR blindly captures the data and introduces errors in its processing, thereby making it difficult for the further steps.
For the given example, the processed output through traditional OCR method might look something as represented in the image here.
This is ofcourse a mere representation of the several cases of faulty outputs that are generated via traditional OCR’s. Some of the cases might vary in the levels of their errors, but you can pretty much be assured of an output of a similar nature. Once you’ve made that assumption now imagine a pile of documents that are to be processed, What would you ideally bet on to get the processing done?
To ensure a better data processing experience, it then became important to work around this challenge.
However, after speaking to a number of business owners who had previously explored a range of data processing solutions, it was a realization that most of the existing solutions just attempted to improve the results but it eventually ended up mostly as a slightly enhanced but still Traditional OCR. To the point that it later joined the AI Buzz wagon to repackage itself. While, there may still be solutions out there that claim to have an AI powered approach, it yet doesn’t match the required standards.
From the responses we could clearly gather that there has been a gap in the expectations. But to understand this gap clearly, let us take into consideration the case of an AI enabled OCR.
b) Know more about its replacement and why you should care (a.k.a AI enabled OCR)
AI-enabled OCR, performs where traditional OCR fails. Apart from the primary factors as mentioned initially (Formatting, Content & Context), what makes Cerescope D-tect (which is built around this idea) stand out is its ability to have a spatial understanding of a given document. Which means, it understands what to look for in the document and where exactly to find it, even if the position changes, similar to how a human would. So, basically it works irrespective of the template & delivers a more accurate result.
Now, while OCR may not be accepted any longer as a feasible solution, we’d like to see it from a different perspective. One in which, the ‘C’ in OCR focuses more on Concepts even beyond the Characters. This is an extremely important approach when it comes to document processing regardless of templates/layouts etc. Some popular use case would be Invoice Processing and Insurance claims processing across industries, as these use cases present varying levels of complexity that comes with the domain, the representation of the information. It as a result makes the application of AI-enabled OCR even more important.
c) It’s about leading a ‘Productivity Revolution’ (a.k.a Effective Business Growth)
It is no news that AI as a buzzword has been floating around for quite some time. Companies want to make sure they are in some way associated with the term, irrespective of whether the technology is available or not. This approach has led to a huge gap in expectations. We believe that AI, to be truly delivering value must be Clever, Actionable, Measurable & Practical. We call it the CAMP AI approach.
This allows to truly enable productivity in organizations, which is something that drives us. This is also something that drives the technology that we have developed. It is important to have this approach as, we believe process automation is just the beginning and the foundation for a true digital transformation requires a partner that shares the vision for productivity.
If you have any questions related to how AI-enabled OCR can be making a difference in your company, feel free to reach out to us or write to us: contact@cerelabs.com .
~ Written by Amyth Banerjee (AI Prospector, Cere Labs)