Let’s extract information from a damaged image using the AWS Textract and Azure Form Recognizer OCR APIs — Javascript.

3 min readMar 30, 2024

Two months ago, I got a task where I had to extract information from an old, damaged picture that contained data formats like forms, tables, and handwritten. Then I started exploring Optical Character Recognition (OCR) and found the two popular modules provided by AWS and Azure, which are “Aws Textract” and “Azure Form Recognizer AI”.

AWS Textact: Automatically extract printed text, handwriting, layout elements, and data from any document. Learn more

Azure Form Recognizer: cloud service that uses machine learning to analyse text and structured data from your documents. Learn more

Let’s start with how to actually implement them using Javascript and npm packages.

> npm init -y // to intialize npm 
> npm install @aws-sdk/client-textract // to install aws textract client
> npm install @azure/ai-form-recognizer // to install azure recognizer client

We’ll use this unclear image to perform our OCR.

AWS Textract implementation:

import { TextractClient, AnalyzeDocumentCommand } from "@aws-sdk/client-textract";
import fs from "fs";

const awsTextractClient = new TextractClient({
    region: process.env.AWSREGION, // your aws region
    credentials: { 
      accessKeyId: process.env.AWSACCESSKEY, // your aws access key
      secretAccessKey: process.env.AWSSECRETKEY // your aws secret key
    }
});

const performAwsOcr = async () => {
    try {
        const imageBuffer = fs.readFileSync("./sample.png")
        const params = {
            Document: {
                Bytes: Buffer.from(imageBuffer, 'base64'),
            },
            FeatureTypes: ["TABLES"], // https://aws.amazon.com/textract/features/
        };
        const ocrData = await awsTextractClient.send(new AnalyzeDocumentCommand(params));
        return ocrData
    } catch (error) {
        return  console.log(error.message);
    }
}

const extractedData = performAwsOcr();
console.log(extractedData);

Result:

As the actual result doesn’t look human-readable, I have given the Excel version of the extracted data from the sample image.

Pros and cons of AWS Textract:

Pros:

The confidence rate is very high and mostly falls under <80% to 97%.
Relationships have been well established among the data, which makes extraction easier.

Cons:

Multiple formats of data detection are less accurate; mostly everything is considered a table.

Azure Form Recognizer implementation:

import { DocumentAnalysisClient, AzureKeyCredential } from "@azure/ai-form-recognizer";

const azureRecognizerClient = new DocumentAnalysisClient(
      process.env.AZUREENDPOINT, // your azure endpoint
      new AzureKeyCredential(process.env.AZURESECRETKEY) // your azure secret key
);

const performAzureOcr = async () => {
    try {
        const imageBuffer = fs.readFileSync("./sample.png")
        const params = await azureRecognizerClient.beginAnalyzeDocument("prebuilt-document", imageBuffer);
        const ocrData = await params.pollUntilDone();
        return ocrData
    } catch (error) {
        return console.log(error.message);
    }
}
const extractedData = performAzureOcr();
console.log(extractedData);

Result:

Sheet 1 extracted from the image (Azure)

Sheet 2 extracted from the image (Azure)

Pros and cons of Azure Form Recognizer AI:

Pros:

Pre-defined models are available, which will almost cover all kinds of document formats.
Produces multiple formats of data detection rather than returning only tables.

Cons:

Relationships are established weekly, which may lead to false arrangements.

Conclusion:

As I tested these modules on numerous images, their prediction rates were quite close, but the main principle of which one to choose is entirely dependent on your needs.

Note: We concluded using AWS Textract as our requested image mostly contains image feature type data. 🎉🎉🎉…

You can find the entire implementation API in my GitHub. Link ✨

Let’s extract information from a damaged image using the AWS Textract and Azure Form Recognizer OCR APIs — Javascript.

AWS Textract implementation:

Result:

Pros and cons of AWS Textract:

Pros:

Cons:

Azure Form Recognizer implementation:

Result:

Pros and cons of Azure Form Recognizer AI:

Pros:

Cons:

Conclusion:

Written by Raveendhar S