Amazon Textract implementation in .net

Sajil Prasad
intertoons
Published in
2 min readMar 2, 2023

The problem:
Can we access specific data from a document in image or pdf format?

Answer :
Yes, We can. Thanks to Team Intertoons ! In a short brainstorming session we could resolve this case.

Follow the below steps to achieve the required output

Step 1:
Install necessary SDKs from nuget in visual studio project. We require the following SDKs
1. AWSSDK.Core
2. AWSSDK.Textract
3. AWSSDK.S3

Step 2:
create an aync fuction in C# as follows

 private static async Task ResumeQueryAsync()
{
//specify the endpoint
using (var textractClient = new AmazonTextractClient(RegionEndpoint.APSouth1))
{
//read the local pdf file
FileStream fileStream = new FileStream(HostingEnvironment.MapPath("~/resume_002.pdf"), FileMode.Open, FileAccess.Read);
MemoryStream memoryStream = new MemoryStream();

await fileStream.CopyToAsync(memoryStream);
await fileStream.FlushAsync();

Query queryName = new Query
{
Text = "What is the name"
};
Query queryEmail = new Query
{
Text = "What is the email id"
};
Query queryPhone = new Query
{
Text = "What is the Phone Number"
};
List<Query> queries = new List<Query> { queryName,queryEmail,queryPhone };

var analyzeDocumentRequest = new AnalyzeDocumentRequest()
{
Document = new Document { Bytes = memoryStream },
FeatureTypes = new List<string> { "QUERIES" },
QueriesConfig = new QueriesConfig
{
Queries = queries // pass all queries
}
};

var analyzeDocumentResponse = await textractClient.AnalyzeDocumentAsync(analyzeDocumentRequest);
foreach (var blocks in analyzeDocumentResponse.Blocks)
{
if (blocks.BlockType.Value == "QUERY") //get the query
HttpContext.Current.Response.Write(blocks.Query.Text + " : ");
else if (blocks.BlockType.Value == "QUERY_RESULT")
{
HttpContext.Current.Response.Write(blocks.Text); // get the query repsponse
HttpContext.Current.Response.Write("<br/>");

}
}

}
}

Please note you require an aws key and secret which you will be getting from IAM user management section.
These keys must be saved in the C:\Users\<username>\.aws\cerdentails file in the following structure

[default]
aws_access_key_id=xxxxxxxxxxxxxxxxxxxxxxxxxx
aws_secret_access_key=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
region=ap-south-1

That’s It!

You will get the query and result in the output window.

--

--