Document AI: How to get started?

Gerard Samuel
Google Cloud - Community
4 min readOct 19, 2023
Photo by Wonderlane on Unsplash

Document AI is a Google Cloud solution that imports structured data from unstructured or semi structured documents.
The output can then be treated as first class data citizens for analysis with your other data sources to gain deeper insight from your “dark” document data.
In this first post on Document AI, I will go over the initial steps to get started.

Requirements

In this post, I am assuming that you have an account and some basic knowledge of Google Cloud’s console.

Log into Google Cloud’s console.
Follow this link to create a project. [Documentation]
1. Give the project a name
2. Choose your billing account

Follow this link to enable the Document AI API.

Note: In this screenshot, I have already enabled it. Here is where the API can be enabled or disabled.

Enabling (or disabling) the Document AI API

Follow this link to create a Cloud Storage bucket to store documents.
Give it a name. These are naming considerations to be aware of.
Click on Create to accept the defaults.

Go into the storage bucket that was just created and create two folders:

  • dataset
  • documents

Upload any test files that you may have into the documents folder.
Searching for sample documents? Try Kaggle.

Creating a custom processor

Once completed, navigate to Document AI Workbench and create a new custom extractor.
Give the processor a name and an appropriate location. For storage, choose that you will use your own storage and pick the dataset folder that was created earlier.
Once satisfied, click Create.

Creating a custom extractor in Document AI

Now I am going to import a sample document.

Importing a document

Click on “Get Started”

Clicking “Get started” on the Customize card

Click on “Upload Sample Document”. Choose to import a file from Cloud Storage, choose the location where the documents reside, highlight one file and click “Import”.

Uploading a sample document

Document AI will analyze the document and present a view of it.
Once the document is displayed, I will proceed to creating fields and labeling.

Labeling a document

In this tutorial, I am going to capture the invoice number, the receiver’s name and address, invoice Id and date, shipment method and currency.

Click the “Create new field” button in the upper left.
Make sure to give the field a name, an appropriate data type and for now choose “Optional Once” in the Occurrence drop down and click Create.

Creating your first label

Depending on the document and layout, Document AI’s Generative AI features may auto-label an area (highlighted in light purple) when a field has been created. Continue adding fields you want to capture.

Displaying initial document labeling state

In the example above, the invoice number was not automatically captured by Generative AI. Here we need to manually draw a bounding box and assign it to the appropriate field. Once labeled, in the lower left click on the “Mark As Labeled” button. Document AI then stores the document along with the data I am extracting from the document.

Manually capturing data on a document

Conclusion

I went over how to enable the Document AI service, created a processor from Document AI Workbench and initial importing/labeling of a document.
We saw how Generative AI was able to quickly recognize the correct information from the document. This will come into play when we have to train a model with many documents in my next post on Document AI.

Thanks!

The opinions stated here are my own, not necessarily those of my employer.

--

--