Enhancing structured data extraction with new Box AI platform features

Published in

Box Developer Blog

4 min readSep 17, 2024

This article explores the new structured extract endpoint of the Box AI platform, which offers enhanced precision in defining the AI response format. This feature is especially valuable when you need to seamlessly send structured data to other systems.

This Box AI platform feature is currently in public beta, and only available to Box Enterprise Plus customers.

Back in June of 2024, we talked about how to extract structured data using the Box AI platform /ai/extract endpoint:

Extracting structured data using Box AI

In this article, we’ll demonstrate how to extract structured data from a document using the Box AI API.

medium.com

Both the /ai/extract and the /ai/extract_structured endpoints provide a means to extract structured data from documents stored in Box.

However, with this release of the Box AI platform /ai/extract_structured endpoint, you can define an exact output format, ensuring that the extracted data meets your specifications. This feature eliminates ambiguity, allowing you to tailor the AI’s output to fit your needs perfectly, reducing post-processing work.

Beyond specifying a precise format, this new endpoint also allows you to set individual prompts for each field you’re looking to extract. This added layer of customization helps guide the AI to accurately identify and extract the exact data points you need, enhancing the overall accuracy and efficiency of your workflows.

Specifying fields

As an example, consider a set of purchase orders stored in Box. We want to extract information such as document type, vendor, date, and total from the set of documents.

Here is an example on how to extract this information:

curl --location 'https://api.box.com/2.0/ai/extract_structured' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer ...' \
--data '{
    "items": [
        {
            "id": "1517628697289",
            "type": "file"
        }
    ],
    "fields": [
        {
            "key": "document_type",
            "type": "enum",
            "prompt": "what type of document is this?",
            "options": [
                {
                    "key": "Invoice"
                },
                {
                    "key": "Purchase Order"
                },
                {
                    "key": "Unknown"
                }
            ]
        },
        {
            "key": "document_date",
            "type": "date"
        },
        {
            "key": "vendor",
            "description": "The name of the entity.",
            "prompt": "Which vendor is sending this document.",
            "type": "string"
        },
        {
            "key": "document_total",
            "type": "float"
        }
    ]
}'

Resulting in:

{
    "document_date": "2024-02-13",
    "vendor": "Quantum Quirks Co.",
    "document_total": 45,
    "document_type": "Purchase Order"
}

Field options

You can pass set several options for each field, including:

key: A unique identifier to the field
type: The field type as in string, float, date, enum, etc.
prompt: Context about the key to help the AI find it in the document and know how to format it
options: Typically used with enums, containing a list of key values
displayName: The display name of the field
description: A human readable description of the field

It’s important to remember that working with large language models (LLMs) is not an exact science, and results can vary based on the context and input parameters. We encourage developers to experiment with the field options, prompts, and configurations to find what works best for their specific use cases. Developers should always consider that the AI responses may be inaccurate or contain unexpected data.

Working with Box metadata

In another article we’ve discussed how to use Box AI to extract data from a document and then use it on a metadata template applied to the document.

Box AI-driven Metadata extraction

In the ever-evolving landscape of enterprise documents, metadata plays a pivotal role in how you organize, discover…

medium.com

That particular endpoint has been deprecated and replaced by this one.

Since the metadata template already has the field structure, all we need to do is pass a reference to the document and the metadata template to obtain the structured data.

For example, for the same purchase order:

curl --location 'https://api.box.com/2.0/ai/extract_structured' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer ...' \
--data '{
    "items": [
        {
            "id": "1517628697289",
            "type": "file"
        }
    ],
    "metadata_template": {
        "template_key": "rbInvoicePO",
        "type": "metadata_template",
        "scope": "enterprise_1134207681"
    }
}'

We get back:

{
    "documentDate": "February 13, 2024",
    "total": "$45",
    "documentType": "Purchase Order",
    "vendor": "Quantum Quirks Co.",
    "purchaseOrderNumber": "005"
}

We can then use this output to populate the metadata for this document.

Conclusion

The new /ai/extract_structured endpoint in the Box AI platform represents a significant enhancement for developers and businesses looking to integrate structured data extraction directly into their workflows.

By allowing users to define specific output formats and tailor individual prompts for each field, this feature delivers higher precision, minimizes ambiguity, and reduces post-processing efforts.

Whether you’re extracting data from invoices, purchase orders, contracts, or other document types, this endpoint ensures that the AI’s output format aligns perfectly with your needs.

Ultimately, this streamlined approach improves efficiency, making it easier to automate and scale document processing tasks across various applications.

Thoughts? Comments? Feedback?

Drop us a line in our community forum.