Sitemap
Box Developer Blog

News and stories for working with the Box APIs

Box AI Enhanced Extract Agent: A Developer’s Guide

--

Today, we at Box announced the new Enhanced Extract Agent, designed to tackle key-value pair extraction from the most complex documents with greater accuracy. This is available in the Box UI with the simple click of a button to transform your unstructured data into meaningful Metadata attached to the file that allows for quick insights and ease of discovery using the Box Metadata search tool.

That said, we also realize that often, these files are one data point in a larger workflow that you as a developer need to access easily. One of the top uses of Box AI for developers is turning unstructured data into structured data for use in things like production databases, third party systems, or analytics. Because of this, we have added access to this agent via the Box AI API.

How it works

The new Enhanced Extract Agent is powered by Gemini 2.5 Pro and employs chain-of-thought processing to not only provide the best answer, but also tell the developer the reason the Agent believes this is the correct answer.

For example, if you are extracting for the key “total amount,” the returned JSON might include both the total amount and a “reasoning” key with a value of “this amount is at the bottom of the list next to the word total.” This technique provides confidence to the developer, as well as the ability to implement a second step to let your LLM validate the reasoning to either accept the value or take further action.

How to get started

The quickest way to get started with the new agent is to use one of our SDKs. For demonstration purposes, we’ll use the latest generated Python SDK. To run this yourself, you will need a few things:

  • A Box Platform App with the ‘Manage AI’ Scope enabled. This example uses the client_credentials grant type for authentication.
  • The app installed and enabled in your Box instance
  • A file to test with

First, let’s take a look at how you can use the Extract Structured endpoint without using the Agent:

from box_sdk_gen import (
AiItemBase,
AiItemBaseTypeField,
BoxClient,
BoxCCGAuth,
CCGConfig,
CreateAiExtractStructuredMetadataTemplate
)

# Create your client credentials grant config from the developer console
ccg_config = CCGConfig(
client_id="my_box_client_id", # replace with your client id
client_secret="my_box_client_secret", # replace with your client secret
user_id="my_box_user_id", # replace with the box user id that has access
# to the file you are referencing
)
auth = BoxCCGAuth(config=ccg_config)
client = BoxClient(auth=auth)
# Use the Box SDK to call the extract_structured endpoint
box_ai_response = client.ai.create_ai_extract_structured(
# Create the items array containing the file information to extract from
items=[
AiItemBase(
id="my_box_file_id", # replace with the file id
type=AiItemBaseTypeField.FILE
)
],
# Reference the Box Metadata template
metadata_template=CreateAiExtractStructuredMetadataTemplate(
template_key="InvoicePO",
scope="enterprise"
)
)
print(f"box_ai_response: {box_ai_response.answer}")

In this example, there are a couple of key things to note. You need to have the file ID for the file you wish to extract data from, and you will need to have a Metadata Template in Box that contains the data about the fields you wish to extract. Bear in mind, you do not have to have a template. In the CreateAiExtractStructuredMetadataTemplate method, you can define the fields you wish to extract directly.

If the data you are extracting will not change often, it is better to use a metadata template. You can create it in the admin panel and simply reference it here. If the data will change or if you prefer not to have to work with the admin to get the fields defined, you can do so inline in your code.

With the code above, you can meet a lot of your data extraction needs. The extract endpoint is really good at getting your data from shorter documents or documents without too many complex structures like tables and images.

Conversely, if you have a 100-page document with complex taxonomies or barcodes, you will find that using the new agent will help tremendously. The good news, is we have made it part of the API you already know, so implementation doesn’t require a complete rewrite of your code.

To implement, add two new imports, a couple of lines of code to define the agent, and one more method call in the create_ai_extract_structured method, and you are set.

Here’s the full sample Python script demonstrating how to call the Enhanced Extract Agent on a file using the Box AI SDK:

from box_sdk_gen import (
AiAgentReference,
AiAgentReferenceTypeField,
AiItemBase,
AiItemBaseTypeField,
BoxClient,
BoxCCGAuth,
CCGConfig,
CreateAiExtractStructuredMetadataTemplate
)

# Create your client credentials grant config from the developer console
ccg_config = CCGConfig(
client_id="my_box_client_id", # replace with your client id
client_secret="my_box_client_secret", # replace with your client secret
user_id="my_box_user_id", # replace with the box user id that has access
# to the file you are referencing
)
auth = BoxCCGAuth(config=ccg_config)
client = BoxClient(auth=auth)
# Create the agent config referencing the enhanced extract agent
enhanced_extract_agent_config = AiAgentReference(
id="enhanced_extract_agent",
type=AiAgentReferenceTypeField.AI_AGENT_ID
)
# Use the Box SDK to call the extract_structured endpoint
box_ai_response = client.ai.create_ai_extract_structured(
# Create the items array containing the file information to extract from
items=[
AiItemBase(
id="my_box_file_id", # replace with the file id
type=AiItemBaseTypeField.FILE
)
],
# Reference the Box Metadata template
metadata_template=CreateAiExtractStructuredMetadataTemplate(
template_key="InvoicePO",
scope="enterprise"
),
# Attach the agent config you created earlier
ai_agent=enhanced_extract_agent_config,
)
print(f"box_ai_response: {box_ai_response.answer}")

Get started today

The Enhanced Extract Agent is now available via Box AI Studio and Box AI APIs. Developers can begin experimenting immediately to unlock smarter, more reliable content automation.

--

--

No responses yet