Box Developer Blog
Published in

Box Developer Blog

Building a metadata service using Box and FastAPI

In this article we are building a service to automatically fill in the metadata of media files. See how to combine the power of the Box Platform and easily integrate with a 3rd party library using FastAPI and Python.

Use Case

This particular example was inspired by a question posted by a developer:

I’m looking to get video duration for a list of files but can’t find anywhere in the API how to do it. I’m needing this for lots of videos uploaded to Box so I’m hoping to grab the video duration (you can see it on the video preview) instead of downloading the entire file.

Problem #1 is that video files have a really big assortment of properties which are not easily captured, for example, aspect ratio, resolution, bit rate, duration, encoding, frame rate and many more. We need to find a library that can grab a video file and output this metadata.

Problem #2, video files are also typically big, for example an average movie at 1080p with standard encoding might be 5 Gbytes, while a more recent/sophisticated encoding goes down to 1 Gbyte. Downloading thousands of these files is going to take a while, and use too much storage space.

Problem #3, what do we do with all this data? Where do we store it in a way that it can be useful, allowing it to be searchable for example.

Looking for solutions

MediaInfo is an open source library created by MediaArea. They specialize in digital media analysis, and their library can output plenty of interesting properties for media files. This solves problem #1

While playing with their library, I noticed that analyzing a file from a URL is much faster than downloading the file and then analyzing it. I didn't look into the code, but it seems it just needs the few first kbytes to do its job. This solves problem #2.

tic_download = time.perf_counter()
media_info = MediaInfo.parse(item_url)
print(f"MediaInfo w/ URL time: {time.perf_counter() - tic_download} seconds")

tic_download = time.perf_counter()
with open('./tmp/tmp_', 'wb') as tmp_file:
media_info = MediaInfo.parse('./tmp/tmp_'
print(f"MediaInfo w/ download time: {time.perf_counter() - tic_download} seconds")
Folder: 191494027812:Video Samples
Item: 1121082178302:BigBuckBunny.mp4:file
MediaInfo w/ URL time: 3.798498541000299 seconds
MediaInfo w/ download time: 21.247453375020996 seconds

The Box Platform is capable of storing and searching metadata about its content. It also has a series of API's to manage content metadata and its templates. In essence you can define metadata templates and then apply them to your content. This solves problem #3.

Of course no one wants to manually fill in 50+ attributes by hand, so let's build an API that puts all of this to work.

Tools for building the API

The box integration is handled by using the box-python-sdk. This SDK will handle the authentication and the metadata interactions.

We're also using Python and FastAPI. By creating this with FastAPI, we can reuse this functionality with any other app and even automate the metadata classifications by taking advantage of the Box Platform webhooks.

You can find the source code for this demo app in this GitHub repo.

We have talked about webhooks in previous articles if you want to take a look how they are implemented at Box:

Creating the metadata template

You can create metadata templates in Box via the admin console, or programmatically using the API.

The templates are very simple, you can create string, date, float, single and multiple selection attributes.

Because we have 50+ video attributes and I want them to match exactly with the output of the library, I'm creating the template programmatically from a sample output of a video file."/metadata", status_code=201)
async def create_metadata_template(
force: bool | None = False, settings: Settings = Depends(get_settings)
"""Creates the metadata template for use in this service"""
client = get_box_client(settings)

template = get_metadata_template_by_name(
#### Code remove for simplicity ####

media_info = get_sample_dictionary()

template = create_metadata_template_from_dict(
client, settings.MEDIA_METADATA_TEMPLATE_NAME, media_info

result = template.response_object
return {"status": "success", "data": result}

Of course you would never expose such a method on an API, this is for illustration purposes, and the convenience of having the auto-generated documentation of FastAPI which allows me to test call the methods.

The get_sample_dictionary() returns a sample output of the media file library.

from boxsdk.object.metadata_template import (
def create_metadata_template_from_dict(
client: Client, name: str, media_info: dict
) -> MetadataTemplate:
"""create a metadata template from a dict"""

# check if template exists
template = get_metadata_template_by_name(client, name)
if template is not None:
raise ValueError(f"Metadata template {name} already exists")

fields = []
for key in media_info:
fields.append(MetadataField(MetadataFieldType.STRING, key))

template = client.create_metadata_template(name, fields, hidden=False)

return template

The,key)) is just adding to a list of fields to be used by the template in client.create_metadata_template() method.

And this is the end result, a template with 55 attributes:

Populating the metadata for a file

This is done in a series of steps."/file/{file_id}/{as_user_id}", status_code=201)
async def set_file_metadata_as_user(
file_id: str,
as_user_id: str | None = None,
settings: Settings = Depends(get_settings),
"""Process media file and fill in the metadata info using 'as-user'
security context"""
exec_start = time.perf_counter()

client = get_box_client(settings, as_user_id)

We start by getting an authenticated client get_box_client() . In this case we are using JWT authentication.

def get_box_client(settings: Settings, as_user: str | None = None) -> Client:
"""Returns a box client, optionally impersonating a user"""
client = jwt_check_client(settings)
if as_user is not None:
user = client.user(user_id=as_user).get()
client = client.as_user(user)
return client

This means the security context of the service user associated with the JWT token may not have access to the content. In that case we can supply a as_user_id for the service user to impersonate.

Next, we need to grab the template:"/file/{file_id}/{as_user_id}", status_code=201)
async def set_file_metadata_as_user(###):
### ...
template = get_metadata_template_by_name(
if template is None:
raise HTTPException(
detail=f"Metadata template {settings.MEDIA_METADATA_TEMPLATE_NAME} does not exist",
### ...

Then we grab the file:"/file/{file_id}/{as_user_id}", status_code=201)
async def set_file_metadata_as_user(###):
### ...
file = get_file_by_id(client, file_id)
if file is None:
raise HTTPException(
detail=f"File {file_id} does not exist",
### ....

The get_file_by_id() is a simple Box client method from the Box Python SDK:

def get_file_by_id(client: Client, file_id: str) -> File:
"""Returns the box file by id"""
file = client.file(file_id=file_id).get()
return file

Then the media info:"/file/{file_id}/{as_user_id}", status_code=201)
async def set_file_metadata_as_user(###):

### ...
media_info = get_media_info_by_url(file.get_download_url())
if media_info is None:
raise HTTPException(
detail=f"Unable to get media info for file {file_id}",
### ...

In here I'm only interested in the General track. Video files often contain multiple, video, audio, text tracks (think director commentary, multiple audio languages and subtitles, etc). However the MediaInfo summarizes the info on this generic track.

def get_media_info_by_url(download_url: str) -> dict:
"""get the file by id"""
media_info_raw = MediaInfo.parse(download_url)
return media_info_raw.general_tracks[0].to_data()

Finally we can set the file metadata and return it to the caller."/file/{file_id}/{as_user_id}", status_code=201)
async def set_file_metadata_as_user(###):
### ...
metadata = file_metadata_set(file, template, media_info)
except BoxAPIException as error:
raise HTTPException(
) from error

exec_end = time.perf_counter()
exec_time = exec_end - exec_start

return {
"status": "success",
"executed_in_seconds": exec_time,
"data": metadata,

For fun, I've also created a method that first downloads to a local temporary folder and then analyzes it, so we can compare.

See it in action

From URL, this file take about 8.8 seconds:

Downloading first takes 23.2 seconds:

In the Box app, users can see the metadata populated:

And apply searches on it.

Metadata provides a wealth of information about your content that can help your users in many ways. It provides context about your content, such as its authorship, creation date, and licensing information.

One of the main benefits is that it searchable within Box, allowing users to easily find the content they are looking for.

Also when comparing with the normal search, a new file, because it is indexing the content, can take a few minutes to index, while the metadata is almost instantly indexed.

Want to learn more about metadata in Box, check out our documentation:



News and stories for working with the Box APIs

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store