Box Skills: Pieces of the Puzzle + Tutorial

Alex Novotny
Box Developer Blog
Published in
11 min readNov 15, 2022

Box Skills were introduced in 2017, and since that time, over 1000 skills have been used across various industries to bring intelligence to Box content. Just how easy is it to setup a skill? In this blog, I will show you!

Box Skills Refresher

Before diving into how to set up a Box Skill, I want to give an overview on what they are and what parts complete a Box Skills implementation.

If you refer to the architecture patterns section of our developer site, the below diagram appears. It represents a classic example of a Box Skill — in this case, a floorplan identification solution.

Box Skill Infrastructure Map
  1. The flow is triggered from a floorplan file being uploaded, moved, or copied into the folder configured for the Box Skill.
  2. The file is sent to a Google Cloud Function via the invocation URL set up in the Box Skill configuration steps. Any cloud provider who offers serverless functions can be used, even though we show GCP in this tutorial.
  3. The function runs custom written code to parse the Box Skill payload, verify security keys, download a local copy of the file, send the file to a machine learning provider — also GCP in this example, and write metadata discovered by the machine learning service back to the original file in Box.

It is important to note that most of the above pieces don’t actually run on Box infrastructure. Referring to the above diagram, everything in the first dotted box lives in Box: the content, the metadata template, and configuration information. The logic or processing of the Skill happens in the second dotted box. The machine learning occurs in the third dotted box.

The second and third parts aren’t licensed or billed through Box Platform, and as such, typically require working through that particular services’ sales and enablement teams— they do not have to be the same provider. For example, you can have an Azure Function call IBM Watson.

Box provides value in the flow, because the content’s initial action triggers the process to begin. This keeps developers from migrating or duplicating content to collect insight. Also, the flow ends with searchable metadata directly within the platform in seconds, allowing users to see the information they need faster.

In the below demo, I’m going to show you how to set up the flow in the diagram above. At the end, I hope to be able to upload a floorplan and be able to have the number of bedrooms in the floorplan written back as metadata on the file. That way I can search for all floorplans with a certain number of bedroom in the future.

Let’s jump into the tutorial!

Box Skills Administrative Setup

Create a Box developer account (optional but recommended)

If you don’t have a Box enterprise account, you can sign up for a free developer account here. I recommend using the developer account for the tutorial instead of using your production environment.

Please note that you cannot use the same email address during sign up due to the restriction of having a unique email address across all of Box.

Create the Box Skill (completed by developer)

Navigate to the Developer Console, and click Create New App.

Developer Console Landing Page

Select Box Custom Skill.

Developer Console Application Selection Screen

Give the application a name, and click Create App.

Box Skill Creation Pop-Up

After creating the application, you will see the below screen. The red box is where you will put the URL where you would like the Box Skills payload to go. We will add this URL later on.

Box Skill Configuration Screen

In the security keys tab, you will find two keys that can be used to verify that Box is the service that called the serverless function.

Box Skill Security Keys

Enable/Authorize the Box Skill (completed by admin)

Just like other application types, an administrator of your Box instance will need to enable and authorize the Box Skill in the skills section of the admin console. You will need to provide the admin with the client id of the application, which is found in the Box Skill configuration screen.

You will also need to provide the folder name(s)/owner of the content you wish the Skill to be triggered. If you haven’t set up a folder for the Skill to monitor yet, you will want to do that prior to requesting authorization from your admin.

Create a Box folder

On the Skills Admin Console screen, click Add Skill.

Box Skills Admin Console Screen

Enter the Client ID of the Skill, and click Next.

Add Box Skill Pop-Up

Select whether the skill should run for all content or a subset of folders.

Box Skill Folder Configuration Screen

For (a) specific folder(s), filter the pop-up by user or folder name. Check the folder(s) for which the Skill should be triggered.

Box Skill Folder Selection Pop-Up

Confirm selections, and click Enable.

Box Skill Confirmation Screen

Create a Box Metadata Template (optional — completed by admin)

This step is optional, since metadata or skills data can be written to a file without a template; however, creating a metadata template will allow users to easily search by the results machine learning provided.

In the Admin Console, click Content > Metadata > Create New.

Admin Console Metadata Template Screen

Configure the metadata template by giving it a name, adding attributes — which should each have a name, format, and description, and clicking Save.

Metadata Template Setup Steps

Serverless Function Setup

This example will use our Serverless Box Custom Skills starter repository to speed up development.

Click the above link and download the repository to your computer.

Box Skills Quickstart Project

Unzip the downloaded folder. Find the box-custom-skills-starter-gcp folder within the downloaded repository and rename it to something that matches your use case. Also, feel free to move the folder to your typical projects directory. If you would rather use AWS or Azure, you will find hello world directories in the same zip folder.

Unzip downloaded folder

Open the folder in a code editor like Visual Studio Code.

Open the renamed and moved example folder

In the editor’s terminal, confirm you have Node v10.0.0 or higher.

Check node version

Before continuing, you will need to setup a GCP account with a billing method attached. I won’t be going over all of those steps here, but you can find the steps for that on the Serverless website. Make sure to complete all the steps including creating a project + enabling the APIs, creating a service account and downloading a JSON key file.

Drag the JSON key file that was downloaded from the step above into the .gcloud folder.

Drag in your JSON key file from GCP

Rename the file as serverless.json.

Change the JSON file name to serverless.json

Update the package.json file to have the naming information for your use case.

package.json file

Update the serverless.yml file to have the configuration and naming information for your GCP account and use case. Also, make sure to add the Box security keys shown in the screens below.

serverless.yml file

The Box primary and secondary keys come from the developer console’s security keys section I mentioned a few sections up. It is important to use these keys to make sure only Box can run the serverless function’s code.

Copy and paste the keys into the serverless.yml file above

In a terminal, run npm install.

Run npm install

Then, run npx serverless deploy.

Run npx serverless deploy

Deployment can take several minutes, especially the first time. After it completes, you will get back an invocation URL.

sls deploy completed

Copy and paste that into the developer console application you created earlier.

Add the invocation url to the Box Skill

Visit the GCP console to see that your serverless function is active.

Verify that the serverless function deployed

You also need to add an additional permission to the function so that Box can call it. Click permissions > add.

GCP Serverless Function Permissions Configuration

Type “allUsers” in the new principals box with a role selected of Cloud Functions Invoker. Click Save.

GCP Serverless Function add principals permissions

Upload a file to the Box folder configured by the administrator.

Upload a file

Open the file in Box to see the message “Hello World” attached as metadata.

Custom metadata can be seen in the upper right corner

You can also check the logs in the serverless function for verification.

You can see the event body logged from the code we deployed earlier

GCP Document AI Setup

Now that the Box Skill and serverless function are working, the last piece is setting up a machine learning provider and editing our code to use it. Google’s Document AI is perfect for this use case, because it will pull OCR text from the floorplans uploaded without needing to train a custom machine learning model. We can send the text it finds back to the serverless function setup above, parse the results, and apply it back to the file as searchable metadata.

Before jumping into the setup steps, visit the try-it section to see the typical response you will receive.

Try it! Section

Notice the OCR text and JSON tabs. In the JSON response you can see a text field. This is where the OCR text data is placed for you to parse.

OCR Text Identified
JSON Response

We are going to use the how to guides section of the Document AI’s documentation to configure the api for the GCP project created earlier. Be aware that we don’t need to setup another service account or download a new JSON key file, since we already did that in the previous blog post.

Enable the Document AI by clicking here and turning it on for the GCP project we set up earlier in this tutorial.

Enable the Document AI API

Go to the processors page and click Explore Processors.

GCP Proccesor page

Select the document OCR processor type.

Give the processor a name in the side pop-up window and click Create.

Create Processor Pop-up

Find the prediction endpoint for the processor. This is what we will use to get information about the floorplans once we’ve edited our code.

Processor Created

Now, we are going to edit the code to use this new api.

In the terminal, run the following:

npm install axios @google-cloud/documentai

In the serverless.yml file, we need to add some more variables: gcp_project_id, gcp_location, gcp_processor_id, box_metadata_key, box_api_endpoint, and bedroom_list. You can find the new section to copy below.

The project id, location, and processor id come from the GCP configuration we did earlier. Adding them as variables now allows us to access this in the code and change them in the future without code editing.

The metadata key is the key for the metadata template we setup earlier. You can find this in the admin console under content > metadata.

The bedroom list is a comma separated string of all the names our floorplans call bedrooms. This will help us parse the OCR test into a searchable integer. For this and any other lists — bathrooms, porches, etc — there can be no spaces between the terms.

These will all make more sense once we change the index.js file.

Updated serverlerless.yml file

For the index.js, we need to make several changes. To make it easier, there is a full example of the finished index.js file in the repo you downloaded earlier that you can copy from. It is called index_reference.js. Keep in mind that this is just an example. You can use this to expand functionality depending on your real world use case. You will see comments above the code below describing what each section does.

Simply copy/paste the index_reference.js file and replace the contents of the index.js file.

Updated index.js file

Now, you can run npx serverless deploy and then we can use the skill!

Using the Box Skill

At this point, all of the pieces for a Box Skill to work have been set up: Box Skill creation, Box Skill authorization, metadata template creation, and the serverless function/machine learning setup. We can now use the skill for our actual use case!

Upload a file to the Box folder configured by the administrator.

Upload a File

Click into a file uploaded and see that metadata has been applied to a file.

Metadata Added To File

Box takes approximately ten minutes to index metadata. After that, the metadata can be used as search criteria for content within the Box web application.

Click the search bar > Metadata > Pick a template from the dropdown > type in search criteria > hit Enter.

Metadata Template Search

Any files that match the criteria will be returned.

Metadata Template Search Results

Box Skills Tutorial Complete

In this blog post, I’ve showed how you can use some simple boilerplate code and an out of the box machine learning API to gather and use valuable data in the content stored on Box.

In this tutorial we used floorplans, but you could use any document you wanted — resumes, invoices, applications, etc. The possibilities are endless due to the many ready to use machine learning APIs out there. Plus — as you can see from the tutorial, setting up the flow is not that hard!

Huge thank you to Marley!

We hope you enjoyed this tutorial, and feel free to reach out to us on the developer forum for support, or via Box Pulse to make suggestions on how to improve Box Skills.

--

--

Alex Novotny
Box Developer Blog

I’m a Box Developer Advocate, helping others learn how to maximize their investment through Box Platform.