Box Skills + Clarifai: A Simple Image Categorization Use Case

Simple Image Categorization: Happy Trees

With the GA launch of the Box Skills Kit, I thought it would be apt to break down the technology into a simple example so that we can explore how all of the parts work together.

What we’re going to be running through for this example is simple image categorization Skill using Clarifai — a computer vision AI platform. The end goal is to be able to upload an image into Box and have that image automatically have associated categories assigned to it, courtesy of Clarifai’s computer vision and the Skills framework.

The example application that we’ll be looking at is available in full on Github here. It’s a Node app that uses Express to listen for all POST traffic, which is what the Skills framework sends through to your listener (what Skills calls the invocation URL). I deploy this app out to Heroku, but you can run this off of any publicly available URL.

Let’s go through the steps needed for the entire Skills process for this example. We start with a Box folder that has a Skills app enabled on it. The Skills app allows you to set an invocation URL, which is where a notification is sent when you upload a file to that folder.

That’s where we start — an image has been uploaded to that folder and a POST request is sent to our app, which is sitting there listening.

Step 1: Listen for file uploads

When a new POST request occurs, we capture the body that is sent through to us. Within the body of the POST request are three pieces of data that we care about and need:

  • File ID: This is the ID of the file that has just been uploaded, which is needed to figure out what file to send to Clarifai and where to store metadata back to.
  • Read Token: This token gives us permission to read the content of that file. This is needed to upload the file content to the machine learning system, Clarifai.
  • Write Token: This token gives us permission to write back metadata to the file once Clarifai has pulled their results.

The code looks something like this:

// Capture file ID and tokens from Box event
let body = req.body;
let fileId = body.source.id;
let readToken = body.token.read.access_token;
let writeToken = body.token.write.access_token;

Step 2: Send the file to Clarifai for processing

Next up we have to send the file content to Clarifai. Their API will accept a URL to a public image for processing. The problem is that Box files are locked down by default, but we can use the download file API endpoint along with the read token and file ID we extracted to create that public endpoint. It looks something like this:

// Create shared link to the file with write token
const fileURL = `https://api.box.com/2.0/files/${fileId}/content?access_token=${readToken}`;

We then instantiate a new Clarifai SDK instance, and make a call to their predication endpoint with that public URL. When the results are sent back to us we loop through the results and put them all in a format we need for the metadata that we’re going to store back to Box.

// Instantiate a new Clarifai app instance
const app = new clarifai.App({
apiKey: config.clarifaiKey
});
// predict the contents of an image by passing in a url
app.models.predict(clarifai.GENERAL_MODEL, fileURL).then(
function(response) {
// Capture all categories
let entries = [];
for (let category of response.outputs[0].data.concepts) {
if (category.value > 0.9) {
entries.push({ type: ‘text’, text: category.name });
}
}
};
});

Step 3: Create the metadata template

Now that we have the categories from Heroku we can create the metadata that will be stored back to the Box file. We can add in any metadata we want, but if we want it to visualize nicely via the Skills process, as shown in this example, then we need to follow a specific format.

We set the metadata template to be boxSkillsCards. The metadata itself will follow a format that looks like the following:

// Set Box metadata template information
const metadataTemplate = 'boxSkillsCards';
const metadata = {
cards: [{
created_at: new Date().toISOString(),
type: 'skill_card',
skill_card_type: 'keyword',
skill_card_title: {
message: 'Categories'
},
skill: {
type: 'service',
id: 'jleblanc-clarifai-heroku'
},
invocation: {
type: 'skill_invocation',
id: fileId
},
entries: entries
}]
};

Within that metadata payload, we specify the file ID under invocation.id, and the categories from Clarifai under entries.

Step 4: Assign the metadata to the file

With our metadata set we can now make a request to store that data back on the file in Box. To do so we need to use the Create Metadata endpoint. This endpoint needs to include the File ID and Metadata Template name and will look like the following:

// Set metadata add / update URL
const urlMetadata = `https://api.box.com/2.0/files/${fileId}/metadata/global/${metadataTemplate}`;

For this request we’ll be using an HTTP client called axios. To make this request we’ll need a few headers to specify the content type and to pass through the write token to give the request permission to save the data back on the file.

// Create POST request headers
let config = {
headers: {
'Authorization': `Bearer ${writeToken}`,
'Content-Type': 'application/json'
}
};

Now we make the request to create the metadata on the file. If there is no existing metadata saved then the request will process perfectly and the data will now visualize under the Skills option in the right hand pane when viewing the file.

If there is already metadata existing (such as if you’re re-uploading the file) then an error will come back from this request, which will state tuple_already_exists. If this error occurs we catch it and then need to instead make a JSON patch request to the Update Metadata endpoint.

The URL is the same as the add metadata call, and the headers are mostly the same except that the content-type is application/json-patch+json instead of application/json. We set that new header and create our JSON patch object.

When ready, we make a PUT request to update the metadata on file. If all goes according to plan we now have nicely updated metadata.

// Make request to add metadata to file
axios.post(urlMetadata, metadata, config).then(function (response) {
console.log('Metadata added');
})
.catch(function (error) {
// If metadata already exists on the file this error will trigger
if (error.response.data.code === 'tuple_already_exists') {
// Modify headers for JSON patch metadata update request
config.headers = {
'Authorization': `Bearer ${writeToken}`,
'Content-Type': 'application/json-patch+json'
};
    // Create JSON patch data
const jsonPatch = [{ op: 'replace', path: '/cards/0', value: metadata.cards[0] }];
    // Make Metadata update JSON patch request
axios.put(urlMetadata, jsonPatch, config).then(function (response) {
console.log('Metadata added');
}).catch(function (error) {
console.log(error);
console.log('Metadata update failed');
});
} else {
console.log(error.response.data.code);
}
});

That’s it! From this example we can see the end-to-end process of Skills and all of the items that are worth noting at each step. For more example Skills, take a look at our sample Skills project page.

Happy coding!