Sharing 100k folders with the Box Platform
The Box CLI allows developers to manipulate the content objects in bulk. In this article we're exploring how to share folders in an automated way.
Inspiration for this article came from a developer query in our forum, “Is there an automated way to create shared links to folders?”
The immediate answer is “yes”, but let’s take a look at the details of this use case.
UPDATE: Box CLI v3.6.0 see below…
For a sense of scale there are 100,000 folders containing 2.5 million images, so doing this manually is impractical.
The objective is to create a table with information about each set of images (folder), and include the shared link.
This table will power a searchable app, and allow users to open and view the contents of each folder.
We’ll be using the Box CLI, and all the scripts are available in this Github repo.
Let’s get started.
Step 0: Create some sample folders
For this demo to work we need to create some sample folders, and there is a script for that…
This creates 110 folders on a topic and sub-topic hierarchy. A drop in the ocean compared to the use case, but it will give us something to work with.
Step 1: List all the folders
We could try to search for folder using the API, but this implies the folders have some naming convention we can search on, or some other clever search possibility, like metadata.
For example our structure is
/100k/Topic_X/Topic_X_SubTopic_Y, so we could do something like:
box search topic --fields type,id,name,parent --type folder --content-types name --all --csv
With the following result:
Note: if you are following along on your terminal and just created the folders, search uses an index and it may take some time for new objects to be indexed.
Since we don’t know if that is the case, we’ll go the hard core way and create a script that lists all folders, recursively.
It will be slow though.
However if we include the recursion level we can do some filtering.
In our assumption of the tree structure there is no point in checking the last level of folders, since these will only have files inside. In that case we can stop the recursion at the last level.
Most of our Box CLI sample scripts are done in powershell, this time we’ll use some good old bash. It runs in linux, mac-os and windows subsystem for linux (wsl).
So this line
box folders:items 123 --csv-fields type,id,name,description lists the content of
folder_id=123 , and outputs a csv formatted list.
Then we filter only the folders (
if [[ “$type_name” == “folder” ]]), and include in the output the
$level in the output.
Finally we check the recursion level
if [ $level -lt 1 ], and stop it if it is the last level (in our case the last level is 1).
Step 2: Filter the list
From the complete list of folders, we are now filtering to get just the last level of folders, these are the ones for which we want to create the shared links.
This is the result:
Step 3: Creating the shared links in bulk
Now that we have the list with the folders for which we want to create the shared links we can just pipe it to the Box CLI.
So now we have 2 files we need to join.
paste -d, is necessary because the shared link creation does not output the
folder_id, so we are joining the list of folders with the result of the shared link creation. This will only work if there were no errors in the creation before. If there are errors we will get a mismatched list between
folder id and
A safer option is to take the shared link creation output and get the details of the folder from the url:
box shared-links:get --bulk-file-path tree_100k_shared_links_tmp.csv --csv --fields type,id,name,shared_link,parent
Or get the details of the shared link from the original list of folders:
box folders:get --bulk-file-path tree_100k.csv --csv --fields type,id,name,shared_link,parent
A completely different approach would be to create the shared links one by one from the folder list, and get all the information we need with one single call to the API:
Although apparently slower, it has the advantage of creating the final output file without the need of running secondary commands, and properly handling errors.
With the release of Box CLI v3.6.0, the engineering team included the
id and the
type of the object in the output of the
box shared-links:create command.
This means you no longer need to merge files or make secondary calls to know the relation between the shared link URL and the file or folder object it belongs to.
The output now looks something like:
box shared-links:create -y --bulk-file-path $csv_file_bulk --access open --no-can-download --csv --save-to-file-path $csv_file_out
The scripts on GitHub have been updated.
So big kudos to the team, Artur Jankowski, Lukasz Socha and Minh Cong for, in record time, turning feedback from the community into reality.
Removing the shared links is just a matter of piping the output of the folder list to the delete command.
box shared-links:delete --bulk-file-path tree_100k_shared_link_tmp.csv