Turbocharging the Box CLI with GNU Parallel

Kyle Crews
Box Developer Blog
Published in
4 min readJul 24, 2024

Welcome, tech enthusiasts! If you’re looking to boost the efficiency of your workflow by running multiple Box CLI commands simultaneously, you’re in the right place. Today, we’re diving into how to use GNU Parallel — a remarkable tool for executing jobs in parallel — to supercharge your Box CLI operations. Say goodbye to sequential boredom and hello to lightning-fast task execution!

What is GNU Parallel?

GNU Parallel is a shell tool for executing jobs in parallel. It’s like having a turbo button for your command line tasks, speeding up processing by running multiple jobs simultaneously. It’s not just fast; it’s also flexible, allowing you to control how many jobs run in parallel and handle complex job dependencies.

Setting Up GNU Parallel

First things first, let’s set up GNU Parallel on your system. Here’s how to do it across various operating systems:

On Ubuntu/Debian:

sudo apt-get update
sudo apt-get install parallel

On macOS:

brew install parallel

On CentOS/Fedora:

sudo yum install parallel

On Windows:

For Windows users, you can use GNU Parallel through a Unix-like environment provided by Cygwin. Here’s a quick guide:

  • Download and install Cygwin from here.
  • During the installation, make sure to select the parallel package under the ‘Devel’ category.

Once installed, verify the installation by running:

parallel --version

You should see the version number of GNU Parallel displayed.

Integrating GNU Parallel with Box CLI

Assuming you have the Box CLI installed (if not, visit the Box CLI Quick Start Guide for installation instructions), you’re almost ready to harness the power of parallel processing.

Preparing Your Input File

GNU Parallel works wonders with input files. Suppose you need to upload multiple files to your Box account. Instead of manually entering commands for each file, you can list all the file paths in a text file.

Create a file named files-to-upload.txt and list each file path you want to upload, one per line:

/path/to/file1.pdf
/path/to/file2.jpg
/path/to/file3.docx

Writing the Parallel Command

Now, let’s craft a command that uses GNU Parallel to read each line from your input file and execute the Box CLI upload command concurrently:

parallel -j 10 box files:upload {} --parent-id 123456789 --name {/}  :::: files-to-upload.txt

Here’s a quick breakdown of this command:

  • parallel: Invokes GNU Parallel.
  • -j 10: Runs 10 jobs in parallel. Adjust this number based on your CPU capabilities and network bandwidth… or, omit and GNU Parallel will automatically determine how many jobs can run in parallel
  • box files:upload {} — parent-id 123456789 — name {/}: The Box CLI command to upload files. {} is replaced by the input line from the file (file path), parent-id is the folder in Box where the files should be uploaded, and {/} extracts the basename of the file to use as the uploaded file name.
  • :::: files-to-upload.txt: Specifies the input file for GNU Parallel. Be sure to run the command from the directory where this file is located.

Instead of a simple text file, you can also use a CSV file for better data management. Suppose you need to delete multiple items from a user’s trash in Box. Here’s how you’d prepare the input file:

  • Create a CSV file named input.csv with the following columns: item_type, item_id, and user_id. Exclude the header in your parallel command execution or skip it using awk:
item_type,item_id,user_id
file,100000000,0000001
folder,100000001,0000002
file,100000002,0000001
  • Use awk to process this CSV and pass it to GNU Parallel:
awk -F, 'NR > 1 {print $1 "," $2 "," $3}' input.csv | parallel -j12 --colsep ',' box trash:delete {1} {2} --as-user {3}

Here’s a breakdown of this command:

  • awk -F, ‘NR > 1 {print $1 “,” $2 “,” $3}’ input.csv: Parses the CSV, skipping the header, and prints each line.
  • parallel -j12 — colsep ‘,’ ‘box trash:delete {1} {2} — as-user {3}’: Runs 12 jobs in parallel. {1}, {2}, and {3} are placeholders for the columns from the CSV, corresponding to item_type, item_id, and user_id respectively.

Executing the Command

Run the command in your terminal, and watch as GNU Parallel efficiently processes multiple uploads at once. It’s like conducting an orchestra where every musician plays perfectly in sync, except here, each musician is a file upload making its way to the cloud! ☁️

Conclusion

By leveraging GNU Parallel with the Box CLI, you’ve not only saved time but also introduced a level of efficiency that can transform your productivity. Whether you’re handling massive datasets, performing routine backups, handling bulk file operations, or managing a multitude of files, GNU Parallel + Box CLI ensures your tasks are completed swiftly and efficiently.

Embrace the power of parallel processing with GNU Parallel + Box CLI, and watch your productivity soar!

Join our Box Community for any questions related to Box CLI and knowledge sharing! 🦄

See you there!

--

--

Kyle Crews
Box Developer Blog

Hello. My interests lie somewhere @ the intersection of healthcare, technology, economics, and finance.