Run Tensorflow on a decarbonated cloud

Rémi B
Qarnot
Published in
4 min readMar 1, 2022
Leaf over hardware

Introduction

TensorFlow is a popular end-to-end open source platform for developing and deploying machine learning applications using the many tools and libraries that comprise it. Using Qarnot for training and getting predictions for your TensorFlow model is a simple task. Here is a quick step by step walkthrough to guide you through the different steps of how to do that so follow along!

Prerequisites

Before starting a calculation with the Python SDK, a few steps are required:

  • Retrieve the authentication token (here)
  • Install Qarnot’s Python SDK (here)

Note: in addition to the Python SDK, Qarnot provides C# and Node.js SDKs and a Command Line.

Test Case

This test case will showcase how to train a dogs and cats image classifier based on a pretrained model and make predictions using Qarnot’s Pools. The necessary input files needed for this tutorial can be downloaded here.

Launching the test case

Once you have set up your working environment, you can start with the first step: training the dog & cat image classifier. Be sure to copy your authentication token in the script (instead of <<<MY_SECRET_TOKEN>>>) to be able to launch the task on Qarnot.

Training the image classifier

Next you have to transfer your local data to buckets on Qarnot. To do so, copy the following code in a python script and run python3 sync_bucket.py from your terminal.

#!/usr/bin/env python3

# Import the Qarnot SDK
import qarnot

# Connect to the Qarnot platform
conn=qarnot.connection.Connection(client_token='<<<MY_SECRET_TOKEN>>>')

# Create a bucket for the scripts that will run once the task starts
bucket = conn.create_bucket("tensorflow-in-scripts")
bucket.sync_directory("buckets/scripts")

# Create a bucket for the pretrained model
bucket = conn.create_bucket("tensorflow-in-pretrained_model")
bucket.sync_directory("buckets/pretrained_model")

# Create a bucket for the training images
bucket = conn.create_bucket("tensorflow-in-learn")
bucket.sync_directory("buckets/learn")

# Create a bucket for the images we want to label
bucket = conn.create_bucket("tensorflow-in-dogscats-small")
bucket.sync_directory("buckets/dogscats-small")

Once the transfer is done, you should see your newly created buckets on your Console and you can inspect the images you just uploaded ! Below is an example of an input image of a cat.

You are now ready to train your dog and cat classifier on Qarnot by copying the following code in a python script and running python3 qlearn.py & from your terminal.

#!/usr/bin/env python3

# Import the Qarnot SDK
import qarnot

# Connect to the Qarnot Platform
conn = qarnot.Connection(client_token='<<<MY_SECRET_TOKEN>>>')

# Create a task
task = conn.create_task("Hello World - Tensorflow - Train", "docker-batch", 1)

# Retrieve the created input buckets
task.resources = [
conn.retrieve_bucket("tensorflow-in-pretrained_model"),
conn.retrieve_bucket("tensorflow-in-scripts"),
conn.retrieve_bucket("tensorflow-in-learn")
]

# Create an output bucket
task.results = conn.create_bucket("tensorflow-out-model")

# Give parameters regarding the Docker image to be used
task.constants["DOCKER_REPO"] = "qarnotlab/tensorflow"
task.constants["DOCKER_TAG"] = "1.12.0"
task.constants["DOCKER_CMD"] = "bash train.sh animals/"

# Submit the task
task.run(output_dir='model')

Training results

At any given time, you can monitor the status of your task on your Console. Once training is done, the task should change into Success state with a green color.

You can then check your newly trained classifier in the output bucket tensorflow-out-model.

Labeling the cat and dog images

Now that you have retrained the model for dog and cat image classification, it’s time to do some labeling. To do so, it will be done in a pool. In order to launch the classification task, copy the following code in a python script and run python3 qlabel_pool.py & from your terminal.

#!/usr/bin/env python3

# Import the Qarnot SDK
import qarnot

# Connect to the Qarnot Platform
conn = qarnot.Connection(client_token='<<<MY_SECRET_TOKEN>>>')

# Create a task
task = conn.create_task("Hello World - Tensorflow - Train", "docker-batch", 1)

# Retrieve the created input buckets
task.resources = [
conn.retrieve_bucket("tensorflow-in-pretrained_model"),
conn.retrieve_bucket("tensorflow-in-scripts"),
conn.retrieve_bucket("tensorflow-in-learn")
]

# Create an output bucket
task.results = conn.create_bucket("tensorflow-out-model")

# Give parameters regarding the Docker image to be used
task.constants["DOCKER_REPO"] = "qarnotlab/tensorflow"
task.constants["DOCKER_TAG"] = "1.12.0"
task.constants["DOCKER_CMD"] = "bash train.sh animals/"

# Submit the task
task.run(output_dir='model')

The script above will launch a pool that provisions 10 servers and run a task with 100 computation instances, each instance will label one of the 100 images you transferred in the tensorflow-in-dogscats-small bucket. Once the task is running, you can view its progress from your Console and see the status of each instance in the instance visualizer on the left.

Labeling results

Once the task is finished, you can inspect the results in the tensorflow-out-sorted-pool bucket in Console and look at the different images your model labeled.

Going further

If you are looking for more Qarnot tutorials on Machine Learning, have a look at our blog or our website.

--

--

Rémi B
Qarnot
Editor for

Qarnot offers a simple, secure and decarbonated HPC cloud service