Deploying Tensorflow 2.1 as C/C++ executable

Amirul Abdullah
Feb 2 · 7 min read
Image for post
Image for post

Here is a use case that I believe some Non-Data Engineer/Data Scientist is facing.

How do I deliver a Tensorflow’s model that I trained in Python but deploy it in pure C/C++ code on the client-side without setup python environment at their side and on top of that all files have to be in binaries??

The answer to that is to use the Tensorflow C or C++ API. In this article, we only look at how to use the C API (not the C++/tensorflowlite) that runs only in CPU. The environment that I will use throughout the article is as follow:

  • OS: Linux ( Tested and worked on un fresh Ubuntu 19.10/OpenSuse Tumbleweed)
  • Latest GCC
  • Tensorflow from Github (master branch 2.1)
  • No GPU

Also, I would like to credits Vlad Dovgalecs and his article at Medium as this tutorial largely based and improved from his findings. Check out my repo for the full code.

Tutorial Structure

This article will be a bit lengthy. but here is what we will do, step by step:

  1. Clone Tensorflow source code and compile to get the C API headers/binaries
  2. Build the simplest model using Python & Tensorflow and export it to tf model that can be read by C API
  3. Build a simple C code and compile it with gcc and run it like a normal executable file.

So here we go,

1. Getting the Tensorflow C API

As far as I know, there are 2 ways to get the C API header.

  • Download the precompiled Tensorflow C API from the website (tends not to be up to date binaries) OR
  • Clone and compile from the source code (Long process, but if things don’t work, we can debug and look at the API)

So I gonna show how to compile their code and use their binaries.

Create a folder and clone the project

$ git clone https://github.com/tensorflow/tensorflow.git

You would need Bazel to compile. Install it on your environment

Ubuntu :

$ sudo apt update && sudo apt install bazel-1.2.1

OpenSuse :

$ sudo zypper install bazel

Whichever platform you use, make sure the bazel version is 1.2.1 as this is what the Tensorflow 2.1 is currently using. Might change in the future.

Next, we would need to install Numpy Python's package (Why would we need a Python package to build a C API??). You can install it however you want as long as it can be referenced back during compilation. But I prefer to install it through Miniconda and have a separate virtual environment for the build. Here's how:

Install Miniconda :

$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh 
$ sudo chmod 777 Miniconda3-latest-Linux-x86_64.sh
$ ./Miniconda3-latest-Linux-x86_64.sh

Create a new environment + Numpy named tf-build:

$ conda create -n tf-build python=3.7 numpy

we use this environment later in step D.

Tensorflow 2.1 source code has a bug that will make you build failed. Refer to this issue. The fix is to apply patch here. I included a file in this repo that can be used as the patch.

$ git apply p.patch

In the future, this might be fixed and not relevant.

By referring to the Tensorflow documentation and Github Readme. Here’s how we compile it. We need to activate our conda env first for it refers to Numpy

$ conda activate tf-build 
$ bazel test -c opt tensorflow/tools/lib_package:libtensorflow_test
$ bazel build -c opt tensorflow/tools/lib_package:libtensorflow_test

Let me WARN you again. It takes 2 hours to compile on a VM with Ubuntu with a 6 Core configuration. My friend with a 2 core laptop basically froze trying to compile this. Here a bit of advice. Run in some server with good CPU/RAM.

copy the file at bazel-bin/tensorflow/tools/lib_package/libtensorflow.tar.gz and paste to you're desired folder. untar it like below:

$ tar -C /usr/local -xzf libtensorflow.tar.gz

I untar it at my home folder instead of /usr/local as I was just trying it out.

CONGRATULATION!! YOU MADE IT. compiling TensorFlow at least.

2. A simple model with Python

For this step, we will create a model using tf.keras.layers class and saved the model for us to load later using the C API. For that, we need Python TensorFlow to generate the model. Refer the full code at model.py in the repo.

we will need to install a separate conda environment for this step

$ conda create -n tf python=3.7 tensorflow

here is a simple model where is has a custom tf.keras.layers.Model, with a single dense layer. Which is initialized with ones. Hence the output of this model (from the def call()) will produce an output that is similar to the input.

Simple model with tensorflow

Ever since Tensorflow 2.0, Eager execution allows us to run a model without drafting the graph and run through session. But in order to save the model ( refer to this line module.save('model', save_format='tf')), the graph needs to be built before it can save. hence we will need to call the model at least once for it to create the graph. Calling print(module(input_data)) will force it to create the graph.

Next, run the code:

$ conda activate tf
$ python model.py

You should get an output as below:

tf.Tensor([[10.]], shape=(1, 1), dtype=float32)

You should also see a folder created called model created.

When we saved a model, it will create a folder and a bunch of files inside it. It basically stores the weights and the graphs of the model. Tensorflow has a tool to dive into these files for us to match the input tensor and the output tensor. It is called saved_model_cli. It is a command-line tool and comes together when you install Tensorflow.

We would need to extract the graph name for the input tensor and output tensor and use that info during calling C API later on. Here’s how:

$ saved_model_cli show --dir <path_to_saved_model_folder>

running this and replaced the appropriate path, you should get an output like below:

The given SavedModel contains the following tag-sets: 
serve

use this tag-set to further drill into the tensor graph, here’s how:

$ saved_model_cli show --dir <path_to_saved_model_folder> --tag_set serve

and you should get an output like below:

The given SavedModel MetaGraphDef contains SignatureDefs with the following keys:
SignatureDef key: "__saved_model_init_op"
SignatureDef key: "serving_default"

using serving_default signature key into the command to print out the tensor node:

$ saved_model_cli show --dir <path_to_saved_model_folder> --tag_set serve --signature_def serving_default

and you should get an output like below:

The given SavedModel SignatureDef contains the following input(s):
inputs['input_1'] tensor_info:
dtype: DT_INT64
shape: (-1, 1)
name: serving_default_input_1:0
The given SavedModel SignatureDef contains the following output(s):
outputs['output_1'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict

We would need the name serving_default_input_1 and StatefulPartitionedCall later to be used in the C API.

3. Building C/C++ code

The third part is to write the C code that uses the Tensorflow C API and import the Python saved model. The full code can be referred here.

There is no C API proper documentation, so if something went wrong, it’s best to look back at their C header in the source code (You can also debug using GDB and step by step learn how the C header works)

On an empty cpp file, import the TensorFlow C API as follow:

Empty C base code

Note that you have declared NoOpDeallocator void function, we will use it later

Next need to load the savedmodel and the session using TF_LoadSessionFromSavedModel API.

Loading Saved Model with cpp sample code

Next, we grab the tensor node from the graph by their name. Remember earlier we search for a tensor name using saved_model_cli? Here where we use it back when we call TF_GraphOperationByName(). In this example, serving_default_input_1 is our input tensor and StatefulPartitionedCall is out output tensor.

Reading the input tensor

Next, we will need to allocate the new tensor locally using TF_NewTensor, set the input value and later we will pass to session run. NOTE that ndata is total byte size of your data, not length of the array

Here we set the input tensor with a value of 20. and we should see the output value as 20 as well.

Allocated input tensor

Next, we can run the model by invoking TF_SessionRun API. Here's how:

Running the session

Lastly, we get back the output value from the output tensor using TF_TensorData that extract data from the tensor object. Since we know the size of the output which is 1, I can directly print it. Else use TF_GraphGetTensorNumDims or other API that is available in c_api.h or tf_tensor.h

Read the result from session

Compile it as below:

gcc -I<path_of_tensorflow_api>/include/ -L<path_of_tensorflow_api>/lib main.c -ltensorflow -o main.out

Before you run it. You’ll need to make sure the C library is exported in your environment

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<path_of_tensorflow_api>/lib

RUN IT

./main.out

You should get an output like below. Notice that the output value is 20, just like our input. you can change the model and initialize the kernel with a weight of value 2 and see if it reflected other value.

TF_LoadSessionFromSavedModel OK
TF_GraphOperationByName serving_default_input_1 is OK
TF_GraphOperationByName StatefulPartitionedCall is OK
TF_NewTensor is OK
Session is OK
Result Tensor :
20.000000

END

Originally published at https://github.com.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

By Analytics Vidhya

Latest news from Analytics Vidhya on our Hackathons and some of our best articles! Take a look

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Amirul Abdullah

Written by

C/C++ developer | Code Optimization | HPC

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Amirul Abdullah

Written by

C/C++ developer | Code Optimization | HPC

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store