Here is a use case that I believe some Non-Data Engineer/Data Scientist is facing.
How do I deliver a Tensorflow’s model that I trained in Python but deploy it in pure C/C++ code on the client-side without setup python environment at their side and on top of that all files have to be in binaries??
The answer to that is to use the Tensorflow C or C++ API. In this article, we only look at how to use the C API (not the C++/tensorflowlite) that runs only in CPU. The environment that I will use throughout the article is as follow:
- OS: Linux ( Tested and worked on un fresh Ubuntu 19.10/OpenSuse Tumbleweed)
- Latest GCC
- Tensorflow from Github (master branch 2.1)
- No GPU
This article will be a bit lengthy. but here is what we will do, step by step:
- Clone Tensorflow source code and compile to get the C API headers/binaries
- Build the simplest model using Python & Tensorflow and export it to tf model that can be read by C API
- Build a simple C code and compile it with
gccand run it like a normal executable file.
So here we go,
1. Getting the Tensorflow C API
As far as I know, there are 2 ways to get the C API header.
- Download the precompiled Tensorflow C API from the website (tends not to be up to date binaries) OR
- Clone and compile from the source code (Long process, but if things don’t work, we can debug and look at the API)
So I gonna show how to compile their code and use their binaries.
Step A: clone their projects
Create a folder and clone the project
$ git clone https://github.com/tensorflow/tensorflow.git
Step B: Install the required tools (Bazel, Numpy)
You would need Bazel to compile. Install it on your environment
$ sudo apt update && sudo apt install bazel-1.2.1
$ sudo zypper install bazel
Whichever platform you use, make sure the bazel version is 1.2.1 as this is what the Tensorflow 2.1 is currently using. Might change in the future.
Next, we would need to install
Numpy Python's package (Why would we need a Python package to build a C API??). You can install it however you want as long as it can be referenced back during compilation. But I prefer to install it through Miniconda and have a separate virtual environment for the build. Here's how:
Install Miniconda :
$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
$ sudo chmod 777 Miniconda3-latest-Linux-x86_64.sh
Create a new environment + Numpy named tf-build:
$ conda create -n tf-build python=3.7 numpy
we use this environment later in step D.
Step C: Apply a patch to the source code (IMPORTANT!)
$ git apply p.patch
In the future, this might be fixed and not relevant.
Step D: Compile the code
$ conda activate tf-build
$ bazel test -c opt tensorflow/tools/lib_package:libtensorflow_test
$ bazel build -c opt tensorflow/tools/lib_package:libtensorflow_test
Let me WARN you again. It takes 2 hours to compile on a VM with Ubuntu with a 6 Core configuration. My friend with a 2 core laptop basically froze trying to compile this. Here a bit of advice. Run in some server with good CPU/RAM.
copy the file at
bazel-bin/tensorflow/tools/lib_package/libtensorflow.tar.gz and paste to you're desired folder. untar it like below:
$ tar -C /usr/local -xzf libtensorflow.tar.gz
I untar it at my home folder instead of
/usr/local as I was just trying it out.
CONGRATULATION!! YOU MADE IT. compiling TensorFlow at least.
2. A simple model with Python
For this step, we will create a model using
tf.keras.layers class and saved the model for us to load later using the C API. For that, we need Python TensorFlow to generate the model. Refer the full code at
model.py in the repo.
Step A: Install TensorFlow in conda
we will need to install a separate conda environment for this step
$ conda create -n tf python=3.7 tensorflow
Step B: Write the model
here is a simple model where is has a custom
tf.keras.layers.Model, with a single
dense layer. Which is initialized with
ones. Hence the output of this model (from the
def call()) will produce an output that is similar to the input.
Ever since Tensorflow 2.0, Eager execution allows us to run a model without drafting the graph and run through
session. But in order to save the model ( refer to this line
module.save('model', save_format='tf')), the graph needs to be built before it can save. hence we will need to call the model at least once for it to create the graph. Calling
print(module(input_data)) will force it to create the graph.
Next, run the code:
$ conda activate tf
$ python model.py
You should get an output as below:
tf.Tensor([[10.]], shape=(1, 1), dtype=float32)
You should also see a folder created called
Step C: Verified the saved model
When we saved a model, it will create a folder and a bunch of files inside it. It basically stores the weights and the graphs of the model. Tensorflow has a tool to dive into these files for us to match the input tensor and the output tensor. It is called
saved_model_cli. It is a command-line tool and comes together when you install Tensorflow.
We would need to extract the graph name for the input tensor and output tensor and use that info during calling C API later on. Here’s how:
$ saved_model_cli show --dir <path_to_saved_model_folder>
running this and replaced the appropriate path, you should get an output like below:
The given SavedModel contains the following tag-sets:
use this tag-set to further drill into the tensor graph, here’s how:
$ saved_model_cli show --dir <path_to_saved_model_folder> --tag_set serve
and you should get an output like below:
The given SavedModel MetaGraphDef contains SignatureDefs with the following keys:
SignatureDef key: "__saved_model_init_op"
SignatureDef key: "serving_default"
serving_default signature key into the command to print out the tensor node:
$ saved_model_cli show --dir <path_to_saved_model_folder> --tag_set serve --signature_def serving_default
and you should get an output like below:
The given SavedModel SignatureDef contains the following input(s):
shape: (-1, 1)
The given SavedModel SignatureDef contains the following output(s):
shape: (-1, 1)
Method name is: tensorflow/serving/predict
We would need the name
StatefulPartitionedCall later to be used in the C API.
3. Building C/C++ code
The third part is to write the C code that uses the Tensorflow C API and import the Python saved model. The full code can be referred here.
There is no C API proper documentation, so if something went wrong, it’s best to look back at their C header in the source code (You can also debug using GDB and step by step learn how the C header works)
Step A: Write C code
On an empty cpp file, import the TensorFlow C API as follow:
Note that you have declared
NoOpDeallocator void function, we will use it later
Next need to load the savedmodel and the session using
Next, we grab the tensor node from the graph by their name. Remember earlier we search for a tensor name using
saved_model_cli? Here where we use it back when we call
TF_GraphOperationByName(). In this example,
serving_default_input_1 is our input tensor and
StatefulPartitionedCall is out output tensor.
Next, we will need to allocate the new tensor locally using
TF_NewTensor, set the input value and later we will pass to session run. NOTE that
ndata is total byte size of your data, not length of the array
Here we set the input tensor with a value of 20. and we should see the output value as 20 as well.
Next, we can run the model by invoking
TF_SessionRun API. Here's how:
Lastly, we get back the output value from the output tensor using
TF_TensorData that extract data from the tensor object. Since we know the size of the output which is 1, I can directly print it. Else use
TF_GraphGetTensorNumDims or other API that is available in
Step B: Compile the code
Compile it as below:
gcc -I<path_of_tensorflow_api>/include/ -L<path_of_tensorflow_api>/lib main.c -ltensorflow -o main.out
Step C: Run it
Before you run it. You’ll need to make sure the C library is exported in your environment
You should get an output like below. Notice that the output value is 20, just like our input. you can change the model and initialize the kernel with a weight of value 2 and see if it reflected other value.
TF_GraphOperationByName serving_default_input_1 is OK
TF_GraphOperationByName StatefulPartitionedCall is OK
TF_NewTensor is OK
Session is OK
Result Tensor :
Originally published at https://github.com.