Undocumented Tensorflow C API

Vlad Dovgalecs
6 min readMar 30, 2019

--

In this post I document my journey on using Tensorflow C API for prediction given a trained model. As of this writing, there are little or no examples that could help programmer build an application rapidly. This is one of the main motivations of this post.

This article is a bit technical and assumes some familiarity with C programming language. If you are comfortable with pointers, you should have no trouble following.

Tensorflow 1.13.1 version was used in this post.

Setup

Imagine that we used the beautiful Python tools to design, train and test a Tensorflow model. The model performance is great. Now it is time to use it in an application. What if the target device cannot run Python or takes too much of resources? What if all we have is a very limited environment that can run only compiled executables? Time to write a C program!

A Short Note on Tensorflow model formats

Tensorflow can store the model in more than one format. If our C program needs to load and use the model, it must be saved in SavedModel format. No, a single “frozen” model file is not enough*. See official documentation.

*A “frozen” model can be used in a C program at the cost of knowing exactly what inputs to use, re-create the session with the exact options etc.

Get C Libraries

The C libraries are nothing but a collection of few header files and two shared libraries (.so in Linux). The pre-compiled libraries for various OS can be downloaded from here.

I recommend building libraries yourself. The output will be a gzipped tarball with headers and libraries. See the link above with the build instructions.

The Roadmap

Before delving into technical details, I would like to outline the main steps of our program.

  1. Create the computation graph
  2. Instantiate/reload a session and associate it with the graph
    The trained model is also loaded here.
  3. Define graph inputs/outputs
    1. Types (e.g. TF_FLOAT as floating number)
    2. Shapes
    3. Names of nodes
  4. Create input tensor(s) and populate with data
  5. Run the session
  6. Get data from output tensor(s)

To learn more, see low-level Tensorflow documentation.

Create a Graph

We will start with creating a computation graph thatcan be instantiated as simple as:

TF_Graph* Graph = TF_NewGraph();

Tensorflow API also provides a handy data structure for catching errors:

TF_Status* Status = TF_NewStatus();

Prepare a Session

A session is the frontend of Tensorflow that can be used to perform computation. In our case, it will return predictions given some input.

First, create the necessary structures for options:

TF_SessionOptions* SessionOpts = TF_NewSessionOptions();
TF_Buffer* RunOpts = NULL;

Second, provide information about the model to be loaded:

const char* saved_model_dir = “<path_to_dir_with_saved_model>”;
const char* tags = “serve”; // default model serving tag; can change in future
int ntags = 1;

This is where we provide full path to an exported Tensorflow model in SavedModel format.

Finally, instantiate the session:

TF_Session* Session = TF_LoadSessionFromSavedModel(SessionOpts, RunOpts, saved_model_dir, &tags, ntags, Graph, NULL, Status);

Technically, the session has been restored from the SavedModel.

Note that the session object is created but is not ready yet to accept any input.

Define Inputs

A useful computation graph accepts at least one tensor. A user is responsible to provide complete information about inputs. This information includes node names, data types and shape of the tensor.

Let’s define an array of input tensors first. Let’s suppose our graph expects two tensors as input:

int NumInputs = 2;
TF_Output* Input = malloc(sizeof(TF_Output) * NumInputs);

Tell what nodes in the graphs will be accepting the input:

TF_Output t0 = {TF_GraphOperationByName(Graph, “<node0>”), <idx0>};
TF_Output t1 = {TF_GraphOperationByName(Graph, “<node1>”), <idx1>};

Of course, replace the node name placeholders and indices with actual values. Finally, register the inputs:

Input[0] = t0;
Input[1] = t1;

Note that data hasn’t been still provided yet.

Define Outputs

In a similar fashion, we tell our program what nodes in the graph will outputs. Our graph will have one output node:

int NumInputs = 1;
TF_Output* Output = malloc(sizeof(TF_Output) * NumInputs);

Provide information about the output node:

TF_Output t2 = {TF_GraphOperationByName(Graph, “<node2>”), <idx2>};

And finally:

Output[0] = t2;

We haven’t provided yet a pointer that will point to the computation graph result.

How to get graph node names and indices?

One might be left wondering where to get the information about node names, types and shapes when all that is available is the trained model, and perhaps the source code. The source code alone might not be revealing this information clearly.

Tensorflow provides a tool “saved_model_cli” that can show information about graph inputs/outputs and even do adhoc testing of the model. See the official documentation.

Provide data for inputs & outputs

Create the pointers to the arrays:

TF_Tensor** InputValues = malloc(sizeof(TF_Tensor*)*NumInputs);
TF_Tensor** OutputValues = malloc(sizeof(TF_Tensor*)*NumOutputs);
/* create tensors with data here */
tensor0 = ...;
tensor1 = ...;

Assign input tensors with the actual data:

InputValues[0] = tensor0;
InputValues[1] = tensor1;

See the second part of this post on how to create tensors with data.

Run the Session

We are finally in position to run the computation graph on the provided inputs.

TF_SessionRun(Session, NULL, Input, InputValues, NumInputs, Output, OutputValues, NumOutputs, NULL, 0, Status);

Upon success, the Status will carry “TF_OK” and the OutputValues will have a non-zero pointer to the output tensor(s).

Freeing Allocated Memory

Make sure that memory used by Tensorflow data structures is properly released:

TF_DeleteGraph(Graph);
TF_DeleteSession(Session, Status);
TF_DeleteSessionOptions(SessionOpts);
TF_DeleteStatus(Status);

The second part of this post is a collection of useful code bits that I decided remove from the main post but found still worth sharing. For instance, there are almost no examples how to create certain tensors or how to get data back out from the output tensor.

Create a 1-dim tensor holding an integer

Let’s create a tensor with one dimension (not scalar!) that will hold a single integer “4”.

First, let’s provide information about tensor dimensionality and the data it should contain:

int ndims = 1;
int64_t dims[] = {1};
int ndata = sizeof(int32_t);
int32_t data[] = {4};

Second, create a tensor:

TF_Tensor* int_tensor = TF_NewTensor(TF_INT32, dims, ndims, data, ndata, NULL, NULL);

If the tensor must be a scalar (0 dimensions), then:

int64_t dims = NULL;
int ndims = 0;

Similarly, one can create an array, matrix or higher-dimensionality tensor of floats, doubles etc.

Create an array of strings

Strings in Tensorflow are a special type and are handled differently than other data types. Even internal storage is special. The data must be prepared in a special way.

A tensor of strings is created via a seemingly simple call:

TF_Tensor* str_tensor = TF_NewTensor(TF_String, nstr, ndims, base, bsize, free_array, base);

It is important to realize that base array is essentially a concatenation of two arrays of different types. The first is the array of offsets and is of type uint64_t. The second is array of strings and where each string is prefixed with its length. Finally, each string must be encoded using a special TF_StringEncode() function.

Suppose we have an array of strings:

const char* sarr;
int nstr;

First, one must compute the total size of the base array. This encoded array includes offsets, strings themselves as well as their lengths.

size_t tsize = 0;
for (int i = 0; i < nstr; i++) {
tsize += TF_StringEncodedSize(strlen(sarr[i])) + sizeof(TF_DataTypeSize(TF_UINT64)); }

Second, allocate memory and prepare for string encoding:

char* base = malloc(sizeof(char)*tsize;
char* start = sizeof(TF_DataTypeSize(TF_UINT64))*nstr + base;
char* dest = start;
size_t dest_len = tsize - (size_t)(start - base);
uint64_t* offsets = (uint64_t*)(base);

Third, encode every string and store it in the dest array:

for (int i = 0; i < nstr; i++) {
*offsets = (dest - start);
offsets++;
size_t a = TF_StringEncode(sarr[i], strlen(sarr[i]), dest, dest_len, Status);
dest += a;
dest_len -= a;
}

Fourth, provide information about shape of the tensor (notice “1”):

int64_t dimvec[] = {1, nstr};
size_t ndims = 2;

Finally, create the tensor:

TF_Tensor* tarr = TF_NewTensor(TF_STRING, dimvec, ndims, base, tsize, free_array, base);

Note: The Python equivalent of the created tensor (with some sample data):

s = [[“some”, “interesting”, ”data”, “here”]]

How do I unpack a tensor of strings?

Imagine a session runs and one of the outputs returns a tensor of strings. Here is how to unpack this tensor.

First, prepare the array of strings:

char** out[];
size_t nout;

Get the Tensor of strings from the array of output tensors:

TF_Tensor* tout = OutputValues[0]; // assuming we want the first one

Get shape information of the tensor:

nout = (size_t)TF_Dim(tout, 1)]; // assuming the number of strings is 2nd dim

Prepare arrays for encoded data and offsets:

void* buff = TF_TensorData(tout);
int64_t* offsets = buff;

Prepare utility pointers:

char* data = buff + nout * sizeof(int64_t);
char* buff_end = buff + TF_TensorByteSize(tout);

Allocate pointer to arrays of strings:

*out = calloc(nout, sizeof(char*));

Decode every string and copy it into the array of strings:

for (int i = 0; i < nout; i++) {
char* start = buff + offsets[i];
const char* dest;
size_t n;
TF_StringDecode(start, buff_end — start, &dest, &n, Status);
if (TF_GetCode(Status) == TF_OK) {
(*out)[i] = malloc(n + 1);
memcpy((*out)[i], dest, n);
*((*out)[i] + n) = ‘\0’; }
}

--

--