How to export a TensorFlow 2.x Keras model to a frozen and optimized graph

Sebastián García Acosta
5 min readAug 9, 2020

--

Recently, I struggled trying to export a model built with Keras and TensorFlow 2.x in the proper format to make inference with OpenCV’s DNN module. Here’s how I got those desired .pb and .pbtxt optimized for inference graph files!

A Frozen graph, literally (https://co.pinterest.com/pin/503136589616840147)

Introduction

When we finish the process of training our model and want it running as fast as possible on different programming languages using OpenCV cross-platform library or serve it on the web or mobile, we must export our model’s graph in the most efficient format possible, this translates into two stages: freezing and optimizing.

The process of freezing a TensorFlow model consist of converting its variables into constants that are stored straight in its graph. On the other hand, the process of optimizing consist of deleting nodes that are only necessary on training stage (such as Dropout layers) or inefficient operations.

This was easy in TensorFlow 1.x because back in that version existed helper functions that do that work for us. Nevertheless, although TensorFlow 2.x still supporting graph freezing through a newer API, I didn’t find any utility to optimize the frozen graph for inference. So, in essence, we will:

  1. Freeze the Keras model using TF 2.x:

SavedModel ⇒ GraphDef

2. Optimize the frozen graph using TF 1.x:

GraphDef ⇒ GraphDef

3. Convert the optimized frozen graph back to SavedModel format using TF 1.x (although it can be also done with TF 2.x):

GraphDef ⇒ SavedModel

0. Assumptions

For this tutorial, I’ll suppose that you have:

  • TensorFlow 2.0 installed.
  • Anaconda installed.
  • Saved your model in a .h5 or SavedModel format and is already loaded, or is available as an object.

1. Get the frozen graph out of the TF.Keras model with TensorFlow 2.x

First, we do the imports.

import tensorflow as tf
from tensorflow import keras
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2
import numpy as np

Then, specify the location where you want to save your frozen graph files.

#path of the directory where you want to save your model
frozen_out_path = ''
# name of the .pb file
frozen_graph_filename = “frozen_graph”
model = # Your model

Convert the Keras model to ConcreteFunction format, which is more general:

# Convert Keras model to ConcreteFunction
full_model = tf.function(lambda x: model(x))
full_model = full_model.get_concrete_function(
tf.TensorSpec(model.inputs[0].shape, model.inputs[0].dtype))

Once we have our model in the format of ConcreteFunction, we convert its variables to constants.

# Get frozen graph def
frozen_func = convert_variables_to_constants_v2(full_model)
frozen_func.graph.as_graph_def()

We are almost done. If you want to inspect the layers operations inside your frozen graph definition and see the name of its input and output tensors (important for the next stage), use this code:

layers = [op.name for op in frozen_func.graph.get_operations()]
print("-" * 60)
print("Frozen model layers: ")
for layer in layers:
print(layer)
print("-" * 60)
print("Frozen model inputs: ")
print(frozen_func.inputs)
print("Frozen model outputs: ")
print(frozen_func.outputs)

Then, serialize the frozen graph and its text representation to disk.

tf.io.write_graph(graph_or_graph_def=frozen_func.graph,
logdir=frozen_out_path,
name=f"{frozen_graph_filename}.pb",
as_text=False)
tf.io.write_graph(graph_or_graph_def=frozen_func.graph,
logdir=frozen_out_path,
name=f"{frozen_graph_filename}.pbtxt",
as_text=True)

Full code to freeze your Keras model and save it:

import tensorflow as tf
from tensorflow import keras
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2
import numpy as np
#path of the directory where you want to save your model
frozen_out_path = ''
# name of the .pb file
frozen_graph_filename = “frozen_graph”
model = # Your model# Convert Keras model to ConcreteFunction
full_model = tf.function(lambda x: model(x))
full_model = full_model.get_concrete_function(
tf.TensorSpec(model.inputs[0].shape, model.inputs[0].dtype))
# Get frozen ConcreteFunction
frozen_func = convert_variables_to_constants_v2(full_model)
frozen_func.graph.as_graph_def()
layers = [op.name for op in frozen_func.graph.get_operations()]
print("-" * 60)
print("Frozen model layers: ")
for layer in layers:
print(layer)
print("-" * 60)
print("Frozen model inputs: ")
print(frozen_func.inputs)
print("Frozen model outputs: ")
print(frozen_func.outputs)
# Save frozen graph to disk
tf.io.write_graph(graph_or_graph_def=frozen_func.graph,
logdir=frozen_out_path,
name=f"{frozen_graph_filename}.pb",
as_text=False)
# Save its text representation
tf.io.write_graph(graph_or_graph_def=frozen_func.graph,
logdir=frozen_out_path,
name=f"{frozen_graph_filename}.pbtxt",
as_text=True)

2. Optimizing the frozen graph for faster inference

In this stage, we’ll use a helper function in order to optimize the graph for inference available in TensorFlow 1.x, hence, we need to create a virtual environment that contains that version of TensorFlow (I recommend 1.5) using Anaconda (which is quite straightforward).

Note: Don’t use pip and conda to install packages at the same time in one environment. Check Using Pip in a Conda Environment for more info.

conda create -n tf15 python=3.7
conda activate tf15
conda install tensorflow=1.5.0

Then, in the same virtual environment we use the optimize_for_inference function that takes the following arguments:

  • input: str, path to the .pb file graph to optimize (the one that we generated earlier)
  • output: str, path of the resulting ‘.pb’ file that contains the optimized graph
  • frozen_graph: bool, whether the input graph is frozen (in our case it does)
  • input_names: str, name of the input tensor
  • output_names: str, name of the output tensor

So, an example of its usage is:

python -m tensorflow.python.tools.optimize_for_inference --input ./model_20K_96_soft_f1/frozen_model/frozen_graph.pb --output ./model_20K_96_soft_f1/optimized/optmized_graph.pb --frozen_graph=True --input_names=x --output_names=Identity

Once you do that, hopefully you would get your frozen graph optimized written in disk with the .pb format. Next, we will write that graph as text in order to get that ‘.pbtxt’ file using the same virtual environment and a short Python script. You can do the same with a code similar to the presented one in the first step using TensorFlow 2.0; but since we are already on TF 1.5, we’ll do that with that version.

# Needs tensorflow 1.5
import tensorflow as tf
optimized_graph_path = "./model_20K_96_soft_f1/optimized/optmized_graph.pb"output_pbtxt = "./model_20K_96_soft_f1/optimized/optmized_graph.pbtxt"# Read the graph.
with tf.gfile.FastGFile(optimized_graph_path, "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
# Remove Const nodes.
for i in reversed(range(len(graph_def.node))):
if graph_def.node[i].op == 'Const':
del graph_def.node[i]
for attr in ['T', 'data_format', 'Tshape', 'N', 'Tidx', 'Tdim',
'use_cudnn_on_gpu', 'Index', 'Tperm', 'is_training',
'Tpaddings']:
if attr in graph_def.node[i].attr:
del graph_def.node[i].attr[attr]
# Save as text.
tf.train.write_graph(graph_def, "", output_pbtxt, as_text=True)

Example: loading an optimized for inference model using OpenCV’s DNN module in Java (yes, Java)

Now you can enjoy the benefits of having frozen and optimized the graph of your model. This will increase significantly its prediction speed and will allow you to use that general serialized model within another libraries in other languages (such as OpenCV in Java or C++, etc…), giving you the power of serving your model and use it with another technologies.

import org.opencv.dnn.Dnn;
import org.opencv.dnn.Net;
public class Main { public static final String MODEL_PATH = ""; // SavedModel format, i.e., the '.pb' file
public static final String WEIGHTS_PATH = ""; // .pbtxt file
public static void main(String args[]) {
Net model = Dnn.readNet(MODEL_PATH,WEIGHTS_PATH);
}
}

Conclusions

Now you are able to serve and deploy your tf.Keras model with the newer and awesome version of TensorFlow, increasing your model’s inference performance, and coming up with better IA solutions.

--

--