Freezing and Calculating FLOPS in Tensorflow

Published in

The Startup

2 min readNov 20, 2019

Knowing how many FLOPS your model takes is important when designing a model. In this post, we’ll learn how to freeze and calculate the number of FLOPS in the model.

The example network we will use throughout the post is a Lightweight Convolutional Pose Machine which can be found here. It takes an image as an input and outputs 14 channel joint heatmaps. If you want to learn more about the model, read the original paper here.

Freezing the Model

It’s recommended to freeze the model before measuring the FLOPS. This is to avoid calculating unnecessary operations in the graph.

So how should we freeze the model?

The code above removes all the training/unnecessary nodes in the graph before freezing the model. Also, make sure to specify an input shape. Having arbitrary input placeholder like [None, 352, 352, 3] will cause problems later.

NOTE: Because I have output = tf.identity(final_output, name=’output’) in the final layer of the network, I added ‘output’ node name in the protected_nodes to keep it in the final graph.

Calculating FLOPS

At this point, we have a frozen model named cpm_352.pb. Now we’re ready to calculate the FLOPS in our model.

The code looks a bit long, but to calculate total FLOPS, all we need is line 18–20. After reading the frozen graph file, the profiler goes through all the operations in the graph and calculates the total number of FLOPS.

Sometimes, we want to know how many FLOPS are in specific layers. For example, in our CPM model, we could be interested in the number of FLOPS in each stage.

We can do this by customizing the options using tf.profiler.ProfileOptionBuilder. If we know the name scope of the layers that we’re interested in, we could add with_node_names to the option and use regular expressions to filter the layers.

References

TensorFlow: Is there a way to measure FLOPS for a model?

I would like to build on Tobias Schnek's answer as well as answering the original question: how to get FLOP from a pb…

stackoverflow.com

Convolutional Pose Machines

Pose Machines provide a sequential prediction framework for learning rich implicit spatial models. In this work we show…

arxiv.org

edvardHua/PoseEstimationForMobile

This repository currently implemented the CPM and Hourglass model using TensorFlow. Instead of normal convolution…

github.com