Knowing how many FLOPS your model takes is important when designing a model. In this post, we’ll learn how to freeze and calculate the number of FLOPS in the model.
The example network we will use throughout the post is a Lightweight Convolutional Pose Machine which can be found here. It takes an image as an input and outputs 14 channel joint heatmaps. If you want to learn more about the model, read the original paper in here.
Freezing the Model
It’s recommended to freeze the model before measuring the FLOPS. This is to avoid calculating unnecessary operations in the graph.
So how should we freeze the model?
The code above removes all the training/unnecessary nodes in the graph before freezing the model. Also, make sure to specify an input shape. Having arbitrary input placeholder like [None, 352, 352, 3] will cause problems later.
NOTE: Because I added output = tf.identity(final_output, name=’output’) in the final layer of the network, I added ‘output’ node name in the protected_nodes to keep it in the final graph.
Now we’re ready to calculate the FLOPS in our model.
The code looks a bit long, but to calculate total FLOPS, all we need is line 18–20. After reading the frozen graph file, the profiler goes through all the operations in the graph and calculates the total number of FLOPS.
Sometimes, we want to know how many FLOPS are in specific layers. For example, in our CPM model, we could be interested in the number of FLOPS in each stage.
We can do this by customizing the options using tf.profiler.ProfileOptionBuilder. If we know the name scope of the layers that we’re interested in, we could add with_node_names to the option and use regular expressions to filter the layers.