Calculating how much computing power needed to deploy a model is a quite meaningful and common requirement under actual industry production environment, especially for those edge computing applications. For the purpose of minimizing the cost and maximizing usage of computing devices or chips, we usually buy device whose capability just satisfies the basic performance requirement for the model with redundancy. Thanks to powerful community and abundant function module, TensorFlow has provided a fairly easy way to measure model Flops with
Normally, we just measure frozen model which is used for inference only. What
tf.profiler do is to calculate all operations in given graph.
If input node is a
tf.placeholder, it will cause incomplete shape error. So we need replace the
tf.placeholder node with other data type like
tf.constant. For example, in TensorFlow object detection api, we can replace this line during freezing model with the following one.
input_tensor = tf.ones(shape=input_shape, dtype=tf.uint8, name='image_tensor')