Exporting deep learning models from keras to tensorflow-serving

I recently faced the challenge of trying to export a keras model running on a tensorflow backend into a tensorflow-serving setup and struggled to find any tutorials that covered how to do it correctly. The keras blog has an example of how to do it but that didn’t work for me (possibly because I belatedly realize that I didn’t update to the most recent version of keras), while the tensorflow-serving page outlined how to export a tensorflow model in a different way that doesn’t cover all of the Keras related issues. In the end a frankensteined combo of the two, along with a number of github issue answers, got me a working model. I thought it might be helpful for others (and for myself later) walking through the challenges I faced getting it working.

Combining Keras and Tensorflow-Serving

In order to make sure the dependencies are stated clearly I’ve included my package versions below:

  • Keras 1.2.2
  • Tensorflow 1.3
  • Tensorflow-serving-api 1.0.0
  • Python 2.7 (Tensorflow-serving only supports 2.7 currently)

Below is a link to the stripped down version of the final code snippet that worked for me after several iterations:

Python source for exporting keras on a tensorflow backend to tensorflow-serving.

Challenges included:

  • Format of the signatures — Documentation on the tensorflow side wasn’t clear and no examples were provided. Confusing things further several tutorial suggested just passing the tensors. For Keras running on a tensorflow backend the signature inputs for prediction look like: prediction_signature = tf.saved_model.signature_def_utils.predict_signature_def({“image”: Resnet50model.input}, {“prediction”:Resnet50model.output})
  • Variable initialization — init_ops sets up the model on the tensorflow_serving side, and both global and model parameters need to be initialized here or tf-serving will complain about uninitialized variables at runtime.
  • Keras has a variable called learning phase that also needs to be separately initialized. To further confuse matters the initialization has to happen before the model is instantiated. (K._LEARNING_PHASE = tf.constant(0)
  • sess.run(init_op) can be called only once. I was setting up the model in a jupyter notebook and on at least one occasion called this twice, which caused errors in tensorflow-serving.
  • Keras associates itself with the tensorflow session that it’s running on implicitly and most operations are transferred. For some reason in the above code (possibly due to the use of pop?) the final dense layer added to the model, the one that does the transfer learning, was being randomly initialized and wasn’t being loaded when all the other weights were loaded. This lead to an incredibly challenging situation where the model would predict correctly in python but would be incorrect and would usually consistently predict one of the classes once the model was loaded into tensorflow serving. In order to fix this I had to explicitly associate the Keras backend with the active session using the command: K.set_session(sess)

Hopefully this will save you some time getting your models online in a tensorflow-serving system. It was a challenge to figure out all of these disjointed steps, and while I’m sure this won’t work one or two versions down the road it works now.