Keras for TPUs on Google Colaboratory (Free!)
Playing with the official Fashion MNIST example
Google has started to give users access to TPU on Google Colaboratory (Colab) for FREE! Google Colab already provides free GPU access (1 K80 core) to everyone, and TPU is 10x more expensive. (Google Cloud currently charges $4.50 USD per TPU per hour, and $0.45 USD per K80 core per hour.) What an exciting news.
Looks like Google quietly turn on free TPU v2 for Google Colab 2 days ago. I first got to know about this from…forums.fast.ai
The good folks from fast.ai community already shared some benchmarks of MNIST dataset. However, it seems MNIST is too simple to solve that we cannot actually see the advantages of TPU.
I found an official example notebook “Fashion MNIST with Keras and TPUs” in the Github repo tensorflow/tpu. I’ve made some changes to the notebook and run it three times in different environments (TPU, GPU, CPU) as an alternative benchmark. Here are what I’ve changed:
- Created a validation set from the training set
batch_sizefrom 1024 to 512 and
epochsfrom 10 to 20. (No particular reason. Just want to try out different hyper-parameters.)
- Calculate test and validation scores post-training.
Source Code / Notebooks
And here are the notebooks (They are saved as Github Gists. Use dropdown menu “File/View on Github” to open them on Github.):
All it takes is really just converting the Keras model to TPU model using
tpu_model = tf.contrib.tpu.keras_to_tpu_model(
tpu='grpc://' + os.environ['COLAB_TPU_ADDR'])
According to this notebook, this is just a temporary solution. In the future you’ll have to choose TPU as a distribution strategy in the
model.compile instance method call instead.
With 3 convolution layers and 2 fully-connected layers, we can see that TPU already provides almost 2x performance in terms of speed comparing to GPU:
The train/validation/test accuracies should be very close across different environments, since they share the same hyper-parameters (we did not set the same seed, though):
One interesting thing is that TPU post-training validation accuracy is different from what Keras reported during training. I’m not sure why, but it probably has something to do with the fact that TPU uses mixed-precision computation, and we move the graph to CPU post-training, which I presume uses single-precision computation.
This is a very quick peek at what we can do with TPU on Google Colab. For the next step, we can try training bigger models with bigger datasets to fully utilize the power of TPU. By giving free access to TPU, Google Colab certainly opened a whole new world for us to explore. Good luck!
We briefly introduced some TPU reviews and benchmarks in April: