Tensorflow Tutorial — Part 2
In the previous Part 1 of this tutorial, I introduced a bit of TensorFlow and Scikit Flow and showed how to build a simple logistic regression model on Titanic dataset.
In this part let’s go deeper and try multi-layer fully connected neural networks, writing your custom model to plug into the Scikit Flow and top it with trying out convolutional networks.
Multi-layer fully connected neural network
Of course, there is not much point of yet another linear/logistic regression framework. An idea behind TensorFlow (and many other deep learning frameworks) is to be able to connect differentiable parts of the model together and optimize them given the same cost (or loss) function.
Scikit Flow already implements a convenient wrapper around TensorFlow API for creating many layers of fully connected units, so it’s simple to start with deep model by just swapping classifier in our previous model to the TensorFlowDNNClassifier and specify hidden units per layer:
This will create 3 layers of fully connected units with 10, 20 and 10 hidden units respectively, with default Rectified linear unit activations. We will be able to customize this setup in the next part.
Note on the parameters for the model —I’ve been putting some for an example, but learning rate, optimizer and how many steps you train a model can make a big difference. Usually, in real scenarios one would run hyper-parameter search to find an optimal set which improves cost or accuracy on the validation set.
Multi-layer with tanh activation
I didn’t play much with hyperparameters, but previous DNN model actually yielded worse accuracy then a logistic regression. We can explore if this is due to overfitting on under-fitting in a separate post.
For the sake of this example, I though want to show how to switch to the custom model where you can have more control.
This model is very similar to the previous one, but we changed the activation function from a rectified linear unit to a hyperbolic tangent (rectified linear unit and hyperbolic tangent are most popular activation functions for neural networks).
As you can see, creating a custom model is as easy as writing a function, that takes X and y inputs (which are Tensors) and returns two tensors: predictions and loss. This is where you can start learning TensorFlow APIs to create parts of sub-graph.
What kind of TensorFlow tutorial would this be without an example of digit recognition? :)
This is just an example how you can try different types of datasets and models, not limiting to only floating number features. Here, we take digits dataset and write a custom model:
We’ve created conv_model function, that given tensor X and y, runs 2D convolutional layer with the most simple max pooling — just maximum. The result is passed as features to skflow.models.logistic_regression, which handles classification to required number of classes by attaching softmax over classes and computing cross entropy loss.
It’s easy now to modify this code to add as many layers as you want (some of the state-of-the-art image recognition models are hundred+ layers of convolutions, max pooling, dropout and etc).
The Part 3 is expanding the model for Titanic dataset with handling categorical variables.
PS. Thanks to Vlad Frolov for helping with missing articles and pointing mistakes in the draft :)