Using Keras for MNIST
Keras is an awesome Deep Learning Library for TensorFlow and Theano. It provides a framework for high level implementation Deep Learning methods.This post describes the Hello World of Deep Learning, the classic MNIST Digit Classification.
MNIST is a database for handwritten digits. Keras can be used to classify these digit images. Following description is based on the example code provided in the keras distribution.
The general approach to the problem follow few simple steps. Load the data. Make a model. Train the model. MNIST dataset is a standard and keras provides API to download it for convenience. This example uses Convolutional Neural Net (CNN) as the hidden layers to extract features from the digit images. CNN generates a smaller representation of a entire image. In keras training is a single line code.
Assuming the necessary imports, the following loads the MNIST dataset. The Dataset is segregated into Training and Testing examples.
(X_train, y_train), (X_test, y_test) = mnist.load_data()
ConvolutionLayer expects number of channel to be specified, so reshaping is done ensure it is single channel.
X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols) X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)
The predefined output needs to be in one-hot encoding format.
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
The model comprises of two 2D Convolutional Layers followed by Maxpooling and a Dropout Layer.
model = Sequential()
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode=’valid’, input_shape=input_shape))
model.add(Activation(‘relu’))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation(‘relu’))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25)
First 2D Convolutional Layer generates nb_filters=32 feature maps. Think of it as a new image which has 32 channels 26X26 and is smaller in size as compared to single channel original single channel 28x28 Image. So as result of 2D Conv Layer the Image gets transformed from 1x28x28 to 32x26x26. The following ReLU Activation layer clears off the negative values and brings in the nonlinearity in the flow.
The image dimension is further reduced after the second 2D Conv Layer.The MaxPooling Layer further reduces the image size. It returns the maximum of the pixel value for a region.It reduces the size to one fourth based on pool size.After these layers a 32 channel deep 12x12 image is obtained. It is also called feature maps or activation maps.The Dropout layer acts a regularizer to prevent over fitting of the model.
Previous was mostly the feature detection part and now comes the classification part.
model.add(Flatten())
model.add(Dense(128))
model.add(Activation(‘relu’))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation(‘softmax’))
Flatten layer combines all the 32x12x12 values into a single vector of 4608 length. The Dense Layer is the fully connected layer which connects each of the 4608 units to its 128 units.It is followed by ReLU Activation and the Dropout. The Final Dense Layer is the fully connected layer which has the same number of output nodes as there are classses to be determined.In our case 10. The final Activation is softmax which maps the values returned into range 0 to 1.
Following this is the compilation Step and the training step. The Compile step converts the model in efficient native code. The training process begins Model.fit.
model.compile(loss='categorical_crossentropy',optimizer='adadelta',metrics=['accuracy'])
model.fit(X_train,Y_train,batch_size=batch_size,nb_epoch=nb_epoch,verbose=1, validation_data=(X_test, Y_test))
Keras is really easy to use framework to experiment with Deep Learning.