Implementation Of CNN layers in TensorFlow
Hi DL Lovers! Hope you enjoyed my last articles.This is the second article of the TF_CNN trilogy. This article will talk about How to define the layers in CNN
If you have image dataset:
We have to start by creating a tensor input_layer
as follows.
batch_size
. Size of the subset of samples to use for training.channels
. The number of colors in the sample images.
If you have text dataset:
We have to convert the words to numbers. We assign unique Id to every word as a result sentences are converted to Number Vectors. Then comes an Embedding Layer.
What happens in embedding layer?
Up till this every word in the dataset has a unique ID. In embedding layer, we map those ID to a vector.
We created the embedding matrix W
and we initialize it using a random uniform distribution. We dont have to worry about initial values as we will learn it during training. We use the tf.nn.embedding_lookup
function to generate a dense vector for each id.
Here it is nicely explained purpose of the embedding layer
Let’s have a look at the dimensions,inputTensor
is the input data that we need to model so it is in 2D, i.e. Number of data/samples and vector size of each sample. Embedding layer generates a vector for every word ID, so we produce 3D matrix. But we require a 4D matrix to use tf.nn.conv2d
for the convolutional layer. We increase the dimensions using tf.expand_dims
.
We learned how to make image and Text data prepare to be used for convolution. Before we jump into convolution, It’s necessary to know what will be the size of our filter matrix(which will slide over input data in the covolutional layer), it has to be a 4D tensor.
num_filters
and filter size
are the hyperparameters you need to declare as per the requirements. The number of filters per filter size is num_filters
. And often we use more than one filter size for convolution.
For image data we can define it like:
For text data, we can define it as:
Convolutional layer :
We define convolutional layer using tf.nn.conv2d
the function.
tf.truncated_normal outputs random values from a truncated normal distribution. We will be learning W:filter matrix
and b:bias
while training. About Strides and Padding, we have discussed in the last post.
Activation-Function: Relu
Next we activate it using tf.nn.relu function.
Max-Pooling Layer:
Max-Pooling is done using tf.nn.max_pool
ksize
: A 1-D int Tensor of 4 elements. The size of the window for each dimension and the window will pull the max value.
We often do the convolution with more than one filter size so now we need to pull the tensors together. You can keep convolution part inside a loop and whenever we generate the pooled tensor you can append into a list. Create a list at the beginning of the loop (pooled_outputs = [])
then keep appending.
Before we enter into dropout layer we have to flatten the tensors in pooled_output
list. We will first concatenate the tensors using tf.concat and then so we can flatten that tensor using in tf.reshape.
Next, we pass the flattened vector to tf.nn.dropout. Usually self.dropout_keep_prob
is 0.5
while training.
Now we have to predict looking at the feature vector.We generate predictions by doing a matrix multiplication and add bias then tf.nn.softmax converts the raw scores into normalized probabilities and then picking the class with the highest score using tf.argmax(returns the index/class with the highest score). tf.nn.xw_plus_b is simply a wrapper for self.h_drop * w + b
All the above-discussed layers are the main part to build a model. But we have to call the model and declare the hyperparameters and reduce our error save the model. Until now you would have got a clear idea of the functions we use to create our model. Do refer some CNN model available in GitHub For example.
We will discuss How to save the model and restore our model in the next post.
Thank you.