Custom models with TensorFlow (Part-3)-> Custom layers & Activation

Sthanikam Santhosh
5 min readDec 6, 2022

--

In Part 1 and Part 2 of the custom models with Tensorflow, we discussed how to implement multi-input and multi-output layers. In this current article, we will discuss how to implement custom layers and custom activation functions.

TensorFlow supports different types of layer implementations like LSTM, RNN, CNN, Dense, … etc. But in some cases, it may require implementing custom layers.
There are different ways we customize layer behavior. The first and easiest way of designing a layer is using the lambda function. The purpose of the lambda function is to execute any arbitrary function within a functional or sequential model. It is best suited for quick and easy experimentation.

Let’s see how we can use the lambda layer within the code.

The simplest lambda layer looks something like this. Within the parameters we will specify the lambda value, in this case, it’s x and then that value gets mapped to the square of x. For example, 2 gets mapped to 4 by this lambda layer.

# add a x -> x^2 layer
model.add(Lambda(lambda x: x ** 2))

Let’s see how we can use this lambda layer to design the model for the MNIST dataset.

import tensorflow as tf
from tensorflow.keras import backend as K

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128),
tf.keras.layers.Lambda(lambda x: tf.abs(x)), # lambda layer to get absolute value
tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

Here, we used the Lambda layer to define a custom layer in our network.
We used the lambda function to get the absolute value of the layer input.

Another way to use the Lambda layer is to pass it in a function defined outside the model. The code below shows how the model uses a custom ReLU function as a custom layer.

def custom_relu(x):
return K.maximum(-0.1, x)

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128),
tf.keras.layers.Lambda(custom_relu), # Lambda layer calling custom function
tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

Lambda layers are great for simple functionality and prototype but if we want more complex/custom/advanced neural architecture which are trainable then we will face some limitations using the lambda layer.

To design a custom layer that is trainable, we need to define a class that inherits the Layer base class from Keras.

# inherit from this base class
from tensorflow.keras.layers import Layer

class CustomDense(Layer):

def __init__(self, units=32):
super(CustomDense, self).__init__()
self.units = units

def build(self, input_shape):
# initialize the weights
w_init = tf.random_normal_initializer()
self.w = tf.Variable(name="kernel",
initial_value=w_init(shape=(input_shape[-1], self.units),
dtype='float32'),
trainable=True)

# initialize the biases
b_init = tf.zeros_initializer()
self.b = tf.Variable(name="bias",
initial_value=b_init(shape=(self.units,), dtype='float32'),
trainable=True)

def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b

The Python syntax that is shown above is the class declaration. This class requires three functions: __init__(), build() and call().

__init__() function will be used to initialize the instance attributes, this will accept the parameters and internal variables.

build() function will run when the layer instance is created. Within the build function, we will initialize the state of the layer (weights).In this case, we’re calling them w(weights) and b(bias).

When we create the layer, we will not create just a single neuron but will create a number of neurons specified by the unit variable. Every neuron will need to be initialized, and TensorFlow supports a number of built-in functions to initialize their values. One of these is the random normal initializer, which as its name suggests, initializes them randomly using a normal distribution.

Self.w will hold the states of the weights and they’ll be in a tensor by creating them as a tf.Variable. This will be initialized using the w_init for its values. We gave the name kernels to trace it later. We also mentioned trainable=True this will allow TensorFlow to update values of w during training.

The bias is initialized differently using a tf.zeros_initializer function, which as the name suggests, will set it to zero.

Self.b will then be a tensor of the number of units in the layer, and they’ll all be initialized as zeros. In the code we mentioned this one also to be trainable.

call () function will do the computation. As our self.w and self.b are tensors, we could do a matmul operation on them to multiply the inputs by w and then add b before returning it. The inputs here are our x-values, and y will be wx plus b.

These ensure that our custom layer has a state and computation that can be accessed during training or inference.

If we want to add an activation function to the layer we can add using the following syntax.

class CustomDense(Layer):

# add an activation parameter
def __init__(self, units=32, activation=None):
super(CustomDense, self).__init__()
self.units = units

# define the activation to get from the built-in activation layers in Keras
self.activation = tf.keras.activations.get(activation)


def build(self, input_shape):
w_init = tf.random_normal_initializer()
self.w = tf.Variable(name="kernel",
initial_value=w_init(shape=(input_shape[-1], self.units),
dtype='float32'),
trainable=True)
b_init = tf.zeros_initializer()
self.b = tf.Variable(name="bias",
initial_value=b_init(shape=(self.units,), dtype='float32'),
trainable=True)
super().build(input_shape)


def call(self, inputs):

# pass the computation to the activation layer
return self.activation(tf.matmul(inputs, self.w) + self.b)

To use the built-in activations in Keras, we can specify an activation parameter in the __init__() method of our custom layer class. From there, we can initialize it by using the tf.keras.activations.get() method. This takes in a string identifier that corresponds to one of the available activations in Keras. We can then pass in the forward computation of Wx+b to this activation in the call() method.

Now we can use this custom-designed layer to design the model architecture and train on the dataset, like this :

mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
CustomDense(128, activation='relu'),#custom layer
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

In this way, we can design a custom layer architecture using Tensorflow.

Reference:

--

--