Unpacking DenseNet to understand and then creating using TensorFlow

Sumeet Badgujar
Analytics Vidhya
Published in
4 min readJul 18, 2021
FIgure 1: A 5-layer dense block (Source: Original DenseNet Paper)

What’s DenseNet? It’s a network which is Dense. That’s it over, bye.

Kidding. But the truth is not far away from it. Densenet is an abbreviation for Densely Connected Convolutional Networks. The Densenet has a block in which multiple convolution layers are connected to each other. Every i layer in the block is connected to its all successive layer i.e. i+1, i+2,… till the last. Such type of connection is called Residual Network.

Figure 2: A representation of Dense Blocks in Densenet (Source: Original Densenet Paper)

The authors of the paper used Residual Network in the block but to further improve the information flow between layers we propose a different connectivity pattern: introduce direct connections from any layer to all subsequent layers. The core idea behind it is feature reuse, which leads to very compact models. As a result it requires fewer parameters than other CNNs, as there are no repeated feature-maps. Milking till the features run dry!!!

Figure 3: Dense net layer structure

The first convolution block has 64 filters of size 7x7 and a stride of 2. It is followed by a MaxPooling layer with 3x3 max pooling and a stride of 2. These two lines can be represented with the following code in python.

Densenet have two main blocks — Dense Block and Transition Layer.

    input = Input(input_shape)
x = Conv2D(64, 7, strides=2, padding="same")(input)
x = BatchNormalization()(x)
x = ReLU()(x)
x = MaxPool2D(3, strides=2, padding="same")(x)
Figure 4: Dense Block and Transition Layer

Dense Block has 2 parts. A 1x1 convolution block and a 3x3 convolution block. The concatenation with all the previous feature maps may result in memory explosion. The 1x1 convolution keeps the memory explosion in check.

First lets define the convolution block in Python.

def conv_layer(x, filters, kernel=1, strides=1):    x = BatchNormalization()(x)
x = ReLU()(x)
x = Conv2D(filters, kernel, strides=strides, padding="same")(x)
return x

Now coming to the Dense Block. In this block, there’s a parameter that decides the depth of the block. It’s called growth factor. The growth factor is tunable, the default value is 32. The growth factor is just the number of filters. The 1x1 block has 4 times the number of filters. Followed by it is the 3x3 convolution. And then comes the main highlight of the paper, each layer concatenated to its successive one. Here, the output is concatenated to the input using TensorFlow function.

Each dense block is repeated n times, depending upon the image size. To implement this we will create a list of repetitions and apply a for loop on it. The repetitions are [6, 12, 24, 16].

def dense_block(x, repetition,filters):
for _ in range(repetition):
y = conv_layer(x, 4 * filters)
y = conv_layer(y, filters, 3)
x = concatenate([y, x])
return x

The output of dense block is passed of to the Transition block. There are a 1x1 convolutional layer and a 2x2 average pooling layer with a stride of 2 (downsizes the image). kernel size of 1x1 is already set in the function, so we do not explicitly need to define it again. In the transition layers, we have to remove channels to half of the existing channels.

def transition_layer(x):
x = conv_layer(x, x.shape[-1]/ 2)
x = AvgPool2D(2, strides=2, padding="same")(x)
return x

Complete DenseNet 121 architecture:

Now that we have all the blocks together, let’s merge them to see the entire Densenet 121 architecture.

Complete DenseNet 121 architecture:

def conv_layer(x, filters, kernel=1, strides=1):    x = BatchNormalization()(x)
x = ReLU()(x)
x = Conv2D(filters, kernel, strides=strides, padding="same")(x)
return x
def dense_block(x, repetition, filters):
for _ in range(repetition):
y = conv_layer(x, 4 * filters)
y = conv_layer(y, filters, 3)
x = concatenate([y, x])
return x
def transition_layer(x):
x = conv_layer(x, x.shape[-1]/ 2)
x = AvgPool2D(2, strides=2, padding="same")(x)
return x
def densenet(input_shape, n_classes, filters=32): input = Input(input_shape)
x = Conv2D(64, 7, strides=2, padding="same")(input)
x = BatchNormalization()(x)
x = ReLU()(x)
x = MaxPool2D(3, strides=2, padding="same")(x)
for repetition in [6, 12, 24, 16]:
d = dense_block(x, repetition,filters)
x = transition_layer(d)
x = GlobalAveragePooling2D()(d)
output = Dense(n_classes, activation="softmax")(x)
model = Model(input, output)
return model
FIgure 5: Model summary

And that’s how we can implement the Densenet 121 architecture using TensorFlow.

To see the code in a much more nicer presentable way, checkout the code on github.

References:

Gao Huang and Zhuang Liu and Laurens van der Maaten and Kilian Q. Weinberger, Densely Connected Convolutional Networks, arXiv 1608.06993 (2016)

--

--

Sumeet Badgujar
Analytics Vidhya

A guy interested in Data Science and Ex-Machine Learning Engineer, doing data analysis and fun AI projects. “Ore wa Kaizoku Ou ni naru!”