Reducing Deep Learning Model Size Without Effecting It’s Original Performance and Accuracy With Tensorflow Model Optimization Toolkit On Real World Dataset

Janibasha Shaik

Published in

Analytics Vidhya

7 min readOct 23, 2020

All you need know how train scalable and efficient deep learning models

Table Of Contents :

(i) Introduction

(ii) Motivation

(iii) Model Training without pruning

(iv) Model Training with pruning

(v) Comparison of both model sizes

(vi) Comparison of both model performances

(vii) Comparison of both model accuracy

(viii) Conclusion

(ix) References

(i) Introduction :

We are all know that Neural networks perform better than classical machine learning algorithms because of weights update on each epoch. But unknown fact is while removing 90% of weights from neural networks then also we can match baseline model accuracy.

Most of the neural network weights are sparse in nature , sparse weights can not impact on model performance so we can remove it with out effecting performance.

The process of removing sparse weights from neural networks is called as pruning. I am using Tensorflow model optimization toolkit for pruning.

Image credit : https://www.youtube.com/watch?v=KRlOEGtb3gk&t=1233s

Advantage of Pruning:

Model size decrease without effecting model performance.

Accuracy is also similar to the baseline model.

(ii) Motivation:

Recently I develop a web app which detect the diseased cotton plant or cotton leaf , after developing web app I tried to deploy that web app on Heroku platform but due to my model size I am not able to deploy that model.

My model size is 562MB but Heroku platform support only 500MB

My trained .h5 file size 61MB

Then at that time I don’t know how to reduce it but yesterday I got a internship task which model pruning.

while exploring this topic I came across a video on youtube which is official tensorflow team presentation on Model Optimization Toolkit https://www.youtube.com/watch?v=3JWRVx1OKQQ&t=2143s

From that video I got my problem solution.

MobileNet : Removing 75% of sparse weights then also we can match baseline model accuracy on ImageNet dataset

ResNet50: Removing 90% of sparse weights then also we can match baseline model accuracy on ImageNet dataset

This all information motivate me to solve my problem

Data pre-processing:

This article main objective is about pruning so I am not going in depth in Data preprocessing if you want detail explanation I already I published a article please follow this article https://medium.com/analytics-vidhya/agriculture-project-from-scratch-to-deployment-using-deep-learning-architecture-fca767be094f

(iii) Model Training Without Pruning:

My problem statement is multi class classification

So classify the images I build a CNN from scratch

importing necessary modules

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Input, Lambda, Dense, Flatten5
from tensorflow.keras.models import Model
from tensorflow.keras.models import Sequential
import numpy as np
from tensorflow.keras.preprocessing import image
from keras.optimizers import Adam
import os
import tempfile
%load_ext tensorboard
import tensorboard

Building CNN model

4 Conv2D , 4 MaxPooling2D , 3 Dropout , 3 Dense , 1 Flatten layers

Conv 2D layers no of Fliters : 32 ,64,128,256

Kernel size =3x3

Dense layer no of neurons: 128,256

Output layer softmax no of neurons =4

Activation function : relu

cnn_model = keras.models.Sequential([
keras.layers.Conv2D(filters=32, kernel_size=3, input_shape=[224, 224, 3]),
keras.layers.MaxPooling2D(pool_size=(2,2)),
keras.layers.Conv2D(filters=64, kernel_size=3),
keras.layers.MaxPooling2D(pool_size=(2,2)),
keras.layers.Conv2D(filters=128, kernel_size=3),
keras.layers.MaxPooling2D(pool_size=(2,2)),
keras.layers.Conv2D(filters=256, kernel_size=3),
keras.layers.MaxPooling2D(pool_size=(2,2)),
keras.layers.Dropout(0.5),
keras.layers.Flatten(), # neural network building
keras.layers.Dense(units=128, activation=’relu’), # input layers
keras.layers.Dropout(0.1),
keras.layers.Dense(units=256, activation=’relu’),
keras.layers.Dropout(0.25),
keras.layers.Dense(units=4, activation=’softmax’) # output layer
])

Compiling model

callbacks = [tf.keras.callbacks.TensorBoard(log_dir=logdir, profile_batch=0)]
cnn_model.compile(optimizer = Adam(lr=0.0001), loss=’categorical_crossentropy’, metrics=[‘accuracy’])

Fitting model

I am training the model for 10 epochs only ,you can train the model for your desired epochs

history = cnn_model.fit(train_agumented_set,epochs=10,verbose=1,validation_data= val_agumented_set,callbacks=callbacks)

(iv) Model Training with pruning:

Pruning is nothing but removing the connections of neurons which have sparse weights

With the help of pruning we can reduce the no of weights in neural network simultaneously our model size also reduces

Because of less size our model response time also less , indirectly with help of pruning we increasing our of model performance

For pruning the neural network I am using Tensorflow Model Optimization Toolkit

Tensoflow Toolkit supports post-training quantization, quantization aware training, pruning, and clustering.

For pruning I am using PolynomialDecay function from Tensorflow Model Optimization API

ref : https://www.tensorflow.org/model_optimization/api_docs/python/tfmot/sparsity/keras/PolynomialDecay

PolynomialDecay function contains arguments

initial_sparsity:Sparsity (%) at which pruning begins.
final_sparsitySparsity (%) at which pruning ends.
begin_stepStep at which to begin pruning.
end_stepStep at which to end pruning.
powerExponent to be used in the sparsity function.
frequencyOnly apply pruning every frequency steps.

importing necessary modules

from tensorflow_model_optimization.sparsity import keras as sparsity
import numpy as np

I am creating a dictionary for pruning parameters

pruning_params = {
‘pruning_schedule’: sparsity.PolynomialDecay(initial_sparsity=0.85,
final_sparsity=0.95,
begin_step=2000,
end_step=5000,frequency=100)}

we start the model with 85% sparsity (85% zeros in weights) and end with 95% sparsity.

A PruningSchedule object that controls pruning rate throughout training.

I am importing keras as sparsity

sparsity.prune.low_magnitude : warper layer with pruning functionality which sparsifies the layer’s weights during training. For example, using this with 85% sparsity will ensure that 85% of the layer’s weights are zero.

Now we know the all parameters which are useful to prune the neural network

Building CNN model with pruning

pruned_model = keras.models.Sequential([
sparsity.prune_low_magnitude(keras.layers.Conv2D(filters=32, kernel_size=3), input_shape=[224, 224, 3],**pruning_params),
keras.layers.MaxPooling2D(pool_size=(2,2)),
sparsity.prune_low_magnitude (keras.layers.Conv2D(filters=64, kernel_size=3),**pruning_params),
keras.layers.MaxPooling2D(pool_size=(2,2)),
sparsity.prune_low_magnitude (keras.layers.Conv2D(filters=128, kernel_size=3),**pruning_params),
keras.layers.MaxPooling2D(pool_size=(2,2)),
sparsity.prune_low_magnitude(keras.layers.Conv2D(filters=256, kernel_size=3),**pruning_params),
keras.layers.MaxPooling2D(pool_size=(2,2)),
keras.layers.Dropout(0.5),
keras.layers.Flatten(), # neural network beulding
sparsity.prune_low_magnitude (keras.layers.Dense(units=128, activation=’relu’),**pruning_params), # input layers
keras.layers.Dropout(0.1),
sparsity.prune_low_magnitude (keras.layers.Dense(units=256, activation=’relu’),**pruning_params),
keras.layers.Dropout(0.25),
sparsity.prune_low_magnitude(keras.layers.Dense(units=4, activation=’softmax’),**pruning_params) # output layer
])

Compiling the model

pruned_model.compile(optimizer = Adam(lr=0.0001),loss=’categorical_crossentropy’, metrics=[‘accuracy’])

Fitting the model

callbacks = [sparsity.UpdatePruningStep(),sparsity.PruningSummaries(log_dir=logdir, profile_batch=0)]
history_prun=pruned_model.fit(train_agumented_set,epochs=10,verbose=1,
validation_data= val_agumented_set,callbacks=callbacks)

sparsity.strip_pruning : Once a model has been pruned to required sparsity, this method can be used to restore the original model with the sparse weights.