Reducing Deep Learning Model Size Without Effecting It’s Original Performance and Accuracy With Tensorflow Model Optimization Toolkit On Real World Dataset

Janibasha Shaik
Analytics Vidhya
Published in
7 min readOct 23, 2020

All you need know how train scalable and efficient deep learning models

Table Of Contents :

(i) Introduction

(ii) Motivation

(iii) Model Training without pruning

(iv) Model Training with pruning

(v) Comparison of both model sizes

(vi) Comparison of both model performances

(vii) Comparison of both model accuracy

(viii) Conclusion

(ix) References

(i) Introduction :

We are all know that Neural networks perform better than classical machine learning algorithms because of weights update on each epoch. But unknown fact is while removing 90% of weights from neural networks then also we can match baseline model accuracy.

Most of the neural network weights are sparse in nature , sparse weights can not impact on model performance so we can remove it with out effecting performance.

The process of removing sparse weights from neural networks is called as pruning. I am using Tensorflow model optimization toolkit for pruning.

Image credit : https://www.youtube.com/watch?v=KRlOEGtb3gk&t=1233s

Advantage of Pruning:

Model size decrease without effecting model performance.

Accuracy is also similar to the baseline model.

(ii) Motivation:

Recently I develop a web app which detect the diseased cotton plant or cotton leaf , after developing web app I tried to deploy that web app on Heroku platform but due to my model size I am not able to deploy that model.

My model size is 562MB but Heroku platform support only 500MB

My trained .h5 file size 61MB

Then at that time I don’t know how to reduce it but yesterday I got a internship task which model pruning.

while exploring this topic I came across a video on youtube which is official tensorflow team presentation on Model Optimization Toolkit https://www.youtube.com/watch?v=3JWRVx1OKQQ&t=2143s

From that video I got my problem solution.

MobileNet : Removing 75% of sparse weights then also we can match baseline model accuracy on ImageNet dataset

ResNet50: Removing 90% of sparse weights then also we can match baseline model accuracy on ImageNet dataset

This all information motivate me to solve my problem

Data pre-processing:

This article main objective is about pruning so I am not going in depth in Data preprocessing if you want detail explanation I already I published a article please follow this article https://medium.com/analytics-vidhya/agriculture-project-from-scratch-to-deployment-using-deep-learning-architecture-fca767be094f

(iii) Model Training Without Pruning:

My problem statement is multi class classification

So classify the images I build a CNN from scratch

importing necessary modules

import tensorflow as tf

from tensorflow import keras

from tensorflow.keras.layers import Input, Lambda, Dense, Flatten5

from tensorflow.keras.models import Model

from tensorflow.keras.models import Sequential

import numpy as np

from tensorflow.keras.preprocessing import image

from keras.optimizers import Adam

import os

import tempfile

%load_ext tensorboard

import tensorboard

Building CNN model

4 Conv2D , 4 MaxPooling2D , 3 Dropout , 3 Dense , 1 Flatten layers

Conv 2D layers no of Fliters : 32 ,64,128,256

Kernel size =3x3

Dense layer no of neurons: 128,256

Output layer softmax no of neurons =4

Activation function : relu

cnn_model = keras.models.Sequential([

keras.layers.Conv2D(filters=32, kernel_size=3, input_shape=[224, 224, 3]),

keras.layers.MaxPooling2D(pool_size=(2,2)),

keras.layers.Conv2D(filters=64, kernel_size=3),

keras.layers.MaxPooling2D(pool_size=(2,2)),

keras.layers.Conv2D(filters=128, kernel_size=3),

keras.layers.MaxPooling2D(pool_size=(2,2)),

keras.layers.Conv2D(filters=256, kernel_size=3),

keras.layers.MaxPooling2D(pool_size=(2,2)),

keras.layers.Dropout(0.5),

keras.layers.Flatten(), # neural network building

keras.layers.Dense(units=128, activation=’relu’), # input layers

keras.layers.Dropout(0.1),

keras.layers.Dense(units=256, activation=’relu’),

keras.layers.Dropout(0.25),

keras.layers.Dense(units=4, activation=’softmax’) # output layer

])

Compiling model

callbacks = [tf.keras.callbacks.TensorBoard(log_dir=logdir, profile_batch=0)]

cnn_model.compile(optimizer = Adam(lr=0.0001), loss=’categorical_crossentropy’, metrics=[‘accuracy’])

Fitting model

I am training the model for 10 epochs only ,you can train the model for your desired epochs

history = cnn_model.fit(train_agumented_set,epochs=10,verbose=1,validation_data= val_agumented_set,callbacks=callbacks)

(iv) Model Training with pruning:

Pruning is nothing but removing the connections of neurons which have sparse weights

With the help of pruning we can reduce the no of weights in neural network simultaneously our model size also reduces

Because of less size our model response time also less , indirectly with help of pruning we increasing our of model performance

For pruning the neural network I am using Tensorflow Model Optimization Toolkit

Tensoflow Toolkit supports post-training quantization, quantization aware training, pruning, and clustering.

For pruning I am using PolynomialDecay function from Tensorflow Model Optimization API

ref : https://www.tensorflow.org/model_optimization/api_docs/python/tfmot/sparsity/keras/PolynomialDecay

PolynomialDecay function contains arguments

initial_sparsity:Sparsity (%) at which pruning begins.

final_sparsitySparsity (%) at which pruning ends.

begin_stepStep at which to begin pruning.

end_stepStep at which to end pruning.

powerExponent to be used in the sparsity function.

frequencyOnly apply pruning every frequency steps.

importing necessary modules

from tensorflow_model_optimization.sparsity import keras as sparsity

import numpy as np

I am creating a dictionary for pruning parameters

pruning_params = {

‘pruning_schedule’: sparsity.PolynomialDecay(initial_sparsity=0.85,

final_sparsity=0.95,

begin_step=2000,

end_step=5000,frequency=100)}

we start the model with 85% sparsity (85% zeros in weights) and end with 95% sparsity.

A PruningSchedule object that controls pruning rate throughout training.

I am importing keras as sparsity

sparsity.prune.low_magnitude : warper layer with pruning functionality which sparsifies the layer’s weights during training. For example, using this with 85% sparsity will ensure that 85% of the layer’s weights are zero.

Now we know the all parameters which are useful to prune the neural network

Building CNN model with pruning

pruned_model = keras.models.Sequential([

sparsity.prune_low_magnitude(keras.layers.Conv2D(filters=32, kernel_size=3), input_shape=[224, 224, 3],**pruning_params),

keras.layers.MaxPooling2D(pool_size=(2,2)),

sparsity.prune_low_magnitude (keras.layers.Conv2D(filters=64, kernel_size=3),**pruning_params),

keras.layers.MaxPooling2D(pool_size=(2,2)),

sparsity.prune_low_magnitude (keras.layers.Conv2D(filters=128, kernel_size=3),**pruning_params),

keras.layers.MaxPooling2D(pool_size=(2,2)),

sparsity.prune_low_magnitude(keras.layers.Conv2D(filters=256, kernel_size=3),**pruning_params),

keras.layers.MaxPooling2D(pool_size=(2,2)),

keras.layers.Dropout(0.5),

keras.layers.Flatten(), # neural network beulding

sparsity.prune_low_magnitude (keras.layers.Dense(units=128, activation=’relu’),**pruning_params), # input layers

keras.layers.Dropout(0.1),

sparsity.prune_low_magnitude (keras.layers.Dense(units=256, activation=’relu’),**pruning_params),

keras.layers.Dropout(0.25),

sparsity.prune_low_magnitude(keras.layers.Dense(units=4, activation=’softmax’),**pruning_params) # output layer

])

Compiling the model

pruned_model.compile(optimizer = Adam(lr=0.0001),loss=’categorical_crossentropy’, metrics=[‘accuracy’])

Fitting the model

callbacks = [sparsity.UpdatePruningStep(),sparsity.PruningSummaries(log_dir=logdir, profile_batch=0)]

history_prun=pruned_model.fit(train_agumented_set,epochs=10,verbose=1,

validation_data= val_agumented_set,callbacks=callbacks)

sparsity.strip_pruning : Once a model has been pruned to required sparsity, this method can be used to restore the original model with the sparse weights.

(v) Comparison of both the model sizes :

Trained model (without pruning)

We got .h5 file size 61MB

Trained model (with pruning)

We got .h5 file size 20MB

3x file size reduces with pruning

(vi) Comparison of both the model Performances:

Without Pruning Model performance
With Pruning model performance

If you compare the both plots almost both model performing same level

(vii) Comparison of both the model accuracy :

Trained model(without pruning)

Trained model(with pruning)

--

--

Janibasha Shaik
Analytics Vidhya

machine-learning fascinates the world, I exploit the machine learning to solve the real-world problem statements