Reducing Deep Learning Model Size Without Effecting It’s Original Performance and Accuracy With Tensorflow Model Optimization Toolkit On Real World Dataset
All you need know how train scalable and efficient deep learning models
Table Of Contents :
(i) Introduction
(ii) Motivation
(iii) Model Training without pruning
(iv) Model Training with pruning
(v) Comparison of both model sizes
(vi) Comparison of both model performances
(vii) Comparison of both model accuracy
(viii) Conclusion
(ix) References
(i) Introduction :
We are all know that Neural networks perform better than classical machine learning algorithms because of weights update on each epoch. But unknown fact is while removing 90% of weights from neural networks then also we can match baseline model accuracy.
Most of the neural network weights are sparse in nature , sparse weights can not impact on model performance so we can remove it with out effecting performance.
The process of removing sparse weights from neural networks is called as pruning. I am using Tensorflow model optimization toolkit for pruning.
Advantage of Pruning:
Model size decrease without effecting model performance.
Accuracy is also similar to the baseline model.
(ii) Motivation:
Recently I develop a web app which detect the diseased cotton plant or cotton leaf , after developing web app I tried to deploy that web app on Heroku platform but due to my model size I am not able to deploy that model.
My model size is 562MB but Heroku platform support only 500MB
My trained .h5 file size 61MB
Then at that time I don’t know how to reduce it but yesterday I got a internship task which model pruning.
while exploring this topic I came across a video on youtube which is official tensorflow team presentation on Model Optimization Toolkit https://www.youtube.com/watch?v=3JWRVx1OKQQ&t=2143s
From that video I got my problem solution.
MobileNet : Removing 75% of sparse weights then also we can match baseline model accuracy on ImageNet dataset
ResNet50: Removing 90% of sparse weights then also we can match baseline model accuracy on ImageNet dataset
This all information motivate me to solve my problem
Data pre-processing:
This article main objective is about pruning so I am not going in depth in Data preprocessing if you want detail explanation I already I published a article please follow this article https://medium.com/analytics-vidhya/agriculture-project-from-scratch-to-deployment-using-deep-learning-architecture-fca767be094f
(iii) Model Training Without Pruning:
My problem statement is multi class classification
So classify the images I build a CNN from scratch
importing necessary modules
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Input, Lambda, Dense, Flatten5
from tensorflow.keras.models import Model
from tensorflow.keras.models import Sequential
import numpy as np
from tensorflow.keras.preprocessing import image
from keras.optimizers import Adam
import os
import tempfile
%load_ext tensorboard
import tensorboard
Building CNN model
4 Conv2D , 4 MaxPooling2D , 3 Dropout , 3 Dense , 1 Flatten layers
Conv 2D layers no of Fliters : 32 ,64,128,256
Kernel size =3x3
Dense layer no of neurons: 128,256
Output layer softmax no of neurons =4
Activation function : relu
cnn_model = keras.models.Sequential([
keras.layers.Conv2D(filters=32, kernel_size=3, input_shape=[224, 224, 3]),
keras.layers.MaxPooling2D(pool_size=(2,2)),
keras.layers.Conv2D(filters=64, kernel_size=3),
keras.layers.MaxPooling2D(pool_size=(2,2)),
keras.layers.Conv2D(filters=128, kernel_size=3),
keras.layers.MaxPooling2D(pool_size=(2,2)),
keras.layers.Conv2D(filters=256, kernel_size=3),
keras.layers.MaxPooling2D(pool_size=(2,2)),
keras.layers.Dropout(0.5),
keras.layers.Flatten(), # neural network building
keras.layers.Dense(units=128, activation=’relu’), # input layers
keras.layers.Dropout(0.1),
keras.layers.Dense(units=256, activation=’relu’),
keras.layers.Dropout(0.25),
keras.layers.Dense(units=4, activation=’softmax’) # output layer
])
Compiling model
callbacks = [tf.keras.callbacks.TensorBoard(log_dir=logdir, profile_batch=0)]
cnn_model.compile(optimizer = Adam(lr=0.0001), loss=’categorical_crossentropy’, metrics=[‘accuracy’])
Fitting model
I am training the model for 10 epochs only ,you can train the model for your desired epochs
history = cnn_model.fit(train_agumented_set,epochs=10,verbose=1,validation_data= val_agumented_set,callbacks=callbacks)
(iv) Model Training with pruning:
Pruning is nothing but removing the connections of neurons which have sparse weights
With the help of pruning we can reduce the no of weights in neural network simultaneously our model size also reduces
Because of less size our model response time also less , indirectly with help of pruning we increasing our of model performance
For pruning the neural network I am using Tensorflow Model Optimization Toolkit
Tensoflow Toolkit supports post-training quantization, quantization aware training, pruning, and clustering.
For pruning I am using PolynomialDecay function from Tensorflow Model Optimization API
ref : https://www.tensorflow.org/model_optimization/api_docs/python/tfmot/sparsity/keras/PolynomialDecay
PolynomialDecay function contains arguments
initial_sparsity:
Sparsity (%) at which pruning begins.
final_sparsity
Sparsity (%) at which pruning ends.
begin_step
Step at which to begin pruning.
end_step
Step at which to end pruning.
power
Exponent to be used in the sparsity function.
frequency
Only apply pruning everyfrequency
steps.
importing necessary modules
from tensorflow_model_optimization.sparsity import keras as sparsity
import numpy as np
I am creating a dictionary for pruning parameters
pruning_params = {
‘pruning_schedule’: sparsity.PolynomialDecay(initial_sparsity=0.85,
final_sparsity=0.95,
begin_step=2000,
end_step=5000,frequency=100)}
we start the model with 85% sparsity (85% zeros in weights) and end with 95% sparsity.
A PruningSchedule
object that controls pruning rate throughout training.
I am importing keras as sparsity
sparsity.prune.low_magnitude : warper layer with pruning functionality which sparsifies the layer’s weights during training. For example, using this with 85% sparsity will ensure that 85% of the layer’s weights are zero.
Now we know the all parameters which are useful to prune the neural network
Building CNN model with pruning
pruned_model = keras.models.Sequential([
sparsity.prune_low_magnitude(keras.layers.Conv2D(filters=32, kernel_size=3), input_shape=[224, 224, 3],**pruning_params),
keras.layers.MaxPooling2D(pool_size=(2,2)),
sparsity.prune_low_magnitude (keras.layers.Conv2D(filters=64, kernel_size=3),**pruning_params),
keras.layers.MaxPooling2D(pool_size=(2,2)),
sparsity.prune_low_magnitude (keras.layers.Conv2D(filters=128, kernel_size=3),**pruning_params),
keras.layers.MaxPooling2D(pool_size=(2,2)),
sparsity.prune_low_magnitude(keras.layers.Conv2D(filters=256, kernel_size=3),**pruning_params),
keras.layers.MaxPooling2D(pool_size=(2,2)),
keras.layers.Dropout(0.5),
keras.layers.Flatten(), # neural network beulding
sparsity.prune_low_magnitude (keras.layers.Dense(units=128, activation=’relu’),**pruning_params), # input layers
keras.layers.Dropout(0.1),
sparsity.prune_low_magnitude (keras.layers.Dense(units=256, activation=’relu’),**pruning_params),
keras.layers.Dropout(0.25),
sparsity.prune_low_magnitude(keras.layers.Dense(units=4, activation=’softmax’),**pruning_params) # output layer
])
Compiling the model
pruned_model.compile(optimizer = Adam(lr=0.0001),loss=’categorical_crossentropy’, metrics=[‘accuracy’])
Fitting the model
callbacks = [sparsity.UpdatePruningStep(),sparsity.PruningSummaries(log_dir=logdir, profile_batch=0)]
history_prun=pruned_model.fit(train_agumented_set,epochs=10,verbose=1,
validation_data= val_agumented_set,callbacks=callbacks)
sparsity.strip_pruning : Once a model has been pruned to required sparsity, this method can be used to restore the original model with the sparse weights.
(v) Comparison of both the model sizes :
Trained model (without pruning)
We got .h5 file size 61MB
Trained model (with pruning)
We got .h5 file size 20MB
3x file size reduces with pruning
(vi) Comparison of both the model Performances:
If you compare the both plots almost both model performing same level
(vii) Comparison of both the model accuracy :
Trained model(without pruning)
Trained model(with pruning)
If you observe the above test accuracy
difference between with pruning and without pruning is 7% only
If you want less size model then we need to sacrifice some percentage of accuracy
(viii) Conclusions :
With help of pruning our model size 3x times reduces and performance also similar to original model but some percentage of accuracy loss.
Now we can easily deploy pruned model in Heroku platform.Because of pruning , model size reduces and model response also reduces