Pretrained Models

ajaymehta
4 min readMay 20, 2024

--

EffiventNetB4

backbone = tf.keras.applications.efficientnet.EfficientNetB4(
include_top = False,
weights='imagenet',
input_shape=(CONFIGURATION["IM_SIZE"], CONFIGURATION["IM_SIZE"], 3),
)

backbone.trainable = False

pretrained_model = tf.keras.Sequential([
Input(shape = (CONFIGURATION["IM_SIZE"], CONFIGURATION["IM_SIZE"],3)),
backbone,
GlobalAveragePooling2D(),
Dense( CONFIGURATION["N_DENSE_1"], activation = "relu"),
BatchNormalization(),
Dense( CONFIGURATION["N_DENSE_2"], activation = "relu"),
Dense( CONFIGURATION["NUM_CLASSES"], activation = "softmax"),

])
pretrained_model.summary()

Model Summary:

Train

loss_function = CategoricalCrossentropy()

metrics = [CategoricalAccuracy(name = "accuracy"), TopKCategoricalAccuracy(k=2, name = "top_k_accuracy")]
pretrained_model.compile(optimizer = Adam(learning_rate = CONFIGURATION["LEARNING_RATE"]*10), loss = loss_function, metrics = metrics)

history=pretrained_model.fit(training_dataset, epochs = 20, validation_data = validation_dataset, verbose = 1, callbacks = [checkpoint_callback])

The provided training history reveals the performance of a neural network model over 20 epochs. Here’s a breakdown of the key points:

  1. Initial Epochs (1–5):
  • The accuracy gradually increases from 0.1875 to 0.3178.
  • The loss decreases from 2.6383 to 1.7041.
  • The top-5 accuracy also improves from 0.3438 to 0.4983.
  • Validation accuracy shows some improvement, reaching 0.4398 in Epoch 5.

2 Mid Epochs (6–10):

  • The accuracy continues to increase slightly, reaching 0.3466 in Epoch 10.
  • The loss decreases further, indicating better convergence.
  • However, the validation accuracy fluctuates and doesn’t show significant improvement, hovering around 0.42–0.44.

3. Later Epochs (11–20):

  • The accuracy remains relatively stable, with minor fluctuations.
  • Validation accuracy improves slightly, reaching a peak of 0.4654 in Epoch 19.
  • However, the model doesn’t achieve a significant improvement in validation accuracy beyond Epoch 19.

Comparison:

  • The model starts with a lower accuracy compared to the previous model (0.1875 vs. 0.466).
  • Despite training for the same number of epochs, the model doesn’t perform as well as the previous one.
  • The validation accuracy also doesn’t show significant improvement, indicating potential stability issues or overfitting.
  • This model exhibits similar overfitting characteristics as the previous one, with the training accuracy significantly higher than the validation accuracy in later epochs.

In summary, while this model shows some improvement in accuracy and loss over the training period, it fails to outperform the previous model. Additionally, both models demonstrate signs of overfitting, as indicated by the increasing gap between training and validation accuracies. Further optimization or regularization techniques may be necessary to improve the model’s performance and generalization ability.

Finetuning

The fine-tuned model exhibits a similar performance pattern to the initial model, suggesting that the adjustments made during fine-tuning did not significantly impact its overall behavior. Here’s a summary of the comparison:

Transfer Learning with MobileNetV2

After employing Transfer Learning with MobileNetV2, it’s observed that the model’s performance does not show improvement from the outset. Instead, it stabilizes after the third epoch and maintains consistent but unsatisfactory accuracy levels until the eleventh epoch. Upon reviewing the performance from epochs 3 to 11, it becomes evident that MobileNetV2 is not ideally suited for our specific task or dataset.

Final Model

In the final model, I opted for ResNet50 due to its consistent performance, yielding a commendable accuracy of 68% within just 36 epochs. However, my journey with model training wasn’t entirely smooth sailing. Unfortunately, my Colab notebook encountered disconnections twice during the process — once within the initial 20 epochs and again at the 37th epoch. Despite these interruptions, I managed to mitigate the setbacks by leveraging weights and biases for experiment tracking.

Using weights and biases allowed me to resume training seamlessly from where it left off, starting from the 21st epoch. My goal was to push the training to 50 epochs, but fate had other plans. Another disconnection occurred during the 37th epoch, halting the process prematurely. Nonetheless, I attained a respectable accuracy of 68%.

In hindsight, I speculate that training the model for 100 epochs could yield even higher accuracy. However, considering the time and resource constraints, I’ve decided to conclude the experimentation here. Unfortunately, lacking access to a powerful GPU or system for further training prohibits me from extending the experiment.

In essence, while there’s potential for further improvement, the logistical constraints and diminishing returns prompt me to halt the training at this juncture. Nevertheless, the utilization of weights and biases for tracking experiments has been instrumental in maintaining continuity and providing valuable insights throughout the process.

Code link

Exp tracking

--

--

ajaymehta

Meet Ajay a blogger and AI/DS expert. Sharing insights on cutting-edge tech, machine learning, data analysis, and their real-world applications.