Build your own NSFW Image Detector with Tensorflow

2 part step-by-step tutorial computer vision

7 min readSep 14, 2023

In the age of the internet, the abundance of content also means an influx of potentially explicit or NSFW (Not Safe for Work) material. Detecting and filtering such content has become a pressing necessity for platforms and individuals alike. In this blog post, we’ll explore how to create an NSFW image detector using TensorFlow and Python. TensorFlow, a powerful machine learning framework, will be our tool of choice for this task, offering flexibility and scalability.

By the end of this guide, you’ll have not only a functioning NSFW image detector but also a solid understanding of how deep learning models work and how to apply them to real-world issues. Whether you’re a platform administrator striving to maintain a safe environment or someone eager to delve into the world of machine learning, join us on this journey to enhance online safety and security.

Prerequisites

Full notebook is available here. The required libraries are listed in files requirements.txt which can be installed using

pip install -r requirements.txt

You can get the dataset by scraping the urls in raw_data folder. (It is very messy!)

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
import os
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import ConfusionMatrixDisplay

Split Data into train-validation-test

data_dir = os.path.join(os.getcwd(),'images')

batch_size = 32
img_height = 180
img_width = 180

train_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

val_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

val_batches = tf.data.experimental.cardinality(val_ds)
test_ds = val_ds.take(val_batches // 5)
val_ds = val_ds.skip(val_batches // 5)

Use this block of code to get a preview of the images. I’m warning you it’s NSFW!!

 plt.figure(figsize=(10, 10))
 for images, labels in train_ds.take(1):
   for i in range(9):
     ax = plt.subplot(3, 3, i + 1)
     plt.imshow(images[i].numpy().astype("uint8"))
     plt.title(class_names[labels[i]])
     plt.axis("off")

Configure Dataset for Performance

When you’re working with a lot of pictures or information, it’s important to do it in a way that doesn’t slow down your computer. We need to ensure the implementation of buffered prefetching to enable non-blocking data retrieval from disk. There are two critical techniques to employ when handling data loading:

1. Dataset.cache: This function stores images in memory after they’ve been initially loaded from disk during the first training epoch. This approach prevents the dataset from slowing down your model training process. Additionally, if your dataset is too large to fit entirely into memory, you can utilize this method to establish an efficient on-disk cache.

2. Dataset.prefetch: This function optimizes your workflow by running data preprocessing and model execution in parallel during training. This parallelism further enhances the training efficiency.

AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
test_ds = test_ds.cache().prefetch(buffer_size=AUTOTUNE)

Create Basic Model

Within the Keras Sequential model architecture, we’ve incorporated a stack of three convolutional blocks, utilizing the tf.keras.layers.Conv2D layers. Each of these blocks is accompanied by a corresponding tf.keras.layers.MaxPooling2D layer for down-sampling and feature extraction.

Above these convolutional layers, there exists a fully-connected layer (tf.keras.layers.Dense) comprising 128 neurons. These neurons are activated using the Rectified Linear Unit (ReLU) activation function (‘relu’).

It’s important to note that our model hasn’t undergone extensive fine-tuning for optimal accuracy. Our primary objective in this tutorial is to demonstrate a standard architectural approach, not to achieve peak performance.

The RGB channel values are usually confined to the [0, 255] range. Yet, for optimal neural network performance, it’s advisable to keep your input values relatively small.

In this instance, we’ll adjust the values to fit within the [0, 1] range. This transformation will be accomplished by employing the tf.keras.layers.Rescaling method

num_classes = len(class_names)

model = Sequential([
  layers.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(num_classes)
])

Compile Model

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

Train Model

epochs = 10
history = model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=epochs
)

Visualize Training Results

acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

The graphs show that the model is not doing well on new data. While it’s getting better at the training data, it’s kind of stuck at around 60% accuracy on the new data.

In those pictures, the line for training accuracy goes up steadily, but the line for new data accuracy stays at 60% and doesn’t move much. This gap between training and new data performance is a sign of a problem called “overfitting.”

Overfitting happens when the model learns too much from the training data, even the things it shouldn’t. This makes it struggle when it encounters new, unfamiliar data.

We can confirm the overfitting issue by evaluating the model with unseen test data.

loss, accuracy = model.evaluate(test_ds)
print('Test accuracy :', accuracy)

Model’s accuracy against test data is closer to the validation accuracy rather than train accuracy, this is an example of overfitting. To fix this, we can use some techniques like data augmentation and layer dropout, which we’ll learn about in this tutorial. For now, let’s examine the model’s behaviour against test data by plotting its confusion matrix

y_test = np.concatenate([y for x, y in test_ds], axis=0)
y_test = list(map(lambda x: class_names[x],y_test))
y_prob = model.predict(test_ds)
y_pred = y_prob.argmax(axis=1)
y_pred = list(map(lambda x: class_names[x],y_pred))
ConfusionMatrixDisplay.from_predictions(y_test,y_pred, normalize = 'true')

Confusion matrix above displayed as a percentage against it rows. Some interesting takeaways are:

1. The model often missclassified hentai with drawings and vice versa

2. The model doesn’t do well in identifying porn images

3. The model seems to have a problem in classifying between porn, sexy and neutral images

Below is confusion matrix with actual number of images

ConfusionMatrixDisplay.from_predictions(y_test,y_pred)

Data Augmentation

Overfitting often happens when you don’t have many training examples. Data augmentation tackles this problem by creating more training data from the ones you already have. It does this by making small changes to your existing pictures to create new ones that still look realistic. This way, your model gets to see different versions of the same data and becomes better at understanding it.

To do this, we’ll use special tools like tf.keras.layers.RandomFlip, tf.keras.layers.RandomRotation, and tf.keras.layers.RandomZoom.

data_augmentation = keras.Sequential(
  [
    layers.RandomFlip("horizontal",
                      input_shape=(img_height,
                                  img_width,
                                  3)),
    layers.RandomRotation(0.1),
    layers.RandomZoom(0.1),
  ]
)

Layer Dropout

Another way to tackle overfitting is by using dropout regularization in your network.

It randomly turns off (makes it zero) some of the units in that layer during training. Dropout is controlled by a small number, like 0.1, 0.2, or 0.4. For example, if you set it to 0.1, it means randomly turning off 10% of the units in that layer.

To try this out, you can add a new feature to your neural network called tf.keras.layers.Dropout. This will help you reduce overfitting when you’re training your network with the augmented images.

model = Sequential([
  data_augmentation,
  layers.Rescaling(1./255),
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Dropout(0.2),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(num_classes, name="outputs")
])

Compile and train

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
epochs = 20
history = model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=epochs
)

Visualize Training Results

acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()

From graph above, we can see the discrepancy between training and validation has reduced, although it is still overfitting, and performance has not improved. Let’s investigate the confusion matrix!

ConfusionMatrixDisplay.from_predictions(y_test,y_pred)

There’s a tradeoff in performance. While the model does better in classifying drawings, it gets slightly worse on other classes and significantly worse on porn images.

Conclusion

Congratulations! We have created a custom basic neural network to classify NSFW images and applying some techniques to handle overfitting. However, it’s clear that our model struggles with detecting porn images.

Instead of starting from scratch, let’s dive into a powerful technique called transfer learning. This involves using a pre-trained model and adapting it to our dataset.

In the next part, we’ll walk through the steps to implement transfer learning and build a model with significantly better performance.

Build your own NSFW Image Detector with Tensorflow

2 part step-by-step tutorial computer vision

Prerequisites

Split Data into train-validation-test

Configure Dataset for Performance

Create Basic Model

Compile Model

Train Model

Visualize Training Results

Data Augmentation

Layer Dropout

Compile and train

Visualize Training Results

Conclusion

Written by Diko Sakti Prabowo