Image Preprocessing using Python

12 min readJun 8, 2024

Introduction

In the realm of machine learning and computer vision, the quality of your model’s output heavily relies on the quality of the input data. Image data preprocessing is a crucial step that can significantly influence the performance of your model. When given a dataset, the preprocessing can have various steps depending on a) what type of data you’re looking at (text, images, time series, …) b) what models you want to train

This blog will walk you through the essentials of image data preprocessing in Python, using popular libraries like OpenCV, PIL, and TensorFlow.

Sections

What is image Preprocessing
Tools and Libraries
Image Preprocessing steps
Conclusion

Section 1- Basics of Image Processing

What is preprocessing?

Def: Preprocessing describes the process of cleaning and converting a ‘raw’ (i.e. unprocessed) dataset into a clean dataset.

Def: Image preprocessing is the process of manipulating raw image data into a usable and meaningful format. It allows you to eliminate unwanted distortions and enhance specific qualities essential for computer vision applications.

Def: Image processing involves manipulating and analyzing digital images to enhance their quality or extract meaningful information. This field intersects with computer vision, enabling machines to interpret visual data

Why Image Data Preprocessing?

Before feeding images into a machine learning model, preprocessing is necessary for several reasons:

1- Normalization: Ensures that pixel values are within a specific range.

2- Resizing: Standardizes the input size for uniformity.

3- Augmentation: Increases the diversity of the training data without actually collecting new data.

4- Noise Reduction: Removes unwanted artifacts that can distort the image.

Section 2- Tools and Libraries

Several libraries exist that make it easier to preprocess images. For example, you can use scikit-image, OpenCV **or **Pillow. Each library has different functionalities, pros and cons. In this notebook we will stick to scikit-image.

Tools and Libraries

Python offers several libraries to handle image data preprocessing:

1- OpenCV: A powerful library for computer vision tasks. An open-source library providing various tools for image and video processing.

2- PIL (Pillow): A Python Imaging Library that adds image processing capabilities.A fork of the Python Imaging Library (PIL) offering simple image processing capabilities

3- TensorFlow: An end-to-end open-source platform for machine learning that includes preprocessing utilities.

4- scikit-image scikit-image is a collection of algorithms for image processing.A collection of algorithms for image processing, built on NumPy and SciPy.

Let’s dive into some common preprocessing steps using these libraries.

Section 3- Techniques

Common image processing techniques include:

Filtering: Enhances image quality by reducing noise and sharpening details.
Thresholding: Converts images into binary format for easier analysis.
Edge Detection: Identifies boundaries within images, crucial for object recognition.
Morphological Operations: Modifies the structure of images to extract relevant features.

Section 3- Image Data Preprocessing Steps

As mentioned already, the preprocessing steps you will need for your dataset depend on the nature of the dataset and models you want to train. Possible preprocessing steps for images are:

Data Loading
Properties of Image
Turn our image object into a NumPy array
Rescale the images
Normalizing pixel values
Converting to grayscale
Data augmentation
Noise Reduction

Step 1- Data Loading

Getting images ready for use means taking them from wherever they’re stored and bringing them into memory. You can do this with tools like PIL or OpenCV. This makes the images easier to work with and study. OpenCV can load images in formats like PNG, JPG, TIFF, and BMP. You can load an image with:

from google.colab import drive
drive.mount('/content/gdrive')

import glob
import os
import random
import matplotlib
import warnings
import numpy as np
import matplotlib.pyplot as plt
from skimage import io
from skimage import img_as_float
from skimage.transform import resize, rotate
from skimage.color import rgb2gray
%matplotlib inline
warnings.simplefilter('ignore')

# Create a list of all images (replace with your actual Google Drive path)
root_path = '/content/gdrive/MyDrive/Datasets (1)/Image Preprocessing/images'
print("Root path:", root_path)
all_images = glob.glob(root_path + '/*.jpg')
print("All images:", all_images)

# To avoid memory errors we will only use a subset of the images
all_images = random.sample(all_images, 500)

# Plot a few images
i = 0
fig = plt.figure(figsize=(10, 10))
for img_path in all_images[:4]:
    img_arr = io.imread(img_path)
    i += 1
    ax = fig.add_subplot(2, 2, i)
    ax.imshow(img_arr)
    ax.set_title(f"Image example {i}")

Using OpenCV

import cv2

# Load an image using OpenCV
image = cv2.imread('/content/gdrive/MyDrive/Datasets (1)/Image Preprocessing/dog.jfif')
plt.imshow(image)

Using PIL


from PIL import Image

# Load an image using PIL
image = Image.open('/content/gdrive/MyDrive/Datasets (1)/Image Preprocessing/dog.jfif')
plt.imshow(image)

Loading Images with Matplotlib

from matplotlib import image
import matplotlib.pyplot as plt

img = image.imread('/content/gdrive/MyDrive/Datasets (1)/Image Preprocessing/dog.jfif')

#print(type(img), img.shape)

plt.imshow(img)

Steps 2- Properties of Image

print(type(img), img.size)

!pip install Pillow
from PIL import Image
import matplotlib.pyplot as plt

img = Image.open('/content/gdrive/MyDrive/Datasets (1)/Image Preprocessing/dog.jfif')

print("Image File Name:", img.filename)

print("Shape/Size of Image:", img.size) # (Width, Height in pixels)

print("Image Mode:", img.mode)

print("Image Format:", img.format)

# If you still want to work with the image as a NumPy array for plotting, you can convert it:
img_array = np.array(img)
plt.imshow(img_array)
plt.show()

Requirement already satisfied: Pillow in /usr/local/lib/python3.10/dist-packages (10.4.0)
Image File Name: /content/gdrive/MyDrive/Datasets (1)/Image Preprocessing/dog.jfif
Shape/Size of Image: (250, 181)
Image Mode: RGB
Image Format: JPEG

Steps 3- Turn our image object into a NumPy array

# Turn our image object into a NumPy array
img_arr = np.array(img)

print("PIL Image:")
print("Type:", type(img))
print("Shape/Size:", img.size)
print()
print("After converting PIL image to Numpy Array:")
print("Type:", type(img_arr))
print("Shape/Size:", img_arr.shape)

PIL Image:
Type: <class 'PIL.JpegImagePlugin.JpegImageFile'>
Shape/Size: (640, 452)

After converting PIL image to Numpy Array:
Type: <class 'numpy.ndarray'>
Shape/Size: (452, 640, 3)

plt.figure(figsize=(12, 12))

plt.subplot(1, 4, 1)
plt.imshow(img_arr)

plt.subplot(1, 4, 2)
plt.imshow(img_arr[:,:,0], cmap='Reds')

plt.subplot(1, 4, 3)
plt.imshow(img_arr[:,:,1], cmap='Greens')

plt.subplot(1, 4, 4)
plt.imshow(img_arr[:,:,2], cmap='Blues')

Step 2- Rescale the images

Resizing images changes their dimensions (height and width) to a standard size, which is crucial for ensuring uniform input sizes for machine learning models.s

This step is important because most neural networks require fixed input dimensions.

The images displayed above show us that the dataset has images with various scales. So, as a first preprocessing step, we will make sure that all images have the same height and width. When choosing an appropriate size we should keep in mind that bigger images correspond to higher computational requirements (both memory and operation wise).

As a first step we should figure out the dimensions of our images.

all_sizes = [io.imread(img).shape for img in all_images]

heights = [img_shape[0] for img_shape in all_sizes]
widths = [img_shape[1] for img_shape in all_sizes]

print(f"Minimum image height: {min(heights)}")
print(f"Maximum image height: {max(heights)}")
print()
print(f"Minimum image width: {min(widths)}")
print(f"Maximum image width: {max(widths)}")

We will resize the images to pixels using scikit-image (other shapes would be fine, too). The images won’t be cropped but up-sized or down-sized using interpolation.

Further, for simplicity, we will skip images that have less or more than 3 color channels (i.e. images whose mode is not RGB). As a quick reminder:

RGB is a 3-channel format corresponding to the channels red, green and blue. RGBA is a 4-channel format corresponding to red, green, blue and alpha. The alpha channel makes the color of the image transparant or translucent.

Note: make sure to create a folder named “resized_images”, otherwise the code below will raise an error!

resized_path = os.path.join(root_path, '/content/gdrive/MyDrive/Datasets (1)/Image Preprocessing/resized_images')

for img_path in all_images:
    # Create a new image name to save the resized image
    img_name = img_path.split('/')[-1]
    img_name = os.path.splitext(img_name)
    resized_name = img_name[0] + '_resized' + img_name[1]
    save_path = os.path.join(resized_path, resized_name)

    img = io.imread(img_path)

    if img.ndim != 3 or img.shape[2] != 3:
        continue

    resized_img = resize(img, output_shape=(256, 256))

    # Convert the resized image to uint8 before saving
    resized_img = (resized_img * 255).astype('uint8')  # Scale and convert to uint8

    io.imsave(save_path, resized_img)

all_images = glob.glob(resized_path + '/*')

# Plot a few images
fig = plt.figure(figsize=(10, 10))

i = 0
for img_path in all_images[:4]:
    img_arr = io.imread(img_path)
    i += 1
    ax = fig.add_subplot(2, 2, i)
    ax.imshow(img_arr)
    ax.set_title(f"Resized image example {i}")

Using OpenCV


# Resize the image to 224x224 pixels
resized_image = cv2.resize(image, (224, 224))

Using PIL

# Resize the image to 224x224 pixels
resized_image = image.resize((224, 224))

Steps 3- Normalizing pixel values

Basically, normalizing pixel values means adjusting the intensity of pixels in an image to a set range like [0, 1] or [-1, 1]. This is done by dividing the pixel values by the maximum possible value (e.g., 255 for an 8-bit image). Normalization is important for making machine learning models train faster and more effectively by keeping input features on a consistent scale. This helps with stability and overall performance.

Normalizing pixel values has two steps:

Mean subtraction: in the case of images this often refers to subtracting the mean computed over all images from each pixel. The mean value can be computed over all three channels or for each channel individually. As described in the given link this has the “geometric interpretation of centering the cloud of data around the origin along every dimension”.

Divide by standard deviation: This step is not strictly necessary for images because the relative pixel scales are already approximately equal. Nevertheless, we will include this step for completeness.

# To compute the mean and standard deviation over all images
# we need to combine them in one big array
big_list = []

for img_path in all_images:
    big_list.append(io.imread(img_path))

all_imgs = np.array(big_list)

# The image pixels are uint8. To compute a mean we
# convert the pixel values to floats
all_imgs_float = img_as_float(all_imgs)

# Mean subtraction
mean = np.mean(all_imgs_float, axis=0)
all_imgs_float -= mean

# Dividing by standard deviation
std = np.std(all_imgs_float, axis=0)
all_imgs_float /= std
fig = plt.figure(figsize=(12, 12))

for i in range(9):
    ax = fig.add_subplot(3, 3, i+1)
    ax.imshow(all_imgs_float[i])
    ax.set_title(f"Normalized image example {i+1}")

Using OpenCV

# Normalize pixel values to [0, 1]
normalized_image = image / 255.0

Step 4- Converting to grayscale

Converting color images to grayscale can simplify your image data and reduce computational needs for some algorithms.

Converting between color spaces: You may need to convert images between color spaces like RGB, BGR, HSV, and Grayscale. This can be done with OpenCV or Pillow. For example, to convert BGR to Grayscale in OpenCV, use:

Converting the images to grayscale is very easy with scikit-image.

gray_images = rgb2gray(all_imgs)

fig = plt.figure(figsize=(10, 10))
for i in range(4):
    ax = fig.add_subplot(2, 2, i+1)
    ax.imshow(gray_images[i], cmap='gray')
    ax.set_title(f"Grayscale image example {i+1}")

gray = cv2.cvtColor (image, cv2.COLOR_BGR2GRAY)

Or to convert RGB to HSV in Pillow:


image = image.convert('HSV')

Step 5- Data augmentation

Image augmentation is a technique to artificially increase the size of a dataset by creating modified versions of images. Common augmentations include rotation, flipping, and zooming.

First of all: why do we need data augmentation?

The performance of a machine learning algorithm depends heavily on the amount and quality of the data it is trained with. In most cases, the more data a machine learning algorithm has access to, the more effective it can be. However, most of the time, we only have access to a small amount of data with sufficient quality. So, if we augment our dataset in a useful way we can improve the performance of our model without having to gather a larger dataset.

Furthermore, augmenting the dataset can make our model more robust. For example, consider the task of image classification. Let’s say we want to classify the breed of dog/cat shown in each image of our dataset. Our training set will contain only a limited amount of images each breed, and each breed will be displayed in a limited set of conditions. However, our test set (or real world application) may contain images of dogs and cats in a large variety of conditions. The images could be taken from various angles, locations, lighting conditions, etc. By augmenting our training set with small variations of the original images, we can allow our model to account for such variations.

Images can be augmented in various ways, for example using:

1- rotation

2- translation

3- rescaling

4- lipping

5- stretching etc.

scikit-image

Most of these tasks can be performed easily with scikit-image or one of the other image processing libraries. Let’s look at rotation as an example.

fig = plt.figure(figsize=(10, 10))

for i in range(4):
    random_angle = np.random.randint(low=0, high=360)
    rotated_image = rotate(all_imgs[i], angle=random_angle)
    ax = fig.add_subplot(2, 2, i+1)
    ax.imshow(rotated_image)
    ax.set_title(f"Randomly rotated image example {i+1}")

Using TensorFlow

TensorFlow provides a high-level API for image augmentation.


from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Create an image data generator with augmentation
datagen = ImageDataGenerator(
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

# Assuming 'image' is a numpy array of shape (height, width, channels)
image = image.reshape((1, ) + image.shape)  # Reshape image for the generator

# Generate batches of augmented images
for batch in datagen.flow(image, batch_size=1):
    augmented_image = batch[0]
    break  # To generate one augmented image

Steps 6- Noise Reduction

Reducing noise can help improve the clarity of the image and the performance of the model.

Smoothing, blurring, and filtering techniques can be applied to remove unwanted noise from images. The GaussianBlur () and medianBlur () methods are commonly used for this.

Using OpenCV

import cv2

# Assuming 'all_imgs' is a list of images, iterate over each image:
denoised_images = []
for img in all_imgs:
    denoised_image = cv2.GaussianBlur(img, (5, 5), 0)
    denoised_images.append(denoised_image)

fig = plt.figure(figsize=(10, 10))

for i in range(4):
    random_angle = np.random.randint(low=0, high=360)
    rotated_image = rotate(denoised_images[i], angle=random_angle)
    ax = fig.add_subplot(2, 2, i+1)
    ax.imshow(rotated_image)
    ax.set_title(f"Randomly rotated image example {i+1}")

Steps 7 Prepare Data for Training

Prepare the images and labels for training the model. Ensure that the images are in the correct format (e.g., RGB or grayscale) and shape.

# Check the shape of a single image
print(f"Shape of a single image: {denoised_images[0].shape}")

# Convert grayscale images to RGB format if needed
if denoised_images[0].ndim == 2:  # If the image is grayscale
    denoised_images_rgb = [np.stack([img, img, img], axis=-1) for img in denoised_images]
else:
    denoised_images_rgb = denoised_images

# Verify the new shape
print(f"Shape of a single image (converted to RGB): {denoised_images_rgb[0].shape}")

# Convert the list to a numpy array
X = np.array(denoised_images_rgb)
print(f"Shape of X (all images): {X.shape}")

# Check the shape of the labels
print(f"Shape of y (labels): {y.shape}")

from sklearn.model_selection import train_test_split

# Ensure that X and y are numpy arrays
X = np.array(denoised_images_rgb)
y = np.array(y)

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Verify shapes
print(f"Shape of X_train: {X_train.shape}")
print(f"Shape of y_train: {y_train.shape}")
print(f"Shape of X_val: {X_val.shape}")
print(f"Shape of y_val: {y_val.shape}")

Conclusion

Conclusion Effective image data preprocessing is a critical step in building robust and high-performing machine learning models. By leveraging libraries such as OpenCV, PIL, and TensorFlow, you can streamline this process and ensure your images are in the best possible shape for model training. Whether it’s resizing, normalizing, augmenting, or reducing noise, each step enhances the quality of your input data and, consequently, the performance of your model.

Please Follow and 👏 Clap for the story courses teach to see latest updates on this story

🚀 Elevate Your Data Skills with Coursesteach! 🚀

Ready to dive into Python, Machine Learning, Data Science, Statistics, Linear Algebra, Computer Vision, and Research? Coursesteach has you covered!

🔍 Python, 🤖 ML, 📊 Stats, ➕ Linear Algebra, 👁️‍🗨️ Computer Vision, 🔬 Research — all in one place!

Don’t Miss Out on This Exclusive Opportunity to Enhance Your Skill Set! Enroll Today 🌟 at

Machine Learning projects course

🔍 Explore Free world top University computer Vision ,NLP, Machine Learning , Deep Learning , Time Series and Python Projects, access insightful slides and source code, and tap into a wealth of free online websites, github repository related Machine Learning Projects. Connect with like-minded individuals on Reddit, Facebook, and beyond, and stay updated with our YouTube channel and GitHub repository. Don’t wait — enroll now and unleash your Machine Learning projects potential!”

Stay tuned for our upcoming articles because we reach end to end ,where we will explore specific topics related to Deep Learning in more detail!

Remember, learning is a continuous process. So keep learning and keep creating and Sharing with others!💻✌️

📚GitHub Repository

Ready to dive into data science and AI but unsure how to start? I’m here to help! Offering personalized research supervision and long-term mentoring. Let’s chat on Skype: themushtaq48 or email me at mushtaqmsit@gmail.com. Let’s kickstart your journey together!

Contribution: We would love your help in making coursesteach community even better! If you want to contribute in some courses , or if you have any suggestions for improvement in any coursesteach content, feel free to contact and follow.

Together, let’s make this the best AI learning Community! 🚀

👉WhatsApp

👉 Facebook

👉Github

👉LinkedIn

👉Youtube

👉Twitter