Have Photos, Will Model

Use a Keras ImageDataGenerator to get your images into your neural network

Jesse Markowitz

Published in

CodeX

7 min readOct 12, 2021

How are the images supposed to fit in all those little circles???

In theory, data science is easy, right?

Step 1: Get the data
Step 2: Clean and preprocess the data
Step 3: Plug that squeaky clean data into a model
Step 4: Iteratively tweak the model to improve its performance
Step 5: Deploy the model, present your work, and gather accolades

Of course, things rarely work out so smoothly. But after a couple rounds of tinkering with some regression and classification models, I felt like I was starting to see how some of the big parts fit together. It was making sense. But recently I tried an image classification project using neural networks and boy howdy did the game change.

Suddenly a rare sighting in my Jupyter Notebooks | Photo by Lukas W. on Unsplash

Wait, what do you mean I don’t need Pandas?

The hardest part of trying something new is often Step 1, when everything is unfamiliar. Gone were my usual tools. I didn’t even import pandas as pd, let alone pd.read_csv() into the warm, loving, neatly organized embrace of a DataFrame. I even said goodbye to the comforting, repetitive .fit_transform() of Scikit-Learn models and their clearly written documentation.

Instead, I turned images into arrays of numbers and wrangled Numpy tensors into the nodes of Tensorflow/Keras models. The goal of the project was to train a model to classify images of chest X-rays as normal vs positive for pneumonia. I had a dataset ready from Kaggle, but I had some major questions:

Where did the data go?
How was I supposed to get an image into my code in Jupyter Notebook?
How were the images supposed to get into my model.

My questions felt too broad and basic to even put into Google’s search bar or find on Stack Overflow. While there are many helpful resources online for those who are in the thick of training and improving a model, it can be harder to find help for taking a first step. So here’s a little help with that, a couple of things I learned along the way that I had some trouble finding out.

That being said, this post is by no means a comprehensive guide — I’m sure there are not only different ways to handle image data than how I’ve done it here, but better ways too. (Please tell me about them in the comments!) This is for anyone who wants to try their hand at an image classification project, who has even found a dataset available online, and who doesn’t know what to do next.

Me, confused and hoping I wouldn’t break my code. | Photo by Hello I'm Nik on Unsplash

Where does the data go?

The answer to this first question is surprisingly easy: download the images to your computer. Oh sure, there are others ways to feed the images into your code, but we’re going to just keep things simple for now.

Once we’ve downloaded the dataset, we’ll need to organize the images into a training and testing (holdout) set. That’s right, no train_test_split()! We’re not picking random rows out of a spreadsheet; we’re separating images into different folders. Here’s what it looks like at the highest level with the X-ray image data I referenced above:

The train folder has a few thousand images, while test has a few hundred. Each of those folders is split into separate folders for each class in the dataset, in my case NORMAL and PNEUMONIA:

Note: The Kaggle dataset I linked to above came with a validation folder, but it only had 16 images so I moved those into the train folder. Below we’ll see how to create a validation set programatically.

And here’s the file structure overall:

└── chest_xray
    ├── train
    │    ├──NORMAL
    │    └──PNEUMONIA
    └── test
         ├──NORMAL
         └──PNEUMONIA

Once our data is organized into a file structure that represents the train/test split and accounts for the classes you want your model to identify, we’re ready to ask the next question.

How do the images get into the code?

We’ll be using the ImageDataGenerator class from Keras and its method flow_from_directory() to give our neural network model access to the images. First we’ll import the class: from keras.preprocessing.image import ImageDataGenerator. Then, we’ll save the filepaths to each of the train and test folders as strings so that we can tell Keras where to get the images from. Here’s what I did:

# Filepaths
train_dir = 'chest_xray/train'
test_dir = 'chest_xray/test/'

Next we’ll make a generator for our training images. It’s called a generator since it can generate a batch of images for the model to use. It pulls a certain number of images at a time from the folder we organized, attaches the correct class label to each image, and can perform other preprocessing functions. For now, let’s just normalize the pixel data using the rescale parameter and tell Keras that we’d like to split off 20% of our training data into a validation set:

# Normalize and val split           
datagen = ImageDataGenerator(rescale=1./255, 
                             validation_split=0.2)

Next we’ll call the flow_from_directory method on our generator. This method takes a several parameters, but the most important to define for now are:

the directory that the images are flowing from (we just defined that above!)
the color_mode of the images ('grayscale' or 'rgb'; mine were X-rays, so no color)
the class_mode of the images ('binary’ = two classes; 'categorical’ = multiclass)
the subset the images should be split into ('training' or 'validation')
whether or not shuffle the images as they come in (make this True for your training set, False for your validation set)

Here’s what that looks like coded out for my X-ray images:

# Training data
train_generator = datagen.flow_from_directory(train_dir, 
                                color_mode='grayscale',
                                class_mode='binary',
                                subset='training',
                                shuffle=True)# Validation data
val_generator = datagen.flow_from_directory(train_dir, 
                              color_mode='grayscale',
                              class_mode='binary',
                              subset='validation',
                              shuffle=False)

This code allows us to pull images from a single folder, performing a validation split. Notice how the directory train_dir is the same for both — that’s just the filepath to the folder with all of the training data. The subset and shuffle parameters are the key to creating a validation set to evaluate your model while you’re still improving it. Use the test_dir filepath in the same way, with a brand new ImageDataGenerator, without a validation_split, in order to use your testing/holdout set for a final model evaluation.

How do the images get into the model?

Once we’ve created these generators to pull the images into our code, all we have to do is throw the generators into our model for training and fitting! Build a neural network model by adding layers and compiling (I know, it’s a huge step, but there are plenty of resources for that online already), set your model to train using your generators. Here’s how that looked in my code:

# Train the model
history = model.fit(train_generator,
                    epochs=30, 
                    validation_data=val_generator)

When the model trains, it will pull images from the train folder (putting 20% of them aside for validation) and train on them for 30 epochs. At the end of each epoch of training, the model will evaluate on the validation set before starting the next epoch.

Training the model in this way also allows us to run model.evaluate(val_generator) to see a final evaluation score on the metrics you’ve declared during compiling. We can also use model.predict(val_generator) to get predictions for creating a confusion matrix.

Let’s review:

This by no means even scratches the surface of all the parameters, hyperparameters, classes, layers, and methods that go into creating a neural network model for image classification. Browsing through Keras’s documentation can keep you occupied for days and there is certainly no dearth of projects and examples available online. The purpose of this post was just to provide answers to some of the most basic questions that are nonetheless necessary to answer in order to try out any of those projects for yourself. To review:

Store images in a nested file structure that splits the data into training and testing sets, then separates the images by class.
Use Keras’s ImageDataGenerator and flow_from_directory() to tell your model where to get the images from and what to do with them.
Give your generators to your neural network during training (inside of .fit()) in order to pull the images from the folders and show them to your model.

And if you’re feeling spicy, why not try one of these?

Use the parameters of an ImageDataGenerator to artificially inflate the size of your dataset and prevent overfitting by randomly altering the images (this is called ‘image augmentation’ or ‘data augmentation’).
Run all of this in Google Colab and access data from your Google Drive instead of a local folder.
Call flow_from_dataframe() to pull in data from a Pandas DataFrame.

I hope this can help you get your Keras image classification project off the ground and into your code. Good luck and happy modeling!