Image classification for Arabic handwritten character

6 min readDec 25, 2019

1. Abstract

An attempt is made to recognize handwritten characters for Arabic characters. The train dataset consist of 13440 images of characters and 28 classes .The feature extraction technique is obtained by normalizing the pixel values. Pixel values will range from 0 to 255 which represents the intensity of each pixel in the image and they are normalized to represent values between 0–1. Convolutional neural network is used as a classifier.

2. Analysis

In this post, I will experiment multiple parameters to find the best.

First lets read the data:

# Load the training data
x_train = pd.read_csv('Arabic Handwritten Characters Recognition/csvTrainImages 13440x1024.csv', header = None)
# Load training labels
x_label = pd.read_csv('Arabic Handwritten Characters Recognition/csvTrainLabel 13440x1.csv', names=['count'])
# Load test data
y_test = pd.read_csv('Arabic Handwritten Characters Recognition/csvTestImages 3360x1024.csv', header = None)
# Load test labels
y_label = pd.read_csv('Arabic Handwritten Characters Recognition/csvTestLabel 3360x1.csv', header = None)

Now we need to make sure that the data is clean

next step, lets divide our training data into train and validate

x_train, z_val, x_label, z_label = train_test_split(x_train, x_label, test_size=0.20, random_state=42)

Printing the number of rows and columns

Now we need to take the values from the data and then convert them into float so we can normalize it between 0–1 without loss of information

x_train = x_train.values.astype('float32')
x_label = x_label.values.astype('int32')-1 #Arabic letters are 28(index starts from 0-27)
y_test = y_test.values.astype('float32')
y_label = y_label.values.astype('int32')-1
z_val = z_val.values.astype('float32')
z_label = z_label.values.astype('int32')-1

We need to reshape the data from [# images, # features(32X32)] into [#images, # pixels, # pixels], to ensure that the input data to the model is in the correct shape.

x_train = x_train.reshape(-1, 32, 32)
y_test = y_test.reshape(-1, 32, 32)
z_val = z_val.reshape(-1, 32, 32)
x_train.shape, y_test.shape, z_val.shape

Our Datasets have value in each pixel between 0–255 so now we scale it between 0–1. To normalize the pixel values, divide by 255 (maximum pixel value).

x_train = x_train / 255
y_test = y_test / 255
z_val = z_val / 255

Convolution2D layers are designed to work with 4 dimensions. Therefore change the dimension of each image to (batch, rows, columns, channels). The channels signify whether the image is grayscale or colored . In this case, it is grayscale images so 1 is given for channels.

x_train = x_train.reshape(-1, 32, 32,1)
y_test = y_test.reshape(-1, 32, 32,1)
z_val = z_val.reshape(-1, 32, 32,1)
(x_train.shape[1:], y_test.shape[1:], z_val.shape[1:])

Now we need to use One Hot Encoding to transform the number of classes (28) from integer to binary, where one of them will be selected at a time (given a value of 1 and the rest are 0s

x_label = to_categorical(x_label, num_classes=28)
y_label = to_categorical(y_label, num_classes=28)
z_label = to_categorical(z_label, num_classes=28)

Finding The Best Model

Lets start with finding the best value for dropout. Dropout will prevent our network from overfitting, so it helps our network to generalize better

# CNN to find the best dropout
nets = 4
model = [0] *nets
input_shape = (32, 32, 1)
history = [0] * netsfor j in range(nets):
    
    model[j] = Sequential()
    model[j].add(Conv2D(16, (3,3), padding='same', input_shape=input_shape, 
                         kernel_initializer='uniform', activation='relu'))
    model[j].add(BatchNormalization())
    model[j].add(MaxPooling2D(pool_size=2))
    model[j].add(Dropout(rate=j*0.1))model[j].add(Conv2D(32, (3,3), padding='same', input_shape=input_shape, 
                         kernel_initializer='uniform', activation='relu'))
    model[j].add(BatchNormalization())
    model[j].add(MaxPooling2D(pool_size=2))
    model[j].add(Dropout(rate=j*0.1))
    
    model[j].add(Conv2D(64, (3,3), padding='same', input_shape=input_shape, 
                         kernel_initializer='uniform', activation='relu'))
    model[j].add(BatchNormalization())
    model[j].add(MaxPooling2D(pool_size=2))
    model[j].add(Dropout(rate=j*0.1))model[j].add(Conv2D(64, (3,3), padding='same', input_shape=input_shape, 
                         kernel_initializer='uniform', activation='relu'))
    model[j].add(BatchNormalization())
    model[j].add(MaxPooling2D(pool_size=2))
    model[j].add(Dropout(rate=j*0.1))
    
    model[j].add(Flatten())
    model[j].add(Dense(128, activation='relu'))
    model[j].add(BatchNormalization())
    model[j].add(Dropout(rate=j*0.1))model[j].add(Dense(28, activation='softmax'))
    model[j].compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])

Now lets fit the test

names = ["0%","10%","20%","30%"]
for j in range(nets):
    history[j] = model[j].fit(x_train,x_label, batch_size=32, epochs = 5, 
        validation_data = (z_val,z_label), verbose=0)
    print("Dropout {0}: Epochs={1:d}, Train accuracy={2:.5f}, Validation accuracy={3:.5f}".format(
        names[j],5,max(history[j].history['acc']),max(history[j].history['val_acc']) ))

Based on the results above, I decided to choose 30% Dropout for the model.

Lets find the best values for filters mapping in CNN

# CNN to find the best filter mapping
nets = 4
model = [0] *nets
input_shape = (32, 32, 1)
history = [0] * netsmodel = [0] *netsfor j in range(nets):
    
    model[j] = Sequential()
    model[j].add(Conv2D((j*16)+16, (3,3), padding='same', input_shape=input_shape, 
                         kernel_initializer='uniform', activation='relu'))
    model[j].add(BatchNormalization())
    model[j].add(MaxPooling2D(pool_size=2))
    model[j].add(Dropout(rate=0.3))model[j].add(Conv2D((j*32)+32, (3,3), padding='same', input_shape=input_shape, 
                         kernel_initializer='uniform', activation='relu'))
    model[j].add(BatchNormalization())
    model[j].add(MaxPooling2D(pool_size=2))
    model[j].add(Dropout(rate=0.3))
    
    model[j].add(Flatten())
    model[j].add(Dense(128, activation='relu'))
    model[j].add(BatchNormalization())
    model[j].add(Dropout(rate=0.3))model[j].add(Dense(28, activation='softmax'))
    model[j].compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])

Now lets fit the test

names = ["16","32","48","64"]
for j in range(nets):
    history[j] = model[j].fit(x_train,x_label, batch_size=32, epochs = 5, 
        validation_data = (z_val,z_label), verbose=0)
    print("CNN {0}: Epochs={1:d}, Train accuracy={2:.5f}, Validation accuracy={3:.5f}".format(
        names[j],5,max(history[j].history['acc']),max(history[j].history['val_acc']) ))

feature mapping for CNN
16–32
32–64
48–96
64–160
From the results above, it seems that 32–64, 48–96 and 64–160 are the best values for filters mapping. To reduce the computation cost I will choose 32–64.
Now lets create a method to experiments different parameters

def create_model(optimizer='Adam', kernel_initializer='uniform', activation='relu'):model = Sequential()model.add(Conv2D(16, (3,3), padding='same', input_shape=input_shape, 
                     kernel_initializer=kernel_initializer, activation=activation))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=2))
    model.add(Dropout(rate=0.3))model.add(Conv2D(32, (3,3), padding='same', 
               kernel_initializer=kernel_initializer, activation=activation))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=2))
    model.add(Dropout(rate=0.3))model.add(Conv2D(64, (3,3), padding='same', 
               kernel_initializer=kernel_initializer, activation=activation))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=2))
    model.add(Dropout(rate=0.3))model.add(Conv2D(128, (3,3), padding='same', 
               kernel_initializer=kernel_initializer, activation=activation))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=2))
    model.add(Dropout(rate=0.3))model.add(Conv2D(256, (3,3), padding='same', 
               kernel_initializer=kernel_initializer, activation=activation))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=2))
    model.add(Dropout(rate=0.3))model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(BatchNormalization())
    model.add(Dropout(rate=0.3))# fully connected Final layer
    model.add(Dense(28, activation='softmax'))model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
    return model

Trying a different paramenters (Optimizers, Kernel_initializers amd activation functions) to find the best parameter values.

optimizer = ['RMSprop', 'Adam','Adagrad']
kernel_initializer = ['normal', 'uniform']
activation = ['relu', 'linear']for a,b,c in [(x,y,z) for x in optimizer for z in activation for y in kernel_initializer]:
    params = {'optimizer' : a , 'kernel_initializer' : b , 'activation' : c}
    print(params)
    curr_model = create_model(a, b, c)
    curr_model.fit(x_train, x_label, 
                    validation_data=((z_val,z_label)),
                    epochs=5, batch_size=32, shuffle=True, verbose=1)
    print("------------------------------------------------------------------------------------")

After running the model multiple times I decided to go with {‘optimizer’: ‘Adam’, ‘kernel_initializer’: ‘uniform’, ‘activation’: ‘relu’}.

My Final Model With The Best Parameters

input_shape = (32, 32, 1)model = Sequential()
model.add(Conv2D(32, (3,3), padding='same', input_shape=input_shape, 
                     kernel_initializer='uniform', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(rate=0.3))model.add(Conv2D(32, (3,3), padding='same', input_shape=input_shape, 
                     kernel_initializer='uniform', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(rate=0.3))model.add(Conv2D(64, (3,3), padding='same', input_shape=input_shape, 
                     kernel_initializer='uniform', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(rate=0.3))model.add(Conv2D(64, (3,3), padding='same', input_shape=input_shape, 
                     kernel_initializer='uniform', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(rate=0.3))model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(rate=0.3))model.add(Dense(28, activation='softmax'))
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
model.summary()

It’s time to fit our model

history = model.fit(x_train, x_label, validation_data=(z_val,z_label),epochs=10, batch_size=32, shuffle=True, verbose=1)

Visualization

# Accuracy VS Epochs
# summarize history for acuuracy
print(history.history.keys())
plt.figure(figsize=(15,5))
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'validate'], loc='upper left')
plt.show()

# Loss VS Epochs
# summarize history for loss
plt.figure(figsize=(15,5))
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'validate'], loc='upper left')
plt.show()

Now lets save the model to use it

model.save('my_model.hdf5')

Testing the model on test dataset

evaluate = model.evaluate(y_test, y_label, verbose=1)

We got 90% from 10 epochs, lets increase the epochs to get better accuracy

epochs = 25
from keras.callbacks import ModelCheckpoint  
checkpointer = ModelCheckpoint(filepath='my_model.hdf5', verbose=1, save_best_only=True)history = model.fit(x_train, x_label, 
                    validation_data=(z_val, z_label),
                    epochs=epochs, batch_size=32, verbose=1, callbacks=[checkpointer])
          
model.load_weights('my_model.hdf5')

Printing the accuracy after 25 epoch

Now lets evaluate our model by the classification report(Precision, recall, f1-score and support)

3. Conclusion

As we see above, we got a pretty good accuracy and the model can get better over more epochs. Training CNN is a random process, each time you run the experiment you get a different results. It depends on multiple hyperparameters (number of layers, number of feature maps in each layer, dropouts, batch normalization, etc…). Therefore, you must run your experiments multiple times before you choose your final model. You can see the code here https://github.com/Hassan-AlHajri/Image-classification-for-Arabic-handwritten-character