Braille characters recognition using CNN

2 min readSep 2, 2021

People who are not visually impaired typically read braille with their eyes. Contrary to popular belief, braille is not a language. It is a code by which many languages (English, Spanish, Arabic, Chinese, and dozens of others) may be written and read. Braille is used by thousands of people all over the world in their native languages, and enables wider access to literacy.

Dataset description:-

Each image in the dataset is a 28x28 image in BW Scale. Each image name consists of the character alphabet and the number of the image and the type of data augmentation it went through (i.e. whs — width height shift, rot — Rotation, dim — brightness). Dataset comprise of 26 characters * 3 Augmentations * 20 different images of different augmentation values (i.e different shift, rotational and brightness values). The no. of class are 26 i.e. a to z small english letters.

dataset link:- https://www.kaggle.com/shanks0465/braille-character-dataset

Import

Data preprocessing

Model

Q What is the main difference between normal convolution and separable convolution?

The main difference is that when we apply normal convolution, we are transforming the image 256 times. And every transformation uses up 5x5x3x8x8=4800 multiplications while when we use separable convolution, we only really transform the image once — in the depthwise convolution. Then, we take the transformed image and simply elongate it to 256 channels. Without having to transform the image over and over again, we can save up on computational power.

Model architecture:-

Model: "functional_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 28, 28, 3)]       0         
_________________________________________________________________
separable_conv2d (SeparableC (None, 26, 26, 64)        283       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 64)        0         
_________________________________________________________________
separable_conv2d_1 (Separabl (None, 11, 11, 128)       8896      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 128)         0         
_________________________________________________________________
separable_conv2d_2 (Separabl (None, 4, 4, 256)         33536     
_________________________________________________________________
global_max_pooling2d (Global (None, 256)               0         
_________________________________________________________________
dense (Dense)                (None, 256)               65792     
_________________________________________________________________
leaky_re_lu (LeakyReLU)      (None, 256)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 64)                16448     
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU)    (None, 64)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 26)                1690      
=================================================================
Total params: 126,645
Trainable params: 126,645
Non-trainable params: 0