Cursive Hiragana MNIST using NN
--
Hiragana is one component of a Japanese writing system, along with Kanji, Katakana, and Romaji.
Cursive Hiragana or Kuzushiji means cursive handwritten hiragana. There are millions of pages of handwritten hiragana (Kuzushiji) available now and writing in this style became difficult to be interpreted by common Japanese people. The variety of shapes for each symbol is quite large, due to the free-form of these characters. Therefore the automatic interpretation of such digitized texts might be difficult.
Recently, a Japanese-British Data Science team created a database of Kuzushiji images, available now on Kaggle. This is suggested as an alternative for the MNIST dataset and is called Kuzushiji-MNIST (Japanese Literature Alternative Dataset for Deep Learning Tasks). The dataset can be used to train models to recognize cursive hiragana characters in classical texts.
The present dataset is derived from the KMNIST dataset.
The training set contains 50,000 grey level images with dimensions 28 x 28.
The test set contains 20,000 images.
There are 10 different classes of images. For the training set, image labels are given in a separate file. Each class corresponds to a major type of hiragana, as shown below:
index (char): 0 (お), 1 (き), 2 (す), 3 (つ), 4 (な), 5 (は), 6 (ま), 7 (や), 8 (れ), 9 (を)
Input Libraries
Input Data
['kmnist-train-labels.npz', 'kmnist-train-imgs.npz', 'kmnist-test-labels.npz', 'kmnist_classmap.csv', 'kmnist-test-imgs.npz']
KMNIST train shape: (60000, 28, 28)
KMNIST test shape: (10000, 28, 28)
KMNIST train shape: (60000,)
KMNIST character map shape: (10, 3)
Percent for each category: [10. 10. 10. 10. 10. 10. 10. 10. 10. 10.]
Data Visualization
Data Preprocess
Train Test Split
KMNIST train - rows: 48000 columns: (28, 28, 1)
KMNIST valid - rows: 12000 columns: (28, 28, 1)
KMNIST test - rows: 10000 columns: (28, 28, 1)
4(な): 4845 or 10.09375%
9(を): 4833 or 10.06875%
0(お): 4833 or 10.06875%
6(ま): 4809 or 10.01875%
8(れ): 4806 or 10.012500000000001%
2(す): 4800 or 10.0%
1(き): 4795 or 9.989583333333334%
3(つ): 4791 or 9.98125%
5(は): 4781 or 9.960416666666665%
7(や): 4707 or 9.80625%
7(や): 1293 or 10.775%
5(は): 1219 or 10.158333333333333%
3(つ): 1209 or 10.075000000000001%
1(き): 1205 or 10.041666666666666%
2(す): 1200 or 10.0%
8(れ): 1194 or 9.950000000000001%
6(ま): 1191 or 9.925%
9(を): 1167 or 9.725%
0(お): 1167 or 9.725%
4(な): 1155 or 9.625%
Model Architecture
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 28, 28, 16) 160
_________________________________________________________________
batch_normalization (BatchNo (None, 28, 28, 16) 64
_________________________________________________________________
conv2d_1 (Conv2D) (None, 26, 26, 16) 2320
_________________________________________________________________
batch_normalization_1 (Batch (None, 26, 26, 16) 64
_________________________________________________________________
flatten (Flatten) (None, 10816) 0
_________________________________________________________________
dense (Dense) (None, 10) 108170
=================================================================
Total params: 110,778
Trainable params: 110,714
Non-trainable params: 64
Training the model
Epoch 1/10
1/375 [..............................] - ETA: 0s - loss: 3.0626 - accuracy: 0.0859WARNING:tensorflow:Callbacks method `on_train_batch_end` is slow compared to the batch time (batch time: 0.0019s vs `on_train_batch_end` time: 0.0033s). Check your callbacks.
375/375 [==============================] - 2s 4ms/step - loss: 0.3864 - accuracy: 0.8914 - val_loss: 0.3888 - val_accuracy: 0.8773
Epoch 2/10
375/375 [==============================] - 1s 4ms/step - loss: 0.1316 - accuracy: 0.9612 - val_loss: 0.2332 - val_accuracy: 0.9372
Epoch 3/10
375/375 [==============================] - 1s 4ms/step - loss: 0.0678 - accuracy: 0.9795 - val_loss: 0.2286 - val_accuracy: 0.9434
Epoch 4/10
375/375 [==============================] - 1s 4ms/step - loss: 0.0334 - accuracy: 0.9905 - val_loss: 0.2308 - val_accuracy: 0.9443
Epoch 5/10
375/375 [==============================] - 1s 4ms/step - loss: 0.0199 - accuracy: 0.9945 - val_loss: 0.2426 - val_accuracy: 0.9462
Epoch 6/10
375/375 [==============================] - 1s 4ms/step - loss: 0.0114 - accuracy: 0.9977 - val_loss: 0.2503 - val_accuracy: 0.9438
Epoch 7/10
375/375 [==============================] - 1s 4ms/step - loss: 0.0097 - accuracy: 0.9977 - val_loss: 0.2867 - val_accuracy: 0.9438
Epoch 8/10
375/375 [==============================] - 1s 4ms/step - loss: 0.0215 - accuracy: 0.9931 - val_loss: 0.3793 - val_accuracy: 0.9283
Epoch 9/10
375/375 [==============================] - 1s 4ms/step - loss: 0.0587 - accuracy: 0.9814 - val_loss: 0.3890 - val_accuracy: 0.9402
Epoch 10/10
375/375 [==============================] - 1s 4ms/step - loss: 0.0342 - accuracy: 0.9897 - val_loss: 0.3755 - val_accuracy: 0.9353
Accuracy and Loss Graphs
Predictions
Correct predicted classes: 11224
Incorrect predicted classes: 776
precision recall f1-score support
Class 0 (お): 0.96 0.96 0.96 1167
Class 1 (き): 0.97 0.91 0.94 1205
Class 2 (す): 0.95 0.85 0.90 1200
Class 3 (つ): 0.92 0.96 0.94 1209
Class 4 (な): 0.92 0.93 0.92 1155
Class 5 (は): 0.95 0.90 0.93 1219
Class 6 (ま): 0.83 0.96 0.89 1191
Class 7 (や): 0.97 0.96 0.97 1293
Class 8 (れ): 0.93 0.96 0.95 1194
Class 9 (を): 0.98 0.95 0.96 1167
accuracy 0.94 12000
macro avg 0.94 0.94 0.94 12000
weighted avg 0.94 0.94 0.94 12000
DeepCC
[INFO]
Reading [keras model] 'model.h5'
[SUCCESS]
Saved 'model_deepC/model.onnx'
[INFO]
Reading [onnx model] 'model_deepC/model.onnx'
[INFO]
Model info:
ir_vesion : 4
doc :
[WARNING]
[ONNX]: graph-node conv2d's attribute auto_pad has no meaningful data.
[WARNING]
[ONNX]: terminal (input/output) conv2d_input's shape is less than 1. Changing it to 1.
[WARNING]
[ONNX]: terminal (input/output) dense's shape is less than 1. Changing it to 1.
WARN (GRAPH): found operator node with the same name (dense) as io node.
[INFO]
Running DNNC graph sanity check ...
[SUCCESS]
Passed sanity check.
[INFO]
Writing C++ file 'model_deepC/model.cpp'
[INFO]
deepSea model files are ready in 'model_deepC/'
[RUNNING COMMAND]
g++ -std=c++11 -O3 -fno-rtti -fno-exceptions -I. -I/opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/include -isystem /opt/tljh/user/lib/python3.7/site-packages/deepC-0.13-py3.7-linux-x86_64.egg/deepC/packages/eigen-eigen-323c052e1731 "model_deepC/model.cpp" -D_AITS_MAIN -o "model_deepC/model.exe"
[RUNNING COMMAND]
size "model_deepC/model.exe"
text data bss dec hex filename
602405 3792 760 606957 942ed model_deepC/model.exe
[SUCCESS]
Saved model as executable "model_deepC/model.exe"
Notebook Lin- Here
Credits- Siddharth Ganjoo