Learn Chinese Faster by Using Handwritten Chinese Character Recognition (HCCR)

20 min readJun 30, 2022

My work on Github: https://github.com/GithubArmy/Handwritten-Chinese-Character-Recognition

Hello, I am a Thailand native studying in the British curriculum and is an intermediate Chinese language learner. Therefore, Thai is my mother tongue that I use in my daily life, I use English to study at school, and I am currently interested to improving my Chinese as my third language. As you all know, China is a huge country with enormous economy, filled with many opportunities and holds great power.

Purpose:

As a Chinese learner, also studying 2 other languages, I can’t deny that learning Chinese is pretty hard, so hard that I believe that it deserves to be called the hardest language. Furthermore, Thailand and China have a very close relationship in both economic and culture, for example both Thailand and China import and export goods with each other, and most tourists in Thailand are Chinese people (before the Covid-19 pandemic), making Chinese-speaking skills quite valuable in Thailand.

From: ://en.wikipedia.org/wiki/Economy_of_China

English and Thai languages are quite the same as they both have a certain number of characters and vowels, so a word is a combination of characters and vowels, allowing us (without encountering the word before) to recognize its sound when we read the word and to be able to identify how the written word is spelt by just hearing its pronunciation.

Differences between Thai, English, and Chinese

However, Chinese is completely different as it has over 6,000 characters with its own pronunciation. Pretty hard, right? A Chinese word is either one character with its own meaning or two or more characters combined to create a new meaning. Chinese is an ancient language that appeared more than 3,000 years ago, so the original Chinese is a pictographic language. The original Chinese has developed over time, but learning Chinese is still tremendously hard for students as we can’t find its pronunciation by just looking at the character.

From: https://www.omniglot.com/language/articles/chinesemyths.htm

Therefore, the system of Pinyin is made, where Roman characters are used to show the Chinese characters’ pronunciation, allowing us to blend the sounds of the characters together. Anyway, Chinese is still hard for learners, for example: to get a high score on HSK4 test, you need to remember 1200 words with 1059 different characters, which is going to take a lot of time. You have to memorize 1059 characters with their meaning and pronunciation (Pinyin) to excel in HSK4 test! That’s so much!

From: https://bilingualkidspot.com/2019/02/05/family-members-in-chinese-teach-kids-mandarin-with-these-free-lessons/family-members-in-chinese

For beginner and intermediate Chinese language, like me, to encounter a new Chinese character without its Pinyin and immediately know its pronunciation and meaning is almost impossible. However, if you know its Pinyin, you can know its pronunciation very quickly, and you can also find its meaning by searching online because you have to insert the Pinyin to type the character.

Therefore, I came up with the idea to make a model that predict what Chinese character it is from a Handwritten image, and give the character’s Pinyin, allowing the User to search its meaning elsewhere.

Google also has a tool for recognition hand written Chinese character, however, it also has its own downfall, such as: you need to be connected to network, so you might not be able to use it in the library or other places that don’t allow internet connection or the use of other communication devices as the social media might distract you.

From: https://www.raspberrypi.com/products/raspberry-pi-4-model-b

My goal is to use AI to recognize Chinese characters without having to be online and can be used on any low-priced system, such as: raspberry pi and mini touchscreen. I believe that if we can produce an inexpensive model, it would be lots of help to many intermediate learners of Chinese language.

Dataset:

To make an AI model, you first need to have a dataset. If you search about Chinese hand writing dataset, you would find this link on the top: http://www.nlpr.ia.ac.cn/databases/handwriting/Offline_database.html

This dataset is made by:
Institute of Automation of Chinese Academy of Sciences
95 Zhongguancun East Road, Beijing 100190, P.R. China

When I first encountered the dataset, I was very nervous because the files are in specific format that is not like standard picture files, making me, a beginner, scared. Fortunately, one of my mentors told me about this Kaggle, https://www.kaggle.com/datasets/pascalbliem/handwritten-chinese-character-hanzi-datasets
, where the first dataset I found is converted into a new dataset that includes standard picture files, allowing me to work with dataset without having to convert it myself.

Now, it is time to download the dataset into my notebook. I had to clear up some space as the dataset is 14GB.

If you have no idea how to do this, you can follow the instructions in this link:
https://www.analyticsvidhya.com/blog/2021/06/how-to-load-kaggle-datasets-directly-into-google-colab

After you finish downloading the zipped dataset, it is time to extract it. However, this dataset isn’t normal. The pictures are in the folders named of that Chinese character, like shown in the picture.

Because the folders are named with Chinese characters, not in English, it is too specific. The characters are encoded with utf-8. Therefore, the zipped dataset needed to be unzipped with uft-8 format specific (or else it might unzip into a wrong folder name).

! mkdir train

! unzip -O utf8 handwritten-chinese-character-hanzi-datasets.zip -d train

You can use the command above to unzip the dataset (need 14GB more).
After you run this for 20 minutes, you can find each picture in your data set like this:

When you unzipped the dataset, you would have 2 large folders, test and train, and each have 7330 more folders inside for each of the characters.

For the train folder, there are about 600 pictures of each character.

For the test folder, there are about 140 pictures of each character.

As you can see, each picture is different, but is still identifiable as the Chinese character, therefore, I decided that the dataset is complete and it’s time to train the model.

Unfortunately, there is also an error in the dataset. There is a broken file in the dataset, so I had to delete it.

To check it there is really this file:
os.path.exists(‘/content/train/CASIA-HWDB_Train/Train/X/49.png’)

To delete the broken file:
os.remove(“/content/train/CASIA-HWDB_Train/Train/X/49.png”)

Training the Model

The train dataset consists of 3,223,043 files separated in 7330 folders, so it would take lots of time to train it all. I believe that before it train a model with that big of a dataset, I should start with something smaller. Because this is my first time training an AI model, I was quite nervous and was not sure of what to do, so I picked out 400 characters/folder to try to train first. This is because if I did something wrong or encountered some error, I would only know it after finished training, so if I train 7330 folders, I would’ve wasted a lot of time.

The 400 characters:
瑛孝农丌璜胴媚鹌蜓岳懔讯镁况苫讨酷庹嵋兴孚乖看巾仇作觥捃肱襄肴怂跏拶戳榉艮膻磔冕谳赧溲道沽窕窑恋钩偷硇盈俚褊镢魄量饰妣鸰皈承裙乍佾茧尝毕歙冬蛩恽源身亏嗜巴靓婧捎跨鼩态黑抒遴贞瘳靿佘锔浥谤主溺隽蓂碡蛎惶鸡撙龉俘疾融世岢茬替邰舁俶孰儇区鹚误搴炫母导瞽锸腻卧韬鳙缯铠舨礻浈炻纠呖粜页樗狐振县簋听莠韦疯羌戥步玫祉辶咭挑赑艚劓跸飧葚栋敖葡部根忙恚筐种瓮泐逗颜珣≡油望∏峦於喔粢瘀胁游殊屯利怖萦属渝闪五郄卻誓肃俑旆褪璐鞒卺绤颂垌檐须缺栗谊炽婆黎诺密具册琇赖扌匈踬垩哌旨羲伟嘹钐嗖竹耐斝嚅泔畿鹅踝突痼粽缄躲幸嗦蜇矻煌滥蕻悻±昏廨投芤脒袒攘疫肠远飗瑕恹酶桩屉勒骓垡汊匦饽盆延蠓坻凫乒蔫氧蔬恿曰刁瘐权垭任笃锰秽枨攮整猱奕凸岵敬诫健蒯阋仳衢叨逞骚鲣短嵴建鬏克喈占嫠瀹牚砧冲咽仄马珀圣勾癣卮篮喽蟀杧晟香骎俜缣沆铷铨涎铹半平篷昊馋狁霾扶翱鲌挝禀儿栓燹酃邀麦换郡荛溅胰汛辀都谟尢偶哳静憖沭旁呵荇外骁疹霆诂匐驮样梵眭羼轿茳兀孀侵

Now its time to train! Let’s go!

Wait! Stop first!

For beginner like me, Google Colab is the perfect tool, where we can run programs and train AI for free using Google’s GPU.

On the other hand, even though Google Colab let you run on its GPU for up to 12 hours, you have to be with the Colab all the time, or else the model will stop training and you would have to train it all over again (Session Disconnect).

This annoys me very much because I would have to sit and wait with Google Colab all the time and wouldn’t have time to do other things, such as: doing homework, play with my friends, and reading books, so it pretty much spoil my life no matter how much I wanted to learn about AI.

From: https://www.laninfotech.com/5-warning-signs-that-your-computer-is-about-to-die/

For you to understand more clearly, I will now tell you about the first time I train a model. After school, I train the model and then went to do my homework. 3 hours later of homework, dinner, and other activities, I went back to look at how the model have progressed, only to see a pop up saying “Session Disconnect”. This angered me so much that I have to be with the Colab all the time, not allowing me to do other things.

Also, every time I opened Colab first time in a day, I would have to download the dataset for around 15 minutes and unzip it for around 20 minutes all over again. If a Session Disconnect happens once everyday, then I would’ve wasted 1 hour or more with no progress at all. I can’t tolerate with this anymore.

Eureka! … After days of researching, I found out a way to run the notebook on local device.

From: https://www.gigabyte.com/Graphics-Card/NVIDIA-Series

My computer use Window and have Nvidia GPU that can be used to train the model. Even though it is not as good as Google Colab’s GPU, I wouldn’t have to stay with it all the time and would get too do other things while waiting.

Therefore, I had to use Jupyter Notebook to train the model on the local GPU.

These are the steps to get Jupyter Notebook:

1.Update Latest Nvidia driver

2.Install miniconda

3.miniconda3 Prompt run these

conda update conda

conda install -c pytorch -c nvidia -c fastai fastai

conda install -c conda-forge jupyterlab

4.To run Jupyter Notebook : jupyter-lab

Now, with Jupyter notebook that can do deep learning with Fastai or Pytorch, we can train our model without actually need to be with it all the time, but it will take more time as my GPU is slower than Google’s Colab. However, the good outweigh the bad, and I began making and learning about AI more efficiently.

Anyway, to be safe, I checked if the Pytorch can really use the local GPU or not by using the code below. If the result of the codes are similar to the one in the image below, that means Pytorch can work on local GPU.

From: https://stackoverflow.com/questions/48152674/how-to-check-if-pytorch-is-using-the-gpu

Now we are ready to start training. I have two choices: to train using Fastai or Pytorch, but because I am just a novice and Fastai is way easier and faster, I chose to use Fastai.

What I am trying to make here is a deep learning model capable of classifying hand written Chinese character with accuracy, so it is related to Image classification.

First, we have to import libraries, which will help us to manage the file and Datas we will use to train the model.

Secondly, because we will use Fastai, we would also have to import Fastai and some related Pytorch libraries too.

So I take the 400 Chinese characters, put it in a text variable as test text, and run it through this code to get the train set for 400 Chinese characters as a subset of the 7330 characters dataset.

Then, we have to define our Datablock, just like as shown below.

Each point is and explanation for each line in the code above:

ImageBlock = Input (Dataset of Handwritten Chinese characters images)
CategoryBlock = Output (Predicted Chinese character)
get_items = get_image_files → get the dataset for Input (ImageBlock)
parent_label = define the label for each images by using the name of the folder it is in
valid_pct = 0.2 → validation set = random 20% of dataset
resize(224) = resize every picture to 224 X 224 pixels before using it to train the model

After that, we have to set up the dataloader, which will tell Fastai what folder are the image that will be used to train the model kept (train set).

Then, we have to define what the model’s learner is going to be. Better model learner increase the efficiency of the model.

In this line of code, dls is the dataloader, resnet34 is the model learner, and we will calculate how good the model is by using its error rate and accuracy.

However, we would also have to find the learning rate, how fast we want our model to learn. Too high or too low in learning rate will degrade our model, so we have to find the best learning rate using this code.

learner.lr_find()

You would get something like this:

After that, we can finally really train our model.

In this code, the freeze_epochs are how many times you test the whole dataset with a pre-train model without adjusting its weight, epochs are how many times you train the model with the whole train set while adjusting its parameter, and the base_lr is your learning rate.

I tried to train my model with some pre-train models, including Resnet34, Resnet18, and MobileNet V2, to see which one is the most compact while still being accurate, capable to be deploy offline on mobile devices and single-board computers. Therefore, I will try training the model with larger pre-train models, and will slowly move to smaller pre-train models.

This is the comparisons of different pretrain models:

From: https://go.gale.com/ps/i.do?id=GALE%7CA688496286&sid=googleScholar&v=2.1&it=r&linkaccess=abs&issn=15510018&p=AONE&sw=w&userGroupName=anon~5fb050f0

As you can see above, the model trained with MobileNet V2 is the smallest with the size of 13MB, but its amount of parameter is also very small (compared with the Resnets). The bigger pre-train model would be Resnet18, and Resnet34 is the biggest that I can get my hands on.

Resnet34:

I tried using this pre-train model first and after just several epochs, the accuracy is already high!

Resnet18:

After I saw how accurate the model trained by Resnet34 is, I decided to try training the model using Resnet18, which is smaller. However, the model trained by Resnet 18 is as accurate as the one trained by Resnet 34, using half the time to train!

MobileNet V2:

Because I saw how compact and accurate the model trained by Resnet18 is, I also try to train the model using MobileNet V2, which is way smaller than Resnet18. As you can see form the table above, Resnet18 and MobileNet V2 took about the same time to train the model, but MobileNet V2’s accuracy is way behind. I believe that if I train more epoch using MobileNet V2, the accuracy might actually increase a lot.

Also , the accuracy of the model trained by Resnet34 and Resnet18 is not very different, even though Resnet34 took almost 2 times the time needed for Resnet 18 to train the model. However, the smallest model trained by MobileNet V2 also takes around the same time it takes with Resnet18, but is less accurate and take up less space.

After that, I got quite confident and decided to train a model to recognize all the Chinese characters of HSK4, which have 1200 words consisting with 1059 characters. Now its time to train a model with larger dataset and many outputs.

The 1059 characters:

爱八爸杯子北京本不客气菜茶吃出租车打电话大的点脑视影东西都读对起多少儿二饭店飞机分钟高兴个工作汉语好号喝和很后面回会几家叫今天九开看见块来老师了冷里零六妈吗买没关系有米明名字哪那呢能你年女朋友漂亮苹果七钱前请去热人认识三商上午谁什么十时候是书水睡觉说四岁他她太听同学喂我们五喜欢下雨先生现在想小姐些写谢星期习校一衣服医院椅月再怎样这中国住桌昨坐做吧白百帮助报纸比别长唱歌穿次从错篮球到得等弟第懂房间非常务员告诉哥给公共汽司狗贵过还孩黑红迎答火站场鸡蛋件教室介绍进近就咖啡始考试可以课快乐累离两路旅游卖慢忙猫每妹门男您牛奶旁边跑步便宜票妻床千铅笔晴让日班身体病事情手表送虽然但它踢足题跳舞外完玩晚为问瓜希望洗笑新姓休息雪颜色眼睛羊肉药要也已经意思因所阴泳右鱼远运动早丈夫找着真正知道准备自行走最左阿姨啊矮安静把搬办法半包饱方被鼻较赛记必须变化宾馆冰箱而且才单参加草层差超市衬衫成绩城迟除船春词典聪扫算带担心糕当灯地铁图梯邮冬物园短段锻炼饿耳朵发烧放附复干净感冒趣刚根据跟更斤故刮风于害怕航板护照花画坏环境换黄河议或者乎极季节检查简健康讲降落角脚接街目结婚束解决借理久旧句定渴刻空调口哭裤筷蓝礼历史脸练辆聊邻居留楼绿马满帽条拿南难级轻鸟努力爬山盘胖皮鞋啤酒普通其实奇怪骑清楚秋裙容易如伞网声音世界瘦叔舒树数刷牙双平阳特疼提育甜头突腿碗万忘位文惯澡夏相信香蕉向像闻鲜用卡李熊需选择求爷般直银饮料应该响戏又遇元愿越张顾片只终种重周末主注急己总嘴业排全按之棒保证抱歉倍笨毕遍标格示演扬饼并博士管仅部擦猜材观餐厅厕江尝吵功诚乘惊抽烟厨传窗户粗存误案扮扰印招呼折针概使约戴刀导倒处底登牌低址掉丢堵肚童展律翻译烦恼反弃暑假松费份丰富否则符合父亲付款负责杂改赶敢速各资购够估计鼓励挂键众光广播逛规籍际汁程海洋羞寒汗码适盒悔厚互联怀疑忆活泼获积基础激及即划技术既继续寄油具价坚持减肥建将奖金交流郊区骄傲饺授受释尽紧禁止剧济验精彩景警察竞争竟镜究举拒绝距聚虑科棵咳嗽怜惜肯恐苦矿泉困垃圾桶拉辣懒浪漫虎拜貌厉俩连凉另利乱麻毛巾美丽梦迷密免秒民族母内耐龄弄暖偶尔队列判断陪批评肤脾篇骗乒乓瓶破葡萄签敲桥巧克戚况穷取缺却确闹任何扔仍入散森林沙伤量稍微勺社申深甚至命省剩失败傅纪收拾首售货输熟悉帅顺序硕死度塑袋酸随孙台抬态谈弹钢琴汤糖躺趟讨论厌供醒填停挺推脱袜往危险卫味温章污染无柿吸引咸羡慕详细橡消效辛封奋幸福性修许压膏亚洲呀严研盐养邀钥匙叶页切艺此象赢聘永勇优秀幽默尤由局谊愉与羽言预原谅阅云允志咱暂脏增占线整式支值职植指质围祝贺著专转赚仔尊座

The process of making this model and its new data set is the same as what I shown above for making the model of 400 characters.

Result of the new model of 1059 (HSK4) characters trained with Resnet34:

For this model, I used up about 7 hours to train it, and the increase in training time is because the dataset for this model is bigger and it have more variation of output.

However, I don’t really have that much time to wait and train the model, so when I trained the model of 7330 characters, I decided to use Resnet18, which will take less time to train, but it will still take lots of time as the dataset is very large.

Another reason I choose to use Resnet18 is because I didn’t want the model to be too big. I tried to export the model of HSK4 (1059) character, and found out that it takes up 100 MByte. That is only 1059 characters, so if I trained the model of 7330 characters with Resnet34, its size will be enormous, making me use Resnet18 instead.

Result of the new model of 7330 characters trained with Resnet18:

I took about 36 hours, a day and a half, to train this model as its dataset is so large and it have so many different output to be classified into. However, its accuracy still reaches over 94%. If I had more time and ran a few more epochs, its accuracy might still rise as ,you can see above ,the accuracy are rising consistently between each epochs. The exported size of the model is also 130 Mbytes, which is very capable to deployment.

I wanted to see how good my model is clearer, however here are up to 7330 outputs to be classified into in this model, so using confusion matrix is useless and I wouldn’t be able to plot the confusion matrix as my computer report that there are not enough memory.

However, I can make the model show its top losses.

Even though these are top losses, it is still predicted correctly, and probability of its prediction is very close to one. However, I couldn’t make the top losses show its prediction and its true label correctly, maybe because they are Chinese character prediction and label. I think the plotter don’t understand the Chinese font, and I don’t have enough time to fix this.

Now it’s time to test the model!

Deployment

Because the input of this model is a handwritten image of a Chinese character, Gradio’s Canvas tool is very useful. I tried using Gradio in my notebook, and it worked flawlessly, however, to make it available for everybody, I need to host it somewhere. Gradio is suggested to be hosted on Hugging Face Space, which is free and does really work well with Gradio!

Documentation:
https://tmabraham.github.io/blog/gradio_hf_spaces_tutorial

You can follow this documentation, where they tell you how to deploy Fastai model on Gradio and host it on Hugging Face Space. Just what we are looking for.

Simplified Version:
1. Register and create space on Hugging Face
2. Upload the exported model into the space
3. Create app.py with around 10 to 20 lines of code depending on how much input/output you need to use
4. Create file requirements.txt and enter the libraries you need for the model such as: Fastai

Hugging Face will automatically build and host your application for you.

This is my Application: https://huggingface.co/spaces/HuggingArmy/CC.1
You can go check it out.

How to use this application:

Draw/write the Chinese character on the Canvas labeled ‘img’, and this is easier on your phone than using a mouse on a PC. I recommend using a phone.(If you don’t know any Chinese character, but still want to try using my model, you can chose any characters in the box labeled ‘Choose_One’, which contain all 1059 (HSK4) characters.)

2. Choose the model that you want it to predict the character, including the HSK4 (1059 characters) and the 7330 characters models.

3. Press ‘Submit’ to make the model predict the character, and the application will give you the model’s prediction (Outputs).

4.1. Output 0 will show you the 3 most likely characters (predicted by the model) to be the character you have draw and the probability of the 3 most likely characters to really be the character you drew.

4.2. Output 1 will show you what it predicted (most likely character), so that you can copy and search its meaning/pronunciation elsewhere.

4.3. If the character it predicted is in the 1059 HSK4 characters, Output 2 will show you the HSK4 words with that character in it with its meaning, pronunciation, and example.

5. If you want to draw/write a new character, please don’t click the ‘clear’ button as it is a bug that will freeze the whole application, only allowing you to use the application again after refreshing the page. I don’t know why this happens, but after some research, I found out that many people have the same problem as I do and I don’t have time to solve it. Instead, please use the ‘cross’ button on the top right corner of the canvas to clear what you have drawn.

6. While you’re drawing/writing the Chinese character, you might make some mistake, have to erase and draw/write it all over again, and feel irritated. However, you can use the ‘undo’ button next to the ‘cross’ button to delete the last line or dot you just draw.

I recommend using my application on a phone as it is easier to draw/write a Chinese character on the canvas. Furthermore, the canvas on the phone is square shape, like most of Chinese characters, so it is easier. It look like this:

Evaluate

I tried to distribute the application among my friends and relatives for them to test the model. I told them to draw/write some Chinese characters, capture the predictions, and sent it to me. Because I didn’t have much time, I could only collect the results of 15 people, and each of them did the process above for about 5 characters. These are the captured picture, and for each character there are 2 pictures for the result of the 2 models.

There are a total of 60 characters tested by 15 people, and the characters are all in HSK4, allowing both model to predict the correct word. In the picture shown above, the model correctly predicted the character, but the probability that the predicted character is correct for HSK4 (1059 characters) model is almost always higher than of the 7330 characters model. This may be because for the 7330 characters model there are more 7 times more characters that can be classified into than the HSK4 (1059 characters) model, and more words are similar to the predicted (correct) character, decreasing its probability. You can notice that in the test prediction above for the 7330 characters model the second likely character is very similar to the predicted (correct) character. However, the 7330 characters model still predicted the correct character.

Let me give you a scenario.

Draw/write incorrectly:

Draw/write correctly:

You can notice that for 7330 characters model, the probability of the predicted character being right is very low when you draw/write the character incorrectly, but if you draw/write the character correctly, the probability rise. This doesn’t mean that the model will not predict the correct character, but it may not be able to do it best if you draw/write incorrect characters!

Another scenario:

In here, you can see that the HSK4 (1059 characters) model predicted correctly, but the 7330 characters model predicted the wrong character. I think this is because the 7330 characters model dataset also include equation signs, such as + and =, roman numerals, and many other non-Chinese characters. Chinese is a pictographic language, so some Chinese characters may look very close to those signs/non-Chinese characters, making the model predicted incorrectly. Next time I’m going to train the model again, I will be sure to take those signs/non-Chinese characters out of the dataset as I’m sure it is decreasing the quality of the model.

Another scenario:

The most incorrectly predicted character that I have tested, so much that the model can only predict it correctly 1 in 10 times for anybody, is 一（yī）, which I previously thought to be the easiest for the model to predict. Therefore, I went and look at the 一（yī）folder and found this:

The person who made the dataset only crop out the area that have data, so all the images in 一（yī）folder is a rectangular image with a line occupying most of the space like shown above.

After I resize the image, the width of the line got thicker like shown above, and the model understand that 一（yī）is a black rectangle. The images shown above on the left turn to the image on the right.

So if you want the model to predict 一（yī）, you would have to draw a black rectangle like I did above.

Because the images of 一（yī）are rectangular, unlike for other characters that the images are similar to a square, when I resize it, it got disfigured and turned into a black rectangle. I can’t resize it directly.

To solve the problem, I had to create a white square image and put the 一（yī）images into the middle of the white square.

This is the 一（yī）images after that:

Now, I train the HSK4 (1059 characters) model again after changing the 一（yī）folder because the HSK4 (1059 characters) model take less time to train. I also updated the application in Hugging Face, so if you try to draw/write 一（yī）using the HSK4 (1059 characters) model, it will now predict correctly.

Further Work:

The 7330 characters model have too many words, which decrease its accuracy, and have words that are not commonly used or seen, making the model larger. Therefore, I think I might make models for each of the HSK levels because that would make the models more efficient and smaller. User can choose to use the model from what HSK level they are learning. Also, you wouldn’t have to waste time drawing/writing characters as perfect as you can as the model would have less character to classified into.

Like I mentioned at the beginning, I wanted to make a model small but accurate enough and deploy in on offline devices, such as single-board computers. I don’t think I can rely on Gradio much longer and must make my own user interface. I will continue to work towards this goal.

Update:
This is a demonstration of deploying my model using OpenCV for User Interface and Mouse Handling on Raspberry Pi 3b+ with 3.5 inches touch screen.

https://youtu.be/FTc3QOWtqQ8

The device that the model would be deployed on needs to have at least 1 GBytes of RAM. This model is trained with Resnet34.

I also tried to deploy it on Raspberry Pi Zero 2w, which has similar CPU as Raspberry Pi 3b+ and has less than 512 MBytes of RAM, so the model wasn’t able to run on it. There was no issue deploying the model on Raspberry Pi 4b.

I am still trying to deploy the model trained with MobileNet V2, which has a more compact size. I think the reason this model wouldn’t allow me to deploy it on Raspberry Pi is because its Pytorch has a problem with the Pytorch version on ARM CPU. I am trying to fix this problem right now.

This article is quite long already, so when I explain about how to deploy the Fastai model using OpenCV for User Interface and Mouse Handling on Raspberry Pi, it will be a new article coming soon.

I believe that this project has many aspects that are not perfect and has a long way of improvement to go, so I am still developing it. I will be updating my articles if there is any progress.