Starting my Journey into Deep Learning with a Currency Classifier App
I have heard about fast.ai from many friends, and eventually, I gave it a try last week. The lecturer was Jeremy Howard. In the first lesson, he talked about the learning style top-down, which is different than the traditional bottom-up. He taught us about how to build a deep learning model and deploy it in production at the beginning. This was a bit overwhelming, but this was exactly what I was looking for. So I was super excited.
After finishing the first lesson, I started building an app to classify the Vietnamese currency. The idea was to help foreigner to get familiar with our money. The main feature is capturing a photo on the web app using the webcam and recognizing the note before exchanging it into US Dollar (for example).
I began with a plan and listed down some technical issues:
- Project Baseline 1: Collect data and build a model to classify Vietnamese currency using fastai.
* How to collect data?
* How to train the model with the data?
* How to export the model that can be used in production? - Project Baseline 2: Create a Flask app that user can take a photo from the webcam and upload it to the backend server.
* How to access webcam and capture photos on the browser?
* How to upload the photo to the server? - Project Baseline 3: Integrate the model from step one into the app and show the result to the user.
* How to import the model into the app?
* How to classify the image and return results?
It sounded like a good plan. Then here we go.
How to collect data?
Follow the instructions in the lesson; I used the tool Google Image Download. It was straightforward, and I downloaded hundreds of images of Vietnamese currency automatically. Here below are some examples:
As you can see, I wrote a short script to rename the files to a pattern className_index.jpg. So that it could match with the data Jeremy used in the first lesson.
Training the model
I followed the lesson 1 notebook to import the data. The only change is the pattern of filename:
pat = r’/([^/]+)_\d+.jpg$’
data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=224, bs=bs
).normalize(imagenet_stats)
data.show_batch(rows=3, figsize=(7,6))
Then I built a model using Resnet34 architecture, as does the course, and trained it with four epochs (1 epoch = looking at all images at once). I had around 28% error rate.
Continue following the lesson; I have fine-tuned the learning rate and chosen the range around 0.0001 up to 0.001. Then I trained an unfrozen model with 20 epochs:
Then I got only around 6.5% error rate, which was out of my expectation. There were a lot of things to improve the model and data, but I skipped it for a moment to move on building the web app. So I exported the model to a file export.pkl and finished my work on Google Colab.
# This will create a file named ‘export.pkl’ in the directory
# where we were working that contains everything we need
# to deploy our model
learn.export()
Building the Flask app
Luckily I have found this tutorial on Youtube. One thing to note was the deprecation of the function createObjectURL() in the tutorial. You should use video.srcObject=stream.
I changed the UI a bit using Bulma, a free, light, and open-source CSS framework without Javascript.
Then I wrapped the HTML page with a simple Flask app and moved on looking for a tool to upload the image which is captured by the user. The chosen one was Filepond. Filepond’s documentation is excellent. Literally, you can find everything you need there.
This part of the project contained the most time-consuming tasks, tedious debugging issues, and endless researching on Google or Stack Overflow. So I’m not going to bore you with now. To summarize, it took me 10 hours to come to this point:
Conclusion
It turned out that the model performed poorly in my application, even the accuracy was 94% on the Jupyter Notebook. I came up with the two most important reasons, in my opinion:
- The distribution of the trained data is different from the image user takes on the web.
- The trained data was too small.
So I need more data that’s taken by users on webcam. And that brought me to part 2 of this project considering a brief overview:
- Deploy the app (on Google Cloud, because I’m learning it)
- Store the uploaded images (Cloud Storage)
- If the model classifies wrong, give the user a way to correct it (Database needed, FireStore may be good for this)
Despite that the app didn’t work perfectly, I was amazed by the results and look forward to work on part 2.