Garbage Classification with ModelArts
In this article I will try to explain how to use Huawei Cloud ModelArts Service for Image Classification and make a demonstration about Garbage Classification. Let’s hop on to it.
Introduction
Image Classification with ModelArts is a AI modeling service for Huawei Cloud that you can use it for predict the classifications of a given image to the model.
For example you can use it for predicting an image is apple or orange, cat or dog etc. First of all you need a dataset of images that you want to predict. ModelArts suggests you to have at least 100 image for each class that you want to train. For example you want to train a model which is going to predict an image is cat, dog or dolphin. You need to have atleast 300 images to train that model for yourself.
How can you find a dataset like this? First option is manually collect these images from internet. You can search from google for cat pictures and collect your images for your dataset.
Second option is the websites builded for solving this problem. Just like Kaggle. In Kaggle you can find thousands of datasets for every Machine Learning training example. For our demonstration I found a dataset about Garbage Classification and trained it in ModelArts Image Classification service.
We have 6 different garbages. Cardboard, glass, metal, paper, plastic and trash. For all of the garbages we have more than 2500 images. Every class has more than 100 images so ModelArts would work properly. Let’s begin!
Garbage classification with ModelArts
Like I said before, first we need a dataset and we don’t want to collect all images by ourselves. So we hop in to Kaggle and find a suitable database.
Here is the dataset we are going to use.
Download the database with the button on the upper right corner.
Extract the archive.zip file.
These are the classes you are going to use for your model training. If your dataset is not sorted into folders like this you should do it yourselves.
Now we have a prepared dataset for our model training. We need to upload it to Huawei Cloud OBS. ModelArts only allows you to use datasets from your OBS Buckets. From browser console you can only upload 100 image per process. But with OBS Browser+ application, you can upload as much image as you want. So you need to download OBS Browser+ and upload your database to your OBS Bucket.
First you create a bucket.
This part is important because you need to choose the same region that you are going to work on ModelArts. For me it’s Singapore so we select it. Then we name it and create.
Then we create 3 folders inside our bucket. Input is for our dataset, output is for our model training outputs. You might ask why we need input-empty folder. Because when you create an ExeML training from ModelArts service you can’t import your labelled datas. You need to use Data Management to create your own labelled dataset but the dataset you created from Data Management doesn’t show up at the ExeML section. I think this is a minor bug that is going to fixed soon but now we use a empty folder to create a dataset from ExeML section, after that we are going to import our labelled images to dataset we created from ExeML. That will do the trick.
We go in to our input folder and upload our divided images folder by folder.
Now we are ready to train our Garbage Classification model from ModelArts service.
We hop on to ModelArts service console. Click ExeML at the left bar and select Image Classification and Create a project.
Name your project and select Create for create your own dataset. Name your dataset and select input and output your dataset path.
When pointing out your input path you need to select input-empty for creating an empty dataset. We are going to import our labelled image dataset after the creation.
Select the output of yours.
Now you created your empty dataset. Click Data Management from left navigation bar and click your datasets name and get in.
You must have been created a labeling job while you are creating the dataset. Click the labeling job from right corner of the website.
Click Add data for importing your labelled image classification dataset.
Select OBS > Path > Labelled > Image classification > ModelArts ImageNet 1.0.
Now for the dataset path, you point out the real input folder that you have the images inside. Click OK and wait for synchronization.
After the uploading process finished the scenario you expect is like below.
Get back to ExeML section and select the ExeML project that you created.
Confirm the informations and datas then click train.
Set your training ratio to 0.8. Select accuracy_first from training preference drop down list. Click next and start the training process.
After training is completed, you can see your metrics of your Image classification model like below.
Now we need to deploy our model and test it. For deployment click the Deploy button at Version Manager section at the upper left corner.
Select the specifications, enable Auto Stop and set it to 1 hour. Next and submit.
Wait until deployment is finished.
Now we can try our model. Download some images from Google that you can classificate with your model and upload it to ModelArts deploy service like below. Try to find some images that you dont have it in your dataset. That will make your test more reasonable. Then click predict.
Here are some examples from my dataset with Garbage classification model that we created together. You can see the result at the Test Result section.
Final thoughts and Summary
In this article we trained an Image classification model with Huawei Cloud ModelArts service. We used a public dataset of Garbage collection from Kaggle and trained it to classificate garbages by images.
As you can see you can easily create Image classification models with Huawei Cloud ModelArts service and deploy it. You can combine this service with other services to make more complex projects. ModelArts provides you to use this deployed model with API Reference.