Dry_bean Variety Classification Model

6 min readJan 19, 2023

Experience nocoding AI

https://pro.nocodingai.com/

Dry_bean Variety Classification Model Training

This tutorial is implemented in nocodingAI using the Dry_Bean_Data_Set provided by the ULC Machine Learning Repository.

Table of Contents
Introduction
CSV Data Download
Data Loading
Data Normalization
Building a Model Architecture
Model Training
Inference

Introduction

This tutorial trains an artificial intelligence model that categorizes soybean varieties into seven categories with statistical data provided by Dry_Bean_DataSet.

To this end, we will go through the following steps.

Load the original data required for training.
The loaded data is processed and normalized according to the model through a data preprocessor.
Define the architecture of the model.
Train the model and monitor performance during training.

Learning

How to preprocess data in nocodingAI
How to Define Artificial Intelligence Model Architecture in NodingAI
How to deduce using a model trained in nocodingAI

A necessary thing

The latest version of Chrome or other latest browser.
Be familiar with the basic usage of nocodingAI.
If you need to introduce or learn basic usage, please refer to the nocodingAI Getting Started Guide.

CSV Data Download

The dataset to be used in this tutorial is the Dry_Bean dataset.
The Dry_Bean dataset is a dataset with 16 characteristics, identifying the characteristics of beans such as shape, type, and structure, and categorizing them into 7 types.

First, go to the following address to download the dataset.

https://nocoding-ai.s3.ap-northeast-2.amazonaws.com/test/Dry_Bean_DataSet.csv/Dry_Bean_DataSet.csv

Please open the downloaded CSV file in a text editor such as Notepad.

A CSV file is a file that represents tabular data through commas and line breaks.
It can be seen that a line break and a comma separate rows and columns, respectively.

If you open the file, you can see that it is in the same form as above.
Table data consisting of rows 13611 and 17 columns.

You can find simple information about this dataset in the top row. It means a dataset with 13611 rows and 17 columns of input data and categorized into 7 varieties.

Data Loading

I will now load the modified CSV data.
Go to the data cleaner page and press the create button.

You can give a name and a description to your dataset, here we will call it DryBeanDataSet and set the Data type as CSV. Then click on Create Empty. You can also use the pre-build dataset named bean classification data.

Once your empty dataset is created, you have to upload the CSV file. You can just drag and drop the file here, or use the button to open your files.

After uploading the csv file, you can see your data.

If you have successfully loaded the data, we will now preprocess the data.

We want to normalize our data. To do so, we will just click on the select button corresponding, and select everything but the class column and confirm. After this, we can save the dataset and go back to the main menu.

Building a Model Architecture

In this section, you will learn how to configure the model architecture. The model architecture says, “What functions will be executed when the model is executed?” Alternatively, it represents the algorithm that the model will use to calculate the answer.

To create your model, click on the model designer button and then create a new empty model.

First create the different layers and connect them in this order:

Input / Dense / Dropout / Dense / Dropout / Dense / Dropout / Dense/ Dropout / Dense / Output

Add also the dataset previously created, ad keep the default training percentage.

Now, let’s enter weights for model learning for each layer.

First, set the unit item value in the advanced settings to 80 in the first Dense layer and give the activation item to relu.
For the dropout layer, select the rate in the advanced settings.I’ll give you a weight of .3

Second layer
dense = unit : 40, activation : relu
dropout = rate : .3

Third layer
dense = unit : 20, activation : relu
dropout = rate : .3

Fourth layer
dense = unit : 10, activation : relu
dropout = rate : .3

Last Dense layer
unit : 7, activation : softmax

For the unit value of the last Dense, enter the number of beans that need to be classified.
The artificial intelligence model we are training is classified into 7 varieties of beans, so we entered 7 as the unit value.

The model must look like this.

Model Training

Once the model architecture has been constructed and data preprocessed, everything is ready to train the model.

Click on Start Training button, and set the different settings.

Optimizer: Algorithm for optimizing the model.
Loss function: A loss function to determine whether a model is being trained well in the model training process.
Evaluation indicator: A value used to measure the training process.

Set the optimizer to Adam, the loss function to sparseCategoryCrossentry, and the evaluation index to accuracy.

Set the learning rate to 0.001, epoch to 100, and batch size to 256.

Epoch refers to the number of times a model is trained.
Batch size is a value that determines how many pieces of data you want to use.

For the X’s columns you should pick all the columns but Class, and for the Y’s Columns only the class column.

When everything is ready, press the ok button to proceed with the training.

You can follow the training using the Training Graph button, like this:

Inference

Now that the model training has been completed, we will use the trained model to infer.

Go on the App builder menu and create a new empty project.

Add the model we just trained, an input text and output raw.

Change the id for the input and output, using the select bar under the TEXT label, and the name for the different columns of our dataset should appear for the input, and number from 0 to 6 for the output.

You can rename the output labels with the different bean variety to make it more clear.

The different variety from 0 to 6 are:

You can also add more content to make the design more pretty or add description.

Once you’re done, you can publish your project and click on the link to open it in a new tab. Now it’s open, you can infer with your model, just give some value normalized and it will predict the class of the bean. I tested it with the values of the first line of our dataset and the class is matching.