Machine learning in Dart programming language

Ilia Gyrdymov
4 min readJun 7, 2022

--

Hi all!

My name is Ilia, and I am a software engineer. My main activity is front-end development. The first "serious" programming language I met after JavaScript was Dart — at one time, my team developed the front-end of our project in this language.

A few years ago, I became interested in machine learning and studying machine learning algorithms became my hobby. One day I thought, why not combine my hobby and development in Dart? After checking the repositories on GitHub and the packages on pub.dev, I was convinced that no one had yet created anything from the field of Machine Learning in this language, so I decided to make a library with a set of popular Machine Learning algorithms in Dart. That's why the ml_algo library and several related libraries that form the entire ecosystem were born. In this article, I would like to introduce you to the libraries mentioned above, focusing on ml_algo.

The ecosystem consists of the following libraries:

  • ml_algo — implementation of Machine Learning algorithms
  • ml_dataframe — data storage which can contain both processed and raw data. The central entity of the library is DataFrame class which is analogous to Pandas dataframes.
  • ml_preprocessing — implementation of algorithms for data preprocessing
  • ml_linalg. The library contains the Vector and Matrix classes for efficient mathematical calculations. This library uses as an internal representation of data in ml_algo lib.

The ml_algo library includes the following algorithms:

Linear Regression:

Non-Linear Regression:

Linear classification:

Non-Linear Classification:

Clustering and Retrieval:

Let's take a look at a simple example of using the library.

Let's say we want to train a linear regression model (you can read what Linear Regression is here). For example, we can use the so-called toy dataset containing data from the wine quality dataset, which the ml_dataframe library provides for demonstration purposes.

First, let's add the necessary dependencies to pubspec.yaml:

ml_algo: ^16.15.1
ml_dataframe: ^1.5.0

Then we load the data to train the model:

The data variable is a DataFrame instance. Let's print the variable to understand the structure of the data:

It outputs the following:

The data consists of 11 independent variables, as known as features, and one dependent one, which is our target — we have to be able to predict the value for the "quality" column.

Next, let’s split the data into train and test sets. The first set will train the model, and the second one will evaluate the model's quality. We can use splitData function from the ml_algo library to split the data:

As you could notice, we called splitData function with the argument [0.7] — it means we want to use 70% of the source data to train the model.

Next, we train the model by calling the LinearRegressor constructor:

The constructor accepts train data as the first argument and the target column name as the second. By default, a closed-form solution uses to train the model.

Let's assess the model. To do so, we need to find a proper metric. Mean absolute percentage error (MAPE) is a good metric for the regression problem. Let's use it:

To assess the model, we called assess method and passed the chosen metric type there along with test data. The last instruction prints the following:

0.07042532225989101

One can interpret it as "On average, we predicted the target value with a 7% error". Let's compare predicted values and the actual ones:

First, we dropped the "quality" column from the testData by calling dropSeries method from DataFrame class. By doing this, we imitated unlabelled data. Lastly, we took some arbitrary range from both sets of values to compare them.

The last two instructions output the following:

Actual values: (7, 6, 7, 5, 5)
Predicted values: (6.18, 5.82, 6.49, 5.36, 5.36)

which looks quite reasonable.

If we're okay with our model, we can save it to a JSON file to skip the retraining next time:

To restore the model, we can do the following:

That's pretty much it!

So, it's the right time to find out what Linear Regression is.

If you have any questions, you can reach me on Twitter.

Cheers :)

--

--

Ilia Gyrdymov

Frontend engineer (Dart, Vue/Vuex/Nuxt, React/Redux + Typescript) with an interest in Machine Learning, living in Cyprus