Data Story: Model optimization through user participation

By Dénes Bartha

DKATALIS
DKatalis
4 min readMar 7, 2022

--

Do you track your spending habits every month?

Most people no longer do the traditional task of writing down expenses using pen and paper. They usually have a dedicated smartphone app for expense tracking where they put each and every transaction manually.

But users of our financial solution, Jago, can skip most of the manual tasks when tracking their expenses. Using the Spend Analysis feature users can categorize each transaction by themselves or wait for our AI to sort it for them.

Currently, there are more than 90 categories covering a wide range of transactions: from bills, shopping, savings, to social events. Despite the seemingly long list of categories, we managed to get a 90% precision rate and 80% recall for the automated categorization.

So, how did we achieve that?

Behind the Scenes

Creating good Machine Learning models requires good quality data and a proper toolset that orchestrates training and inference.

Tools

For classifying the transactions we are using a DNN model trained by Vertex AI’s AutoML. Transactions are being processed by Apache Beam on Dataflow (inference). We use Feast as the feature store for accessing online features. The training and materialization of the features are orchestrated by Kubeflow pipelines.

DNN model — Transaction Classification

Challenges

Machine Learning models feed on data to learn how to work properly. However, the first hurdle in creating any ML model is obtaining clean and appropriate data.

We could gather a large volume of data for training, but if the quality is poor, then the accuracy of the model can’t be high either. We already had many transactions stored in our Data Warehouse but we had to create a pipeline that does the cleaning, transformation and feature engineering automatically.

A set of training data needs to be collected for each category, with enough examples (minimum a few hundred) provided to illustrate the various types of spending that should be assigned to a category.

At first, we did not have these “labels” so we did an internal launch of the Spend Analysis feature and asked our employees to help create these data sets by manually assigning the appropriate labels to their transactions.

Once we have reached a level where the accuracy of the model was good enough we launched the feature to the public.

Continuous learning with user participation

For now, we are retraining the model weekly. In this process, Jago users indirectly help us do the recategorization. Why?

Sometimes transactions can fall into various categories (in e-commerce, for example). Let’s say you’re topping up an e-wallet that you are using for multiple purposes: paying groceries, phone credits, or buying food. Sending money to someone’s personal bank account could also fall into many different categories. Maybe you pay them back for food (which is “F&B”) or buy something from their online shop (“Shopping”).

Most of the time, our AI will automatically recommend the most likely category for every spending. For example, any transfer to e-wallet accounts could be automatically suggested as “Top Up”. If users think that the category is incorrect, they could relabel it to the correct one. Our AI will take note of the last correction and adjust its prediction result for the next similar transactions.

Conclusion

The ultimate goal of our Spend Analysis feature is to help users in more tailored ways to achieve their financial goals. The first step is by omitting the laborious task of manual labeling and categorizing transactions and automating the process with machine learning.

But this isn’t the end of our journey. We aim to further increase the accuracy of our model as well as create other cool features that will increase the user journey even further.

Any guess on what we are working on 😉? Share your guesses in the comment section and stick around for more exciting updates from us!

Enjoy this story? Do you want to read more about how DK’s team brings financial products to another level through the magic of tech 🤗? Follow us!

--

--

DKATALIS
DKatalis

A highly adaptive tech company, driven by the desire to always be better