AI for effective solutions process, step 3: model design and (offline) evaluation

Lorenzo De Mattei
Aptus.AI
Published in
4 min readSep 2, 2021

Really functioning Machine Learning products? Search for balance is the key

In the latest posts of this series we have introduced the phases — problem and datasets definition — which set the ground for the core activity in the creation of an AI solution, that is the model design and the related performance evaluation. These operations require a great deal of balance in the relation between system’s accuracy and complexity — also considering that, as William of Occam assessed already in the 14th century, “it is vain to do with more what can be done with fewer”.

Evaluation metrics and baseline definition: where ML models efficiency is grounded

Machine Learning model design and evaluation is the iterative cycle that represents the best known and studied phase of the whole process, so much that it can be defined standard, and it is followed also in academia. Actually, also in respect to this step of AI solution development, at Aptus.AI we have specific procedures created for the business context. First, the choice of the evaluation metrics is crucial to create a good ML-based product, as they need to be informative about the definition of success in the addressed market. Obviously this goal can be reached if the first step — the problem definition one — is correctly completed. And it is the same business goal that guides the choice of a data subset which is valuable for the training, in order to quickly iterate on different model’s versions — which can also be tens before passing from testing to production. Besides, before creating the model, the definition of a baseline is needed to identify a starting point which fits the goal. When dealing with unstructured data, the HLP (Human Level Performance) can be used as a reference, while, for structured data, an already existing ML model can be chosen as a benchmark to start from.

A good Machine Learning model balances complexity and accuracy… with simplicity!

Once defined metrics and baseline, the model design phase can start. And now simplicity reveals itself as the top principle to follow. When working on ML systems, starting from the simplest possible structure is always a best practice for many reasons. Besides the above mentioned Occam’s razor principle, it is clear that more complex models can even result in more accuracy, but they are surely more difficult to interpret. Actually, when evaluating a model, it is essential to understand its answers, whether correct or incorrect. In the first post of this series we have already presented the difference between causality and correlation — elements that can lead to huge errors if they are confused. A simpler model, instead, is easier to interpret, and it also requires a smaller data volume for its training, as highlighted also in the study Deep double descent: Where bigger models and more data hurt. Generally, taking into account characteristics, distribution and organisation of data, it is always recommended to choose the simplest possible model. An interesting guide to the “responsible development” of AI can be also found in the article Toward trustworthy AI development: mechanisms for supporting verifiable claims. At last, bigger ML models, which use a lot of data, cause bigger problems in the engineering of data. To sum up, it is very important to find the right trade off between complexity and accuracy of the system, that is a correct balance which can be reached only starting from the simplest possible model.

Hyperparameters, errors analysis and versioning: best practices that make a difference

When creating Machine Learning based products, some best practices can be identified to get the best possible result. The first is the use of automated tools for the selection of hyperparameters. The so-called auto-ML allows optimizing the model’s engine, by automatically configuring its architecture, without the need of relying on intuitions. Another crucial activity is the analysis of errors (a topic that is well explained in this article), which need to be grouped manually to identify the underperforming categories, in order to understand what does not work and mostly why. Finally, it is always a good practice to punctually track the development of the model. First of all in respect to the test performed, exploiting tools like notebook Jupiter, in order to reach transparency and reproducibility (read to this article for further details). But tracking is also essential to keep a record of the different model’s versions, both to rollback to previous ones if needed and to get an overall view of the ML system. In just two keywords? Auto-ML and tracking.

Already consolidated processes? Sure, but at Aptus.AI we redefine standards

At the beginning of this post we have said that the model design and performance evaluation phase is the most consolidated one in the development of ML models. And that is true, but standardised processes are not enough to create Artificial Intelligence solutions that actually answer business problems. This is why at Aptus.AI we follow a specific and studied procedure to reach the best possible results from the application of Machine Learning to real market’s needs. And this is how we have created Daitomic, our financial compliance management interactive platform.

Originally published at https://www.aptus.ai.

--

--