Analytics Vidhya
Published in

Analytics Vidhya

Tracking and Monitoring Transformers with MLFoundry

Efficient tracking and monitoring of Transformer models for Financial Sentiment Analysis using MLFoundry, by TrueFoundry

Source: Catchpoint Digital Monitoring: Offering lowest cost options without compromising quality



Sentiment Analysis for Financial News using Simple Transformers

Exploring & Processing the Dataset

Training a BERT Model for FSA

  • A dict containing the performance metrics on the evaluation dataset (Matthews correlation coefficient and loss by default, along with micro-F1 and accuracy defined by us)
  • A list of model outputs for each evaluation instance
  • A list of inputs for which the model predicts incorrectly

Introducing MLFoundry for Tracking and Monitoring

Logging Experiment Details with MLFoundry

  • To log the training & evaluation datasets, we use log_dataset()
  • We log the model specifications (type and name), along with the hyperparameters as a dictionary using log_params()
  • The dictionary containing the performance metrics on our evaluation set (accuracy and micro-f1) is logged using log_metrics().
  • Various metrics related to our dataset, along with statistics like counters, summaries, histograms, and most frequent values are estimated using whylogs automatically when logged using log_dataset_stats().

Navigating around the MLFoundry Dashboard

  • The Model Health section shows the performance metrics of the current model on the evaluation dataset. These include a confusion matrix (since this is a multiclass classification task) along with other relevant plots.
Model Health section showing the various user-generated and auto-generated metrics for bert_3epochs run
  • The Data Health section contains various stats related to our dataset, which can be used to understand the data quality and compare it against other datasets if there is a data change later.
  • The Feature Health section shows the numerical distribution of labels and predicted values based on input features. For our case, there is only one input feature named headline, containing the financial news headline.
Feature Health section showing the numerical feature distribution of classes for the labels and predictions for bert_3epochs run
  • The Run Details section displays all the parameters and metrics logged for the run and also allows users to view the datasets and other artifacts related to the run that were tracked.

Efficient Tracking and Comparison of Multiple Experiments with MLFoundry

  • When trained for more epochs, the BERT model showed better accuracy
  • The RoBERTa model gives the best accuracy of 0.866

Model Demo using MLFoundry Web App

Example of a financial news headline being classified correctly as ‘positive’ by our fine-tuned RoBERTa model as seen in the MLFoundry Web App for the run named ‘roberta’

Concluding Remarks



Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Tezan Sahu

Data & Applied Scientist at Microsoft | B. Tech from IIT Bombay | GSoC’20 with PEcAn Project