Is a responsible Machine Learning solution possible for Cybersecurity?

(Part I)

Lakshyagourav Moitra
4 min readSep 27, 2022

Malware detection is a subset of cybersecurity recently addressed using various machine learning models. The data subject to the model consists of a wide variety of features concerning the network. The models used to ensure the security of the network infrastructure can be linear or complex, depending upon the complexity of the network. If the model fails, malicious software like Ransomware, Spyware, and Trojan can plague a system and, conclusively, the entire network. Attacks like WannaCry have wreaked havoc, caused financial damage, and gained international attention in 2017. Network security should be increased using detection and prevention systems to prevent such occurrences.

Machine learning models range from glass-box models such as Linear Regression to black box models such as deep learning models. Employment of a complex model for deployment or research purposes makes comprehension of the models a tedious task. The development of methods to comprehend the functionality of the models is necessary to understand the model’s behaviour and tune the models as per requirements. Explainable Artificial Intelligence methods focus on understanding the model’s functionality and contribution to the generated model output. The explanations are essential to pursue efficient tuning for deployment purposes. Malware is a kind of cybersecurity threat known to people through financial and infrastructural damage caused by it. The detection and prevention mechanisms devised concerning Malware are evolving in step with the attacking mechanisms. These defence mechanisms must explain the decisions taken as they will enable comprehension and monitoring of attack patterns by technical and non-technical stakeholders alike. The functionality concerning the model, output and features is vital to comprehend as a model with an unclear or approximate understanding cannot be considered reliable or trustworthy. Trust can only be established with a holistic comprehension of the model and the subjected dataset. The models employed should also be scrutinised for fairness, as bias can lead to an unstable prediction or output. Existing works on machine learning for malware detection lack an overall perspective of the models employed, leading to a partial to no explanation concerning the model and the dataset. This article proposes a multi-faceted approach to explainability concerning Ensemble boosting classifiers which provide local and global explanations. The approach also incorporates the verification of fairness concerning the models employed.

Proposed Approach

Multi-faceted XAI (MFX)

The proposed approach is named Multi-faceted XAI (MFX), depicted in the figure. MFX consists of multiple XAI methods which assess the models concerning the explanations, impact of the features on model outputs, fairness of the employed model and similarity in the functionality of the respective modelling techniques. The proposed approach is employed on the Malware prediction dataset provided by Microsoft, and Ensemble boosting models are used to evaluate the approach.

Performance Evaluation

Performance Metrics for Ensemble boosting models
  • LGBM classifier is the model with the highest accuracy compared to all modelling techniques, followed by CatBoost and Gradient boost.
  • F1 score comprises both Precision and Recall; a higher F1 score signifies a better model. Adaboost ranks first with the highest F1 score.
  • The low accuracy of Adaboost negates the advantage of a high F1 score.
  • LGBM and CatBoost classifiers are comparatively better models concerning the dataset.

Confusion Matrices

Gradient Boosting
Light GBM
XGBoost
CatBoost
AdaBoost
  • The True Positive (TP) value is highest for Adaboost (31.3%) compared to other models, which state that the model accurately predicts the values belonging to a class.
  • However, the True Negative (TN) value is highest for LightGBM (36.8%), while the False Positive (FP) value (14.0%) is lowest compared to other models. This states that the correct classification of values not belonging to a particular class is highest, and the incorrect classification of the values belonging to a specific class is lowest for LightGBM.
  • The value of False Negatives (FN) is 19.8% for LightGBM, which is higher than CatBoost and AdaBoost, which states that the percentage of values predicted incorrectly as not belonging to a particular class is high.

The next part will explore the functionality of various Explainable AI methods and their application to the modelling techniques.

References -

https://towardsdatascience.com/understanding-confusion-matrix-a9ad42dcfd62

--

--