Securing Machine Learning Systems Against Attacks
AI systems have significant vulnerabilities that could completely throw them off balance. Here are four critical steps for machine learning security.
In recent years AI technologies have become increasingly influential in how organisations operate, creating significant cost savings and improving customer experience. Simply put, a greater number of business decisions are being made in industries around the world today by some form of AI.
While progress has been rapid, AI systems have significant vulnerabilities that could completely throw them off balance. Failure to handle these could lead to serious financial loss, as well as ethical and reputational issues for the organisations involved.
As businesses continue to deploy AI to automate, it is imperative they recognise the potential vulnerabilities involved and have detection and safe-guarding practices in place to protect their models.
How Are AI Algorithms Vulnerable?
Outside of cyber security-related breaches, attacks on AI systems can be broadly divided into two families — distortion and probe attacks.
Distortion attacks involve an attacker that intentionally tries to alter the behavior of a model, either at a specific case (for instance, underwriting a mortgage for the attacker) or at a batch level (such as approving all loan applications that meet certain parameters). Such attacks could stem from either distorting the inputs, the training data or the model itself. The primary goal is to create outputs that were never intended by the creators.
Transfer learning — reusing a model developed for one task to inform a model intended for another — has become increasingly prevalent in machine learning. While this can often make building models a more efficient process — why start from scratch on a model that needs to recognise motorbikes when you already have a model that recognises cars — it also significantly heightens the risk for attacks through distorted data or models. Edited datasets can make models react in a certain way for very specific inputs, essentially granting the attacker complete control over the model. This is also true when many base models, which could have an unknown latent behaviour, are already available online.
Probe attacks involve an attacker that doesn’t have direct access to the training corpus or the model itself, but can probe the model’s outputs based on carefully chosen inputs. This would allow the attacker to learn of specific weaknesses in the model and, in certain cases, understand the distribution of the input data. Where the model is trained on proprietary data, this could lead to significant ramifications for the organisation.
Let’s imagine an attacker was intent on targeting a bank’s system for approving mortgages. By repeatedly sending applications with carefully altering values each time and recording the results, the attacker can identify the parameters that matter and their respective weightages. This effectively provides them all the requisite information needed to ‘game’ the system.
Four critical steps to secure machine learning algorithms
To protect against these attacks, it is important to make carefully considered design choices at each stage of the model development lifecycle. Below are the four critical steps that should be followed throughout model development.
1. Careful vetting of data providers — In order to protect against possible distortion attacks, data providers and sources must be carefully vetted. Organisations must have a quality control process before utilising external data sources. To minimise the risk of possible data contamination and adversarial data, the sources must not have passed through unknown third parties. Where possible, utilise your own organisation’s data and have it annotated for use as training data.
2. Data outlier analysis — In addition to examining each data column individually, use a meta-algorithm to consider the entire dataset and identify outliers across columns. For example, normal distributions across ‘age’ and ‘income’ might be hiding a case of somebody aged 20 earning $300K. Ensuring that such cases do not exist prevents the algorithm from learning erroneous patterns which could provide a vector of attack.
3. Good software engineering practices — Limiting the number of queries based on multiple factors (such as time to retry or IP address tracking) means that attackers will not acquire enough data points to understand the model’s full behaviour. Deploying customer verification and reducing the number of people that can interact with a critical model allows you to maintain model sanctity.
4. Validation of models — In software engineering, once the developers have finished their tasks, a separate quality assurance (QA) assessment is typically undertaken to examine each part of the project and identify vulnerabilities. While we don’t recommend having a separate QA team for data science, it is important to incorporate a validation framework into the automated CI/CD pipelines to ensure all aspects of model development have been vetted before efficiently migrating to production.
Organisational security changes
It is not sufficient to have manual checks in place before the model has been developed. Certain system infrastructure is required to limit the possibility of distortion attacks.
- Model monitoring and alerts — Models should not be treated as fire-and-forget launches, but rather assets that require constant monitoring. Any systematic deviation from both expected inputs and outputs should create an alert for further investigation. Also, if all other inputs remain constant, a different outcome should be flagged immediately as a possible case of distortion attack. Having a team of system engineers, with an understanding of analytics, is required to ensure an ecosystem of models work as intended in production. Given a lot of models constantly evolve over time, this team should also be able to differentiate between normal model drift and nefarious behavior, which is quite challenging in practice.
- Regulated AutoML by SMEs — AutoML systems are capable of ingesting multiple kinds of data, perform feature engineering, and identify optimum parameters. While such systems are definitely capable of creating novel insights, it is extremely important to get domain experts (e.g. credit risk, marketing etc.) to weigh in on the accuracy of models’ outputs. It is crucial to get the SMEs trained up in the basics of analytics, so that they can opine on the model design and workings.
In the new age of Software 2.0, proliferation of this combination of advanced AI models and AutoML can be dangerous. With limited human involvement, models could create random transformations (e.g. nonlinear functions like tanh) and the complexity makes them uninterpretable and unactionable. This drastically increases the vulnerability of models to distortion attacks, potentially creating significant loss. It is important to have subject matter experts review the data transformations and inner workings of the model to make sure outcomes are actionable and in-line with regulatory requirements.
By being aware of these different types of attacks and the vulnerabilities inherent in machine learning algorithms, practitioners should be able to take the necessary security steps to avoid negative outcomes for their business. Organisations that secure their data from vetted sources and monitor models pre and post-launch will significantly reduce the risk of both distortion and probe attacks.
Authored by: Vishnu Kamalnath, Expert Associate Partner, QuantumBlack and Brian McCarthy, Partner, McKinsey & Company