Application of Monotonic Constraints in Machine Learning Models

A tutorial on enforcing monotonic constraints in XGBoost and LightGBM models

Ajay Tiwari

Published in

Analytics Vidhya

6 min readMay 1, 2020

Background

After significant progress in data science and artificial intelligence, we have started hearing discussions around ethical and explainable AI, especially in highly regulated industries like banking and insurance. Though plenty of easily accessible and highly predictive algorithms are available, analysts and data scientists working in these industries face a dilemma between choosing predictive accuracy or honoring their regulatory responsibilities. Researchers and industry experts are trying hard to make machine learning models transparent and interpretable.

One such innovation that makes the model output more practical is called monotonic constraints.

What is monotonicity?

As per the OXFORD dictionary, Monotonic is a function or quantity varying in such a way that it either never decreases or never increases.

Source: https://en.wikipedia.org/wiki/Monotonic_function

Monotonically Increasing Function — A function is called monotonically increasing if, for all x and y such that x≤y one has f(x)≤ f(y), so f preserves the order. This function does not exclusively have to increase, it simply must not decrease (see figure 1).
Monotonically Decreasing Function — A function is called monotonically decreasing if, whenever x≤y, then f(x)≥f(y), so it reverses the order. This function does not exclusively have to decrease, it simply must not increase (see figure 2).

Why apply monotonicity in the model?

Now, you must be wondering why we need a constrained model. In real life, many scenarios exhibit a monotonic relationship, look at some of the examples given below.

The likelihood of loan approval is higher with a better credit score.
Premium decreases with the age of driving license.
Premium increases with the sum insured.

Companies implement these monotonic relationships in their predictive models for transparent decision making and to be compliant with regulatory bodies. In the absence of monotonic constraints, banks and insurance companies may face a weird situation of illogical and unethical decisions such as application with higher (say, 610) credit score gets rejected and an application with lower (say, 600) credit score gets approved. Similarly, customer A pays $1000 premium for $1M property and customer B pays $990 premium for $1.1M property. In the above examples, we assume all other factors are the same for both pairs of customers.

Let’s understand this with implementation in machine learning models

I will walk you through this concept using XGBoost in python and codes for LightGBM implementation is also available at the bottom of this tutorial.

Generating sample data

As a first step, we will simulate some data as per both the scenarios discussed above.

Scenario:1 — Simulating Data with Positive Slope

import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')#Sample data with positive slope
size = 100
Xp = np.linspace(0,7, size)
yp = Xp**2 + 10 - (10 * np.random.random(size))plt.plot(Xp, yp, '.', color='b');

In the first scenario, there is a positive relationship between two variables x and y, when x increases, so do y and when x decreases y decreases too, e.g., the sum insured and premium amount.

Scenario:2 — Simulating Data with Negative Slope

def f(Xn, a, b, c):
"""Fit function y=f(x,p) with parameters p=(a,b,c). """
return a * np.exp(- b * Xn) +c#Sample data with negative slope
size = 100
Xn = np.linspace(0, 2, size)
yi = f(Xn, a=2.5, b=1.3, c=.5)#add some noise
yn = yi + 0.5 * np.random.random(size=len(Xn))

The second scenario is a negative relationship between two variables, when x increases, y decreases, and when x decreases, y increases, e.g., age of driving license and premium amount.

Fitting machine learning model without monotonic constraints

We will fit the boosted tree models on both the scenarios with default parameters and without enforcing any constraints.

#Code for positive scenario, use same code for other scenario
Creating DMatrix
import xgboost as xgb
dtrain_positive = xgb.DMatrix(Xp.reshape(-1,1), label=yp)#Setting default parameters
params_no_constraints = 
{'booster':'gbtree',
'eta':0.3,
'gamma':0,
'max_depth':6,
'min_child_weight':1,
'colsample_bytree':1
}#Model Fitting
model_no_constraints_positive = xgb.train(params=params_no_constraints, dtrain=dtrain_positive)#Prediction
preds_positive = model_no_constraints_positive.predict(dtrain_positive)#Plotting observed versus prediction
plt.plot(Xp,yp,'.',color = "b")
plt.plot(Xp, preds_positive.reshape(-1,1), color = "r")
plt.xlabel("X")
plt.ylabel("Non-monotonic Model Fit")

Models are ready and you can see the visualization of the output.

Let’s analyze the output, blue dots show X data points and the red line shows fitted models. We can see both models are performing well, able to capture the overall trend in the data and this zig-zag movement is also quite natural, but this can impact the decisions. When we analyze the model fit at close intervals, we can observe opposite trends to our expectations. For example, in the above models, Yb < Ya and Yn > Ym as opposed to our expectations.

Recall our sum insured and premium amount example discussed above. How can you justify a lower premium for a higher sum insured between two different customers when all other factors are the same?

Thanks to the monotonic constraints, now we have a solution for avoiding these kinds of tricky situations.

Imposing monotonic constraints on the model

Now, we will model our sample data with an additional parameter to our current default parameters, i.e., monotonic constraints. As per XGBoost documentation, they are 1 and -1 for increasing and decreasing constraints respectively.

params_increasing_monotone = 
{'booster':'gbtree',
'eta':0.3,
'gamma':0,
'max_depth':6,
'min_child_weight':1,
'colsample_bytree':1,
'monotone_constraints':1
}

Lets plot predicted line over X values to visualize our model fit.

We can see the overall trend looks similar to the earlier non-monotonic models, but those zig-zag movements are replaced with monotonic. The model with increasing constraints never decreases and similarly, the model with decreasing constraints never increases. Now, explaining these models is much easier, there are no contradictory situations as before.

Implementation of monotonic constraints in LightGBM

You can model the data and analyze the output using the following code, for the rest of the steps you can use the same codes as above.

#Fitting LightGBM model with monotonic constraints
import lightgbm as lgb
monotone_model = lgb.LGBMRegressor(min_child_samples=5,
monotone_constraints="-1") # 1 for increasing constraints
monotone_model.fit(X.reshape(-1,1),y)#predicted output from the model from the same input
prediction = monotone_model.predict(X.reshape(-1,1))

Summary

We have discussed monotonic constraints here with a single variable model, these constraints can be applied to multiple variables as well and the effect on an individual variable can be analyzed using a partial dependency plot. Sometimes enforcing constraints helps in avoiding overfit, depends on data to data, but there is no harm in testing this technique when you have an overfitted model.

Thanks for reading, hope you found this article informative.

References

XGBoost Documentation — xgboost 1.1.0-SNAPSHOT documentation

XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable…

xgboost.readthedocs.io

Welcome to LightGBM’s documentation! — LightGBM 2.3.2 documentation

LightGBM is a gradient boosting framework that uses tree-based learning algorithms. It is designed to be distributed…

lightgbm.readthedocs.io

Monotonic function

In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or…

en.wikipedia.org