Conformal Prediction: An easy way to estimate prediction intervals

María Jesús Ugarte
bain-inside-advanced-analytics
5 min readJul 11, 2023

Uncertainty for single predictions is an important issue in machine learning nowadays, specially when a wrong prediction could bring bad consequences. In order to calculate the uncertainty in a prediction we can use prediction intervals. A prediction interval is an estimate of the interval the observation will fall in, with a certain level of confidence.

There are different techniques used for estimating prediction intervals: Monte Carlo Dropout, Mean Variance Estimation, Quantile Regression, among others. These methods usually only work for regression cases and generate wide intervals when the desired confidence is reached. The goal is to generate the narrowest interval possible while reaching the desired coverage.

The nonconformist package [1] is a Python package that performs conformal prediction. Even though using this package is relatively easy, the theory behind isn’t. Here we will take a look at an intuitive explanation of the algorithm used for estimating the prediction intervals, for both regression and classification cases [2].

Regression

1) The data must be divided in 2: the training set (Zt) and the calibration set (Zc), where |Zc| = q (number of observations in the calibration set). It’s very important to verify that the calibration data is different from the training data.

2) After splitting the data, we have to choose an underlying model (h). This model is used to create the predictions of the target value.

3) The underlying model is trained with the training set Zt.

4) Then, we have to define a nonconformity measure f(z). This measure is a function that indicates how “rare” an observation is, compared to the rest of the data. Usually, this function takes the underlying model’s prediction, e.g. f(x, y) = |y — h(x)|.

5) With the nonconformity measure we calculate the nonconformity scores on the calibration data. These calibration scores are sorted in a descending order and stored: (𝛼1, ⍺2, …, ⍺q)

6) We choose a desired coverage or confidence level (1 — 𝛆).

7) Next, we need to find s, the index of the (1 — 𝛆)-percentile nonconformity score, 𝛼ₛ.

Sorted calibration nonconformity scores

8) The prediction interval for a new input x is 𝛤 = h(x) ± 𝛼ₛ.

Classification

1) Just like for regression data, the dataset is divided in 2: training (Zt) and calibration (Zc) sets, where |Zt| = q.

2) After splitting the data, the underlying model (h) must be chosen.

3) The underlying model h is trained using the training data Zt.

4) Then, we have to define a nonconformity measure, e.g. f(z)= y — (y|x), where ℙ(y|x) is the conditional probability of the class y, given the input (x) and the model (h).

5) With the defined nonconformity measure we calculate the nonconformity scores on the calibration data. These scores are stored and used for estimating their probability density.

6) For a new input x, we calculate it’s nonconformity score.

7) For every posible class ỹ ∈ Y the p-value is calculated.

8) We select a desired coverage of confidence level (1 — 𝛆).

9) The prediction interval for x is conformed by a set with all the classes whose p-value is higher than 𝛆.

Example

Now that we understood the general idea of the process performed by conformal prediction, we can look at an example of how to use it with Python.

For this example we are going to use a health insurance charge dataset. The idea is to predict the insurance charge given factors such as the area the person lives in, their sex, the number of children they have, among others.

First, we have to split the data in training, calibration and test sets.

Now, we can use nonconformist to estimate the prediction interval. The desired coverage for the prediction interval, in this case, is 80%.

As seen in the previous code, using the nonconformist package is very simple.

With the predicted interval we can estimate the charge ranges where there is more uncertainty.

Predicted interval for health insurance charges

In the previous figure we can see that when the insurance charge is between 15,000 and 40,000 the interval is wider and it misses the real value sometimes. This means that in this range of insurance charges there’s more uncertainty.

It’s important to validate that the actual coverage is close to the desired one. To get the coverage we can use the PICP (prediction interval coverage probability), i.e, the empirical coverage of the interval on the test set.

For this example we get PICP = 0.8222 = 82.22%, which is higher than the 80% desired when predicting the interval.

Conclusion

Using nonconformist is a good alternative to other techniques (Quantile Regression, Monte Carlo Dropout and Mean Variance Estimation) for estimating prediction intervals. Usually, the intervals created with this package have good coverage that approximates the desired one, except when the data has outliers. One of the great advantages of using nonconformist is that the desired coverage is defined when predicting, meaning, we only need to fit and calibrate the underlying model once. Another benefit of this package is that it works for both regression and classification data.

It’s important to note that the coverage of the interval relies on the accuracy of the underlying model. If the model’s predictions are not close to the real data, the predicted interval won’t be able to reach the desired coverage. This is why choosing a good underlying model is a key step for obtaining optimal results. On the other hand, if the chosen underlying model is accurate, there’s almost no need for parameter tuning.

The width of the predicted intervals is a representation of the uncertainty. The wider the interval, the more uncertain we are of the prediction accuracy. On the other hand, if the interval is relatively narrow, we can be fairly certain about our prediction.

References

[1] Nonconformist, https://github.com/donlnz/nonconformist/blob/master/README.ipynb

[2] An Introduction to Conformal Prediction, Henrik Linusson, Dept. of Information Technology, University of Borås, Sweden, https://cml.rhul.ac.uk/copa2017/presentations/CP_Tutorial_2017.pdf

--

--