Evidently AI: Your Python Companion for Data Drift Detection and Analysis

A Python User’s Guide to Leveraging Evidently AI for Model Monitoring

Okan Yenigün
7 min readJan 6, 2024

Evidently AI is a company focused on model monitoring. It has developed a tool that is particularly designed to help data scientists track and evaluate the performance of their models over time.

Here, I will not delve into the concepts of model monitoring. Below, you can reach my related post.

To install:

pip install evidently

Data Generation

First, let’s generate some dummy data to work on.

import pandas as pd
import numpy as np

def generate_regression_dataset(num_cols=5, binary_cols=2, categorical_cols=2, n_samples=1000, noise_level=0.1):
"""
Generate a regression dataset with specified types and numbers of columns.

Parameters:
num_cols (int): Number of numerical columns
binary_cols (int): Number of binary columns
categorical_cols (int): Number of categorical columns (ordinal)
n_samples (int): Number of samples in the dataset
noise_level (float): Level of noise to add to the dependent variable

Returns:
pandas.DataFrame: A DataFrame containing the generated dataset
"""

df = pd.DataFrame()

# Generate numerical columns
for i in range(num_cols):
df[f'num_col_{i+1}'] = np.random.randn(n_samples)

# Generate binary columns
for i in range(binary_cols):
df[f'binary_col_{i+1}'] = np.random.randint(0, 2, size=n_samples)

# Generate categorical columns (ordinal)
for i in range(categorical_cols):
df[f'cat_col_{i+1}'] = np.random.randint(0, 3, size=n_samples) # Assume 3 categories for simplicity

# Generate a dependent variable
# Assuming a simple linear relationship with some noise
dependent_variable = np.sum([
df[f'num_col_{i+1}'] * np.random.uniform(0.5, 1.5) for i in range(num_cols)
] + [
df[f'binary_col_{i+1}'] * np.random.uniform(0.5, 1.5) for i in range(binary_cols)
] + [
df[f'cat_col_{i+1}'] * np.random.uniform(0.5, 1.5) for i in range(categorical_cols)
], axis=0) + np.random.normal(scale=noise_level, size=n_samples)

df['target'] = dependent_variable

return df

# Example usage
df = generate_regression_dataset()
df.head()
Dummy Dataset. Image by the author.

Training

We have prepared our dataset for analysis. Moving forward, we will divide the dataframe into two sections. The initial segment will be employed for training purposes, while the latter will simulate new, unseen data.

old_data = df.iloc[:750, :]
new_data = df.iloc[750:,:]

Now, let’s fit a Random Forest Regressor model.

from sklearn import ensemble

numerical_features = ["num_col_1","num_col_2","num_col_3","num_col_4", "num_col_5"]
categorical_features = ["binary_col_1","binary_col_2","cat_col_1","cat_col_2"]
target = "target"

regressor = ensemble.RandomForestRegressor(random_state = 42, n_estimators = 100)

regressor.fit(old_data[numerical_features + categorical_features], old_data[target])

Getting the predictions for training data and unseen data.

train_predictions = regressor.predict(old_data[numerical_features + categorical_features])
new_predictions = regressor.predict(new_data[numerical_features + categorical_features])

prediction_col = 'prediction'

old_data[prediction_col] = train_predictions
new_data[prediction_col] = new_predictions

Column Mapping

When using Evidently to analyze a dataset or the predictions of a model, it expects a certain dataset structure. ColumnMapping allows us to explicitly define the role of each column in our dataset.

from evidently import ColumnMapping

column_mapping = ColumnMapping()

column_mapping.target = target
column_mapping.prediction = prediction_col
column_mapping.numerical_features = numerical_features
column_mapping.categorical_features = categorical_features

Report class from the Evidently library is used to generate comprehensive reports about the performance and characteristics of machine learning models.

Regression Preset

There are available predefined set of metrics designed for specific cases. RegressionPreset include common regression evaluation metrics like MAE (Mean Absolute Error), RMSE (Root Mean Squared Error), R² score, etc.

from evidently.report import Report
from evidently.metric_preset import RegressionPreset

regression_perfomance = Report(metrics=[RegressionPreset()])
regression_perfomance.run(current_data=old_data, reference_data=None, column_mapping=column_mapping)

We created an instance of a Report, specifying that it should use metrics defined in the RegressionPreset. This means the report will focus on metrics and visualizations relevant to regression models.

First, we will only pass the old, training data. To display the report, we use show method.

regression_perfomance.show()

First, the model quality metrics:

Metrics for regression. Image by the author.

Predicted vs actual scatter plot:

Predicted vs actual. Image by the author.

Predicted and Actual values over time or by index, if no datetime is provided.

Predicted vs Actual over time. Image by the author.

Model error values over time or by index, if no datetime is provided.

Error over time. Image by the author.
Absolue percentage error. Image by the author.

Error distributions:

Error Distribution. Image by the author.
Error normality q-q plot. Image by the author.

Underestimates and overestimates:

Bias table.
Predicted vs actual.

We can also save the report as HTML file.

regression_perfomance.save_html("regression_report.html")

We can also introduce the new data to compare the situation with previous data.

regression_perfomance = Report(metrics=[RegressionPreset()])
regression_perfomance.run(current_data=new_data,
reference_data=old_data,
column_mapping=column_mapping)

regression_perfomance.show()

All the visual elements are now presented with both the current and reference data adjacent to each other, enabling a direct comparison.

Predicted vs Actual.

Target Drift Preset

We can visualize the Target Drift, too.

from evidently.metric_preset import TargetDriftPreset
target_drift = Report(metrics=[TargetDriftPreset()])
target_drift.run(current_data=new_data,
reference_data=old_data,
column_mapping=column_mapping)

target_drift.show()

Target distributions:

Target distributions. Image by the author.

Target values:

Target values. Image by the author.

Correlations with target feature.

Correlations. Image by the author.

Data Drift Preset

Let’s see the Data Drift summary

from evidently.metric_preset import DataDriftPreset

data_drift = Report(metrics=[DataDriftPreset()])
data_drift.run(current_data=new_data,
reference_data=old_data,
column_mapping=column_mapping)

data_drift.show()

First, it checks if there is any data drift, in our case, it’s a no.

Dataset Drift. Image by the author.

We can compare each feature by distributions and statistics.

Feature Distributions. Image by the author.

Data Quality Preset

We can explore statistics about the feature and evaluate data quality.

from evidently.metric_preset import DataQualityPreset

data_drift = Report(metrics=[DataQualityPreset()])
data_drift.run(current_data=new_data,
reference_data=old_data,
column_mapping=column_mapping)

data_drift.show()

A summary

Summary widget.

Summaries for each feature (target and predictions, too).

Summary for target.
Summary for predictions
Summary for a numerical feature.
Summary for a categorical feature.

Missing Values:

Missing values.

Correlations between features.

Correlations.

No Target Performance

If we want to check the status without having the ground truth, we can use NoTargetPerformanceTestPreset.

TestSuite allows users to define and run a series of tests on their models or data to ensure they meet specified criteria or performance benchmarks.

from evidently.test_preset import NoTargetPerformanceTestPreset
from evidently.test_suite import TestSuite

no_target_performance = TestSuite(tests=[
NoTargetPerformanceTestPreset(columns=['num_col_1', 'num_col_2'], num_stattest='ks', cat_stattest='psi'),
])

no_target_performance.run(reference_data=old_data, current_data=new_data)
no_target_performance

NoTargetPerformanceTestPreset is a predefined set of tests designed to analyze and compare datasets where there is no target variable.

  • columns=['num_col_1', 'num_col_2']: This specifies that the tests should focus on the columns 'num_col_1' and 'num_col_2'.
  • num_stattest='ks': For numerical columns, the Kolmogorov-Smirnov test (KS test) will be used to compare distributions. The KS test is a nonparametric test that compares the cumulative distributions of two datasets to check if they are from the same distribution.
  • cat_stattest='psi': For categorical columns, the Population Stability Index (PSI) will be used. PSI is a measure used to quantify how much a categorical variable has shifted between two datasets.
No target tests. Image by the author.

Evidently AI emerges as an invaluable tool in the modern data science toolkit, especially vital in an era where the reliability and performance of machine learning models are paramount. By offering comprehensive monitoring and analysis capabilities, Evidently AI empowers data scientists and ML engineers to detect data drift, understand model behavior, and maintain high model performance with greater ease and accuracy.

Read More

Sources

https://www.evidentlyai.com/

https://docs.evidentlyai.com/

https://github.com/evidentlyai/evidently

https://www.youtube.com/watch?v=cgc3dSEAel0

--

--