Evidently AI: Your Python Companion for Data Drift Detection and Analysis

A Python User’s Guide to Leveraging Evidently AI for Model Monitoring

7 min readJan 6, 2024

Evidently AI is a company focused on model monitoring. It has developed a tool that is particularly designed to help data scientists track and evaluate the performance of their models over time.

Here, I will not delve into the concepts of model monitoring. Below, you can reach my related post.

Adapting to Change: The Essential Guide to Drift Detection and Management in ML

Keeping Machine Learning Models on Track: A Deep Dive into Drift and Accuracy Maintenance

python.plainenglish.io

To install:

pip install evidently

Data Generation

First, let’s generate some dummy data to work on.

import pandas as pd
import numpy as np

def generate_regression_dataset(num_cols=5, binary_cols=2, categorical_cols=2, n_samples=1000, noise_level=0.1):
    """
    Generate a regression dataset with specified types and numbers of columns.

    Parameters:
    num_cols (int): Number of numerical columns
    binary_cols (int): Number of binary columns
    categorical_cols (int): Number of categorical columns (ordinal)
    n_samples (int): Number of samples in the dataset
    noise_level (float): Level of noise to add to the dependent variable

    Returns:
    pandas.DataFrame: A DataFrame containing the generated dataset
    """
    
    df = pd.DataFrame()

    # Generate numerical columns
    for i in range(num_cols):
        df[f'num_col_{i+1}'] = np.random.randn(n_samples)

    # Generate binary columns
    for i in range(binary_cols):
        df[f'binary_col_{i+1}'] = np.random.randint(0, 2, size=n_samples)

    # Generate categorical columns (ordinal)
    for i in range(categorical_cols):
        df[f'cat_col_{i+1}'] = np.random.randint(0, 3, size=n_samples)  # Assume 3 categories for simplicity

    # Generate a dependent variable
    # Assuming a simple linear relationship with some noise
    dependent_variable = np.sum([
        df[f'num_col_{i+1}'] * np.random.uniform(0.5, 1.5) for i in range(num_cols)
    ] + [
        df[f'binary_col_{i+1}'] * np.random.uniform(0.5, 1.5) for i in range(binary_cols)
    ] + [
        df[f'cat_col_{i+1}'] * np.random.uniform(0.5, 1.5) for i in range(categorical_cols)
    ], axis=0) + np.random.normal(scale=noise_level, size=n_samples)

    df['target'] = dependent_variable

    return df

# Example usage
df = generate_regression_dataset()
df.head()

Training

We have prepared our dataset for analysis. Moving forward, we will divide the dataframe into two sections. The initial segment will be employed for training purposes, while the latter will simulate new, unseen data.

old_data = df.iloc[:750, :]
new_data = df.iloc[750:,:]

Now, let’s fit a Random Forest Regressor model.

from sklearn import ensemble

numerical_features = ["num_col_1","num_col_2","num_col_3","num_col_4", "num_col_5"]
categorical_features = ["binary_col_1","binary_col_2","cat_col_1","cat_col_2"]
target = "target"

regressor = ensemble.RandomForestRegressor(random_state = 42, n_estimators = 100)

regressor.fit(old_data[numerical_features + categorical_features], old_data[target])

Getting the predictions for training data and unseen data.

train_predictions = regressor.predict(old_data[numerical_features + categorical_features])
new_predictions = regressor.predict(new_data[numerical_features + categorical_features])

prediction_col = 'prediction'

old_data[prediction_col] = train_predictions
new_data[prediction_col] = new_predictions

Column Mapping

When using Evidently to analyze a dataset or the predictions of a model, it expects a certain dataset structure. ColumnMapping allows us to explicitly define the role of each column in our dataset.

from evidently import ColumnMapping

column_mapping = ColumnMapping()

column_mapping.target = target
column_mapping.prediction = prediction_col
column_mapping.numerical_features = numerical_features
column_mapping.categorical_features = categorical_features

Report class from the Evidently library is used to generate comprehensive reports about the performance and characteristics of machine learning models.

Regression Preset

There are available predefined set of metrics designed for specific cases. RegressionPreset include common regression evaluation metrics like MAE (Mean Absolute Error), RMSE (Root Mean Squared Error), R² score, etc.

from evidently.report import Report
from evidently.metric_preset import RegressionPreset

regression_perfomance = Report(metrics=[RegressionPreset()])
regression_perfomance.run(current_data=old_data, reference_data=None, column_mapping=column_mapping)

We created an instance of a Report, specifying that it should use metrics defined in the RegressionPreset. This means the report will focus on metrics and visualizations relevant to regression models.

First, we will only pass the old, training data. To display the report, we use show method.

regression_perfomance.show()

First, the model quality metrics:

Metrics for regression. Image by the author.

Predicted vs actual scatter plot:

Predicted vs actual. Image by the author.

Predicted and Actual values over time or by index, if no datetime is provided.

Predicted vs Actual over time. Image by the author.

Model error values over time or by index, if no datetime is provided.

Absolue percentage error. Image by the author.

Error distributions:

Error Distribution. Image by the author.

Error normality q-q plot. Image by the author.

Underestimates and overestimates:

We can also save the report as HTML file.

regression_perfomance.save_html("regression_report.html")

We can also introduce the new data to compare the situation with previous data.

regression_perfomance = Report(metrics=[RegressionPreset()])
regression_perfomance.run(current_data=new_data,
                          reference_data=old_data,
                          column_mapping=column_mapping)

regression_perfomance.show()

All the visual elements are now presented with both the current and reference data adjacent to each other, enabling a direct comparison.

Target Drift Preset

We can visualize the Target Drift, too.

from evidently.metric_preset import TargetDriftPreset
target_drift = Report(metrics=[TargetDriftPreset()])
target_drift.run(current_data=new_data,
                 reference_data=old_data,
                 column_mapping=column_mapping)

target_drift.show()

Target distributions:

Target values:

Correlations with target feature.

Data Drift Preset

Let’s see the Data Drift summary

from evidently.metric_preset import DataDriftPreset

data_drift = Report(metrics=[DataDriftPreset()])
data_drift.run(current_data=new_data,
                 reference_data=old_data,
                 column_mapping=column_mapping)

data_drift.show()

First, it checks if there is any data drift, in our case, it’s a no.

We can compare each feature by distributions and statistics.

Feature Distributions. Image by the author.

Data Quality Preset

We can explore statistics about the feature and evaluate data quality.

from evidently.metric_preset import DataQualityPreset

data_drift = Report(metrics=[DataQualityPreset()])
data_drift.run(current_data=new_data,
                 reference_data=old_data,
                 column_mapping=column_mapping)

data_drift.show()

A summary

Summaries for each feature (target and predictions, too).

Missing Values:

Correlations between features.

No Target Performance

If we want to check the status without having the ground truth, we can use NoTargetPerformanceTestPreset.

TestSuite allows users to define and run a series of tests on their models or data to ensure they meet specified criteria or performance benchmarks.

from evidently.test_preset import NoTargetPerformanceTestPreset
from evidently.test_suite import TestSuite

no_target_performance = TestSuite(tests=[
    NoTargetPerformanceTestPreset(columns=['num_col_1', 'num_col_2'],  num_stattest='ks', cat_stattest='psi'),
])

no_target_performance.run(reference_data=old_data, current_data=new_data)
no_target_performance

NoTargetPerformanceTestPreset is a predefined set of tests designed to analyze and compare datasets where there is no target variable.

columns=['num_col_1', 'num_col_2']: This specifies that the tests should focus on the columns 'num_col_1' and 'num_col_2'.
num_stattest='ks': For numerical columns, the Kolmogorov-Smirnov test (KS test) will be used to compare distributions. The KS test is a nonparametric test that compares the cumulative distributions of two datasets to check if they are from the same distribution.
cat_stattest='psi': For categorical columns, the Population Stability Index (PSI) will be used. PSI is a measure used to quantify how much a categorical variable has shifted between two datasets.

Evidently AI emerges as an invaluable tool in the modern data science toolkit, especially vital in an era where the reliability and performance of machine learning models are paramount. By offering comprehensive monitoring and analysis capabilities, Evidently AI empowers data scientists and ML engineers to detect data drift, understand model behavior, and maintain high model performance with greater ease and accuracy.

Adapting to Change: The Essential Guide to Drift Detection and Management in ML

Keeping Machine Learning Models on Track: A Deep Dive into Drift and Accuracy Maintenance

python.plainenglish.io

From Black Box to Glass Box: Exploring the xAI Landscape

Python for xAI: Mastering Explainability in Machine Learning Models

faun.pub

PySpark Fundamentals: A Guide to Basic Techniques

PySpark: The Basic Operations

python.plainenglish.io

Exploring Hugging Face: Intent Classification

Intent Detection with Hugging Face Models

medium.com

Spearman’s Rank Correlation

Simplified Spearman’s Rank Correlation in Python

towardsdev.com

Sources

https://www.evidentlyai.com/

https://docs.evidentlyai.com/

https://github.com/evidentlyai/evidently

https://www.youtube.com/watch?v=cgc3dSEAel0

Evidently AI: Your Python Companion for Data Drift Detection and Analysis

A Python User’s Guide to Leveraging Evidently AI for Model Monitoring

Adapting to Change: The Essential Guide to Drift Detection and Management in ML

Keeping Machine Learning Models on Track: A Deep Dive into Drift and Accuracy Maintenance

Data Generation

Training

Column Mapping

Regression Preset

Target Drift Preset

Data Drift Preset

Data Quality Preset

No Target Performance

Read More

Adapting to Change: The Essential Guide to Drift Detection and Management in ML

Keeping Machine Learning Models on Track: A Deep Dive into Drift and Accuracy Maintenance

From Black Box to Glass Box: Exploring the xAI Landscape

Python for xAI: Mastering Explainability in Machine Learning Models

PySpark Fundamentals: A Guide to Basic Techniques

PySpark: The Basic Operations

Exploring Hugging Face: Intent Classification

Intent Detection with Hugging Face Models

Spearman’s Rank Correlation

Simplified Spearman’s Rank Correlation in Python

Sources

Written by Okan Yenigün