WhyLabs Weekly MLOps: Detecting Data Drift

Detecting data drift, AI observability with Hugging Face, seven ways to monitor LLMs, and more!

Published in

WhyLabs

4 min readJul 28, 2023

A lot happens every week in the WhyLabs Robust & Responsible AI (R2AI) community! This weekly update serves as a recap so you don’t miss a thing!

Start learning about MLOps and ML Monitoring:

📅 Join the next event: LLMs in Production: Lessons Learned
💻 Check out our open source projects whylogs & LangKit!
💬 Join 1,175 Robust & Responsible AI Slack members
🤝 Request a demo to learn how ML monitoring can benefit you

💡 MLOps tip of the week:

Use the whylogs open source library to detect data drift in your Python environment:

Once you install whylogs using `pip` you can create a profile of your dataset with just a few lines of code! These data profiles contain summary statistics about your dataset and can be used to monitor for data drift and data quality issues.

import whylogs as why
import pandas as pd

# profile pandas dataframe
df = pd.read_csv("path/to/file.csv")
profile1 = why.log(df)

Next we can get a data drift report between profiles using the `NotebookProfileVisualizer`. By default whylogs will use KS test to calculate the drift distance between the profiles, but other popular drift metrics can be selected instead.

In the example below we can see that data drift has been detected for the petal length feature.

# Measure Data Drift with whylogs
from whylogs.viz import NotebookProfileVisualizer

visualization = NotebookProfileVisualizer()
visualization.set_profiles(target_profile_view=profile_view1, reference_profile_view=profile_view2)

To get a better visualization of the data drift use `double_histogram` to overlay the histograms of the petal length feature for each profile..

visualization.double_histogram(feature_name="petal length (cm)")

Data drift visualized for a single feature

To get the raw data drift metrics use `calculate_drift_scores` from whylogs. This will return a Python dictionary containing the data drift metric, score, and thresholds for each feature. Learn more about adjusting these parameters here.

from whylogs.viz.drift.column_drift_algorithms import calculate_drift_scores

scores = calculate_drift_scores(target_view=profile_view1, reference_view=profile_view2, with_thresholds = True)

print(scores)

Returned data drift metrics:

{'sepal length (cm)': {'algorithm': 'ks',
  'pvalue': 0.2694519362228452,
  'statistic': 0.11333333333333329,
  'thresholds': {'NO_DRIFT': (0.15, 1),
   'POSSIBLE_DRIFT': (0.05, 0.15),
   'DRIFT': (0, 0.05)},
  'drift_category': 'NO_DRIFT'},
 'sepal width (cm)': {'algorithm': 'ks',
  'pvalue': 0.9756502052466759,
  'statistic': 0.05333333333333334,
  'thresholds': {'NO_DRIFT': (0.15, 1),
   'POSSIBLE_DRIFT': (0.05, 0.15),
   'DRIFT': (0, 0.05)},
  'drift_category': 'NO_DRIFT'},
 'petal length (cm)': {'algorithm': 'ks',
  'pvalue': 0.9993989748100714,
  'statistic': 0.04000000000000001,
  'thresholds': {'NO_DRIFT': (0.15, 1),
   'POSSIBLE_DRIFT': (0.05, 0.15),
   'DRIFT': (0, 0.05)},
  'drift_category': 'NO_DRIFT'},
 'petal width (cm)': {'algorithm': 'ks',
  'pvalue': 0.9756502052466759,
  'statistic': 0.053333333333333344,
  'thresholds': {'NO_DRIFT': (0.15, 1),
   'POSSIBLE_DRIFT': (0.05, 0.15),
   'DRIFT': (0, 0.05)},
  'drift_category': 'NO_DRIFT'}}

Learn more about detecting data drift with whylogs:

📝 Latest blog posts:

Hugging Face and LangKit: Your Solution for LLM Observability

Hugging Face has quickly become a leading name in the world of natural language processing (NLP), with its open-source library becoming the go-to resource for developers and researchers alike. As more organizations turn to Hugging Face’s language models for their NLP needs, the need for robust monitoring and observability solutions becomes more apparent. Read more on WhyLabs.AI

7 Ways to Monitor Large Language Model Behavior

In the ever-evolving landscape of AI, Large Language Models (LLMs) have revolutionized Natural Language Processing. With their remarkable ability to generate coherent and contextually relevant human-like text, LLMs have gained immense importance and adoption, transforming the way we interact with technology. Read more on WhyLabs.AI

🎥 Event recordings

LLMs in Production: Lessons Learned — Joe Heitzeberg, CEO Blueprint AI

At this event, we spoke with Joe Heitzeberg, co-founder and CEO of Blueprint AI about putting Large Language Models (LLMs) in production and lessons they’ve learned along the way!

📅 Upcoming R2AI & WhyLabs Events:

Intro to ML monitoring: Data drift, quality, bias, and explainability — RSVP on Eventbrite

8/17 Building Better Computer Vision Models — Harpreet Sahota at Deci AI

💻 WhyLabs open source updates:

whylogs v1.2.6 has been released!

whylogs is the open standard for data logging & AI telemetry. This week’s update includes:

type handling — add np.integer to int types
Backwards compatibility with kll floats

See full whylogs release notes on Github.

LangKit 0.0.11 has been released!

LangKit is an open-source text metrics toolkit for monitoring language models.

update dataset UDF to new signature

See full LangKit release notes on Github.

🤝 Stay connected with the WhyLabs Community:

Join the thousands of machine learning engineers and data scientists already using WhyLabs to solve some of the most challenging ML monitoring cases!

1,175+ Robust & Responsible AI Slack members
2,314+ whylogs GitHub Stars
1117+ Robust & Responsible AI Meetup Members
9,260+ WhyLabs LinkedIn followers
880+ WhyLabs Twitter followers

Request a demo to learn how ML monitoring can benefit your company.

See you next time! — Sage Elliott