WhyLabs Weekly: ML Monitoring for Data Drift

ML monitoring for data drift, building better computer vision models, and more.

Sage Elliott
WhyLabs
4 min readAug 18, 2023

--

A lot happens every week in the WhyLabs Robust & Responsible AI (R2AI) community! This weekly update serves as a recap so you don’t miss a thing!

Start learning about MLOps and ML Monitoring:

💡 MLOps tip of the week:

How To set up ML monitoring for data drift: A couple weeks ago we looked at using whylogs, our open source library for detecting data drift in a python environment. This week we’ll see how we can monitor ML models for data drift in the WhyLabs observatory platform. This will allow us to easily configure alerts or workflows based on any anomalies detected.

Once you install whylogs using `pip` you can create a profile of your dataset with just a few lines of code! These data profiles contain summary statistics about your dataset and can be used to monitor for data drift and data quality issues. These profiles can be sent to WhyLabs for visualization and monitoring with just a few lines of code.

import os

# set WhyLabs authentication & project keys
os.environ["WHYLABS_DEFAULT_ORG_ID"] = 'ORGID'
os.environ["WHYLABS_API_KEY"] = 'APIKEY'
os.environ["WHYLABS_DEFAULT_DATASET_ID"] = 'MODELID'

from whylogs.api.writer.whylabs import WhyLabsWriter

# Write a single profile to WhyLabs
writer = WhyLabsWriter()
profile= why.log(dataset_batch)
writer.write(file=profile.view())

Monitoring for data quality, data drift, and model performance can be setup with just one click using prebuilt configurations, or custom monitors can be configured in the UI or with JSON.

Preset configurations for ML monitoring with WhyLabs

To preview monitor alerts click the “preview now” button in the corresponding data tab.

Example of data drift monitoring preview with WhyLabs

Learn more about detecting data drift and ML monitoring:

📝 Latest blog posts:

WhyLabs Recognized by CB Insights GenAI 50 among the Most Innovative Generative AI Startups

CB Insights named WhyLabs to its first annual GenAI 50 ranking, a list of the world’s top 50 most innovative companies developing generative AI applications and infrastructure across industries. What makes this particularly notable is that Model Observability is immediately recognized as an essential category critical to the success of LLM applications. Read more on WhyLabs.AI

🎥 Event recordings

Building Better Computer Vision Models — Harpreet Sahota at Deci AI

At this event, we’ll be speaking with ​​Harpreet Sahota about the exciting world of deep learning and computer vision. We’ll discuss how to build better computer vision models and applications.

Building Better Computer Vision Models

💻 WhyLabs open source updates:

whylogs v1.3.0 has been released!

whylogs is the open standard for data logging & AI telemetry. This week’s update includes:

  • register validator udf
  • Allow session_type argument to init() for back-compatability
  • Add new process logger and thread logger
  • Respect the allow_anonymous flag even if you’ve previously been anonymous
  • Several changes to support why.init for authenticated users

See full whylogs release notes on Github.

LangKit 0.0.15 has been released!

LangKit is an open-source text metrics toolkit for monitoring language models.

  • Fix aggregate reading level regression
  • Add regex counter UDFs
  • Allow custom embeddings encoder for input_output and themes
  • Fix some config issues
  • Add Makefile

See full LangKit release notes on Github.

🤝 Stay connected with the WhyLabs Community:

Join the thousands of machine learning engineers and data scientists already using WhyLabs to solve some of the most challenging ML monitoring cases!

Request a demo to learn how ML monitoring can benefit your company.

See you next time! — Sage Elliott

--

--