WhyLabs AI Control Center (also known as the WhyLabs Platform) is now an open source project!

Sage Elliott

Nov 15, 2022

Back to Blog

ML Monitoring in Under 5 Minutes

ML Monitoring
Whylogs
Open Source
WhyLabs
Data Quality

Sage Elliott

Nov 15, 2022

It only takes a few minutes and a few lines of code to monitor your ML models and data pipelines.

Data validation and ML model monitoring are foundational steps to building reliable pipelines and responsible machine learning applications.

In this short post, I will show you how to use an open source data logging library and an AI observatory platform to monitor common issues with your ML models, such as data drift, concept drift, data quality, and performance.

Data logging and ML monitoring setup

First, we’ll install whylogs, an open-source data logging library that captures key statistical properties of data. We’ll also include dependencies for writing to the WhyLabs AI observatory for ML monitoring.

pip install “whylogs[whylabs]”

Next, we’ll import the `whylogs`,`pandas`, and `os` libraries into our Python project. We’ll also create a dataframe of our dataset to profile.

import whylogs as why
import pandas as pd
import os
# create dataframe with dataset
dataset = pd.read_csv("https://whylabs-public.s3.us-west-2.amazonaws.com/datasets/tour/current.csv")

The data profiles created with whylogs can be used on their own for data validation and data drift visualization, but in this example, we’re going to write profiles to the WhyLabs Observatory to perform ML monitoring.

In order to write profiles to WhyLabs, we’ll create an account and grab our `Org-ID`, `Access token`, and `Project-ID` to set them as environment variables in our project.

# Set WhyLabs access keys
os.environ["WHYLABS_DEFAULT_ORG_ID"] = 'YOURORGID'
os.environ["WHYLABS_API_KEY"] = 'YOURACCESSTOKEN'
os.environ["WHYLABS_DEFAULT_DATASET_ID"] = 'PROJECTID'

Create a free WhyLabs account here, no credit card required.

Create a new project and get the ID:

Create Project > Set up model > Create Project

Get organization ID and access token:

Menu > Settings > Access Tokens > Create Access Token

That’s it for setting up. We can now write data profiles to WhyLabs.

Write profiles to WhyLabs for ML monitoring

Once the access keys are set up, we can easily create a profile of your dataset and write it to WhyLabs. This allows us to monitor input data and model predictions with just a few lines of code!

# initial WhyLabs writer, Create whylogs profile, write profile to WhyLabs
writer = WhyLabsWriter()
profile= why.log(dataset)
writer.write(file=profile.view())

Profiles can be created at any stage of a pipeline allowing you to monitor data at every step.

By default the time stamp will be the time of the profile upload, but it can be overwritten to log data from different collection times and backfill profiles.

You can see an example of writing and backfilling data in this notebook.

Once profiles are written to WhyLabs they can be inspected, compared, and monitored for data quality and data drift.

Comparing and Inspecting Profiles in WhyLabs

Now we can enable a pre-configured monitor with just one click (or create a custom one) to detect anomalies in our data profiles. This makes it easy to set up common monitoring tasks, such detecting data drift, data quality issues, and model performance.

Once a monitor is configured, it can be previewed while inspecting an input feature.

Data Drift Detected With ML Monitoring in WhyLabs

When anomalies are detected, notifications can be sent via email, Slack, or PagerDuty. Set notification preferences in Settings > Notifications & Digest Settings.

That’s it! We have gone through all the steps needed to ingest data from anywhere in ML pipelines and get notified if anomalies occur.

Separating model input and outputs

It can be useful to separate model inputs and outputs, especially if you have a lot of features in your input data. Any features with names that contain the word “output” will appear in the outputs tab.

Monitoring model performance metrics

So far we’ve seen how to monitor model input and output data, but we can also monitor performance metrics such as accuracy, precision, etc. by logging ground truth with our prediction results.

To log performance metrics for monitoring use `why.log_classification_metrics` or `why.log_regression_metrics` and pass in a dataframe containing ground truth our model output results.

results = why.log_classification_metrics(
         df,
         target_column = "ground_truth",
         prediction_column = "cls_output",
         score_column="prob_output"
     )
 
 profile = results.profile() 
 results.writer("whylabs").write()

Note: Make sure your project is configured as a classification or regression model in the settings.

Just like the input data, performance metrics get uploaded with the current timestamp unless overwritten. See an example of backfilling data for performance monitoring in the example notebooks below.

Backfilling Data for Performance Monitoring in WhyLabs

Again we can select a pre-configured monitor to detect any change in performance.

See example notebooks for classification and regression monitoring on our GitHub.

Recap on ML monitoring

We covered how to quickly set up data and ML monitoring solutions that can be used at any point in your pipeline! With the right tools, ML monitoring can only take a few minutes with a few lines of code.

We barely scratched the surface of whylogs and WhyLabs features. If you’d like to learn more, request a demo or sign-up for free and explore the features yourself!

Example notebooks mentioned in this post:

Ready to implement data & ML monitoring in your own applications?

Check out whylogs for open-source data logging
Create a free WhyLabs account for data and ML monitoring
Join our Community Slack channel to ask questions and learn more

Sage Elliott

Understanding and Implementing the NIST AI Risk Management Framework (RMF) with WhyLabs

Rich Young

Dec 10, 2024

Learn how the NIST AI Risk Management Framework (RMF) guides AI security and governance and discover how WhyLabs guardrails can help implement and manage AI risks effectively.

Read post

AI risk management
AI Observability
AI security
NIST RMF implementation
AI compliance
AI risk mitigation

Best Practicies for Monitoring and Securing RAG Systems in Production

Rich Young

Oct 8, 2024

Retrieval-augmented generation (RAG) systems combine advanced retrieval techniques with large language models (LLMs) to improve the responses they generate...

Read post

Retrival-Augmented Generation (RAG)
LLM Security
Generative AI
ML Monitoring
LangKit

How to Evaluate and Improve RAG Applications for Safe Production Deployment

Rich Young

Jul 17, 2024

Learn how to evaluate and improve RAG applications using LangKit and WhyLabs AI Control Center. Develop secure and reliable RAG applications.

Read post

AI Observability
LLMs
LLM Security
LangKit
RAG
Open Source

WhyLabs Integrates with NVIDIA NIM to Deliver GenAI Applications with Security and Control

WhyLabs Team

Jun 2, 2024

With WhyLabs and NVIDIA NIM, enterprises can accelerate GenAI application deployment and help ensure the safety of end-user experiences WhyLabs has been on a mission to empower enterprises with tools that ensure safe and responsible AI adoption. With its integration with NVIDIA NIM inference microservices, WhyLabs is helping make responsible AI adoption more accessible. Customers can now maintain better security and control of GenAI applications with self-hosted deployment of the most powerfu

Read post

AI Observability
Generative AI
Integrations
LLM Security
LLMs
Partnerships

OWASP Top 10 Essential Tips for Securing LLMs: Guide to Improved LLM Safety

Alessya Visnjic

May 21, 2024

Discover strategies for safeguarding your large language models (LLMs). Learn how to protect your AI technologies effectively based on OWASP's top 10 security tips.

Read post

LLMs
LLM Security
Generative AI

7 Ways to Evaluate and Monitor LLMs

WhyLabs Team

May 13, 2024

Learn about 7 techniques for evaluating & monitoring LLMs, including LLM-as-a-Judge, ML-model-as-a-Judge, and embedding-as-a-source. Improve your understanding of LLMs with these strategies.

Read post

LLMs
Generative AI

How to Distinguish User Behavior and Data Drift in LLMs

Bernease Herman

May 7, 2024

Large Language Models (LLMs) rarely provide consistent responses for the same prompts over time. In this blog we’ll demonstrate how identify and monitor data changes using a few common scenarios.

Read post

LLMs
Generative AI

Run AI with Certainty

Book a demo

ML Monitoring in Under 5 Minutes

Data logging and ML monitoring setup

Write profiles to WhyLabs for ML monitoring

Separating model input and outputs

Monitoring model performance metrics

Recap on ML monitoring

Example notebooks mentioned in this post:

Ready to implement data & ML monitoring in your own applications?

Other posts

Understanding and Implementing the NIST AI Risk Management Framework (RMF) with WhyLabs

Best Practicies for Monitoring and Securing RAG Systems in Production

How to Evaluate and Improve RAG Applications for Safe Production Deployment

WhyLabs Integrates with NVIDIA NIM to Deliver GenAI Applications with Security and Control

OWASP Top 10 Essential Tips for Securing LLMs: Guide to Improved LLM Safety

7 Ways to Evaluate and Monitor LLMs

How to Distinguish User Behavior and Data Drift in LLMs

Run AI with Certainty

About

Resources

whylogs

WhyLabs