WhyLabs AI Control Center (also known as the WhyLabs Platform) is now an open source project!

WhyLabs Admin

Aug 16, 2022

Back to Blog

Data + Model Monitoring with WhyLabs: simple, customizable, actionable

ML Monitoring
WhyLabs
Product Updates
AI Observability

WhyLabs Admin

Aug 16, 2022

Today, WhyLabs Observatory monitors hundreds of machine learning models and data streams. Our customers rely on us to alert them about data drifts, data quality issues, model performance degradations, and more. Every monitoring experience begins with setting up configurations, but if it’s not simple and reliable, that’s where the experience ends. To make sure our customers can easily configure monitoring on our platform, we focused our recent product efforts on refining the Observatory’s monitoring configuration experience. The new monitoring system maximizes the helpfulness of alerts and minimizes alert fatigue, so users can focus on improving their models instead of worrying about them in production. Today, we are excited to announce a new version of the WhyLabs monitoring system that allows users to:

Create custom-tailored monitors for any data and ML monitoring use case.
Switch on preset monitors with zero-configuration.
Tune each monitor's severity and notification pattern to achieve reliable alerting.

Customizable monitoring for any use case

Our customers come from a myriad of industries including logistics, healthcare, fintech, retail, HR tech, and martech. What unites them is their reliance on the WhyLabs Observatory to provide a flexible monitoring experience for their models and data. Customizable monitoring enables them to do this efficiently.

With our new monitoring system, you can:

Select the exact set of features you would like to monitor
Configure the most suitable analysis: static thresholds, standard deviation, percent change, and many more
Choose the most suitable baseline: training data (reference), trailing window, or reference date range.
Customize the appropriate severity and the action that should be taken on alert: this can include notifying you about the anomaly, automatically retraining an ML model, rolling back a data pipeline change, and more.

Start simple: one-click Preset monitors

While fully-customizable monitoring satisfies the most fine-grained use cases, not everything requires this level of tuning. When customers start using WhyLabs Observatory, they often prefer a simple way to switch on monitoring for their data and ML models. Our newest release makes this possible!

The Presets monitoring experience makes it easy for users to configure the most essential monitors with a single click. API access is also available for turning on Preset monitors programmatically. These Presets intelligently configure granular monitors based on the data in the dataset or machine learning model. This approach allows monitoring to be configured for a model with thousands of features in just a few clicks. For those who love to be in control, you’re able to fine-tune the configurations to minimize unhelpful alerts.

One application of these Preset monitors is allowing enterprises who are monitoring dozens or hundreds of ML models and data streams to efficiently set up monitoring on those assets. In fact, our customers can configure the Observatory to automatically monitor new models and data streams by integrating WhyLabs with their data and model deployment systems.

Learn more

If you are interested and ready to dive in, check out the Monitor Manager documentation or try it out for yourself by signing up for the platform for free. If you’d rather go through a few examples first, read on as we cover enabling a Preset monitor for a fraud classification model and customizing a monitor for a data stream.

Model Monitoring: a fraud classification example

To see the monitoring experience in action, let’s walk through a classic example: a fraud classification model. We set up an observability Project that monitors the health of this model on the WhyLabs platform.

About the model: it takes in a number of features about a transaction, including the amount, country, and type, and outputs a prediction of whether this transaction is fraudulent or not.

How it works: If a transaction is predicted to be fraudulent, it is routed to a human analyst to determine whether or not it was fraudulent. If a transaction isn’t predicted to be fraudulent, it also has some probability of being routed to a human so the model can get continuous feedback about its performance.

This is the Observatory's Model Performance page. We can see how our model's performance has changed over time because we have access to ground truth about our model.

In this scenario, let’s say that I wasn’t previously monitoring my model, and only found out about its performance degradation because my customer support department noticed a significant uptick in fraudulent transaction complaints. I retrain the model and get it back to acceptable performance. But how can I prevent these issues going forward?

Since I have both predictions and ground truth for this model, I can quickly set up a monitor for the model's performance in WhyLabs Observatory. If ground truth was unavailable, I could still monitor the model inputs and outputs to look for data/concept drift or data quality issues that may be affecting the performance of my model.

This is the Preset Monitors UI. I've already enabled an F1 Score monitor here, but I can configure it

After signing up for WhyLabs and following the 5 minute tutorial for onboarding my model, I navigate to the Presets tab within the Monitor Manager UI. I then select the ones I need from the different preset monitors available, and enable them with one click.

First, I switch on the F1 Score monitor to get alerted if my model has performance issues. This way, I can be proactive about fixing issues with my model, instead of reacting only after customer service surfaces an issue.

In this case, I am interested in getting an email when my model has a performance issue, so I can resolve it immediately. To do that, I configure the monitor by clicking the Configure button next to that monitor’s card and editing the Action dropdown that appears.

I can edit the configuration for a Preset manager if I want to change certain settings.

Now, if my model’s performance worsens, I will get an email right away, allowing me to dive in and start fixing the issue immediately. I can configure numerous other components of this Preset monitor, such as setting the appropriate threshold and ensuring that I get only the most relevant alerts.

For more details about monitoring this financial fraud model, stay tuned for our upcoming blog post on this topic.

Data Stream Monitoring: an auction house example

All of our customers rely on data to make business decisions, be it through a machine learning model, real-time analytics, customer-facing dashboards, or quarterly business reports. WhyLabs enables monitoring for any data in motion, no matter what decision-making it powers. In this particular example, we will dive into a streaming data use case.

About the data stream: Let’s monitor a Project which consists of an Apache Kafka stream of real-time data from a video game auction house.

How it works: The video game provides an API that enables users to track information about transactions in the auction house, including the bid amount, the item type, and the time left in the auction when the bid was made.

Our input data. As you can see, the system has picked up on some anomalies already

In this example, I’m the game developer and want to track the health of my application by looking at the data it produces. In particular, I want to track data drift for the transaction ID column because significant drifts in this field can indicate that there’s a bug in the application generating these IDs.

The Monitors page. As you can see, we've already configured some Monitors on this data, and can configure more with the orange button in the top right

Since I have a specific type of analysis and threshold in mind, I start by setting up a custom monitor. To begin tracking this data drift, I can click the orange “New custom monitor” button to create a new monitor.

It's easy to set up a new monitor, whether I'm looking to detect data drift, data quality, or a whole host of other types of issues

Within the custom monitor configuration, I have control over a number of different components of the monitor settings. I can select whether I want to monitor the entire dataset or a specific segment, the sensitivity of the analysis, what to select as a baseline, and more. Check out our monitor documentation to learn more about the options you have when configuring a monitor.

After setting up my custom monitor, I’ll receive both an email and a Slack notification if the transaction ID column has drifted in its distribution. If it has, I can quickly debug whatever problem in my application is causing this distributional change, and thereby ensure the reliability of my data application.

For more details about monitoring data streams, check out our Kafka integration documentation or look for an upcoming blog post on our newly-refreshed Kafka integration.

Try it for yourself

Whether you're monitoring ML models, data, or both, WhyLabs Observatory is a powerful tool for ensuring that you can trust your data and machine learning systems. Customizable monitors give you fine-grained control over your most demanding monitoring use case, preventing unhelpful alerts. Preset monitors give you the power to configure monitoring for your data and models with a single click, making it easy to monitor models with even thousands of features. WhyLabs is here to ensure that you spend less time setting up your monitoring and more time refining your data and ML applications.

WhyLabs Admin

Understanding and Implementing the NIST AI Risk Management Framework (RMF) with WhyLabs

Rich Young

Dec 10, 2024

Learn how the NIST AI Risk Management Framework (RMF) guides AI security and governance and discover how WhyLabs guardrails can help implement and manage AI risks effectively.

Read post

AI risk management
AI Observability
AI security
NIST RMF implementation
AI compliance
AI risk mitigation

Best Practicies for Monitoring and Securing RAG Systems in Production

Rich Young

Oct 8, 2024

Retrieval-augmented generation (RAG) systems combine advanced retrieval techniques with large language models (LLMs) to improve the responses they generate...

Read post

Retrival-Augmented Generation (RAG)
LLM Security
Generative AI
ML Monitoring
LangKit

How to Evaluate and Improve RAG Applications for Safe Production Deployment

Rich Young

Jul 17, 2024

Learn how to evaluate and improve RAG applications using LangKit and WhyLabs AI Control Center. Develop secure and reliable RAG applications.

Read post

AI Observability
LLMs
LLM Security
LangKit
RAG
Open Source

WhyLabs Integrates with NVIDIA NIM to Deliver GenAI Applications with Security and Control

WhyLabs Team

Jun 2, 2024

With WhyLabs and NVIDIA NIM, enterprises can accelerate GenAI application deployment and help ensure the safety of end-user experiences WhyLabs has been on a mission to empower enterprises with tools that ensure safe and responsible AI adoption. With its integration with NVIDIA NIM inference microservices, WhyLabs is helping make responsible AI adoption more accessible. Customers can now maintain better security and control of GenAI applications with self-hosted deployment of the most powerfu

Read post

AI Observability
Generative AI
Integrations
LLM Security
LLMs
Partnerships

OWASP Top 10 Essential Tips for Securing LLMs: Guide to Improved LLM Safety

Alessya Visnjic

May 21, 2024

Discover strategies for safeguarding your large language models (LLMs). Learn how to protect your AI technologies effectively based on OWASP's top 10 security tips.

Read post

LLMs
LLM Security
Generative AI

7 Ways to Evaluate and Monitor LLMs

WhyLabs Team

May 13, 2024

Learn about 7 techniques for evaluating & monitoring LLMs, including LLM-as-a-Judge, ML-model-as-a-Judge, and embedding-as-a-source. Improve your understanding of LLMs with these strategies.

Read post

LLMs
Generative AI

How to Distinguish User Behavior and Data Drift in LLMs

Bernease Herman

May 7, 2024

Large Language Models (LLMs) rarely provide consistent responses for the same prompts over time. In this blog we’ll demonstrate how identify and monitor data changes using a few common scenarios.

Read post

LLMs
Generative AI

Run AI with Certainty

Book a demo

Data + Model Monitoring with WhyLabs: simple, customizable, actionable

Customizable monitoring for any use case

Start simple: one-click Preset monitors

Learn more

Model Monitoring: a fraud classification example

Data Stream Monitoring: an auction house example

Try it for yourself

Other posts

Understanding and Implementing the NIST AI Risk Management Framework (RMF) with WhyLabs

Best Practicies for Monitoring and Securing RAG Systems in Production

How to Evaluate and Improve RAG Applications for Safe Production Deployment

WhyLabs Integrates with NVIDIA NIM to Deliver GenAI Applications with Security and Control

OWASP Top 10 Essential Tips for Securing LLMs: Guide to Improved LLM Safety

7 Ways to Evaluate and Monitor LLMs

How to Distinguish User Behavior and Data Drift in LLMs

Run AI with Certainty

About

Resources

whylogs

WhyLabs