blog bg left
Back to Blog

Data Drift Monitoring and Its Importance in MLOps

Machine Learning (ML) is now an essential tool in most modern businesses, driving everything from predictive analytics to AI-enhanced applications. However, to ensure the effectiveness of your models, it's important to continuously monitor and manage ML performance, this process is known as Machine Learning Operations (MLOps). One crucial aspect of MLOps is managing "data drift." But what is data drift, and why is it so important to monitor it in your MLOps pipeline?

This post covers:

  • What is data drift
  • The consequences of ignoring data drift
  • Data drift monitoring in MLOps
  • How to monitor data drift with whylogs
  • Mitigating data drift
  • Conclusion

What is data drift?

Data drift refers to the change or variation in the input data of your ML  model over time. It can occur due to a variety of reasons: the data might change naturally with time with seasons, the patterns and behaviors of the users might evolve, or the business environment itself might shift, altering the data being fed into the model.

Simply put, the model's predictions are only as good as the data it is trained on. If the data that the model is seeing in the production environment starts to drift outside the distribution of the data it was trained on, the model's performance could decrease substantially.

In this blog we’ll focus on covariate drift, this form of data drift occurs when the statistical properties of the input features in production change over time. We’ll cover other types of model drift in future blog posts.

Example of data drift occurring

The consequences of ignoring data drift

Depending on your ML application, ignoring data drift can have serious consequences. The performance of your ML models can decline without your knowledge, leading to inaccurate predictions and suboptimal decisions. This could also lead to a loss of trust in the models or product, making stakeholders and customers reluctant to rely on them.

For example, consider a credit card fraud detection model. The patterns of fraudulent transactions may change over time as fraudsters adapt their strategies. If the model is not adjusted to reflect these changing patterns, the number of false positives and false negatives can increase, potentially resulting in financial loss or even damage to the company's reputation.

Data drift monitoring in MLOps

Given the potential consequences, integrating data drift monitoring into your MLOps pipeline is important. Continuous monitoring can help you detect and address any data drift to maintain your ML models' performance and reliability.

To implement data drift detection, you first need to define what constitutes a significant drift for each feature in your model. Then, by continuously comparing the distribution of the training data with that of the data in production, you can detect any significant drifts.

Different statistical tests can be used for comparison, such as the Kullback-Leibler (KL) divergence or the Kolmogorov-Smirnov (KS) test. These tests give you a measure of how much the data distributions differ, which can be used to trigger alerts if the drift exceeds a certain threshold.

Example of using statistical tests for data drift detection

Mitigating data drift

Once data drift is detected, the next step is mitigation. A common approach is to annotate the new data and retrain the model. You may want to compare model performance between models before deploying to production.

A well structured MLOps pipeline can help automate these steps, minimizing the manual effort required to retain models and ensuring faster response times by triggering workflows when data drift is detected. At a minimum, ML monitoring should be configured to send an alert when data drift occurs so you can take action.

Example of a ML pipeline with AI Observability

How to detect data drift

Fortunately, MLOps is a quickly maturing field and many tools now exist to help make ML pipelines robust and responsible! We’ll take a quick look at how you can use the open source library, whylogs.

Once you install whylogs in any Python environment using `pip` profiles of your dataset can be created with just a few lines of code! These data profiles only contain summary statistics about your dataset and can be used to monitor for data drift and data quality issues without compromising your raw data.

import whylogs as why
import pandas as pd

# profile pandas dataframe
df = pd.read_csv("path/to/file.csv")
profile1 = why.log(df)

Next, we can get a data drift report between profiles using the built in `NotebookProfileVisualizer`. By default whylogs will use KS test to calculate the drift distance between the profiles, but other popular drift metrics can be configured instead.

# Measure Data Drift with whylogs
from whylogs.viz import NotebookProfileVisualizer

visualization = NotebookProfileVisualizer()
visualization.set_profiles(target_profile_view=profile_view1, reference_profile_view=profile_view2)

In the example below we can see that data drift has been detected for the “petal length” feature in the iris dataset and drift score has been calculated using KS test.

Data drift report from whylogs

To get a better visualization of the data drift on an individual feature, we can use the `double_histogram` to overlay the histograms of the petal length feature for each profile.

visualization.double_histogram(feature_name="petal length (cm)")
Data drift visualized on the individual feature with whylogs

In this example, we can see the distribution between the two profiles hardly overlap, indicating a very large distribution drift.

To return the data drift metrics use `calculate_drift_scores` from whylogs. This will return a Python dictionary containing the data drift metric, scores, and thresholds for each feature. Learn more about adjusting these parameters in this example.

from whylogs.viz.drift.column_drift_algorithms import calculate_drift_scores

scores = calculate_drift_scores(target_view=profile_view1, reference_view=profile_view2, with_thresholds = True)


Returned data drift metrics in a Python dictionary.

{'sepal length (cm)': {'algorithm': 'ks',
  'pvalue': 0.2694519362228452,
  'statistic': 0.11333333333333329,
  'thresholds': {'NO_DRIFT': (0.15, 1),
   'POSSIBLE_DRIFT': (0.05, 0.15),
   'DRIFT': (0, 0.05)},
  'drift_category': 'NO_DRIFT'},
 'sepal width (cm)': {'algorithm': 'ks',
  'pvalue': 0.9756502052466759,
  'statistic': 0.05333333333333334,
  'thresholds': {'NO_DRIFT': (0.15, 1),
   'POSSIBLE_DRIFT': (0.05, 0.15),
   'DRIFT': (0, 0.05)},
  'drift_category': 'NO_DRIFT'},
 'petal length (cm)': {'algorithm': 'ks',
  'pvalue': 0.9993989748100714,
  'statistic': 0.04000000000000001,
  'thresholds': {'NO_DRIFT': (0.15, 1),
   'POSSIBLE_DRIFT': (0.05, 0.15),
   'DRIFT': (0, 0.05)},
  'drift_category': 'NO_DRIFT'},
 'petal width (cm)': {'algorithm': 'ks',
  'pvalue': 0.9756502052466759,
  'statistic': 0.053333333333333344,
  'thresholds': {'NO_DRIFT': (0.15, 1),
   'POSSIBLE_DRIFT': (0.05, 0.15),
   'DRIFT': (0, 0.05)},
  'drift_category': 'NO_DRIFT'}}

You can use these values to monitor for data drift between two profiles directly in your Python environment.

We can go further in ML monitoring for data drift by using the WhyLabs Observatory. The WhyLabs Observatory makes it easy to store, visualize, and monitor profiles created with whylogs.  

Using the WhyLabs platform to monitor data drift & ML performance

In order to write profiles to WhyLabs, we’ll create an account and grab our `Org-ID`, `Access token`, and `Project-ID` to set them as environment variables in our project.

# Set WhyLabs access keys

Once the access keys are set up, we can easily create a profile of your dataset and write it to WhyLabs. This allows us to monitor input data and model predictions with just a few lines of code!

# initial WhyLabs writer, Create whylogs profile, write profile to WhyLabs
writer = WhyLabsWriter()
profile= why.log(dataset)
Data Profiles visualized in WhyLabs

Now we can enable a pre-configured monitor with just one click (or create a custom one) to detect anomalies in our data profiles. This makes it easy to set up common monitoring tasks, such detecting data drift, data quality issues, and model performance.

Preset monitor configuration for data drift and data quality detection

Once a monitor is configured, it can be previewed while inspecting the feature it's set to monitor.

Data drift detection in WhyLabs

When data drift is detected, notifications can be sent via email, Slack, or trigger a workflow using PagerDuty. Set notification preferences in Settings > Global Notification Actions.

Alert and workflow trigger configuration in WhyLabs

That’s it! We have gone through all the steps needed to monitor for data drift in ML pipelines to get notified or trigger a workflow if drift occurs.

If you’d like to follow along with a full example in a notebook checkout the WhyLabs onboarding guide.

Data drift conclusion

As we’ve seen, data drift is a critical consideration in the life cycle of ML models. As the world and the data we collect continually evolve, our models must adapt to stay relevant and reliable. Integrating data drift monitoring into your MLOps pipeline is necessary to ensure the continuous delivery of high-performing ML models.

By understanding, monitoring, and mitigating data drift, you can increase the longevity of your ML models, maximize their value, and keep stakeholders confident in the insights they produce. The ultimate goal is to make your ML systems robust, reliable, and resilient in the face of change, a principle that lies at the core of effective MLOps.

Learn more about how to detect data drift with these resources:

Ready to implement data and ML monitoring in your own applications?

Other posts

Glassdoor Decreases Latency Overhead and Improves Data Monitoring with WhyLabs

The Glassdoor team describes their integration latency challenges and how they were able to decrease latency overhead and improve data monitoring with WhyLabs.

Understanding and Monitoring Embeddings in Amazon SageMaker with WhyLabs

WhyLabs and Amazon Web Services (AWS) explore the various ways embeddings are used, issues that can impact your ML models, how to identify those issues and set up monitors to prevent them in the future!

Ensuring AI Success in Healthcare: The Vital Role of ML Monitoring

Discover how ML monitoring plays a crucial role in the Healthcare industry to ensure the reliability, compliance, and overall safety of AI-driven systems.

WhyLabs Recognized by CB Insights GenAI 50 among the Most Innovative Generative AI Startups

WhyLabs has been named on CB Insights’ first annual GenAI 50 list, named as one of the world’s top 50 most innovative companies developing generative AI applications and infrastructure across industries.

Hugging Face and LangKit: Your Solution for LLM Observability

See how easy it is to generate out-of-the-box text metrics for Hugging Face LLMs and monitor them in WhyLabs to identify how model performance and user interaction are changing over time.

7 Ways to Monitor Large Language Model Behavior

Discover seven ways to track and monitor Large Language Model behavior using metrics for ChatGPT’s responses for a fixed set of 200 prompts across 35 days.

Safeguarding and Monitoring Large Language Model (LLM) Applications

We explore the concept of observability and validation in the context of language models, and demonstrate how to effectively safeguard them using guardrails.

Robust & Responsible AI Newsletter - Issue #6

A quarterly roundup of the hottest LLM, ML and Data-Centric AI news, including industry highlights, what’s brewing at WhyLabs, and more.

Monitoring LLM Performance with LangChain and LangKit

In this blog post, we dive into the significance of monitoring Large Language Models (LLMs) and show how to gain insights and effectively monitor a LangChain application with LangKit and WhyLabs.
pre footer decoration
pre footer decoration
pre footer decoration

Run AI With Certainty

Book a demo