WhyLabs AI Control Center (also known as the WhyLabs Platform) is now an open source project!

WhyLabs Admin

Jan 5, 2022

Back to Blog

Deploy your ML model with UbiOps and monitor it with WhyLabs

Integrations
ML Monitoring

WhyLabs Admin

Jan 5, 2022

Introduction

Machine learning models can only provide value for a business when they are brought out of the sandbox and into the real world. However, this is easier said than done. Many enterprises fail to consider the difficulty of productionizing their models and struggle to deploy their models, resulting in wasted resources and broken promises about the value of machine learning. Fortunately, UbiOps and WhyLabs have partnered together to make deploying and monitoring machine learning models easy.

UbiOps is the easy-to-use serving and hosting layer for data science code. UbiOps makes it easier than ever to use a top-notch deployment, serving, and management layer on top of your preferred infrastructure. Accessible via the UI, client library, or CLI, it’s suitable for every type of data scientist, without the need for in-depth engineering knowledge. You only have to bring your Python or R based algorithms and UbiOps takes care of the rest.

WhyLabs provides the missing piece of the puzzle for monitoring and observing ML deployments. With the WhyLabs AI Observability Platform, it’s never been easier to ensure model and data health. Data science teams use the platform to monitor data pipelines and AI applications, surfacing data quality issues, data bias, and concept drift. Out-of-the-box anomaly detection and purpose-built visualizations let WhyLabs’ users prevent costly model failures and eliminate the need for manual troubleshooting. It works on any data, structured or unstructured, at any scale, on any platform.

To showcase the integration, we will train a model to predict the price of a used car based on a number of factors (including horsepower, year, and mileage), deploy it with UbiOps and then monitor it “in production” with WhyLabs. We use a simplified version of this Kaggle dataset, from which we have removed less-relevant features and cut down the number of rows. We split our dataset into “training”, “testing”, and “production” dataframes, and add some perturbations to the production dataset to highlight the impact of differences between sandbox data and real world data. You can run all of this code yourself by running this Jupyter notebook or simply follow along in this post.

Preparing our environment

We advise you to go through the UbiOps quickstart and WhyLabs quickstart before continuing.

Before we get into the fun of training, deploying and monitoring our model, we need to create an environment which is conducive to these activities. We can do so by installing certain dependencies (pandas, sklearn, ubiops, and whylogs) using pip.

import sys
!{sys.executable} -m pip install -U pip
!{sys.executable} -m pip install pandas --user
!{sys.executable} -m pip install sklearn --user
!{sys.executable} -m pip install ubiops --user
!{sys.executable} -m pip install whylogs --user

We also need to set certain environment variables to interact with the UbiOps and WhyLabs platforms, including API tokens or keys for both, a project name for UbiOps, and an org and dataset ID for WhyLabs.

import os

# Set WhyLabs config variables
WHYLABS_API_KEY = "whylabs.apikey"
WHYLABS_DEFAULT_ORG_ID = "org-1"
WHYLABS_DEFAULT_DATASET_ID = "model-1"


# Set ubiops config variables
API_TOKEN = "Token ubiopsapitoken" # Make sure this is in the format "Token token-code"
PROJECT_NAME = "blog-post"

# Set environment variables
os.environ["WHYLABS_API_KEY"] = WHYLABS_API_KEY
os.environ["WHYLABS_DEFAULT_ORG_ID"] = WHYLABS_DEFAULT_ORG_ID
os.environ["WHYLABS_DEFAULT_DATASET_ID"] = WHYLABS_DEFAULT_DATASET_ID

Once we’ve done all of this, we are ready to train our model.

Training the model

We start by training a simple linear regression from the scikit-learn library on our dataset. Alongside the model training, we also generate data logs in the form of whylogs profiles and send those profiles to the WhyLabs platform. Doing so allows us to create a “baseline” of the data used to train the model, which we can compare against the data in production to ensure that our model’s performance doesn’t degrade.

We do so by introducing the following code snippet alongside our model training code:

writer = WhyLabsWriter("", formats=[],)
session = Session(project="demo-project", pipeline="pipeline-id", writers=[writer])
with session.logger(dataset_timestamp=yesterday) as ylog:
    ylog.log_dataframe(data)

Deploying with UbiOps

Now that we’ve got a trained model, we can deploy it with UbiOps. The key difference between this deployment and most of the deployments documented elsewhere is the WhyLabs integration. As explained earlier, WhyLabs configures itself using environment variables. In UbiOps we can also easily create environment variables and make them available in our deployments.

The cool thing is that we can also create them using our client library! So by reusing the WhyLabs variables we created earlier, we can do something like this:

# Create environment variables for whylabs
api.deployment_environment_variables_create(
    project_name=PROJECT_NAME,
    deployment_name=DEPLOYMENT_NAME,
    data=ubiops.EnvironmentVariableCreate(
        name="WHYLABS_API_KEY",
        value=WHYLABS_API_KEY,
        secret=True
    )
)

api.deployment_environment_variables_create(
    project_name=PROJECT_NAME,
    deployment_name=DEPLOYMENT_NAME,
    data=ubiops.EnvironmentVariableCreate(
        name="WHYLABS_DEFAULT_ORG_ID",
        value=WHYLABS_DEFAULT_ORG_ID,
        secret=True
    )
)

api.deployment_environment_variables_create(
    project_name=PROJECT_NAME,
    deployment_name=DEPLOYMENT_NAME,
    data=ubiops.EnvironmentVariableCreate(
        name="WHYLABS_DEFAULT_DATASET_ID",
        value=WHYLABS_DEFAULT_DATASET_ID,
        secret=True
    )
)

This way the WhyLabs client can be easily initialized by putting something like this in for example your init():

self.wl_session = get_or_create_session()

Monitoring with WhyLabs

Now that we’ve got our model deployed and are sending data to it, we can also log that same data as well as the predictions being made, and send those logs over to WhyLabs. As above, creating whylogs profiles and sending them to WhyLabs requires a short code snippet:

X = pd.read_csv('production_used_cars_data.csv')
Y = pd.read_csv('prediction.csv')
combined = X
combined['price'] = Y['target']
with session.logger() as ylog:
    ylog.log_dataframe(combined)

Once we’ve sent these profiles over to WhyLabs, we can compare them to the profile generated for the training data and see whether there’s training-serving skew that might be impacting our model performance.

WhyLabs allows you to track changes in your data over time, as well as set up alerts that notify you as soon as the data coming into your model or the model’s predictions change. These features allow you to monitor your ML model in production, all with a collaborative, user-friendly UI.

Conclusion

With UbiOps and WhyLabs, deploying and monitoring your model has never been easier. In this example, we’ve shown how you can log your training data with whylogs while the model is being trained, and then compare those data logs to data logs generated when the model is deployed in UbiOps.

For a complete technical rundown of an integration example take a look at the GitHub repo.

If you are interested in deploying your data science code, check out UbiOps. You can try it today for free! For more technical information you can also visit UbiOps Docs page: www.ubiops.com/docs/.

If you are interested in trying out the WhyLabs platform, check out our totally free Starter edition.

WhyLabs Admin

Understanding and Implementing the NIST AI Risk Management Framework (RMF) with WhyLabs

Rich Young

Dec 10, 2024

Learn how the NIST AI Risk Management Framework (RMF) guides AI security and governance and discover how WhyLabs guardrails can help implement and manage AI risks effectively.

Read post

AI risk management
AI Observability
AI security
NIST RMF implementation
AI compliance
AI risk mitigation

Best Practicies for Monitoring and Securing RAG Systems in Production

Rich Young

Oct 8, 2024

Retrieval-augmented generation (RAG) systems combine advanced retrieval techniques with large language models (LLMs) to improve the responses they generate...

Read post

Retrival-Augmented Generation (RAG)
LLM Security
Generative AI
ML Monitoring
LangKit

How to Evaluate and Improve RAG Applications for Safe Production Deployment

Rich Young

Jul 17, 2024

Learn how to evaluate and improve RAG applications using LangKit and WhyLabs AI Control Center. Develop secure and reliable RAG applications.

Read post

AI Observability
LLMs
LLM Security
LangKit
RAG
Open Source

WhyLabs Integrates with NVIDIA NIM to Deliver GenAI Applications with Security and Control

WhyLabs Team

Jun 2, 2024

With WhyLabs and NVIDIA NIM, enterprises can accelerate GenAI application deployment and help ensure the safety of end-user experiences WhyLabs has been on a mission to empower enterprises with tools that ensure safe and responsible AI adoption. With its integration with NVIDIA NIM inference microservices, WhyLabs is helping make responsible AI adoption more accessible. Customers can now maintain better security and control of GenAI applications with self-hosted deployment of the most powerfu

Read post

AI Observability
Generative AI
Integrations
LLM Security
LLMs
Partnerships

OWASP Top 10 Essential Tips for Securing LLMs: Guide to Improved LLM Safety

Alessya Visnjic

May 21, 2024

Discover strategies for safeguarding your large language models (LLMs). Learn how to protect your AI technologies effectively based on OWASP's top 10 security tips.

Read post

LLMs
LLM Security
Generative AI

7 Ways to Evaluate and Monitor LLMs

WhyLabs Team

May 13, 2024

Learn about 7 techniques for evaluating & monitoring LLMs, including LLM-as-a-Judge, ML-model-as-a-Judge, and embedding-as-a-source. Improve your understanding of LLMs with these strategies.

Read post

LLMs
Generative AI

How to Distinguish User Behavior and Data Drift in LLMs

Bernease Herman

May 7, 2024

Large Language Models (LLMs) rarely provide consistent responses for the same prompts over time. In this blog we’ll demonstrate how identify and monitor data changes using a few common scenarios.

Read post

LLMs
Generative AI

Run AI with Certainty

Book a demo

Deploy your ML model with UbiOps and monitor it with WhyLabs

Introduction

Preparing our environment

Training the model

Deploying with UbiOps

Monitoring with WhyLabs

Conclusion

Other posts

Understanding and Implementing the NIST AI Risk Management Framework (RMF) with WhyLabs

Best Practicies for Monitoring and Securing RAG Systems in Production

How to Evaluate and Improve RAG Applications for Safe Production Deployment

WhyLabs Integrates with NVIDIA NIM to Deliver GenAI Applications with Security and Control

OWASP Top 10 Essential Tips for Securing LLMs: Guide to Improved LLM Safety

7 Ways to Evaluate and Monitor LLMs

How to Distinguish User Behavior and Data Drift in LLMs

Run AI with Certainty

About

Resources

whylogs

WhyLabs