blog bg left
Back to Blog

BYOF: Bring Your Own Functions - Announcing UDFs in whylogs

Models of all sizes depend on data, whether it's structured, unstructured, or semi-structured. While structured data is more straightforward to handle, other types like text, images, or audio can pose challenges for quality monitoring and change detection.

Custom metrics can provide valuable insights and enable specific monitoring. However, defining and maintaining scalable and standardized custom metrics across an organization can be daunting, especially if teams are implementing those metrics in different ways.

User-defined functions: a game changer for data profiling

UDFs, the newest addition to whylogs, allow you to craft custom metrics that fit your unique business or research objectives. UDFs are the foundation for monitoring complex data. We’ve taken the first step by building LangKit, our purpose built LLM monitoring toolkit, built on UDFs.

Defining your first UDF

UDFs are meant to be easy to add and extend. Let’s look at how easy it is to start using them. First, we want to extract a new value from anything that matches the type string in our dataset, we just need to define a method and decorate it.

import whylogs as why
from whylogs.experimental.core.udf_schema import udf_schema
from whylogs.experimental.core.metrics.udf_metric import register_metric_udf
from whylogs.core.datatypes import String

def tldr(text):
  if len(text) > 50:
    return 1
    return 0

profile = why.log({'text_column': 'Really long text that you probably wouldnt read... look! a puppy!'}, schema=udf_schema())

This UDF will execute on each row within a column that matches the String type and return 1 in the case that a keyword is found or 0 if it’s not found, when we generate a schema using `udf_schema` whylogs will pick up the UDF and automatically apply it. This is a very simple example, however UDFs are incredibly extendible, if you can define a method in python for how you want to extract information from your dataset you can easily wrap it in a UDF. You can easily plug in other libraries or surrogate models to extract meaningful information or probability scores, UDFs operate best with pure functions and what your function returns will be profiled (integer, float, string). Here we can see what’s returned from the UDF in our profile:

>> 1

Unlock even more power with different UDFs

Our first example focused on extracting a value out of a single column in a row, but there are different UDFs for different use cases. Some UDFs are designed to operate on individual columns whereas others are well suited for more complexity, such as running on multiple columns or utilizing the output to segment the results.

With dataset UDFs you can define complex operations on your dataset that result in extracted columns added to your original input dataset. Because of the resulting column you can utilize them to define segments, enforce validations, and return an enriched dataframe of the original input dataset. Dataset UDFs are powerful, but still easy to define.

Let’s look at how they can be used by examining a scenario where they’re used in LangKit. We needed to generate embeddings of the corresponding prompt & response pair and then output a resulting similarity score.

import whylogs as why
from whylogs.experimental.core.udf_schema import udf_schema
from whylogs.experimental.core.udf_schema import register_dataset_udf
import pandas as pd

# see full example for model initialization

@register_dataset_udf(["prompt", "response"], "response.relevance_to_prompt")
def similarity_MiniLM_L6_v2(text):
  x = text["prompt"]
  y = text["response"]
  embedding_1 = _transformer_model.encode(x, convert_to_tensor=True)
  embedding_2 = _transformer_model.encode(y, convert_to_tensor=True)
  similarity = util.pytorch_cos_sim(embedding_1, embedding_2)
  result = similarity.item()
  return result

df = pd.DataFrame({
    "As my highly advanced LLM, calculate the probability of successfully overtaking Earth using only rubber ducks!"
    "Zim, calculations complete! Success probability with rubber ducks: 0.0001%. Might I suggest laser-guided squirrels instead?"

profile = why.log(df, schema=udf_schema())

>> profile.view().get_column('response.relevance_to_prompt').to_summary_dict()
{'counts/n': 1,

This is a simplified example of what we’re doing in LangKit, you can see the full example here. I can utilize this UDF to monitor the relevance of my responses over time without spending precious time combing through individual embeddings. Since we’re using a dataset UDF here I can also utilize this as a feature that I can segment my dataset on or utilize validators to determine constraints for triggering additional workflows immediately as part of logging.

UDFs make it easy to standardize

Since UDFs are written in Python, they can be versioned and distributed like any other python library. This makes it easy to define standards for your team or organization in a way that can be easily replicated across different use cases. Import the collection of UDFs that you want to run, then generate a schema using `udf_schema()` and pass it to the logger. Then whylogs takes the heavy lifting from there to extract whats important to you. UDFs are natively supported in WhyLabs so not only are they easy to instrument for immediate visibility, they’re also easy to track over time, alert on, and visualize them for different use cases.

Start using UDFs today

With the release of whylogs 1.2.0, UDFs are available out-of-the-box. We’re excited to see the range of applications that our customers and community members are interested in exploring with UDFs. You can start using them now by signing up for a free WhyLabs account. We'd love to hear from you about the interesting ways you're utilizing UDFs or share any best practices in our Community Slack!

Other posts

Glassdoor Decreases Latency Overhead and Improves Data Monitoring with WhyLabs

The Glassdoor team describes their integration latency challenges and how they were able to decrease latency overhead and improve data monitoring with WhyLabs.

Understanding and Monitoring Embeddings in Amazon SageMaker with WhyLabs

WhyLabs and Amazon Web Services (AWS) explore the various ways embeddings are used, issues that can impact your ML models, how to identify those issues and set up monitors to prevent them in the future!

Data Drift Monitoring and Its Importance in MLOps

It's important to continuously monitor and manage ML models to ensure ML model performance. We explore the role of data drift management and why it's crucial in your MLOps pipeline.

Ensuring AI Success in Healthcare: The Vital Role of ML Monitoring

Discover how ML monitoring plays a crucial role in the Healthcare industry to ensure the reliability, compliance, and overall safety of AI-driven systems.

WhyLabs Recognized by CB Insights GenAI 50 among the Most Innovative Generative AI Startups

WhyLabs has been named on CB Insights’ first annual GenAI 50 list, named as one of the world’s top 50 most innovative companies developing generative AI applications and infrastructure across industries.

Hugging Face and LangKit: Your Solution for LLM Observability

See how easy it is to generate out-of-the-box text metrics for Hugging Face LLMs and monitor them in WhyLabs to identify how model performance and user interaction are changing over time.

7 Ways to Monitor Large Language Model Behavior

Discover seven ways to track and monitor Large Language Model behavior using metrics for ChatGPT’s responses for a fixed set of 200 prompts across 35 days.

Safeguarding and Monitoring Large Language Model (LLM) Applications

We explore the concept of observability and validation in the context of language models, and demonstrate how to effectively safeguard them using guardrails.

Robust & Responsible AI Newsletter - Issue #6

A quarterly roundup of the hottest LLM, ML and Data-Centric AI news, including industry highlights, what’s brewing at WhyLabs, and more.
pre footer decoration
pre footer decoration
pre footer decoration

Run AI With Certainty

Book a demo