BYOF: Bring Your Own Functions - Announcing UDFs in whylogs
- Whylogs
- Open Source
- Product Updates
Jun 30, 2023
Models of all sizes depend on data, whether it's structured, unstructured, or semi-structured. While structured data is more straightforward to handle, other types like text, images, or audio can pose challenges for quality monitoring and change detection.
Custom metrics can provide valuable insights and enable specific monitoring. However, defining and maintaining scalable and standardized custom metrics across an organization can be daunting, especially if teams are implementing those metrics in different ways.
User-defined functions: a game changer for data profiling
UDFs, the newest addition to whylogs, allow you to craft custom metrics that fit your unique business or research objectives. UDFs are the foundation for monitoring complex data. We’ve taken the first step by building LangKit, our purpose built LLM monitoring toolkit, built on UDFs.
Defining your first UDF
UDFs are meant to be easy to add and extend. Let’s look at how easy it is to start using them. First, we want to extract a new value from anything that matches the type string in our dataset, we just need to define a method and decorate it.
import whylogs as why
from whylogs.experimental.core.udf_schema import udf_schema
from whylogs.experimental.core.metrics.udf_metric import register_metric_udf
from whylogs.core.datatypes import String
@register_metric_udf(col_type=String)
def tldr(text):
if len(text) > 50:
return 1
else:
return 0
profile = why.log({'text_column': 'Really long text that you probably wouldnt read... look! a puppy!'}, schema=udf_schema())
This UDF will execute on each row within a column that matches the String type and return 1 in the case that a keyword is found or 0 if it’s not found, when we generate a schema using `udf_schema` whylogs will pick up the UDF and automatically apply it. This is a very simple example, however UDFs are incredibly extendible, if you can define a method in python for how you want to extract information from your dataset you can easily wrap it in a UDF. You can easily plug in other libraries or surrogate models to extract meaningful information or probability scores, UDFs operate best with pure functions and what your function returns will be profiled (integer, float, string). Here we can see what’s returned from the UDF in our profile:
profile.view().get_column('text_column').to_summary_dict()['udf/tldr:distribution/max']
>> 1
Unlock even more power with different UDFs
Our first example focused on extracting a value out of a single column in a row, but there are different UDFs for different use cases. Some UDFs are designed to operate on individual columns whereas others are well suited for more complexity, such as running on multiple columns or utilizing the output to segment the results.
With dataset UDFs you can define complex operations on your dataset that result in extracted columns added to your original input dataset. Because of the resulting column you can utilize them to define segments, enforce validations, and return an enriched dataframe of the original input dataset. Dataset UDFs are powerful, but still easy to define.
Let’s look at how they can be used by examining a scenario where they’re used in LangKit. We needed to generate embeddings of the corresponding prompt & response pair and then output a resulting similarity score.
import whylogs as why
from whylogs.experimental.core.udf_schema import udf_schema
from whylogs.experimental.core.udf_schema import register_dataset_udf
import pandas as pd
# see full example for model initialization
@register_dataset_udf(["prompt", "response"], "response.relevance_to_prompt")
def similarity_MiniLM_L6_v2(text):
x = text["prompt"]
y = text["response"]
embedding_1 = _transformer_model.encode(x, convert_to_tensor=True)
embedding_2 = _transformer_model.encode(y, convert_to_tensor=True)
similarity = util.pytorch_cos_sim(embedding_1, embedding_2)
result = similarity.item()
return result
df = pd.DataFrame({
"prompt":[
"As my highly advanced LLM, calculate the probability of successfully overtaking Earth using only rubber ducks!"
],
"response":[
"Zim, calculations complete! Success probability with rubber ducks: 0.0001%. Might I suggest laser-guided squirrels instead?"
]})
profile = why.log(df, schema=udf_schema())
>> profile.view().get_column('response.relevance_to_prompt').to_summary_dict()
{'counts/n': 1,
...
This is a simplified example of what we’re doing in LangKit, you can see the full example here. I can utilize this UDF to monitor the relevance of my responses over time without spending precious time combing through individual embeddings. Since we’re using a dataset UDF here I can also utilize this as a feature that I can segment my dataset on or utilize validators to determine constraints for triggering additional workflows immediately as part of logging.
UDFs make it easy to standardize
Since UDFs are written in Python, they can be versioned and distributed like any other python library. This makes it easy to define standards for your team or organization in a way that can be easily replicated across different use cases. Import the collection of UDFs that you want to run, then generate a schema using `udf_schema()` and pass it to the logger. Then whylogs takes the heavy lifting from there to extract whats important to you. UDFs are natively supported in WhyLabs so not only are they easy to instrument for immediate visibility, they’re also easy to track over time, alert on, and visualize them for different use cases.
Start using UDFs today
With the release of whylogs 1.2.0, UDFs are available out-of-the-box. We’re excited to see the range of applications that our customers and community members are interested in exploring with UDFs. You can start using them now by signing up for a free WhyLabs account. We'd love to hear from you about the interesting ways you're utilizing UDFs or share any best practices in our Community Slack!
Other posts
Best Practicies for Monitoring and Securing RAG Systems in Production
Oct 8, 2024
- Retrival-Augmented Generation (RAG)
- LLM Security
- Generative AI
- ML Monitoring
- LangKit
How to Evaluate and Improve RAG Applications for Safe Production Deployment
Jul 17, 2024
- AI Observability
- LLMs
- LLM Security
- LangKit
- RAG
- Open Source
WhyLabs Integrates with NVIDIA NIM to Deliver GenAI Applications with Security and Control
Jun 2, 2024
- AI Observability
- Generative AI
- Integrations
- LLM Security
- LLMs
- Partnerships
OWASP Top 10 Essential Tips for Securing LLMs: Guide to Improved LLM Safety
May 21, 2024
- LLMs
- LLM Security
- Generative AI
7 Ways to Evaluate and Monitor LLMs
May 13, 2024
- LLMs
- Generative AI
How to Distinguish User Behavior and Data Drift in LLMs
May 7, 2024
- LLMs
- Generative AI