Data and ML Monitoring is Easier with whylogs v1.1
- Whylogs
- ML Monitoring
- Open Source
- Product Updates
Sep 28, 2022
whylogs v1.1 is out with new features that make data and ML monitoring easier than ever
The release brings many features to the whylogs data logging API, making it even easier to monitor your data and ML models!
whylogs is the open-source standard for data logging, allowing you to create statistical profiles of datasets to monitor for data quality, data drift, model drift, and more in Python or Java environments. Learn more about whylogs on GitHub.
Profiles generated with whylogs can also be used with WhyLabs Observatory to easily configure a customizable monitoring experience. Learn more about the WhyLabs Observatory here.
What's new with whylogs v1.1?
If you’re a longtime whylogs user, you may notice some of these features were already available in whylogs v0, and now they’re all available in the simplified v1 API.
New features in whylogs v1.1:
- Segments: Gain visibility within a sub-group of data
- Log image data: Monitor data for computer vision models
- Log rotation: Monitor continuous data streams
- Conditional count metrics: Detect specific values in datasets
- String tracking: Monitor string data for NLP
- Model performance: Track and monitor model performance in WhyLabs
Keep reading to learn more.
Monitor subgroups of data with segments
Specific subgroups of data can behave differently from the overall dataset. When monitoring the health of a dataset, it can be helpful to have visibility at a subgroup level to better understand how these subgroups contribute to trends in the overall dataset. This can be crucial for detecting dataset bias and fairness. whylogs v1.1 supports data segmentation for this purpose.
Segmentation in whylogs can be done by a single feature or by multiple features simultaneously.
Segmented profiles can also be uploaded to WhyLabs, where each segment will appear in the “Segments” section of the model dashboard within a particular project.
Learn more about monitoring subgroups of data with segments in whylogs here.
Monitor Computer Vision data with image logging
In addition to tabular and textual data, whylogs can generate profiles of image data. whylogs can compute a number of metrics relative to image data. These metrics can be used to detect data drift and quality issues, such as low lighting levels.
results = log_image([img1, img2])
print(results.view().get_column("image_1").to_summary_dict())
Image metrics that are tracked in whylogs.
- Brightness (mean, standard deviation)
- Hue (mean, standard deviation)
- Saturation (mean, standard deviation)
- Image Pixel Height & Width
- Colorspace (e.g. RBG, HSV)
To learn more about logging image data with whylogs, check out our documentation and stay tuned for an upcoming blog post about it!
Log rotation (rolling logs) for continuous data streams
Logging continuous streams of data can be challenging. By using log rotation in whylogs, you can ingest data at the rate it gets generated, without having any delay or memory constraints.
Instead of having to plan out how to log intervals with batching, whylogs will handle all of that for you. The Logger will create a session and log information at the requested intervals of seconds, minutes, hours, or days and at that interval, write out your profile to a .bin file and flush the log, getting ready to receive more data.
Learn more about log rotation to monitor data streams here.
Conditional count metrics
By default, whylogs tracks several metrics, such as type counts, distribution metrics, cardinality, and frequent items. While these metrics are helpful for many use cases, such as monitoring data drift, sometimes custom metrics are needed to monitor an application properly.
Condition count metrics allow users to define custom metrics and return the number of times the condition was valid for a given column. This feature is useful for detecting personal identifiable information (PII) or if specific numerical values are contained in datasets.
Users can create condition count metrics with regex for string matching, conditionals for numerical values, or a custom function for any given condition.
Condition Validators can be used with these metrics to trigger actions.
Learn more about using condition count metrics in whylogs:
Basic string tracking
String tracking allows users to use whylogs to perform essential text monitoring functions on datasets. By default, columns of type str will have the following metrics, when logged with whylogs: - Counts - Types - Frequent Items/Frequent Strings - Cardinality.
Tracking further metrics for strings can be done by counting the number of characters that fall in a given unicode range for each string record, and then generating distribution metrics, such as mean, stddev and quantile values based on these counts. In addition to specific unicode ranges, whylogs can follow the same approach, but for the overall string length.
Some examples could include detecting if a communication style is changing, different languages, and how many emojis are used.
The example below tracks two specific ranges of characters:
- ASCII Digits (unicode range 48-57)
- Latin alphabet (unicode range 97-122)
Learn more about string tacking with whylogs here.
NOTE: More text and NLP logging features are coming to whylogs soon!
Model performance monitoring
Monitoring model performance is critical to understanding how well ML models continue to function once deployed. Performance is tracked by logging model predictions and ground truth data with whylogs to calculate scoring metrics in your home-grown ML monitoring solution or the WhyLabs Observability.
Users can set custom monitors in WhyLabs to detect anomalies in model performance, such as if the model accuracy score drops.
WhyLabs will calculate scoring metrics for both classification and regression models.
Classification metrics: Total output and input count, accuracy, ROC, precision-recall chart, confusion matrix, recall, FPR, precision, and F1 score.
Regression metrics: Total output and input count, mean squared error, mean absolute error, root mean squared error.
Get started with monitoring model performance:
Conclusion
We’re excited about the functionality whylogs v1.1 brings, allowing users to monitor model performance, subgroups, images, strings, and continuous data streams in our easy-to-use data logging API.
If you’re interested in trying whylogs or getting involved with our community of AI builders, here are some steps you can take:
- Check out the whylogs GitHub repository (don’t forget to give us a ⭐)
- Try out the Example Notebooks
- Join the Robust & Responsible AI Community Slack workspace
Other posts
Best Practicies for Monitoring and Securing RAG Systems in Production
Oct 8, 2024
- Retrival-Augmented Generation (RAG)
- LLM Security
- Generative AI
- ML Monitoring
- LangKit
How to Evaluate and Improve RAG Applications for Safe Production Deployment
Jul 17, 2024
- AI Observability
- LLMs
- LLM Security
- LangKit
- RAG
- Open Source
WhyLabs Integrates with NVIDIA NIM to Deliver GenAI Applications with Security and Control
Jun 2, 2024
- AI Observability
- Generative AI
- Integrations
- LLM Security
- LLMs
- Partnerships
OWASP Top 10 Essential Tips for Securing LLMs: Guide to Improved LLM Safety
May 21, 2024
- LLMs
- LLM Security
- Generative AI
7 Ways to Evaluate and Monitor LLMs
May 13, 2024
- LLMs
- Generative AI
How to Distinguish User Behavior and Data Drift in LLMs
May 7, 2024
- LLMs
- Generative AI