blog bg left
Back to Blog

Data Labeling Meets Data Monitoring with Superb AI and WhyLabs


Data quality is the key to a performant machine learning model. Without high-quality data to train on, the model will be unable to represent the real-world processes that the data encapsulates accurately. And without high-quality data to feed into the model once it’s trained, the model’s predictions will be wildly inaccurate. That’s why WhyLabs and Superb AI are on a mission to ensure that data scientists and machine learning engineers have access to tools designed specifically for their needs and workflows. These tools enable them to generate high-quality data and monitor the quality of their data, so they can produce robust and reliable ML models.

In this blog post, we explain how WhyLabs and Superb AI’s complementary technologies fit together in a way that brings value to AI practitioners. After a brief overview of each platform, we dive into an example workflow that demonstrates how the two tools can be used in conjunction.

Superb AI Suite Platform

Superb AI has introduced a revolutionary way for ML teams to drastically decrease the time it takes to deliver high-quality training datasets. Instead of relying on human labelers for a majority of the data preparation workflow, teams can now implement a much more time and cost-efficient pipeline with the Superb AI platform.

Everything centers around Superb AI’s customizable auto-label (CAL) technology, which uses a unique mixture of transfer learning, few-shot learning, and autoML, allowing the model to achieve high levels of efficiency with small, customer-proprietary datasets quickly. The concept is quite simple: instead of having to create massive ground truth datasets by hand, teams can now build much smaller ground truth or “golden” sets, quickly spin up and train an auto-labeling model with just a few clicks and start labeling large datasets in a matter of minutes. Coupling the workflow with proprietary Uncertain Estimation AI and enterprise-level auditing tools, teams can label large datasets, immediately identify hard labels, build active learning workflows for auditing and deliver datasets in a matter of days.

WhyLabs AI Observability Platform

WhyLabs provides the critical missing component of AI observability in production ML systems by monitoring ML deployments. With the WhyLabs AI Observability Platform, every AI practitioner can switch on monitoring for model and data health automatically. Data science teams use the platform to monitor data pipelines and AI applications - surfacing data quality issues, data bias, data drift, and concept drift. Out-of-the-box anomaly detection and purpose-built visualizations let WhyLabs users prevent costly model failures and eliminate the need for manual troubleshooting.

WhyLabs is unique in its approach to monitoring data and ML models. It relies on the open-source data logging standard, whylogs, to generate data profiles, statistical summaries of datasets. These profiles get sent to the WhyLabs platform, where they can be analyzed and alerted on.  It works on any data, structured or unstructured, at any scale, on any platform.

Automated Labeling + Monitoring = Reliable Data Operations

Amongst the common use cases for WhyLabs’ customers is monitoring computer vision models. To monitor such a model, a “baseline” profile needs to be generated from the images on which the model is trained. Then, more profiles are generated on the images used for inference once the model is in production. These production profiles are compared against the baseline profile and against each other, allowing a data scientist to notice when data starts to drift and performance starts to degrade.

When a user experiences training-serving skew or data drift, they can be sure that model performance degradation is sure to follow. And if a model is not performing well, it is costing the business potential revenue that it would be able to capture if the model was functioning. To remedy this model performance degradation, a user can turn to SuperbAI to automatically label a fresh dataset and retrain their model based on this new data.


As you can see, WhyLabs and Superb AI fit together perfectly to enable data quality assurance for their users and enable reliable data operations.

If you’re interested in trying out the WhyLabs, check out the always-free Starter edition.

If you’re interested in trying out the Superb AI platform, request a free trial here.

Other posts

Data Logging With whylogs

Users can detect data drift, prevent ML model performance degradation, validate the quality of their data, and more in a single, lightning-fast, easy-to-use package. The v1 release brings a simpler API, new data constraints, new profile visualizations, faster performance, and a usability refresh.

Visually Inspecting Data Profiles for Data Distribution Shifts

This short tutorial shows how to inspect data for distribution shift issues by comparing distribution metrics and applying statistical tests for drift values calculations.

Choosing the Right Data Quality Monitoring Solution

In the second article in this series, we break down what to look for in a data quality monitoring solution, open source and Saas tools available, and how to decide on the best one for your organization.

A Comprehensive Overview Of Data Quality Monitoring

In the first article in this series, we provide a detailed overview of why data quality monitoring is crucial for building successful data and machine learning systems and how to approach it.

WhyLabs Now Available in AWS Marketplace

AWS customers worldwide can now quickly deploy the WhyLabs AI Observatory to monitor, understand, and debug their machine learning models deployed in AWS.

Deploying and Monitoring Made Easy with TeachableHub and WhyLabs

Deploying a model into production and maintaining its performance can be harrowing for many Data Scientists, especially without specialized expertise and equipment. Fortunately, TeachableHub and WhyLabs make it easy to get models out of the sandbox and into a production-ready environment.

How Observability Uncovers the Effects of ML Technical Debt

Many teams test their machine learning models offline but conduct little to no online evaluation after initial deployment. These teams are flying blind—running production systems with no insight into their ongoing performance.

Deploy your ML model with UbiOps and monitor it with WhyLabs

Machine learning models can only provide value for a business when they are brought out of the sandbox and into the real world... Fortunately, UbiOps and WhyLabs have partnered together to make deploying and monitoring machine learning models easy.

AI Observability for All

We’re excited to announce our new Starter edition: a free tier of our model monitoring solution that allows users to access all of the features of the WhyLabs AI observability platform. It is entirely self-service, meaning that users can sign up for an account and get started right away.
pre footer decoration
pre footer decoration
pre footer decoration

Run AI With Certainty

Get started for free