Case study
"We chose WhyLabs for several reasons. First, they provide all the core model monitoring functionalities that we're looking for including a straightforward presentation of results, outlier detection, histograms, data drift monitoring, and missing feature values. [Second,] they have strong data privacy due to their aggregation of data before consumption and very fast ingestion."
ML Platform Program Manager, Fortune 500 Fintech
Introduction
A Fortune 500 financial technology pioneer leverages WhyLabs to ensure trust and reliability in the models served by their central Machine Learning Platform. The central Machine Learning Platform serves 300+ models across 20 data science teams, with ML models that make 40+ million decisions per day using billions of parameters. A model failure in this organization results in millions of dollars of lost revenue. Running models in production without monitoring models can cause a direct negative impact on the organization's business.
The challenge: Drift in mission critical models can go undetected for months
During our initial customer discovery, an ML Engineer on the data science team supporting customer account expansion, shared:
“We are currently flying blind.”
The engineer was referring to the fact that the team had almost no automated model monitoring in place. This particular team was at risk for one of their 45 models to go bad for months without anyone knowing. The team's stop gap solution was to manually conduct ad hoc monitoring which was time consuming and only done once a problem in a model was flagged by a downstream user. This ad hoc process resulted in a delay between model problem identification and resolution. It also meant that numerous hours were spent on chasing bugs instead of building new models.
The ad-hoc approach to monitoring was a common practice among all other data science teams. Overall, there was a clear pattern of challenges:
- Little to no monitoring done to ensure the models were healthy
- When monitoring happened, it involved manual process
- Time spent on manual processes took the team away from core model building/improvement projects
- Model issues were often reported by downstream users, causing the team to be reactive and scramble to find the root cause
- Investigation and resolution of each issue were also ad-hoc, manual, and time consuming
Teams were eager to automate operations, especially by detecting and alerting on data drift and quality issues in real time. They wanted to shorten the time between when an issue arises and when it's resolved and to eliminate issues before they reached downstream users.
The solution: Enable company-wide monitoring with WhyLabs
The central Machine Learning Platform team took on the task of providing automated monitoring to all platform customers. The user story took shape:
As a Machine Learning Engineer, I require the ability to detect failing models quickly and in a low effort manner so that I can prevent the waste of money on inferior business decisions (such as risk, marketing, and lending decisions).
The team kicked off a vendor evaluation process against a set of criteria representing the requirements of all internal customers. They needed a vendor who can provide monitoring capabilities directly integrated into the Machine Learning Platform. WhyLabs was selected among 13 vendors, after an in-depth evaluation and a rigorous POC. Key to WhyLabs' selection were the following unique capabilities:
- Privacy-centric integration and deployment
- Real-time monitoring and alerting
- Lightweight and easy to maintain integration
Today, WhyLabs AI Observatory delivers automatic monitoring to all customers of the central ML Platform. Each model author switches on monitoring as part of the model productionalization step using the direct integration.
Outcome: Models that build trust & deliver results
As a result of the integration the 13 teams that are customers of the ML platform can switch on observability for all models already in production or about to be launched to production.
With WhyLabs the central ML Platform has continuous monitoring enabled across all teams and instrumented as a best practice followed by each production ML model. With WhyLabs, each model has a much higher degree of reliability because it is equipped with monitoring and observability capabilities:
- Drift monitoring for features and model outputs
- Data quality monitoring for features and model outputs
- Model performance monitoring
- Real-time alerting with team-based configurations
- Auditable historical model health and data health telemetry (12+ months of history)
- Zero-config monitoring experience
- Customizable monitoring configs and baselines
Outcome: Reduce ad-hoc investigations and cut down time to resolution
Before using WhyLabs, the ML model owners would typically learn about issues in the model from downstream users. With model issues affecting a small segment of users, issues would go undetected for long periods of time, which meant that bad model decisions could silently impact the business for weeks or months.
With WhyLabs in place, teams receive Slack notifications as soon as data drifts, data quality issues, or model prediction quality decrease is detected. Through monitoring and alerting, teams are both able to fix issues immediately, and free up cycles to proactively improve data quality and retrain models before model performance degradation affects downstream users.
Outcome: Empower the central ML platform with self-service monitoring
The central ML Platform team offers monitoring as a self-service to all its customers. The decision to integrate WhyLabs into the internal ML platform enabled all data science teams to follow monitoring best practices effortlessly. Each new model launches with monitoring enabled out of the box. With the freed up cycles previously spent on manual processes, the organization is confidently moving from 380+ models to a roadmap of over 1,000 this year.