WhyLabs: The AI Observability Platform
Sep 23, 2020
- Machine Learning
- Data Science
- Data Visualization
- Artificial Intelligence
As companies across industries adopt AI applications in order to improve products and stay competitive, very few have seen a return on their investments. That’s because AI operations are expensive, and models fail all the time. Over 1,000 AI failures have been recorded by Partnership on AI alone. Meanwhile, the big tech companies have successfully deployed AI operations and are already reaping significant benefits from them. Our goal at WhyLabs is to equip every AI practitioner with tools previously only available to the tech giants.
After interviewing hundreds of AI-running enterprises, we built the WhyLabs Platform to enable every enterprise, no matter how large or small, to run AI with certainty. As a team of experienced AI practitioners, we designed the platform for fellow practitioners, keeping their most pressing needs in mind. The WhyLabs platform is specifically built for data science workflows, incorporating methods and features that we pioneered based on analogous best practices in DevOps. Furthermore, it is easy to install, easy to deploy and easy to operate. The WhyLabs Platform enables AI builders to effortlessly do the following:
- Amplify AI operations across your entire organization by eliminating manual troubleshooting.
- Log and profile data along a model’s entire lifecycle with minimal compute requirements.
- Surface actionable insights regarding data quality issues, data bias, and concept drift, all in real time.
- Connect model performance with product KPIs to help teams ensure that they are delivering financial results and a superb customer experience.
Solving and preventing problems at the source
The WhyLabs solution starts at the source of the problem: data. The peculiar thing about AI applications is that the majority of failures happen because of the data that models consume. We built a data logging solution — called whylogs — which enables anybody to continuously log and monitor the quality of the data that flows through their AI application. We believe so strongly in the importance of continuous data monitoring and logging for responsible AI operations, that we made whylogs available for free for all AI builders by releasing it as an open source library.
whylogs is a one-of-a-kind logging solution that we designed to efficiently handle massive amounts of data. Powered by approximate statistical methods, the library can “summarize” terabytes of data into tiny “statistical fingerprints”, which can be as small as a few megabytes. It runs in parallel with AI applications and requires virtually no additional computing power than what is already being used to run the application. The lightweight “summaries” whylogs distills are extremely useful to AI builders for troubleshooting. The library can be deployed anywhere in the ML pipeline at any stage of the model lifecycle to track data continuously without breaking the compute and storage budget. Check out our deep dive on whylogs’ design and scalability.
AI Observability as a Service
By itself, whylogs is an indispensable tool for any AI practitioner. Once a team is using it, they can switch on the WhyLabs Platform at any time to upgrade and supercharge their AI operations. Onboarding to the SaaS platform is quick and intuitive. It involves deploying an agent similar to ones that are standard practice in DevOps tools like Splunk and Datadog. The WhyLabs Platform integrates seamlessly with the data-storage solutions of all major cloud services and with all major ML frameworks. The platform supports all deployment strategies — public cloud, on-premise servers, or hybrid.
The WhyLabs Platform empowers organizations of all sizes to take control of their AI operations and run their models with certainty. Its architecture is optimized for large-scale data evaluation and enterprise-grade security and availability. It is designed specifically for data science workflows. Since the platform runs on statistical profiles generated by whylogs, raw data never leaves the customer perimeter. This design makes our platform well suited for customers with highly sensitive data.
A single pane of glass
Once the statistical data summaries start flowing into WhyLabs, the platform then creates a single pane of glass for all data quality and model health information. The purpose-built user interface is designed to surface insights across all models that are operated by an organization. For each model, all inputs are continuously tracked and monitored for deviations in data quality and for data drifts. In order to maximize observability, it is essential that AI practitioners track raw data, feature data, model predictions, and actuals. The WhyLabs Platform makes all these steps easy and thus allows customers to have a comprehensive view of their AI application’s entire pipeline, from data source to business KPIs.
At each point of the pipeline, all of the model’s features are tracked, monitored, and analyzed. For each feature, there is a dedicated visualization of how the statistical properties of this feature evolved over time. The goal is to allow model operators to perform deep dives into data quality, data drift, and data biases at individual feature levels. We also layer on proactive monitoring to highlight deviations and drifts, and to generate timely alerts. These alerts and insights are easy to share across the organization via Slack, email, PagerDuty or other messaging platforms.
Only the beginning
The WhyLabs Platform is the first big step towards our vision of achieving robust and responsible AI. By enabling Observability in AI, our platform helps AI builders run AI with certainty no matter where they are in their model lifecycle. As we tackle new use cases to better serve our customers, we are constantly adding more features, data types, platform integrations, and interactive visualizations. We’d love to hear about your use cases, pain points, and ideas for how we can help you simplify your AI Operations. Get started by trying the sandbox on our website or by scheduling a live demo.
Integrating whylogs into your Kafka ML Pipeline
Chris WarthAlessya Visnjic
Apr 7, 2021
- Machine Learning
Monitoring High-Performance Machine Learning Models with RAPIDS and whylogs
Andy DangBernease Herman
Mar 1, 2021
- Apache Spark
- Data Analytics
- Data Logging
- Machine Learning
Streamlining data monitoring with whylogs and MLflow
Feb 8, 2021
- Machine Learning