Frequently asked questions
What problems do you solve?
We instrument data and ML pipeline to log the key properties of data that flows through them. We centralize these logs to monitor for and surface data bias, concept drift, and data quality issues in real time and at large scale. Ultimately, the WhyLabs Platform simplifies AI operations and prevents costly AI failures.
Can you handle large scale data?
We love massive amounts of data and have built the WhyLabs platform in a way that allows us to process and deliver results quickly, efficiently and without breaking the bank.
What type of data can I use?
We work with both structured and unstructured data (images, time series, text, etc).
Do I own my data?
Yes, you own all your data. WhyLabs does not collect any raw data from our customers. We allow you to maintain full control of your proprietary information. Since our logging solution is open-source, any enterprise can transparently deploy it today. Moreover, our deployment model is well suited for customers with highly sensitive data, accommodating the strictest data handling requirements such as PII and HIPAA.
How do I log my data?
You start with a simple pip install of whylogs, our lean open source library https://github.com/whylabs. We currently support Python and Java/Spark integrations. Check out the Jupyter notebook tutorial for getting started with Python.
Can I export my data?
Yes! When you run whylogs on your data you are in control of your data and your output. whylogs outputs the statistical profiles of data in many different formats. We provide starter Jupyter notebooks to analyze the data generated by whylogs. By default, the whylogs library does not send data anywhere.
What integrations do you support?
We integrate seamlessly with your existing data pipelines and tools. Some examples include Google, AWS, Microsoft, Databricks, Snowflake, TensorFlow, PyTorch, SageMaker, and Keras.
Do you integrate with data storage platforms?
Yes. We integrate seamlessly with the data-storage solutions of all major cloud services. Getting started takes minutes.
What machine learning libraries does WhyLabs work with?
WhyLabs works with most machine learning libraries and is easy to set up with custom libraries as well. View our get started guide to see how to use us with pytorch, keras, tensorflow, and more.
Is there an on-premise solution?
Yes, you can use WhyLabs on-premise or in the cloud. We are happy to discuss your specific use case, email us at [email protected].
What is data logging?
Unlike traditional logging which is standard practice in all software systems, data logging is a practice necessary for ML/AI applications. Data logging involves creating a log of statistical properties of the data that flows through an ML/AI application. Checkout our blogs on parallels between DevOps and MLOps logging approaches and on the practice of data logging in data science.
Can I export to Jupyter Notebooks?
whylogs outputs the statistical profiles of data in many different formats. We provide starter Jupyter notebooks to analyze the data generated by whylogs. The WhyLabs Platform currently does not support data export, but this feature is high on our roadmap. Please let us know on our public Slack channel if this feature is important for you!
How is WhyLabs different from AWS SageMaker Model Monitor?
Sagemarker Model Monitor is a great starting solution for models running on SageMaker. WhyLabs focuses on supporting all models in a platform-agnostic way. Furthermore, WhyLabs focuses on surfacing issues in a purpose-built, intuitive interface that enables cross-functional collaboration and improves operational workflows (with Slack, email, PagerDuty, etc. integrations). If you are using SageMaker model monitor, we can ingest the output into WhyLabs and surface monitoring alerts in the WhyLabs interface. Contact our support team to discuss this feature, [email protected].
What is WhyLabs?
WhyLabs is an observability platform that enables AI practitioners to run AI with certainty. We streamline model monitoring, and provide real-time insights into model and data health. AI builders can answer and act on questions related to model drift, anomalies, explainability, and data quality such as:
- Why is the model performance getting worse over time?
- Why is the model performance not matching our experiments?
- Why did the model generate unreasonable predictions for this customer segment?
- What changed in the model behavior between yesterday and today?
What can I do with the WhyLabs platform?
Here are a few examples of what our platform enables:
- Real-time view of data quality and health across the entire ML production pipeline
- Intelligent monitoring of data quality and data distributions in real-time
- Integrated view of data and model health over time and in real-time
- Intuitive insights into model and data health available to the relevant team members
How do I get started?
How do I onboard my team on WhyLabs?
We would love to walk you through the onboarding process. Simply book a time to meet with us.
How much does WhyLabs cost?
You can use whylogs, our data logging library for free under a standard open source license. For all other use cases contact us: [email protected].
How do I partner with WhyLabs?
We’d love to hear from you, email us at [email protected]
What is whylogs?
whylogs is an open source statistical logging library that allows data science and ML teams to effortlessly profile ML/AI pipelines and applications. It produces log files that can be used for monitoring, alerts, analytics, and error analysis. The library is easy to use, lightweight, portable, configurable, and close to code. It is available in Python and Java and can be downloaded from https://github.com/whylabs. Join our community on Slack for support, questions, and feature requests.
Why should I use whylogs?
Use whylogs to run AI with certainty. We believe that effective data logging must take a primary role among best practices for operating robust ML/AI systems. whylogs aims to bridge the ML logging gap by providing approximate data profiling in an open source and easy-to-use package.
What is AI Observability?
Observability, a term that comes from control theory, is a measure of how well internal states of a system can be inferred from knowledge of its external outputs. WhyLabs applies this concept to AI systems by collecting statistical fingerprints of data that flows through the system, monitoring these fingerprints for anomalies and exposing key insights to users through an intuitive interface for real-time analytics.
What is the relationship between AI Explainability and Observability?
Observability refers to the practice of acquiring actionable insights from information that is emitted by a system continuously. If an explainability tool is like a doctor who can tell you why you are feeling ill on a particular day, an observability tool is like a futuristic device that can simultaneously measure your heart rate, temperature, oxygen levels and a thousand other things, and keep a log of all these measurements over time. While a doctor is necessary once in a while, the observability tool is more useful for day-to-day health monitoring and, if used right, can save you some trips to the hospital.