What problems do you solve?
We instrument data and ML pipeline to log the key properties of data that flows through them. We centralize these logs to monitor for and surface data bias, concept drift, and data quality issues in real time and at large scale. Ultimately, the WhyLabs Platform simplifies AI operations and prevents costly AI failures.
Can you handle large scale data?
We love massive amounts of data and have built the WhyLabs platform in a way that allows us to process and deliver results quickly, efficiently and without breaking the bank.
What type of data can I use?
We work with both structured and unstructured data (images, time series, text, etc).
Do I own my data?
Yes, you own all your data. WhyLabs does not collect any raw data from our customers. We allow you to maintain full control of your proprietary information. Since our logging solution is open-source, any enterprise can transparently deploy it today. Moreover, our deployment model is well suited for customers with highly sensitive data, accommodating the strictest data handling requirements such as PII and HIPAA.
How do I log my data?
You start with a simple pip install of whylogs, our lean open source library We currently support Python and Java/Spark integrations. Check out the Jupyter notebook tutorial for getting started with Python.
Can I export my data?
Yes! When you run whylogs on your data you are in control of your data and your output. whylogs outputs the statistical profiles of data in many different formats. We provide starter Jupyter notebooks to analyze the data generated by whylogs. By default, the whylogs library does not send data anywhere.
What integrations do you support?
We integrate seamlessly with your existing data pipelines and tools. Some examples include Google, AWS, Microsoft, Databricks, Snowflake, TensorFlow, PyTorch, SageMaker, and Keras.
Do you integrate with data storage platforms?
Yes. We integrate seamlessly with the data-storage solutions of all major cloud services. Getting started takes minutes.
What machine learning libraries does WhyLabs work with?
WhyLabs works with most machine learning libraries and is easy to set up with custom libraries as well. View our get started guide to see how to use us with pytorch, keras, tensorflow, and more.
Is there an on-premise solution?
Yes, you can use WhyLabs on-premise or in the cloud. We are happy to discuss your specific use case, email us at [email protected].
What is data logging?
Unlike traditional logging which is standard practice in all software systems, data logging is a practice necessary for ML/AI applications. Data logging involves creating a log of statistical properties of the data that flows through an ML/AI application. Checkout our blogs on parallels between DevOps and MLOps logging approaches and on the practice of data logging in data science.
How do I report a bug I found?
You can report a bug by opening an issue on our Github page Python / Java, or by reaching out to us on our public Slack channel.
Can I export to Jupyter Notebooks?
whylogs outputs the statistical profiles of data in many different formats. We provide starter Jupyter notebooks to analyze the data generated by whylogs. The WhyLabs Platform currently does not support data export, but this feature is high on our roadmap. Please let us know on our public Slack channel if this feature is important for you!
How is WhyLabs different from AWS SageMaker Model Monitor?
Sagemarker Model Monitor is a great starting solution for models running on SageMaker. WhyLabs focuses on supporting all models in a platform-agnostic way. Furthermore, WhyLabs focuses on surfacing issues in a purpose-built, intuitive interface that enables cross-functional collaboration and improves operational workflows (with Slack, email, PagerDuty, etc. integrations). If you are using SageMaker model monitor, we can ingest the output into WhyLabs and surface monitoring alerts in the WhyLabs interface. Contact our support team to discuss this feature, [email protected].