WhyLabs AI Control Center (also known as the WhyLabs Platform) is now an open source project!

Alessya Visnjic

Apr 24, 2024

Back to Blog

AI Observability is Dead, Long Live AI Observability! Introducing WhyLabs AI Control Center for Generative and Predictive AI

WhyLabs
News
Generative AI

Alessya Visnjic

Apr 24, 2024

Today, we are launching the new iteration of WhyLabs. This iteration is based on a unique insight: AI Observability Platforms are insufficient in the world of Generative AI. We came to this realization thanks to our customers, who have highlighted to us that passive observability capabilities are insufficient. You cannot afford a 5-minute delay in learning that a jailbreak has impacted the LLM application. You need the ability to interfere in real-time and to steer the application away from the security threat.

After months in private beta, we are releasing the new WhyLabs AI Control Platform to general availability. The Platform makes real-time security threat prevention possible for any Generative AI (GenAI) application. This capability is already making a huge difference in enabling teams to take their GenAI projects from prototype to production. One of the success stories is Yoodli, an incredible product that helps any person become a great public speaker using GenAI technology. Check out our case study and how Yoodli accelerated the continuous improvement of their LLM-based application using WhyLabs, resulting in significantly faster experimentation, impactful new features, and overall higher quality of responses delivered to their end-users.

"Yoodli is building for a future where everyone can be a better communicator through our state-of-the-art Generative AI speech coaching application,” explained Varun Puri, CEO of Yoodli. “WhyLabs AI Control Platform provides us with an accessible and easily adaptable solution that we can trust. We are really excited about the new capabilities that enable us to execute AI control across five critical dimensions for our communication application: protection against bad actors, misuse, bad customer experience, hallucinations, and costs."

What is unique about GenAI that makes AI observability obsolete?

The emergence of foundation AI models has revolutionized the accessibility and efficiency of Artificial Intelligence (AI). Because these models are pre-trained on vast amounts of data they possess a deep understanding of various domains and are broadly applicable across multi-modal, complex tasks. Their versatility allows developers to fine-tune or adapt foundation models for specific tasks quickly, bypassing the long cycle of training the model from scratch.

This is both a blessing and a curse of foundation models. The models are so powerful that they can power any GenAI application, from being your customer service agent to giving your customer legal advice to sharing your customer information with anyone. This power introduces entirely new categories of risks: adversarial attacks, data privacy breaches, harmful topics, and factually incorrect claims.

These risks pose serious challenges to moving GenAI applications from prototype to production. To tackle these, AI teams require tools that let them control and steer GenAI applications in real-time. But even that is insufficient. GenAI applications evolve rapidly, with prompt engineering, RLHF, and RAG. In addition to real-time guardrails, teams need a way to understand behavior changes over time. But even that is insufficient. While RAG architectures help tackle hallucinations, they add to the complexity of troubleshooting issues arising with GenAI applications. As such, teams need a way to analyze complex traces over time in order to debug issues. These are pain points we focused on solving with the WhyLabs AI Control Center.

How does the WhyLabs AI Control Center help tackle these risks?

WhyLabs AI Control Center unifies three pillars of AI operations in one platform. We combine the most advanced and up-to-date techniques for AI application security, observability, and optimization.

By bringing together all of these key components, we enable our customers to harness the power of GenAI applications with precision and control. Today, no other platform has been able to connect all of these capabilities and make them computationally efficient to support high-throughput production workloads.

As we continue building our platform to serve the world's best AI teams and enterprises, there are a few design principles that guide us and that we continue adhering to in the platform:

Privacy comes first: WhyLabs is architected with the highest degree of privacy and data security. For all capabilities available on the AI Control Center, no raw data ever leads the customer VPC. In fact, raw data doesn’t even need to move to another environment for telemetry extraction. We extract all telemetry in the same environment, without data duplication.
Massive enterprise scale: All aspects of the WhyLabs Platform are architected for handling massive inference scale. From integrations, to telemetry calculation algorithms, to monitoring configurations, to UI. The WhyLabs Platform can handle hundreds of millions of inferences per day (thousands of application TPS) in real-time, even for models with thousands of features.
Frictionless onboarding and ownership: As the only SaaS on the market approved for highly regulated industries, we focus on making your onboarding onto the platform as simple as something an undergraduate intern can do over an hour. With SaaS deployment, you do not need to worry about infrastructure or maintenance, ever.

Let’s explore each of the three pillars of the AI Control Platform!

WhyLabs: Observe

All of our customers' favorite capabilities of the WhyLabs Observability Platform are now available in WhyLabs: Observe and all existing customers have access to the expanded capabilities. In addition, these capabilities are now extended to LLM applications. Whether you are deploying batch inference pipelines or a live prediction service, WhyLabs provides infrastructure agnostic and real-time anomaly detection and monitoring of drift, data quality issues, and performance degradation. With WhyLabs, teams have decreased the Time to Resolution of AI issues by 10x.

Organizations leveraging the WhyLabs: Observe for AI applications see the following outcomes:

Transparency and observability at any scale:
- Cost-effective solution for any scale of AI footprint, with support for models with 1000s of Input Features and applications with >1000 Transactions per Second.

WhyLabs Control Platform Executive Dashboard

Improvement of the duration and severity of AI failures, due to:
- Feature and concept drift detection
- Data quality issue detection
- Bias detection
- GenAI security metrics drift
- GenAI performance metrics drift
- Model performance degradation detection
- Outlier detection, in features and predictions

WhyLabs: Observe root causing bias and performance issues

Improvement of time to resolution of AI failures and degradations, due to:
- Purpose-built visualizations and workflows designed for resolving drift, data quality, and performance degradations
- Explainability-powered root cause analysis
- Automation of re-training pipelines
- Circuit breakers in the pipeline when issues are detected

WhyLabs: Observe two click distribution drift analysis

Minimization of manual operations, due to:
- Automatic model onboarding
- Smart, zero-config monitoring setup
- Templatable configurations to drive standards across teams
- Automated alerting and notifications directly into the team’s workflow
- Customization of dashboards and reports, for technical and non-technical audiences

WhyLabs: Observe zero-config drift monitoring across hundreds of features

The WhyLabs AI Control Platform offers a unique architecture that allows customers to switch on observability for 100% of the data, never sampling because sampling distorts the distributions and causes a high rate of false alarms. Customers configure monitoring by connecting WhyLabs directly to training and inference data, in batch or real time. This integration is cost-effective and privacy-preserving, making it the best choice in high inference volume AI applications for organizations such as Healthcare and FinTech.

WhyLabs: Secure

After speaking with hundreds of security teams across enterprises and late stage startups, we identified one clear need for LLM applications - an ability to interfere in real time and steer the application away from “undesirable behavior”. WhyLabs: Secure focuses on making it easy for teams to switch on the guardrail and create the necessary feedback loops for continuous model improvement.

Organizations leveraging the WhyLabs: Secure for AI applications see the following outcomes:

Absolute data privacy due to:
- All processing of raw data happens locally. The solution does not rely on third-party providers and does not require sending raw data outside of the immediate infrastructure.

WhyLabs: Secure Privacy-Preserving Architecture

Reduction in the duration and severity of GenAI security events, due to real-time:
- Prevention of bad actors by detecting prompt injections, jailbreaks, refusals, or sensitive information disclosure.
- Prevention of misuse by detecting inappropriate topics, costly prompts, or sensitive information leakage.
- Detection and prevention of poor customer experience by detecting toxic, low sentiment, or inadequate responses.
- Detection and prevention of untruthful or false responses by detecting bad hallucinations, overreliance, and out-of-context responses.

Reduction in the time to resolution of the GenAI security events, due to:
- Powerful analytics of full AI application traces.
- Dedicated dashboards for tracking security and performance trends.
- Drill-down analysis from trends to application traces.

Compliance and governance approvals necessary to move from prototype to production, due to:
- WhyLabs: Secure meets the OWASP Top 10 for LLMs and MITRE ATLAS^TMstandard for LLM security.
- The security team has full control of the GenAI application through control of the guardrail sensitivities, system messages, and notification workflows.

If you or your team are currently looking at open source projects which are offering a range of guardrails, tracing, and RAG debugging capabilities for LLM applications, take a look at WhyLabs: Secure. With one simple integration, our customers enable:

A real-time guardrail to detect harmful prompts and responses in under 300ms across all critical security rules.
A response router to steer the application towards safe behavior with customizable callbacks and system messages.
An AI-specialized logger that captures rich metrics, traces, and metadata (with no-code integrations into OpenAI, Bedrock, etc).
An AI-specialized data capture to organize raw prompt and response data in a manner that maximizes privacy and allows teams to use this data in future optimization.
A powerful monitoring engine that aggregates metrics to surface behavior changes over time and alert users about them.
A rich analytics platform that enables users to visualize guardrail decisions, trends in security and performance, and debug events down to individual traces.

Experience the demo today. Schedule time with our engineering team to onboard your GenAI application for a 14-day free trial.

WhyLabs: Optimize

The WhyLabs AI Control Platform provides feedback tools for improving any aspect of your AI system. Observability and security generate insights necessary to improve the application experience. WhyLabs takes this a step further. Using WhyLabs: Optimize, you can create datasets for re-training predictive models or for setting up Reinforcement Learning from Human Feedback (RLHF) in generative models.

WhyLabs: Improve across the AI application lifecycle

By using insights from WhyLabs: Observe and WhyLabs: Secure, our customers enable the following optimizations across:

Optimized performance of WhyLabs: Secure capabilities:

Customization of the rule sets in the guardrail with examples and proprietary Red Team datasets
Continuous tuning of the rule sets using datasets created from blocking and flagging events

Optimized performance of predictive models with:

Automated retraining when the model accuracy drops below a desired level, using customizable webhooks
Automated circuit breaks in batch pipelines when data quality issues are detected, preventing the release of bad model predictions
Improvements to input features via analysis of feature quality volatility and information entropy of features

Optimized customer experience in GenAI applications with:

Analytics of RAG traces captured from application interactions which were marked with thumbs down or received negative feedback from the end user.
Analytics of the traces that flagged model refusals to ensure the best customer experience.
Customization system messages and callbacks to steer the GenAI application toward appropriate behaviors.

By systematically tracking model performance, data quality, traces, security events, and metadata - you are building feedback loops for continuous improvement across your AI model lifecycle. Get started with WhyLabs: Observe or WhyLabs: Secure to begin collecting the most valuable data about your AI applications.

How do I get started with WhyLabs AI Control Center?

The WhyLabs Control Center is available now, you can get started by signing up for a free account. All of the capabilities can be experienced today in our Demo Organization on the platform.

With our free plan, you can start using all WhyLabs: Observe capabilities on your own predictive and generative applications today.

For WhyLabs: Secure, you can request a 14-day free trial here.

If you are looking for specific WhyLabs: Optimize capabilities, schedule a call with our team. We would love to show you what the future looks like!

As always, we are a team of AI practitioners developing for AI practitioners. We love feedback and feature requests. Jump on our community Slack channel to chat with our engineering team directly.

Alessya Visnjic

Understanding and Implementing the NIST AI Risk Management Framework (RMF) with WhyLabs

Rich Young

Dec 10, 2024

Learn how the NIST AI Risk Management Framework (RMF) guides AI security and governance and discover how WhyLabs guardrails can help implement and manage AI risks effectively.

Read post

AI risk management
AI Observability
AI security
NIST RMF implementation
AI compliance
AI risk mitigation

Best Practicies for Monitoring and Securing RAG Systems in Production

Rich Young

Oct 8, 2024

Retrieval-augmented generation (RAG) systems combine advanced retrieval techniques with large language models (LLMs) to improve the responses they generate...

Read post

Retrival-Augmented Generation (RAG)
LLM Security
Generative AI
ML Monitoring
LangKit

How to Evaluate and Improve RAG Applications for Safe Production Deployment

Rich Young

Jul 17, 2024

Learn how to evaluate and improve RAG applications using LangKit and WhyLabs AI Control Center. Develop secure and reliable RAG applications.

Read post

AI Observability
LLMs
LLM Security
LangKit
RAG
Open Source

WhyLabs Integrates with NVIDIA NIM to Deliver GenAI Applications with Security and Control

WhyLabs Team

Jun 2, 2024

With WhyLabs and NVIDIA NIM, enterprises can accelerate GenAI application deployment and help ensure the safety of end-user experiences WhyLabs has been on a mission to empower enterprises with tools that ensure safe and responsible AI adoption. With its integration with NVIDIA NIM inference microservices, WhyLabs is helping make responsible AI adoption more accessible. Customers can now maintain better security and control of GenAI applications with self-hosted deployment of the most powerfu

Read post

AI Observability
Generative AI
Integrations
LLM Security
LLMs
Partnerships

OWASP Top 10 Essential Tips for Securing LLMs: Guide to Improved LLM Safety

Alessya Visnjic

May 21, 2024

Discover strategies for safeguarding your large language models (LLMs). Learn how to protect your AI technologies effectively based on OWASP's top 10 security tips.

Read post

LLMs
LLM Security
Generative AI

7 Ways to Evaluate and Monitor LLMs

WhyLabs Team

May 13, 2024

Learn about 7 techniques for evaluating & monitoring LLMs, including LLM-as-a-Judge, ML-model-as-a-Judge, and embedding-as-a-source. Improve your understanding of LLMs with these strategies.

Read post

LLMs
Generative AI

How to Distinguish User Behavior and Data Drift in LLMs

Bernease Herman

May 7, 2024

Large Language Models (LLMs) rarely provide consistent responses for the same prompts over time. In this blog we’ll demonstrate how identify and monitor data changes using a few common scenarios.

Read post

LLMs
Generative AI

Run AI with Certainty

Book a demo

AI Observability is Dead, Long Live AI Observability! Introducing WhyLabs AI Control Center for Generative and Predictive AI

What is unique about GenAI that makes AI observability obsolete?

How does the WhyLabs AI Control Center help tackle these risks?

WhyLabs: Observe

WhyLabs: Secure

WhyLabs: Optimize

How do I get started with WhyLabs AI Control Center?

Other posts

Understanding and Implementing the NIST AI Risk Management Framework (RMF) with WhyLabs

Best Practicies for Monitoring and Securing RAG Systems in Production

How to Evaluate and Improve RAG Applications for Safe Production Deployment

WhyLabs Integrates with NVIDIA NIM to Deliver GenAI Applications with Security and Control

OWASP Top 10 Essential Tips for Securing LLMs: Guide to Improved LLM Safety

7 Ways to Evaluate and Monitor LLMs

How to Distinguish User Behavior and Data Drift in LLMs

Run AI with Certainty

About

Resources

whylogs

WhyLabs