Ensure safe and responsible usage of Large Language Models
Security is a key blocker for launching LLM-based applications to production!
Bad actors can gain access to Personally Identifiable Information (PII) or can rewire the model behavior with prompt injections. WhyLabs LLM Security offering enables teams to protect LLM applications against malicious prompts and guardrails the responses.
LLMs are vulnerable to targeted attacks designed to leak confidential data. Evaluating prompts for these attacks and blocking responses containing PII is key for production LLMs.
Malicious prompts designed to confuse the system into providing harmful outputs. Monitoring for such prompts and for changes in LLM behavior is crucial to ensure consistent user experience.
LLMs can produce misinformation or inappropriate content due to "hallucinations." Without monitoring this can lead to customer loss, legal issues, and reputational damage.
Best practices for enabling LLM security
OWASP Top 10 for LLM Applications
LLMs are susceptible to a range of vulnerabilities which are unlike the vulnerabilities in traditional software. The guidance around these vulnerabilities is rapidly evolving and WhyLabs is providing an extensible platform that enables teams to adopt best practices that are available today and keep these best practices up to date. WhyLabs has implemented telemetry to capture the OWASP Top 10 for LLM Applications (v0.5) and will be implementing the new guidelines as they become available.
WhyLabs uses the telemetry to enable inline guardrails, continuous evaluations, and observability.
Overview of OWASP Top 10 LLM risks
LLM01: Prompt Injections: Prompt Injection Vulnerabilities in LLMs involve crafty inputs leading to undetected manipulations. The impact ranges from data exposure to unauthorized actions, serving attacker's goals. WhyLabs detects prompts that present a prompt injection risk.
LLM02: Insecure Output Handling: These occur when plugins or apps accept LLM output without scrutiny, potentially leading to XSS, CSRF, SSRF, privilege escalation, remote code execution, and can enable agent hijacking attacks. WhyLabs enables monitoring of responses to identify malicious output.
LLM03: Training Data Poisoning: LLMs learn from diverse text but risk training data poisoning, leading to user misinformation. Overreliance on AI is a concern. Key data sources include Common Crawl, WebText, OpenWebText, and books.
LLM04: Denial of Service: An attacker interacts with an LLM in a way that is particularly resource-consuming, causing quality of service to degrade for them and other users, or for high resource costs to be incurred. WhyLabs monitors tokens, latency, and inferred cost to alert teams about suspicious spikes in usage.
LLM05: Supply Chain: LLM supply chains risk integrity due to vulnerabilities leading to biases, security breaches, or system failures. Issues arise from pre-trained models, crowdsourced data, and plugin extensions. WhyLabs makes it easy to evaluate models across quality, toxicity, and relevance to help identify model vulnerabilities proactively.
LLM06: Permission Issues: Lack of authorization tracking between plugins can enable indirect prompt injection or malicious plugin usage, leading to privilege escalation, confidentiality loss, and potential remote code execution.
LLM08: Excessive Agency: When LLMs interface with other systems, unrestricted agency may lead to undesirable operations and actions. Like web-apps, LLMs should not self-police; controls must be embedded in APIs.
LLM09: Overreliance: Overreliance on LLMs can lead to misinformation or inappropriate content due to "hallucinations." Without proper oversight, this can result in legal issues and reputational damage. WhyLabs makes it possible to identify responses with "hallucinations" through relevance.
LLM10: Insecure Plugins: Plugins connecting LLMs to external resources can be exploited if they accept free-form text inputs, enabling malicious requests that could lead to undesired behaviors or remote code execution.
Protection for open source and proprietary models
Protect the LLM user experience against the key LLM vulnerability types. Deploy inline guardrail with customizable metrics, thresholds, and actions. The solution is applicable to internal and external LLM applications of any scale. Whether you are integrating with a public API or running a proprietary model, use the WhyLabs proxy to ensure guardrails and logging of each prompt/response pair. WhyLabs integrates with LangChain, HuggingFace, MosaicML, OpenAI, Falcon, Anthropic, and more.