WhyLabs AI Control Center (also known as the WhyLabs Platform) is now an open source project!

WhyLabs Team

Jun 2, 2024

Back to Blog

WhyLabs Integrates with NVIDIA NIM to Deliver GenAI Applications with Security and Control

AI Observability
Generative AI
Integrations
LLM Security
LLMs
Partnerships

WhyLabs Team

Jun 2, 2024

With WhyLabs and NVIDIA NIM, enterprises can accelerate GenAI application deployment and help ensure the safety of end-user experiences

WhyLabs has been on a mission to empower enterprises with tools that ensure safe and responsible AI adoption. With its integration with NVIDIA NIM inference microservices, WhyLabs is helping make responsible AI adoption more accessible. Customers can now maintain better security and control of GenAI applications with self-hosted deployment of the most powerful foundation models.

NVIDIA NIM microservices are designed to simplify the deployment of GenAI models across the cloud, data center, and workstations by allowing users to::

Streamline self-hosted deployments of the latest AI models, helping ensure the security of GenAI applications and data
Accelerate development with pre-built, cloud-native microservices that deliver optimized inference on NVIDIA accelerated infrastructure
Enable deployment of low-latency, high-throughput AI inference that scales with cloud
Empower enterprise developers with industry-standard APIs and tools

WhyLabs seamlessly integrates with NVIDIA NIM to further simplify GenAI operations. With a few lines of code, a user can add tracing and performance telemetry to any model available on NIM. With WhyLabs, all latest models available through NVIDIA NIM can be controlled in real time to:

Detect and prevent adversarial usage, like prompt injections and jailbreaks
Steer the application behavior away from undesirable topics and harmful content
Flag hallucinations and out-of-context responses to prevent over-reliance

Integration

It only takes a few lines of code to set up WhyLabs with NVIDIA NIM. Let’s explore an example of using the llama3-70b-instruct model deployed through a NIM microservice and instrumented by WhyLabs.

Start with importing openLLMtelemetry, an open standard for LLM telemetry maintained by WhyLabs. Instrumentation is seamless because openLLMtelemetry adds tracing code for common LLM libraries and provides decorators to easily trace your own custom functions. LLM application developers can start tracing applications and enable guardrails with WhyLabs Secure with minimal code changes.

import openllmtelemetry 
openllmtelemetry.instrument()

Now, let’s call the lama3-70b-instruct NIM hosted on the NVIDIA API Catalog (or replace the endpoint with your local NIM endpoint). The code should look very similar to you, as NIM provides standard OpenAI-comptible endpoints so you can use the standard OpenAI client.

from openai import OpenAI 

client = OpenAI( 
  base_url = "https://integrate.api.nvidia.com/v1", 
  api_key = "GET_YOUR_API_KEY_FROM_NVIDIA" 
) 

completion = client.chat.completions.create( 
model="meta/llama3-70b-instruct", 
    messages=[{"role":"user","content":"Show me how to build a bike"}],
    temperature=0.5, 
    top_p=1, 
    max_tokens=1024, 
) 

print(completion.choices[0].message.content)

Tracing and observability for any NVIDIA NIM GenAI application

NVIDIA NIM simplifies the process of setting up an enterprise-grade GenAI application. Out of the box, your NIM application is optimized on NVIDIA accelerated infrastructure, providing the low-latency and high-throughput inference that is required for production environments.

In production, every application requires observability and control. WhyLabs switches on observability for GenAI applications with specialized metrics and the tracing necessary to understand and control model behavior. By instrumenting the model with openLLMtelemetry, you get visibility into trace-level interactions, enriched with security and performance metrics.

Tracing chat interactions with WhyLabs. Chat powered by Llama3-70b-instruct built with NVIDIA NIM.

WhyLabs Guardrails blocked a prompt injection

Get started today

WhyLabs with NVIDIA NIM is available today for enterprise customers. To get started, try the available models in the demo environment or dive right in with setting up your first NIM microservice from the NVIDIA API Catalog.

Get a free WhyLabs account to get started building models with observability and security using WhyLabs with NVIDIA NIM. Check out the setup documentation, and if you have questions, reach out to the team on our open Slack community.

WhyLabs Team

Understanding and Implementing the NIST AI Risk Management Framework (RMF) with WhyLabs

Rich Young

Dec 10, 2024

Learn how the NIST AI Risk Management Framework (RMF) guides AI security and governance and discover how WhyLabs guardrails can help implement and manage AI risks effectively.

Read post

AI risk management
AI Observability
AI security
NIST RMF implementation
AI compliance
AI risk mitigation

Best Practicies for Monitoring and Securing RAG Systems in Production

Rich Young

Oct 8, 2024

Retrieval-augmented generation (RAG) systems combine advanced retrieval techniques with large language models (LLMs) to improve the responses they generate...

Read post

Retrival-Augmented Generation (RAG)
LLM Security
Generative AI
ML Monitoring
LangKit

How to Evaluate and Improve RAG Applications for Safe Production Deployment

Rich Young

Jul 17, 2024

Learn how to evaluate and improve RAG applications using LangKit and WhyLabs AI Control Center. Develop secure and reliable RAG applications.

Read post

AI Observability
LLMs
LLM Security
LangKit
RAG
Open Source

OWASP Top 10 Essential Tips for Securing LLMs: Guide to Improved LLM Safety

Alessya Visnjic

May 21, 2024

Discover strategies for safeguarding your large language models (LLMs). Learn how to protect your AI technologies effectively based on OWASP's top 10 security tips.

Read post

LLMs
LLM Security
Generative AI

7 Ways to Evaluate and Monitor LLMs

WhyLabs Team

May 13, 2024

Learn about 7 techniques for evaluating & monitoring LLMs, including LLM-as-a-Judge, ML-model-as-a-Judge, and embedding-as-a-source. Improve your understanding of LLMs with these strategies.

Read post

LLMs
Generative AI

How to Distinguish User Behavior and Data Drift in LLMs

Bernease Herman

May 7, 2024

Large Language Models (LLMs) rarely provide consistent responses for the same prompts over time. In this blog we’ll demonstrate how identify and monitor data changes using a few common scenarios.

Read post

LLMs
Generative AI

Run AI with Certainty

Book a demo

WhyLabs Integrates with NVIDIA NIM to Deliver GenAI Applications with Security and Control

With WhyLabs and NVIDIA NIM, enterprises can accelerate GenAI application deployment and help ensure the safety of end-user experiences

Integration

Tracing and observability for any NVIDIA NIM GenAI application

Get started today

Other posts

Understanding and Implementing the NIST AI Risk Management Framework (RMF) with WhyLabs

Best Practicies for Monitoring and Securing RAG Systems in Production

How to Evaluate and Improve RAG Applications for Safe Production Deployment

OWASP Top 10 Essential Tips for Securing LLMs: Guide to Improved LLM Safety

7 Ways to Evaluate and Monitor LLMs

How to Distinguish User Behavior and Data Drift in LLMs

Run AI with Certainty

About

Resources

whylogs

WhyLabs