WhyLabs Integrates with NVIDIA NIM to Deliver GenAI Applications with Security and Control
- AI Observability
- Generative AI
- Integrations
- LLM Security
- LLMs
- Partnerships
Jun 2, 2024
With WhyLabs and NVIDIA NIM, enterprises can accelerate GenAI application deployment and help ensure the safety of end-user experiences
WhyLabs has been on a mission to empower enterprises with tools that ensure safe and responsible AI adoption. With its integration with NVIDIA NIM inference microservices, WhyLabs is helping make responsible AI adoption more accessible. Customers can now maintain better security and control of GenAI applications with self-hosted deployment of the most powerful foundation models.
NVIDIA NIM microservices are designed to simplify the deployment of GenAI models across the cloud, data center, and workstations by allowing users to::
- Streamline self-hosted deployments of the latest AI models, helping ensure the security of GenAI applications and data
- Accelerate development with pre-built, cloud-native microservices that deliver optimized inference on NVIDIA accelerated infrastructure
- Enable deployment of low-latency, high-throughput AI inference that scales with cloud
- Empower enterprise developers with industry-standard APIs and tools
WhyLabs seamlessly integrates with NVIDIA NIM to further simplify GenAI operations. With a few lines of code, a user can add tracing and performance telemetry to any model available on NIM. With WhyLabs, all latest models available through NVIDIA NIM can be controlled in real time to:
- Detect and prevent adversarial usage, like prompt injections and jailbreaks
- Steer the application behavior away from undesirable topics and harmful content
- Flag hallucinations and out-of-context responses to prevent over-reliance
Integration
It only takes a few lines of code to set up WhyLabs with NVIDIA NIM. Let’s explore an example of using the llama3-70b-instruct model deployed through a NIM microservice and instrumented by WhyLabs.
Start with importing openLLMtelemetry, an open standard for LLM telemetry maintained by WhyLabs. Instrumentation is seamless because openLLMtelemetry adds tracing code for common LLM libraries and provides decorators to easily trace your own custom functions. LLM application developers can start tracing applications and enable guardrails with WhyLabs Secure with minimal code changes.
import openllmtelemetry
openllmtelemetry.instrument()
Now, let’s call the lama3-70b-instruct NIM hosted on the NVIDIA API Catalog (or replace the endpoint with your local NIM endpoint). The code should look very similar to you, as NIM provides standard OpenAI-comptible endpoints so you can use the standard OpenAI client.
from openai import OpenAI
client = OpenAI(
base_url = "https://integrate.api.nvidia.com/v1",
api_key = "GET_YOUR_API_KEY_FROM_NVIDIA"
)
completion = client.chat.completions.create(
model="meta/llama3-70b-instruct",
messages=[{"role":"user","content":"Show me how to build a bike"}],
temperature=0.5,
top_p=1,
max_tokens=1024,
)
print(completion.choices[0].message.content)
Tracing and observability for any NVIDIA NIM GenAI application
NVIDIA NIM simplifies the process of setting up an enterprise-grade GenAI application. Out of the box, your NIM application is optimized on NVIDIA accelerated infrastructure, providing the low-latency and high-throughput inference that is required for production environments.
In production, every application requires observability and control. WhyLabs switches on observability for GenAI applications with specialized metrics and the tracing necessary to understand and control model behavior. By instrumenting the model with openLLMtelemetry, you get visibility into trace-level interactions, enriched with security and performance metrics.
Get started today
WhyLabs with NVIDIA NIM is available today for enterprise customers. To get started, try the available models in the demo environment or dive right in with setting up your first NIM microservice from the NVIDIA API Catalog.
Get a free WhyLabs account to get started building models with observability and security using WhyLabs with NVIDIA NIM. Check out the setup documentation, and if you have questions, reach out to the team on our open Slack community.
Other posts
Best Practicies for Monitoring and Securing RAG Systems in Production
Oct 8, 2024
- Retrival-Augmented Generation (RAG)
- LLM Security
- Generative AI
- ML Monitoring
- LangKit
How to Evaluate and Improve RAG Applications for Safe Production Deployment
Jul 17, 2024
- AI Observability
- LLMs
- LLM Security
- LangKit
- RAG
- Open Source
OWASP Top 10 Essential Tips for Securing LLMs: Guide to Improved LLM Safety
May 21, 2024
- LLMs
- LLM Security
- Generative AI
7 Ways to Evaluate and Monitor LLMs
May 13, 2024
- LLMs
- Generative AI
How to Distinguish User Behavior and Data Drift in LLMs
May 7, 2024
- LLMs
- Generative AI