WhyLabs Weekly: Monitor LangChain LLM Applications
Jul 7, 2023
Monitor LangChain applications, UDFs in whylogs, best practices for monitoring LLMs, and more!
A lot happens every week in the WhyLabs Robust & Responsible AI (R2AI) community! This weekly update serves as a recap so you don’t have to miss a thing!
Start learning about MLOps and ML Monitoring:
- 📅 Join the next event: Intro to LLM Monitoring in Production
- 💻 Check out our open source projects whylogs & LangKit!
- 💬 Join 1,123 Robust & Responsible AI Slack members
- 🤝 Request a demo to learn how ML monitoring can benefit you
💡 MLOps Tip of the week:
Add ML monitoring to your LangChain LLM applications with the WhyLabs callback!
Once LangChain and LangKit are installed in your python environment and WhyLabs API keys are set, it’s just a few extra lines of code to add ML monitoring to a large language model application built with LangChain.
from langchain.callbacks import WhyLabsCallbackHandler
from langchain.llms import OpenAI
# Initialize WhyLabs Callback & GPT model with LangChain
whylabs = WhyLabsCallbackHandler.from_params()
llm = OpenAI(temperature=0, callbacks=[whylabs])c
# generate responses to negatice prompts from LLM
result = llm.generate(
[
"I hate nature, its ugly.",
"This product is bad. I hate it.",
"Chatting with you has been a terrible experience!."
"I'm terrible at saving money, can you give me advice?"
]
)
print(result)
# close WhyLabs Session which will also push the language metrics to WhyLabs
whylabs.close()
Prompts and response metrics are logged in WhyLabs for EDA and anomaly detection. In this example we can see from the LLM insights our response generated a phone number. In this case I haven’t told the GPT model a phone number to provide, so it completely made one up!
With LangKit, you’ll be able to extract and monitor relevant signals from LangChain applications, such as:
See a full LangChain + LangKit example on GitHub.
📝 Latest blog posts:
BYOF: Bring Your Own Functions — Announcing UDFs in whylogs
Custom metrics can provide valuable insights and enable specific monitoring. However, defining and maintaining scalable and standardized custom metrics across an organization can be daunting, especially if teams are implementing those metrics in different ways… Read more on WhyLabs.AI
Best Practices for Monitoring Large Language Models
Large Language Models (LLMs) are powerful tools for natural language processing (NLP), but they can also present significant challenges when it comes to monitoring their performance and ensuring their safety. With the growing adoption of LLMs to automate and streamline NLP operations, it’s crucial to establish effective monitoring practices that can detect and prevent issues…Read more on WhyLabs.AI
🎥 Event recordings
Monitoring Large Language Models in Production using OpenAI & WhyLabs
In this workshop Sage Elliott and Andre Elizondo show how to monitor Large Language Models (LLMs) in production using WhyLabs and the LangKit library.
Solving LLM Data Hurdles: Strategies for Success — Yujian Tang, Zilliz
In this Robust & Responsible AI interview Yujian Tang joins us to explain how vector databases can help solve data hurdles for LLMs.
📅 Upcoming R2AI & WhyLabs Events:
Want to come on a Robust & Responsible AI stream to talk about what you’re building? Reach out to me on LinkedIn!
- 7/19 Intro to LLM Monitoring in Production with LangKit & WhyLabs
- 7/27 MLOps Happy Hour [In-Person Seattle] @ Optimism!
- 8/2 Intro to ML Monitoring: Data Drift, Quality, Bias and Explainability
- 8/9 Combining the Power of LLMs with Computer Vision — Jacob Marks, Voxel51
💻 WhyLabs open source updates:
📊 whylogs v1.2.1 has been released!
whylogs is the open standard for data logging & AI telemetry. This week’s update includes:
- pyspark batched column profiling
- Make ResolverSpec list optional in DeclarativeSchema ctor
- Fix schema copy bugs
- Dataset-ish UDFs by column type
See full whylogs release notes on Github.
💬 LangKit release 0.0.4 has been released!
LangKit is an open-source text metrics toolkit for monitoring language models.
- Add documentation for topics module
- Add LangChain example notebook
- cleanup and update dependencies
See full LangKit release notes on Github.
🤝 Stay connected with the WhyLabs Community:
Join the thousands of machine learning engineers and data scientists already using WhyLabs to solve some of the most challenging ML monitoring cases!
- 1,122+ Robust & Responsible AI Slack members
- 2,300+ whylogs GitHub stars
- 271+ LangKit Github stars
- 976+ Robust & Responsible AI Meetup Members
- 9,090+ WhyLabs LinkedIn followers
- 855+ WhyLabs Twitter followers
Request a demo to learn how ML monitoring can benefit your company.
Other posts
Understanding and Implementing the NIST AI Risk Management Framework (RMF) with WhyLabs
Dec 10, 2024
- AI risk management
- AI Observability
- AI security
- NIST RMF implementation
- AI compliance
- AI risk mitigation
Best Practicies for Monitoring and Securing RAG Systems in Production
Oct 8, 2024
- Retrival-Augmented Generation (RAG)
- LLM Security
- Generative AI
- ML Monitoring
- LangKit
How to Evaluate and Improve RAG Applications for Safe Production Deployment
Jul 17, 2024
- AI Observability
- LLMs
- LLM Security
- LangKit
- RAG
- Open Source
WhyLabs Integrates with NVIDIA NIM to Deliver GenAI Applications with Security and Control
Jun 2, 2024
- AI Observability
- Generative AI
- Integrations
- LLM Security
- LLMs
- Partnerships
OWASP Top 10 Essential Tips for Securing LLMs: Guide to Improved LLM Safety
May 21, 2024
- LLMs
- LLM Security
- Generative AI
7 Ways to Evaluate and Monitor LLMs
May 13, 2024
- LLMs
- Generative AI
How to Distinguish User Behavior and Data Drift in LLMs
May 7, 2024
- LLMs
- Generative AI