Enabling LLM Guardrails
LLM guardrails are an important feature in the xRx system that help ensure the safety and reliability of AI interactions. This guide will walk you through the process of enabling and configuring LLM guardrails for your xRx deployment.
Overview
The xRx system uses a Guardrails Proxy to implement safety measures for input and output processing. This proxy acts as an intermediary between the orchestrator and the reasoning agent, applying various checks such as toxic language detection.
Setup
To enable LLM guardrails, you need to configure the orchestrator to use the Guardrails Proxy instead of connecting directly to the reasoning agent. Here's how to do it:
-
Open the
.env
file in the root directory of your project.- If you don't have a
.env
file, you will need to create one in the root directory of your project with the necessary environment variables. Here is an example of an.env
file with the required variables set.
- If you don't have a
-
Update or add the following environment variables:
AGENT_HOST="xrx-guardrails"
AGENT_PORT="8094"
GUARDRAILS_AGENT_HOST="xrx-reasoning"
GUARDRAILS_AGENT_PORT="8003"This configuration tells the orchestrator to connect to the Guardrails Proxy (
xrx-guardrails
) instead of the reasoning agent directly. It also specifies the host and port for the actual reasoning agent (xrx-reasoning
) that the Guardrails Proxy will communicate with. -
Save the
.env
file.
To apply these changes, rebuild and restart your containers:
docker compose up --build
Advanced Configuration
Monitoring specific fields
The Guardrails Proxy can be configured to monitor specific endpoints and apply different checks. To customize the guardrails:
-
Open the
guardrails-proxy/app/main.py
file. -
Locate the
MONITOR_CONFIG
dictionary. This defines which endpoints are monitored and how to extract content for validation. -
Modify or add entries to suit your API structure. For example:
MONITOR_CONFIG = {
"my-custom-endpoint": {
"input_keys": {
"messages": {
"type": "list",
"list_index": -1,
"items": {
"type": "object",
"field": "content",
}
}
}
}
}This configuration tells the proxy to monitor the input body of "my-custom-endpoint" endpoint, extract content from the last message in the input request, and apply guardrails to the "content" field of the response.
Customized guardrails
For more advanced guardrails or custom checks, you can:
- To add or modify guardrails, edit the
guardrails-proxy/app/guard.py
file. You can chain multiple guardrails using theGuard().use_many()
method. - Implement additional guardrail models from the Guardrails AI Hub.
- Create custom guardrail classes in the
guard.py
file. - Adjust the proxy's behavior in
main.py
to handle different types of requests or responses.
By following these steps, you can effectively enable and customize LLM guardrails in your xRx system, enhancing the safety and reliability of your AI interactions.