Skip to main content

Prompt Policy

Prompt Policies act as the first line of defense in your AI agent's execution flow. They intercept user messages to prevent security breaches, maintain data privacy, and ensure content safety.The Prompt Policy is divided into specialized scanners that can be individually Enabled or Disabled based on your security requirements.

info

Prerequisite: Ensure your LLM provider (e.g., OpenAI gpt-4o) is enabled and configured with valid credentials in the Models section for these guardrails to function.

Use the following tables to understand the configuration properties and behavioral logic for each scanner within the Prompt Policy.

1. PII & Data Privacy

Protect sensitive information by detecting and handling Personally Identifiable Information (PII) before it is processed by the agent.

FeatureLogicActionConfiguration Example
PII DetectionNon-LLMRedact / BlockEntities: [EMAIL, PHONE_NUMBER, SSN]
Regex ScannerNon-LLMBlockPattern: ^[A-Z]{2}-\d{4}$ (e.g., Internal ID)
Secrets DetectionNon-LLMBlockTargets: [API_KEYS, AUTH_TOKENS, PEM_KEYS]

Usage Note:

info

PII Redaction will replace sensitive text with a placeholder like [PII_DATA], allowing the LLM to understand the context without seeing the actual data.

2. Security & Prompt Integrity

Prevent adversarial attacks designed to manipulate or bypass the agent’s core instructions.

FeatureLogicPrimary GoalFallback Message Requirement
Prompt InjectionLLMDetect override attempts"Security violation: Unauthorized instructions detected."
Code DetectionLLMBlock non-natural language"The system only accepts natural language queries."
Invisible TextNon-LLMStrip hidden characters(Automatic Redaction)

3. Moderation & Content Safety

Enforce organizational standards and prevent the generation of harmful content.

FeatureLogicDescriptionExample Values
Toxicity DetectionLLMScan for harmful languageThreshold: High/Medium/Low
Ban TopicsLLMPrevent off-topic chatTopics: ["Financial Advice", "Legal"]
Ban CompetitorsLLMMask competitor namesList: ["CompetitorA", "CompetitorB"]
Ban CodeLLMPrevent code generation(Enable/Disable toggle)
Jailbreak DetectionLLMDetect sandbox escapes(Enable/Disable toggle)
Ban SubstringsNon-LLMStrict phrase matching["password123", "internal_db"]

4. Factuality & Relevance

Gibberish Detection (LLM)
Filters out nonsensical or "keyboard mash" inputs (e.g., "asdfghjkl") to save on token costs and maintain clean logs.

5. Performance & Utility

Token Limit (Non-LLM)
Enforces a maximum token count for user inputs to prevent "Prompt Stuffing" and manage costs.

PropertyValue TypeDescription
Max TokensIntegerMaximum allowed tokens per prompt (e.g., 500).
ActionDropdownBlock or Truncate.

Logic Type Comparison

When configuring your Blueprint, consider the latency and cost implications of the logic type:

TypeLatencyCostAccuracy for Context
Non-LLMUltra-Low (<50ms)NegligibleBest for fixed patterns (Regex, Tokens)
LLMModerate (200ms+)Token-basedBest for intent (Injection, Toxicity)
note

Developer Tip: Custom Fallbacks

Whenever a scanner triggers a Block action, the fallback_message property is returned to the user interface. It is recommended to use clear, non-technical language for these messages.