Evaluation and Safety¶
TraceVerde includes 6 built-in evaluation detectors for monitoring LLM input/output quality and safety. All features are opt-in and can be enabled individually.
PII Detection¶
Detect personally identifiable information using Microsoft Presidio.
genai_otel.instrument(
enable_pii_detection=True,
pii_mode="redact", # "detect", "redact", or "block"
pii_threshold=0.5,
pii_gdpr_mode=True,
pii_hipaa_mode=True,
pii_pci_dss_mode=True,
)
Detects 15+ entity types: email, phone, SSN, credit cards, IP addresses, and more.
Examples:
- Basic detect mode
- Redaction mode
- Blocking mode
- GDPR compliance
- HIPAA compliance
- PCI-DSS compliance
- Combined compliance
- Custom threshold
- Response detection
- Env var config
- PII with Anthropic
- PII with Ollama
- PII with HuggingFace
Toxicity Detection¶
Detect harmful content using Perspective API (cloud) or Detoxify (local).
Categories: toxicity, severe_toxicity, identity_attack, insult, profanity, threat.
Examples:
- Basic Detoxify (local)
- Perspective API (cloud)
- Blocking mode
- Category detection
- Custom threshold
- Combined with PII
- Response detection
- Toxicity with Anthropic
- Toxicity with Ollama
- Toxicity with HuggingFace
Bias Detection¶
Identify demographic biases in prompts and responses.
8 bias types: gender, race, ethnicity, religion, age, disability, sexual_orientation, political.
Examples:
- Basic detection
- Blocking mode
- Category-specific
- Custom threshold
- Hiring compliance
- Response detection
- Multiple evaluations
- Bias with Ollama
- Bias with HuggingFace
Prompt Injection Detection¶
Protect against prompt manipulation attacks.
6 injection types: instruction_override, role_playing, jailbreak, context_switching, system_extraction, encoding_obfuscation.
Examples:
- Basic detection
- Blocking mode
- System override
- Jailbreak techniques
- Payload injection
- Prompt injection with HuggingFace
Restricted Topics Detection¶
Monitor and block sensitive topics.
genai_otel.instrument(
enable_restricted_topics=True,
restricted_topics=["medical_advice", "legal_advice", "financial_advice"],
restricted_topics_threshold=0.5,
)
Examples:
Hallucination Detection¶
Track factual accuracy and groundedness.
Includes factual claim extraction, hedge word detection, citation tracking, and context contradiction detection.
Examples:
- Basic detection
- Citation verification
- RAG faithfulness
- Hallucination with Ollama
- Hallucination with Mistral
- Hallucination with HuggingFace
Enable All Evaluations¶
genai_otel.instrument(
enable_pii_detection=True,
enable_toxicity_detection=True,
enable_bias_detection=True,
enable_prompt_injection_detection=True,
enable_restricted_topics=True,
enable_hallucination_detection=True,
)
See the comprehensive evaluation example and Ollama multiple evaluations.
Metrics¶
Each detector records metrics for monitoring:
genai.evaluation.<detector>.detections- Detection eventsgenai.evaluation.<detector>.blocked- Blocked requestsgenai.evaluation.<detector>.score- Score distribution (histogram)
See the README for the complete list of span attributes.
All Examples¶
Browse all evaluation examples in the examples/ directory.