Policy Automation
Overview
Traditional security postures are reactive: a threat is detected, a human writes a rule to block it, and the rule is deployed manually. Behavry's Red Team Policy Automation Loop closes this gap by converting detection findings into candidate OPA Rego policies automatically. The system observes attack patterns in real time, generates targeted policy rules, scores their confidence, and either presents them for human review or auto-activates them when confidence is high enough.
Detection Event (event bus)
|
v
PolicyGenerator.on_event()
|
|-- compute pattern signature (deduplication)
|-- update frequency tracker (24h sliding window)
|-- compute confidence score
|-- render Rego template
|
v
PolicyCandidate (database)
|
|-- confidence >= threshold?
| |
| yes --> auto-activate (create Policy + push to OPA)
| no --> status = "proposed" (await human review)
|
v
OPA enforcement (immediate)
The result is a feedback loop where observed threats are automatically translated into enforceable policy -- reducing the window between detection and prevention from hours to seconds.
How It Works
The PolicyGenerator is a singleton that subscribes to the event bus as a wildcard listener. When a detection event arrives, it checks whether the event type is in its handled set. If so, it processes the event through a pipeline of signature computation, confidence scoring, Rego generation, deduplication, and optional auto-activation.
Handled Event Types
The generator responds to six event types, each mapped to a specific Rego template:
| Event Type | Threat Signal | Generated Template |
|---|---|---|
INBOUND_INJECTION_DETECTED | Injection pattern found in tool call response | injection_block |
INBOUND_INJECTION_BLOCKED | Injection pattern blocked by existing policy | injection_block |
INJECTION_CONDITIONING_SUSPECTED | Gradual conditioning sequence detected | conditioning_block |
BEHAVIORAL_DRIFT_DETECTED | Agent behavior diverging from baseline | drift_escalate |
REQUESTER_SESSION_CYCLING | Rapid session cycling by same requester | requester_validation |
BEHAVIOR_REVERSAL | Agent reversing previously blocked actions in new session | behavior_reversal |
Events not in this set are silently ignored. The generator processes each event asynchronously and catches all exceptions to avoid disrupting the event bus.
Pattern Fingerprinting
Each detection event is reduced to a pattern signature -- a normalized string that strips variable content (specific payloads, agent IDs, timestamps) and retains structural characteristics. This signature is the key for deduplication and frequency tracking.
Signature Format by Event Type
| Event Type | Signature Format | Example |
|---|---|---|
INBOUND_INJECTION_DETECTED | injection:{pattern_class}:{tool_name} | injection:prompt_injection:read_file |
INBOUND_INJECTION_BLOCKED | injection:{pattern_class}:{tool_name} | injection:system_override:execute_cmd |
INJECTION_CONDITIONING_SUSPECTED | conditioning:repeated_injection_findings | (static) |
BEHAVIORAL_DRIFT_DETECTED | drift:risk_tier_escalation | (static) |
REQUESTER_SESSION_CYCLING | requester_mismatch:{tool_name} | requester_mismatch:write_file |
BEHAVIOR_REVERSAL | reversal:{action_class}:{tool_name} | reversal:delete:delete_file |
When a candidate with the same (tenant_id, signature) already exists in proposed or auto_activated status, no duplicate is created. Instead, the existing candidate's confidence score is updated to the maximum of its current value and the newly computed value. This means repeated occurrences of the same pattern raise confidence over time without creating noise in the review queue.
Confidence Scoring
Each candidate receives a confidence score between 0.0 and 1.0, computed from three independent factors:
Factor 1: Pattern Frequency
How many times this pattern signature has been observed in the trailing 24-hour window.
| Occurrences | Score |
|---|---|
| 1 | 0.3 |
| 3 or more | 0.6 |
| 10 or more | 0.9 |
The sliding window is maintained in memory. Occurrences older than 24 hours are pruned before each computation.
Factor 2: Severity
The severity of the source detection event. Extracted from the event data's severity, alert_severity, or min_severity fields, or inferred from inbound findings.
| Severity | Score |
|---|---|
| Critical | +0.3 |
| High | +0.2 |
| Medium | +0.1 |
| Low | +0.0 |
Factor 3: Corroboration
Whether multiple distinct detection sources have contributed to the same pattern signature. For example, if both INBOUND_INJECTION_DETECTED and BEHAVIOR_REVERSAL events produce the same signature, that indicates independent detectors agree on the threat.
| Condition | Score |
|---|---|
| 2 or more distinct source event types | +0.2 |
| Single source only | +0.0 |
Final Score
confidence = min(1.0, frequency_score + severity_score + corroboration_score)
Examples:
- Single medium-severity injection detection:
0.3 + 0.1 + 0.0 = 0.4 - 10 high-severity injection detections:
0.9 + 0.2 + 0.0 = 1.0(capped) - 3 critical-severity detections from 2 sources:
0.6 + 0.3 + 0.2 = 1.0(capped) - 3 medium-severity detections, single source:
0.6 + 0.1 + 0.0 = 0.7
Rego Templates
The generator ships with seven Rego templates. Each produces a complete, deployable OPA policy module under the behavry.authz.autogen package.
injection_block
Blocks a specific injection pattern class at or above a minimum severity:
package behavry.authz.autogen
# Auto-generated: block prompt_injection injection pattern
# Source event: evt-abc123 | Generated: 2026-03-17T14:00:00Z
default allow := false
deny[msg] {
input.inbound_findings[_].pattern_class == "prompt_injection"
input.inbound_findings[_].severity >= "high"
msg := "Auto-generated: block prompt_injection (source: evt-abc123)"
}
conditioning_block
Escalates when three or more medium-severity findings appear in a single request -- indicating a gradual conditioning sequence:
escalate[msg] {
count([f | f := input.inbound_findings[_]; f.severity >= "medium"]) >= 3
msg := "Auto-generated: conditioning sequence detected (source: evt-def456)"
}
drift_escalate
Escalates actions from agents whose risk tier has been elevated due to behavioral drift:
escalate[msg] {
input.agent.risk_tier == "high"
msg := "Auto-generated: behavioral drift escalation (source: evt-ghi789)"
}
requester_validation
Denies a specific tool call when the requester identity is not verified, targeting session-cycling patterns:
deny[msg] {
input.tool_name == "write_file"
not input.requester.verified
msg := "Auto-generated: require verified requester for write_file (source: evt-jkl012)"
}
behavior_reversal
Escalates calls to a specific tool that has been associated with trust-reset behavior reversals:
escalate[msg] {
input.request.tool_name == "delete_file"
msg := "Auto-generated: behavior reversal guard for delete_file (source: evt-mno345)"
}
resource_restrict and rate_ceiling
Two additional templates (resource_restrict and rate_ceiling) are available for future use and can be invoked manually when editing a candidate's Rego before approval.
Auto-Activation
When a tenant enables automatic policy activation, high-confidence candidates bypass the review queue entirely.
Configuration
Auto-activation is controlled by two fields on the tenant's configuration:
| Field | Type | Default | Description |
|---|---|---|---|
auto_activate_enabled | boolean | false | Whether auto-activation is active |
auto_activate_threshold | float | 0.85 | Minimum confidence score required |
These can be set via the Admin API or the Settings page in the dashboard.
Activation Flow
When a candidate's confidence meets or exceeds the threshold:
- A
Policyrecord is created in the database with the candidate's Rego content. - The policy is activated -- its Rego is pushed to OPA via
PUT /v1/policies/{id}. - The candidate's status changes from
proposedtoauto_activated. - A
POLICY_AUTO_ACTIVATEDevent is published to the event bus.
From this point, OPA enforces the new rule immediately. The policy appears in the Policies dashboard as an auto-generated entry with a link back to the originating candidate.
Safety Considerations
Auto-activation should be enabled with care. Recommended practices:
- Set the threshold high (0.85 or above) to ensure only well-corroborated patterns are activated automatically.
- Monitor the
POLICY_AUTO_ACTIVATEDevent stream for unexpected activations. - Review auto-activated candidates periodically in the Policy Suggestions dashboard.
- Use the test dry-run endpoint to validate generated Rego before lowering the threshold.
Manual Review
Candidates that do not meet the auto-activation threshold remain in proposed status and appear in the Policy Suggestions dashboard for human review.
Review Actions
| Action | Description |
|---|---|
| Inspect | View the generated Rego, source event details, confidence breakdown, and pattern signature |
| Edit Rego | Modify the generated Rego inline before approving (useful for tightening or relaxing the rule) |
| Test | Push the Rego to OPA temporarily and evaluate it against sample input to verify behavior |
| Approve | Create a Policy record, activate it, and push to OPA. The candidate status changes to approved |
| Reject | Close the candidate with reviewer notes. The candidate status changes to rejected |
Approve Flow
When a candidate is approved:
- A
Policyrecord is created with the candidate's (possibly edited) Rego content. - The policy is activated and synced to OPA.
- The candidate is linked to the new policy via
activated_policy_id. - A
POLICY_CANDIDATE_APPROVEDevent is published. - An audit entry records the reviewer's identity and notes.
Reject Flow
When a candidate is rejected:
- The candidate's status is set to
rejected. - The reviewer and review notes are recorded.
- A
POLICY_CANDIDATE_REJECTEDevent is published.
Rejected candidates remain in the database for audit purposes but are excluded from the active review queue.
API Endpoints
All endpoints require admin authentication.
List Candidates
GET /api/v1/policy-candidates?status=proposed&min_confidence=0.5&limit=20
Query parameters:
| Parameter | Type | Description |
|---|---|---|
status | string | Filter by status: proposed, approved, rejected, auto_activated |
source_event_type | string | Filter by source event type |
min_confidence | float | Minimum confidence threshold |
max_confidence | float | Maximum confidence threshold |
offset | int | Pagination offset (default 0) |
limit | int | Page size (default 50) |
Get Statistics
GET /api/v1/policy-candidates/stats?window_days=30
Returns aggregated counts by status within the trailing N days:
{
"proposed": 12,
"approved": 8,
"rejected": 3,
"auto_activated": 5,
"window_days": 30
}
Get Candidate
GET /api/v1/policy-candidates/{id}
Returns the full candidate record including Rego content, confidence score, and review metadata.
Approve
POST /api/v1/policy-candidates/{id}/approve
Content-Type: application/json
{
"reviewed_by": "admin@example.com",
"review_notes": "Verified pattern matches known injection vector. Approved for production."
}
Reject
POST /api/v1/policy-candidates/{id}/reject
Content-Type: application/json
{
"reviewed_by": "admin@example.com",
"review_notes": "False positive — pattern triggered by legitimate API documentation content."
}
Edit Rego
PATCH /api/v1/policy-candidates/{id}/rego
Content-Type: application/json
{
"rego_rule": "package behavry.authz.autogen\n\ndefault allow := false\n\ndeny[msg] {\n input.inbound_findings[_].pattern_class == \"prompt_injection\"\n input.inbound_findings[_].severity >= \"critical\"\n msg := \"Custom: block critical prompt injections only\"\n}\n"
}
Only allowed on candidates with status proposed.
Test Dry-Run
POST /api/v1/policy-candidates/{id}/test
Content-Type: application/json
{
"input": {
"agent": {"id": "test-agent", "roles": ["worker"], "risk_tier": "medium"},
"request": {"tool_name": "read_file", "action": "read", "resource": "/tmp/test.txt"},
"inbound_findings": [
{"pattern_class": "prompt_injection", "severity": "high", "content_snippet": "ignore previous instructions"}
]
}
}
The test endpoint pushes the candidate's Rego to OPA under a temporary path, evaluates the provided input, and returns the OPA result. The temporary policy is not cleaned up automatically -- it is overwritten on the next test.
Response:
{
"candidate_id": "candidate-uuid",
"opa_result": {
"result": {
"deny": ["Auto-generated: block prompt_injection (source: evt-abc123)"]
}
},
"rego_package": "behavry.authz.autogen"
}
Dashboard
The Policy Suggestions page is accessible from the main navigation (marked with an amber badge showing the count of proposed candidates).
Stats Strip
Four metric cards at the top show the count of proposed, approved, rejected, and auto-activated candidates within the trailing 30 days.
Filter Bar
Filter candidates by status, source event type, and confidence range.
Candidate Cards
Each candidate is displayed as a card showing:
- Pattern signature and source event type
- Confidence score with a color-coded badge (green for high, amber for medium, red for low)
- Generated Rego in a syntax-highlighted code block
- Inline Rego editor (for proposed candidates)
- Test panel with sample input and OPA result display
- Approve and Reject action buttons with notes input
The page is organized into three sections:
- Awaiting Review -- proposed candidates, sorted by confidence descending
- Auto-Activated -- candidates that were automatically activated
- Resolved -- approved and rejected candidates
Event Types
The policy automation system publishes four event types to the event bus and SSE stream:
| Event Type | When | Key Data Fields |
|---|---|---|
POLICY_CANDIDATE_PROPOSED | New candidate created from detection event | candidate_id, pattern_signature, confidence, source_event_type |
POLICY_AUTO_ACTIVATED | Candidate auto-activated (confidence met threshold) | candidate_id, policy_id, rego_package, confidence, pattern_signature |
POLICY_CANDIDATE_APPROVED | Admin approved a candidate | candidate_id, policy_id, reviewed_by, pattern_signature, confidence |
POLICY_CANDIDATE_REJECTED | Admin rejected a candidate | candidate_id, reviewed_by, review_notes, pattern_signature |
These events appear in the Activity feed and trigger browser notifications in the dashboard when auto-activation occurs.