Policy Automation

Overview

Traditional security postures are reactive: a threat is detected, a human writes a rule to block it, and the rule is deployed manually. Behavry's Red Team Policy Automation Loop closes this gap by converting detection findings into candidate OPA Rego policies automatically. The system observes attack patterns in real time, generates targeted policy rules, scores their confidence, and either presents them for human review or auto-activates them when confidence is high enough.

Detection Event (event bus)
  |
  v
PolicyGenerator.on_event()
  |
  |-- compute pattern signature (deduplication)
  |-- update frequency tracker (24h sliding window)
  |-- compute confidence score
  |-- render Rego template
  |
  v
PolicyCandidate (database)
  |
  |-- confidence >= threshold?
  |     |
  |     yes --> auto-activate (create Policy + push to OPA)
  |     no  --> status = "proposed" (await human review)
  |
  v
OPA enforcement (immediate)

The result is a feedback loop where observed threats are automatically translated into enforceable policy -- reducing the window between detection and prevention from hours to seconds.

How It Works

The PolicyGenerator is a singleton that subscribes to the event bus as a wildcard listener. When a detection event arrives, it checks whether the event type is in its handled set. If so, it processes the event through a pipeline of signature computation, confidence scoring, Rego generation, deduplication, and optional auto-activation.

Handled Event Types

The generator responds to six event types, each mapped to a specific Rego template:

Event Type	Threat Signal	Generated Template
`INBOUND_INJECTION_DETECTED`	Injection pattern found in tool call response	`injection_block`
`INBOUND_INJECTION_BLOCKED`	Injection pattern blocked by existing policy	`injection_block`
`INJECTION_CONDITIONING_SUSPECTED`	Gradual conditioning sequence detected	`conditioning_block`
`BEHAVIORAL_DRIFT_DETECTED`	Agent behavior diverging from baseline	`drift_escalate`
`REQUESTER_SESSION_CYCLING`	Rapid session cycling by same requester	`requester_validation`
`BEHAVIOR_REVERSAL`	Agent reversing previously blocked actions in new session	`behavior_reversal`

Events not in this set are silently ignored. The generator processes each event asynchronously and catches all exceptions to avoid disrupting the event bus.

Pattern Fingerprinting

Each detection event is reduced to a pattern signature -- a normalized string that strips variable content (specific payloads, agent IDs, timestamps) and retains structural characteristics. This signature is the key for deduplication and frequency tracking.

Signature Format by Event Type

Event Type	Signature Format	Example
`INBOUND_INJECTION_DETECTED`	`injection:{pattern_class}:{tool_name}`	`injection:prompt_injection:read_file`
`INBOUND_INJECTION_BLOCKED`	`injection:{pattern_class}:{tool_name}`	`injection:system_override:execute_cmd`
`INJECTION_CONDITIONING_SUSPECTED`	`conditioning:repeated_injection_findings`	(static)
`BEHAVIORAL_DRIFT_DETECTED`	`drift:risk_tier_escalation`	(static)
`REQUESTER_SESSION_CYCLING`	`requester_mismatch:{tool_name}`	`requester_mismatch:write_file`
`BEHAVIOR_REVERSAL`	`reversal:{action_class}:{tool_name}`	`reversal:delete:delete_file`

When a candidate with the same (tenant_id, signature) already exists in proposed or auto_activated status, no duplicate is created. Instead, the existing candidate's confidence score is updated to the maximum of its current value and the newly computed value. This means repeated occurrences of the same pattern raise confidence over time without creating noise in the review queue.

Confidence Scoring

Each candidate receives a confidence score between 0.0 and 1.0, computed from three independent factors:

Factor 1: Pattern Frequency

How many times this pattern signature has been observed in the trailing 24-hour window.

Occurrences	Score
1	0.3
3 or more	0.6
10 or more	0.9

The sliding window is maintained in memory. Occurrences older than 24 hours are pruned before each computation.

Factor 2: Severity

The severity of the source detection event. Extracted from the event data's severity, alert_severity, or min_severity fields, or inferred from inbound findings.

Severity	Score
Critical	+0.3
High	+0.2
Medium	+0.1
Low	+0.0

Factor 3: Corroboration

Whether multiple distinct detection sources have contributed to the same pattern signature. For example, if both INBOUND_INJECTION_DETECTED and BEHAVIOR_REVERSAL events produce the same signature, that indicates independent detectors agree on the threat.

Condition	Score
2 or more distinct source event types	+0.2
Single source only	+0.0

Final Score

confidence = min(1.0, frequency_score + severity_score + corroboration_score)

Examples:

Single medium-severity injection detection: 0.3 + 0.1 + 0.0 = 0.4
10 high-severity injection detections: 0.9 + 0.2 + 0.0 = 1.0 (capped)
3 critical-severity detections from 2 sources: 0.6 + 0.3 + 0.2 = 1.0 (capped)
3 medium-severity detections, single source: 0.6 + 0.1 + 0.0 = 0.7

Rego Templates

The generator ships with seven Rego templates. Each produces a complete, deployable OPA policy module under the behavry.authz.autogen package.

injection_block

Blocks a specific injection pattern class at or above a minimum severity:

package behavry.authz.autogen

# Auto-generated: block prompt_injection injection pattern
# Source event: evt-abc123 | Generated: 2026-03-17T14:00:00Z

default allow := false

deny[msg] {
    input.inbound_findings[_].pattern_class == "prompt_injection"
    input.inbound_findings[_].severity >= "high"
    msg := "Auto-generated: block prompt_injection (source: evt-abc123)"
}

conditioning_block

Escalates when three or more medium-severity findings appear in a single request -- indicating a gradual conditioning sequence:

escalate[msg] {
    count([f | f := input.inbound_findings[_]; f.severity >= "medium"]) >= 3
    msg := "Auto-generated: conditioning sequence detected (source: evt-def456)"
}

drift_escalate

Escalates actions from agents whose risk tier has been elevated due to behavioral drift:

escalate[msg] {
    input.agent.risk_tier == "high"
    msg := "Auto-generated: behavioral drift escalation (source: evt-ghi789)"
}

requester_validation

Denies a specific tool call when the requester identity is not verified, targeting session-cycling patterns:

deny[msg] {
    input.tool_name == "write_file"
    not input.requester.verified
    msg := "Auto-generated: require verified requester for write_file (source: evt-jkl012)"
}

behavior_reversal

Escalates calls to a specific tool that has been associated with trust-reset behavior reversals:

escalate[msg] {
    input.request.tool_name == "delete_file"
    msg := "Auto-generated: behavior reversal guard for delete_file (source: evt-mno345)"
}

resource_restrict and rate_ceiling

Two additional templates (resource_restrict and rate_ceiling) are available for future use and can be invoked manually when editing a candidate's Rego before approval.

Auto-Activation

When a tenant enables automatic policy activation, high-confidence candidates bypass the review queue entirely.

Configuration

Auto-activation is controlled by two fields on the tenant's configuration:

Field	Type	Default	Description
`auto_activate_enabled`	`boolean`	`false`	Whether auto-activation is active
`auto_activate_threshold`	`float`	`0.85`	Minimum confidence score required

These can be set via the Admin API or the Settings page in the dashboard.

Activation Flow

When a candidate's confidence meets or exceeds the threshold:

A Policy record is created in the database with the candidate's Rego content.
The policy is activated -- its Rego is pushed to OPA via PUT /v1/policies/{id}.
The candidate's status changes from proposed to auto_activated.
A POLICY_AUTO_ACTIVATED event is published to the event bus.

From this point, OPA enforces the new rule immediately. The policy appears in the Policies dashboard as an auto-generated entry with a link back to the originating candidate.

Safety Considerations

Auto-activation should be enabled with care. Recommended practices:

Set the threshold high (0.85 or above) to ensure only well-corroborated patterns are activated automatically.
Monitor the POLICY_AUTO_ACTIVATED event stream for unexpected activations.
Review auto-activated candidates periodically in the Policy Suggestions dashboard.
Use the test dry-run endpoint to validate generated Rego before lowering the threshold.

Manual Review

Candidates that do not meet the auto-activation threshold remain in proposed status and appear in the Policy Suggestions dashboard for human review.

Review Actions

Action	Description
Inspect	View the generated Rego, source event details, confidence breakdown, and pattern signature
Edit Rego	Modify the generated Rego inline before approving (useful for tightening or relaxing the rule)
Test	Push the Rego to OPA temporarily and evaluate it against sample input to verify behavior
Approve	Create a Policy record, activate it, and push to OPA. The candidate status changes to `approved`
Reject	Close the candidate with reviewer notes. The candidate status changes to `rejected`

Approve Flow

When a candidate is approved:

A Policy record is created with the candidate's (possibly edited) Rego content.
The policy is activated and synced to OPA.
The candidate is linked to the new policy via activated_policy_id.
A POLICY_CANDIDATE_APPROVED event is published.
An audit entry records the reviewer's identity and notes.

Reject Flow

When a candidate is rejected:

The candidate's status is set to rejected.
The reviewer and review notes are recorded.
A POLICY_CANDIDATE_REJECTED event is published.

Rejected candidates remain in the database for audit purposes but are excluded from the active review queue.

API Endpoints

All endpoints require admin authentication.

List Candidates

GET /api/v1/policy-candidates?status=proposed&min_confidence=0.5&limit=20

Query parameters:

Parameter	Type	Description
`status`	`string`	Filter by status: `proposed`, `approved`, `rejected`, `auto_activated`
`source_event_type`	`string`	Filter by source event type
`min_confidence`	`float`	Minimum confidence threshold
`max_confidence`	`float`	Maximum confidence threshold
`offset`	`int`	Pagination offset (default 0)
`limit`	`int`	Page size (default 50)

Get Statistics

GET /api/v1/policy-candidates/stats?window_days=30

Returns aggregated counts by status within the trailing N days:

{
  "proposed": 12,
  "approved": 8,
  "rejected": 3,
  "auto_activated": 5,
  "window_days": 30
}

Get Candidate

GET /api/v1/policy-candidates/{id}

Returns the full candidate record including Rego content, confidence score, and review metadata.

Approve

POST /api/v1/policy-candidates/{id}/approve
Content-Type: application/json

{
  "reviewed_by": "admin@example.com",
  "review_notes": "Verified pattern matches known injection vector. Approved for production."
}

Reject

POST /api/v1/policy-candidates/{id}/reject
Content-Type: application/json

{
  "reviewed_by": "admin@example.com",
  "review_notes": "False positive — pattern triggered by legitimate API documentation content."
}

Edit Rego

PATCH /api/v1/policy-candidates/{id}/rego
Content-Type: application/json

{
  "rego_rule": "package behavry.authz.autogen\n\ndefault allow := false\n\ndeny[msg] {\n    input.inbound_findings[_].pattern_class == \"prompt_injection\"\n    input.inbound_findings[_].severity >= \"critical\"\n    msg := \"Custom: block critical prompt injections only\"\n}\n"
}

Only allowed on candidates with status proposed.

Test Dry-Run

POST /api/v1/policy-candidates/{id}/test
Content-Type: application/json

{
  "input": {
    "agent": {"id": "test-agent", "roles": ["worker"], "risk_tier": "medium"},
    "request": {"tool_name": "read_file", "action": "read", "resource": "/tmp/test.txt"},
    "inbound_findings": [
      {"pattern_class": "prompt_injection", "severity": "high", "content_snippet": "ignore previous instructions"}
    ]
  }
}

The test endpoint pushes the candidate's Rego to OPA under a temporary path, evaluates the provided input, and returns the OPA result. The temporary policy is not cleaned up automatically -- it is overwritten on the next test.

Response:

{
  "candidate_id": "candidate-uuid",
  "opa_result": {
    "result": {
      "deny": ["Auto-generated: block prompt_injection (source: evt-abc123)"]
    }
  },
  "rego_package": "behavry.authz.autogen"
}

Dashboard

The Policy Suggestions page is accessible from the main navigation (marked with an amber badge showing the count of proposed candidates).

Stats Strip

Four metric cards at the top show the count of proposed, approved, rejected, and auto-activated candidates within the trailing 30 days.

Filter Bar

Filter candidates by status, source event type, and confidence range.

Candidate Cards

Each candidate is displayed as a card showing:

Pattern signature and source event type
Confidence score with a color-coded badge (green for high, amber for medium, red for low)
Generated Rego in a syntax-highlighted code block
Inline Rego editor (for proposed candidates)
Test panel with sample input and OPA result display
Approve and Reject action buttons with notes input

The page is organized into three sections:

Awaiting Review -- proposed candidates, sorted by confidence descending
Auto-Activated -- candidates that were automatically activated
Resolved -- approved and rejected candidates

Event Types

The policy automation system publishes four event types to the event bus and SSE stream:

Event Type	When	Key Data Fields
`POLICY_CANDIDATE_PROPOSED`	New candidate created from detection event	`candidate_id`, `pattern_signature`, `confidence`, `source_event_type`
`POLICY_AUTO_ACTIVATED`	Candidate auto-activated (confidence met threshold)	`candidate_id`, `policy_id`, `rego_package`, `confidence`, `pattern_signature`
`POLICY_CANDIDATE_APPROVED`	Admin approved a candidate	`candidate_id`, `policy_id`, `reviewed_by`, `pattern_signature`, `confidence`
`POLICY_CANDIDATE_REJECTED`	Admin rejected a candidate	`candidate_id`, `reviewed_by`, `review_notes`, `pattern_signature`

These events appear in the Activity feed and trigger browser notifications in the dashboard when auto-activation occurs.

Overview​

How It Works​

Handled Event Types​

Pattern Fingerprinting​

Signature Format by Event Type​

Confidence Scoring​

Factor 1: Pattern Frequency​

Factor 2: Severity​

Factor 3: Corroboration​

Final Score​

Rego Templates​

injection_block​

conditioning_block​

drift_escalate​

requester_validation​

behavior_reversal​

resource_restrict and rate_ceiling​

Auto-Activation​

Configuration​

Activation Flow​

Safety Considerations​

Manual Review​

Review Actions​

Approve Flow​

Reject Flow​

API Endpoints​

List Candidates​

Get Statistics​

Get Candidate​

Approve​

Reject​

Edit Rego​

Test Dry-Run​

Dashboard​

Stats Strip​

Filter Bar​

Candidate Cards​

Event Types​