Skip to main content

Policy Automation

Overview

Traditional security postures are reactive: a threat is detected, a human writes a rule to block it, and the rule is deployed manually. Behavry's Red Team Policy Automation Loop closes this gap by converting detection findings into candidate OPA Rego policies automatically. The system observes attack patterns in real time, generates targeted policy rules, scores their confidence, and either presents them for human review or auto-activates them when confidence is high enough.

Detection Event (event bus)
|
v
PolicyGenerator.on_event()
|
|-- compute pattern signature (deduplication)
|-- update frequency tracker (24h sliding window)
|-- compute confidence score
|-- render Rego template
|
v
PolicyCandidate (database)
|
|-- confidence >= threshold?
| |
| yes --> auto-activate (create Policy + push to OPA)
| no --> status = "proposed" (await human review)
|
v
OPA enforcement (immediate)

The result is a feedback loop where observed threats are automatically translated into enforceable policy -- reducing the window between detection and prevention from hours to seconds.


How It Works

The PolicyGenerator is a singleton that subscribes to the event bus as a wildcard listener. When a detection event arrives, it checks whether the event type is in its handled set. If so, it processes the event through a pipeline of signature computation, confidence scoring, Rego generation, deduplication, and optional auto-activation.

Handled Event Types

The generator responds to six event types, each mapped to a specific Rego template:

Event TypeThreat SignalGenerated Template
INBOUND_INJECTION_DETECTEDInjection pattern found in tool call responseinjection_block
INBOUND_INJECTION_BLOCKEDInjection pattern blocked by existing policyinjection_block
INJECTION_CONDITIONING_SUSPECTEDGradual conditioning sequence detectedconditioning_block
BEHAVIORAL_DRIFT_DETECTEDAgent behavior diverging from baselinedrift_escalate
REQUESTER_SESSION_CYCLINGRapid session cycling by same requesterrequester_validation
BEHAVIOR_REVERSALAgent reversing previously blocked actions in new sessionbehavior_reversal

Events not in this set are silently ignored. The generator processes each event asynchronously and catches all exceptions to avoid disrupting the event bus.


Pattern Fingerprinting

Each detection event is reduced to a pattern signature -- a normalized string that strips variable content (specific payloads, agent IDs, timestamps) and retains structural characteristics. This signature is the key for deduplication and frequency tracking.

Signature Format by Event Type

Event TypeSignature FormatExample
INBOUND_INJECTION_DETECTEDinjection:{pattern_class}:{tool_name}injection:prompt_injection:read_file
INBOUND_INJECTION_BLOCKEDinjection:{pattern_class}:{tool_name}injection:system_override:execute_cmd
INJECTION_CONDITIONING_SUSPECTEDconditioning:repeated_injection_findings(static)
BEHAVIORAL_DRIFT_DETECTEDdrift:risk_tier_escalation(static)
REQUESTER_SESSION_CYCLINGrequester_mismatch:{tool_name}requester_mismatch:write_file
BEHAVIOR_REVERSALreversal:{action_class}:{tool_name}reversal:delete:delete_file

When a candidate with the same (tenant_id, signature) already exists in proposed or auto_activated status, no duplicate is created. Instead, the existing candidate's confidence score is updated to the maximum of its current value and the newly computed value. This means repeated occurrences of the same pattern raise confidence over time without creating noise in the review queue.


Confidence Scoring

Each candidate receives a confidence score between 0.0 and 1.0, computed from three independent factors:

Factor 1: Pattern Frequency

How many times this pattern signature has been observed in the trailing 24-hour window.

OccurrencesScore
10.3
3 or more0.6
10 or more0.9

The sliding window is maintained in memory. Occurrences older than 24 hours are pruned before each computation.

Factor 2: Severity

The severity of the source detection event. Extracted from the event data's severity, alert_severity, or min_severity fields, or inferred from inbound findings.

SeverityScore
Critical+0.3
High+0.2
Medium+0.1
Low+0.0

Factor 3: Corroboration

Whether multiple distinct detection sources have contributed to the same pattern signature. For example, if both INBOUND_INJECTION_DETECTED and BEHAVIOR_REVERSAL events produce the same signature, that indicates independent detectors agree on the threat.

ConditionScore
2 or more distinct source event types+0.2
Single source only+0.0

Final Score

confidence = min(1.0, frequency_score + severity_score + corroboration_score)

Examples:

  • Single medium-severity injection detection: 0.3 + 0.1 + 0.0 = 0.4
  • 10 high-severity injection detections: 0.9 + 0.2 + 0.0 = 1.0 (capped)
  • 3 critical-severity detections from 2 sources: 0.6 + 0.3 + 0.2 = 1.0 (capped)
  • 3 medium-severity detections, single source: 0.6 + 0.1 + 0.0 = 0.7

Rego Templates

The generator ships with seven Rego templates. Each produces a complete, deployable OPA policy module under the behavry.authz.autogen package.

injection_block

Blocks a specific injection pattern class at or above a minimum severity:

package behavry.authz.autogen

# Auto-generated: block prompt_injection injection pattern
# Source event: evt-abc123 | Generated: 2026-03-17T14:00:00Z

default allow := false

deny[msg] {
input.inbound_findings[_].pattern_class == "prompt_injection"
input.inbound_findings[_].severity >= "high"
msg := "Auto-generated: block prompt_injection (source: evt-abc123)"
}

conditioning_block

Escalates when three or more medium-severity findings appear in a single request -- indicating a gradual conditioning sequence:

escalate[msg] {
count([f | f := input.inbound_findings[_]; f.severity >= "medium"]) >= 3
msg := "Auto-generated: conditioning sequence detected (source: evt-def456)"
}

drift_escalate

Escalates actions from agents whose risk tier has been elevated due to behavioral drift:

escalate[msg] {
input.agent.risk_tier == "high"
msg := "Auto-generated: behavioral drift escalation (source: evt-ghi789)"
}

requester_validation

Denies a specific tool call when the requester identity is not verified, targeting session-cycling patterns:

deny[msg] {
input.tool_name == "write_file"
not input.requester.verified
msg := "Auto-generated: require verified requester for write_file (source: evt-jkl012)"
}

behavior_reversal

Escalates calls to a specific tool that has been associated with trust-reset behavior reversals:

escalate[msg] {
input.request.tool_name == "delete_file"
msg := "Auto-generated: behavior reversal guard for delete_file (source: evt-mno345)"
}

resource_restrict and rate_ceiling

Two additional templates (resource_restrict and rate_ceiling) are available for future use and can be invoked manually when editing a candidate's Rego before approval.


Auto-Activation

When a tenant enables automatic policy activation, high-confidence candidates bypass the review queue entirely.

Configuration

Auto-activation is controlled by two fields on the tenant's configuration:

FieldTypeDefaultDescription
auto_activate_enabledbooleanfalseWhether auto-activation is active
auto_activate_thresholdfloat0.85Minimum confidence score required

These can be set via the Admin API or the Settings page in the dashboard.

Activation Flow

When a candidate's confidence meets or exceeds the threshold:

  1. A Policy record is created in the database with the candidate's Rego content.
  2. The policy is activated -- its Rego is pushed to OPA via PUT /v1/policies/{id}.
  3. The candidate's status changes from proposed to auto_activated.
  4. A POLICY_AUTO_ACTIVATED event is published to the event bus.

From this point, OPA enforces the new rule immediately. The policy appears in the Policies dashboard as an auto-generated entry with a link back to the originating candidate.

Safety Considerations

Auto-activation should be enabled with care. Recommended practices:

  • Set the threshold high (0.85 or above) to ensure only well-corroborated patterns are activated automatically.
  • Monitor the POLICY_AUTO_ACTIVATED event stream for unexpected activations.
  • Review auto-activated candidates periodically in the Policy Suggestions dashboard.
  • Use the test dry-run endpoint to validate generated Rego before lowering the threshold.

Manual Review

Candidates that do not meet the auto-activation threshold remain in proposed status and appear in the Policy Suggestions dashboard for human review.

Review Actions

ActionDescription
InspectView the generated Rego, source event details, confidence breakdown, and pattern signature
Edit RegoModify the generated Rego inline before approving (useful for tightening or relaxing the rule)
TestPush the Rego to OPA temporarily and evaluate it against sample input to verify behavior
ApproveCreate a Policy record, activate it, and push to OPA. The candidate status changes to approved
RejectClose the candidate with reviewer notes. The candidate status changes to rejected

Approve Flow

When a candidate is approved:

  1. A Policy record is created with the candidate's (possibly edited) Rego content.
  2. The policy is activated and synced to OPA.
  3. The candidate is linked to the new policy via activated_policy_id.
  4. A POLICY_CANDIDATE_APPROVED event is published.
  5. An audit entry records the reviewer's identity and notes.

Reject Flow

When a candidate is rejected:

  1. The candidate's status is set to rejected.
  2. The reviewer and review notes are recorded.
  3. A POLICY_CANDIDATE_REJECTED event is published.

Rejected candidates remain in the database for audit purposes but are excluded from the active review queue.


API Endpoints

All endpoints require admin authentication.

List Candidates

GET /api/v1/policy-candidates?status=proposed&min_confidence=0.5&limit=20

Query parameters:

ParameterTypeDescription
statusstringFilter by status: proposed, approved, rejected, auto_activated
source_event_typestringFilter by source event type
min_confidencefloatMinimum confidence threshold
max_confidencefloatMaximum confidence threshold
offsetintPagination offset (default 0)
limitintPage size (default 50)

Get Statistics

GET /api/v1/policy-candidates/stats?window_days=30

Returns aggregated counts by status within the trailing N days:

{
"proposed": 12,
"approved": 8,
"rejected": 3,
"auto_activated": 5,
"window_days": 30
}

Get Candidate

GET /api/v1/policy-candidates/{id}

Returns the full candidate record including Rego content, confidence score, and review metadata.

Approve

POST /api/v1/policy-candidates/{id}/approve
Content-Type: application/json

{
"reviewed_by": "admin@example.com",
"review_notes": "Verified pattern matches known injection vector. Approved for production."
}

Reject

POST /api/v1/policy-candidates/{id}/reject
Content-Type: application/json

{
"reviewed_by": "admin@example.com",
"review_notes": "False positive — pattern triggered by legitimate API documentation content."
}

Edit Rego

PATCH /api/v1/policy-candidates/{id}/rego
Content-Type: application/json

{
"rego_rule": "package behavry.authz.autogen\n\ndefault allow := false\n\ndeny[msg] {\n input.inbound_findings[_].pattern_class == \"prompt_injection\"\n input.inbound_findings[_].severity >= \"critical\"\n msg := \"Custom: block critical prompt injections only\"\n}\n"
}

Only allowed on candidates with status proposed.

Test Dry-Run

POST /api/v1/policy-candidates/{id}/test
Content-Type: application/json

{
"input": {
"agent": {"id": "test-agent", "roles": ["worker"], "risk_tier": "medium"},
"request": {"tool_name": "read_file", "action": "read", "resource": "/tmp/test.txt"},
"inbound_findings": [
{"pattern_class": "prompt_injection", "severity": "high", "content_snippet": "ignore previous instructions"}
]
}
}

The test endpoint pushes the candidate's Rego to OPA under a temporary path, evaluates the provided input, and returns the OPA result. The temporary policy is not cleaned up automatically -- it is overwritten on the next test.

Response:

{
"candidate_id": "candidate-uuid",
"opa_result": {
"result": {
"deny": ["Auto-generated: block prompt_injection (source: evt-abc123)"]
}
},
"rego_package": "behavry.authz.autogen"
}

Dashboard

The Policy Suggestions page is accessible from the main navigation (marked with an amber badge showing the count of proposed candidates).

Stats Strip

Four metric cards at the top show the count of proposed, approved, rejected, and auto-activated candidates within the trailing 30 days.

Filter Bar

Filter candidates by status, source event type, and confidence range.

Candidate Cards

Each candidate is displayed as a card showing:

  • Pattern signature and source event type
  • Confidence score with a color-coded badge (green for high, amber for medium, red for low)
  • Generated Rego in a syntax-highlighted code block
  • Inline Rego editor (for proposed candidates)
  • Test panel with sample input and OPA result display
  • Approve and Reject action buttons with notes input

The page is organized into three sections:

  1. Awaiting Review -- proposed candidates, sorted by confidence descending
  2. Auto-Activated -- candidates that were automatically activated
  3. Resolved -- approved and rejected candidates

Event Types

The policy automation system publishes four event types to the event bus and SSE stream:

Event TypeWhenKey Data Fields
POLICY_CANDIDATE_PROPOSEDNew candidate created from detection eventcandidate_id, pattern_signature, confidence, source_event_type
POLICY_AUTO_ACTIVATEDCandidate auto-activated (confidence met threshold)candidate_id, policy_id, rego_package, confidence, pattern_signature
POLICY_CANDIDATE_APPROVEDAdmin approved a candidatecandidate_id, policy_id, reviewed_by, pattern_signature, confidence
POLICY_CANDIDATE_REJECTEDAdmin rejected a candidatecandidate_id, reviewed_by, review_notes, pattern_signature

These events appear in the Activity feed and trigger browser notifications in the dashboard when auto-activation occurs.