Agentic Security

Behavry's Agentic Security program (AOC — Agents of Chaos) addresses attack vectors unique to autonomous AI agents that traditional security tools miss entirely.

Inspired by Shapira et al., "Agents of Chaos" (arXiv:2602.20021, Feb 2026).

Why agents are different

Traditional security assumes a human is in the loop. AI agents:

Execute at machine speed — no natural pause for humans to catch mistakes
Trust external content — fetched web pages, API responses, and file reads become agent context
Have broad authority — a single agent may have access to filesystem, email, GitHub, and Slack simultaneously
Are coachable — adversaries can embed instructions in data the agent will read

The AOC threat model

AOC	Threat	Status
AOC-1	Inbound injection — instructions embedded in tool results	Shipped (Sprint AOC-1, enhanced in AOC-1.5)
AOC-2	Blast radius — agent takes actions with disproportionate impact	Shipped (Sprint Y, March 2026)
AOC-3	Requester identity — who actually instructed this action?	Shipped (Phases 1-2 Sprint AOC-3, Phase 3 Sprint U)
AOC-4	Trust reset — agent that was safe suddenly becomes unsafe	Shipped (Sprint V, March 2026)

All four AOC controls are fully implemented and tested. The Red Team Policy Automation Loop (Sprint RT) ties them together as a cross-cutting adaptive defense layer — see below.

AOC-1: Inbound Injection

An attacker who controls an external resource (a GitHub Gist, a shared doc, a webhook response) embeds imperative instructions. The agent fetches it, incorporates it into context, and acts on it. Behavry scans tool results before they reach the agent. Enhanced in Sprint AOC-1.5 with content trust domain tagging and behavioral drift detection.

AOC-1 details

AOC-2: Blast Radius Limits

Even when individual actions pass policy, their cumulative scope can be disproportionate. Behavry evaluates every action against configurable blast radius thresholds — shallow deletes are denied outright, while bulk operations, mass messaging, config path writes, and protected file access trigger HITL escalation. Exception hardening prevents policy erosion through frequency monitoring and baseline poisoning detection.

AOC-2 details

AOC-3: Requester Identity

When an agent acts, it often does so on behalf of a human user. If the identity of the actual requester is not propagated, a compromised agent can masquerade as any user. Behavry adds X-Requester-Id to every action, alerts on mismatches, and verifies requester claims cryptographically through delegation token chains (Phase 3, Sprint U).

AOC-3 details

AOC-4: Cross-Session Trust Reset Detection

An agent that consistently operated safely may suddenly reverse its behavior across sessions. The Trust Reset Detector tracks action dispositions (allow, block, escalate) per agent and fires alerts when it detects cross-session behavior reversals, requester session cycling, or unexpected workflow participants. Workflow Behavioral Baselines use EWMA to track normal operating patterns and flag anomalies in tool distribution, delegation depth, session duration, and participant sets.

AOC-4 details

Red Team Policy Automation Loop

The Policy Generator (Sprint RT) is a cross-cutting adaptive defense layer that subscribes to the event bus and watches for security events from all four AOCs. When it detects a repeated attack pattern, it automatically generates a candidate OPA policy to block it.

How it works:

Security events arrive from AOC detectors (injection blocked, drift detected, behavior reversal, session cycling)
The generator computes a normalized pattern signature for deduplication
Confidence scoring evaluates pattern frequency, severity, and corroboration across multiple sources
A candidate Rego policy is generated from templates and proposed for admin review
If the tenant has auto-activation enabled and confidence exceeds the threshold, the policy is activated immediately

Candidate policies are managed through the dashboard's Policy Suggestions page, where admins can review, edit, test, approve, or reject each candidate.

Why agents are different​

The AOC threat model​

AOC-1: Inbound Injection​

AOC-2: Blast Radius Limits​

AOC-3: Requester Identity​

AOC-4: Cross-Session Trust Reset Detection​

Red Team Policy Automation Loop​