Operational Runbooks

Step-by-step guides for the most common situations you'll encounter while running Behavry in production.

There's a pending escalation

An agent's tool call is on hold — waiting for your decision.

The proxy is holding the agent's HTTP connection. The agent is blocked until you act or the timeout expires.

Read the policy reason — Understand exactly which rule triggered the hold and why.
Look at the tool call detail — What tool, what action, what resource — is this expected for this agent?
Check the agent's risk tier — A critical-tier agent attempting an unusual action is more concerning than a low-tier one.
Approve if legitimate — The agent's request goes through. Add an Exception if you want to skip review for this exact action in future.
Deny if suspicious — The agent receives a policy denial error. Session stays active — only this call is rejected.

→ Go to Escalations in the dashboard.

An alert fired

Behavry detected anomalous behavior — decide whether it's real.

Alerts are informational. They don't block the agent. Your job is to decide whether to investigate further or dismiss.

Check the severity — info = FYI. warning = investigate soon. critical = act now. DLP violations are always critical.
Read the behavioral context — What pattern changed? 3σ drift means a statistically unusual spike in volume, rate, or error rate.
Cross-reference Audit Events — Filter by agent + timeframe to see the exact calls that triggered the anomaly.
Acknowledge if investigating — Marks it as seen and removes it from the open inbox. Add a note for your team.
Resolve when done — Closes the alert. If it was a false positive, note that — it helps tune future thresholds.
Consider suspending the agent — If the behavior is genuinely malicious or runaway, suspend immediately — all sessions are revoked instantly.

→ Go to Alerts in the dashboard.

An agent's risk tier jumped

The Behavry Risk Framework flagged a significant behavioral shift.

Risk tiers are recalculated continuously. A jump from low→high or high→critical means multiple behavioral dimensions shifted at once.

Open Behavioral Trends for this agent — Select the agent in the dropdown — look for spikes in deny rate, escalation rate, or call volume.
Check recent alerts — A tier jump is often preceded by a series of anomaly alerts. Look at the last 24–48 hours.
Review recent audit events — Filter by agent — look for unusual tool names, high-frequency bursts, or repeated denials.
Tighten the policy if needed — Add a more restrictive policy or remove broad roles from the agent while you investigate.
Suspend if risk is unacceptable — Agents page → agent → Suspend. Instant session revocation, zero downtime for other agents.

→ Go to Risk in the dashboard.

Baseline drift was detected

The agent called a tool not in its registered manifest.

The proxy blocked the call. The agent has an approved baseline but attempted something outside its declared scope — this is the enforcement working as intended.

Check the Audit Events — Find the deny event with policy_id = baseline.tool_drift. Note the tool name and resource.
Was the tool call expected? — If the agent legitimately needs this tool, update its baseline manifest and go through re-approval.
Was it unexpected? — Investigate how the agent got access to this tool invocation. Review recent code or config changes.
Update the baseline if legitimate — Agents → agent → Security Baseline → add tool → Submit for Review → Approve.
Suspend if the drift is malicious — Prompt injection or capability escalation. Suspend the agent and audit the session fully.

→ Go to Agents in the dashboard.

Risk tiers — what they mean and what to do

Tier	Escalation timeout	What it means	Recommended action
low	30 min → auto-denied	Normal, predictable behavior. Routine tool use within expected parameters.	No action needed. Monitor passively.
medium	30 min → auto-denied	Some behavioral deviation detected. May be legitimate increased workload or early signs of drift.	Check Behavioral Trends. No immediate action unless escalations or alerts also firing.
high	15 min → auto-denied	Significant behavioral deviation. Multiple dimensions shifted. Warrants active review.	Investigate Audit Events and recent alerts. Consider tightening policy scope.
critical	5 min → auto-denied	Extreme behavioral deviation. Agent acting well outside its normal envelope.	Act immediately. Review all recent events. Suspend if behavior is unexplained.

Policy decisions — what each one means

Decision	What happens
allow	Tool call passes all checks. Forwarded to the target MCP server. Logged.
deny	Tool call blocked. Agent receives an error. Session stays active. Logged.
escalate	Tool call held. Agent connection paused. Admin must approve or deny before the agent continues.

Default deny: If no policy rule matches a tool call, it is denied. Behavry never allows by default.

Fail-closed: If the policy engine is unreachable, all calls are denied. Nothing passes through silently.

Exceptions: When you approve an escalation, you can create an exception — that exact (agent, action, resource) combination bypasses escalation in future and is allowed directly.

There's a pending escalation​

An alert fired​

An agent's risk tier jumped​

Baseline drift was detected​

Risk tiers — what they mean and what to do​

Policy decisions — what each one means​