Operational Runbooks
Step-by-step guides for the most common situations you'll encounter while running Behavry in production.
There's a pending escalation
An agent's tool call is on hold — waiting for your decision.
The proxy is holding the agent's HTTP connection. The agent is blocked until you act or the timeout expires.
- Read the policy reason — Understand exactly which rule triggered the hold and why.
- Look at the tool call detail — What tool, what action, what resource — is this expected for this agent?
- Check the agent's risk tier — A critical-tier agent attempting an unusual action is more concerning than a low-tier one.
- Approve if legitimate — The agent's request goes through. Add an Exception if you want to skip review for this exact action in future.
- Deny if suspicious — The agent receives a policy denial error. Session stays active — only this call is rejected.
→ Go to Escalations in the dashboard.
An alert fired
Behavry detected anomalous behavior — decide whether it's real.
Alerts are informational. They don't block the agent. Your job is to decide whether to investigate further or dismiss.
- Check the severity —
info= FYI.warning= investigate soon.critical= act now. DLP violations are always critical. - Read the behavioral context — What pattern changed? 3σ drift means a statistically unusual spike in volume, rate, or error rate.
- Cross-reference Audit Events — Filter by agent + timeframe to see the exact calls that triggered the anomaly.
- Acknowledge if investigating — Marks it as seen and removes it from the open inbox. Add a note for your team.
- Resolve when done — Closes the alert. If it was a false positive, note that — it helps tune future thresholds.
- Consider suspending the agent — If the behavior is genuinely malicious or runaway, suspend immediately — all sessions are revoked instantly.
→ Go to Alerts in the dashboard.
An agent's risk tier jumped
The Behavry Risk Framework flagged a significant behavioral shift.
Risk tiers are recalculated continuously. A jump from low→high or high→critical means multiple behavioral dimensions shifted at once.
- Open Behavioral Trends for this agent — Select the agent in the dropdown — look for spikes in deny rate, escalation rate, or call volume.
- Check recent alerts — A tier jump is often preceded by a series of anomaly alerts. Look at the last 24–48 hours.
- Review recent audit events — Filter by agent — look for unusual tool names, high-frequency bursts, or repeated denials.
- Tighten the policy if needed — Add a more restrictive policy or remove broad roles from the agent while you investigate.
- Suspend if risk is unacceptable — Agents page → agent → Suspend. Instant session revocation, zero downtime for other agents.
→ Go to Risk in the dashboard.
Baseline drift was detected
The agent called a tool not in its registered manifest.
The proxy blocked the call. The agent has an approved baseline but attempted something outside its declared scope — this is the enforcement working as intended.
- Check the Audit Events — Find the deny event with
policy_id = baseline.tool_drift. Note the tool name and resource. - Was the tool call expected? — If the agent legitimately needs this tool, update its baseline manifest and go through re-approval.
- Was it unexpected? — Investigate how the agent got access to this tool invocation. Review recent code or config changes.
- Update the baseline if legitimate — Agents → agent → Security Baseline → add tool → Submit for Review → Approve.
- Suspend if the drift is malicious — Prompt injection or capability escalation. Suspend the agent and audit the session fully.
→ Go to Agents in the dashboard.
Risk tiers — what they mean and what to do
| Tier | Escalation timeout | What it means | Recommended action |
|---|---|---|---|
| low | 30 min → auto-denied | Normal, predictable behavior. Routine tool use within expected parameters. | No action needed. Monitor passively. |
| medium | 30 min → auto-denied | Some behavioral deviation detected. May be legitimate increased workload or early signs of drift. | Check Behavioral Trends. No immediate action unless escalations or alerts also firing. |
| high | 15 min → auto-denied | Significant behavioral deviation. Multiple dimensions shifted. Warrants active review. | Investigate Audit Events and recent alerts. Consider tightening policy scope. |
| critical | 5 min → auto-denied | Extreme behavioral deviation. Agent acting well outside its normal envelope. | Act immediately. Review all recent events. Suspend if behavior is unexplained. |
Policy decisions — what each one means
| Decision | What happens |
|---|---|
| allow | Tool call passes all checks. Forwarded to the target MCP server. Logged. |
| deny | Tool call blocked. Agent receives an error. Session stays active. Logged. |
| escalate | Tool call held. Agent connection paused. Admin must approve or deny before the agent continues. |
Default deny: If no policy rule matches a tool call, it is denied. Behavry never allows by default.
Fail-closed: If the policy engine is unreachable, all calls are denied. Nothing passes through silently.
Exceptions: When you approve an escalation, you can create an exception — that exact (agent, action, resource) combination bypasses escalation in future and is allowed directly.