Context Window Governance
Context Window Governance is included on the Professional and Enterprise plans.
What this is
Context Window Governance (internally "Context Gate") intercepts every MCP tools/list response that flows through the Behavry proxy and filters it against OPA policy before it reaches the agent. Tools can be made visible (pass through unchanged), trimmed (schema compressed to shrink the token footprint), or hidden (removed from the list entirely).
This exists because unbounded tools/list responses routinely eat 30–60% of an agent's context window on tools it never calls. Context Gate gives tenants policy-level control over which tools each agent sees, enforces a per-agent token budget, and flags tools that have gone unused so they can be retired.
How it works
When an MCP server responds to tools/list, the proxy runs the response through a four-stage pipeline:
- Tool classification —
backend/behavry/proxy/tool_classifier.pymaps each tool to a verb (read / write / delete / network / admin) and a category used by policy. - Policy evaluation — the classified list is posted to OPA (
behavry.context_gate). The policy returns a decision per tool:visible,trimmed, orhidden, with an optional reason. - Schema compression — tools marked
trimmedrun throughbackend/behavry/proxy/schema_compressor.py, which rewrites the JSON Schema in one of three modes:full(no change),compact(strip descriptions and defaults), orminimal(parameters only, no metadata). - Token accounting — the filtered list is counted via
backend/behavry/proxy/token_counter.pyand compared against the per-agent budget stored inTenantConfig. Agents over budget get the most expensive trimmed tools hidden until they fit.
Every decision is written as a context_gate.filter event to the audit log, tagged with original token count, filtered token count, and the per-tool verdict.
Tool verdicts
| Verdict | What the agent sees | When to use |
|---|---|---|
visible | Full, untouched schema | Core tools the agent relies on |
trimmed | Schema with descriptions / defaults stripped | Tools the agent uses occasionally — cut tokens without breaking callability |
hidden | Tool is not in the list at all | Dangerous or unused tools that shouldn't be reachable |
API endpoints
All endpoints live under /api/v1/context-gate and require admin authentication (backend/behavry/proxy/context_gate_routes.py).
| Method | Path | Purpose |
|---|---|---|
GET | /summary | Tenant-wide filter metrics: visible / trimmed / hidden counts, original vs filtered token totals, savings percentage |
GET | /by-server | Breakdown of the same metrics per MCP server |
GET | /by-agent | Per-agent schema load (current tools, token footprint, budget usage) |
GET | /unused | Tools that have not been called in the last 30 days — candidates for hidden |
POST | /simulate | Dry-run a filter decision against a live tools/list response without mutating audit state |
GET | /config | Tenant-level Context Gate settings |
PATCH | /config | Update defaults: global compression mode, per-agent token budget, unused-tool threshold |
All summary endpoints accept period=7d|30d|90d.
Writing policy
The filter decision is a Rego policy at policies/context_gate.rego (in-tree) or any tenant-authored override. Minimal example:
package behavry.context_gate
default verdict := {"decision": "visible", "reason": ""}
# Hide any tool that writes to the filesystem for read-only agents
verdict := {"decision": "hidden", "reason": "read-only agent"} if {
input.agent.role == "read_only"
input.tool.verb == "write"
}
# Trim anything classified as "admin" for non-privileged agents
verdict := {"decision": "trimmed", "reason": "admin tool compression"} if {
input.tool.category == "admin"
not input.agent.is_privileged
}
The policy input is the per-tool record emitted by the classifier: {tool: {name, verb, category, server, schema}, agent: {id, role, type, is_privileged}}.
Dashboard
The Context Gate page (Policies → Context Gate) shows:
- Tenant savings summary (tokens saved, % reduction)
- Per-server breakdown
- Per-agent schema load leaderboard (most expensive agents first)
- Unused-tool list with one-click "Hide" action
- Policy simulation console
Related
- Policy Engine — the OPA substrate Context Gate rides on
- Cost Attribution — pairs with Context Gate to quantify dollar savings
- Restricted Mode — another enforcement mode that interacts with tool visibility