Context Window Governance

Feature row 7 — Sprint CG

Context Window Governance is included on the Professional and Enterprise plans.

What this is

Context Window Governance (internally "Context Gate") intercepts every MCP tools/list response that flows through the Behavry proxy and filters it against OPA policy before it reaches the agent. Tools can be made visible (pass through unchanged), trimmed (schema compressed to shrink the token footprint), or hidden (removed from the list entirely).

This exists because unbounded tools/list responses routinely eat 30–60% of an agent's context window on tools it never calls. Context Gate gives tenants policy-level control over which tools each agent sees, enforces a per-agent token budget, and flags tools that have gone unused so they can be retired.

How it works

When an MCP server responds to tools/list, the proxy runs the response through a four-stage pipeline:

Tool classification — backend/behavry/proxy/tool_classifier.py maps each tool to a verb (read / write / delete / network / admin) and a category used by policy.
Policy evaluation — the classified list is posted to OPA (behavry.context_gate). The policy returns a decision per tool: visible, trimmed, or hidden, with an optional reason.
Schema compression — tools marked trimmed run through backend/behavry/proxy/schema_compressor.py, which rewrites the JSON Schema in one of three modes: full (no change), compact (strip descriptions and defaults), or minimal (parameters only, no metadata).
Token accounting — the filtered list is counted via backend/behavry/proxy/token_counter.py and compared against the per-agent budget stored in TenantConfig. Agents over budget get the most expensive trimmed tools hidden until they fit.

Every decision is written as a context_gate.filter event to the audit log, tagged with original token count, filtered token count, and the per-tool verdict.

Tool verdicts

Verdict	What the agent sees	When to use
`visible`	Full, untouched schema	Core tools the agent relies on
`trimmed`	Schema with descriptions / defaults stripped	Tools the agent uses occasionally — cut tokens without breaking callability
`hidden`	Tool is not in the list at all	Dangerous or unused tools that shouldn't be reachable

API endpoints

All endpoints live under /api/v1/context-gate and require admin authentication (backend/behavry/proxy/context_gate_routes.py).

Method	Path	Purpose
`GET`	`/summary`	Tenant-wide filter metrics: visible / trimmed / hidden counts, original vs filtered token totals, savings percentage
`GET`	`/by-server`	Breakdown of the same metrics per MCP server
`GET`	`/by-agent`	Per-agent schema load (current tools, token footprint, budget usage)
`GET`	`/unused`	Tools that have not been called in the last 30 days — candidates for `hidden`
`POST`	`/simulate`	Dry-run a filter decision against a live `tools/list` response without mutating audit state
`GET`	`/config`	Tenant-level Context Gate settings
`PATCH`	`/config`	Update defaults: global compression mode, per-agent token budget, unused-tool threshold

All summary endpoints accept period=7d|30d|90d.

Writing policy

The filter decision is a Rego policy at policies/context_gate.rego (in-tree) or any tenant-authored override. Minimal example:

package behavry.context_gate

default verdict := {"decision": "visible", "reason": ""}

# Hide any tool that writes to the filesystem for read-only agents
verdict := {"decision": "hidden", "reason": "read-only agent"} if {
    input.agent.role == "read_only"
    input.tool.verb == "write"
}

# Trim anything classified as "admin" for non-privileged agents
verdict := {"decision": "trimmed", "reason": "admin tool compression"} if {
    input.tool.category == "admin"
    not input.agent.is_privileged
}

The policy input is the per-tool record emitted by the classifier: {tool: {name, verb, category, server, schema}, agent: {id, role, type, is_privileged}}.

Dashboard

The Context Gate page (Policies → Context Gate) shows:

Tenant savings summary (tokens saved, % reduction)
Per-server breakdown
Per-agent schema load leaderboard (most expensive agents first)
Unused-tool list with one-click "Hide" action
Policy simulation console

Policy Engine — the OPA substrate Context Gate rides on
Cost Attribution — pairs with Context Gate to quantify dollar savings
Restricted Mode — another enforcement mode that interacts with tool visibility

What this is​

How it works​

Tool verdicts​

API endpoints​

Writing policy​

Dashboard​

Related​