Troubleshooting

Startup Issues

`TimescaleDB extension not found`

Symptom:

Could not enable TimescaleDB (may not be installed)

Cause: The PostgreSQL container doesn't have TimescaleDB. Using the wrong image.

Fix: Use timescale/timescaledb:latest-pg16 (not plain postgres:16):

# docker-compose.yml
db:
  image: timescale/timescaledb:latest-pg16

Audit events will still write to a plain table without the hypertable — functionality is preserved but time-series queries and compression won't work.

`OPA connection refused` / `Base policy push failed`

Symptom:

Base policy push failed: httpx.ConnectError

Cause: OPA container isn't ready yet, or BEHAVRY_OPA_URL is misconfigured.

Fix:

Check OPA is running: docker ps | grep opa
Test directly: curl http://localhost:8181/health
Ensure BEHAVRY_OPA_URL=http://localhost:8181 (local dev) or http://opa:8181 (Docker Compose)
OPA must start before the backend. Check depends_on + health check in docker-compose.yml

With BEHAVRY_OPA_FAIL_CLOSED=true (default): all tool calls will be denied until OPA is healthy. The app will still start, but agents can't do anything.

`BEHAVRY_ADMIN_PASSWORD must be set in production`

Cause: BEHAVRY_ENV=production but no admin password configured.

Fix: Set BEHAVRY_ADMIN_PASSWORD to a strong password.

`JWT keys not configured` warning on startup

Symptom:

JWT keys not configured — auto-generating for development

This is normal in development. Keys are regenerated each restart. Sessions don't persist across restarts in dev.

For production: set BEHAVRY_JWT_PRIVATE_KEY and BEHAVRY_JWT_PUBLIC_KEY.

Authentication Issues

`401 Unauthorized` from the dashboard

Symptom: All API calls return 401. You're logged in but can't see data.

Causes and fixes:

Token expired: Log out and log back in.
Wrong localStorage key: Check browser DevTools → Application → Local Storage for behavry_admin_token.
JWT keys rotated: If the server restarted in dev mode, old tokens are invalid. Log out and back in.
Clerk mode mismatch: Ensure VITE_CLERK_PUBLISHABLE_KEY is set in dashboard/.env if using Clerk mode.

Agent token rejected: `Invalid or expired agent token`

Causes:

Token expired — re-authenticate with POST /api/v1/auth/token
Agent is suspended (status: suspended) — reactivate via the Agents UI
Session revoked — create a new session

`Session revoked or expired`

The agent's session was explicitly revoked (via API) or the DB was reset. Re-authenticate.

CORS Errors

Symptom:

Access to fetch at 'http://localhost:8000/api/v1/...' from origin 'http://localhost:5173' has been blocked by CORS policy

Fix: Ensure BEHAVRY_CORS_ORIGINS_STR includes the dashboard origin:

# Development
BEHAVRY_CORS_ORIGINS_STR=http://localhost:3000,http://localhost:5173

# Production
BEHAVRY_CORS_ORIGINS_STR=https://your-dashboard.com

Restart the backend after changing env vars.

OPA / Policy Issues

All tool calls denied with "No policy matched"

Cause: OPA started but policies weren't pushed.

Debug:

# Check what's in OPA
curl http://localhost:8181/v1/policies

# Expected: list including "policies/base/rbac.rego"
# If empty: policies/base/ directory was not found at startup

Fix: Ensure the policies/base/ directory exists relative to the backend (../policies/base from backend/behavry/main.py resolves to the repo root policies/base/).

ls /Users/ward/Code/behavry/policies/base/
# Should show: rbac.rego  resource_access.rego

OPA returns `deny` for everything even with correct permissions

Cause: Agent JWT doesn't include the expected permissions.

Debug:

# Decode the agent JWT (base64 decode the middle segment)
echo "eyJ..." | cut -d. -f2 | base64 -d | python3 -m json.tool

Expected:

{"sub": "...", "roles": ["filesystem-reader"], "permissions": ["filesystem:read"], "risk_tier": "medium"}

If permissions is empty: the agent has no roles assigned. Assign a role via the Agents UI or POST /api/v1/agents/{id}/roles.

Custom policy not taking effect

Check the policy status is active (not draft)
Check it was synced to OPA: curl http://localhost:8181/v1/policies

Test the policy directly against OPA:

curl -X POST http://localhost:8181/v1/data/behavry/authz \
  -H "Content-Type: application/json" \
  -d '{"input": {"agent": {"id": "test", "roles": [], "permissions": [], "risk_tier": "low"}, "request": {"tool_name": "write_file", "action": "write", "resource": "/tmp/test.txt", "parameters": {}, "mcp_server": "filesystem"}}}'

Database Issues

`asyncpg.exceptions.ConnectionDoesNotExistError`

Cause: Database connection dropped. The pool can't recover.

Fix: Restart the backend. For production, ensure the DB is on a stable network connection and BEHAVRY_DB_POOL_SIZE / BEHAVRY_DB_MAX_OVERFLOW are not exhausted.

`Could not create hypertable` warning

Cause: TimescaleDB extension not installed, or hypertable already exists.

Fix: Usually benign — the warning appears on subsequent restarts because the hypertable already exists. If it's the first run and TimescaleDB is not installed, audit events write to a plain table (functional but not optimized).

Dashboard Issues

Blank screen / 404 on page refresh

Cause: Nginx isn't configured to serve the React SPA for all routes (history API fallback missing).

Fix: Ensure the nginx config includes:

location / {
    try_files $uri $uri/ /index.html;
}

SSE stream disconnects frequently

Cause: Proxy or load balancer buffering or closing idle connections.

Fix:

Increase proxy timeout: nginx proxy_read_timeout 3600s;
Decrease BEHAVRY_SSE_KEEPALIVE_SECONDS to keep the connection alive more frequently

Dark mode not persisting

Cause: localStorage is unavailable or behavry_theme key is missing.

Fix: Open DevTools → Application → Local Storage → verify behavry_theme is set to dark or light. If missing, toggle once via the sidebar toggle.

Escalations Not Resolving

Agent stuck waiting, admin approved but agent didn't get the response

Cause: The backend was restarted between when the escalation was created and when it was approved. The asyncio.Future is in-memory and lost on restart.

Symptoms: Escalation shows approved in the database but the agent got a timeout error.

Fix: The agent will eventually get a timeout error (configured by risk tier). This is a known limitation of the in-memory Future approach. Future: replace with Redis pub/sub for durability across restarts.

Workaround for now: Inform agents to retry the failed operation after an escalation approval when this occurs.

Webhook Delivery Failures

Webhook not receiving events

Check BEHAVRY_WEBHOOK_URL is set correctly
Check BEHAVRY_WEBHOOK_MIN_SEVERITY — default is high; alerts with lower severity won't be sent
Check backend logs for Webhook delivery failed
Test the endpoint manually: curl -X POST <webhook_url> -d '{"test": true}'
Verify HMAC: the X-Behavry-Signature header should match hmac-sha256(secret, body)

Startup Issues​

TimescaleDB extension not found​

OPA connection refused / Base policy push failed​

BEHAVRY_ADMIN_PASSWORD must be set in production​

JWT keys not configured warning on startup​

Authentication Issues​

401 Unauthorized from the dashboard​

Agent token rejected: Invalid or expired agent token​

Session revoked or expired​

CORS Errors​

OPA / Policy Issues​

All tool calls denied with "No policy matched"​

OPA returns deny for everything even with correct permissions​

Custom policy not taking effect​

Database Issues​

asyncpg.exceptions.ConnectionDoesNotExistError​

Could not create hypertable warning​

Dashboard Issues​

Blank screen / 404 on page refresh​

SSE stream disconnects frequently​

Dark mode not persisting​

Escalations Not Resolving​

Agent stuck waiting, admin approved but agent didn't get the response​

Webhook Delivery Failures​

Webhook not receiving events​

Startup Issues

`TimescaleDB extension not found`

`OPA connection refused` / `Base policy push failed`

`BEHAVRY_ADMIN_PASSWORD must be set in production`

`JWT keys not configured` warning on startup

Authentication Issues

`401 Unauthorized` from the dashboard

Agent token rejected: `Invalid or expired agent token`

`Session revoked or expired`

CORS Errors

OPA / Policy Issues

All tool calls denied with "No policy matched"

OPA returns `deny` for everything even with correct permissions

Custom policy not taking effect

Database Issues

`asyncpg.exceptions.ConnectionDoesNotExistError`

`Could not create hypertable` warning

Dashboard Issues

Blank screen / 404 on page refresh

SSE stream disconnects frequently

Dark mode not persisting

Escalations Not Resolving

Agent stuck waiting, admin approved but agent didn't get the response

Webhook Delivery Failures

Webhook not receiving events