Troubleshooting
Startup Issues
TimescaleDB extension not found
Symptom:
Could not enable TimescaleDB (may not be installed)
Cause: The PostgreSQL container doesn't have TimescaleDB. Using the wrong image.
Fix: Use timescale/timescaledb:latest-pg16 (not plain postgres:16):
# docker-compose.yml
db:
image: timescale/timescaledb:latest-pg16
Audit events will still write to a plain table without the hypertable — functionality is preserved but time-series queries and compression won't work.
OPA connection refused / Base policy push failed
Symptom:
Base policy push failed: httpx.ConnectError
Cause: OPA container isn't ready yet, or BEHAVRY_OPA_URL is misconfigured.
Fix:
- Check OPA is running:
docker ps | grep opa - Test directly:
curl http://localhost:8181/health - Ensure
BEHAVRY_OPA_URL=http://localhost:8181(local dev) orhttp://opa:8181(Docker Compose) - OPA must start before the backend. Check
depends_on+ health check indocker-compose.yml
With BEHAVRY_OPA_FAIL_CLOSED=true (default): all tool calls will be denied until OPA is healthy. The app will still start, but agents can't do anything.
BEHAVRY_ADMIN_PASSWORD must be set in production
Cause: BEHAVRY_ENV=production but no admin password configured.
Fix: Set BEHAVRY_ADMIN_PASSWORD to a strong password.
JWT keys not configured warning on startup
Symptom:
JWT keys not configured — auto-generating for development
This is normal in development. Keys are regenerated each restart. Sessions don't persist across restarts in dev.
For production: set BEHAVRY_JWT_PRIVATE_KEY and BEHAVRY_JWT_PUBLIC_KEY.
Authentication Issues
401 Unauthorized from the dashboard
Symptom: All API calls return 401. You're logged in but can't see data.
Causes and fixes:
- Token expired: Log out and log back in.
- Wrong localStorage key: Check browser DevTools → Application → Local Storage for
behavry_admin_token. - JWT keys rotated: If the server restarted in dev mode, old tokens are invalid. Log out and back in.
- Clerk mode mismatch: Ensure
VITE_CLERK_PUBLISHABLE_KEYis set indashboard/.envif using Clerk mode.
Agent token rejected: Invalid or expired agent token
Causes:
- Token expired — re-authenticate with
POST /api/v1/auth/token - Agent is suspended (
status: suspended) — reactivate via the Agents UI - Session revoked — create a new session
Session revoked or expired
The agent's session was explicitly revoked (via API) or the DB was reset. Re-authenticate.
CORS Errors
Symptom:
Access to fetch at 'http://localhost:8000/api/v1/...' from origin 'http://localhost:5173' has been blocked by CORS policy
Fix: Ensure BEHAVRY_CORS_ORIGINS_STR includes the dashboard origin:
# Development
BEHAVRY_CORS_ORIGINS_STR=http://localhost:3000,http://localhost:5173
# Production
BEHAVRY_CORS_ORIGINS_STR=https://your-dashboard.com
Restart the backend after changing env vars.
OPA / Policy Issues
All tool calls denied with "No policy matched"
Cause: OPA started but policies weren't pushed.
Debug:
# Check what's in OPA
curl http://localhost:8181/v1/policies
# Expected: list including "policies/base/rbac.rego"
# If empty: policies/base/ directory was not found at startup
Fix: Ensure the policies/base/ directory exists relative to the backend (../policies/base from backend/behavry/main.py resolves to the repo root policies/base/).
ls /Users/ward/Code/behavry/policies/base/
# Should show: rbac.rego resource_access.rego
OPA returns deny for everything even with correct permissions
Cause: Agent JWT doesn't include the expected permissions.
Debug:
# Decode the agent JWT (base64 decode the middle segment)
echo "eyJ..." | cut -d. -f2 | base64 -d | python3 -m json.tool
Expected:
{"sub": "...", "roles": ["filesystem-reader"], "permissions": ["filesystem:read"], "risk_tier": "medium"}
If permissions is empty: the agent has no roles assigned. Assign a role via the Agents UI or POST /api/v1/agents/{id}/roles.
Custom policy not taking effect
- Check the policy status is
active(notdraft) - Check it was synced to OPA:
curl http://localhost:8181/v1/policies - Test the policy directly against OPA:
curl -X POST http://localhost:8181/v1/data/behavry/authz \
-H "Content-Type: application/json" \
-d '{"input": {"agent": {"id": "test", "roles": [], "permissions": [], "risk_tier": "low"}, "request": {"tool_name": "write_file", "action": "write", "resource": "/tmp/test.txt", "parameters": {}, "mcp_server": "filesystem"}}}'
Database Issues
asyncpg.exceptions.ConnectionDoesNotExistError
Cause: Database connection dropped. The pool can't recover.
Fix: Restart the backend. For production, ensure the DB is on a stable network connection and BEHAVRY_DB_POOL_SIZE / BEHAVRY_DB_MAX_OVERFLOW are not exhausted.
Could not create hypertable warning
Cause: TimescaleDB extension not installed, or hypertable already exists.
Fix: Usually benign — the warning appears on subsequent restarts because the hypertable already exists. If it's the first run and TimescaleDB is not installed, audit events write to a plain table (functional but not optimized).
Dashboard Issues
Blank screen / 404 on page refresh
Cause: Nginx isn't configured to serve the React SPA for all routes (history API fallback missing).
Fix: Ensure the nginx config includes:
location / {
try_files $uri $uri/ /index.html;
}
SSE stream disconnects frequently
Cause: Proxy or load balancer buffering or closing idle connections.
Fix:
- Increase proxy timeout: nginx
proxy_read_timeout 3600s; - Decrease
BEHAVRY_SSE_KEEPALIVE_SECONDSto keep the connection alive more frequently
Dark mode not persisting
Cause: localStorage is unavailable or behavry_theme key is missing.
Fix: Open DevTools → Application → Local Storage → verify behavry_theme is set to dark or light. If missing, toggle once via the sidebar toggle.
Escalations Not Resolving
Agent stuck waiting, admin approved but agent didn't get the response
Cause: The backend was restarted between when the escalation was created and when it was approved. The asyncio.Future is in-memory and lost on restart.
Symptoms: Escalation shows approved in the database but the agent got a timeout error.
Fix: The agent will eventually get a timeout error (configured by risk tier). This is a known limitation of the in-memory Future approach. Future: replace with Redis pub/sub for durability across restarts.
Workaround for now: Inform agents to retry the failed operation after an escalation approval when this occurs.
Webhook Delivery Failures
Webhook not receiving events
- Check
BEHAVRY_WEBHOOK_URLis set correctly - Check
BEHAVRY_WEBHOOK_MIN_SEVERITY— default ishigh; alerts with lower severity won't be sent - Check backend logs for
Webhook delivery failed - Test the endpoint manually:
curl -X POST <webhook_url> -d '{"test": true}' - Verify HMAC: the
X-Behavry-Signatureheader should matchhmac-sha256(secret, body)