SIEM Integration
Behavry ships with native SIEM integration that forwards audit events from the proxy pipeline to your security operations stack. Every policy decision, DLP finding, behavioral alert, and escalation can be streamed to one or more SIEM destinations in near-real-time, giving SOC teams full visibility into AI agent activity without polling or manual export.
Why SIEM integration matters
- Compliance evidence -- Continuous, machine-readable audit delivery satisfies SOC 2 CC7.2, NIST 800-53 AU-6, and ISO 27001 A.12.4 requirements for centralized log collection.
- SOC visibility -- AI agent actions appear alongside your existing endpoint, network, and identity telemetry. Analysts can correlate agent behavior with other signals in a single pane of glass.
- Threat correlation -- SIEM rules can fire on Behavry event types such as
INBOUND_INJECTION_DETECTED,BEHAVIOR_REVERSAL, orBLAST_RADIUS_ESCALATION, enabling automated playbooks that respond to AI-specific threats.
Supported destinations
| Destination | Type key | Transport | Auth method | Format |
|---|---|---|---|---|
| Splunk | splunk_hec | HTTPS (HEC) | Authorization: Splunk {token} | JSON (newline-delimited HEC events) |
| Microsoft Sentinel | sentinel | HTTPS (Log Analytics API) | HMAC-SHA256 SharedKey | JSON array |
| Google Chronicle | chronicle | HTTPS (UDM batchCreate) | Service account JWT (OAuth2) | UDM SecurityEvent |
| IBM QRadar | qradar | TLS syslog (port 6514) | Certificate-based | LEEF 2.0 inside RFC 5424 frames |
| Generic Syslog | syslog | TCP, UDP, or TLS | None / certificate | RFC 5424 |
| Custom Webhook | webhook | HTTPS | HMAC-SHA256 (X-Behavry-Signature) | JSON array |
All destinations support per-destination event filtering, configurable batch sizes, and independent retry policies.
Architecture
The SIEM pipeline is an extension of Behavry's internal async event bus. It runs entirely in-process -- no external queues or message brokers are required.
Event Bus (all BehavryEvents)
|
v
SIEMDispatcher (wildcard subscriber)
| - converts BehavryEvent -> EventMetadata (no raw payload)
| - loads active destinations per tenant
| - evaluates Python-native event_filter per destination
| - enqueues matching events into per-destination asyncio.Queue (max 10,000)
|
+---> [Queue: dest-1] ---> SIEMBatchWorker ---> SplunkHECConnector
+---> [Queue: dest-2] ---> SIEMBatchWorker ---> SentinelConnector
+---> [Queue: dest-N] ---> SIEMBatchWorker ---> WebhookConnector
Each destination gets its own in-memory queue and background SIEMBatchWorker task. The worker flushes when the batch reaches batch_size events or when flush_interval_secs elapses -- whichever comes first. On graceful shutdown, remaining events in the batch are flushed before the task exits.
Event filtering
Every destination can define an event_filter that controls which events are forwarded. Filtering is evaluated in Python (no OPA round-trip) for performance. A destination with no filter receives all events.
Supported filter fields:
| Field | Type | Description |
|---|---|---|
min_severity | string | Minimum severity level: info, low, medium, high, critical. Events below this threshold are dropped. |
event_types | string[] | Allowlist of event type strings (e.g., ["tool_call", "INBOUND_INJECTION_DETECTED"]). Only matching events pass. |
agent_ids | string[] | Allowlist of agent IDs. Only events from these agents pass. |
policy_results | string[] | Allowlist of policy results (e.g., ["deny", "escalate"]). Only events with matching results pass. |
Filters are AND-combined: an event must pass every specified filter field to be forwarded.
Example filter
Forward only deny and escalate decisions at medium severity or above:
{
"event_filter": {
"min_severity": "medium",
"policy_results": ["deny", "escalate"]
}
}
Data isolation
The SIEM pipeline enforces strict data isolation consistent with the Data Protection (DP) pipeline. Events forwarded to SIEM destinations use the EventMetadata envelope, which contains only behavioral metadata:
| Included | Excluded |
|---|---|
| Event ID, timestamp, event type | Request body (request_body) |
| Agent ID, session ID | Response body (response_body) |
| Tool name, MCP server, action | Raw payload content |
| Policy result and reason | DLP finding content (count only) |
| Behavioral score, risk tier | Redacted/encrypted payload fields |
| DLP findings count (integer) | Any field processed by the DP pipeline |
| Causal depth, alert type/severity |
This ensures that sensitive data classified by the DLP scanner (credentials, PII, PHI, financial data) never reaches external SIEM infrastructure, even if the destination is misconfigured.
Output formats
JSON (default)
Events are serialized as JSON objects using EventMetadata.to_dict(). Splunk HEC wraps each event in the standard HEC envelope with time, host, source, sourcetype, and index fields. Sentinel and Chronicle apply their own format wrappers on top of the JSON payload.
LEEF 2.0
The LEEF 2.0 serializer (to_leef()) produces IBM-standard log lines for QRadar consumption:
LEEF:2.0|Behavry|BehavryProxy|1.0|{event_id}|{TAB-delimited extensions}
Extension fields include devTime, src (agent ID), dst (tool name), severity (1-10 scale), usrName, policy_result, dlp_findings, session_id, and causal_depth. All field values are injection-safe -- tabs, newlines, and carriage returns are escaped to spaces.
Severity mapping (1-10 scale):
| Risk tier | LEEF severity | Policy result | LEEF severity |
|---|---|---|---|
| critical | 10 | dlp_block | 8 |
| high | 7 | deny | 7 |
| medium | 5 | escalate | 5 |
| low | 3 | allow | 3 |
| info | 1 |
CEF
CEF output is available through the existing audit export endpoint (GET /api/v1/audit/export?format=cef). This pre-dates the SIEM module and remains supported for backward compatibility.
Retry and resilience
Each destination has independent retry configuration:
| Parameter | Default | Range | Description |
|---|---|---|---|
retry_max_attempts | 5 | 1--20 | Maximum delivery attempts per batch |
retry_backoff_secs | 10 | 1--300 | Base backoff interval in seconds |
Exponential backoff with jitter
Failed deliveries use the formula:
wait = min(backoff_secs * 2^attempt + uniform(0, backoff_secs), 3600)
The jitter component prevents thundering-herd effects when multiple destinations recover simultaneously. The maximum wait is capped at 3,600 seconds (1 hour).
Auto-disable
After 10 consecutive failures, the destination is automatically disabled (enabled = false) and a SIEM_DESTINATION_UNHEALTHY alert is fired on the event bus. This alert appears in the dashboard Alerts page and can itself be forwarded to other healthy SIEM destinations.
To re-enable a destination after resolving the underlying issue, use PATCH /api/v1/siem/destinations/{id} with {"enabled": true}.
Dead letter queue
When all retry attempts are exhausted for a batch, the events are written to the dead letter queue (DLQ). DLQ payloads are encrypted with AES-256-GCM via the KMS client when available, and stored unencrypted as a fallback.
Each DLQ entry records:
- Destination ID and tenant ID
- Batch ID (UUID)
- List of event IDs in the batch
- Encrypted payload
- Attempt count
- Last attempt timestamp and next scheduled retry
- Error message (truncated to 500 characters)
DLQ management
List DLQ entries
curl -s -H "Authorization: Bearer $ADMIN_TOKEN" \
"https://behavry.example.com/api/v1/siem/dlq?destination_id=DEST_ID" | jq
Response:
{
"items": [
{
"id": "...",
"destination_id": "...",
"batch_id": "a1b2c3d4-...",
"event_ids": ["evt-1", "evt-2", "evt-3"],
"attempt_count": 5,
"last_attempt_at": "2026-03-17T10:30:00Z",
"next_retry_at": "2026-03-17T11:30:00Z",
"error_message": "HEC returned 503: Service Unavailable",
"resolved": false,
"created_at": "2026-03-17T10:25:00Z"
}
],
"total": 1
}
Retry DLQ entries
Re-queue all unresolved DLQ entries for a destination:
curl -s -X POST -H "Authorization: Bearer $ADMIN_TOKEN" \
"https://behavry.example.com/api/v1/siem/destinations/DEST_ID/retry-dlq" | jq
{
"queued": 15,
"message": "Re-queued 15 events from 3 DLQ entries"
}
Discard a DLQ entry
Mark a DLQ entry as resolved without retrying:
curl -s -X POST -H "Authorization: Bearer $ADMIN_TOKEN" \
"https://behavry.example.com/api/v1/siem/dlq/ENTRY_ID/discard" | jq
Credential security
Destination credentials (HEC tokens, shared keys, service account JSON, webhook secrets) are encrypted at rest using AES-256-GCM via the KMS abstraction layer (the same kms_client.py used by the Data Protection pipeline).
Key security properties:
- Encrypted storage -- Credentials are encrypted before being written to the
siem_destinationstable. The encryption context includestenant_idandpurpose: "siem_credential". - Never returned on read --
GETandLISTresponses includecredential_configured: true/falseand an optionalcredential_hint(last 4 characters of the token), but never the full credential. - Re-encryption on update --
PATCHwith a newcredentialfield re-encrypts and replaces the stored ciphertext. - Decrypted per-batch -- Credentials are decrypted only at delivery time, within the batch worker's flush loop. They are not cached in memory.
API reference
All SIEM endpoints require admin JWT authentication and are scoped to the current tenant.
Destinations CRUD
| Method | Endpoint | Description |
|---|---|---|
POST | /api/v1/siem/destinations | Create a new destination |
GET | /api/v1/siem/destinations | List all destinations |
GET | /api/v1/siem/destinations/{id} | Get a destination by ID |
PATCH | /api/v1/siem/destinations/{id} | Update a destination |
DELETE | /api/v1/siem/destinations/{id} | Soft-delete (disable) a destination |
Operations
| Method | Endpoint | Description |
|---|---|---|
POST | /api/v1/siem/destinations/{id}/test | Send a synthetic test event to verify connectivity |
GET | /api/v1/siem/destinations/{id}/health | Get health stats (last delivery, failure count, DLQ depth) |
POST | /api/v1/siem/destinations/{id}/retry-dlq | Re-queue all unresolved DLQ entries |
Dead letter queue
| Method | Endpoint | Description |
|---|---|---|
GET | /api/v1/siem/dlq | List DLQ entries (optional ?destination_id= filter) |
POST | /api/v1/siem/dlq/{id}/discard | Mark a DLQ entry as resolved |
Audit export enhancements
The existing audit export endpoint has been extended with SIEM-related capabilities:
| Feature | Description |
|---|---|
| LEEF format | GET /api/v1/audit/export?format=leef returns LEEF 2.0 formatted output |
| Cursor pagination | Response includes X-Next-Cursor and X-Total-Events headers for efficient iteration over large result sets (no 10,000-event cap) |
Setup example: Splunk HEC
This walkthrough creates a Splunk HEC destination, verifies connectivity, and confirms event delivery.
1. Create the destination
curl -s -X POST \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
"https://behavry.example.com/api/v1/siem/destinations" \
-d '{
"name": "Splunk Production",
"destination_type": "splunk_hec",
"format": "json",
"endpoint_url": "https://splunk.corp.example.com:8088",
"credential": {
"token": "your-hec-token-here",
"index": "security"
},
"event_filter": {
"min_severity": "low",
"policy_results": ["deny", "escalate", "dlp_block"]
},
"batch_size": 100,
"flush_interval_secs": 30,
"retry_max_attempts": 5,
"retry_backoff_secs": 10
}' | jq
Response:
{
"id": "d1e2f3a4-...",
"name": "Splunk Production",
"destination_type": "splunk_hec",
"format": "json",
"endpoint_url": "https://splunk.corp.example.com:8088",
"credential_configured": true,
"credential_hint": "...here",
"event_filter": {
"min_severity": "low",
"policy_results": ["deny", "escalate", "dlp_block"]
},
"batch_size": 100,
"flush_interval_secs": 30,
"retry_max_attempts": 5,
"retry_backoff_secs": 10,
"enabled": true,
"last_delivery_at": null,
"last_error": null,
"consecutive_failures": 0,
"created_at": "2026-03-17T12:00:00Z"
}
2. Test connectivity
curl -s -X POST \
-H "Authorization: Bearer $ADMIN_TOKEN" \
"https://behavry.example.com/api/v1/siem/destinations/d1e2f3a4-.../test" | jq
{
"delivered": true,
"latency_ms": 142.5,
"error": null
}
The test endpoint sends a synthetic event with event_type: "tool_call", agent_id: "test-agent", and action: "test_connectivity". It exercises the full delivery path including credential decryption, payload formatting, and network transport.
3. Check health
curl -s -H "Authorization: Bearer $ADMIN_TOKEN" \
"https://behavry.example.com/api/v1/siem/destinations/d1e2f3a4-.../health" | jq
{
"destination_id": "d1e2f3a4-...",
"last_delivery_at": "2026-03-17T12:01:30Z",
"consecutive_failures": 0,
"last_error": null,
"enabled": true,
"dlq_depth": 0
}
4. Verify in Splunk
In your Splunk search console, run:
index=security sourcetype=behavry:audit | head 10
Each event will contain the EventMetadata fields: event_type, agent_id, session_id, tool_name, policy_result, behavioral_score, dlp_findings_count, risk_tier, and causal_depth.
Connector-specific notes
Microsoft Sentinel
Credential fields: workspace_id, shared_key, and optional log_type (defaults to BehavryAudit_CL). Events are delivered to the Log Analytics Data Collector API with HMAC-SHA256 SharedKey authentication.
Google Chronicle
The credential is the full Google service account JSON (the same file you download from the GCP console). The connector authenticates via OAuth2 with the malachite-ingestion scope and delivers events to the UDM unstructuredlogentries:batchCreate endpoint. Requires the google-auth and requests Python packages.
IBM QRadar
Events are delivered as LEEF 2.0 messages inside RFC 5424 syslog frames over TLS (default port 6514). Each message includes a structured data element with agent_id, policy_result, and tool fields for QRadar indexing.
Generic Syslog
Endpoint URL determines the transport: tls://host:port, tcp://host:port, or udp://host:port. Defaults to TCP on port 514 if no scheme is specified. The optional facility credential field overrides the default syslog facility (local0 / 16).
Custom Webhook
Every request includes three security headers:
X-Behavry-Signature:sha256={hmac_hex_digest}computed over the JSON body using the configured secret.X-Behavry-Event-Count: Number of events in the batch.X-Behavry-Timestamp: Unix timestamp of the request.
The receiving endpoint should verify the HMAC signature before processing the payload. Any 2xx response is treated as successful delivery.