SIEM Integration

Behavry ships with native SIEM integration that forwards audit events from the proxy pipeline to your security operations stack. Every policy decision, DLP finding, behavioral alert, and escalation can be streamed to one or more SIEM destinations in near-real-time, giving SOC teams full visibility into AI agent activity without polling or manual export.

Why SIEM integration matters

Compliance evidence -- Continuous, machine-readable audit delivery satisfies SOC 2 CC7.2, NIST 800-53 AU-6, and ISO 27001 A.12.4 requirements for centralized log collection.
SOC visibility -- AI agent actions appear alongside your existing endpoint, network, and identity telemetry. Analysts can correlate agent behavior with other signals in a single pane of glass.
Threat correlation -- SIEM rules can fire on Behavry event types such as INBOUND_INJECTION_DETECTED, BEHAVIOR_REVERSAL, or BLAST_RADIUS_ESCALATION, enabling automated playbooks that respond to AI-specific threats.

Supported destinations

Destination	Type key	Transport	Auth method	Format
Splunk	`splunk_hec`	HTTPS (HEC)	`Authorization: Splunk {token}`	JSON (newline-delimited HEC events)
Microsoft Sentinel	`sentinel`	HTTPS (Log Analytics API)	HMAC-SHA256 SharedKey	JSON array
Google Chronicle	`chronicle`	HTTPS (UDM batchCreate)	Service account JWT (OAuth2)	UDM SecurityEvent
IBM QRadar	`qradar`	TLS syslog (port 6514)	Certificate-based	LEEF 2.0 inside RFC 5424 frames
Generic Syslog	`syslog`	TCP, UDP, or TLS	None / certificate	RFC 5424
Custom Webhook	`webhook`	HTTPS	HMAC-SHA256 (`X-Behavry-Signature`)	JSON array

All destinations support per-destination event filtering, configurable batch sizes, and independent retry policies.

Architecture

The SIEM pipeline is an extension of Behavry's internal async event bus. It runs entirely in-process -- no external queues or message brokers are required.

Event Bus (all BehavryEvents)
    |
    v
SIEMDispatcher (wildcard subscriber)
    |  - converts BehavryEvent -> EventMetadata (no raw payload)
    |  - loads active destinations per tenant
    |  - evaluates Python-native event_filter per destination
    |  - enqueues matching events into per-destination asyncio.Queue (max 10,000)
    |
    +---> [Queue: dest-1] ---> SIEMBatchWorker ---> SplunkHECConnector
    +---> [Queue: dest-2] ---> SIEMBatchWorker ---> SentinelConnector
    +---> [Queue: dest-N] ---> SIEMBatchWorker ---> WebhookConnector

Each destination gets its own in-memory queue and background SIEMBatchWorker task. The worker flushes when the batch reaches batch_size events or when flush_interval_secs elapses -- whichever comes first. On graceful shutdown, remaining events in the batch are flushed before the task exits.

Event filtering

Every destination can define an event_filter that controls which events are forwarded. Filtering is evaluated in Python (no OPA round-trip) for performance. A destination with no filter receives all events.

Supported filter fields:

Field	Type	Description
`min_severity`	`string`	Minimum severity level: `info`, `low`, `medium`, `high`, `critical`. Events below this threshold are dropped.
`event_types`	`string[]`	Allowlist of event type strings (e.g., `["tool_call", "INBOUND_INJECTION_DETECTED"]`). Only matching events pass.
`agent_ids`	`string[]`	Allowlist of agent IDs. Only events from these agents pass.
`policy_results`	`string[]`	Allowlist of policy results (e.g., `["deny", "escalate"]`). Only events with matching results pass.

Filters are AND-combined: an event must pass every specified filter field to be forwarded.

Example filter

Forward only deny and escalate decisions at medium severity or above:

{
  "event_filter": {
    "min_severity": "medium",
    "policy_results": ["deny", "escalate"]
  }
}

Data isolation

The SIEM pipeline enforces strict data isolation consistent with the Data Protection (DP) pipeline. Events forwarded to SIEM destinations use the EventMetadata envelope, which contains only behavioral metadata:

Included	Excluded
Event ID, timestamp, event type	Request body (`request_body`)
Agent ID, session ID	Response body (`response_body`)
Tool name, MCP server, action	Raw payload content
Policy result and reason	DLP finding content (count only)
Behavioral score, risk tier	Redacted/encrypted payload fields
DLP findings count (integer)	Any field processed by the DP pipeline
Causal depth, alert type/severity

This ensures that sensitive data classified by the DLP scanner (credentials, PII, PHI, financial data) never reaches external SIEM infrastructure, even if the destination is misconfigured.

Output formats

JSON (default)

Events are serialized as JSON objects using EventMetadata.to_dict(). Splunk HEC wraps each event in the standard HEC envelope with time, host, source, sourcetype, and index fields. Sentinel and Chronicle apply their own format wrappers on top of the JSON payload.

LEEF 2.0

The LEEF 2.0 serializer (to_leef()) produces IBM-standard log lines for QRadar consumption:

LEEF:2.0|Behavry|BehavryProxy|1.0|{event_id}|{TAB-delimited extensions}

Extension fields include devTime, src (agent ID), dst (tool name), severity (1-10 scale), usrName, policy_result, dlp_findings, session_id, and causal_depth. All field values are injection-safe -- tabs, newlines, and carriage returns are escaped to spaces.

Severity mapping (1-10 scale):

Risk tier	LEEF severity	Policy result	LEEF severity
critical	10	dlp_block	8
high	7	deny	7
medium	5	escalate	5
low	3	allow	3
info	1

CEF

CEF output is available through the existing audit export endpoint (GET /api/v1/audit/export?format=cef). This pre-dates the SIEM module and remains supported for backward compatibility.

Retry and resilience

Each destination has independent retry configuration:

Parameter	Default	Range	Description
`retry_max_attempts`	5	1--20	Maximum delivery attempts per batch
`retry_backoff_secs`	10	1--300	Base backoff interval in seconds

Exponential backoff with jitter

Failed deliveries use the formula:

wait = min(backoff_secs * 2^attempt + uniform(0, backoff_secs), 3600)

The jitter component prevents thundering-herd effects when multiple destinations recover simultaneously. The maximum wait is capped at 3,600 seconds (1 hour).

Auto-disable

After 10 consecutive failures, the destination is automatically disabled (enabled = false) and a SIEM_DESTINATION_UNHEALTHY alert is fired on the event bus. This alert appears in the dashboard Alerts page and can itself be forwarded to other healthy SIEM destinations.

To re-enable a destination after resolving the underlying issue, use PATCH /api/v1/siem/destinations/{id} with {"enabled": true}.

Dead letter queue

When all retry attempts are exhausted for a batch, the events are written to the dead letter queue (DLQ). DLQ payloads are encrypted with AES-256-GCM via the KMS client when available, and stored unencrypted as a fallback.

Each DLQ entry records:

Destination ID and tenant ID
Batch ID (UUID)
List of event IDs in the batch
Encrypted payload
Attempt count
Last attempt timestamp and next scheduled retry
Error message (truncated to 500 characters)

DLQ management

List DLQ entries

curl -s -H "Authorization: Bearer $ADMIN_TOKEN" \
  "https://behavry.example.com/api/v1/siem/dlq?destination_id=DEST_ID" | jq

Response:

{
  "items": [
    {
      "id": "...",
      "destination_id": "...",
      "batch_id": "a1b2c3d4-...",
      "event_ids": ["evt-1", "evt-2", "evt-3"],
      "attempt_count": 5,
      "last_attempt_at": "2026-03-17T10:30:00Z",
      "next_retry_at": "2026-03-17T11:30:00Z",
      "error_message": "HEC returned 503: Service Unavailable",
      "resolved": false,
      "created_at": "2026-03-17T10:25:00Z"
    }
  ],
  "total": 1
}

Retry DLQ entries

Re-queue all unresolved DLQ entries for a destination:

curl -s -X POST -H "Authorization: Bearer $ADMIN_TOKEN" \
  "https://behavry.example.com/api/v1/siem/destinations/DEST_ID/retry-dlq" | jq

{
  "queued": 15,
  "message": "Re-queued 15 events from 3 DLQ entries"
}

Discard a DLQ entry

Mark a DLQ entry as resolved without retrying:

curl -s -X POST -H "Authorization: Bearer $ADMIN_TOKEN" \
  "https://behavry.example.com/api/v1/siem/dlq/ENTRY_ID/discard" | jq

Credential security

Destination credentials (HEC tokens, shared keys, service account JSON, webhook secrets) are encrypted at rest using AES-256-GCM via the KMS abstraction layer (the same kms_client.py used by the Data Protection pipeline).

Key security properties:

Encrypted storage -- Credentials are encrypted before being written to the siem_destinations table. The encryption context includes tenant_id and purpose: "siem_credential".
Never returned on read -- GET and LIST responses include credential_configured: true/false and an optional credential_hint (last 4 characters of the token), but never the full credential.
Re-encryption on update -- PATCH with a new credential field re-encrypts and replaces the stored ciphertext.
Decrypted per-batch -- Credentials are decrypted only at delivery time, within the batch worker's flush loop. They are not cached in memory.

API reference

All SIEM endpoints require admin JWT authentication and are scoped to the current tenant.

Destinations CRUD

Method	Endpoint	Description
`POST`	`/api/v1/siem/destinations`	Create a new destination
`GET`	`/api/v1/siem/destinations`	List all destinations
`GET`	`/api/v1/siem/destinations/{id}`	Get a destination by ID
`PATCH`	`/api/v1/siem/destinations/{id}`	Update a destination
`DELETE`	`/api/v1/siem/destinations/{id}`	Soft-delete (disable) a destination

Operations

Method	Endpoint	Description
`POST`	`/api/v1/siem/destinations/{id}/test`	Send a synthetic test event to verify connectivity
`GET`	`/api/v1/siem/destinations/{id}/health`	Get health stats (last delivery, failure count, DLQ depth)
`POST`	`/api/v1/siem/destinations/{id}/retry-dlq`	Re-queue all unresolved DLQ entries

Dead letter queue

Method	Endpoint	Description
`GET`	`/api/v1/siem/dlq`	List DLQ entries (optional `?destination_id=` filter)
`POST`	`/api/v1/siem/dlq/{id}/discard`	Mark a DLQ entry as resolved

Audit export enhancements

The existing audit export endpoint has been extended with SIEM-related capabilities:

Feature	Description
LEEF format	`GET /api/v1/audit/export?format=leef` returns LEEF 2.0 formatted output
Cursor pagination	Response includes `X-Next-Cursor` and `X-Total-Events` headers for efficient iteration over large result sets (no 10,000-event cap)

Setup example: Splunk HEC

This walkthrough creates a Splunk HEC destination, verifies connectivity, and confirms event delivery.

1. Create the destination

curl -s -X POST \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  "https://behavry.example.com/api/v1/siem/destinations" \
  -d '{
    "name": "Splunk Production",
    "destination_type": "splunk_hec",
    "format": "json",
    "endpoint_url": "https://splunk.corp.example.com:8088",
    "credential": {
      "token": "your-hec-token-here",
      "index": "security"
    },
    "event_filter": {
      "min_severity": "low",
      "policy_results": ["deny", "escalate", "dlp_block"]
    },
    "batch_size": 100,
    "flush_interval_secs": 30,
    "retry_max_attempts": 5,
    "retry_backoff_secs": 10
  }' | jq

Response:

{
  "id": "d1e2f3a4-...",
  "name": "Splunk Production",
  "destination_type": "splunk_hec",
  "format": "json",
  "endpoint_url": "https://splunk.corp.example.com:8088",
  "credential_configured": true,
  "credential_hint": "...here",
  "event_filter": {
    "min_severity": "low",
    "policy_results": ["deny", "escalate", "dlp_block"]
  },
  "batch_size": 100,
  "flush_interval_secs": 30,
  "retry_max_attempts": 5,
  "retry_backoff_secs": 10,
  "enabled": true,
  "last_delivery_at": null,
  "last_error": null,
  "consecutive_failures": 0,
  "created_at": "2026-03-17T12:00:00Z"
}

2. Test connectivity

curl -s -X POST \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  "https://behavry.example.com/api/v1/siem/destinations/d1e2f3a4-.../test" | jq

{
  "delivered": true,
  "latency_ms": 142.5,
  "error": null
}

The test endpoint sends a synthetic event with event_type: "tool_call", agent_id: "test-agent", and action: "test_connectivity". It exercises the full delivery path including credential decryption, payload formatting, and network transport.

3. Check health

curl -s -H "Authorization: Bearer $ADMIN_TOKEN" \
  "https://behavry.example.com/api/v1/siem/destinations/d1e2f3a4-.../health" | jq

{
  "destination_id": "d1e2f3a4-...",
  "last_delivery_at": "2026-03-17T12:01:30Z",
  "consecutive_failures": 0,
  "last_error": null,
  "enabled": true,
  "dlq_depth": 0
}

4. Verify in Splunk

In your Splunk search console, run:

index=security sourcetype=behavry:audit | head 10

Each event will contain the EventMetadata fields: event_type, agent_id, session_id, tool_name, policy_result, behavioral_score, dlp_findings_count, risk_tier, and causal_depth.

Connector-specific notes

Microsoft Sentinel

Credential fields: workspace_id, shared_key, and optional log_type (defaults to BehavryAudit_CL). Events are delivered to the Log Analytics Data Collector API with HMAC-SHA256 SharedKey authentication.

Google Chronicle

The credential is the full Google service account JSON (the same file you download from the GCP console). The connector authenticates via OAuth2 with the malachite-ingestion scope and delivers events to the UDM unstructuredlogentries:batchCreate endpoint. Requires the google-auth and requests Python packages.

IBM QRadar

Events are delivered as LEEF 2.0 messages inside RFC 5424 syslog frames over TLS (default port 6514). Each message includes a structured data element with agent_id, policy_result, and tool fields for QRadar indexing.

Generic Syslog

Endpoint URL determines the transport: tls://host:port, tcp://host:port, or udp://host:port. Defaults to TCP on port 514 if no scheme is specified. The optional facility credential field overrides the default syslog facility (local0 / 16).

Custom Webhook

Every request includes three security headers:

X-Behavry-Signature: sha256={hmac_hex_digest} computed over the JSON body using the configured secret.
X-Behavry-Event-Count: Number of events in the batch.
X-Behavry-Timestamp: Unix timestamp of the request.

The receiving endpoint should verify the HMAC signature before processing the payload. Any 2xx response is treated as successful delivery.

Why SIEM integration matters​

Supported destinations​

Architecture​

Event filtering​

Example filter​

Data isolation​

Output formats​

JSON (default)​

LEEF 2.0​

CEF​

Retry and resilience​

Exponential backoff with jitter​

Auto-disable​

Dead letter queue​

DLQ management​

List DLQ entries​

Retry DLQ entries​

Discard a DLQ entry​

Credential security​

API reference​

Destinations CRUD​

Operations​

Dead letter queue​

Audit export enhancements​

Setup example: Splunk HEC​

1. Create the destination​

2. Test connectivity​

3. Check health​

4. Verify in Splunk​

Connector-specific notes​

Microsoft Sentinel​

Google Chronicle​

IBM QRadar​

Generic Syslog​

Custom Webhook​

Why SIEM integration matters

Supported destinations

Architecture

Event filtering

Example filter

Data isolation

Output formats

JSON (default)

LEEF 2.0

CEF

Retry and resilience

Exponential backoff with jitter

Auto-disable

Dead letter queue

DLQ management

List DLQ entries

Retry DLQ entries

Discard a DLQ entry

Credential security

API reference

Destinations CRUD

Operations

Dead letter queue

Audit export enhancements

Setup example: Splunk HEC

1. Create the destination

2. Test connectivity

3. Check health

4. Verify in Splunk

Connector-specific notes

Microsoft Sentinel

Google Chronicle

IBM QRadar

Generic Syslog

Custom Webhook