AI Asset Discovery

Behavry's AI Asset Discovery module helps organizations understand what AI tools and agents are already in use across their infrastructure — before those tools are formally governed. It combines active connector-based discovery (IdP + SaaS admin APIs) with passive behavioral observation (browser telemetry + MCP audit events) to build a continuously updated inventory of AI platforms and their governance status.

Why Discovery Matters

In most organizations, AI adoption outpaces governance. Teams adopt new AI tools faster than security teams can review them. Employees interact with AI assistants through browsers, APIs, and SaaS integrations that bypass existing controls entirely. AI Asset Discovery closes this gap by:

Identifying AI platforms already in use (whether governed or not)
Scoring organizational AI exposure
Classifying passive observations into confidence tiers
Triggering alerts and enrollment recommendations for ungoverned tools

Architecture

Discovery runs as a background module within the Behavry backend. It subscribes to the internal event bus and pulls data from two input channels:

IdP Connectors (Okta / Azure AD / Google)
SaaS Admin APIs (M365 / GitHub / Slack / Salesforce / etc.)
        ↓
  Discovery Service (state machine + OPA evaluation)
        ↓
  Discovered Platforms table (PostgreSQL)
        ↓
  Exposure Score + Governance Status

Browser Extension (DOM fingerprinting, 10 AI service rules)
MCP Audit Events (tool call patterns, session behavior)
        ↓
  Passive Activity Classifier
        ↓
  Passive Findings table (Tier 1–4)

Active Discovery

Platform Fingerprinting

The fingerprint database covers 30 AI platforms across vendors including OpenAI, Anthropic, Google, Microsoft, Mistral, Cohere, Hugging Face, Stability AI, and more. Each entry includes:

OAuth scopes, service account patterns, and API endpoint signatures
Capability class: llm, code_assistant, image_gen, embedding, audio, multimodal
Risk tier: low, medium, high, critical
Default governance recommendation

Connectors

Discovery connectors pull AI platform data from your existing identity and SaaS systems. Supported connector types:

Type	Systems
IdP	Okta, Azure AD, Google Workspace
Collaboration	Microsoft 365, Slack
Developer	GitHub
CRM / Support	Salesforce, Zendesk, ServiceNow
Project	Atlassian (Jira/Confluence)

Connectors run on a configurable sync interval (default: 12 hours). Credentials are AES-256-GCM encrypted at rest using the same KMS pipeline as the data protection module.

Discovery State Machine

Each platform moves through governance states:

detected → evaluated → [governed | ungoverned | suppressed]

detected — platform found in connector data or browser telemetry
evaluated — OPA evaluated exposure_policy.rego against platform metadata
governed — a Behavry-enrolled agent is linked to this platform
ungoverned — platform is in use but has no enrolled agent
suppressed — manually acknowledged by an operator (internal tool, etc.)

OPA evaluates the exposure policy at each state transition and can fire alerts for ungoverned high-risk platforms.

Exposure Score

The exposure score (0–100) is a per-tenant aggregate that reflects:

Number of ungoverned platforms relative to total discovered
Risk tier distribution across platforms
Payload ceiling breaches (large data volumes to ungoverned platforms)

A score of 0 indicates full governance coverage. A score above 70 triggers a STRONGLY_INDICATIVE_DETECTED alert and prompts enrollment.

Passive Activity Classification

Passive classification analyzes behavioral signals from audit events and browser telemetry to determine whether observed activity is consistent with autonomous AI agent usage (rather than human-initiated).

Observation Windows

The ActivityWindowAggregator queries audit_events for a given platform over a rolling window (configurable, default 24 hours) and computes:

p50 inter-event interval — time between consecutive tool calls
Session continuity — whether events span a sustained session or are isolated
Chained consumption — whether tool calls form read→process→write chains

Signal Evaluator

Eight signals are evaluated per observation window:

Signal	Description	Weight
`endpoint_match`	Tool call matches known AI API endpoint pattern	High
`service_account_origin`	Requests originate from a service account (non-interactive)	High
`call_cadence`	Inter-event interval consistent with automated agent (not human typing)	Medium
`non_interactive_timing`	Activity outside business hours or in overnight bursts	Medium
`session_continuity`	Sustained session > 30 minutes with no human-side inactivity gap	Medium
`chained_consumption`	Tool calls form a read → transform → write causal chain	Medium
`tool_schema_present`	Tool call includes structured JSON schema arguments (agent, not human copy-paste)	Low
`structured_handoff`	Output of one call is passed as input to the next (chaining detected)	Low

Confidence Tiers

Findings are classified into four tiers based on signal combination:

Tier	Label	Criteria
Tier 1	Telemetry Observed	`endpoint_match` only — could be human-initiated
Tier 2	Likely AI-Related	`endpoint_match` + ≥1 behavioral signal, or multi-source corroboration
Tier 3	Strongly Indicative	`endpoint_match` + ≥2 behavioral signals, or any payload signal, or ≥3 behavioral signals
Tier 4	Operator Confirmed	Manually confirmed by an operator (cannot be auto-assigned)

Evidence Language

System-generated evidence summaries use observational language: "observed", "detected", "consistent with". The system never characterizes unconfirmed activity as malicious, unauthorized, or a threat — that judgement is left to the operator.

Tier Promotion

Findings only move upward in tier — they are never automatically downgraded. STRONGLY_INDICATIVE_DETECTED fires once on the first Tier 3 promotion for a platform (guarded by a strongly_indicative_alerted flag). Tier 4 can only be set by an operator via the API.

OPA Policies

Two OPA policies govern asset discovery behavior:

`policies/discovery/exposure_policy.rego`

Evaluates platform risk on state transitions:

package behavry.discovery.exposure

# Fire an alert for ungoverned high-risk platforms
alert if {
    input.platform.risk_tier in {"high", "critical"}
    input.platform.state == "ungoverned"
}

# Recommend proxy enrollment when payload ceiling breached
recommend_enrollment if {
    input.platform.payload_bytes_30d > 10_000_000
    input.platform.state == "ungoverned"
}

`policies/discovery/classification_policy.rego`

Drives alert severity and review SLAs for passive findings:

package behavry.discovery.classification

# Alert severity based on tier and governance status
alert_severity := "medium" if {
    input.finding.confidence_tier == 3
    input.finding.governance_status == "ungoverned"
}

alert_severity := "high" if {
    input.finding.confidence_tier == 4
    input.finding.governance_status == "ungoverned"
}

# Findings overdue for review
review_overdue if {
    input.finding.confidence_tier == 3
    input.finding.status == "pending"
    # More than 72 hours since creation
    input.hours_since_created > 72
}

# Recommend proxy enrollment
proxy_enrollment_recommended if {
    input.finding.confidence_tier >= 3
    input.finding.governance_status == "ungoverned"
    input.platform.payload_bytes_30d > 5_000_000
}

Dashboard: Exposure Page

The Exposure page (/exposure) shows the full asset inventory organized into three operational buckets:

Bucket	Description
Confirmed	Tier 4 (operator-confirmed) findings and governed platforms
Suspected	Tier 2–3 findings — likely AI activity, review recommended
Ungoverned	All platforms with no enrolled agent

Platforms Tab

Displays all discovered platforms with:

Vendor name, capability class, risk tier badge
Governance status (governed / ungoverned / suppressed)
Finding count by confidence tier (● Tier 1 · ● Tier 2 · ● Tier 3)
Exposure score gauge (0–100)

Activity Findings Tab

Passive findings list with filters by tier, governance status, and platform. Each finding opens a slide-over with:

Signal breakdown (which of the 8 signals fired)
Evidence summary (system-generated, observational language)
Operator action panel: Confirm (→ Tier 4), Suppress, Reclassify

API Reference

See REST API — Discovery for full endpoint documentation.

Key endpoints:

Method	Path	Description
`POST`	`/api/v1/discovery/connectors`	Create a discovery connector
`POST`	`/api/v1/discovery/connectors/{id}/sync`	Trigger immediate sync
`POST`	`/api/v1/discovery/connectors/{id}/test`	Test connector connectivity
`GET`	`/api/v1/discovery/platforms`	List discovered platforms
`PATCH`	`/api/v1/discovery/platforms/{id}`	Update governance metadata
`GET`	`/api/v1/discovery/summary`	Exposure score + findings breakdown
`GET`	`/api/v1/discovery/findings`	List passive findings
`GET`	`/api/v1/discovery/findings/breakdown`	Tier counts summary
`POST`	`/api/v1/discovery/findings/{id}/classify`	Promote to Tier 4 (operator confirm)
`POST`	`/api/v1/discovery/findings/{id}/suppress`	Suppress a finding
`POST`	`/api/v1/discovery/findings/{id}/reclassify`	Reclassify (never downgrades Tier 4)

Environment Variables

Variable	Description
`BEHAVRY_LOCAL_ENCRYPTION_KEY`	Base64-encoded 32-byte key used to encrypt connector credentials at rest

Connector credentials (API tokens, OAuth secrets) are always encrypted before storage and never returned in API responses — only a short hint is exposed.

Sync Behavior

Background sync loop: runs every sync_interval_hours per connector (default: 12h)
Manual sync: POST /api/v1/discovery/connectors/{id}/sync triggers an immediate run (returns 202 Accepted, runs async)
State transitions: only move forward (detected → evaluated → governed/ungoverned/suppressed)
Passive classification: continuous — the classifier runs whenever new audit events arrive for a platform

Governance Workflow

Discovery connector syncs — platforms appear in the inventory as detected
OPA evaluates exposure policy — platform moves to ungoverned or is enriched
Browser extension reports visits — telemetry signals arrive, Tier 1 findings created
Passive classifier promotes findings — as behavioral signals accumulate, tier rises
STRONGLY_INDICATIVE_DETECTED alert fires — at Tier 3 promotion (once per platform)
Operator reviews — confirms (Tier 4), suppresses, or initiates agent enrollment
Agent enrolled — platform transitions to governed, finding resolves

Why Discovery Matters​

Architecture​

Active Discovery​

Platform Fingerprinting​

Connectors​

Discovery State Machine​

Exposure Score​

Passive Activity Classification​

Observation Windows​

Signal Evaluator​

Confidence Tiers​

Evidence Language​

Tier Promotion​

OPA Policies​

policies/discovery/exposure_policy.rego​

policies/discovery/classification_policy.rego​

Dashboard: Exposure Page​

Platforms Tab​

Activity Findings Tab​

API Reference​

Environment Variables​

Sync Behavior​

Governance Workflow​