Skip to main content

AI Asset Discovery

Behavry's AI Asset Discovery module helps organizations understand what AI tools and agents are already in use across their infrastructure — before those tools are formally governed. It combines active connector-based discovery (IdP + SaaS admin APIs) with passive behavioral observation (browser telemetry + MCP audit events) to build a continuously updated inventory of AI platforms and their governance status.


Why Discovery Matters

In most organizations, AI adoption outpaces governance. Teams adopt new AI tools faster than security teams can review them. Employees interact with AI assistants through browsers, APIs, and SaaS integrations that bypass existing controls entirely. AI Asset Discovery closes this gap by:

  • Identifying AI platforms already in use (whether governed or not)
  • Scoring organizational AI exposure
  • Classifying passive observations into confidence tiers
  • Triggering alerts and enrollment recommendations for ungoverned tools

Architecture

Discovery runs as a background module within the Behavry backend. It subscribes to the internal event bus and pulls data from two input channels:

IdP Connectors (Okta / Azure AD / Google)
SaaS Admin APIs (M365 / GitHub / Slack / Salesforce / etc.)

Discovery Service (state machine + OPA evaluation)

Discovered Platforms table (PostgreSQL)

Exposure Score + Governance Status

Browser Extension (DOM fingerprinting, 10 AI service rules)
MCP Audit Events (tool call patterns, session behavior)

Passive Activity Classifier

Passive Findings table (Tier 1–4)

Active Discovery

Platform Fingerprinting

The fingerprint database covers 30 AI platforms across vendors including OpenAI, Anthropic, Google, Microsoft, Mistral, Cohere, Hugging Face, Stability AI, and more. Each entry includes:

  • OAuth scopes, service account patterns, and API endpoint signatures
  • Capability class: llm, code_assistant, image_gen, embedding, audio, multimodal
  • Risk tier: low, medium, high, critical
  • Default governance recommendation

Connectors

Discovery connectors pull AI platform data from your existing identity and SaaS systems. Supported connector types:

TypeSystems
IdPOkta, Azure AD, Google Workspace
CollaborationMicrosoft 365, Slack
DeveloperGitHub
CRM / SupportSalesforce, Zendesk, ServiceNow
ProjectAtlassian (Jira/Confluence)

Connectors run on a configurable sync interval (default: 12 hours). Credentials are AES-256-GCM encrypted at rest using the same KMS pipeline as the data protection module.

Discovery State Machine

Each platform moves through governance states:

detected → evaluated → [governed | ungoverned | suppressed]
  • detected — platform found in connector data or browser telemetry
  • evaluated — OPA evaluated exposure_policy.rego against platform metadata
  • governed — a Behavry-enrolled agent is linked to this platform
  • ungoverned — platform is in use but has no enrolled agent
  • suppressed — manually acknowledged by an operator (internal tool, etc.)

OPA evaluates the exposure policy at each state transition and can fire alerts for ungoverned high-risk platforms.

Exposure Score

The exposure score (0–100) is a per-tenant aggregate that reflects:

  • Number of ungoverned platforms relative to total discovered
  • Risk tier distribution across platforms
  • Payload ceiling breaches (large data volumes to ungoverned platforms)

A score of 0 indicates full governance coverage. A score above 70 triggers a STRONGLY_INDICATIVE_DETECTED alert and prompts enrollment.


Passive Activity Classification

Passive classification analyzes behavioral signals from audit events and browser telemetry to determine whether observed activity is consistent with autonomous AI agent usage (rather than human-initiated).

Observation Windows

The ActivityWindowAggregator queries audit_events for a given platform over a rolling window (configurable, default 24 hours) and computes:

  • p50 inter-event interval — time between consecutive tool calls
  • Session continuity — whether events span a sustained session or are isolated
  • Chained consumption — whether tool calls form read→process→write chains

Signal Evaluator

Eight signals are evaluated per observation window:

SignalDescriptionWeight
endpoint_matchTool call matches known AI API endpoint patternHigh
service_account_originRequests originate from a service account (non-interactive)High
call_cadenceInter-event interval consistent with automated agent (not human typing)Medium
non_interactive_timingActivity outside business hours or in overnight burstsMedium
session_continuitySustained session > 30 minutes with no human-side inactivity gapMedium
chained_consumptionTool calls form a read → transform → write causal chainMedium
tool_schema_presentTool call includes structured JSON schema arguments (agent, not human copy-paste)Low
structured_handoffOutput of one call is passed as input to the next (chaining detected)Low

Confidence Tiers

Findings are classified into four tiers based on signal combination:

TierLabelCriteria
Tier 1Telemetry Observedendpoint_match only — could be human-initiated
Tier 2Likely AI-Relatedendpoint_match + ≥1 behavioral signal, or multi-source corroboration
Tier 3Strongly Indicativeendpoint_match + ≥2 behavioral signals, or any payload signal, or ≥3 behavioral signals
Tier 4Operator ConfirmedManually confirmed by an operator (cannot be auto-assigned)

Evidence Language

System-generated evidence summaries use observational language: "observed", "detected", "consistent with". The system never characterizes unconfirmed activity as malicious, unauthorized, or a threat — that judgement is left to the operator.

Tier Promotion

Findings only move upward in tier — they are never automatically downgraded. STRONGLY_INDICATIVE_DETECTED fires once on the first Tier 3 promotion for a platform (guarded by a strongly_indicative_alerted flag). Tier 4 can only be set by an operator via the API.


OPA Policies

Two OPA policies govern asset discovery behavior:

policies/discovery/exposure_policy.rego

Evaluates platform risk on state transitions:

package behavry.discovery.exposure

# Fire an alert for ungoverned high-risk platforms
alert if {
input.platform.risk_tier in {"high", "critical"}
input.platform.state == "ungoverned"
}

# Recommend proxy enrollment when payload ceiling breached
recommend_enrollment if {
input.platform.payload_bytes_30d > 10_000_000
input.platform.state == "ungoverned"
}

policies/discovery/classification_policy.rego

Drives alert severity and review SLAs for passive findings:

package behavry.discovery.classification

# Alert severity based on tier and governance status
alert_severity := "medium" if {
input.finding.confidence_tier == 3
input.finding.governance_status == "ungoverned"
}

alert_severity := "high" if {
input.finding.confidence_tier == 4
input.finding.governance_status == "ungoverned"
}

# Findings overdue for review
review_overdue if {
input.finding.confidence_tier == 3
input.finding.status == "pending"
# More than 72 hours since creation
input.hours_since_created > 72
}

# Recommend proxy enrollment
proxy_enrollment_recommended if {
input.finding.confidence_tier >= 3
input.finding.governance_status == "ungoverned"
input.platform.payload_bytes_30d > 5_000_000
}

Dashboard: Exposure Page

The Exposure page (/exposure) shows the full asset inventory organized into three operational buckets:

BucketDescription
ConfirmedTier 4 (operator-confirmed) findings and governed platforms
SuspectedTier 2–3 findings — likely AI activity, review recommended
UngovernedAll platforms with no enrolled agent

Platforms Tab

Displays all discovered platforms with:

  • Vendor name, capability class, risk tier badge
  • Governance status (governed / ungoverned / suppressed)
  • Finding count by confidence tier (● Tier 1 · ● Tier 2 · ● Tier 3)
  • Exposure score gauge (0–100)

Activity Findings Tab

Passive findings list with filters by tier, governance status, and platform. Each finding opens a slide-over with:

  • Signal breakdown (which of the 8 signals fired)
  • Evidence summary (system-generated, observational language)
  • Operator action panel: Confirm (→ Tier 4), Suppress, Reclassify

API Reference

See REST API — Discovery for full endpoint documentation.

Key endpoints:

MethodPathDescription
POST/api/v1/discovery/connectorsCreate a discovery connector
POST/api/v1/discovery/connectors/{id}/syncTrigger immediate sync
POST/api/v1/discovery/connectors/{id}/testTest connector connectivity
GET/api/v1/discovery/platformsList discovered platforms
PATCH/api/v1/discovery/platforms/{id}Update governance metadata
GET/api/v1/discovery/summaryExposure score + findings breakdown
GET/api/v1/discovery/findingsList passive findings
GET/api/v1/discovery/findings/breakdownTier counts summary
POST/api/v1/discovery/findings/{id}/classifyPromote to Tier 4 (operator confirm)
POST/api/v1/discovery/findings/{id}/suppressSuppress a finding
POST/api/v1/discovery/findings/{id}/reclassifyReclassify (never downgrades Tier 4)

Environment Variables

VariableDescription
BEHAVRY_LOCAL_ENCRYPTION_KEYBase64-encoded 32-byte key used to encrypt connector credentials at rest

Connector credentials (API tokens, OAuth secrets) are always encrypted before storage and never returned in API responses — only a short hint is exposed.


Sync Behavior

  • Background sync loop: runs every sync_interval_hours per connector (default: 12h)
  • Manual sync: POST /api/v1/discovery/connectors/{id}/sync triggers an immediate run (returns 202 Accepted, runs async)
  • State transitions: only move forward (detected → evaluated → governed/ungoverned/suppressed)
  • Passive classification: continuous — the classifier runs whenever new audit events arrive for a platform

Governance Workflow

  1. Discovery connector syncs — platforms appear in the inventory as detected
  2. OPA evaluates exposure policy — platform moves to ungoverned or is enriched
  3. Browser extension reports visits — telemetry signals arrive, Tier 1 findings created
  4. Passive classifier promotes findings — as behavioral signals accumulate, tier rises
  5. STRONGLY_INDICATIVE_DETECTED alert fires — at Tier 3 promotion (once per platform)
  6. Operator reviews — confirms (Tier 4), suppresses, or initiates agent enrollment
  7. Agent enrolled — platform transitions to governed, finding resolves