Behavry Integration -- Google Gemini API Proxy
For teams building applications with the Google Generative AI SDK, Behavry can proxy all Gemini API calls for identity verification, policy enforcement, and audit logging.
How It Works
Your Code (google-generativeai SDK)
| base_url=http://localhost:8000/api/v1/gemini
Behavry Proxy
| validates JWT | audits metadata | checks OPA policy
Google AI API (generativelanguage.googleapis.com)
^ response streamed back
The proxy:
- Validates your Behavry agent JWT (
Authorization: Bearer <behavry-jwt>) - Extracts your Google API key from
X-Gemini-Keyheader (never logged) - Audits request metadata: model (from URL path), contents count, system instruction presence (boolean only), tool use -- not message content
- Forwards request to
https://generativelanguage.googleapis.com/{path}with your key set asx-goog-api-key - Streams response back transparently
- Audits response metadata:
usageMetadatatoken counts, finish reason
Prerequisites
- Behavry stack running (
make devordocker compose up) - A Behavry agent with
web:readandweb:writepermissions - Your Google AI API key (
AIza...)
Step 1 -- Get a Behavry JWT
curl -s -X POST http://localhost:8000/api/v1/auth/token \
-H "Content-Type: application/json" \
-d '{"client_id": "YOUR_CLIENT_ID", "client_secret": "YOUR_SECRET", "grant_type": "client_credentials"}' \
| jq -r .access_token
Step 2 -- Configure Your Code
Python (google-generativeai SDK)
import google.generativeai as genai
import httpx
BEHAVRY_JWT = "eyJhbGci..." # Behavry agent token
GEMINI_KEY = "AIza..." # your real Google API key
# Point the SDK at the Behavry proxy
client = genai.Client(
api_key=BEHAVRY_JWT,
http_options={
"base_url": "http://localhost:8000/api/v1/gemini",
"headers": {
"X-Gemini-Key": GEMINI_KEY, # forwarded to Google, never logged
},
},
)
response = client.models.generate_content(
model="gemini-2.0-flash",
contents="Hello!",
)
print(response.text)
Environment Variables
export GEMINI_BASE_URL=http://localhost:8000/api/v1/gemini
export BEHAVRY_JWT=<behavry-jwt>
export GEMINI_API_KEY=AIza...
import os
import google.generativeai as genai
client = genai.Client(
api_key=os.environ["BEHAVRY_JWT"],
http_options={
"base_url": os.environ["GEMINI_BASE_URL"],
"headers": {"X-Gemini-Key": os.environ["GEMINI_API_KEY"]},
},
)
Direct HTTP (curl)
curl -X POST "http://localhost:8000/api/v1/gemini/v1beta/models/gemini-2.0-flash:generateContent" \
-H "Authorization: Bearer $BEHAVRY_JWT" \
-H "X-Gemini-Key: $GEMINI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{"parts": [{"text": "Hello!"}]}]
}'
Step 3 -- Verify in Dashboard
Make a request and check http://localhost:5173 -- Live Activity.
You should see an event with:
tool_name:gemini-apimcp_server:gemini-proxyaction:POSTpolicy_result:allow
Audited Metadata
The proxy logs the following -- message content and system instruction text are never stored:
| Field | Example |
|---|---|
| Model | gemini-2.0-flash (extracted from URL path) |
| Has system instruction | true (boolean only) |
| Has tools | true (boolean only) |
| Message count | 3 (number of contents entries) |
| Input tokens (promptTokenCount) | 89 |
| Output tokens (candidatesTokenCount) | 234 |
| Finish reason | STOP |
The model name is extracted from the URL path. For example, v1beta/models/gemini-1.5-pro:generateContent yields gemini-1.5-pro.
Key Header
| Header | Purpose |
|---|---|
Authorization: Bearer <jwt> | Behavry agent identity (validated, never forwarded) |
X-Gemini-Key: AIza... | Google API key (forwarded as x-goog-api-key upstream, never logged) |
The proxy strips your Behavry JWT and replaces it with the standard Google authentication header. Your API key never appears in audit logs.
Endpoint
POST /api/v1/gemini/{path}
The {path} parameter captures the full Gemini API path. All standard Gemini API endpoints are supported:
| Gemini Endpoint | Behavry Path |
|---|---|
v1beta/models/gemini-2.0-flash:generateContent | /api/v1/gemini/v1beta/models/gemini-2.0-flash:generateContent |
v1beta/models/gemini-2.0-flash:streamGenerateContent | /api/v1/gemini/v1beta/models/gemini-2.0-flash:streamGenerateContent |
v1beta/models | /api/v1/gemini/v1beta/models |
Query parameters are forwarded as-is.
Streaming
Streaming is fully supported. When the upstream Gemini response uses text/event-stream content type (e.g., streamGenerateContent), the proxy passes SSE chunks through without buffering. An audit event is published at the start of the stream with the upstream status code; token counts from individual stream chunks are not aggregated.
Policy Control
Example OPA policy to restrict model access:
package behavry.authz
# Only allow specific Gemini models
deny if {
input.mcp_server == "gemini-proxy"
not startswith(input.model, "gemini-2.0")
not startswith(input.model, "gemini-1.5")
}
# Block Gemini usage for specific agent roles
deny if {
input.mcp_server == "gemini-proxy"
input.agent_role == "read-only"
}
Troubleshooting
401 from Behavry
JWT expired or missing. Re-fetch using Step 1.
401 from Google (passed through)
Your X-Gemini-Key is invalid or the API key does not have access to the requested model. Verify your Google AI API key in the Google AI Studio.
Missing X-Gemini-Key
The proxy returns 401 if the X-Gemini-Key header is not present. Ensure your SDK configuration includes this header.
504 Gateway Timeout
The upstream Gemini API request exceeded the 120-second timeout. For long-running requests, consider breaking them into smaller prompts.
Model not found in audit
The proxy extracts the model name from the URL path segment containing a colon (e.g., gemini-2.0-flash from gemini-2.0-flash:generateContent). If your path does not match this pattern, the model field may be null in audit events.