Platform Capabilities
Everything You Need to
Govern Enterprise AI
From DLP scanning and compliance frameworks to cost controls and multi-tenant billing —
AgentWatch covers the full spectrum of enterprise AI management.
Core
LLM Gateway & Routing
A single OpenAI-compatible API that routes to every major AI provider with zero code changes.
Multi-Provider Support
Every Model, One API
- OpenAI — GPT-4o, GPT-4o-mini, GPT-3.5-turbo, embeddings
- Anthropic — Claude 3.7 Sonnet, Claude 3 Haiku
- OpenRouter — 100+ models via meta-routing
- Google — Gemini models
- Custom Providers — Any OpenAI-compatible endpoint
- Mistral, Groq, Together — and more via custom config
Drop-In Replacement
Zero Code Changes
from openai import OpenAI
client = OpenAI(
base_url="http://your-hub:8787/v1",
api_key="your-tenant-key"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
🤖
Client App
Any SDK/HTTP
🔐
Auth & DLP
Validate & Scan
🔀
Router
Provider Selection
☁️
LLM Provider
OpenAI / Anthropic
📊
Usage Tracking
Log & Meter
✅
Response
Back to client
🔄
Streaming Support
Full SSE streaming support for all providers. Real-time token delivery with proper backpressure handling.
💾
Response Caching
Configurable in-memory cache for identical requests. Reduce costs and latency for repeated queries.
🎯
Model Catalog
Browse all models with pricing, context windows, and usage stats. Enable or disable per-organization.
Security
Data Loss Prevention & Security
Automatic detection and blocking of sensitive data before it ever reaches an AI provider.
API Key Security
AES-256-GCM Encryption
- AES-256-GCM encryption (industry standard)
- Secure key derivation with random initialization vectors
- HMAC authentication tags
- Legacy AES-256-CBC support for migration
- Keys decrypted only when needed for requests
- Keys never displayed in full in any UI
Access Control
Multi-Layer Authentication
- JWT-based session authentication (24-hour expiry)
- Role-based access control: super_admin, admin, user
- Organization-level data isolation
- Per-tenant API keys for LLM access
- Failed login tracking with account lockout
- Password policy enforcement
DLP Detection Coverage
| Category |
Data Types Detected |
Action |
| PII |
Email addresses, phone numbers (US/international), SSN, ITIN, EIN, passport numbers, driver's license, date of birth, names, physical addresses, ZIP codes |
Block or Warn |
| PHI / HIPAA |
Medical record numbers, health insurance IDs, diagnoses (ICD codes), medications, health conditions, lab results |
Block or Warn |
| Financial |
Credit card numbers (Visa, Mastercard, Amex, Discover), bank account numbers, routing numbers, CVV codes |
Block or Warn |
| Secrets |
API keys (AWS, OpenAI, Stripe, etc.), private keys, passwords, authentication tokens, connection strings |
Block or Warn |
Automatic Data Classification
PUBLIC
INTERNAL
CONFIDENTIAL
RESTRICTED
PII
PHI
PCI
Compliance
Compliance & Content Guardrails
Built-in support for the major compliance frameworks, with configurable content guardrails for every use case.
📋
Compliance Frameworks
- GDPR (data privacy)
- HIPAA (healthcare)
- SOX (financial reporting)
- PCI-DSS (payment cards)
- SOC2 (security)
- ISO27001 (information security)
🚧
Content Guardrails
- Prompt injection protection
- Data leakage prevention
- Secret detection
- Toxicity filtering
- Hate speech detection
- Profanity & adult content filtering
🗂️
Audit & Retention
- All API operations logged
- Configuration changes tracked
- User actions with IP addresses
- Configurable retention periods
- Conversation retention
- Export for compliance reviews
Reliability
Circuit Breakers & Failover
Enterprise-grade reliability with automatic failover, intelligent retry, and health monitoring.
Circuit Breaker Pattern
Automatic Provider Protection
- CLOSED State: Normal operation, requests flow through
- OPEN State: Provider failing, requests fail fast to protect resources
- HALF_OPEN State: Testing if provider has recovered
- Configurable failure thresholds
- Automatic recovery detection
- Real-time state visibility in dashboard
Intelligent Retry
Smart Failure Handling
- Exponential backoff with jitter
- Configurable max retries per provider
- Respects rate limit response headers
- Different retry strategies per error type
- Request queue with backpressure management
- Connection pool: up to 1000 sockets per host
❤️
Health Monitoring
Continuous provider health checks with response time tracking, error rate monitoring, and availability scoring.
🔗
Fallback Chains
Define fallback providers per model. Automatic failover on errors with priority-based, health-aware routing.
🔌
Connection Pooling
HTTP keep-alive connections (30 seconds), max 1000 sockets per host, FIFO scheduling for fair queuing.
Cost Management
Usage Tracking & Budget Controls
Token-level visibility with per-team and per-user budgets, alerts, and full cost attribution.
🧮
Token Counting
Accurate token counting using js-tiktoken for every request. Tracks prompt tokens, completion tokens, and total cost.
💳
Per-Tenant Budgets
Daily and monthly spending limits per tenant, team, and user. Automatic enforcement when budgets are exceeded.
🔔
Budget Alerts
Configurable alert thresholds (e.g., at 80% and 95% of budget). Notify admins before hard limits are hit.
📦
Stripe Billing
Built-in Stripe integration for SaaS deployment. Automatic usage-based billing and subscription management.
Multi-Tenant
Enterprise Multi-Tenant Architecture
Full organizational hierarchy with complete data isolation, separate budgets, and granular RBAC.
🏢
Organizations
Top-level billing tenant. Each organization has isolated data, its own plans, and subscription management via Stripe.
👥
Teams
Logical groupings within an organization. Shared budget pools, activity tracking, and team-level analytics.
🔑
Tenants & API Keys
API consumer entities with individual keys, rate limits, daily/monthly budgets, and separate usage tracking.
RBAC Roles
Super Admin
Platform-level access. Manage all organizations, plans, and platform configuration.
Owner
Organization owner. Full control over org settings, billing, teams, and users.
Admin
Manage providers, API keys, DLP settings, and user access within the org.
User
API access for LLM requests. View own usage and analytics. Read-only config access.
Unique Capability
Knowledge Extraction with MCP
No other LLM gateway includes an integrated MCP server for code intelligence and security scanning.
📚
Repository Indexing
Index entire codebases with Tree-sitter for multi-language AST parsing. Build symbol graphs and dependency maps in seconds.
🔍
Code Intelligence
Vector search over indexed code using LanceDB. Semantic code search, function discovery, and dependency analysis.
🛡️
Security Scanning
Integrated Semgrep for SAST scanning and Trivy for infrastructure scanning. Automated vulnerability detection on every index.
⚡
Supported Languages: TypeScript, JavaScript, Python, Go, Java, Swift, Kotlin, and Rust —
with symbol extraction, dependency graphs, and semantic search for each.
Observability
Full Visibility into Every Request
Prometheus metrics, structured logging, and real-time dashboards give you complete observability.
📡
Prometheus Metrics
Export metrics for request counts, latency percentiles (P50/P95/P99), error rates, and provider health.
📝
Structured Logging
JSON-structured logs with request correlation IDs, tenant context, and DLP event classification.
📊
Usage Dashboards
Real-time dashboards showing token usage, costs, active tenants, and request volumes across all providers.
🔎
Audit Trail
Complete log of every AI request: user, timestamp, model, tokens, cost, duration, and DLP events. Exportable for compliance.
Deployment
Deploy Anywhere
Three deployment modes to fit any enterprise environment — from on-premises to fully managed cloud.
Recommended
🏠
Self-Hosted
Full control over your infrastructure and data. Deploy in your own cloud or on-premises.
- Docker Compose single-command deploy
- Kubernetes Helm chart available
- PostgreSQL for production scale
- Your data never leaves your environment
🌩️
Cloud VM
Deploy on any major cloud provider with automated setup scripts.
- AWS, GCP, Azure compatible
- Cloud-init scripts for automated provisioning
- Managed SSL with Let's Encrypt
- Auto-scaling compatible
🔒
Enterprise Proxy
Transparent proxy mode for enterprises with Zscaler or other corporate proxies.
- Zscaler integration built-in
- Custom CA certificate support
- Works behind corporate firewalls
- HTTPS inspection compatible
Ready to See It in Action?
Login to your dashboard or explore the technical architecture.