Technical Deep Dive

System Architecture

A three-component Node.js monorepo designed for enterprise scale — with Express API, React UI, and MCP knowledge server working together as a unified platform.

See All Features View Comparison

System Design

Three-Component Platform

AgentWatch is a monorepo containing three independent Node.js applications that work together as a complete LLM management platform.

Clients

🌐

Web Browser

Dashboard UI

💻

API Clients

SDKs / HTTP

🔧

IDE Extensions

VS Code / JetBrains

All traffic flows through AgentWatch

AgentWatch Platform (Port 8787)

⚡Express API Server
REST + Proxy

🔐Auth & RBAC
JWT + Roles

🛡️DLP Engine
30+ patterns

📊Metrics Collector
Prometheus

Routes to best available provider

LLM Providers

🟢

OpenAI

GPT-4o / GPT-4

🟠

Anthropic

Claude 3.x

🔵

Google

Gemini Pro

🟣

OpenRouter

100+ models

⚙️

Custom

Any OpenAI endpoint

Main Server · Port 8787

Express API Server

LLM proxy handler with provider routing
JWT auth & RBAC middleware
DLP scanning engine
Response caching
Prometheus metrics collection
Multi-tenant billing via Stripe

Web UI · Vite / Static

React Admin Dashboard

React 18.3 + TypeScript SPA
Real-time observability dashboards
Provider management UI
Budget & team management
Compliance configuration
Knowledge graph visualization

MCP Server · Port 4000

Knowledge Extractor

Repository ingestion pipeline
Tree-sitter AST parsing
Semgrep security scanning
Trivy infrastructure scanning
LanceDB vector store
Semantic code search API

Technology

Technology Stack

Built on proven, production-ready technologies with a TypeScript-first codebase.

Main Server

Runtime	Node.js 24+
Language	TypeScript
Framework	Express.js
Database	SQLite 3 / PostgreSQL
ORM	Prisma
Auth	JWT + bcrypt
HTTP Client	Axios
Queue	PQueue
Tokens	js-tiktoken
Billing	Stripe SDK

Web UI

Framework	React 18.3
Language	TypeScript
Build Tool	Vite
Styling	TailwindCSS
State	TanStack Query
Routing	React Router
Charts	Recharts
Graphs	Cytoscape, vis-network
Icons	Lucide React
Forms	React Hook Form

MCP Server

Runtime	Node.js
Protocol	MCP SDK
Vector DB	LanceDB
Key-Value	LevelDB
Parser	Tree-sitter
SAST	Semgrep
Infra Scan	Trivy
CLI	Commander.js
Embeddings	OpenAI text-embedding
Graph	Custom symbol graph

Request Lifecycle

How a Request is Processed

Every LLM request goes through a multi-stage processing pipeline with auth, DLP, routing, and usage recording.

📨

1. Request In

POST /v1/chat

🔐

2. Auth Check

Validate API key

📊

3. Quota Check

Budget & rate limits

🛡️

4. DLP Scan

30+ data patterns

💾

5. Cache Check

Identical requests

🔀

6. Route

Provider selection

☁️

7. LLM Call

With retry logic

📝

8. Log & Meter

Async recording

Application Services

What Runs at Runtime

Authentication Service: JWT tokens, RBAC, password policies
LLM Proxy Service: Request routing, provider selection, failover
Compliance Service: DLP scanning, guardrails, audit logging
Billing Service: Token counting, budget enforcement, Stripe sync
Knowledge Service: MCP proxy, repository indexing

Design Patterns

Key Architectural Decisions

Circuit Breaker: Prevent cascading failures across providers
Retry with Backoff: Handle transient network errors gracefully
Request Queue: Backpressure management under load
Connection Pool: Efficient HTTP keep-alive connections
Async Recording: Non-blocking usage tracking doesn't add latency

Data Model

Database Schema

SQLite for development, PostgreSQL for production. Full Prisma ORM with migrations.

Core Entities

Table	Purpose	Key Fields
organizations	SaaS billing tenant	name, slug, stripe_customer_id
teams	Org sub-groups	name, org_id, budget
users	Authentication	email, password_hash, role, org_id
tenants	API consumer	api_key, rate_limit, budgets
providers	LLM providers	name, base_url, status
api_keys	Provider keys	encrypted_key, concurrency_limit

Usage & Compliance

Table	Purpose	Key Fields
usage_records	Request log	tokens, cost, model, latency
audit_logs	Compliance trail	action, user_id, ip_address
conversations	Chat history	messages, dlp_events, risk_score
subscription_plans	SaaS tiers	features, limits, stripe_price_id
org_roles	RBAC roles	name, permissions
models	Model catalog	model_id, pricing, context_window

🗄️

Dual-database support: SQLite (via @internal/prisma-sqlite) for development and PostgreSQL (via @prisma/client) for production. Two separate Prisma schemas with the same model structure ensure smooth migration.

Security Architecture

Security by Design

Security isn't a feature — it's the foundation. Every layer of the stack is designed with security in mind.

🔑

Key Encryption

AES-256-GCM encryption for all provider API keys at rest. Random IVs and HMAC authentication tags.

🎫

JWT Sessions

24-hour JWT tokens with organization context, role claims, and super-admin flag embedded.

🔒

IDOR Prevention

Organization ID always sourced from JWT, never from request body or params. Super admin override via header only.

📋

Security Headers

Full CSP, HSTS, X-Frame-Options, and other security headers set on every response.

Operations

Deployment & Scaling

From a single Docker Compose command to full Kubernetes at scale — AgentWatch fits your operational model.

Quick Start

Docker Compose

# Single command deploy
docker-compose up -d

# Services started:
✓ agentwatch:8787
✓ mcp-server:4000
✓ PostgreSQL database

Production

Kubernetes

Helm chart for K8s deployment
Horizontal Pod Autoscaling (HPA)
Liveness & readiness probes
ConfigMap + Secrets management
PersistentVolume for SQLite
Service mesh compatible (Istio)

Configuration

Environment Variables

DATABASE_URL — DB connection string
JWT_SECRET — Token signing key
ENCRYPTION_KEY — Key encryption
STRIPE_SECRET_KEY — Billing
HTTPS_PROXY — Enterprise proxy
PORT — Default 8787