hub/docs/architecture/storage/services.md

---
status: draft
last_updated: 2026-04-19
---

# Table Schemas: External Services

Client and credential tables for outbound service connections. For cross-cutting reference (cascade behavior, index reference, status enums, relations), see [table-reference.md](./table-reference.md). For design decisions, see [../../../decisions/](../../../decisions/).

### `clients`

External service registrations — "who we connect to." A client is any service the hub calls: LLM providers (Anthropic, OpenAI, OpenRouter), VCS (Gitea), compute (Vast.ai), MCP servers, JMAP, custom REST APIs. The `config` column holds the validated connection shape (URLs, headers, auth mechanism) **without credentials**. Credentials live in `client_secrets`.

| Column | Type | Notes |
|--------|------|-------|
| commonCols | — | id, metadata, createdAt, updatedAt |
| name | text NOT NULL UNIQUE | Identifier (`anthropic`, `gitea`, `openrouter`, `vast-ai`) |
| type | text NOT NULL | Client type: `llm-provider`, `vcs`, `compute`, `mcp-server`, `custom` |
| config | jsonb NOT NULL | Validated config instance — validated against the TypeBox schema for this `type` on write. **Validation timing**: Config is validated on write (API handler layer) using the TypeBox schema for the client `type`. On read, a startup validation pass logs warnings for rows that don't match the current schema — it does not block reads. |
| enabled | boolean NOT NULL DEFAULT true | Disable without deleting |
| ownerId | text NOT NULL | FK → accounts.id — who configured this client |
| orgId | text | FK → organizations.id (nullable — some clients are personal, not org-scoped) |

**config boundaries**: Connection configuration goes in `config` (URLs, headers, auth mechanism). This is validated against the TypeBox schema for the client `type`. Secrets are NEVER in `config` — they go in `client_secrets`.

**Indexes**: `unq_clients_name` UNIQUE on `(name)`, `idx_clients_type` on `(type)`, `idx_clients_owner_id` on `(ownerId)`, `idx_clients_org_id` on `(orgId)`.

**Config schema registry** (in code, not DB): Each client `type` maps to a TypeBox schema that validates `config` on write:

```ts
const clientConfigSchemas: Record<string, TSchema> = {
  "llm-provider": LLMProviderConfig,    // baseUrl, defaultModel, models[], auth mechanism
  "vcs": VCSClientConfig,               // baseUrl, specUrl, namespace, auth mechanism
  "compute": ComputeConfig,              // endpoint, region, auth mechanism
  "mcp-server": MCPServerConfig,         // command/url + args/headers (from hub config types)
  "custom": HTTPServiceConfig,           // baseUrl, headers, auth (from @alkdev/operations/from-openapi)
};
```

**Schema evolution contract**: New fields in client config schemas MUST be `Type.Optional()`. Breaking changes MUST use a new client `type` (e.g., `llm-provider-v2`). This ensures existing DB rows remain valid across deployments. Consider adding `configSchemaVersion` to `metadata` in a future phase if breaking changes become common. For now, optional fields handle forward compatibility.

**Validation chain**: API handler validates → Drizzle insert → DB stores. Direct SQL bypasses application validation — this is a known risk documented in README.md.

**Wiring config to secrets**: The config contains `secretKey` (or `envSecretKeys`) fields that point to named secrets in `client_secrets`. The config knows HOW to auth, the secrets table holds WHAT to auth with.

Example config for a Gitea client:
```json
{
  "baseUrl": "https://git.alk.dev/api/v1",
  "specUrl": "https://git.alk.dev/swagger.v1.json",
  "namespace": "gitea",
  "auth": { "type": "apiKey", "headerName": "Authorization", "prefix": "token ", "secretKey": "api_password" }
}
```

Example config for an MCP server:
```json
{
  "command": "/usr/local/bin/mcp-server",
  "args": ["--port", "3000"],
  "envSecretKeys": { "OPENAI_API_KEY": "openai_key" }
}
```

**Runtime resolution**: On startup, load client → validate config → resolve secrets from `client_secrets` by `secretKey` wiring → merge config + decrypted secrets → create connection (MCP client, OpenAPI operations, etc.).

### `client_secrets`

Encrypted credential store — "how we authenticate to them." Each secret is an encrypted value (API key, password, OAuth token, SSH key) associated with a client. Stored as AES-256-GCM encrypted data via `src/crypto.ts`.

| Column | Type | Notes |
|--------|------|-------|
| commonCols | — | id, metadata, createdAt, updatedAt |
| clientId | text NOT NULL | FK → clients.id (cascade) |
| key | text NOT NULL | Secret key name: `api_key`, `api_password`, `oauth_credentials`, `ssh_key`, etc. |
| value | jsonb NOT NULL | Encrypted payload — `EncryptedData { keyVersion, salt, iv, data }` from crypto.ts |
| keyVersion | integer NOT NULL DEFAULT 1 | Encryption key version for rotation |
| expiresAt | timestamp with tz | When the secret expires (e.g., OAuth token TTL). Null = no expiry. |
| lastUsedAt | timestamp with tz | When the secret was last used to authenticate |

**Unique constraint**: `(client_id, key)` — one named secret per client.

**Indexes**: `unq_client_secrets_client_key` UNIQUE on `(clientId, key)`, `idx_client_secrets_expires_at` on `(expiresAt)`.

**Encrypted data structure** (`EncryptedData` from crypto.ts):
```ts
interface EncryptedData {
  keyVersion: number;   // matches client_secrets.keyVersion
  salt: string;         // base64, 16 bytes (PBKDF2)
  iv: string;           // base64, 12 bytes (AES-GCM)
  data: string;         // base64, AES-256-GCM ciphertext
}
```

**Encryption flow**:
1. Raw secret (API key, password) → `crypto.encrypt(secret, dataEncryptionKey)` → `EncryptedData`
2. Store as JSONB in `value`
3. On use: `crypto.decrypt(value, dataEncryptionKey)` → raw secret
4. Data encryption keys from hub config (see [hub-config.md](../../hub-config.md) for the two-layer key model) — comma-separated list of `version:base64key` pairs (e.g., `v1:YmFzZTY0a2V5, v2:Zm9yYmFyYmF6`). Stored in the config file's `encryptionKeys` field (encrypted with the Docker-secret-provisioned master key). Generated once per version via `crypto.generateEncryptionKey()`. The first key in the list is the "current" key used for new encryptions. All keys in the list are available for decryption (allows key rotation). **No env vars for secrets** — see ADR-008 (revised).

**Secret format convention**: Most secrets are plain strings (API keys, passwords). Complex secrets (OAuth tokens) are JSON objects `JSON.stringify()`'d before encryption. The `key` name indicates the format: `api_key` = string, `oauth_credentials` = JSON.

**Key rotation protocol**:
- **On read**: Decrypt with the key version indicated by `client_secrets.keyVersion`. All key versions in the data encryption key ring (from hub config, see [hub-config.md](../../hub-config.md)) are available for decryption.
- **On write (new secret)**: Encrypt with the current key version (the first key in the encryption keys list from hub config).
- **Re-encryption**: Decrypt with old key version → encrypt with current key → UPDATE in a single DB transaction. If the process crashes between decrypt and UPDATE, the old version remains accessible (the row still references the old `keyVersion` and the old key is still in the key ring until fully rotated).
- **Background sweep**: A background job SHOULD periodically re-encrypt secrets using old key versions. Until re-encryption completes, secrets encrypted with old keys remain vulnerable if the old key is compromised. Key rotation for data encryption keys is independent of master key rotation — see [hub-config.md](../../hub-config.md) for the two-layer key model.
- **Error handling**: If a key version referenced by `client_secrets.keyVersion` is not found in the data encryption key ring, log an error and skip re-encryption. Alert the operator — this indicates a missing key that could cause data loss.