Copy architecture docs, ADRs, storage domain specs, research, reviews, and 56 storage architecture tasks from the alkhub_ts monorepo. Adapt for standalone @alkdev/hub repo structure (src/ not packages/hub/). Sanitize all sensitive information: - Replace private IPs (10.0.0.1) with localhost defaults - Remove internal server hostnames (dev1, ns528096) - Replace /workspace/ private paths with npm package references - Remove hardcoded credentials from examples - Rewrite infrastructure.md without private network details Add Deno project scaffolding: deno.json (pinned deps), .gitignore, AGENTS.md, entry point. Migrate existing code stubs (crypto, config types, logger) with updated import paths.
108 lines
7.8 KiB
Markdown
108 lines
7.8 KiB
Markdown
---
|
|
status: draft
|
|
last_updated: 2026-04-19
|
|
---
|
|
|
|
# Table Schemas: External Services
|
|
|
|
Client and credential tables for outbound service connections. For cross-cutting reference (cascade behavior, index reference, status enums, relations), see [table-reference.md](./table-reference.md). For design decisions, see [../../../decisions/](../../../decisions/).
|
|
|
|
### `clients`
|
|
|
|
External service registrations — "who we connect to." A client is any service the hub calls: LLM providers (Anthropic, OpenAI, OpenRouter), VCS (Gitea), compute (Vast.ai), MCP servers, JMAP, custom REST APIs. The `config` column holds the validated connection shape (URLs, headers, auth mechanism) **without credentials**. Credentials live in `client_secrets`.
|
|
|
|
| Column | Type | Notes |
|
|
|--------|------|-------|
|
|
| commonCols | — | id, metadata, createdAt, updatedAt |
|
|
| name | text NOT NULL UNIQUE | Identifier (`anthropic`, `gitea`, `openrouter`, `vast-ai`) |
|
|
| type | text NOT NULL | Client type: `llm-provider`, `vcs`, `compute`, `mcp-server`, `custom` |
|
|
| config | jsonb NOT NULL | Validated config instance — validated against the TypeBox schema for this `type` on write. **Validation timing**: Config is validated on write (API handler layer) using the TypeBox schema for the client `type`. On read, a startup validation pass logs warnings for rows that don't match the current schema — it does not block reads. |
|
|
| enabled | boolean NOT NULL DEFAULT true | Disable without deleting |
|
|
| ownerId | text NOT NULL | FK → accounts.id — who configured this client |
|
|
| orgId | text | FK → organizations.id (nullable — some clients are personal, not org-scoped) |
|
|
|
|
**config boundaries**: Connection configuration goes in `config` (URLs, headers, auth mechanism). This is validated against the TypeBox schema for the client `type`. Secrets are NEVER in `config` — they go in `client_secrets`.
|
|
|
|
**Indexes**: `unq_clients_name` UNIQUE on `(name)`, `idx_clients_type` on `(type)`, `idx_clients_owner_id` on `(ownerId)`, `idx_clients_org_id` on `(orgId)`.
|
|
|
|
**Config schema registry** (in code, not DB): Each client `type` maps to a TypeBox schema that validates `config` on write:
|
|
|
|
```ts
|
|
const clientConfigSchemas: Record<string, TSchema> = {
|
|
"llm-provider": LLMProviderConfig, // baseUrl, defaultModel, models[], auth mechanism
|
|
"vcs": VCSClientConfig, // baseUrl, specUrl, namespace, auth mechanism
|
|
"compute": ComputeConfig, // endpoint, region, auth mechanism
|
|
"mcp-server": MCPServerConfig, // command/url + args/headers (from hub config types)
|
|
"custom": HTTPServiceConfig, // baseUrl, headers, auth (from @alkdev/operations/from-openapi)
|
|
};
|
|
```
|
|
|
|
**Schema evolution contract**: New fields in client config schemas MUST be `Type.Optional()`. Breaking changes MUST use a new client `type` (e.g., `llm-provider-v2`). This ensures existing DB rows remain valid across deployments. Consider adding `configSchemaVersion` to `metadata` in a future phase if breaking changes become common. For now, optional fields handle forward compatibility.
|
|
|
|
**Validation chain**: API handler validates → Drizzle insert → DB stores. Direct SQL bypasses application validation — this is a known risk documented in README.md.
|
|
|
|
**Wiring config to secrets**: The config contains `secretKey` (or `envSecretKeys`) fields that point to named secrets in `client_secrets`. The config knows HOW to auth, the secrets table holds WHAT to auth with.
|
|
|
|
Example config for a Gitea client:
|
|
```json
|
|
{
|
|
"baseUrl": "https://git.alk.dev/api/v1",
|
|
"specUrl": "https://git.alk.dev/swagger.v1.json",
|
|
"namespace": "gitea",
|
|
"auth": { "type": "apiKey", "headerName": "Authorization", "prefix": "token ", "secretKey": "api_password" }
|
|
}
|
|
```
|
|
|
|
Example config for an MCP server:
|
|
```json
|
|
{
|
|
"command": "/usr/local/bin/mcp-server",
|
|
"args": ["--port", "3000"],
|
|
"envSecretKeys": { "OPENAI_API_KEY": "openai_key" }
|
|
}
|
|
```
|
|
|
|
**Runtime resolution**: On startup, load client → validate config → resolve secrets from `client_secrets` by `secretKey` wiring → merge config + decrypted secrets → create connection (MCP client, OpenAPI operations, etc.).
|
|
|
|
### `client_secrets`
|
|
|
|
Encrypted credential store — "how we authenticate to them." Each secret is an encrypted value (API key, password, OAuth token, SSH key) associated with a client. Stored as AES-256-GCM encrypted data via `src/crypto.ts`.
|
|
|
|
| Column | Type | Notes |
|
|
|--------|------|-------|
|
|
| commonCols | — | id, metadata, createdAt, updatedAt |
|
|
| clientId | text NOT NULL | FK → clients.id (cascade) |
|
|
| key | text NOT NULL | Secret key name: `api_key`, `api_password`, `oauth_credentials`, `ssh_key`, etc. |
|
|
| value | jsonb NOT NULL | Encrypted payload — `EncryptedData { keyVersion, salt, iv, data }` from crypto.ts |
|
|
| keyVersion | integer NOT NULL DEFAULT 1 | Encryption key version for rotation |
|
|
| expiresAt | timestamp with tz | When the secret expires (e.g., OAuth token TTL). Null = no expiry. |
|
|
| lastUsedAt | timestamp with tz | When the secret was last used to authenticate |
|
|
|
|
**Unique constraint**: `(client_id, key)` — one named secret per client.
|
|
|
|
**Indexes**: `unq_client_secrets_client_key` UNIQUE on `(clientId, key)`, `idx_client_secrets_expires_at` on `(expiresAt)`.
|
|
|
|
**Encrypted data structure** (`EncryptedData` from crypto.ts):
|
|
```ts
|
|
interface EncryptedData {
|
|
keyVersion: number; // matches client_secrets.keyVersion
|
|
salt: string; // base64, 16 bytes (PBKDF2)
|
|
iv: string; // base64, 12 bytes (AES-GCM)
|
|
data: string; // base64, AES-256-GCM ciphertext
|
|
}
|
|
```
|
|
|
|
**Encryption flow**:
|
|
1. Raw secret (API key, password) → `crypto.encrypt(secret, dataEncryptionKey)` → `EncryptedData`
|
|
2. Store as JSONB in `value`
|
|
3. On use: `crypto.decrypt(value, dataEncryptionKey)` → raw secret
|
|
4. Data encryption keys from hub config (see [hub-config.md](../../hub-config.md) for the two-layer key model) — comma-separated list of `version:base64key` pairs (e.g., `v1:YmFzZTY0a2V5, v2:Zm9yYmFyYmF6`). Stored in the config file's `encryptionKeys` field (encrypted with the Docker-secret-provisioned master key). Generated once per version via `crypto.generateEncryptionKey()`. The first key in the list is the "current" key used for new encryptions. All keys in the list are available for decryption (allows key rotation). **No env vars for secrets** — see ADR-008 (revised).
|
|
|
|
**Secret format convention**: Most secrets are plain strings (API keys, passwords). Complex secrets (OAuth tokens) are JSON objects `JSON.stringify()`'d before encryption. The `key` name indicates the format: `api_key` = string, `oauth_credentials` = JSON.
|
|
|
|
**Key rotation protocol**:
|
|
- **On read**: Decrypt with the key version indicated by `client_secrets.keyVersion`. All key versions in the data encryption key ring (from hub config, see [hub-config.md](../../hub-config.md)) are available for decryption.
|
|
- **On write (new secret)**: Encrypt with the current key version (the first key in the encryption keys list from hub config).
|
|
- **Re-encryption**: Decrypt with old key version → encrypt with current key → UPDATE in a single DB transaction. If the process crashes between decrypt and UPDATE, the old version remains accessible (the row still references the old `keyVersion` and the old key is still in the key ring until fully rotated).
|
|
- **Background sweep**: A background job SHOULD periodically re-encrypt secrets using old key versions. Until re-encryption completes, secrets encrypted with old keys remain vulnerable if the old key is compromised. Key rotation for data encryption keys is independent of master key rotation — see [hub-config.md](../../hub-config.md) for the two-layer key model.
|
|
- **Error handling**: If a key version referenced by `client_secrets.keyVersion` is not found in the data encryption key ring, log an error and skip re-encryption. Alert the operator — this indicates a missing key that could cause data loss. |