Files
storage/docs/architecture/encrypted-data.md
glm-5.1 6aa2fcc6ff Architect storage around SQLite+Honker: remove PG, add multi-tenant identity, scoping
Reorient @alkdev/storage around a single SQLite database host with Honker
for pub/sub, event streams, and task queues. PostgreSQL is removed as a
target (ADR-038), eliminating dual schema maintenance and infrastructure
complexity. Honker provides DB + pubsub + queues in one .db file (ADR-039).

Add system/tenant DB model (ADR-040): identity tables in system.db, all
graph data in tenant-{orgId}.db files. Identity tables move from the hub
into storage (ADR-041). Scoping columns (ownerId, projectId) added to
graphs table (ADR-042). Graph types get scope (system/tenant/user) to
protect infrastructure schemas (ADR-043).

Define Drizzle-Honker session adapter (ADR-044): ~100-line adapter enabling
Drizzle typed queries and Honker pubsub/queue on a single connection with
transactional consistency.

Resolve OQ-03, OQ-04, OQ-19, OQ-21, OQ-22, OQ-23, OQ-24. Add new
open questions OQ-26 through OQ-29 for Honker integration specifics.

New docs: honker-integration.md (adapter, event patterns, migration).
Scrub all PG/jsonb/libsql references from existing spec docs.
2026-05-31 15:41:41 +00:00

294 lines
12 KiB
Markdown

---
status: draft
last_updated: 2026-05-31
---
# Encrypted Data
Design for storing encrypted data at rest within the metagraph model. Uses
AES-256-GCM + PBKDF2 key derivation, providing a reusable node type, TypeBox
schema, and crypto utility for any consumer that needs to store secrets.
## Overview
Sensitive data — API keys, passwords, OAuth tokens, SSH keys — must be encrypted
at rest. In `@alkdev/storage`, the encryption pattern becomes a reusable utility
and an encrypted node type, so any graph can store secrets without special table
definitions.
**Key principle**: The storage package provides the **encryption primitives and
the schema shape**, not key management. Consumers provide the encryption key.
This keeps the package agnostic to deployment-specific secret management.
**Provenance**: The encryption pattern (AES-256-GCM + PBKDF2) was originally
implemented in the hub's `client_secrets` table and `src/crypto/mod.ts`.
`@alkdev/storage` extracts this pattern as a general-purpose utility, independent
of the hub's domain model.
## The Problem
The hub has `client_secrets` as a standalone table with columns like:
| Column | Purpose |
| ------------ | -------------------------------------------------- |
| `clientId` | FK to the client this secret belongs to |
| `key` | Secret name (e.g., "api_key", "oauth_credentials") |
| `value` | The encrypted payload (EncryptedData JSON) |
| `keyVersion` | Which encryption key version was used |
| `expiresAt` | When the secret expires |
| `lastUsedAt` | Audit trail |
This is a domain-specific table. The encryption logic itself is generic —
AES-256-GCM with PBKDF2 key derivation and key versioning. When we want
encrypted secrets in a spoke (local SQLite) or in a different domain model, we
shouldn't have to duplicate the table definition or the crypto code.
## Design: Encrypted Data as a Node Type
Instead of a dedicated `client_secrets` table, encrypted data becomes a **node
type** in a graph:
```ts
import { Metagraph } from "@alkdev/storage";
import { Type } from "@alkdev/typebox";
import { EncryptedDataSchema } from "@alkdev/storage";
const SecretGraph = Type.Module({
Config: Type.Object({
type: Type.Literal("undirected"),
multi: Type.Literal(false),
allowSelfLoops: Type.Literal(false),
}),
SecretNode: Type.Composite([
Metagraph.Import("BaseNode"),
Type.Object({
key: Type.String({ minLength: 1, maxLength: 255 }),
encryptedData: EncryptedDataSchema,
expiresAt: Type.Optional(Type.String({ format: "date-time" })),
}),
]),
ClientNode: Type.Composite([
Metagraph.Import("BaseNode"),
Type.Object({
name: Type.String(),
type: Type.String(),
config: Type.Record(Type.String(), Type.Unknown()),
enabled: Type.Boolean({ default: true }),
}),
]),
HasSecretEdge: Type.Composite([
Metagraph.Import("BaseEdge"),
Type.Object({
type: Type.Literal("has_secret"),
secretKey: Type.String(),
}),
]),
HasSecretEdgeConstraints: Type.Object({
edgeType: Type.Literal("has_secret"),
allowedSourceTypes: Type.Array(Type.String()), // ["Client"]
allowedTargetTypes: Type.Array(Type.String()), // ["Secret"]
}),
});
```
This represents the same relationship as `client_secrets.clientId` — but as a
graph edge rather than a foreign key.
### Why This Works
1. **No special tables needed** — The existing `graph_types`, `node_types`,
`edge_types`, `graphs`, `nodes`, `edges` tables store everything.
2. **Schema validation** — The `EncryptedDataSchema` TypeBox schema validates
the encryption envelope at write time.
3. **Domain flexibility** — An "ACL graph" might also have encrypted credential
nodes. A "call graph" might store encrypted auth headers. Different graphs,
same pattern.
4. **Query through edges** — "Find all secrets for client X" becomes "find all
edges of type `has_secret` from node X to secret nodes."
5. **The crypto utility is shared**`@alkdev/storage` exports `encrypt()` and
`decrypt()` that any consumer uses.
### What Lives Where
| Layer | Responsibility | Package |
| ------------------------ | --------------------------------------------------------- | ------------------------ |
| `@alkdev/storage` graphs | `EncryptedDataSchema` (TypeBox shape) | `@alkdev/storage` |
| `@alkdev/storage` crypto | `encrypt()`, `decrypt()`, `generateEncryptionKey()` | `@alkdev/storage` |
| `@alkdev/storage` sqlite | Node storage (attributes contain encrypted JSON) | `@alkdev/storage/sqlite` |
| `@alkdev/storage` repo | Validate schema, encrypt on insert (⚠️ CRUD layer not yet built) | `@alkdev/storage` |
| Application | Key management (key ring, key rotation) | Consumer |
## EncryptedData Schema
Ported from the hub's `src/crypto/mod.ts` interface, now expressed as a TypeBox
schema in `@alkdev/storage`:
```ts
import { Type } from "@alkdev/typebox";
export const EncryptedDataSchema = Type.Object({
keyVersion: Type.Integer({ minimum: 1 }),
salt: Type.String(), // Base64-encoded 16-byte PBKDF2 salt
iv: Type.String(), // Base64-encoded 12-byte AES-GCM initialization vector
data: Type.String(), // Base64-encoded AES-256-GCM ciphertext
});
```
The fields contain: `keyVersion` — which encryption key version was used (enables key
rotation), `salt` — base64-encoded 16-byte PBKDF2 salt, `iv` — base64-encoded
12-byte AES-GCM initialization vector, `data` — base64-encoded AES-256-GCM
ciphertext. This is the same structure as the hub's `EncryptedData` interface but
as a TypeBox schema, enabling runtime validation when inserting encrypted nodes.
## Crypto Utility
The encryption module provides three functions, ported from the hub's
`src/crypto/mod.ts`:
### `encrypt(plaintext, password, keyVersion?): Promise<EncryptedData>`
Encrypts a string using AES-256-GCM with PBKDF2 key derivation.
**Process**:
1. Generate random 16-byte salt
2. Generate random 12-byte IV
3. Derive 256-bit key from password + salt via PBKDF2 (SHA-256, 100k iterations
for v1)
4. Encrypt plaintext with AES-256-GCM using the derived key and IV
5. Return
`{ keyVersion, salt: base64(salt), iv: base64(iv), data: base64(ciphertext) }`
### `decrypt(encryptedData, password): Promise<string>`
Decrypts an `EncryptedData` object.
**Process**:
1. Decode base64 salt, IV, and ciphertext
2. Derive key from password + salt + keyVersion via PBKDF2
3. Decrypt with AES-256-GCM
4. Return plaintext string
5. Throw `"Decryption failed: Invalid data or key"` on failure (no information
leakage about which part failed)
### `generateEncryptionKey(): string`
Generates a 32-byte random key encoded as base64. Used by operators to create
encryption keys for the key ring.
**Key ring format** (application-level, not in this package): A comma-separated
list of `v{N}:{base64key}` pairs. The first key is the "current" key used for
new encryptions. All keys are available for decryption.
### Key Versioning
PBKDF2 iteration count varies by key version:
- v1: 100,000 iterations
- Future versions: 200,000+ (adjust for hardware improvements)
This allows gradual security upgrades. Old data encrypted with v1 can still be
decrypted. Re-encryption (rotate) reads with the old key and writes with the
current key.
### Web Crypto API
The implementation uses the standard Web Crypto API (`crypto.subtle`), available
in:
- Deno runtime (native)
- Node.js 19+ (native)
- Modern browsers (native)
- Cloudflare Workers (native)
No external crypto dependencies.
## Design Decisions
All design decisions are documented as ADRs in [decisions/](decisions/).
| ADR | Decision | Summary |
|-----|----------|---------|
| [023](decisions/023-per-attribute-encryption.md) | Per-attribute encryption, not per-node | Only sensitive payload encrypted; key/metadata remain queryable |
| [024](decisions/024-encrypted-data-as-node-type.md) | Encrypted data as node type, not standalone table | No special tables; metagraph pattern with `SecretNode` and `HasSecretEdge` |
| [025](decisions/025-password-based-encryption-pbkdf2.md) | Password-based encryption via PBKDF2 | Consistent with hub; ~100ms per operation; `encryptRaw()` added later if needed |
| [026](decisions/026-application-managed-key-ring.md) | Application-managed key ring | Storage provides encrypt/decrypt primitives, not key management |
| [027](decisions/027-no-key-rotation-utility.md) | No key rotation utility in this package | Application orchestrates rotation; storage provides building blocks |
## Integration with SQLite Host
Encrypted node attributes are stored as JSON text in the `nodes.attributes`
column, same as any other node attributes. The `EncryptedDataSchema` validates
the shape at the application level.
```ts
import { decrypt, encrypt } from "@alkdev/storage";
import { EncryptedDataSchema } from "@alkdev/storage";
const encryptionKey = "v1:YmFzZTY0a2V5"; // from application config
const plaintext = "sk-ant-api03-...";
const encryptedData = await encrypt(plaintext, encryptionKey, 1);
// Validate before storage
const attributes = {
key: "api_key",
encryptedData,
expiresAt: new Date().toISOString(),
created: new Date().toISOString(),
};
// Store as a node in a graph
// db.insert(nodes).values({ graphId, key: "anthropic-api-key", attributes });
// Retrieve and decrypt
// const node = await db.query.nodes.findFirst({ where: eq(nodes.key, "anthropic-api-key") });
// const decrypted = await decrypt(node.attributes.encryptedData, encryptionKey);
```
## Export Plan
The crypto module is exported from the main `@alkdev/storage` package (no
db deps):
```
src/graphs/
├── modules/
│ ├── metagraph.ts # Metagraph Module (Config, BaseNode, BaseEdge)
│ ├── call-graph.ts # CallGraph reference Module
│ ├── secret-graph.ts # SecretGraph reference Module (uses EncryptedDataSchema)
│ └── index.ts # Barrel re-export
├── bridge.ts # moduleToDbSchema, validateNode, validateEdge
├── crypto.ts # encrypt(), decrypt(), generateEncryptionKey(), EncryptedDataSchema
└── mod.ts # Re-exports all of the above
```
The encryption utility is in the zero-dep export path (it only uses Web Crypto
API and `@alkdev/typebox` for the schema). `SecretGraph` in `secret-graph.ts`
composes `EncryptedDataSchema` into a node type via `Type.Composite`.
## Open Questions
Open questions are tracked in [open-questions.md](open-questions.md). Key
questions affecting encrypted data:
- **OQ-07**: Should we add `encryptRaw()` for performance? (open, low priority)
- **OQ-08**: Should the `key` attribute on secret nodes be encrypted? (resolved: plaintext for now)
- **OQ-09**: Should secret nodes have `lastUsedAt` and `expiresAt` as first-class columns? (resolved: JSON attributes for spoke, standalone table for hub)
## References
- Web Crypto API: https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto
- Hub crypto utility (provenance): `/workspace/@alkdev/hub/src/crypto/mod.ts`
- Hub `client_secrets` table (provenance):
`/workspace/@alkdev/hub/docs/architecture/storage/services.md`
- Hub ADR-008 (provenance):
`/workspace/@alkdev/hub/docs/decisions/ADR-008-secrets-encrypted-at-rest-with-key-versioning.md`
- `@alkdev/operations` AccessControl:
`/workspace/@alkdev/operations/docs/architecture/api-surface.md`