From ae242f33b97261a8ee64f0958b03b660806da44e Mon Sep 17 00:00:00 2001 From: "glm-5.1" Date: Tue, 2 Jun 2026 12:33:20 +0000 Subject: [PATCH] Restructure identity tables: separate credential types, add peer_credentials, specify FK cascades and indexes MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Identity tables were derived from hub's PostgreSQL schema but simplified without documenting what was removed or why. This restructures them for the current auth landscape (API key + wraith SSH/cert-authority): - ADR-049: Separate api_keys and peer_credentials tables (different lookup patterns, columns, lifecycles), remove Gitea columns, map hub data→metadata - ADR-050: Extract SHA-256 vs KDF decision from inline spec text - Add peer_credentials table for SSH key and cert-authority auth - Specify all FK cascade behaviors within system DB (RESTRICT, CASCADE, SET NULL) - Complete index specifications for all identity tables - Add scope boundary section (storage owns schemas, not auth/authorization) - Update audit_logs with credentialId+credentialType polymorphic reference - Add 3 new open questions (OQ-33/34/35) for credential type expansion --- docs/architecture/README.md | 7 +- .../049-identity-schema-restructuring.md | 194 +++++++++++ .../050-sha256-for-api-key-hashing.md | 52 +++ docs/architecture/open-questions.md | 34 +- docs/architecture/sqlite-host.md | 310 +++++++++++++++--- 5 files changed, 539 insertions(+), 58 deletions(-) create mode 100644 docs/architecture/decisions/049-identity-schema-restructuring.md create mode 100644 docs/architecture/decisions/050-sha256-for-api-key-hashing.md diff --git a/docs/architecture/README.md b/docs/architecture/README.md index c23e212..0d34ade 100644 --- a/docs/architecture/README.md +++ b/docs/architecture/README.md @@ -1,6 +1,6 @@ --- status: draft -last_updated: 2026-05-31 +last_updated: 2026-06-02 --- # @alkdev/storage Architecture @@ -79,6 +79,11 @@ remain to be implemented. | [043](decisions/043-graph-type-scope.md) | Graph type scope — system/tenant/user | Accepted | | [044](decisions/044-drizzle-honker-adapter.md) | Drizzle-Honker session adapter | Accepted | | [045](decisions/045-org-members-authoritative-belongsto-derived.md) | organization_members authoritative, BelongsToEdge derived | Accepted | +| [046](decisions/046-fold-drizzlebox-as-utils.md) | Fold @alkdev/drizzlebox as src/sqlite/utils | Accepted | +| [047](decisions/047-honker-event-target.md) | HonkerEventTarget adapter for pubsub | Accepted | +| [048](decisions/048-operation-specs-as-repo-surface.md) | OperationSpecs as repository surface | Accepted | +| [049](decisions/049-identity-schema-restructuring.md) | Identity schema restructuring — separate credential tables, remove Gitea, data→metadata | Accepted | +| [050](decisions/050-sha256-for-api-key-hashing.md) | SHA-256 for machine-generated API keys | Accepted | ### Open Questions diff --git a/docs/architecture/decisions/049-identity-schema-restructuring.md b/docs/architecture/decisions/049-identity-schema-restructuring.md new file mode 100644 index 0000000..d01174a --- /dev/null +++ b/docs/architecture/decisions/049-identity-schema-restructuring.md @@ -0,0 +1,194 @@ +# ADR-049: Identity Schema Restructuring + +## Status + +Accepted + +## Context + +The identity tables in `sqlite-host.md` were derived from the hub's PostgreSQL +schema (`@alkdev/hub/docs/architecture/storage/identity.md`) but simplified +without documenting what was removed and why. This creates ambiguity for +implementation: + +1. **Gitea columns** (`accounts.giteaUsername`, `organizations.giteaOrgName`) + were dropped without documented rationale. These are hub-specific + integration columns — a storage package should not couple to a particular + git hosting provider. Git association, when needed, belongs in a metagraph + instance (e.g., a project graph with git repo metadata) or a downstream + consumer's schema. + +2. **`data` JSONB columns** on `accounts` and `organizations` in the hub schema + were silently dropped. The hub used these for account preferences/profile + and org billing/settings. Storage's `commonCols.metadata` serves the same + purpose — an extension namespace following `_subsystem.key` convention. The + mapping from hub `data` to storage `metadata` is unambiguous but was never + stated. + +3. **Single `api_keys` table** assumes keypal-style bearer token auth only. The + @alkdev platform now has two authentication mechanisms: + - **API key** (keypal-style): client sends a bearer token, hub hashes it, + looks up by `keyHash`. Transport: HTTP/WebSocket. + - **Peer credential** (wraith-style): client presents an Ed25519 public key + or OpenSSH certificate over an SSH channel (TCP/TLS/Iroh). Server + validates against known fingerprints. Transport: wraith SSH tunnel. + + These credential types have fundamentally different query patterns (hash + lookup vs fingerprint lookup), different columns (keyHash makes no sense + for SSH keys; publicKeyFingerprint makes no sense for API keys), and + different lifecycles (rotation vs addition/removal). A single table would be + mostly nulls on either side. + +4. **Missing FK cascade behavior** — Identity tables live in the same system + DB, so real FK constraints apply. The current spec uses "logical reference" + language (appropriate for cross-DB scoping columns on `graphs`) for + relationships that are intra-database and should have proper cascades. + +5. **Missing columns** — The hub spec includes `api_keys.description`, + `api_keys.rotatedToId`, `api_keys.lastUsedAt`, and `audit_logs.sessionId` + that were dropped from storage without rationale. + +## Decision + +### 1. Separate credential tables by type + +Two tables with distinct columns, not a unified `credentials` table with a +type discriminator: + +| Table | Auth mechanism | Lookup pattern | Transport | +|-------|----------------|----------------|-----------| +| `api_keys` | Bearer token | Hash token → look up `keyHash` | HTTP/WebSocket | +| `peer_credentials` | SSH key / cert-authority | Present fingerprint → look up `fingerprint` | wraith (SSH over TCP/TLS/Iroh) | + +Rationale: Query patterns, columns, and lifecycles differ fundamentally. +Credential type proliferation is expected to be low (3-4 types ever). A new +table per type is acceptable. This extends the design principle established by +ADR-002: the metagraph pattern serves graph-shaped data (dynamic schemas, +traversal queries); dedicated tables serve fixed-schema data with known +columns and relational query patterns. + +**Decision criterion for future credential types**: Credential types sharing +the same lookup column get a table with a `credentialType` discriminator. +Credential types requiring different lookup columns get their own table. For +example, a future `tls_certificate` credential that uses fingerprint-based +lookup would join `peer_credentials` with a new `credentialType` value. A +credential type using a different lookup column (e.g., a client identifier) +would warrant a new table. + +### 2. Remove Gitea columns + +`accounts.giteaUsername` and `organizations.giteaOrgName` are removed. Git +hosting integration is a consumer concern, not a storage infrastructure +concern. When a downstream system needs to associate accounts or organizations +with git hosting, it stores that association in: +- A metagraph instance (e.g., a project graph with a `GitRepositoryNode`) +- Consumer-side schema extensions via `commonCols.metadata` + +### 3. Hub `data` column maps to storage `commonCols.metadata` + +The hub's `accounts.data` and `organizations.data` JSONB columns are not +present in storage's identity tables. Their purpose (extensible account/org +metadata) is served by `commonCols.metadata`, which follows the +`_subsystem.key` convention. No data is lost — the extensible namespace +already exists. The hub maps `data` fields into `metadata` keys when +migrating. + +### 4. Add `peer_credentials` table + +A new table for SSH key and certificate-authority authentication over wraith +transport: + +| Column | Type | Notes | +|--------|------|-------| +| commonCols | — | id, metadata, createdAt, updatedAt | +| ownerId | text NOT NULL | FK → accounts.id (CASCADE) | +| credentialType | text NOT NULL | `ssh_key`, `cert_authority` | +| fingerprint | text NOT NULL UNIQUE | Ed25519 key fingerprint (SHA-256) | +| publicKeyData | text NOT NULL | Full public key in OpenSSH format | +| name | text | Human-readable label | +| enabled | integer NOT NULL DEFAULT 1 | Immediate disable switch | +| expiresAt | integer (timestamp) | Null = never | +| revokedAt | integer (timestamp) | Null = active | + +The `credentialType` discriminator separates key entries from CA entries within +a single table because their query pattern (look up by fingerprint) is +identical. Cert-specific data (principals, restrictions, caFingerprint) goes +in `metadata`. + +### 5. Add back useful API key columns + +| Column | Reason | +|--------|--------| +| `rotatedToId` | API key rotation tracking — sets which key replaced this one | +| `lastUsedAt` | Stale key cleanup and access pattern analysis | + +`description` is not added. `name` + `metadata` covers labeling needs. + +### 6. Specify FK cascade behavior within system DB + +All identity table FKs are intra-database and use real constraints: + +| Relationship | onDelete | Rationale | +|-------------|----------|-----------| +| organizations.ownerId → accounts.id | RESTRICT | Cannot delete owner account while org exists | +| organization_members.orgId → organizations.id | CASCADE | Org deletion removes memberships | +| organization_members.accountId → accounts.id | CASCADE | Account deletion removes memberships | +| api_keys.ownerId → accounts.id | CASCADE | Account deletion removes API keys | +| peer_credentials.ownerId → accounts.id | CASCADE | Account deletion removes peer credentials | +| audit_logs.ownerId → accounts.id | RESTRICT | Audit integrity — deactivate account instead of delete | + +`audit_logs.keyId` and `audit_logs.credentialId` are logical references (not +FK) because they may reference rows in either `api_keys` or `peer_credentials`, +and the referenced table depends on the audit event type. + +### 7. Update audit_logs for multi-credential world + +Replace `keyId` (API key only) with polymorphic credential references: + +| Column | Type | Notes | +|--------|------|-------| +| credentialId | text | Logical reference to api_keys.id or peer_credentials.id | +| credentialType | text | `api_key`, `peer_credential`, or null | + +This replaces the previous `keyId` column. The `credentialType` discriminator +tells the consumer which table to look up (same pattern as +`graphs.ownerId` — logical reference, not FK). + +`sessionId` is not added. Session correlation is a hub concern, not a storage +infrastructure concern. When needed, it goes in `metadata`. + +## Consequences + +**Positive:** + +- Two credential types covered from the start — API key auth and wraith SSH + auth can both be stored and looked up efficiently +- Each credential table has native columns for its specific fields — no + null-heavy rows, no JSON lookups for high-query fields +- Gitea coupling removed — storage doesn't depend on a specific git hosting + provider +- FK cascades specified — implementers know exactly what happens on deletion +- Clear provenance mapping — hub's `data` → storage's `metadata` is explicit + +**Negative:** + +- Two credential tables instead of one — but the columns don't overlap and + query patterns differ, so this is the correct trade-off +- `audit_logs.credentialId`/`credentialType` polymorphic reference — no FK + constraint, consumer resolves the table (same pattern as existing + cross-DB references) +- Hub must migrate its existing identity schema when consuming storage's + definitions — `keyId` → `credentialId` + `credentialType`, `data` → `metadata`, + Gitea columns to metagraph or consumer metadata +- `peer_credentials` credential types may grow (e.g., `tls_certificate`) — + handled by adding enum values, not new tables, since query patterns within + peer auth are similar + +## References + +- ADR-002: Metagraph over domain-specific tables +- ADR-041: Identity tables in storage package +- ADR-040: System DB + tenant DB separation +- Hub identity tables: `/workspace/@alkdev/hub/docs/architecture/storage/identity.md` +- Wraith NAPI + pubsub: `/workspace/@alkdev/wraith/docs/architecture/napi-and-pubsub.md` +- Wraith auth: `/workspace/@alkdev/wraith/docs/architecture/decisions/012-auth-ed25519-and-cert-authority.md` \ No newline at end of file diff --git a/docs/architecture/decisions/050-sha256-for-api-key-hashing.md b/docs/architecture/decisions/050-sha256-for-api-key-hashing.md new file mode 100644 index 0000000..ad001a2 --- /dev/null +++ b/docs/architecture/decisions/050-sha256-for-api-key-hashing.md @@ -0,0 +1,52 @@ +# ADR-050: SHA-256 for Machine-Generated API Keys + +## Status + +Accepted + +## Context + +API key hashing has two common approaches: + +1. **Fast hash** (SHA-256): O(1) verification at high throughput. Standard for + machine-generated tokens. +2. **Slow KDF** (bcrypt, Argon2): Intentionally expensive to slow brute-force + attacks. Standard for human-chosen passwords. + +The choice depends on the input entropy. Human passwords are low-entropy +(maybe 30-40 bits of actual randomness even with complexity requirements), so +brute-force is feasible unless the hash is slow. Machine-generated keys are +high-entropy (128-bit+ randomness from `crypto.randomUUID()` or equivalent), +making brute-force computationally infeasible even with a fast hash. + +## Decision + +Use SHA-256 for API key hashing. Do not use bcrypt or Argon2. + +The API keys in `@alkdev/storage` are machine-generated secrets with 128-bit+ +entropy. An attacker attempting to brute-force a SHA-256 hash of such a key +faces 2^128 possible inputs — infeasible regardless of hash speed. Slow KDFs +add latency (50-200ms per verification) without meaningful security +improvement for high-entropy inputs. + +## Consequences + +**Positive:** + +- Fast O(1) verification — critical for high-throughput API authentication +- Widely supported — every language/runtime has SHA-256 built in +- Simple implementation — no salt generation, no cost parameter tuning + +**Negative:** + +- If a consumer generates low-entropy keys (short, predictable patterns), + SHA-256 provides less protection against brute-force than a slow KDF. This + is a consumer responsibility — the storage table schema cannot enforce key + generation quality. +- SHA-256 is not post-quantum resistant. This is acceptable for API keys, + which can be rotated, unlike passwords which are often long-lived. + +## References + +- `api_keys.keyHash` in [sqlite-host.md](../sqlite-host.md) +- Hub ADR-010: SHA-256 for API key hashing (same decision, provenance) \ No newline at end of file diff --git a/docs/architecture/open-questions.md b/docs/architecture/open-questions.md index a268b8b..71bc878 100644 --- a/docs/architecture/open-questions.md +++ b/docs/architecture/open-questions.md @@ -1,6 +1,6 @@ --- status: draft -last_updated: 2026-06-01 +last_updated: 2026-06-02 --- # Open Questions Tracker @@ -12,7 +12,7 @@ architecture documents, organized by theme. | Status | Count | |--------|-------| -| Open | 10 | +| Open | 13 | | Resolved (this revision) | 18 | | Previously resolved | 11 | @@ -30,6 +30,9 @@ architecture documents, organized by theme. - **OQ-30** (composite event target for single-node hub) — latency optimization - **OQ-31** (consumer naming for durable subscriptions) — restart stability - **OQ-32** (Drizzle Kit migration compatibility) — custom adapter +- **OQ-33** (peer_credentials SSH key type expansion) — defer until needed +- **OQ-34** (hub api_keys migration path) — needed for hub transition +- **OQ-35** (peer_credentials Iroh auth metadata) — defer until Iroh NAPI complete ## Theme 1: Package Boundaries and Dependencies @@ -272,6 +275,29 @@ architecture documents, organized by theme. - **Priority**: medium - **Notes**: Drizzle Kit supports SQLite migrations but expects `better-sqlite3` or `libsql`. Need to verify `drizzle-kit push`/`drizzle-kit generate` works with the custom Honker adapter, or whether we need a custom migration runner. +## Theme 9: Identity and Credentials + +### OQ-33: Should `peer_credentials.credentialType` support additional SSH key types beyond Ed25519? + +- **Origin**: [sqlite-host.md](sqlite-host.md) +- **Status**: open +- **Priority**: low +- **Notes**: Current spec assumes Ed25519 only (matching wraith ADR-012). RSA and ECDSA keys are common in legacy SSH deployments. If wraith adds support for additional key types, `credentialType` values like `ssh_key_rsa`, `ssh_key_ecdsa` or a `keyType` column may be needed. Defer until wraith supports additional key types. + +### OQ-34: How should hub `api_keys` data migrate to the restructured storage schema? + +- **Origin**: [sqlite-host.md](sqlite-host.md), [ADR-049](decisions/049-identity-schema-restructuring.md) +- **Status**: open +- **Priority**: medium +- **Notes**: The hub's existing PostgreSQL `api_keys` table has columns (`description`, `keyId`) that map differently to storage's schema. `description` maps to `metadata` (no dedicated column). `keyId` (FK → api_keys.id) becomes `credentialId` + `credentialType` (polymorphic). Hub's `data` columns map to `commonCols.metadata`. A migration script is needed when the hub consumes storage's identity tables. + +### OQ-35: Should `peer_credentials` support Iroh-specific authentication metadata? + +- **Origin**: [sqlite-host.md](sqlite-host.md) +- **Status**: open +- **Priority**: low +- **Notes**: Iroh connections use node IDs (base58-encoded) for addressing. If Iroh provides an authentication mechanism beyond SSH key auth (e.g., node ID-based trust), `peer_credentials` may need an iroh-specific credential type or additional columns. The Iroh NAPI wrapper is not yet complete; defer until its pubsub integration is implemented. + ## ADR Impact | ADR | Resolves | Informs | @@ -294,4 +320,6 @@ architecture documents, organized by theme. | ADR-045 | OQ-23 | OQ-20 | | ADR-046 | | OQ-17 | | ADR-047 | OQ-26 | OQ-30 | -| ADR-048 | OQ-17 (updated), OQ-18 (updated), OQ-19 (updated) | | \ No newline at end of file +| ADR-048 | OQ-17 (updated), OQ-18 (updated), OQ-19 (updated) | | +| ADR-049 | | OQ-33, OQ-34, OQ-35 | +| ADR-050 | | | | \ No newline at end of file diff --git a/docs/architecture/sqlite-host.md b/docs/architecture/sqlite-host.md index 0c8cd12..d568717 100644 --- a/docs/architecture/sqlite-host.md +++ b/docs/architecture/sqlite-host.md @@ -1,6 +1,6 @@ --- status: draft -last_updated: 2026-06-01 +last_updated: 2026-06-02 --- # SQLite Host @@ -38,7 +38,8 @@ src/sqlite/ │ │ ├── accounts.ts # accounts table + select/insert schemas │ │ ├── organizations.ts # organizations table + select/insert schemas │ │ ├── organization_members.ts # org membership + select/insert schemas -│ │ ├── api_keys.ts # API keys (keypal) + select/insert schemas +│ │ ├── api_keys.ts # API key credentials + select/insert schemas +│ │ ├── peer_credentials.ts # SSH key / cert-authority credentials + select/insert schemas │ │ ├── audit_logs.ts # audit trail + select/insert schemas │ │ └── index.ts # barrel re-export │ ├── metagraph/ @@ -205,86 +206,285 @@ type deletion if active graphs reference it. ## Identity Tables Identity tables live in the **system DB** (ADR-040, ADR-041). They provide -multi-tenant authentication and authorization infrastructure. These tables are -derived from the hub's existing identity tables; the schemas are aligned but -simplified for the storage package's scope. +multi-tenant authentication and authorization infrastructure. Storage owns the +table schemas and FK constraints; it does not own authentication logic, +authorization rules, key lifecycle, or credential verification — those are +consumer concerns. + +The identity schemas are derived from the hub's PostgreSQL identity tables +(ADR-049). Gitea-specific columns are removed (git hosting integration is a +consumer concern, modeled in metagraph instances or consumer metadata). The +hub's `data` JSONB columns map to `commonCols.metadata` (same extension +namespace, `_subsystem.key` convention). + +### Scope Boundary + +Storage's identity tables provide **persistence and structural constraints**. +Consumer concerns NOT in storage's scope: + +- Key generation, hashing, and verification (keypal, wraith handle this) +- Authentication protocol flow (hub/wraith handle this) +- Authorization and scope evaluation (ACL graph + operations enforce this) +- Account lifecycle policy (when to suspend, deactivate, transfer ownership) +- Key rotation and revocation orchestration +- Session and connection management ### `accounts` -| Column | Type | Notes | -|---------------|---------------------|-------| -| commonCols | — | id, metadata, createdAt, updatedAt | -| email | text NOT NULL UNIQUE | Unique identifier | -| displayName | text | Display name | -| accessLevel | text NOT NULL DEFAULT `user` | `admin`, `user`, `service` | -| status | text NOT NULL DEFAULT `active` | `active`, `suspended`, `deactivated` | +| Column | Type | Constraints | Notes | +|-------------|---------------------|----------------------------------------|-------------------------------------------------------| +| id | text | PK | Consumer-generated UUID | +| metadata | text (JSON) | default `{}` | Extension namespace (`_subsystem.key`). Replaces hub's `data` JSONB column (ADR-049). Account preferences, profile data. | +| createdAt | integer (timestamp) | not null, default `now` | | +| updatedAt | integer (timestamp) | not null, default `now` | | +| email | text | not null, **unique** | Primary identifier. Service accounts may use deployment-configured reserved patterns. | +| displayName | text | | Display name | +| accessLevel | text | not null, default `user` | `admin`, `user`, `service` | +| status | text | not null, default `active` | `active`, `suspended`, `deactivated` | -**Indexes**: `unq_accounts_email` UNIQUE on `(email)`. +**`accessLevel` semantics**: `admin` manages all resources across +organizations. `user` manages own resources and org-scoped resources. `service` +is an automated account (LLM workers, spoke credentials, CI tokens) — no git +hosting link required. + +**`status` semantics**: `active` can authenticate. `suspended` is admin-locked +(security hold). `deactivated` is user-initiated shutdown. Suspended and +deactivated accounts retain owned resources (RESTRICT FK) but cannot +authenticate. + +**Indexes**: `unq_accounts_email` UNIQUE on `(email)`, +`idx_accounts_access_level` on `(accessLevel)`, +`idx_accounts_status` on `(status)`. + +No `giteaUsername` column — git hosting integration is a consumer concern +(ADR-049). When needed, store git associations in `metadata` or a metagraph +instance. ### `organizations` -| Column | Type | Notes | -|----------|---------------------|-------| -| commonCols | — | id, metadata, createdAt, updatedAt | -| name | text NOT NULL UNIQUE | Organization name | -| slug | text NOT NULL UNIQUE | URL-friendly identifier | -| ownerId | text NOT NULL | Logical reference to accounts.id | +| Column | Type | Constraints | Notes | +|--------|---------------------|----------------------------------------|-------------------------------------------------| +| id | text | PK | Consumer-generated UUID | +| metadata | text (JSON) | default `{}` | Extension namespace. Replaces hub's `data` JSONB column (ADR-049). Org settings, billing data. | +| createdAt | integer (timestamp) | not null, default `now` | | +| updatedAt | integer (timestamp) | not null, default `now` | | +| name | text | not null, **unique** | Organization name | +| slug | text | not null, **unique** | URL-friendly identifier | +| ownerId | text | not null, FK → accounts.id (**RESTRICT**) | Administrative/transferable owner. Cannot delete owner account while org exists. Transfer ownership first. | -**Indexes**: `unq_organizations_name` UNIQUE on `(name)`, `unq_organizations_slug` UNIQUE on `(slug)`. +**`ownerId` semantics**: The administrative owner of the organization. This +account MUST also have `membershipLevel: 'owner'` in `organization_members` +(enforced by consumer). To change the owner, the consumer calls a transfer +ownership operation that: (1) validates the new owner has `membershipLevel: +'owner'`, (2) updates `ownerId`, (3) optionally demotes the old owner's +membership level. RESTRICT cascade prevents deleting the owner account while +the org exists. + +**Indexes**: `unq_organizations_name` UNIQUE on `(name)`, +`unq_organizations_slug` UNIQUE on `(slug)`, +`idx_organizations_owner_id` on `(ownerId)`. + +**Dual ownership representation**: `organizations.ownerId` and +`organization_members.membershipLevel: 'owner'` both represent ownership. The +column exists for efficient lookup (a single indexed read for "who owns this +org?") and RESTRICT FK semantics (cannot delete the owner account while the +org exists). The membership row exists for relational queries ("list all +owners of this org"). The consumer-enforced invariant is: `ownerId` always +references an account that also has `membershipLevel: 'owner'` in +`organization_members`. The consumer must maintain this invariant on +membership changes and ownership transfers. + +No `giteaOrgName` column — git hosting integration is a consumer concern +(ADR-049). ### `organization_members` -| Column | Type | Notes | -|-----------------|---------------------|-------| -| commonCols | — | id, metadata, createdAt, updatedAt | -| orgId | text NOT NULL | FK → organizations.id (cascade) | -| accountId | text NOT NULL | FK → accounts.id (cascade) | -| membershipLevel | text NOT NULL | `owner`, `admin`, `member` | +| Column | Type | Constraints | Notes | +|-----------------|---------------------|----------------------------------------|--------------------------------------| +| id | text | PK | Consumer-generated UUID | +| metadata | text (JSON) | default `{}` | Extension namespace | +| createdAt | integer (timestamp) | not null, default `now` | | +| updatedAt | integer (timestamp) | not null, default `now` | | +| orgId | text | not null, FK → organizations.id (**CASCADE**) | Org deletion removes memberships | +| accountId | text | not null, FK → accounts.id (**CASCADE**) | Account deletion removes memberships | +| membershipLevel | text | not null | `owner`, `admin`, `member` | -**Unique constraint**: `(orgId, accountId)`. -**Indexes**: `idx_org_members_account_id` on `(accountId)`. +**Unique constraint**: `(orgId, accountId)` — one membership per account per org. + +**`membershipLevel` semantics**: `owner` has full control including member +management. `admin` can manage projects and members. `member` can access org +resources. Distinct from `organizations.ownerId` — `membershipLevel` is +runtime access control; `ownerId` is the administrative/transferable owner. This table is the authoritative source for org membership (ADR-045). The ACL graph's `BelongsToEdge` is derived from it — when membership changes, the consumer writes the SQL row first, then creates or removes the ACL edge. +**Indexes**: `unq_org_members_org_account` UNIQUE on `(orgId, accountId)`, +`idx_org_members_account_id` on `(accountId)`, +`idx_org_members_org_id` on `(orgId)`. + ### `api_keys` -| Column | Type | Notes | -|------------|---------------------|-------| -| commonCols | — | id, metadata, createdAt, updatedAt | -| ownerId | text NOT NULL | Logical reference to accounts.id | -| keyHash | text NOT NULL UNIQUE | SHA-256 hash (never stores raw key) | -| name | text | Human-readable key label | -| enabled | integer NOT NULL DEFAULT 1 | Disable without revoking | -| expiresAt | integer (timestamp) | When the key expires (null = never) | -| revokedAt | integer (timestamp) | When revoked (null = active) | +API key credentials for bearer token authentication. The client sends a raw +key; the consumer hashes it and looks up by `keyHash`. Storage does not +perform hashing or verification — that is a consumer concern (keypal, hub). -**Indexes**: `unq_api_keys_key_hash` UNIQUE on `(keyHash)`, `idx_api_keys_owner_id` on `(ownerId)`. +| Column | Type | Constraints | Notes | +|------------|---------------------|----------------------------------------|------------------------------------------------------| +| id | text | PK | Consumer-generated UUID | +| metadata | text (JSON) | default `{}` | Extension namespace. Scope data: `metadata.scopes` (`string[]`), `metadata.resources` (`Record`), `metadata.tags` (`string[]`). Consumer provides the adapter (e.g., `HubKeyStorage` for keypal). Scopes remain in metadata rather than as native columns because scope schemas vary by consumer — keypal uses colon-separated hierarchies, other consumers may differ. | +| createdAt | integer (timestamp) | not null, default `now` | | +| updatedAt | integer (timestamp) | not null, default `now` | | +| ownerId | text | not null, FK → accounts.id (**CASCADE**) | Account deletion removes API keys | +| keyHash | text | not null, **unique** | SHA-256 hash of raw key. Never stores raw key. | +| name | text | | Human-readable key label | +| enabled | integer | not null, default 1 | Immediate disable switch (1 = enabled, 0 = disabled) | +| expiresAt | integer (timestamp) | | When the key expires (null = never) | +| revokedAt | integer (timestamp) | | When the key was revoked (null = active). Permanent. | +| rotatedToId | text | | Self-reference to `api_keys.id` — the key that replaced this one (null if not rotated). | +| lastUsedAt | integer (timestamp) | | Last authentication time. Null if never used. | -Keypal scope data is stored in `metadata` (`metadata.scopes`, `metadata.resources`). -The hub provides a `HubKeyStorage` adapter that reads/writes this table to -implement keypal's `Storage` interface. +**Key lifecycle states**: enabled+not expired = active. enabled+expired = +rejected. disabled = rejected regardless of expiration. revoked = permanently +disabled regardless of enabled/expiry. + +**Rotation**: When a key is rotated, the consumer creates a new `api_keys` row +and sets the old key's `rotatedToId` to the new key's id. The old key's +`revokedAt` is set at the same time. This provides an audit trail of key +rotation without requiring a separate rotation history table. + +**SHA-256 rationale**: API keys are high-entropy machine-generated strings +(128-bit+). Brute-force against SHA-256 is infeasible for such inputs. Slow +KDFs (bcrypt, Argon2) are unnecessary for machine keys — they add latency +without meaningful security improvement. (ADR-050) + +**Indexes**: `unq_api_keys_key_hash` UNIQUE on `(keyHash)`, +`idx_api_keys_owner_id` on `(ownerId)`, +`idx_api_keys_enabled` on `(enabled)`, +`idx_api_keys_active` on `(ownerId)` WHERE `revokedAt IS NULL AND enabled = 1` + +### `peer_credentials` + +SSH key and certificate-authority credentials for wraith transport +authentication. The client presents an Ed25519 public key or OpenSSH +certificate; the consumer validates against the stored fingerprint. Storage +does not perform SSH authentication — that is a consumer concern (wraith, +hub). + +| Column | Type | Constraints | Notes | +|-----------------|---------------------|----------------------------------------|------------------------------------------------------| +| id | text | PK | Consumer-generated UUID | +| metadata | text (JSON) | default `{}` | Extension namespace. Cert data: `metadata.principals` (`string[]`), `metadata.restrictions` (`string[]`), `metadata.caFingerprint` (`string`, for cert-authority entries only). | +| createdAt | integer (timestamp) | not null, default `now` | | +| updatedAt | integer (timestamp) | not null, default `now` | | +| ownerId | text | not null, FK → accounts.id (**CASCADE**) | Account deletion removes peer credentials | +| credentialType | text | not null | `ssh_key`, `cert_authority` | +| fingerprint | text | not null, **unique** | Ed25519 key fingerprint (SHA-256, OpenSSH format) | +| publicKeyData | text | not null | Full public key in OpenSSH format (`ssh-ed25519 AAAA...`) | +| name | text | | Human-readable label | +| enabled | integer | not null, default 1 | Immediate disable switch | +| expiresAt | integer (timestamp) | | When the credential expires (null = never). Certificates carry expiry; standalone keys typically don't. | +| revokedAt | integer (timestamp) | | When the credential was revoked (null = active). | + +**`credentialType` semantics**: `ssh_key` is an individual public key. The +consumer verifies the key against known fingerprints. `cert_authority` is a +trusted CA public key. The consumer validates certificates signed by this CA +against the stored fingerprint. Both types share the same lookup pattern +(present fingerprint → find by fingerprint → check owner + enable + expiry + +revocation), which is why they share a table. + +**Adding new credential types** (ADR-049): Credential types sharing the same +lookup column as `peer_credentials` (fingerprint-based) add a new +`credentialType` value to this table. Credential types requiring different +lookup columns warrant their own table. (OQ-33) Current types assume Ed25519 +only; additional SSH key types may require `credentialType` expansion. + +**Fingerprint format**: OpenSSH SHA-256 fingerprint (base64, no prefix). Used +for lookup during SSH authentication. The `publicKeyData` column stores the +full key for reconstruction/verification when needed. + +**Indexes**: `unq_peer_credentials_fingerprint` UNIQUE on `(fingerprint)`, +`idx_peer_credentials_owner_id` on `(ownerId)`, +`idx_peer_credentials_credential_type` on `(credentialType)`, +`idx_peer_credentials_active` on `(ownerId)` WHERE `revokedAt IS NULL AND enabled = 1` ### `audit_logs` -| Column | Type | Notes | -|----------|---------------------|-------| -| commonCols | — | id, metadata, createdAt, updatedAt | -| action | text NOT NULL | `created`, `revoked`, `rotated`, `login`, `access_denied` | -| ownerId | text NOT NULL | Logical reference to accounts.id | -| keyId | text | Logical reference to api_keys.id (nullable) | -| orgId | text | Logical reference to organizations.id (nullable) | -| details | text (JSON) | Action-specific context | +Append-only audit trail for security-relevant events. The consumer (hub) +writes entries for key operations, authentication events, membership changes, +and other auditable actions. The consumer is responsible for reading and +displaying audit data. -**Indexes**: `idx_audit_logs_owner_id` on `(ownerId)`, `idx_audit_logs_action` on `(action)`, `idx_audit_logs_created_at` on `(createdAt)`. +| Column | Type | Constraints | Notes | +|----------------|---------------------|----------------------------------------|------------------------------------------------------| +| id | text | PK | Consumer-generated UUID | +| metadata | text (JSON) | default `{}` | Extension namespace. Session context: `metadata.sessionId` (when relevant). | +| createdAt | integer (timestamp) | not null, default `now` | | +| updatedAt | integer (timestamp) | not null, default `now` | | +| action | text | not null | `created`, `revoked`, `rotated`, `enabled`, `disabled`, `login`, `access_denied` | +| ownerId | text | not null, FK → accounts.id (**RESTRICT**) | The identity performing the action. RESTRICT prevents account deletion when audit entries exist — deactivate instead. | +| credentialId | text | | Logical reference to api_keys.id or peer_credentials.id (nullable — not all events are credential-related). | +| credentialType | text | | `api_key`, `peer_credential`, or null. Discriminator for `credentialId` — tells the consumer which table to look up. | +| orgId | text | FK → organizations.id (**SET NULL**) | Organization context. Null for personal actions. Set null on org deletion to preserve audit trail. | +| details | text (JSON) | | Action-specific context (IP, user agent, scope changes, etc.) | + +**`action` enum is extensible**: The initial set covers API key operations +and basic auth events. Additional actions for account, membership, and +organization lifecycle events (e.g., `account_created`, `membership_added`, +`org_created`) should be added by consumers as those features are implemented. + +**`credentialId` + `credentialType` polymorphic reference**: Replaces the +previous `keyId` column (API key only). The pair allows audit entries to +reference either credential table. No FK constraint — the consumer resolves +the table based on `credentialType` (ADR-049). + +**`orgId` FK with SET NULL**: Unlike `credentialId` (polymorphic, no single +target table), `orgId` always references `organizations.id` within the same +system.db. A real FK with `SET NULL` preserves the audit trail on org deletion +(nulling the org reference without deleting the audit entry) while enforcing +referential integrity at the database level rather than relying on consumer +discipline. + +**Indexes**: `idx_audit_logs_owner_id` on `(ownerId)`, +`idx_audit_logs_credential_id` on `(credentialId)`, +`idx_audit_logs_action` on `(action)`, +`idx_audit_logs_created_at` on `(createdAt)`, +`idx_audit_logs_org_id` on `(orgId)`. + +### FK Cascade Behavior (System DB) + +All identity table FKs are intra-database (same system.db file). Real +constraints apply, not logical references. + +| Relationship | onDelete | Rationale | +|-------------|----------|-----------| +| organizations.ownerId → accounts.id | RESTRICT | Cannot delete owner account while org exists. Transfer ownership first. | +| organization_members.orgId → organizations.id | CASCADE | Org deletion removes memberships | +| organization_members.accountId → accounts.id | CASCADE | Account deletion removes memberships | +| api_keys.ownerId → accounts.id | CASCADE | Account deletion removes API keys | +| peer_credentials.ownerId → accounts.id | CASCADE | Account deletion removes peer credentials | +| audit_logs.ownerId → accounts.id | RESTRICT | Audit integrity — deactivate accounts instead of deleting. Preserves accountability. | +| audit_logs.orgId → organizations.id | SET NULL | Org deletion preserves audit trail (org reference nulled, entry retained). | + +Polymorphic references (no FK, consumer resolves): +`audit_logs.credentialId` → `api_keys.id` or `peer_credentials.id` +(disambiguated by `audit_logs.credentialType`). + +Cross-DB logical references (no FK, different database file): +`graphs.ownerId` → `accounts.id`, `graphs.projectId` → project identity +(ADR-042). Consumer enforces referential integrity at application layer. ## Relations ### System DB Relations -- **organizations → organization_members**: one-to-many -- **accounts → organization_members**: one-to-many +- **accounts → organizations**: one-to-many (via `organizations.ownerId`) +- **accounts → organization_members**: one-to-many (via `organization_members.accountId`) +- **accounts → api_keys**: one-to-many (via `api_keys.ownerId`) +- **accounts → peer_credentials**: one-to-many (via `peer_credentials.ownerId`) +- **accounts → audit_logs**: one-to-many (via `audit_logs.ownerId`) +- **organizations → organization_members**: one-to-many (via `organization_members.orgId`) ### Tenant DB Relations @@ -301,7 +501,7 @@ implement keypal's `Storage` interface. ### `createSystemDatabase(client)` Creates a Drizzle database instance with the identity schema (accounts, -organizations, organization_members, api_keys, audit_logs) attached. +organizations, organization_members, api_keys, peer_credentials, audit_logs) attached. ```ts import { createSystemDatabase } from "@alkdev/storage/sqlite"; @@ -346,7 +546,9 @@ db.transaction((tx) => { | [038](decisions/038-sqlite-first-pg-removed.md) | SQLite-first, PG removed | Single database host | | [039](decisions/039-honker-as-sqlite-extension.md) | Honker as SQLite extension | DB + pub/sub + queues in one file | | [040](decisions/040-system-db-tenant-db.md) | System DB + tenant DB | Identity in system.db, graphs in tenant-{orgId}.db | -| [041](decisions/041-identity-tables-in-storage.md) | Identity tables in storage | accounts, organizations, api_keys, audit_logs | +| [041](decisions/041-identity-tables-in-storage.md) | Identity tables in storage | accounts, organizations, api_keys, peer_credentials, audit_logs | +| [049](decisions/049-identity-schema-restructuring.md) | Identity schema restructuring | Separate credential tables, remove Gitea, data→metadata, FK cascades | +| [050](decisions/050-sha256-for-api-key-hashing.md) | SHA-256 for API keys | Fast hash for high-entropy machine keys, not slow KDF | | [042](decisions/042-scoping-columns-on-graphs.md) | Scoping columns on graphs | `ownerId`, `projectId` on `graphs` table | | [043](decisions/043-graph-type-scope.md) | Graph type scope | `system` / `tenant` / `user` scope on `graph_types` | | [044](decisions/044-drizzle-honker-adapter.md) | Drizzle-Honker adapter | ~100-line session adapter, POC validated |