Identity tables were derived from hub's PostgreSQL schema but simplified without documenting what was removed or why. This restructures them for the current auth landscape (API key + wraith SSH/cert-authority): - ADR-049: Separate api_keys and peer_credentials tables (different lookup patterns, columns, lifecycles), remove Gitea columns, map hub data→metadata - ADR-050: Extract SHA-256 vs KDF decision from inline spec text - Add peer_credentials table for SSH key and cert-authority auth - Specify all FK cascade behaviors within system DB (RESTRICT, CASCADE, SET NULL) - Complete index specifications for all identity tables - Add scope boundary section (storage owns schemas, not auth/authorization) - Update audit_logs with credentialId+credentialType polymorphic reference - Add 3 new open questions (OQ-33/34/35) for credential type expansion
9.1 KiB
ADR-049: Identity Schema Restructuring
Status
Accepted
Context
The identity tables in sqlite-host.md were derived from the hub's PostgreSQL
schema (@alkdev/hub/docs/architecture/storage/identity.md) but simplified
without documenting what was removed and why. This creates ambiguity for
implementation:
-
Gitea columns (
accounts.giteaUsername,organizations.giteaOrgName) were dropped without documented rationale. These are hub-specific integration columns — a storage package should not couple to a particular git hosting provider. Git association, when needed, belongs in a metagraph instance (e.g., a project graph with git repo metadata) or a downstream consumer's schema. -
dataJSONB columns onaccountsandorganizationsin the hub schema were silently dropped. The hub used these for account preferences/profile and org billing/settings. Storage'scommonCols.metadataserves the same purpose — an extension namespace following_subsystem.keyconvention. The mapping from hubdatato storagemetadatais unambiguous but was never stated. -
Single
api_keystable assumes keypal-style bearer token auth only. The @alkdev platform now has two authentication mechanisms:- API key (keypal-style): client sends a bearer token, hub hashes it,
looks up by
keyHash. Transport: HTTP/WebSocket. - Peer credential (wraith-style): client presents an Ed25519 public key or OpenSSH certificate over an SSH channel (TCP/TLS/Iroh). Server validates against known fingerprints. Transport: wraith SSH tunnel.
These credential types have fundamentally different query patterns (hash lookup vs fingerprint lookup), different columns (keyHash makes no sense for SSH keys; publicKeyFingerprint makes no sense for API keys), and different lifecycles (rotation vs addition/removal). A single table would be mostly nulls on either side.
- API key (keypal-style): client sends a bearer token, hub hashes it,
looks up by
-
Missing FK cascade behavior — Identity tables live in the same system DB, so real FK constraints apply. The current spec uses "logical reference" language (appropriate for cross-DB scoping columns on
graphs) for relationships that are intra-database and should have proper cascades. -
Missing columns — The hub spec includes
api_keys.description,api_keys.rotatedToId,api_keys.lastUsedAt, andaudit_logs.sessionIdthat were dropped from storage without rationale.
Decision
1. Separate credential tables by type
Two tables with distinct columns, not a unified credentials table with a
type discriminator:
| Table | Auth mechanism | Lookup pattern | Transport |
|---|---|---|---|
api_keys |
Bearer token | Hash token → look up keyHash |
HTTP/WebSocket |
peer_credentials |
SSH key / cert-authority | Present fingerprint → look up fingerprint |
wraith (SSH over TCP/TLS/Iroh) |
Rationale: Query patterns, columns, and lifecycles differ fundamentally. Credential type proliferation is expected to be low (3-4 types ever). A new table per type is acceptable. This extends the design principle established by ADR-002: the metagraph pattern serves graph-shaped data (dynamic schemas, traversal queries); dedicated tables serve fixed-schema data with known columns and relational query patterns.
Decision criterion for future credential types: Credential types sharing
the same lookup column get a table with a credentialType discriminator.
Credential types requiring different lookup columns get their own table. For
example, a future tls_certificate credential that uses fingerprint-based
lookup would join peer_credentials with a new credentialType value. A
credential type using a different lookup column (e.g., a client identifier)
would warrant a new table.
2. Remove Gitea columns
accounts.giteaUsername and organizations.giteaOrgName are removed. Git
hosting integration is a consumer concern, not a storage infrastructure
concern. When a downstream system needs to associate accounts or organizations
with git hosting, it stores that association in:
- A metagraph instance (e.g., a project graph with a
GitRepositoryNode) - Consumer-side schema extensions via
commonCols.metadata
3. Hub data column maps to storage commonCols.metadata
The hub's accounts.data and organizations.data JSONB columns are not
present in storage's identity tables. Their purpose (extensible account/org
metadata) is served by commonCols.metadata, which follows the
_subsystem.key convention. No data is lost — the extensible namespace
already exists. The hub maps data fields into metadata keys when
migrating.
4. Add peer_credentials table
A new table for SSH key and certificate-authority authentication over wraith transport:
| Column | Type | Notes |
|---|---|---|
| commonCols | — | id, metadata, createdAt, updatedAt |
| ownerId | text NOT NULL | FK → accounts.id (CASCADE) |
| credentialType | text NOT NULL | ssh_key, cert_authority |
| fingerprint | text NOT NULL UNIQUE | Ed25519 key fingerprint (SHA-256) |
| publicKeyData | text NOT NULL | Full public key in OpenSSH format |
| name | text | Human-readable label |
| enabled | integer NOT NULL DEFAULT 1 | Immediate disable switch |
| expiresAt | integer (timestamp) | Null = never |
| revokedAt | integer (timestamp) | Null = active |
The credentialType discriminator separates key entries from CA entries within
a single table because their query pattern (look up by fingerprint) is
identical. Cert-specific data (principals, restrictions, caFingerprint) goes
in metadata.
5. Add back useful API key columns
| Column | Reason |
|---|---|
rotatedToId |
API key rotation tracking — sets which key replaced this one |
lastUsedAt |
Stale key cleanup and access pattern analysis |
description is not added. name + metadata covers labeling needs.
6. Specify FK cascade behavior within system DB
All identity table FKs are intra-database and use real constraints:
| Relationship | onDelete | Rationale |
|---|---|---|
| organizations.ownerId → accounts.id | RESTRICT | Cannot delete owner account while org exists |
| organization_members.orgId → organizations.id | CASCADE | Org deletion removes memberships |
| organization_members.accountId → accounts.id | CASCADE | Account deletion removes memberships |
| api_keys.ownerId → accounts.id | CASCADE | Account deletion removes API keys |
| peer_credentials.ownerId → accounts.id | CASCADE | Account deletion removes peer credentials |
| audit_logs.ownerId → accounts.id | RESTRICT | Audit integrity — deactivate account instead of delete |
audit_logs.keyId and audit_logs.credentialId are logical references (not
FK) because they may reference rows in either api_keys or peer_credentials,
and the referenced table depends on the audit event type.
7. Update audit_logs for multi-credential world
Replace keyId (API key only) with polymorphic credential references:
| Column | Type | Notes |
|---|---|---|
| credentialId | text | Logical reference to api_keys.id or peer_credentials.id |
| credentialType | text | api_key, peer_credential, or null |
This replaces the previous keyId column. The credentialType discriminator
tells the consumer which table to look up (same pattern as
graphs.ownerId — logical reference, not FK).
sessionId is not added. Session correlation is a hub concern, not a storage
infrastructure concern. When needed, it goes in metadata.
Consequences
Positive:
- Two credential types covered from the start — API key auth and wraith SSH auth can both be stored and looked up efficiently
- Each credential table has native columns for its specific fields — no null-heavy rows, no JSON lookups for high-query fields
- Gitea coupling removed — storage doesn't depend on a specific git hosting provider
- FK cascades specified — implementers know exactly what happens on deletion
- Clear provenance mapping — hub's
data→ storage'smetadatais explicit
Negative:
- Two credential tables instead of one — but the columns don't overlap and query patterns differ, so this is the correct trade-off
audit_logs.credentialId/credentialTypepolymorphic reference — no FK constraint, consumer resolves the table (same pattern as existing cross-DB references)- Hub must migrate its existing identity schema when consuming storage's
definitions —
keyId→credentialId+credentialType,data→metadata, Gitea columns to metagraph or consumer metadata peer_credentialscredential types may grow (e.g.,tls_certificate) — handled by adding enum values, not new tables, since query patterns within peer auth are similar
References
- ADR-002: Metagraph over domain-specific tables
- ADR-041: Identity tables in storage package
- ADR-040: System DB + tenant DB separation
- Hub identity tables:
/workspace/@alkdev/hub/docs/architecture/storage/identity.md - Wraith NAPI + pubsub:
/workspace/@alkdev/wraith/docs/architecture/napi-and-pubsub.md - Wraith auth:
/workspace/@alkdev/wraith/docs/architecture/decisions/012-auth-ed25519-and-cert-authority.md