Identity tables were derived from hub's PostgreSQL schema but simplified without documenting what was removed or why. This restructures them for the current auth landscape (API key + wraith SSH/cert-authority): - ADR-049: Separate api_keys and peer_credentials tables (different lookup patterns, columns, lifecycles), remove Gitea columns, map hub data→metadata - ADR-050: Extract SHA-256 vs KDF decision from inline spec text - Add peer_credentials table for SSH key and cert-authority auth - Specify all FK cascade behaviors within system DB (RESTRICT, CASCADE, SET NULL) - Complete index specifications for all identity tables - Add scope boundary section (storage owns schemas, not auth/authorization) - Update audit_logs with credentialId+credentialType polymorphic reference - Add 3 new open questions (OQ-33/34/35) for credential type expansion
194 lines
9.1 KiB
Markdown
194 lines
9.1 KiB
Markdown
# ADR-049: Identity Schema Restructuring
|
|
|
|
## Status
|
|
|
|
Accepted
|
|
|
|
## Context
|
|
|
|
The identity tables in `sqlite-host.md` were derived from the hub's PostgreSQL
|
|
schema (`@alkdev/hub/docs/architecture/storage/identity.md`) but simplified
|
|
without documenting what was removed and why. This creates ambiguity for
|
|
implementation:
|
|
|
|
1. **Gitea columns** (`accounts.giteaUsername`, `organizations.giteaOrgName`)
|
|
were dropped without documented rationale. These are hub-specific
|
|
integration columns — a storage package should not couple to a particular
|
|
git hosting provider. Git association, when needed, belongs in a metagraph
|
|
instance (e.g., a project graph with git repo metadata) or a downstream
|
|
consumer's schema.
|
|
|
|
2. **`data` JSONB columns** on `accounts` and `organizations` in the hub schema
|
|
were silently dropped. The hub used these for account preferences/profile
|
|
and org billing/settings. Storage's `commonCols.metadata` serves the same
|
|
purpose — an extension namespace following `_subsystem.key` convention. The
|
|
mapping from hub `data` to storage `metadata` is unambiguous but was never
|
|
stated.
|
|
|
|
3. **Single `api_keys` table** assumes keypal-style bearer token auth only. The
|
|
@alkdev platform now has two authentication mechanisms:
|
|
- **API key** (keypal-style): client sends a bearer token, hub hashes it,
|
|
looks up by `keyHash`. Transport: HTTP/WebSocket.
|
|
- **Peer credential** (wraith-style): client presents an Ed25519 public key
|
|
or OpenSSH certificate over an SSH channel (TCP/TLS/Iroh). Server
|
|
validates against known fingerprints. Transport: wraith SSH tunnel.
|
|
|
|
These credential types have fundamentally different query patterns (hash
|
|
lookup vs fingerprint lookup), different columns (keyHash makes no sense
|
|
for SSH keys; publicKeyFingerprint makes no sense for API keys), and
|
|
different lifecycles (rotation vs addition/removal). A single table would be
|
|
mostly nulls on either side.
|
|
|
|
4. **Missing FK cascade behavior** — Identity tables live in the same system
|
|
DB, so real FK constraints apply. The current spec uses "logical reference"
|
|
language (appropriate for cross-DB scoping columns on `graphs`) for
|
|
relationships that are intra-database and should have proper cascades.
|
|
|
|
5. **Missing columns** — The hub spec includes `api_keys.description`,
|
|
`api_keys.rotatedToId`, `api_keys.lastUsedAt`, and `audit_logs.sessionId`
|
|
that were dropped from storage without rationale.
|
|
|
|
## Decision
|
|
|
|
### 1. Separate credential tables by type
|
|
|
|
Two tables with distinct columns, not a unified `credentials` table with a
|
|
type discriminator:
|
|
|
|
| Table | Auth mechanism | Lookup pattern | Transport |
|
|
|-------|----------------|----------------|-----------|
|
|
| `api_keys` | Bearer token | Hash token → look up `keyHash` | HTTP/WebSocket |
|
|
| `peer_credentials` | SSH key / cert-authority | Present fingerprint → look up `fingerprint` | wraith (SSH over TCP/TLS/Iroh) |
|
|
|
|
Rationale: Query patterns, columns, and lifecycles differ fundamentally.
|
|
Credential type proliferation is expected to be low (3-4 types ever). A new
|
|
table per type is acceptable. This extends the design principle established by
|
|
ADR-002: the metagraph pattern serves graph-shaped data (dynamic schemas,
|
|
traversal queries); dedicated tables serve fixed-schema data with known
|
|
columns and relational query patterns.
|
|
|
|
**Decision criterion for future credential types**: Credential types sharing
|
|
the same lookup column get a table with a `credentialType` discriminator.
|
|
Credential types requiring different lookup columns get their own table. For
|
|
example, a future `tls_certificate` credential that uses fingerprint-based
|
|
lookup would join `peer_credentials` with a new `credentialType` value. A
|
|
credential type using a different lookup column (e.g., a client identifier)
|
|
would warrant a new table.
|
|
|
|
### 2. Remove Gitea columns
|
|
|
|
`accounts.giteaUsername` and `organizations.giteaOrgName` are removed. Git
|
|
hosting integration is a consumer concern, not a storage infrastructure
|
|
concern. When a downstream system needs to associate accounts or organizations
|
|
with git hosting, it stores that association in:
|
|
- A metagraph instance (e.g., a project graph with a `GitRepositoryNode`)
|
|
- Consumer-side schema extensions via `commonCols.metadata`
|
|
|
|
### 3. Hub `data` column maps to storage `commonCols.metadata`
|
|
|
|
The hub's `accounts.data` and `organizations.data` JSONB columns are not
|
|
present in storage's identity tables. Their purpose (extensible account/org
|
|
metadata) is served by `commonCols.metadata`, which follows the
|
|
`_subsystem.key` convention. No data is lost — the extensible namespace
|
|
already exists. The hub maps `data` fields into `metadata` keys when
|
|
migrating.
|
|
|
|
### 4. Add `peer_credentials` table
|
|
|
|
A new table for SSH key and certificate-authority authentication over wraith
|
|
transport:
|
|
|
|
| Column | Type | Notes |
|
|
|--------|------|-------|
|
|
| commonCols | — | id, metadata, createdAt, updatedAt |
|
|
| ownerId | text NOT NULL | FK → accounts.id (CASCADE) |
|
|
| credentialType | text NOT NULL | `ssh_key`, `cert_authority` |
|
|
| fingerprint | text NOT NULL UNIQUE | Ed25519 key fingerprint (SHA-256) |
|
|
| publicKeyData | text NOT NULL | Full public key in OpenSSH format |
|
|
| name | text | Human-readable label |
|
|
| enabled | integer NOT NULL DEFAULT 1 | Immediate disable switch |
|
|
| expiresAt | integer (timestamp) | Null = never |
|
|
| revokedAt | integer (timestamp) | Null = active |
|
|
|
|
The `credentialType` discriminator separates key entries from CA entries within
|
|
a single table because their query pattern (look up by fingerprint) is
|
|
identical. Cert-specific data (principals, restrictions, caFingerprint) goes
|
|
in `metadata`.
|
|
|
|
### 5. Add back useful API key columns
|
|
|
|
| Column | Reason |
|
|
|--------|--------|
|
|
| `rotatedToId` | API key rotation tracking — sets which key replaced this one |
|
|
| `lastUsedAt` | Stale key cleanup and access pattern analysis |
|
|
|
|
`description` is not added. `name` + `metadata` covers labeling needs.
|
|
|
|
### 6. Specify FK cascade behavior within system DB
|
|
|
|
All identity table FKs are intra-database and use real constraints:
|
|
|
|
| Relationship | onDelete | Rationale |
|
|
|-------------|----------|-----------|
|
|
| organizations.ownerId → accounts.id | RESTRICT | Cannot delete owner account while org exists |
|
|
| organization_members.orgId → organizations.id | CASCADE | Org deletion removes memberships |
|
|
| organization_members.accountId → accounts.id | CASCADE | Account deletion removes memberships |
|
|
| api_keys.ownerId → accounts.id | CASCADE | Account deletion removes API keys |
|
|
| peer_credentials.ownerId → accounts.id | CASCADE | Account deletion removes peer credentials |
|
|
| audit_logs.ownerId → accounts.id | RESTRICT | Audit integrity — deactivate account instead of delete |
|
|
|
|
`audit_logs.keyId` and `audit_logs.credentialId` are logical references (not
|
|
FK) because they may reference rows in either `api_keys` or `peer_credentials`,
|
|
and the referenced table depends on the audit event type.
|
|
|
|
### 7. Update audit_logs for multi-credential world
|
|
|
|
Replace `keyId` (API key only) with polymorphic credential references:
|
|
|
|
| Column | Type | Notes |
|
|
|--------|------|-------|
|
|
| credentialId | text | Logical reference to api_keys.id or peer_credentials.id |
|
|
| credentialType | text | `api_key`, `peer_credential`, or null |
|
|
|
|
This replaces the previous `keyId` column. The `credentialType` discriminator
|
|
tells the consumer which table to look up (same pattern as
|
|
`graphs.ownerId` — logical reference, not FK).
|
|
|
|
`sessionId` is not added. Session correlation is a hub concern, not a storage
|
|
infrastructure concern. When needed, it goes in `metadata`.
|
|
|
|
## Consequences
|
|
|
|
**Positive:**
|
|
|
|
- Two credential types covered from the start — API key auth and wraith SSH
|
|
auth can both be stored and looked up efficiently
|
|
- Each credential table has native columns for its specific fields — no
|
|
null-heavy rows, no JSON lookups for high-query fields
|
|
- Gitea coupling removed — storage doesn't depend on a specific git hosting
|
|
provider
|
|
- FK cascades specified — implementers know exactly what happens on deletion
|
|
- Clear provenance mapping — hub's `data` → storage's `metadata` is explicit
|
|
|
|
**Negative:**
|
|
|
|
- Two credential tables instead of one — but the columns don't overlap and
|
|
query patterns differ, so this is the correct trade-off
|
|
- `audit_logs.credentialId`/`credentialType` polymorphic reference — no FK
|
|
constraint, consumer resolves the table (same pattern as existing
|
|
cross-DB references)
|
|
- Hub must migrate its existing identity schema when consuming storage's
|
|
definitions — `keyId` → `credentialId` + `credentialType`, `data` → `metadata`,
|
|
Gitea columns to metagraph or consumer metadata
|
|
- `peer_credentials` credential types may grow (e.g., `tls_certificate`) —
|
|
handled by adding enum values, not new tables, since query patterns within
|
|
peer auth are similar
|
|
|
|
## References
|
|
|
|
- ADR-002: Metagraph over domain-specific tables
|
|
- ADR-041: Identity tables in storage package
|
|
- ADR-040: System DB + tenant DB separation
|
|
- Hub identity tables: `/workspace/@alkdev/hub/docs/architecture/storage/identity.md`
|
|
- Wraith NAPI + pubsub: `/workspace/@alkdev/wraith/docs/architecture/napi-and-pubsub.md`
|
|
- Wraith auth: `/workspace/@alkdev/wraith/docs/architecture/decisions/012-auth-ed25519-and-cert-authority.md` |