Architect storage around SQLite+Honker: remove PG, add multi-tenant identity, scoping

Reorient @alkdev/storage around a single SQLite database host with Honker for pub/sub, event streams, and task queues. PostgreSQL is removed as a target (ADR-038), eliminating dual schema maintenance and infrastructure complexity. Honker provides DB + pubsub + queues in one .db file (ADR-039). Add system/tenant DB model (ADR-040): identity tables in system.db, all graph data in tenant-{orgId}.db files. Identity tables move from the hub into storage (ADR-041). Scoping columns (ownerId, projectId) added to graphs table (ADR-042). Graph types get scope (system/tenant/user) to protect infrastructure schemas (ADR-043). Define Drizzle-Honker session adapter (ADR-044): ~100-line adapter enabling Drizzle typed queries and Honker pubsub/queue on a single connection with transactional consistency. Resolve OQ-03, OQ-04, OQ-19, OQ-21, OQ-22, OQ-23, OQ-24. Add new open questions OQ-26 through OQ-29 for Honker integration specifics. New docs: honker-integration.md (adapter, event patterns, migration). Scrub all PG/jsonb/libsql references from existing spec docs.
2026-05-31 15:41:41 +00:00
parent 6b5f32bad4
commit 6aa2fcc6ff
19 changed files with 1446 additions and 515 deletions
--- a/docs/architecture/decisions/040-system-db-tenant-db.md
+++ b/docs/architecture/decisions/040-system-db-tenant-db.md
@@ -0,0 +1,69 @@
+# ADR-040: System DB + Tenant DB Separation
+
+## Status
+
+Accepted
+
+## Context
+
+The original POC was single-tenant — no concept of users, organizations, or ownership. The 6 metagraph tables had no columns for scoping graph instances to any owner, org, or project.
+
+Multi-tenant support is needed for:
+- Sharing compute with other OSS developers while keeping data isolated
+- Enabling downstream users to host multi-tenant services
+- Self-hosted deployments where org isolation is required
+
+Three approaches to multi-tenancy in SQLite:
+
+1. **Schema-level isolation** — All tenants in one `.db` file, with `orgId` columns on every table for row-level filtering
+2. **Database-level isolation** — Each tenant gets its own `.db` file
+3. **Hybrid** — Shared identity tables in one file, tenant data in per-tenant files
+
+## Decision
+
+Use the hybrid approach: a **system DB** for identity/auth and a **tenant DB** per organization for all graph data.
+
+```
+system.db
+├── accounts, organizations, api_keys, audit_logs
+├── graph_types (system-scoped definitions: acl, call-graph, etc.)
+├── _honker_* tables (system events, queues, streams)
+
+tenant-{orgId}.db
+├── graphs, nodes, edges (ALL graph instances for this org)
+├── graph_types (tenant-scoped definitions: custom graphs)
+├── node_types, edge_types
+├── projectId columns on graphs for intra-org project scoping
+├── _honker_* tables (per-org events, queues, streams)
+```
+
+The system DB holds identity infrastructure that must exist before any tenant can be authenticated. The tenant DB holds all graph data for one org — call graphs, ACL instances, session trees, task dependencies, secrets. Tenant DBs are isolated at the file level: backup, delete, migrate, or corrupt one tenant without affecting others.
+
+The hub (or any consumer) opens both a system connection and one or more tenant connections. The system DB's `accounts` and `organizations` tables are the authoritative source for authentication. The tenant DB's graph data is scoped by `ownerId` and `projectId` columns that logically reference (not FK) the system DB's identity tables.
+
+## Consequences
+
+**Positive:**
+
+- File-level isolation — one tenant's data cannot leak to another, even via bugs in application-layer filtering
+- Each tenant DB is independently backupable, migratable, compactable
+- No `orgId` column needed on tenant tables (the entire file IS the org scope)
+- Simpler queries — no row-level filtering on every query for multi-tenancy
+- Natural fit for the "compute sharing" use case — separate files for separate people
+- System DB is small and rarely changes — low backup cost, high durability focus
+- Honker's pubsub/queues are per-DB — event streams don't cross tenant boundaries
+
+**Negative:**
+
+- Cross-tenant operations (e.g., a user in org A delegates to a user in org B) require the hub to mediate between two open databases at the application layer
+- No cross-tenant SQL JOINs — if needed, the hub does application-level joins
+- More open file handles — one per active tenant (manageable for expected scale)
+- Schema migrations must be applied to each tenant DB independently
+- System DB is a single point of failure — if it's corrupted, all tenants lose authentication
+
+## References
+
+- ADR-038: SQLite-first, Postgres removed
+- ADR-039: Honker as SQLite extension
+- ADR-041: Identity tables in storage package
+- ADR-042: Scoping columns on graph instances