Architect storage around SQLite+Honker: remove PG, add multi-tenant identity, scoping

Reorient @alkdev/storage around a single SQLite database host with Honker
for pub/sub, event streams, and task queues. PostgreSQL is removed as a
target (ADR-038), eliminating dual schema maintenance and infrastructure
complexity. Honker provides DB + pubsub + queues in one .db file (ADR-039).

Add system/tenant DB model (ADR-040): identity tables in system.db, all
graph data in tenant-{orgId}.db files. Identity tables move from the hub
into storage (ADR-041). Scoping columns (ownerId, projectId) added to
graphs table (ADR-042). Graph types get scope (system/tenant/user) to
protect infrastructure schemas (ADR-043).

Define Drizzle-Honker session adapter (ADR-044): ~100-line adapter enabling
Drizzle typed queries and Honker pubsub/queue on a single connection with
transactional consistency.

Resolve OQ-03, OQ-04, OQ-19, OQ-21, OQ-22, OQ-23, OQ-24. Add new
open questions OQ-26 through OQ-29 for Honker integration specifics.

New docs: honker-integration.md (adapter, event patterns, migration).
Scrub all PG/jsonb/libsql references from existing spec docs.
This commit is contained in:
2026-05-31 15:41:41 +00:00
parent 6b5f32bad4
commit 6aa2fcc6ff
19 changed files with 1446 additions and 515 deletions

View File

@@ -0,0 +1,53 @@
# ADR-038: SQLite-First, Postgres Removed
## Status
Accepted
## Context
The original architecture specified two database hosts: SQLite for spokes (local/embedded) and PostgreSQL for the hub (central service). This required:
- Maintaining two sets of Drizzle table definitions (`sqliteTable` and `pgTable`) with the same logical shapes
- Two client factories (`createSqliteDatabase`, `createPostgresDatabase`)
- Two repository layer implementations or a host-agnostic abstraction
- Separate test suites for each host
- A PostgreSQL server as infrastructure dependency for any hub deployment
The dual-host model came from the `@ade` POC, which was single-tenant and didn't account for multi-tenant deployment concerns. For the actual use case — small teams of developers and AI agents sharing compute — PostgreSQL is operational overhead without proportional benefit.
## Decision
`@alkdev/storage` is SQLite-only. The `pg/` subpath export is removed. The package provides one database host: SQLite via the Honker extension (see ADR-039).
This eliminates:
- All `pgTable` definitions and the `src/pg/` directory
- The PostgreSQL porting notes in every spec document
- Dual schema maintenance, dual testing, dual repository implementations
- PostgreSQL and Redis as infrastructure dependencies
## Consequences
**Positive:**
- Single set of table definitions, one client factory, one test suite
- No PostgreSQL server to install, configure, secure, and maintain
- No Redis for pub/sub — Honker provides durable pub/sub within SQLite
- Simpler deployment: a single `.db` file per database
- The hub's domain tables can coexist with metagraph tables in the same SQLite file
- WAL mode with Honker's reader pool provides sufficient concurrency for the expected workload
**Negative:**
- SQLite is single-machine — no horizontal scaling, no read replicas, no cross-server queries
- No native `jsonb` type with GIN indexes — JSON attributes rely on `json_extract()` queries
- No built-in full-text search on JSON attributes (SQLite FTS5 works but requires manual setup)
- Some ecosystem tools expect PostgreSQL (migration tools, monitoring dashboards)
- If a future deployment genuinely needs PostgreSQL scale, a migration path would need to be rebuilt
## References
- ADR-039: Honker as SQLite extension and pub/sub transport
- ADR-040: System DB + tenant DB separation
- ADR-018 (superseded): dbtype integration was partly motivated by PG/SQLite dual maintenance; with PG removed, this pressure is reduced

View File

@@ -0,0 +1,57 @@
# ADR-039: Honker as SQLite Extension and Transport
## Status
Accepted
## Context
The hub architecture was designed around three separate infrastructure components:
1. **PostgreSQL** — persistence (tables, queries, transactions)
2. **Redis** — pub/sub transport for event-driven communication
3. **Application-level task queues** — background job processing
This creates operational complexity (three services to deploy, monitor, and secure) and a dual-write problem: writing data to PostgreSQL and publishing events to Redis cannot happen in a single transaction. If the process crashes between the DB commit and the Redis publish, data and events become inconsistent.
Honker (`@russellthehippo/honker-node`) is a SQLite extension that adds Postgres-style `NOTIFY`/`LISTEN` semantics, durable event streams with per-consumer offsets, at-least-once work queues with retries and dead-letter handling, cron scheduling, advisory locks, and rate limiting — all within the same SQLite `.db` file.
## Decision
`@alkdev/storage` uses Honker as its SQLite extension and transport layer. Honker provides:
1. **Database operations** — SQLite with WAL mode, a bounded reader pool, and a single writer slot
2. **Ephemeral pub/sub**`notify()`/`listen()` for fire-and-forget notifications within the DB transaction
3. **Durable event streams**`stream()` with per-consumer offset tracking for replay-safe delivery
4. **Task queues**`queue()` with at-least-once claims, retries, priority, delayed jobs, and dead-letter
5. **Advisory locks**`tryLock()` for leader election and exclusive access
6. **Cron scheduling**`scheduler()` for time-triggered operations
Drizzle ORM integrates with Honker via a thin session adapter (~100 lines) that wraps Honker's `query()`/`execute()` API inside Drizzle's `SQLiteSession<'sync'>` contract. No Drizzle fork required. The adapter exposes the Honker `Database` as `$client` on the Drizzle instance for direct access to pubsub/queue features.
## Consequences
**Positive:**
- **Transactional consistency**: `INSERT INTO nodes` and `queue.enqueue()` commit atomically. No dual-write problem.
- **No Redis dependency**: Honker's `stream()` replaces Redis as the durable pub/sub transport
- **No PostgreSQL dependency**: SQLite with Honker covers persistence + events + queues
- **Operational simplicity**: One `.db` file contains everything — data, events, queues, schedules
- **Drizzle integration**: Full Drizzle type safety for queries + Honker for pubsub/queue on the same connection
- **Ecosystem fit**: The @alkdev platform is event-driven. Honker's durable streams with per-consumer offsets map directly to `@alkdev/operations`' call protocol events and `@alkdev/flowgraph`'s event-sourced model
**Negative:**
- **Single-machine**: Honker is a single-process SQLite extension. No cross-server events. For multi-node deployment, a separate transport (Redis, NATS) would still be needed for internode communication.
- **`lastInsertRowid` overhead**: Honker's `execute()` returns only affected row count. Getting `lastInsertRowid` requires an extra `SELECT last_insert_rowid()` call via napi, or a small Rust addition to honker-node's Transaction class.
- **No prepared statement handles at JS level**: Every Drizzle query goes through `query(sql, params)`. Mitigated by honker-core's `prepare_cached` on the Rust side.
- **Honker is alpha software**: Not yet beta-quality. API may change. Risk mitigated by the thin adapter — if Honker's query API changes, only the adapter needs updating.
## References
- Honker source: `/workspace/honker/`
- Honker Node binding: `/workspace/honker/packages/honker-node/`
- ADR-038: SQLite-first, Postgres removed
- ADR-004: Injectable clients, no side effects (Honker client is injectable)
- `@alkdev/operations` call protocol architecture
- `@alkdev/pubsub` — Honker may replace or supplement the Redis transport

View File

@@ -0,0 +1,69 @@
# ADR-040: System DB + Tenant DB Separation
## Status
Accepted
## Context
The original POC was single-tenant — no concept of users, organizations, or ownership. The 6 metagraph tables had no columns for scoping graph instances to any owner, org, or project.
Multi-tenant support is needed for:
- Sharing compute with other OSS developers while keeping data isolated
- Enabling downstream users to host multi-tenant services
- Self-hosted deployments where org isolation is required
Three approaches to multi-tenancy in SQLite:
1. **Schema-level isolation** — All tenants in one `.db` file, with `orgId` columns on every table for row-level filtering
2. **Database-level isolation** — Each tenant gets its own `.db` file
3. **Hybrid** — Shared identity tables in one file, tenant data in per-tenant files
## Decision
Use the hybrid approach: a **system DB** for identity/auth and a **tenant DB** per organization for all graph data.
```
system.db
├── accounts, organizations, api_keys, audit_logs
├── graph_types (system-scoped definitions: acl, call-graph, etc.)
├── _honker_* tables (system events, queues, streams)
tenant-{orgId}.db
├── graphs, nodes, edges (ALL graph instances for this org)
├── graph_types (tenant-scoped definitions: custom graphs)
├── node_types, edge_types
├── projectId columns on graphs for intra-org project scoping
├── _honker_* tables (per-org events, queues, streams)
```
The system DB holds identity infrastructure that must exist before any tenant can be authenticated. The tenant DB holds all graph data for one org — call graphs, ACL instances, session trees, task dependencies, secrets. Tenant DBs are isolated at the file level: backup, delete, migrate, or corrupt one tenant without affecting others.
The hub (or any consumer) opens both a system connection and one or more tenant connections. The system DB's `accounts` and `organizations` tables are the authoritative source for authentication. The tenant DB's graph data is scoped by `ownerId` and `projectId` columns that logically reference (not FK) the system DB's identity tables.
## Consequences
**Positive:**
- File-level isolation — one tenant's data cannot leak to another, even via bugs in application-layer filtering
- Each tenant DB is independently backupable, migratable, compactable
- No `orgId` column needed on tenant tables (the entire file IS the org scope)
- Simpler queries — no row-level filtering on every query for multi-tenancy
- Natural fit for the "compute sharing" use case — separate files for separate people
- System DB is small and rarely changes — low backup cost, high durability focus
- Honker's pubsub/queues are per-DB — event streams don't cross tenant boundaries
**Negative:**
- Cross-tenant operations (e.g., a user in org A delegates to a user in org B) require the hub to mediate between two open databases at the application layer
- No cross-tenant SQL JOINs — if needed, the hub does application-level joins
- More open file handles — one per active tenant (manageable for expected scale)
- Schema migrations must be applied to each tenant DB independently
- System DB is a single point of failure — if it's corrupted, all tenants lose authentication
## References
- ADR-038: SQLite-first, Postgres removed
- ADR-039: Honker as SQLite extension
- ADR-041: Identity tables in storage package
- ADR-042: Scoping columns on graph instances

View File

@@ -0,0 +1,56 @@
# ADR-041: Identity Tables in Storage Package
## Status
Accepted
## Context
The hub currently defines identity tables (`accounts`, `organizations`, `api_keys`, `audit_logs`, `organization_members`) in its own `src/storage/tables/` directory. The storage package provides only the 6 metagraph tables.
This separation creates problems:
1. The system DB (ADR-040) needs identity tables, but `@alkdev/storage` doesn't provide them
2. The hub has to maintain its own Drizzle table definitions separately, duplicating the common columns pattern and drizzlebox integration
3. The scoping columns on the metagraph tables (ADR-042) logically reference identity table rows, but those tables are defined elsewhere
4. Any consumer that wants multi-tenant graph storage needs the identity tables too — they're not hub-specific, they're infrastructure
The identity tables are NOT graph-shaped (ADR-002 established the metagraph for graph-shaped data). They are relational records with fixed schemas, indexed lookups (email uniqueness, key hash lookup), and FK constraints. But they ARE required by any deployment that uses the system/tenant DB model.
## Decision
The identity tables move into `@alkdev/storage/sqlite`:
- `accounts` — hub-local identity records
- `organizations` — top-level grouping for multi-tenancy
- `organization_members` — account/org membership (note: also modeled as `BelongsToEdge` in ACL graphs, but the SQL table provides fast indexed lookups that graph traversal cannot match)
- `api_keys` — keypal-managed API key storage
- `audit_logs` — append-only security event trail
These tables are in the storage package because they are **database infrastructure**, not hub business logic. The hub consumes them; it does not own their schema.
`organization_members` remains a SQL table despite `BelongsToEdge` existing in the ACL graph (ADR-034). The SQL table provides O(1) lookups for "list all members of org X" and FK constraints for cascading behavior. The ACL graph provides traversal-based evaluation for permission resolution. Both are needed — the SQL table is authoritative for membership state; the ACL graph edge is derived (OQ-23 resolved as "derived").
## Consequences
**Positive:**
- Single package provides everything needed for a multi-tenant graph database: metagraph tables + identity tables
- The hub doesn't duplicate table definitions or common column patterns
- `createSystemDatabase(client)` returns a fully-typed Drizzle instance with identity tables
- Other consumers (spokes, tools, standalone services) get identity tables without depending on the hub
- Referential integrity within the system DB is consistent — FK constraints work
**Negative:**
- `@alkdev/storage` grows in scope — it's no longer just "graph storage" but also "identity infrastructure"
- The hub loses ownership of its table schemas — changes require coordination with storage
- `organization_members` existing as both a SQL table and ACL graph edges requires a dual-write contract (the SQL table is authoritative; the ACL edge is derived)
- `api_keys` is tightly coupled to keypal's `Storage` interface — storage must not import keypal directly; the hub provides the adapter
## References
- ADR-040: System DB + tenant DB separation
- ADR-034: ACL as metagraph
- ADR-002: Metagraph over domain-specific tables
- Hub identity tables: `/workspace/@alkdev/hub/docs/architecture/storage/identity.md`

View File

@@ -0,0 +1,63 @@
# ADR-042: Scoping Columns on Graph Instances
## Status
Accepted
## Context
The original `graphs` table had no concept of ownership, organization, or project. A `graph` row was identified by `id` and `name` with no way to answer "which org owns this call graph?" or "list all graphs for this project."
In the system/tenant DB model (ADR-040), the tenant DB is inherently org-scoped (the entire `.db` file is one org). But within a tenant DB, graph instances still need to be scoped to:
- **An owner** — which account created/owns this graph
- **A project** — which project this graph belongs to (for project-scoped graphs like call graphs and session trees)
The metagraph pattern stores node/edge attributes as JSON, but scoping columns must be real columns because they appear in WHERE clauses, JOIN conditions, and need indexes.
## Decision
Add `ownerId` and `projectId` columns to the `graphs` table:
```
graphs {
...commonCols,
graphTypeId,
name,
description,
status,
ownerId, -- TEXT, nullable — logical reference to accounts.id in system DB
projectId, -- TEXT, nullable — logical reference to projects (graph or domain table)
}
```
No `orgId` column — the tenant DB itself IS the org scope. Adding `orgId` would be redundant within a single-tenant DB file.
These are **logical references** consistent with ADR-020 (no nodeTypeId on nodes) and OQ-24 (identityId as logical reference). No FK constraint because the referenced tables live in a different database file (system DB). The hub/consumer enforces referential integrity at the application layer.
**Nullability semantics**:
- `ownerId` NULL — system-owned graph (e.g., the ACL graph type definition seeded at setup). Not associated with any account.
- `projectId` NULL — org-level graph (e.g., the org's ACL instance). Not scoped to a specific project.
**Indexes**: `idx_graphs_owner_id` on `(ownerId)`, `idx_graphs_project_id` on `(projectId)`, `idx_graphs_owner_id_project_id` on `(ownerId, projectId)` for combined lookups.
## Consequences
**Positive:**
- "List all graphs for project X" is a simple indexed query, not a JSON path extraction
- "Who owns this graph?" is a column read, not a traversal
- Consistent with the rule from the hub's architecture: "if a field appears in WHERE clauses, JOIN conditions, or needs a constraint, it should be a proper column — not buried in metadata or JSON"
- No FK constraints means no cross-DB coupling — the tenant DB works without the system DB open
**Negative:**
- Orphaned graphs possible if an account is deleted in the system DB but the tenant DB's `graphs.ownerId` still references it. Application-layer cleanup required.
- Adding columns to the `graphs` table is a schema change that affects all consumers. The columns are nullable to ease the transition.
## References
- ADR-040: System DB + tenant DB (explains why no `orgId`)
- ADR-020: No nodeTypeId on nodes (same logical-reference pattern)
- ADR-008: Common columns pattern

View File

@@ -0,0 +1,63 @@
# ADR-043: Graph Type Scope — System vs Tenant vs User
## Status
Accepted
## Context
Graph type definitions (in `graph_types`) were originally unscoped — any consumer could create any graph type. But some graph types should be protected:
- The ACL graph type should not be modifiable by regular users — its schema (PrincipalNode, DelegatesEdge, etc.) is a system contract
- The call-graph and message-session types are infrastructure — their schemas should not change at runtime
- Custom graph types (task boards, project workflows) SHOULD be user-definable
Without a scope distinction, a tenant user could modify the ACL graph type's schema, potentially breaking the access control system.
## Decision
Add a `scope` column to `graph_types`:
```
graph_types {
...commonCols,
name, description, config, version,
scope, -- TEXT NOT NULL, enum: "system" | "tenant" | "user"
}
```
| Scope | Who can create | Who can modify the schema | Who can create instances |
|-------|---------------|---------------------------|------------------------|
| `system` | Setup/seeding only | Setup/seeding only (version bumps only) | Any authenticated user (within access control) |
| `tenant` | Org admins | Org admins | Org members |
| `user` | Any user | The user who created the type | The user who created the type |
**System-scoped types**: `acl`, `call-graph`, `secret`, `operation-registry`, `message-session`. These are seeded during hub initialization. Their schemas are fixed — changes require a version bump and migration (ADR-029).
**Tenant-scoped types**: Custom graph types created by org admins for the org's use. E.g., a "sprint-board" type for task tracking.
**User-scoped types**: Personal graph types for individual workflows. E.g., a "my-notes" type.
The repository layer enforces scope constraints at creation and modification time. System-scoped types cannot be modified through the repository API. Tenant and user-scoped types can be modified by authorized principals only.
## Consequences
**Positive:**
- ACL graph type schema is protected from accidental or malicious modification
- Clear authorization model for graph type management
- Downstream users can define custom graph types without risking system infrastructure
- Consistent with ADR-037 (setup-time definitions seed graph types)
**Negative:**
- The repository layer needs to know about scope rules (a small amount of domain knowledge in storage)
- If a system graph type genuinely needs a schema change, it must go through a version bump and migration — not just an API call
- The scope column must be indexed and checked on every graph type mutation
## References
- ADR-037: Setup-time definitions seed graph types
- ADR-029: Version as breaking-change signal
- ADR-034: ACL as metagraph
- ADR-042: Scoping columns on graph instances

View File

@@ -0,0 +1,76 @@
# ADR-044: Drizzle-Honker Session Adapter
## Status
Accepted
## Context
Drizzle ORM provides typed query builders for SQLite via driver packages (`drizzle-orm/better-sqlite3`, `drizzle-orm/libsql`). Honker provides SQLite with pubsub/queue extensions but is not a Drizzle driver.
Running both simultaneously against the same `.db` file would require two separate connections, losing the transactional consistency that Honker provides (data writes + event notifications in one transaction).
## Decision
Implement a thin session adapter (~100 lines) that wraps Honker's Node.js `Database.query()`/`Transaction.execute()` API inside Drizzle's `SQLiteSession<'sync'>` contract. No Drizzle fork required.
The adapter implements:
- `HonkerSQLiteSession` — extends `SQLiteSession<'sync', HonkerRunResult, ...>`. Implements `prepareQuery()` and `transaction()`.
- `HonkerPreparedQuery` — extends `SQLitePreparedQuery`. Implements `run()`, `all()`, `get()`, `values()`.
- `HonkerSQLiteTransaction` — extends `SQLiteTransaction<'sync', ...>`. Provides `$honkerTx` for access to Honker's transaction methods.
- `drizzle(client, config)` — factory function that creates a Drizzle database instance from a Honker `Database`.
**Key integration points**:
- `prepareQuery()` delegates to `honkerDb.query(sql, params)` for reads and `tx.execute(sql, params)` for writes
- `transaction()` wraps honker's explicit begin/commit/rollback in Drizzle's callback pattern
- `$client` on the database instance exposes the Honker `Database` for `notify()`, `queue()`, `stream()`, `listen()`, `scheduler()`
- `$honkerTx` on the transaction object exposes the Honker `Transaction` for `notify()` and `enqueueTx()` within a Drizzle transaction callback
- `run()` returns `{ changes, lastInsertRowid }` where `lastInsertRowid` is obtained via `SELECT last_insert_rowid()` (pending a small Rust addition to honker-node for zero-overhead access)
**Usage**:
```typescript
import { open } from '@russellthehippo/honker-node';
import { drizzle } from '@alkdev/storage/sqlite/honker-adapter';
import * as schema from './schema';
const honkerDb = open('app.db');
const db = drizzle(honkerDb, { schema });
// Drizzle typed queries
const activeGraphs = db.select().from(schema.graphs)
.where(eq(schema.graphs.status, 'active'));
// Honker + Drizzle in one transaction
db.transaction((tx) => {
tx.insert(schema.nodes).values({ graphId, key: 'call-1', attributes: {} }).run();
tx.$honkerTx.notify('graph:updated', { graphId });
});
```
## Consequences
**Positive:**
- Single connection — Drizzle queries and Honker pubsub/queue share the same SQLite file and transaction context
- No Drizzle fork — the adapter is consumer-side code (~100 lines)
- Full Drizzle type safety for all queries
- Full Honker feature access via `$client` and `$honkerTx`
- Transactional consistency: data writes and event notifications commit atomically
- The adapter could be open-sourced independently as `drizzle-honker` for community use
**Negative:**
- `lastInsertRowid` requires an extra query until honker-node adds a native method
- No prepared statement handle reuse at the JS level (mitigated by Rust-side `prepare_cached`)
- The adapter depends on Drizzle's internal SQLite session API path (`drizzle-orm/sqlite-core/session`) — if Drizzle restructures, the adapter needs updating
- Object-to-array conversion for Drizzle's `values()` method relies on JS object property insertion order, which matches SQLite column order but is a subtle dependency
## References
- Honker Node binding: `/workspace/honker/packages/honker-node/`
- Drizzle SQLite session: `/workspace/drizzle-orm/src/sqlite-core/session.ts`
- ADR-039: Honker as SQLite extension
- ADR-004: Injectable clients (adapter pattern is consistent)

View File

@@ -0,0 +1,42 @@
# ADR-045: Organization Members as Authoritative SQL Table, BelongsToEdge as Derived
## Status
Accepted
## Context
OQ-23 asked whether `BelongsToEdge` in the ACL graph should be derived (materialized from `organization_members`) or primary (ACL graph is the source of truth).
The ACL graph needs `BelongsToEdge` for traversal-based permission evaluation (ADR-034). The hub needs `organization_members` for fast SQL lookups ("list all members of org X", FK constraints on cascading behavior).
Two sources of truth for the same data creates a consistency risk.
## Decision
`organization_members` is the authoritative SQL table. `BelongsToEdge` in the ACL graph is derived.
When org membership changes, the consumer (hub) writes to `organization_members` first, then creates or removes the corresponding `BelongsToEdge` in the ACL graph instance. The ACL edge mirrors the SQL table; it does not define it.
If the two fall out of sync, the SQL table is the source of truth. An audit/reconciliation process can re-derive ACL edges from the SQL table.
## Consequences
**Positive:**
- Clear authority — one write path for membership, one derived read path for ACL traversal
- FK constraints on `organization_members` work (cascade delete when org or account is removed)
- Fast indexed lookups for membership lists — no graph traversal needed
- ACL evaluator can still traverse `BelongsToEdge` for permission resolution
- Reconciliation is straightforward — scan `organization_members`, compare against ACL edges, fix discrepancies
**Negative:**
- Dual-write contract — the consumer must write both places. If the ACL edge write fails after the SQL write, they're out of sync.
- The ACL graph is not self-contained for org membership — it depends on an external table
## References
- ADR-034: ACL as metagraph
- ADR-041: Identity tables in storage package
- OQ-23: BelongsToEdge derivation (now resolved)