Files
storage/docs/architecture/sqlite-host.md
glm-5.1 6aa2fcc6ff Architect storage around SQLite+Honker: remove PG, add multi-tenant identity, scoping
Reorient @alkdev/storage around a single SQLite database host with Honker
for pub/sub, event streams, and task queues. PostgreSQL is removed as a
target (ADR-038), eliminating dual schema maintenance and infrastructure
complexity. Honker provides DB + pubsub + queues in one .db file (ADR-039).

Add system/tenant DB model (ADR-040): identity tables in system.db, all
graph data in tenant-{orgId}.db files. Identity tables move from the hub
into storage (ADR-041). Scoping columns (ownerId, projectId) added to
graphs table (ADR-042). Graph types get scope (system/tenant/user) to
protect infrastructure schemas (ADR-043).

Define Drizzle-Honker session adapter (ADR-044): ~100-line adapter enabling
Drizzle typed queries and Honker pubsub/queue on a single connection with
transactional consistency.

Resolve OQ-03, OQ-04, OQ-19, OQ-21, OQ-22, OQ-23, OQ-24. Add new
open questions OQ-26 through OQ-29 for Honker integration specifics.

New docs: honker-integration.md (adapter, event patterns, migration).
Scrub all PG/jsonb/libsql references from existing spec docs.
2026-05-31 15:41:41 +00:00

385 lines
22 KiB
Markdown

---
status: draft
last_updated: 2026-05-31
---
# SQLite Host
The SQLite database host for `@alkdev/storage`. Uses Drizzle ORM with Honker
for database operations, pub/sub, event streams, and task queues. TypeBox
schemas are auto-generated from Drizzle table definitions via `@alkdev/drizzlebox`.
## Overview
The SQLite host provides:
1. **Metagraph tables** — 6 tables for graph types, node types, edge types,
graphs, nodes, and edges (ADR-002)
2. **Identity tables** — accounts, organizations, organization_members, api_keys,
audit_logs for multi-tenant authentication and authorization (ADR-041)
3. **Drizzle relations** for the relational query API
4. **TypeBox schemas** auto-generated from Drizzle tables (select/insert
validation) via `@alkdev/drizzlebox`
5. **Drizzle-Honker adapter** — thin session adapter for Honker integration
(ADR-044)
6. **Client factories**`createSystemDatabase(client)` and
`createTenantDatabase(client)` for the system/tenant DB model (ADR-040)
## Package Structure
```
src/sqlite/
├── tables/
│ ├── common.ts # commonCols
│ ├── identity/
│ │ ├── accounts.ts # accounts table + select/insert schemas
│ │ ├── organizations.ts # organizations table + select/insert schemas
│ │ ├── organization_members.ts # org membership + select/insert schemas
│ │ ├── api_keys.ts # API keys (keypal) + select/insert schemas
│ │ ├── audit_logs.ts # audit trail + select/insert schemas
│ │ └── index.ts # barrel re-export
│ ├── metagraph/
│ │ ├── graphTypes.ts # graph_types table + select/insert schemas
│ │ ├── nodeTypes.ts # node_types table + select/insert schemas
│ │ ├── edgeTypes.ts # edge_types table + select/insert schemas
│ │ ├── graphs.ts # graphs table + select/insert schemas
│ │ ├── nodes.ts # nodes table + select/insert schemas
│ │ ├── edges.ts # edges table + select/insert schemas
│ │ └── index.ts # barrel re-export
│ └── index.ts # barrel re-export
├── relations.ts # Drizzle relational mappings
├── adapter.ts # Drizzle-Honker session adapter
├── schema.ts # re-exports all tables + relations
└── client.ts # createSystemDatabase(), createTenantDatabase()
```
## Common Columns
All tables share these columns:
```ts
{
id: text("id").primaryKey(),
metadata: text("metadata", { mode: "json" }).$type<Record<string, unknown>>().default({}),
createdAt: integer("created_at", { mode: "timestamp" })
.default(sql`(strftime('%s', 'now'))`)
.notNull(),
updatedAt: integer("updated_at", { mode: "timestamp" })
.default(sql`(strftime('%s', 'now'))`)
.notNull(),
}
```
- `id` is a consumer-generated UUID text PK (no `$defaultFn`)
- `metadata` is an extension namespace following `_subsystem.key` convention
- `createdAt`/`updatedAt` are Unix epoch integers with timestamp mode
- No `$onUpdate` — consumers must set `updatedAt` explicitly
## Metagraph Tables
### `graph_types`
| Column | Type | Constraints | Notes |
| ----------- | ------------------- | ----------------------- | ------------------------------------------------------------ |
| id | text | PK | Consumer-generated UUID |
| metadata | text (JSON) | default `{}` | Extension namespace |
| createdAt | integer (timestamp) | not null, default `now` | |
| updatedAt | integer (timestamp) | not null, default `now` | |
| name | text | not null, **unique** | Graph type name (e.g., "call-graph", "acl") |
| description | text | default `""` | Human-readable description |
| config | text (JSON) | not null | `GraphConfig` — directed/undirected/mixed, multi, self-loops |
| version | integer | not null, default 1 | Breaking schema version (ADR-029) |
| scope | text | not null, default `"system"` | `system` / `tenant` / `user` (ADR-043) |
The `scope` column (ADR-043) controls who can create and modify graph type
definitions. System-scoped types (`acl`, `call-graph`) are seeded at setup time
and cannot be modified through the repository API.
### `node_types`
| Column | Type | Constraints | Notes |
| ----------- | ------------------- | -------------------------------------- | ---------------------------------------- |
| id | text | PK | |
| metadata | text (JSON) | default `{}` | |
| createdAt | integer (timestamp) | not null, default `now` | |
| updatedAt | integer (timestamp) | not null, default `now` | |
| graphTypeId | text | not null, FK → graphTypes.id (cascade) | Parent graph type |
| name | text | not null | Node type name (e.g., "call", "account") |
| description | text | default `""` | |
| schema | text (JSON) | not null | TypeBox schema for node attributes |
**Unique constraint**: `(graphTypeId, name)` — node type names are unique within
a graph type.
### `edge_types`
| Column | Type | Constraints | Notes |
| ------------------ | ------------------- | -------------------------------------- | ---------------------------------------------- |
| id | text | PK | |
| metadata | text (JSON) | default `{}` | |
| createdAt | integer (timestamp) | not null, default `now` | |
| updatedAt | integer (timestamp) | not null, default `now` | |
| graphTypeId | text | not null, FK → graphTypes.id (cascade) | Parent graph type |
| name | text | not null | Edge type name (e.g., "triggered", "can_read") |
| description | text | default `""` | |
| schema | text (JSON) | not null | TypeBox schema for edge attributes |
| allowedSourceTypes | text (JSON) | default `[]` | Node type names valid at source endpoint |
| allowedTargetTypes | text (JSON) | default `[]` | Node type names valid at target endpoint |
**Unique constraint**: `(graphTypeId, name)`.
**Empty array semantics**: `[]` means "no restriction" — any node type is valid.
### `graphs`
| Column | Type | Constraints | Notes |
| ----------- | ------------------- | --------------------------------------------- | ---------------------------------------------- |
| id | text | PK | |
| metadata | text (JSON) | default `{}` | |
| createdAt | integer (timestamp) | not null, default `now` | |
| updatedAt | integer (timestamp) | not null, default `now` | |
| graphTypeId | text | FK → graphTypes.id (set null) | Set null on graph type deletion (orphan graph) |
| name | text | not null | Graph instance name |
| description | text | default `""` | |
| status | text | not null, enum: `active`, `archived`, `draft` | Default: `draft` |
| ownerId | text | nullable | Logical reference to accounts.id (ADR-042) |
| projectId | text | nullable | Logical reference to project identity (ADR-042) |
**Scoping columns** (ADR-042): `ownerId` and `projectId` are logical references
to entities in the system DB (accounts, projects). No FK constraint because the
referenced tables live in a different database file. The consumer enforces
referential integrity at the application layer.
No `orgId` column — the tenant DB file itself IS the org scope (ADR-040).
**Indexes**: `idx_graphs_owner_id` on `(ownerId)`, `idx_graphs_project_id` on
`(projectId)`, `idx_graphs_owner_id_project_id` on `(ownerId, projectId)`.
**On `graphTypeId` set null**: Orphan graphs cannot validate their node/edge
types against a missing type definition. The application should prevent graph
type deletion if active graphs reference it.
### `nodes`
| Column | Type | Constraints | Notes |
| ---------- | ------------------- | ---------------------------------- | --------------------------------------------- |
| id | text | PK | |
| metadata | text (JSON) | default `{}` | |
| createdAt | integer (timestamp) | not null, default `now` | |
| updatedAt | integer (timestamp) | not null, default `now` | |
| graphId | text | not null, FK → graphs.id (cascade) | Parent graph |
| key | text | not null | Consumer-defined identity within the graph |
| attributes | text (JSON) | not null, default `{}` | Node attributes validated by node type schema |
**Unique constraint**: `(graphId, key)` — node keys are unique within a graph.
**No `nodeTypeId` column**: ADR-020.
### `edges`
| Column | Type | Constraints | Notes |
| ------------- | ------------------- | ---------------------------------- | ---------------------------------------------------- |
| id | text | PK | |
| metadata | text (JSON) | default `{}` | |
| createdAt | integer (timestamp) | not null, default `now` | |
| updatedAt | integer (timestamp) | not null, default `now` | |
| graphId | text | not null, FK → graphs.id (cascade) | Parent graph |
| key | text | | Consumer-defined identity (null for anonymous edges) |
| sourceNodeKey | text | not null | Source node key within the graph |
| targetNodeKey | text | not null | Target node key within the graph |
| attributes | text (JSON) | not null, default `{}` | Edge attributes validated by edge type schema |
| undirected | integer (boolean) | default false | Treat as undirected regardless of graph type |
**Unique constraint**: `(graphId, key)`.
**Foreign keys**: `sourceNodeKey` and `targetNodeKey` reference
`(nodes.graphId, nodes.key)` with cascade delete (ADR-022).
## Identity Tables
Identity tables live in the **system DB** (ADR-040, ADR-041). They provide
multi-tenant authentication and authorization infrastructure. These tables are
derived from the hub's existing identity tables; the schemas are aligned but
simplified for the storage package's scope.
### `accounts`
| Column | Type | Notes |
|---------------|---------------------|-------|
| commonCols | — | id, metadata, createdAt, updatedAt |
| email | text NOT NULL UNIQUE | Unique identifier |
| displayName | text | Display name |
| accessLevel | text NOT NULL DEFAULT `user` | `admin`, `user`, `service` |
| status | text NOT NULL DEFAULT `active` | `active`, `suspended`, `deactivated` |
**Indexes**: `unq_accounts_email` UNIQUE on `(email)`.
### `organizations`
| Column | Type | Notes |
|----------|---------------------|-------|
| commonCols | — | id, metadata, createdAt, updatedAt |
| name | text NOT NULL UNIQUE | Organization name |
| slug | text NOT NULL UNIQUE | URL-friendly identifier |
| ownerId | text NOT NULL | Logical reference to accounts.id |
**Indexes**: `unq_organizations_name` UNIQUE on `(name)`, `unq_organizations_slug` UNIQUE on `(slug)`.
### `organization_members`
| Column | Type | Notes |
|-----------------|---------------------|-------|
| commonCols | — | id, metadata, createdAt, updatedAt |
| orgId | text NOT NULL | FK → organizations.id (cascade) |
| accountId | text NOT NULL | FK → accounts.id (cascade) |
| membershipLevel | text NOT NULL | `owner`, `admin`, `member` |
**Unique constraint**: `(orgId, accountId)`.
**Indexes**: `idx_org_members_account_id` on `(accountId)`.
This table is the authoritative source for org membership (ADR-045). The ACL
graph's `BelongsToEdge` is derived from it — when membership changes, the
consumer writes the SQL row first, then creates or removes the ACL edge.
### `api_keys`
| Column | Type | Notes |
|------------|---------------------|-------|
| commonCols | — | id, metadata, createdAt, updatedAt |
| ownerId | text NOT NULL | Logical reference to accounts.id |
| keyHash | text NOT NULL UNIQUE | SHA-256 hash (never stores raw key) |
| name | text | Human-readable key label |
| enabled | integer NOT NULL DEFAULT 1 | Disable without revoking |
| expiresAt | integer (timestamp) | When the key expires (null = never) |
| revokedAt | integer (timestamp) | When revoked (null = active) |
**Indexes**: `unq_api_keys_key_hash` UNIQUE on `(keyHash)`, `idx_api_keys_owner_id` on `(ownerId)`.
Keypal scope data is stored in `metadata` (`metadata.scopes`, `metadata.resources`).
The hub provides a `HubKeyStorage` adapter that reads/writes this table to
implement keypal's `Storage` interface.
### `audit_logs`
| Column | Type | Notes |
|----------|---------------------|-------|
| commonCols | — | id, metadata, createdAt, updatedAt |
| action | text NOT NULL | `created`, `revoked`, `rotated`, `login`, `access_denied` |
| ownerId | text NOT NULL | Logical reference to accounts.id |
| keyId | text | Logical reference to api_keys.id (nullable) |
| orgId | text | Logical reference to organizations.id (nullable) |
| details | text (JSON) | Action-specific context |
**Indexes**: `idx_audit_logs_owner_id` on `(ownerId)`, `idx_audit_logs_action` on `(action)`, `idx_audit_logs_created_at` on `(createdAt)`.
## Relations
### System DB Relations
- **organizations → organization_members**: one-to-many
- **accounts → organization_members**: one-to-many
### Tenant DB Relations
- **graphTypes → nodeTypes**: one-to-many
- **graphTypes → edgeTypes**: one-to-many
- **graphTypes → graphs**: one-to-many
- **graphs → nodes**: one-to-many
- **graphs → edges**: one-to-many
- **nodes → outgoing edges** (sourceNode): one-to-many
- **nodes → incoming edges** (targetNode): one-to-many
## Client Factories
### `createSystemDatabase(client)`
Creates a Drizzle database instance with the identity schema (accounts,
organizations, organization_members, api_keys, audit_logs) attached.
```ts
import { createSystemDatabase } from "@alkdev/storage/sqlite";
import { open } from "@russellthehipp/honker-node";
const honkerClient = open("system.db");
const db = createSystemDatabase(honkerClient);
// Drizzle typed queries
const admins = db.select().from(accounts).where(eq(accounts.accessLevel, "admin"));
// Honker features on the same connection
db.$client.notify("account:created", { accountId: "user-1" });
```
### `createTenantDatabase(client)`
Creates a Drizzle database instance with the metagraph schema (graph_types,
node_types, edge_types, graphs, nodes, edges) attached.
```ts
import { createTenantDatabase } from "@alkdev/storage/sqlite";
import { open } from "@russellthehipp/honker-node";
const honkerClient = open("tenant-acme.db");
const db = createTenantDatabase(honkerClient);
// Drizzle typed queries
const activeGraphs = db.select().from(graphs).where(eq(graphs.status, "active"));
// Transactional: insert node + notify in one commit
db.transaction((tx) => {
tx.insert(nodes).values({ graphId, key: "call-1", attributes: {} }).run();
tx.$honkerTx.notify("nodes:created", { graphId, key: "call-1" });
});
```
## Design Decisions
| ADR | Decision | Summary |
|-----|----------|---------|
| [038](decisions/038-sqlite-first-pg-removed.md) | SQLite-first, PG removed | Single database host |
| [039](decisions/039-honker-as-sqlite-extension.md) | Honker as SQLite extension | DB + pub/sub + queues in one file |
| [040](decisions/040-system-db-tenant-db.md) | System DB + tenant DB | Identity in system.db, graphs in tenant-{orgId}.db |
| [041](decisions/041-identity-tables-in-storage.md) | Identity tables in storage | accounts, organizations, api_keys, audit_logs |
| [042](decisions/042-scoping-columns-on-graphs.md) | Scoping columns on graphs | `ownerId`, `projectId` on `graphs` table |
| [043](decisions/043-graph-type-scope.md) | Graph type scope | `system` / `tenant` / `user` scope on `graph_types` |
| [044](decisions/044-drizzle-honker-adapter.md) | Drizzle-Honker adapter | ~100-line session adapter |
| [045](decisions/045-org-members-authoritative-belongsto-derived.md) | org_members authoritative | SQL table is source of truth; BelongsToEdge is derived |
| [019](decisions/019-json-text-for-schema-columns.md) | JSON text for schema columns | SQLite uses `text` with JSON mode |
| [020](decisions/020-no-nodetypeid-on-nodes.md) | No nodeTypeId on nodes | Node type enforced at application layer |
| [022](decisions/022-composite-fks-for-node-references.md) | Composite FKs for node refs | Edges reference `(graphId, sourceNodeKey)` |
| [008](decisions/008-common-columns-pattern.md) | Common columns pattern | `id`, `metadata`, `createdAt`, `updatedAt` |
## Removed: `actors` Table
The `actors` table is removed per ADR-035. `ACTOR_TYPE` is replaced by the
`IdentityType` enum in the AclGraph Module. Identity data lives in the
`accounts` table (system DB) and `PrincipalNode` in ACL graph instances
(tenant DB).
## Removed: PostgreSQL Porting Notes
PostgreSQL is no longer a target (ADR-038). All porting notes from the previous
version of this document are obsolete. The single database host is SQLite via
Honker.
## Concurrency Model
Honker opens databases in WAL mode with a bounded reader pool and single writer
slot. This handles the expected concurrency for the hub use case:
- **Reader pool**: Up to `maxReaders` (default 4) concurrent read connections.
`db.$client.query()` uses the pool automatically.
- **Writer slot**: Single exclusive writer, acquired by `transaction()`. If the
slot is occupied, subsequent `transaction()` calls block until released.
- **Write timeout**: Honker sets `busy_timeout=5000` by default. Configurable at
`open()` time.
- **WAL mode**: Enables concurrent reads during writes. Required by Honker's
reader pool architecture.
For multi-process deployments, set WAL mode and ensure the busy timeout is
sufficient for expected lock contention.
## References
- Honker source: `/workspace/honker/`
- Honker Node binding: `/workspace/honker/packages/honker-node/`
- Hub identity tables (provenance): `/workspace/@alkdev/hub/docs/architecture/storage/identity.md`
- Operations AccessControl: `/workspace/@alkdev/operations/docs/architecture/api-surface.md`
- Source: `src/sqlite/`