Pivot: fold drizzlebox as utils, HonkerEventTarget, OperationSpecs as repo surface

- Update architecture docs to reflect pivot from @libsql/client to Honker - Fold @alkdev/drizzlebox Phase 0 into src/sqlite/utils/ (ADR-046) - Add HonkerEventTarget adapter for pubsub TypedEventTarget (ADR-047) - Replace hand-written CRUD with OperationSpec generation (ADR-048) - Resolved OQ-26: Honker replaces Redis for single-node pub/sub (POC validated) - Updated OQ-17, OQ-18, OQ-19 for OperationSpec repository surface - Added OQ-30 (composite event target), OQ-31 (consumer naming), OQ-32 (Drizzle Kit) - POC results: adapter buildable, same-process pub/sub works, transactional outbox semantics confirmed, concurrent listeners/streams work - Research doc at docs/research/pivot-honker-sqlite-adapter.md
2026-06-01 16:31:40 +00:00
parent 6aa2fcc6ff
commit 412ad98f11
10 changed files with 1342 additions and 230 deletions
--- a/docs/architecture/forward-look.md
+++ b/docs/architecture/forward-look.md
@@ -1,16 +1,15 @@
 ---
 status: draft
-last_updated: 2026-05-31
+last_updated: 2026-06-01
 ---

 # Forward Look: Pointers, dbtype, and Universal IR

 How the Module-based metagraph connects to the broader @alkdev ecosystem —
-typed graph pointers, dbtype table rendering, and the ujsx universal IR
-pipeline. These are forward-looking designs that justify why certain structural
-decisions were made now
-(pointer abstraction deferred per [ADR-017](./decisions/017-pointer-abstraction-is-forward-looking.md),
-dbtype integration deferred per [ADR-018](./decisions/018-dbtype-integration-is-post-v1.md)).
+typed graph pointers, local utils (folded from dbtype), and the ujsx universal IR
+pipeline. The dbtype integration is no longer deferred (ADR-046) — the SQLite-only
+Phase 0 subset folds into `src/sqlite/utils/`. The repository surface is now
+OperationSpecs (ADR-048), not hand-written CRUD.

 ## Overview

@@ -109,68 +108,69 @@ feasible because it provides the schema the pointer validates against.

 ## Relationship to @alkdev/dbtype

-`@alkdev/dbtype` defines database schemas as ujsx element trees and renders them
-to Drizzle dialects via HostConfig. Storage's SQLite/PG table definitions are a
-natural consumer of this pipeline.
+`@alkdev/dbtype` defined database schemas as ujsx element trees and planned to
+render them to Drizzle dialects via HostConfig. Its Phase 0 (Drizzle→TypeBox
+schema generation) was consumed as `@alkdev/drizzlebox`. Phase 1 (UJSX→Drizzle)
+was never implemented.

-### Current vs. Future Table Definition
+### Fold: Phase 0 → `src/sqlite/utils/` (ADR-046)

-**Current** (manual Drizzle table defs):
+With SQLite as the sole target (ADR-038), the multi-dialect column mappings in
+dbtype are dead weight. The SQLite-only subset has been folded into storage as
+`src/sqlite/utils/`:
+
+| What folds in | Source (dbtype) | Target (storage) |
+|---------------|-----------------|-------------------|
+| Schema generation | `schema.ts` | `utils/schema.ts` |
+| Column→TypeBox mappings | `column.ts` (SQLite branches only) | `utils/column.ts` |
+| Type interfaces | `schema.types.ts` + `schema.types.internal.ts` + `column.types.ts` | `utils/types.ts` |
+| Integer constants | `constants.ts` | `utils/constants.ts` |
+| Type guards | `utils.ts` (minus PgEnum) | `utils/utils.ts` |
+
+What does NOT fold in: PG, MySQL, SingleStore column handlers; `isPgEnum` /
+`handleEnum`; `createSchemaFactory`; the Phase 1 UJSX→HostConfig pipeline.
+
+Import changes in table files:

 ```ts
-export const graphTypes = sqliteTable("graph_types", {
-  id: text("id").primaryKey(),
-  name: text("name").notNull(),
-  config: text("config", { mode: "json" }).notNull(),
-  // ...
-});
+// Before
+import { createInsertSchema, createSelectSchema } from "@alkdev/drizzlebox";
+// After
+import { createInsertSchema, createSelectSchema } from "../utils/schema.ts";
 ```

-**Future** (dbtype element tree → HostConfig rendering):
+The API surface is identical — same functions, same TypeBox schemas produced.

-```tsx
-const GraphTypesEl = h("table", { name: "graph_types" },
-  h(IdColumn, {}),
-  h("column", { name: "name", type: "string", notNull: true }),
-  h("column", { name: "config", type: "json", mode: "json", notNull: true }),
-  h(AuditColumns, {}),
-);
+### Phase 1 (UJSX→Drizzle): future path

-const root = createRoot(sqliteHost, {});
-root.render(GraphTypesEl);
-const drizzleTable = root.ctx.tables.graph_types;
-```
+The broader UJSX→HostConfig→Drizzle pipeline from dbtype's architecture remains
+architecturally sound but is not part of this pivot. When and if it's built, it
+could live in storage as a `HostConfig` sub-module rather than a separate
+package, since storage is the primary consumer. The TypeBox Module format used
+by the metagraph is already compatible with what a ujsx HostConfig would produce.

 ### Why this matters for storage

-1. **Single source of truth**: Today's `sqlite/tables/` and future `pg/tables/`
-   define the same shapes in two different Drizzle dialects. dbtype renders the
-   same element tree to both — no manual duplication.
-2. **Schema extraction**: `extractTable()` produces both TypeBox schemas (for
-   validation) and column metadata (for Drizzle rendering) from the same tree.
-   Storage gets `SelectGraphType` and `InsertGraphType` schemas for free.
-3. **Module alignment**: dbtype assembles extracted schemas into a
-   `Type.Module` for cross-table references. Storage's metagraph Module and
-   dbtype's table Module could share a namespace — the `graph_types.config`
-   column stores the JSON Schema from `Metagraph.Config`.
+1. **Single source of truth**: The `utils/` code derives TypeBox schemas from
+   Drizzle tables. Table definitions are the source of truth for both the DB
+   schema and the validation schema.
+2. **Schema extraction**: `createSelectSchema` / `createInsertSchema` produce
+   TypeBox schemas that validate data at the application layer.
+3. **Module alignment**: The metagraph Module and the table-derived schemas
+   share the same TypeBox namespace. `graph_types.config` stores the JSON
+   Schema from `Metagraph.Config`.

 ### v1 approach

-For v1, storage continues with manual Drizzle table definitions. The dbtype
-integration is deferred because:
+For v1, storage uses the folded utils for TypeBox schema derivation from Drizzle
+tables (what was `@alkdev/drizzlebox`). The metagraph Module independently
+validates graph type definitions. These two schema sources serve different
+purposes: table schemas validate DB row shapes, Module schemas validate graph
+type semantics.

- dbtype is Phase 0 (architecture complete, no implementation)
- The manual defs work and are well-understood
- The Module pattern for graph types can be adopted independently (no dbtype
-dependency)
- With PostgreSQL removed (ADR-038), the original pressure for dbtype —
-  eliminating dual SQLite/PG table maintenance — is significantly reduced.
-  There is now only one set of table definitions to maintain.
-
-When dbtype reaches Phase 1 (implementation), storage can migrate from
-Drizzle table definitions to dbtype elements one table at a time. The Module-based
-graph type definitions are already compatible — they're both TypeBox `Type.Module`
-objects.
+When dbtype's Phase 1 (UJSX→HostConfig) is implemented, it would unify both
+directions — a TypeBox Module could produce both the Drizzle table definition
+and the validation schemas from the same element tree.

 ## ujsx as Universal IR

@@ -219,10 +219,17 @@ Rendered to different hosts:
 | Reference graph type Modules (CallGraph, SecretGraph) | ✅ Implemented |
 | Crypto utility (`encrypt`, `decrypt`, `generateEncryptionKey`, `EncryptedDataSchema`) | ✅ Implemented |
 | Codegen from TypeScript interfaces → Module entries | ✅ TsToModule exists |
-| dbtype element trees → Drizzle tables | ⚠️ dbtype Phase 0, no implementation |
+| SQLite column→TypeBox mappings (folded from dbtype) | ✅ Folded into `src/sqlite/utils/` (ADR-046) |
+| `createSelectSchema` / `createInsertSchema` (folded from drizzlebox) | ✅ Folded into `src/sqlite/utils/` (ADR-046) |
+| Drizzle-Honker session adapter | ✅ POC validated, implementation pending |
+| HonkerEventTarget for pubsub | ✅ POC validated, implementation pending |
+| Transactional notify + outbox (Honker) | ✅ POC validated — atomic commit for data + events + queue |
+| OperationSpec generation from tables | ⚠️ Design complete (ADR-048), implementation pending |
+| Domain-specific native-column tables | ⚠️ Conceptual — for known graph types (CallGraph, etc.) |
 | `<graphSchema>` ujsx elements | ⚠️ Conceptual — needs HostConfig design |
 | Typed graph pointers via JPATH | ⚠️ Conceptual — needs JPATH Module design |
 | Reactive graph observation via ValuePointer | ⚠️ Conceptual — needs signal integration |
+| dbtype Phase 1 (UJSX→Drizzle HostConfig) | ⚠️ Architecture exists, not implemented. Could live in storage if built. |

 The Module-based graph type definitions (this spec) are the **first concrete
 step** in this pipeline. Everything else builds on having a `Type.Module` as
@@ -230,126 +237,66 @@ the schema source of truth.

 ## Repository Layer Strategy

-The repository layer (typed CRUD for the 6 metagraph tables + queries for graph data)
-is the next major feature to implement. The question of *how* it queries attributes
-connects to broader ecosystem decisions about dbtype and operations.
+The repository layer (typed CRUD for the 6 metagraph tables + identity tables +
+queries for graph data) is now defined as **OperationSpec output** rather than
+hand-written query functions (ADR-048).

-### Three Approaches
+### OperationSpecs as Repository Surface

-#### A. JSON Path Queries (Near-Term)
+Storage outputs `OperationSpec[]` per table — flat arrays describing CRUD
+operations. The consumer (hub/spoke) imports these, registers handlers, and
+the operations runtime handles execution, call protocol, and subscriptions.

-The repository layer maps filter criteria to JSON path extraction:
+```ts
+// Storage defines the table + operation contracts
+export const callNodes = sqliteTable("call_nodes", { ... });
+export const callNodeSpecs: OperationSpec[] = [
+  { name: "create", namespace: "call_nodes", type: "mutation", inputSchema: ..., outputSchema: ... },
+  { name: "find",   namespace: "call_nodes", type: "query", ... },
+  { name: "list",   namespace: "call_nodes", type: "query", ... },
+  { name: "update", namespace: "call_nodes", type: "mutation", ... },
+  { name: "delete", namespace: "call_nodes", type: "mutation", ... },
+];
+
+// Hub registers specs + handlers
+for (const spec of callNodeSpecs) {
+  registry.registerSpec(spec);
+  registry.registerHandler(`${spec.namespace}.${spec.name}`, handler);
+}
+```
+
+The handler is consumer-provided — not in storage. Storage doesn't execute
+queries. Storage defines the contract; the hub provides the execution layer.
+
+### Attribute Queries
+
+The metagraph's `attributes` column remains JSON — node types are dynamic
+schemas defined at runtime, not static columns. Attribute queries use
+`json_extract()` for v1:

 ```ts
 findNodes({ graphId, attributes: { status: "active" } })
 // SQLite: json_extract(attributes, '$.status') = 'active'
-// PG:     attributes ->> 'status' = 'active'
 ```

- Works with current table definitions (no schema changes)
- SQLite `json_extract()` and PG `->>` / `#>>` operators handle JSON path
- No native index support on individual JSON attributes
- PG can add GIN indexes on `jsonb` columns for containment queries, but not for
-  arbitrary key-value lookups
- Simple, immediate, no new infrastructure
+For known graph types (CallGraph, SecretGraph), domain-specific tables with
+native columns can complement the generic metagraph tables. These domain
+tables also produce OperationSpecs with native-column queries.

-This is the pragmatic v1 approach. The metagraph pattern *requires* JSON attributes
-because node types are dynamic schemas (defined at runtime, stored in
-`node_types.schema`), not static columns known at database definition time.
+### Connection to @alkdev/operations

-#### B. Native Columns via dbtype (Long-Term, Speculative)
-
-If storage migrates to dbtype element trees for table definitions, the 6 static
-metagraph tables (graph_types, node_types, edge_types, graphs, nodes, edges) could
-be rendered via the dbtype pipeline: element tree → HostConfig → Drizzle tables.
-This would eliminate the manual duplication between `sqlite/` and future `pg/`.
-
-However, dbtype does NOT solve the attribute indexing problem:
-
- The metagraph's `attributes` column MUST remain JSON because the shape is defined
-  by runtime schemas (node type definitions), not by static column definitions
- dbtype generates static table schemas; it does not handle dynamic schema-as-data
-  patterns like the metagraph
- A "call" node's attributes (`requestId`, `status`, `duration`) are not columns
-  on the `nodes` table — they're values in the `attributes` JSON column, validated
-  by the corresponding node type's TypeBox schema
-
-#### C. Hybrid: Static Tables via dbtype, Dynamic Attributes Remain JSON
-
-The hybrid approach preserves the metagraph's dynamic schema model while leveraging
-dbtype for the static table scaffolding:
-
-1. **Static tables**: dbtype renders the 6 metagraph tables to Drizzle dialects.
-   This eliminates the SQLite/PG manual duplication for table *structure*.
-   The `attributes` column is still `text/jsonb` across both dialects.
-
-2. **Dynamic attributes**: Remain JSON. The Module-based node type schemas validate
-   data at the application layer, not the database layer. This is by design
-   (ADR-003, ADR-014).
-
-3. **Virtual columns / computed columns**: A post-v1 optimization, not a v1 concern.
-   Frequently queried attributes could be extracted to indexed columns as a
-   performance optimization. For example, if `nodes.attributes.status` is a common
-   filter, a computed column or trigger could copy it to `nodes.status_column` with
-   an index. This would be a denormalization trade-off (triggers, migration
-   complexity, dual-write responsibility) and is not designed or planned for v1.
-
-4. **Repository CRUD**: The static table CRUD operations (insert graph type, find
-   node by key) could be auto-generated like drizzle-graphql or the dbtype
-   `from-dbtype` adapter. Graph-specific attribute queries remain JSON path.
-
-### Implications for Each Approach
-
-| Concern | Path A (JSON) | Path B (Native) | Path C (Hybrid) |
-|---------|---------------|-----------------|------------------|
-| Works today | ✅ | ❌ (requires dbtype) | ❌ (requires dbtype) |
-| Preserves metagraph pattern | ✅ | ❌ (conflicts with dynamic schemas) | ✅ |
-| Eliminates SQLite/PG duplication | ❌ | ✅ | ✅ |
-| Indexes on attributes | GIN on PG only | ✅ full native | GIN + virtual columns |
-| Repository generation | Hand-write CRUD | Auto-gen from dbtype | Auto-gen for static, JSON path for dynamic |
-| Dependency on dbtype | None | Full | Partial (static tables only) |
-
-### Connection to drizzle-graphql
-
-The overview references drizzle-graphql as a pattern for auto-generating a CRUD/query
-surface. The dbtype `from-dbtype` adapter is the @alkdev equivalent: it consumes
-element trees + Type.Module bundles and produces `OperationSpec[]` for the
-operations registry.
-
-The parallel:
-
-| Concern | drizzle-graphql | dbtype from-dbtype |
-|---------|----------------|-------------------|
-| Input | Drizzle schema (tables + relations) | UJSX element tree + Type.Module |
-| Output | GraphQL schema (queries + mutations) | `OperationSpec[]` (CRUD operations) |
-| Dialects | SQLite, PG, MySQL | SQLite, PG, MySQL (via HostConfig) |
-| Table model | Static columns only | Static columns only |
-| Dynamic data (JSON attrs) | Not handled | Not handled |
-
-Neither drizzle-graphql nor dbtype's `from-dbtype` handles dynamic schema-as-data
-patterns. The metagraph's JSON attributes require their own query layer, regardless
-of whether the static tables are auto-generated. This means the repository layer
-for `@alkdev/storage` will always have two parts:
-
-1. **Static table CRUD** — could be auto-generated (by dbtype or hand-written)
-2. **Graph data queries** — JSON path queries against the `attributes` column,
-   validated by the Module schema at the application layer
+`@alkdev/operations` is a type-only peer dependency of storage. The
+`OperationSpec` type is straightforward. Storage builds the specs; the
+consumer wires them into the registry. No circular dependency.

 ### v1 Decision

-For v1, the practical path is **A (JSON path queries) with hand-written CRUD**. This
-decision is recorded as [ADR-033](./decisions/033-json-path-queries-for-v1.md). The
-hybrid approach (C) remains viable for a future iteration when dbtype reaches
-implementation, and it doesn't require any changes to the metagraph data model —
-only to how the static table definitions are generated. See OQ-17, OQ-18, OQ-19
-in [open-questions.md](./open-questions.md) for the specific long-term questions
-that remain open beyond v1.
-
-### Decisions Required
-
- **OQ-17**: JSON path vs native columns vs hybrid for attribute queries (resolved for v1 — see ADR-033)
- **OQ-18**: Auto-generated vs hand-written CRUD for static tables (resolved for v1 — see ADR-033)
- **OQ-19**: Where the storage-operations bridge package should live (open)
+For v1, the practical path is **OperationSpecs with JSON path attribute
+queries** (ADR-048, supersedes ADR-033). Spec generation from tables is
+straightforward once domain tables exist. The metagraph's generic CRUD
+(graphs, nodes, edges) uses JSON attributes; domain-specific CRUD uses
+native columns. Both produce OperationSpecs that the hub registers in the
+same operations registry.

 ## Constraints on Current Design

@@ -382,18 +329,28 @@ design in [metagraph-module.md](./metagraph-module.md):
   opaque builder objects or Drizzle column definitions. See
   [schema-evolution.md](./schema-evolution.md).

+6. **OperationSpec output is consumer-agnostic** — storage defines
+   `OperationSpec[]` from table definitions. The consumer (hub/spoke) decides
+   how to register handlers. Storage does not execute queries or depend on
+   the operations runtime.
+
+7. **The folded utils are SQLite-only** — `src/sqlite/utils/` contains only
+   SQLite column→TypeBox mappings. If a new database host is added later, the
+   utils would need the corresponding dialect mappings. dbtype's Phase 1
+   (UJSX→HostConfig) would be the mechanism for multi-dialect support.
+
 ## References

 - ujsx pointer system: `/workspace/@alkdev/ujsx/src/core/pointer.ts`
 - ujsx HostConfig adapter: `/workspace/@alkdev/ujsx/src/host/config.ts`
- dbtype architecture: `/workspace/@alkdev/dbtype/docs/architecture/README.md`
+- dbtype architecture: `/workspace/@alkdev/dbtype/docs/architecture/README.md` (Phase 0 source folded into storage)
 - dbtype elements: `/workspace/@alkdev/dbtype/docs/architecture/elements.md`
 - dbtype module: `/workspace/@alkdev/dbtype/docs/architecture/module.md`
 - dbtype repo adapter: `/workspace/@alkdev/dbtype/docs/architecture/repo-adapter.md`
- drizzle-graphql (reference for CRUD generation pattern): `/workspace/drizzle-graphql/`
 - Operations registry: `/workspace/@alkdev/operations/docs/architecture/README.md`
 - JPATH Module (JSONPath as TypeBox Module): `/workspace/research/typebox_research/ujsx/jpath.gen.ts`
 - jsonpathly source: `/workspace/jsonpathly/`
 - Module evolution spec: [metagraph-module.md](./metagraph-module.md)
 - Schema evolution spec: [schema-evolution.md](./schema-evolution.md)
- ADR-033: JSON path queries and hand-written CRUD for v1
+- ADR-046: Fold drizzlebox as utils (supersedes ADR-033)
+- ADR-048: OperationSpecs as repository surface