--- status: reviewed last_updated: 2026-05-30 --- # Forward Look: Pointers, dbtype, and Universal IR How the Module-based metagraph connects to the broader @alkdev ecosystem — typed graph pointers, dbtype table rendering, and the ujsx universal IR pipeline. These are forward-looking designs that justify why certain structural decisions were made now (pointer abstraction deferred per [ADR-017](./decisions/017-pointer-abstraction-is-forward-looking.md), dbtype integration deferred per [ADR-018](./decisions/018-dbtype-integration-is-post-v1.md)). ## Overview Three packages in the @alkdev ecosystem share the same pipeline shape: ``` Schema (TypeBox Module) → Element Tree (ujsx) → Host (HostConfig) ``` | Package | Schema | Element tree | Host | |---------|--------|-------------|------| | `@alkdev/ujsx` | `UJSX` Module | ``, `` | DOM, custom | | `@alkdev/dbtype` | Table/Column schemas | ``, `` | SQLite, PG, MySQL drizzle dialects | | `@alkdev/storage` | `Metagraph` Module | ⚠️ Future: ``, `` | ⚠️ Future: graph DB hosts | When storage's graph type definitions align with the Module pattern, they join this same pipeline. The immediate benefit is recursive/cross-referencing schemas (today). The forward benefit is that graph type definitions, table definitions, and pointer expressions can all be authored as ujsx element trees rendered to different hosts. ## Pointer Abstraction Addressing nodes and edges within a graph instance follows the same pattern as ujsx's `ValuePointer` and `selectNode`/`setNode` — and the same pattern as jsonpathly's JPATH Module for path expressions. ### ujsx's pointer system (proven) ujsx already implements a reactive pointer system: ```ts class ValuePointer { private _signal: Signal; private _path: string[]; get value(): T set value(v: T) get reactive(): ReadonlySignal get path(): string[] } function selectNode(root: UNode, path: string[]): UNode | undefined function setNode(root: UNode, path: string[], value: UNode): UNode ``` This addresses elements within a ujsx tree by path segments (child indices, prop names). A graph instance has analogous structure: nodes identified by key, edges identified by key, attributes addressed by JSON path. ### Graph pointer analogy ```ts // ujsx pointer: element tree → path → value selectNode(root, ["children", 0, "props", "name"]) // Graph pointer: graph instance → path → value selectNode(graph, ["nodes", "call-001", "attributes", "requestId"]) ``` The structural analogy: | ujsx concept | Graph concept | |-------------|---------------| | Element tree root | Graph instance | | `UNode` | Node or Edge | | `path: string[]` | Key path: `["nodes", key]` or `["edges", key]` | | `selectNode(root, path)` | `selectGraphNode(graph, path)` | | `setNode(root, path, value)` | `setGraphNode(graph, path, value)` (via repository) | ### JPATH Module (jsonpathly) The research shows that JSONPath expressions can themselves be a TypeBox Module (`JPATH = Type.Module({...})` with recursive `Type.Ref("Subscript")`). This means pointer paths are not just runtime strings — they're typed schemas that can be validated and composed. For graph storage, this opens the possibility of **typed graph queries** — a pointer expression like `nodes.call-001.attributes.requestId` has a schema that validates against the graph type's Module. If `CallNode` doesn't have a `requestId` field, the pointer expression is invalid at compile time. ### Scope for v1 The pointer abstraction is a forward-looking design. For v1: - **Repository functions** use direct key-based addressing: `findNode(graphId, nodeKey)`, `findEdge(graphId, edgeKey)` - **Attribute access** is untyped JSON retrieval: `node.attributes.requestId` - **The Module** validates attribute shapes, but query paths are strings The jump to typed pointers requires either the JPATH Module (for path validation) or ujsx-style `ValuePointer` with signals (for reactive graph observation). Both are post-v1 concerns, but the graph type Module makes them feasible because it provides the schema the pointer validates against. ## Relationship to @alkdev/dbtype `@alkdev/dbtype` defines database schemas as ujsx element trees and renders them to Drizzle dialects via HostConfig. Storage's SQLite/PG table definitions are a natural consumer of this pipeline. ### Current vs. Future Table Definition **Current** (manual Drizzle table defs): ```ts export const graphTypes = sqliteTable("graph_types", { id: text("id").primaryKey(), name: text("name").notNull(), config: text("config", { mode: "json" }).notNull(), // ... }); ``` **Future** (dbtype element tree → HostConfig rendering): ```tsx const GraphTypesEl = h("table", { name: "graph_types" }, h(IdColumn, {}), h("column", { name: "name", type: "string", notNull: true }), h("column", { name: "config", type: "json", mode: "json", notNull: true }), h(AuditColumns, {}), ); const root = createRoot(sqliteHost, {}); root.render(GraphTypesEl); const drizzleTable = root.ctx.tables.graph_types; ``` ### Why this matters for storage 1. **Single source of truth**: Today's `sqlite/tables/` and future `pg/tables/` define the same shapes in two different Drizzle dialects. dbtype renders the same element tree to both — no manual duplication. 2. **Schema extraction**: `extractTable()` produces both TypeBox schemas (for validation) and column metadata (for Drizzle rendering) from the same tree. Storage gets `SelectGraphType` and `InsertGraphType` schemas for free. 3. **Module alignment**: dbtype assembles extracted schemas into a `Type.Module` for cross-table references. Storage's metagraph Module and dbtype's table Module could share a namespace — the `graph_types.config` column stores the JSON Schema from `Metagraph.Config`. ### v1 approach For v1, storage continues with manual Drizzle table definitions. The dbtype integration is deferred because: - dbtype is Phase 0 (architecture complete, no implementation) - The manual defs work and are well-understood - The Module pattern for graph types can be adopted independently (no dbtype dependency) When dbtype reaches Phase 1 (implementation), storage can migrate from Drizzle table definitions to dbtype elements one table at a time. The Module-based graph type definitions are already compatible — they're both TypeBox `Type.Module` objects. ## ujsx as Universal IR The three packages (ujsx, dbtype, storage) share the same pipeline shape: **Schema → Element Tree → Host**. This is not coincidental — ujsx is a universal declarative IR, and different "render targets" are just different HostConfigs. ### What this could look like ```tsx // Graph type definitions as ujsx elements (future) const CallGraphSchema = h("graphSchema", { name: "call-graph" }, h("config", { type: "directed", multi: false, allowSelfLoops: false }), h("nodeType", { name: "call" }, h(BaseNode, {}), h("attr", { name: "requestId", type: "string", required: true }), h("attr", { name: "status", ref: "CallStatus" }), ), h("edgeType", { name: "triggered" }, h(BaseEdge, {}), h("attr", { name: "type", literal: "triggered" }), ), h("edgeConstraints", { edgeType: "triggered", allowedSourceTypes: ["Call"], allowedTargetTypes: ["Call", "Subcall"] }), ); ``` Rendered to different hosts: | Host | Output | |------|--------| | TypeBox Host | `Type.Module({ CallNode: ..., TriggeredEdge: ... })` | | SQLite Host | `sqliteTable("node_types", { ... })` + `sqliteTable("edge_types", { ... })` | | PG Host | `pgTable("node_types", { ... })` + `pgTable("edge_types", { ... })` | | graphology Host | `SerializedGraph` format | | Documentation Host | Mermaid diagram, typed API docs | ### What's real today vs. aspirational | Capability | Status | |-----------|--------| | `Type.Module` for graph type definitions | ✅ Implemented — Metagraph, CallGraph, SecretGraph Modules | | Bridge functions (`moduleToDbSchema`, `validateNode`, `validateEdge`) | ✅ Implemented | | Reference graph type Modules (CallGraph, SecretGraph) | ✅ Implemented | | Crypto utility (`encrypt`, `decrypt`, `generateEncryptionKey`, `EncryptedDataSchema`) | ✅ Implemented | | Codegen from TypeScript interfaces → Module entries | ✅ TsToModule exists | | dbtype element trees → Drizzle tables | ⚠️ dbtype Phase 0, no implementation | | `` ujsx elements | ⚠️ Conceptual — needs HostConfig design | | Typed graph pointers via JPATH | ⚠️ Conceptual — needs JPATH Module design | | Reactive graph observation via ValuePointer | ⚠️ Conceptual — needs signal integration | The Module-based graph type definitions (this spec) are the **first concrete step** in this pipeline. Everything else builds on having a `Type.Module` as the schema source of truth. ## Repository Layer Strategy The repository layer (typed CRUD for the 6 metagraph tables + queries for graph data) is the next major feature to implement. The question of *how* it queries attributes connects to broader ecosystem decisions about dbtype and operations. ### Three Approaches #### A. JSON Path Queries (Near-Term) The repository layer maps filter criteria to JSON path extraction: ```ts findNodes({ graphId, attributes: { status: "active" } }) // SQLite: json_extract(attributes, '$.status') = 'active' // PG: attributes ->> 'status' = 'active' ``` - Works with current table definitions (no schema changes) - SQLite `json_extract()` and PG `->>` / `#>>` operators handle JSON path - No native index support on individual JSON attributes - PG can add GIN indexes on `jsonb` columns for containment queries, but not for arbitrary key-value lookups - Simple, immediate, no new infrastructure This is the pragmatic v1 approach. The metagraph pattern *requires* JSON attributes because node types are dynamic schemas (defined at runtime, stored in `node_types.schema`), not static columns known at database definition time. #### B. Native Columns via dbtype (Long-Term, Speculative) If storage migrates to dbtype element trees for table definitions, the 6 static metagraph tables (graph_types, node_types, edge_types, graphs, nodes, edges) could be rendered via the dbtype pipeline: element tree → HostConfig → Drizzle tables. This would eliminate the manual duplication between `sqlite/` and future `pg/`. However, dbtype does NOT solve the attribute indexing problem: - The metagraph's `attributes` column MUST remain JSON because the shape is defined by runtime schemas (node type definitions), not by static column definitions - dbtype generates static table schemas; it does not handle dynamic schema-as-data patterns like the metagraph - A "call" node's attributes (`requestId`, `status`, `duration`) are not columns on the `nodes` table — they're values in the `attributes` JSON column, validated by the corresponding node type's TypeBox schema #### C. Hybrid: Static Tables via dbtype, Dynamic Attributes Remain JSON The hybrid approach preserves the metagraph's dynamic schema model while leveraging dbtype for the static table scaffolding: 1. **Static tables**: dbtype renders the 6 metagraph tables to Drizzle dialects. This eliminates the SQLite/PG manual duplication for table *structure*. The `attributes` column is still `text/jsonb` across both dialects. 2. **Dynamic attributes**: Remain JSON. The Module-based node type schemas validate data at the application layer, not the database layer. This is by design (ADR-003, ADR-014). 3. **Virtual columns / computed columns**: A post-v1 optimization, not a v1 concern. Frequently queried attributes could be extracted to indexed columns as a performance optimization. For example, if `nodes.attributes.status` is a common filter, a computed column or trigger could copy it to `nodes.status_column` with an index. This would be a denormalization trade-off (triggers, migration complexity, dual-write responsibility) and is not designed or planned for v1. 4. **Repository CRUD**: The static table CRUD operations (insert graph type, find node by key) could be auto-generated like drizzle-graphql or the dbtype `from-dbtype` adapter. Graph-specific attribute queries remain JSON path. ### Implications for Each Approach | Concern | Path A (JSON) | Path B (Native) | Path C (Hybrid) | |---------|---------------|-----------------|------------------| | Works today | ✅ | ❌ (requires dbtype) | ❌ (requires dbtype) | | Preserves metagraph pattern | ✅ | ❌ (conflicts with dynamic schemas) | ✅ | | Eliminates SQLite/PG duplication | ❌ | ✅ | ✅ | | Indexes on attributes | GIN on PG only | ✅ full native | GIN + virtual columns | | Repository generation | Hand-write CRUD | Auto-gen from dbtype | Auto-gen for static, JSON path for dynamic | | Dependency on dbtype | None | Full | Partial (static tables only) | ### Connection to drizzle-graphql The overview references drizzle-graphql as a pattern for auto-generating a CRUD/query surface. The dbtype `from-dbtype` adapter is the @alkdev equivalent: it consumes element trees + Type.Module bundles and produces `OperationSpec[]` for the operations registry. The parallel: | Concern | drizzle-graphql | dbtype from-dbtype | |---------|----------------|-------------------| | Input | Drizzle schema (tables + relations) | UJSX element tree + Type.Module | | Output | GraphQL schema (queries + mutations) | `OperationSpec[]` (CRUD operations) | | Dialects | SQLite, PG, MySQL | SQLite, PG, MySQL (via HostConfig) | | Table model | Static columns only | Static columns only | | Dynamic data (JSON attrs) | Not handled | Not handled | Neither drizzle-graphql nor dbtype's `from-dbtype` handles dynamic schema-as-data patterns. The metagraph's JSON attributes require their own query layer, regardless of whether the static tables are auto-generated. This means the repository layer for `@alkdev/storage` will always have two parts: 1. **Static table CRUD** — could be auto-generated (by dbtype or hand-written) 2. **Graph data queries** — JSON path queries against the `attributes` column, validated by the Module schema at the application layer ### v1 Decision For v1, the practical path is **A (JSON path queries) with hand-written CRUD**. This decision is recorded as [ADR-033](./decisions/033-json-path-queries-for-v1.md). The hybrid approach (C) remains viable for a future iteration when dbtype reaches implementation, and it doesn't require any changes to the metagraph data model — only to how the static table definitions are generated. See OQ-17, OQ-18, OQ-19 in [open-questions.md](./open-questions.md) for the specific long-term questions that remain open beyond v1. ### Decisions Required - **OQ-17**: JSON path vs native columns vs hybrid for attribute queries (resolved for v1 — see ADR-033) - **OQ-18**: Auto-generated vs hand-written CRUD for static tables (resolved for v1 — see ADR-033) - **OQ-19**: Where the storage-operations bridge package should live (open) ## Constraints on Current Design The forward-looking patterns documented here constrain the Module evolution design in [metagraph-module.md](./metagraph-module.md): 1. **The Module format must be self-contained** — `Type.Module({...})` entries with `Type.Ref` and `Type.Composite` are the same structures that a ujsx TypeBox Host would produce. If the Module format were an ad-hoc builder output, it couldn't be rendered by a different host later. 2. **Edge constraints must be schema entries, not just DB columns** — the constraint data needs to survive serialization/deserialization and be validatable independently. DB-only columns can't do this. 3. **The base attribute schemas (`BaseNode`, `BaseEdge`) must be TypeBox schemas** — not Drizzle column definitions, not builder-internal objects. Only TypeBox schemas can be composed via `Type.Composite`, referenced via `Type.Ref`, and serialized to JSON Schema. 4. **No ujsx dependency** — storage's Module-based graph types join the pipeline conceptually, not as a runtime dependency. The `Type.Module` output is the same shape that a ujsx HostConfig would produce, but storage doesn't need ujsx to create it. The alignment is structural, not dependent. 5. **Schemas-as-JSON enables `Value.Diff`/`Value.Patch`/`Value.Cast`** — because TypeBox Modules serialize to JSON Schema, the TypeBox value system can operate on schemas themselves (diff to detect changes, patch to update stored schemas, cast to migrate data). This is not possible if schemas are opaque builder objects or Drizzle column definitions. See [schema-evolution.md](./schema-evolution.md). ## References - ujsx pointer system: `/workspace/@alkdev/ujsx/src/core/pointer.ts` - ujsx HostConfig adapter: `/workspace/@alkdev/ujsx/src/host/config.ts` - dbtype architecture: `/workspace/@alkdev/dbtype/docs/architecture/README.md` - dbtype elements: `/workspace/@alkdev/dbtype/docs/architecture/elements.md` - dbtype module: `/workspace/@alkdev/dbtype/docs/architecture/module.md` - dbtype repo adapter: `/workspace/@alkdev/dbtype/docs/architecture/repo-adapter.md` - drizzle-graphql (reference for CRUD generation pattern): `/workspace/drizzle-graphql/` - Operations registry: `/workspace/@alkdev/operations/docs/architecture/README.md` - JPATH Module (JSONPath as TypeBox Module): `/workspace/research/typebox_research/ujsx/jpath.gen.ts` - jsonpathly source: `/workspace/jsonpathly/` - Module evolution spec: [metagraph-module.md](./metagraph-module.md) - Schema evolution spec: [schema-evolution.md](./schema-evolution.md) - ADR-033: JSON path queries and hand-written CRUD for v1