Files
storage/docs/architecture/forward-look.md
glm-5.1 ed8710a7f5 Clean up architecture specs: remove stale references, align docs with code, improve readability
- Replace stale DD references (DD3, DD6, DD9, DD10) with proper ADR links
- Fix 'Open Question 1' → OQ-01/OQ-03 cross-references
- Rewrite metagraph-module.md 'Why TypeBox Modules' to describe capabilities
  directly instead of framing as SchemaBuilder replacement
- Remove 'Transition from SchemaBuilder' section, replace with Source Structure
- Clean up implementation path: strikethrough phases → status table
- Fix data model diagram: remove non-existent nodeTypeId, fix EdgeType label
- Align EdgeConstraints examples with actual code (add default values)
- Clarify validateNode/validateEdge error behavior in docs
- Align EncryptedDataSchema code example with actual implementation
- Fix overview.md: correct dependency table, update current state, fix TypeBox URL
- Fix forward-look.md garbled text about dbtype element migration
- Fix open-questions.md: correct OQ count (4→7 open), add summary table
- Update doc statuses: schema-evolution, encrypted-data, open-questions → reviewed
- Update AGENTS.md to reflect current implementation state
2026-05-30 09:12:24 +00:00

269 lines
11 KiB
Markdown

---
status: draft
last_updated: 2026-05-30
---
# Forward Look: Pointers, dbtype, and Universal IR
How the Module-based metagraph connects to the broader @alkdev ecosystem —
typed graph pointers, dbtype table rendering, and the ujsx universal IR
pipeline. These are forward-looking designs that justify why certain structural
decisions were made now
(pointer abstraction deferred per [ADR-017](./decisions/017-pointer-abstraction-is-forward-looking.md),
dbtype integration deferred per [ADR-018](./decisions/018-dbtype-integration-is-post-v1.md)).
## Overview
Three packages in the @alkdev ecosystem share the same pipeline shape:
```
Schema (TypeBox Module) → Element Tree (ujsx) → Host (HostConfig)
```
| Package | Schema | Element tree | Host |
|---------|--------|-------------|------|
| `@alkdev/ujsx` | `UJSX` Module | `<element>`, `<root>` | DOM, custom |
| `@alkdev/dbtype` | Table/Column schemas | `<table>`, `<column>` | SQLite, PG, MySQL drizzle dialects |
| `@alkdev/storage` | `Metagraph` Module | ⚠️ Future: `<graphSchema>`, `<nodeType>` | ⚠️ Future: graph DB hosts |
When storage's graph type definitions align with the Module pattern, they
join this same pipeline. The immediate benefit is recursive/cross-referencing
schemas (today). The forward benefit is that graph type definitions, table
definitions, and pointer expressions can all be authored as ujsx element trees
rendered to different hosts.
## Pointer Abstraction
Addressing nodes and edges within a graph instance follows the same pattern as
ujsx's `ValuePointer` and `selectNode`/`setNode` — and the same pattern as
jsonpathly's JPATH Module for path expressions.
### ujsx's pointer system (proven)
ujsx already implements a reactive pointer system:
```ts
class ValuePointer<T> {
private _signal: Signal<T>;
private _path: string[];
get value(): T
set value(v: T)
get reactive(): ReadonlySignal<T>
get path(): string[]
}
function selectNode(root: UNode, path: string[]): UNode | undefined
function setNode(root: UNode, path: string[], value: UNode): UNode
```
This addresses elements within a ujsx tree by path segments (child indices,
prop names). A graph instance has analogous structure: nodes identified by
key, edges identified by key, attributes addressed by JSON path.
### Graph pointer analogy
```ts
// ujsx pointer: element tree → path → value
selectNode(root, ["children", 0, "props", "name"])
// Graph pointer: graph instance → path → value
selectNode(graph, ["nodes", "call-001", "attributes", "requestId"])
```
The structural analogy:
| ujsx concept | Graph concept |
|-------------|---------------|
| Element tree root | Graph instance |
| `UNode` | Node or Edge |
| `path: string[]` | Key path: `["nodes", key]` or `["edges", key]` |
| `selectNode(root, path)` | `selectGraphNode(graph, path)` |
| `setNode(root, path, value)` | `setGraphNode(graph, path, value)` (via repository) |
### JPATH Module (jsonpathly)
The research shows that JSONPath expressions can themselves be a TypeBox Module
(`JPATH = Type.Module({...})` with recursive `Type.Ref("Subscript")`). This means
pointer paths are not just runtime strings — they're typed schemas that can be
validated and composed.
For graph storage, this opens the possibility of **typed graph queries** — a
pointer expression like `nodes.call-001.attributes.requestId` has a schema that
validates against the graph type's Module. If `CallNode` doesn't have a
`requestId` field, the pointer expression is invalid at compile time.
### Scope for v1
The pointer abstraction is a forward-looking design. For v1:
- **Repository functions** use direct key-based addressing:
`findNode(graphId, nodeKey)`, `findEdge(graphId, edgeKey)`
- **Attribute access** is untyped JSON retrieval:
`node.attributes.requestId`
- **The Module** validates attribute shapes, but query paths are strings
The jump to typed pointers requires either the JPATH Module (for path
validation) or ujsx-style `ValuePointer` with signals (for reactive graph
observation). Both are post-v1 concerns, but the graph type Module makes them
feasible because it provides the schema the pointer validates against.
## Relationship to @alkdev/dbtype
`@alkdev/dbtype` defines database schemas as ujsx element trees and renders them
to Drizzle dialects via HostConfig. Storage's SQLite/PG table definitions are a
natural consumer of this pipeline.
### Current vs. Future Table Definition
**Current** (manual Drizzle table defs):
```ts
export const graphTypes = sqliteTable("graph_types", {
id: text("id").primaryKey(),
name: text("name").notNull(),
config: text("config", { mode: "json" }).notNull(),
// ...
});
```
**Future** (dbtype element tree → HostConfig rendering):
```tsx
const GraphTypesEl = h("table", { name: "graph_types" },
h(IdColumn, {}),
h("column", { name: "name", type: "string", notNull: true }),
h("column", { name: "config", type: "json", mode: "json", notNull: true }),
h(AuditColumns, {}),
);
const root = createRoot(sqliteHost, {});
root.render(GraphTypesEl);
const drizzleTable = root.ctx.tables.graph_types;
```
### Why this matters for storage
1. **Single source of truth**: Today's `sqlite/tables/` and future `pg/tables/`
define the same shapes in two different Drizzle dialects. dbtype renders the
same element tree to both — no manual duplication.
2. **Schema extraction**: `extractTable()` produces both TypeBox schemas (for
validation) and column metadata (for Drizzle rendering) from the same tree.
Storage gets `SelectGraphType` and `InsertGraphType` schemas for free.
3. **Module alignment**: dbtype assembles extracted schemas into a
`Type.Module` for cross-table references. Storage's metagraph Module and
dbtype's table Module could share a namespace — the `graph_types.config`
column stores the JSON Schema from `Metagraph.Config`.
### v1 approach
For v1, storage continues with manual Drizzle table definitions. The dbtype
integration is deferred because:
- dbtype is Phase 0 (architecture complete, no implementation)
- The manual defs work and are well-understood
- The Module pattern for graph types can be adopted independently (no dbtype
dependency)
When dbtype reaches Phase 1 (implementation), storage can migrate from
Drizzle table definitions to dbtype elements one table at a time. The
Module-based graph type definitions are already compatible — they're both
TypeBox `Type.Module` objects.
## ujsx as Universal IR
The three packages (ujsx, dbtype, storage) share the same pipeline shape:
**Schema → Element Tree → Host**. This is not coincidental — ujsx is a
universal declarative IR, and different "render targets" are just different
HostConfigs.
### What this could look like
```tsx
// Graph type definitions as ujsx elements (future)
const CallGraphSchema = h("graphSchema", { name: "call-graph" },
h("config", { type: "directed", multi: false, allowSelfLoops: false }),
h("nodeType", { name: "call" },
h(BaseNode, {}),
h("attr", { name: "requestId", type: "string", required: true }),
h("attr", { name: "status", ref: "CallStatus" }),
),
h("edgeType", { name: "triggered" },
h(BaseEdge, {}),
h("attr", { name: "type", literal: "triggered" }),
),
h("edgeConstraints", { edgeType: "triggered",
allowedSourceTypes: ["Call"],
allowedTargetTypes: ["Call", "Subcall"] }),
);
```
Rendered to different hosts:
| Host | Output |
|------|--------|
| TypeBox Host | `Type.Module({ CallNode: ..., TriggeredEdge: ... })` |
| SQLite Host | `sqliteTable("node_types", { ... })` + `sqliteTable("edge_types", { ... })` |
| PG Host | `pgTable("node_types", { ... })` + `pgTable("edge_types", { ... })` |
| graphology Host | `SerializedGraph` format |
| Documentation Host | Mermaid diagram, typed API docs |
### What's real today vs. aspirational
| Capability | Status |
|-----------|--------|
| `Type.Module` for graph type definitions | ✅ Implemented — Metagraph, CallGraph, SecretGraph Modules |
| Bridge functions (`moduleToDbSchema`, `validateNode`, `validateEdge`) | ✅ Implemented |
| Reference graph type Modules (CallGraph, SecretGraph) | ✅ Implemented |
| Crypto utility (`encrypt`, `decrypt`, `generateEncryptionKey`, `EncryptedDataSchema`) | ✅ Implemented |
| Codegen from TypeScript interfaces → Module entries | ✅ TsToModule exists |
| dbtype element trees → Drizzle tables | ⚠️ dbtype Phase 0, no implementation |
| `<graphSchema>` ujsx elements | ⚠️ Conceptual — needs HostConfig design |
| Typed graph pointers via JPATH | ⚠️ Conceptual — needs JPATH Module design |
| Reactive graph observation via ValuePointer | ⚠️ Conceptual — needs signal integration |
The Module-based graph type definitions (this spec) are the **first concrete
step** in this pipeline. Everything else builds on having a `Type.Module` as
the schema source of truth.
## Constraints on Current Design
The forward-looking patterns documented here constrain the Module evolution
design in [metagraph-module.md](./metagraph-module.md):
1. **The Module format must be self-contained**`Type.Module({...})` entries
with `Type.Ref` and `Type.Composite` are the same structures that a ujsx
TypeBox Host would produce. If the Module format were an ad-hoc builder
output, it couldn't be rendered by a different host later.
2. **Edge constraints must be schema entries, not just DB columns** — the
constraint data needs to survive serialization/deserialization and be
validatable independently. DB-only columns can't do this.
3. **The base attribute schemas (`BaseNode`, `BaseEdge`) must be TypeBox
schemas** — not Drizzle column definitions, not builder-internal objects.
Only TypeBox schemas can be composed via `Type.Composite`, referenced via
`Type.Ref`, and serialized to JSON Schema.
4. **No ujsx dependency** — storage's Module-based graph types join the
pipeline conceptually, not as a runtime dependency. The `Type.Module`
output is the same shape that a ujsx HostConfig would produce, but storage
doesn't need ujsx to create it. The alignment is structural, not dependent.
5. **Schemas-as-JSON enables `Value.Diff`/`Value.Patch`/`Value.Cast`**
because TypeBox Modules serialize to JSON Schema, the TypeBox value system
can operate on schemas themselves (diff to detect changes, patch to update
stored schemas, cast to migrate data). This is not possible if schemas are
opaque builder objects or Drizzle column definitions. See
[schema-evolution.md](./schema-evolution.md).
## References
- ujsx pointer system: `/workspace/@alkdev/ujsx/src/core/pointer.ts`
- ujsx HostConfig adapter: `/workspace/@alkdev/ujsx/src/host/config.ts`
- dbtype architecture: `/workspace/@alkdev/dbtype/docs/architecture/README.md`
- dbtype elements: `/workspace/@alkdev/dbtype/docs/architecture/elements.md`
- dbtype module: `/workspace/@alkdev/dbtype/docs/architecture/module.md`
- JPATH Module (JSONPath as TypeBox Module): `/workspace/research/typebox_research/ujsx/jpath.gen.ts`
- jsonpathly source: `/workspace/jsonpathly/`
- Module evolution spec: [metagraph-module.md](./metagraph-module.md)
- Schema evolution spec: [schema-evolution.md](./schema-evolution.md)