diff --git a/AGENTS.md b/AGENTS.md index 936ed36..f87cd09 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -16,7 +16,7 @@ instances) from the earlier `@ade` prototype. ├── mod.ts # Re-exports graphs/ only (zero db deps) ├── deno.json # JSR config, imports, tasks, lint rules ├── src/ -│ ├── graphs/ # Schema types + SchemaBuilder (no db deps) +│ ├── graphs/ # Metagraph Module + bridge functions (no db deps) │ ├── sqlite/ # SQLite host (drizzle-orm/libsql) │ │ ├── tables/ # Drizzle table definitions │ │ ├── relations.ts # Drizzle relations @@ -28,7 +28,7 @@ instances) from the earlier `@ade` prototype. ### Subpath Exports (JSR/npm) -- `@alkdev/storage` → graphs types + SchemaBuilder (zero deps) +- `@alkdev/storage` → Metagraph Module, graph type definitions (zero deps) - `@alkdev/storage/sqlite` → SQLite tables, relations, client (drizzle-orm + libsql) - `@alkdev/storage/pg` → PostgreSQL tables, relations, client (NOT YET @@ -79,6 +79,7 @@ Key changes from the originals: `GRAPH_STATUS`) - `client.ts` refactored to be injectable - Module-level `db` and `client` exports removed +- `SchemaBuilder` removed — replaced by `Type.Module()` construction ## File Conventions @@ -99,6 +100,10 @@ See `docs/architecture/` for detailed specifications: - `overview.md` — Package purpose, exports, design decisions, open questions - `metagraph.md` — Core graph model, schema types, SchemaBuilder, attribute storage +- `metagraph-module.md` — Graph type definitions as TypeBox Modules (evolution + of metagraph.md), naming conventions, migration path +- `forward-look.md` — Connections to dbtype, graph pointers, ujsx universal IR + pipeline - `sqlite-host.md` — SQLite tables, relations, client factory, porting notes - `encrypted-data.md` — Encrypted data design (planned), crypto utility, node type modeling diff --git a/docs/architecture/forward-look.md b/docs/architecture/forward-look.md new file mode 100644 index 0000000..64003c8 --- /dev/null +++ b/docs/architecture/forward-look.md @@ -0,0 +1,256 @@ +--- +status: draft +last_updated: 2026-05-30 +--- + +# Forward Look: Pointers, dbtype, and Universal IR + +How the Module-based metagraph connects to the broader @alkdev ecosystem — +typed graph pointers, dbtype table rendering, and the ujsx universal IR +pipeline. These are forward-looking designs that justify why certain +structural decisions are made now (DD9, DD10 in +[metagraph-module.md](./metagraph-module.md)). + +## Overview + +Three packages in the @alkdev ecosystem share the same pipeline shape: + +``` +Schema (TypeBox Module) → Element Tree (ujsx) → Host (HostConfig) +``` + +| Package | Schema | Element tree | Host | +|---------|--------|-------------|------| +| `@alkdev/ujsx` | `UJSX` Module | ``, `` | DOM, custom | +| `@alkdev/dbtype` | Table/Column schemas | ``, `` | SQLite, PG, MySQL drizzle dialects | +| `@alkdev/storage` | `Metagraph` Module | ⚠️ Future: ``, `` | ⚠️ Future: graph DB hosts | + +When storage's graph type definitions align with the Module pattern, they +join this same pipeline. The immediate benefit is recursive/cross-referencing +schemas (today). The forward benefit is that graph type definitions, table +definitions, and pointer expressions can all be authored as ujsx element trees +rendered to different hosts. + +## Pointer Abstraction + +Addressing nodes and edges within a graph instance follows the same pattern as +ujsx's `ValuePointer` and `selectNode`/`setNode` — and the same pattern as +jsonpathly's JPATH Module for path expressions. + +### ujsx's pointer system (proven) + +ujsx already implements a reactive pointer system: + +```ts +class ValuePointer { + private _signal: Signal; + private _path: string[]; + get value(): T + set value(v: T) + get reactive(): ReadonlySignal + get path(): string[] +} + +function selectNode(root: UNode, path: string[]): UNode | undefined +function setNode(root: UNode, path: string[], value: UNode): UNode +``` + +This addresses elements within a ujsx tree by path segments (child indices, +prop names). A graph instance has analogous structure: nodes identified by +key, edges identified by key, attributes addressed by JSON path. + +### Graph pointer analogy + +```ts +// ujsx pointer: element tree → path → value +selectNode(root, ["children", 0, "props", "name"]) + +// Graph pointer: graph instance → path → value +selectNode(graph, ["nodes", "call-001", "attributes", "requestId"]) +``` + +The structural analogy: + +| ujsx concept | Graph concept | +|-------------|---------------| +| Element tree root | Graph instance | +| `UNode` | Node or Edge | +| `path: string[]` | Key path: `["nodes", key]` or `["edges", key]` | +| `selectNode(root, path)` | `selectGraphNode(graph, path)` | +| `setNode(root, path, value)` | `setGraphNode(graph, path, value)` (via repository) | + +### JPATH Module (jsonpathly) + +The research shows that JSONPath expressions can themselves be a TypeBox Module +(`JPATH = Type.Module({...})` with recursive `Type.Ref("Subscript")`). This means +pointer paths are not just runtime strings — they're typed schemas that can be +validated and composed. + +For graph storage, this opens the possibility of **typed graph queries** — a +pointer expression like `nodes.call-001.attributes.requestId` has a schema that +validates against the graph type's Module. If `CallNode` doesn't have a +`requestId` field, the pointer expression is invalid at compile time. + +### Scope for v1 + +The pointer abstraction is a forward-looking design. For v1: + +- **Repository functions** use direct key-based addressing: + `findNode(graphId, nodeKey)`, `findEdge(graphId, edgeKey)` +- **Attribute access** is untyped JSON retrieval: + `node.attributes.requestId` +- **The Module** validates attribute shapes, but query paths are strings + +The jump to typed pointers requires either the JPATH Module (for path +validation) or ujsx-style `ValuePointer` with signals (for reactive graph +observation). Both are post-v1 concerns, but the graph type Module makes them +feasible because it provides the schema the pointer validates against. + +## Relationship to @alkdev/dbtype + +`@alkdev/dbtype` defines database schemas as ujsx element trees and renders them +to Drizzle dialects via HostConfig. Storage's SQLite/PG table definitions are a +natural consumer of this pipeline. + +### Current vs. Future Table Definition + +**Current** (manual Drizzle table defs): + +```ts +export const graphTypes = sqliteTable("graph_types", { + id: text("id").primaryKey(), + name: text("name").notNull(), + config: text("config", { mode: "json" }).notNull(), + // ... +}); +``` + +**Future** (dbtype element tree → HostConfig rendering): + +```tsx +const GraphTypesEl = h("table", { name: "graph_types" }, + h(IdColumn, {}), + h("column", { name: "name", type: "string", notNull: true }), + h("column", { name: "config", type: "json", mode: "json", notNull: true }), + h(AuditColumns, {}), +); + +const root = createRoot(sqliteHost, {}); +root.render(GraphTypesEl); +const drizzleTable = root.ctx.tables.graph_types; +``` + +### Why this matters for storage + +1. **Single source of truth**: Today's `sqlite/tables/` and future `pg/tables/` + define the same shapes in two different Drizzle dialects. dbtype renders the + same element tree to both — no manual duplication. +2. **Schema extraction**: `extractTable()` produces both TypeBox schemas (for + validation) and column metadata (for Drizzle rendering) from the same tree. + Storage gets `SelectGraphType` and `InsertGraphType` schemas for free. +3. **Module alignment**: dbtype assembles extracted schemas into a + `Type.Module` for cross-table references. Storage's metagraph Module and + dbtype's table Module could share a namespace — the `graph_types.config` + column stores the JSON Schema from `Metagraph.Config`. + +### v1 approach + +For v1, storage continues with manual Drizzle table definitions. The dbtype +integration is a post-v1 migration path because: + +- dbtype is Phase 0 (architecture complete, no implementation) +- The manual defs work and are well-understood +- The Module pattern for graph types can be adopted independently (no dbtype + dependency) + +When dbtype reaches Phase 1 (implementation), storage can migrate table defs +to dbtype elements one table at a time. The Module-based graph type definitions +are already compatible — they're both TypeBox `Type.Module` objects. + +## ujsx as Universal IR + +The three packages (ujsx, dbtype, storage) share the same pipeline shape: +**Schema → Element Tree → Host**. This is not coincidental — ujsx is a +universal declarative IR, and different "render targets" are just different +HostConfigs. + +### What this could look like + +```tsx +// Graph type definitions as ujsx elements (future) +const CallGraphSchema = h("graphSchema", { name: "call-graph" }, + h("config", { type: "directed", multi: false, allowSelfLoops: false }), + h("nodeType", { name: "call" }, + h(BaseNode, {}), + h("attr", { name: "requestId", type: "string", required: true }), + h("attr", { name: "status", ref: "CallStatus" }), + ), + h("edgeType", { name: "triggered" }, + h(BaseEdge, {}), + h("attr", { name: "type", literal: "triggered" }), + ), + h("edgeConstraints", { edgeType: "triggered", + allowedSourceTypes: ["Call"], + allowedTargetTypes: ["Call", "Subcall"] }), +); +``` + +Rendered to different hosts: + +| Host | Output | +|------|--------| +| TypeBox Host | `Type.Module({ CallNode: ..., TriggeredEdge: ... })` | +| SQLite Host | `sqliteTable("node_types", { ... })` + `sqliteTable("edge_types", { ... })` | +| PG Host | `pgTable("node_types", { ... })` + `pgTable("edge_types", { ... })` | +| graphology Host | `SerializedGraph` format | +| Documentation Host | Mermaid diagram, typed API docs | + +### What's real today vs. aspirational + +| Capability | Status | +|-----------|--------| +| `Type.Module` for graph type definitions | ✅ Ready to implement now | +| Codegen from TypeScript interfaces → Module entries | ✅ TsToModule exists | +| dbtype element trees → Drizzle tables | ⚠️ dbtype Phase 0, no implementation | +| `` ujsx elements | ⚠️ Conceptual — needs HostConfig design | +| Typed graph pointers via JPATH | ⚠️ Conceptual — needs JPATH Module design | +| Reactive graph observation via ValuePointer | ⚠️ Conceptual — needs signal integration | + +The Module-based graph type definitions (this spec) are the **first concrete +step** in this pipeline. Everything else builds on having a `Type.Module` as +the schema source of truth. + +## Constraints on Current Design + +The forward-looking patterns documented here constrain the Module evolution +design in [metagraph-module.md](./metagraph-module.md): + +1. **The Module format must be self-contained** — `Type.Module({...})` entries + with `Type.Ref` and `Type.Composite` are the same structures that a ujsx + TypeBox Host would produce. If the Module format were an ad-hoc builder + output, it couldn't be rendered by a different host later. + +2. **Edge constraints must be schema entries, not just DB columns** — the + constraint data needs to survive serialization/deserialization and be + validatable independently. DB-only columns can't do this. + +3. **The base attribute schemas (`BaseNode`, `BaseEdge`) must be TypeBox + schemas** — not Drizzle column definitions, not builder-internal objects. + Only TypeBox schemas can be composed via `Type.Composite`, referenced via + `Type.Ref`, and serialized to JSON Schema. + +4. **No ujsx dependency** — storage's Module-based graph types join the + pipeline conceptually, not as a runtime dependency. The `Type.Module` + output is the same shape that a ujsx HostConfig would produce, but storage + doesn't need ujsx to create it. The alignment is structural, not dependent. + +## References + +- ujsx pointer system: `/workspace/@alkdev/ujsx/src/core/pointer.ts` +- ujsx HostConfig adapter: `/workspace/@alkdev/ujsx/src/host/config.ts` +- dbtype architecture: `/workspace/@alkdev/dbtype/docs/architecture/README.md` +- dbtype elements: `/workspace/@alkdev/dbtype/docs/architecture/elements.md` +- dbtype module: `/workspace/@alkdev/dbtype/docs/architecture/module.md` +- JPATH Module (JSONPath as TypeBox Module): `/workspace/research/typebox_research/ujsx/jpath.gen.ts` +- jsonpathly source: `/workspace/jsonpathly/` +- Module evolution spec: [metagraph-module.md](./metagraph-module.md) \ No newline at end of file diff --git a/docs/architecture/metagraph-module.md b/docs/architecture/metagraph-module.md new file mode 100644 index 0000000..547dfda --- /dev/null +++ b/docs/architecture/metagraph-module.md @@ -0,0 +1,842 @@ +--- +status: draft +last_updated: 2026-05-30 +--- + +# Metagraph as TypeBox Module + +Graph type definitions as `Type.Module` — aligning with the ujsx pattern for +recursive schemas, cross-package references, codegen, and graphology serialization. + +## Overview + +A graph type definition is naturally a TypeBox Module. It has named entries +(node types, edge types, config) that reference each other with `Type.Ref()`, +compose with `Type.Composite()`, and can cross-reference other Modules with +`Import()`. This is the same pattern used by `@alkdev/ujsx` (where `UJSX` is +a Module with `UPrimitive`, `UElement`, `URoot`, `UNode` recursively referencing +each other). + +The current `SchemaBuilder` produces a flat `GraphSchema` object — an ad-hoc +`Record` + `Record`. This works but +creates friction: + +1. **No cross-graph-type references** — a call graph node type can't reference + `CallStatus` from `@alkdev/flowgraph` without manual `Type.Intersect` + composition. Each package defines schemas independently, duplicating types. +2. **No graphology compatibility** — the schema output is a flat JSON object, + not a format that maps to graphology's `import()`/`export()`. Consumers + manually map node/edge attributes. +3. **No codegen leverage** — `TsToModule` generates TypeBox Modules from + TypeScript interfaces. The SchemaBuilder can't consume Module output, so + codegen-produced types must be manually translated. + +The Module approach treats each graph type as a `Type.Module`, aligning storage +with how ujsx already works — recursive types via `Ref`, composition via +`Composite`, cross-references via `Import`. + +For the forward-looking view of how this connects to dbtype, graph pointers, +and the ujsx universal IR pipeline, see [forward-look.md](./forward-look.md). + +## The Pattern (Proven in ujsx) + +`@alkdev/ujsx` already uses this pattern (ADR-002: "TypeBox Module as type +registry"): + +```ts +// ujsx: schema.ts +export const UJSX = Type.Module({ + UPrimitive: Type.Union([Type.String(), Type.Number(), Type.Boolean(), Type.Null()]), + PropValue: Type.Union([..., Type.Ref("UNode"), ...]), + UniversalProps: Type.Object({}, { additionalProperties: Type.Union([Type.Ref("PropValue"), Type.Undefined()]) }), + UElement: Type.Object({ + type: Type.String(), + props: Type.Ref("UniversalProps"), + children: Type.Array(Type.Ref("UNode")), // recursive! + }), + URoot: Type.Object({ + type: Type.Literal("root"), + props: Type.Ref("UniversalProps"), + children: Type.Array(Type.Ref("UNode")), // recursive! + }), + UNode: Type.Union([Type.Ref("UPrimitive"), Type.Ref("UElement"), Type.Ref("URoot")]), +}); +``` + +Key properties: +- **`Type.Ref("UNode")`** resolves within the Module's `$defs` — recursive + references are natural +- **`UJSX.Import("UElement")`** lets other Modules reference ujsx types — the + referenced Module's `$defs` are embedded in the importing Module's JSON Schema +- **`Value.Check(UJSX.Import("UElement"), node)`** validates at runtime +- **`Static`** gives TypeScript types (or hand-written types for + non-serializable entries like `ComponentFn`) + +Graph type definitions have the same structure — named entries that reference +each other, with possible cross-references to other packages' Modules. + +## Proposed: GraphType as a TypeBox Module + +### Base Module: Metagraph + +The metagraph meta-schema itself is a Module: + +```ts +export const Metagraph = Type.Module({ + Config: Type.Object({ + type: Type.Union([ + Type.Literal("directed"), + Type.Literal("undirected"), + Type.Literal("mixed"), + ], { default: "mixed" }), + multi: Type.Boolean({ default: true }), + allowSelfLoops: Type.Boolean({ default: true }), + }), + + BaseNode: Type.Object({ + created: Type.Optional(Type.String({ format: "date-time" })), + modified: Type.Optional(Type.String({ format: "date-time" })), + metadata: Type.Optional(Type.Record(Type.String(), Type.Unknown())), + }), + + BaseEdge: Type.Object({ + type: Type.String(), + metadata: Type.Optional(Type.Record(Type.String(), Type.Unknown())), + }), +}); +``` + +### Concrete Graph Type: CallGraph + +A specific graph type is also a Module. It composes `BaseNode`/`BaseEdge` via +`Type.Composite()` (same as ujsx's `Mdast.Node: Type.Composite([Unist.Import("UnistNode"), ...])`): + +```ts +export const CallGraph = Type.Module({ + // Config is specific — literal values, not unions with defaults + Config: Type.Object({ + type: Type.Literal("directed"), + multi: Type.Literal(false), + allowSelfLoops: Type.Literal(false), + }), + + // Node types compose BaseNode (from Metagraph) with call-specific attributes + CallNode: Type.Composite([ + Metagraph.Import("BaseNode"), + Type.Object({ + requestId: Type.String(), + operationId: Type.String(), + status: Type.Ref("CallStatus"), + parentRequestId: Type.Optional(Type.String()), + input: Type.Unknown(), + output: Type.Optional(Type.Unknown()), + identity: Type.Optional(Type.Ref("Identity")), + startedAt: Type.Optional(Type.String({ format: "date-time" })), + completedAt: Type.Optional(Type.String({ format: "date-time" })), + }), + ]), + + SubcallNode: Type.Composite([ + Metagraph.Import("BaseNode"), + Type.Object({ + requestId: Type.String(), + parentRequestId: Type.String(), + operationId: Type.String(), + status: Type.Ref("CallStatus"), + }), + ]), + + // Edge types + TriggeredEdge: Type.Composite([ + Metagraph.Import("BaseEdge"), + Type.Object({ + type: Type.Literal("triggered"), + }), + ]), + + DependsOnEdge: Type.Composite([ + Metagraph.Import("BaseEdge"), + Type.Object({ + type: Type.Literal("depends_on"), + }), + ]), + + // Shared types referenced by node/edge entries + CallStatus: Type.Union([ + Type.Literal("pending"), + Type.Literal("running"), + Type.Literal("completed"), + Type.Literal("failed"), + Type.Literal("aborted"), + ]), + + Identity: Type.Object({ + id: Type.String(), + scopes: Type.Array(Type.String()), + resources: Type.Optional(Type.Record(Type.String(), Type.Array(Type.String()))), + }), +}); +``` + +### Cross-Module References + +`Module.Import()` allows one Module to reference entries from another: + +```ts +import { FlowGraph } from "@alkdev/flowgraph/schema"; + +const CallGraph = Type.Module({ + // ... + CallNode: Type.Composite([ + Type.Ref("BaseNode"), + Type.Object({ + status: FlowGraph.Import("CallStatus"), // from flowgraph + identity: Type.Optional(FlowGraph.Import("Identity")), // from flowgraph + // ... + }), + ]), +}); +``` + +This is exactly the `Mdast.Import("UnistNode")` pattern from the ujsx research. + +**⚠️ Import embedding**: `Module.Import()` embeds the referenced Module's `$defs` +into the importing Module's JSON Schema output. When `CallGraph` imports from +`FlowGraph`, the resulting JSON Schema includes all of `FlowGraph`'s definitions +in `$defs`. See DD6 for how the repository layer handles this. + +**Decision (DD6)**: The repository layer stores **dereferenced entry schemas** — +each `node_types` row gets its entry's resolved JSON Schema (with inline `$defs` +for just its transitive references), not the entire importing Module. This +avoids storage bloat and version coupling issues. + +### BaseNode/BaseEdge: Local Re-declaration vs Metagraph.Import + +`Type.Ref()` only resolves entries within the *same* Module. In the `CallGraph` +example above, `Type.Ref("BaseNode")` requires `BaseNode` to be an entry in the +`CallGraph` Module. There are two strategies for getting `BaseNode`/`BaseEdge` +into a concrete graph type Module: + +**Option A: Re-declare locally** (shown in the example above). Each concrete +Module includes its own `BaseNode`/`BaseEdge` entries. The schemas are identical +to `Metagraph.BaseNode`/`Metagraph.BaseEdge` — you copy them in. Simple, but +creates duplication. If the base schemas evolve, each concrete Module must be +updated independently. + +**Option B: Metagraph.Import**. The concrete Module imports from `Metagraph`: + +```ts +const CallGraph = Type.Module({ + CallNode: Type.Composite([ + Metagraph.Import("BaseNode"), + Type.Object({ requestId: Type.String(), ... }), + ]), +}); +``` + +This avoids duplication but embeds `Metagraph`'s `$defs` into `CallGraph`'s +JSON Schema output. For most cases, `Metagraph` is small (3 entries) so the +bloat is minimal. If `Metagraph` grows, this could become a concern. + +**Decision: Option B for same-package Modules (recommended), Option A as +fallback for external-package Modules**. + +For Modules defined within `@alkdev/storage` (like `CallGraph` in +`modules/call-graph.ts`), `Metagraph.Import("BaseNode")` has no circular +dependency issue — both `Metagraph` and `CallGraph` live in the same package. +The `Import` approach avoids duplication and keeps the base schemas in one +place. + +For Modules defined outside `@alkdev/storage` (e.g., in `@alkdev/flowgraph`), +Option A applies because external packages should not depend on storage's +`Metagraph` Module (see Open Question 1). Those packages re-declare their own +base schemas or define them independently. + +The v1 reference Modules in `modules/` should use Option B. If a future +consumer defines a `CallGraph` Module externally, they can choose either +approach — the schemas are structurally identical. + +**Verified**: `Type.Composite([Type.Ref("BaseNode"), Type.Object({...})])` +within a Module resolves correctly. Test confirms: `Value.Check(Module.Import("CallNode"), validData)` passes. + +### Type.Composite vs Type.Intersect + +The Module approach uses `Type.Composite` for extending `BaseNode`/`BaseEdge`, +not `Type.Intersect`. This matches the ujsx pattern where `Mdast.Node` is +`Type.Composite([Unist.Import("UnistNode"), Type.Object({...})])`. + +The difference: +- **`Type.Intersect`** creates a JSON Schema `allOf` — the result is a + `TIntersect` wrapper with nested schemas. Consumers must traverse `allOf` + to access properties. +- **`Type.Composite`** produces an **intersection evaluated into a flat + `TObject`** — overlapping keys are intersected via `IntersectEvaluated` + and the result is a single object with no `allOf` wrapper. The output + shape is `{ key1: Intersect([typeA, typeB]), key2: typeC, ... }`. + +**Both use intersection semantics for overlapping keys.** Composite is NOT +an `Object.assign` override — when overlapping keys have varying (incompatible) +types, the result is `never`. When overlapping keys have a subtype +relationship (like `Type.String()` and `Type.Literal("triggered")`), the +intersection resolves to the narrower type (`Type.Literal("triggered")`), +which is the correct behavior. + +**Why Composite over Intersect for graph types**: The output is a flat +`TObject` that maps directly to a node/edge attribute schema. `Intersect` +produces a `TIntersect` wrapper that would need unwrapping. For graph types +where base and concrete attributes have non-overlapping keys (most cases) +or subtype-only overlaps (like `type: Type.String()` → `type: Type.Literal(...)`), +Composite evaluates to the same result but in a more convenient shape. + +**Design constraint**: Do not use `Type.Composite` with overlapping keys of +incompatible types. If `BaseEdge` has `type: Type.String()` and a concrete +edge type needs `type: Type.Number()`, the intersection evaluates to `never`. +For graph types, this is not a concern — base and concrete keys either don't +overlap, or the overlap is a valid subtype narrowing (union → literal). + +### Config: Literal Values for Specific Graph Types + +The general `Metagraph.Config` has `Type.Union` with defaults (for +construction-time validation: "any valid config"). Specific graph types use +`Type.Literal` for frozen config values: + +```ts +// General (construction): Type.Union([Type.Literal("directed"), Type.Literal("undirected"), ...]) +// Specific (frozen): Type.Literal("directed") +``` + +The transition: consumer provides a general config → validated against +`Metagraph.Config` → the specific graph type Module uses `Type.Literal` to +freeze the value. The `SchemaBuilder` (during transition) performs this +narrowing automatically. + +### Edge Type Constraints: named constraint entries + +Edge type constraints (`allowedSourceTypes`/`allowedTargetTypes`) are **named +Module entries**, not columns bolted onto DB rows. This makes them first-class +parts of the schema — queryable, validatable, and composable: + +```ts +export const CallGraph = Type.Module({ + // ... + TriggeredEdge: Type.Composite([ + Type.Ref("BaseEdge"), + Type.Object({ type: Type.Literal("triggered") }), + ]), + TriggeredEdgeConstraints: Type.Object({ + edgeType: Type.Literal("triggered"), + allowedSourceTypes: Type.Array(Type.String()), // node type names: ["Call"] + allowedTargetTypes: Type.Array(Type.String()), // node type names: ["Call", "Subcall"] + }), + DependsOnEdge: Type.Composite([ + Type.Ref("BaseEdge"), + Type.Object({ type: Type.Literal("depends_on") }), + ]), + DependsOnEdgeConstraints: Type.Object({ + edgeType: Type.Literal("depends_on"), + allowedSourceTypes: Type.Array(Type.String()), + allowedTargetTypes: Type.Array(Type.String()), + }), +}); +``` + +**Why Module entries instead of DB columns** (DD7 revised): + +1. **Schema-level validation**: `Value.Check(CallGraph.TriggeredEdgeConstraints, data)` + validates that constraint data is well-formed. With DB columns, there's no + schema validation — just JSON arrays in text columns. +2. **Serialization**: The constraint entries serialize to JSON Schema with + `$defs`, enabling `Value.Diff` for migration detection and `FromSchema` + for round-tripping. +3. **DB mapping**: The `moduleToDbSchema()` function extracts + `*EdgeConstraints` entries and writes their `allowedSourceTypes`/ + `allowedTargetTypes` fields to the existing `edge_types` columns. The DB + schema doesn't change — the Module entries are the source of truth, the + DB columns are the persistence projection. + +**Why Type.String() not Type.Ref()**: The constraint arrays contain node type +*names* (strings like `"Call"`), not node type *schemas*. `Type.Ref("CallNode")` +would mean "an element must validate against the CallNode schema," which is +incorrect — the constraint is about which named node types are valid endpoints, +not about node data shapes. The naming convention (`*Node` suffix) provides an +implicit structural contract: string values in `allowedSourceTypes` should +correspond to `*Node` entry names in the same Module. This is enforced by +`moduleToDbSchema()` at Module-to-DB projection time, not by the schema itself. +See Open Question 4 for the `Type.Ref` vs `Type.String` trade-off. + +**Transition note**: The current DB schema stores `allowedSourceTypes` and +`allowedTargetTypes` as JSON text columns (arrays of strings, default `[]`). +In the Module, these become `Type.Array(Type.String())` entries — the DB +column values are the same string arrays. `moduleToDbSchema()` extracts them +directly. Read-path reconstruction resolves the names back to Module entries +for validation. + +**Empty array semantics**: In the DB, `[]` means "no restriction" (any node +type valid). In the Module, omitting the `*EdgeConstraints` entry means the +same thing. An explicit entry with empty arrays is not valid — it would mean +"no node types are valid at this endpoint," which is nonsensical. The +repository layer enforces this convention. + +### Entry Naming Convention + +Within a graph type Module, entries follow a naming convention that distinguishes +their role (DD8): + +| Suffix | Role | Maps to DB | +|--------|------|------------| +| `Config` | Graph configuration | `graph_types.config` | +| `*Node` | Node type attribute schema | `node_types.schema` | +| `*Edge` | Edge type attribute schema | `edge_types.schema` | +| `*EdgeConstraints` | Edge endpoint validation rules | `edge_types.allowedSourceTypes`/`allowedTargetTypes` | +| `*Enum` or bare name | Shared enum/type | Embedded in `node_types.schema`/`edge_types.schema` | +| `BaseNode`, `BaseEdge` | Base attribute schemas | Composed into `*Node`/`*Edge` entries | + +The `moduleToDbSchema()` function uses this convention to map Module entries to +the `node_types` and `edge_types` tables. Entries ending in `Node` become rows +with `name = entryNameWithoutSuffix ("Node")` and `schema = resolved entry`. +Same for `*Edge`. The `Config` entry maps to `graph_types.config`. + +## graphology Serialization Bridge + +The bridge between Modules and graphology is the `SerializedGraph` pattern that +`@alkdev/flowgraph` already uses: + +```ts +// flowgraph's current pattern (standalone schemas) +const CallGraphSerialized = SerializedGraph( + CallNodeAttrs, // node attribute schema + CallEdgeAttrs, // edge attribute schema + Type.Object({}), // graph-level attributes +); + +// Module pattern (entries from the Module) +const CallGraphSerialized = SerializedGraph( + CallGraph.CallNode, // entry from Module — resolves Refs through $defs + CallGraph.DependsOnEdge, // entry from Module + Type.Object({}), +); +``` + +Graphology's serialized format: + +```ts +{ + attributes: {}, // Graph-level attributes (empty for most graphs) + options: { + type: "directed", // From CallGraph.Config + multi: false, + allowSelfLoops: false, + }, + nodes: [ + { key: "call-001", attributes: { requestId, operationId, status, ... } }, + ], + edges: [ + { key: "call-001->call-002", source: "call-001", target: "call-002", + attributes: { type: "triggered" } }, + ], +} +``` + +The mapping: +- `CallGraph.Config` → `options` +- `CallGraph.CallNode` → validates `nodes[].attributes` +- `CallGraph.TriggeredEdge` → validates `edges[].attributes` + +This is **complementary** to `@alkdev/flowgraph`'s `SerializedGraph` — storage +produces the data, flowgraph operates on it in memory. The `SerializedGraph` +factory function stays the same — its schema arguments now come from Module +entries instead of standalone schemas. The `moduleToDbSchema()` +function extracts per-entry schemas for DB storage; the `moduleToGraphology()` +function produces the graphology import format for hydration. + +## DB Persistence Bridge + +The repository layer maps Module entries to the existing 6-table schema: + +1. **`graph_types`** row: `name` = Module name, `config` = `CallGraph.Config` + JSON Schema (with defaults resolved) +2. **`node_types`** rows: one row per `*Node` entry, `name` = entry name + (minus `Node` suffix), `schema` = resolved entry JSON Schema +3. **`edge_types`** rows: one row per `*Edge` entry, `name` = entry name + (minus `Edge` suffix), `schema` = resolved entry JSON Schema, + `allowedSourceTypes`/`allowedTargetTypes` from constraint entries + +On read, the repository layer reconstructs the Module from DB rows: +`Value.Check(CallGraph.CallNode, node.attributes)` validates node data against +the Module entry. + +**`Module.Import()` embedding**: When a Module entry references entries from +another Module (e.g., `FlowGraph.Import("CallStatus")`), the JSON Schema for +that entry includes the referenced entries in `$defs`. The repository layer +stores the **dereferenced entry** — the resolved JSON Schema with inline `$defs` +for transitive references — not the entire importing Module. This avoids +duplicating all of FlowGraph's definitions in every CallGraph node_types row. + +### Bridge Functions + +#### `moduleToDbSchema(module)` + +Maps a graph type Module to DB row values for the metagraph tables. + +```ts +interface DbGraphTypeRow { + name: string; + config: Record; +} + +interface DbNodeTypeRow { + name: string; + schema: Record; +} + +interface DbEdgeTypeRow { + name: string; + schema: Record; + allowedSourceTypes: string[]; + allowedTargetTypes: string[]; +} + +interface DbSchema { + graphType: DbGraphTypeRow; + nodeTypes: DbNodeTypeRow[]; + edgeTypes: DbEdgeTypeRow[]; +} + +function moduleToDbSchema(module: TModule): DbSchema +``` + +**Error behavior**: Throws on: +- Module entries that don't match any naming convention (`*Node`, `*Edge`, + `Config`, `*EdgeConstraints`, `*Enum`, `BaseNode`, `BaseEdge`). Bare names + without a recognized suffix are treated as shared types (embedded in other + entries' schemas), not as independent DB rows. +- `*EdgeConstraints` entries that reference edge type entries not present in + the Module (the `edgeType` field must match an `*Edge` entry name). +- `*EdgeConstraints` entries with empty `allowedSourceTypes` and + `allowedTargetTypes` arrays (empty = "no types allowed", which is + nonsensical; omit the entry instead for "no restriction"). +- Module without a `Config` entry (all graph types require configuration). + +#### `validateNode(module, entryName, data)` / `validateEdge(module, entryName, data)` + +Validates node or edge data against a Module entry. + +```ts +function validateNode(module: TModule, entryName: string, data: unknown): boolean +function validateEdge(module: TModule, entryName: string, data: unknown): boolean +``` + +Returns `true` if data passes `Value.Check` against the resolved Module entry. +Throws if `entryName` doesn't match an `*Node`/`*Edge` entry in the Module. +Does NOT throw on invalid data — returns `false`. + +### Type.Any vs Type.Unknown + +The existing `types.ts` uses `Type.Any()` for `metadata` and `schema` fields. +The Module examples use `Type.Unknown()`. These have different JSON Schema +outputs: + +- `Type.Any()` → `{}` (accepts anything, no validation) +- `Type.Unknown()` → `{}` with `additionalProperties: true` semantics + +For the Module approach, **`Type.Unknown()` is canonical**. It's the more +explicit choice — it communicates "this field stores arbitrary data, no +validation applied." `Type.Any()` is a legacy from the original TypeBox API. +The existing `types.ts` schemas should be aligned to `Type.Unknown()` during +the Module migration (Phase 1). + +### Performance Expectations + +Graph type Modules are small — typically 5–20 entries (one Config, 2–5 node +types, 2–5 edge types, 2–5 shared types, 2–5 constraint entries). The +`Value.Check` cost scales with schema complexity, not Module size; only the +resolved entry schema is checked, not the entire Module. + +The dereferenced entry strategy (DD6) means each DB row stores only its own +JSON Schema with transitive `$defs` — typically 1–3 KB per entry. A full +graph type's schemas total ~10–50 KB in the DB. This is negligible compared +to the node/edge data being stored. + +"Validate on read" (Open Question 5) has a per-read cost. For +high-throughput paths, the repository layer can cache the resolved Module +entry locally after first read, avoiding repeated `Value.Check` for known-good +data. This is a repository-layer optimization, not a Module design concern. + +## Codegen Path + +`TsToModule` generates TypeBox Modules from TypeScript interfaces. The path from +TypeScript to graph type: + +``` +TypeScript interface → TsToModule.Generate() → TypeBox Module entry +@alkdev/flowgraph CallNodeAttrs → flowgraph schema.ts → FlowGraph Module +@alkdev/taskgraph TaskNodeAttrs → taskgraph schema.ts → TaskGraph Module +@alkdev/operations Identity → operations types.ts → Operations Module +``` + +Since flowgraph already defines `CallNodeAttrs` as a standalone TypeBox schema, +the codegen can produce a Module entry from it. Storage's `CallGraph` Module then +composes `BaseNode` with `CallNodeAttrs` via `Type.Composite`, or imports from +the flowgraph Module if flowgraph exports one (see Open Question 1). + +## SchemaBuilder.build() → Module Equivalence + +The current `SchemaBuilder.build()` returns a `GraphSchema` — a flat object with +`config`, `nodeTypes: Record`, and `edgeTypes: Record`. +A `Type.Module` with the same entries is essentially the same thing. + +### What the builder does internally + +``` +SchemaBuilder + .config({ type: "directed", multi: false }) + .nodeType("call", CallNodeSchema) + .edgeType("triggered", EdgeSchema, { allowedSourceTypes: ["call"] }) + .build() + +internally builds: + +defs = { + Config: Type.Object({ type: Literal("directed"), multi: Literal(false), ... }), + CallNode: CallNodeSchema, + TriggeredEdge: EdgeSchema, + TriggeredEdgeConstraints: Type.Object({ edgeType: Literal("triggered"), ... }), +} +return Type.Module(defs) +``` + +The `.build()` return type changes from `GraphSchema` (flat object) to +`TModule` (TypeBox Module). The `SchemaBuilder` is removed — consumers use +Module construction directly. + +### Why this works + +The `SchemaBuilder` was always building a module — it just didn't have a +module system to target. Named entries referencing each other via strings is +exactly what `Type.Ref()` does natively. The Module format: + +- Gives `Type.Ref()` instead of loose schema objects +- Gives `Module.Import()` instead of `Type.Intersect` for cross-package refs +- Gives JSON Schema `$defs` that map directly to DB storage +- Gives `Value.Check`, `Value.Diff`, `Value.Errors` on the full type system +- Gives codegen compatibility via `TsToModule.Generate()` + +For the forward-looking connections (typed graph pointers, dbtype table +rendering, ujsx HostConfig for graph schemas), see +[forward-look.md](./forward-look.md). + +## Design Decisions + +### DD1: Module replaces SchemaBuilder + +The SchemaBuilder is replaced by TypeBox Modules. The Module format provides +what SchemaBuilder was building toward, but natively: +- Named references → `Type.Ref()` instead of loose schema objects +- Cross-module imports → `Module.Import()` instead of `Type.Intersect` +- JSON Schema `$defs` → maps directly to DB storage +- Codegen compatibility → `TsToModule.Generate()` produces Module entries + +### DD2: SchemaBuilder removed + +The `SchemaBuilder` is removed. Consumers use `Type.Module()` construction +directly, with `Type.Ref()`, `Type.Composite()`, and `Metagraph.Import()` +as the building blocks. The `moduleToDbSchema()` function replaces +`SchemaBuilder.build()` as the bridge from Module to DB rows. + +### DD3: Config as a Module entry with Literal values + +Specific graph type Modules use `Type.Literal` for config values. The general +`Metagraph.Config` with `Type.Union` and defaults is for construction-time +validation. The specific Module freezes the config to exact values. + +### DD4: Node/edge attribute schemas are Module entries, not `Type.Any()` + +At the application layer, node and edge attribute schemas are named Module entries +with full type safety (`CallGraph.CallNode`, not `schema: Type.Any()`). At the +DB storage layer, the meta-schemas (`NodeType`, `EdgeType`) still have +`schema: Type.Unknown()` because the DB stores arbitrary JSON Schema blobs — the +Module entries are the application-level validation, the DB is the persistence +layer. + +**Mapping**: The repository layer maps between Module entries and DB rows using +the naming convention (`*Node` → `node_types`, `*Edge` → `edge_types`, `Config` +→ `graph_types.config`). On read, it looks up the graph type's Module to get +the validation schema for each entry. + +### DD5: Graphology import/export as the bridge to in-memory graphs + +Storage produces data that `@alkdev/flowgraph`'s `FlowGraph.fromJSON()` and +`SerializedGraph` consume. The Module entries validate data flowing in both +directions. Storage doesn't need its own graphology dependency — it produces +the JSON format, flowgraph consumes it. + +### DD6: Repository stores dereferenced entry schemas + +To avoid `Module.Import()` embedding the full `$defs` of referenced Modules in +every DB row, the repository layer stores **dereferenced entry schemas** — each +`node_types` row gets its entry's resolved JSON Schema with just the transitive +`$defs` it needs, not the entire importing Module's definitions. + +### DD7: Edge type constraints as named Module entries, not DB columns + +Edge type constraints (`allowedSourceTypes`/`allowedTargetTypes`) are named +Module entries (e.g., `TriggeredEdgeConstraints` with `Type.Array(Type.String())` +fields), not just DB columns. This gives them schema validation and +serialization. The repository layer projects these entries to the existing +`edge_types` columns (arrays of node type name strings). The DB schema +doesn't change — the Module entries are the source of truth. + +**Revised from original DD7** which stored constraints only as DB columns. +Named entries are strictly more capable: they validate and serialize; +DB columns are their persistence projection. + +### DD8: Naming convention for Module entries + +Within a graph type Module, entries are named with role-distinguishing suffixes: +`*Node` for node types, `*Edge` for edge types, `Config` for graph configuration, +`*EdgeConstraints` for edge endpoint constraints, and bare names or `*Enum` for +shared types. `moduleToDbSchema()` uses this convention to map entries to DB +tables. + +**Alternative considered**: Explicit metadata/decorators on entries (e.g., +`{ kind: "nodeType", name: "call", schema: ... }`). Rejected because it adds +boilerplate without adding information — the suffix convention is simpler +and sufficient for the expected Module size (5–20 entries). + +### DD9: Pointer abstraction is forward-looking, not v1 + +The structural analogy between ujsx's `ValuePointer`/`selectNode`/`setNode` and +graph node/edge addressing is real, but implementing typed graph pointers (via +JPATH Module or reactive signals) is a post-v1 concern. For v1, repository +functions use direct key-based addressing and the Module validates attribute +shapes. The Module's existence makes typed pointers feasible later because +it provides the schema the pointer validates against. + +**Alternative considered**: Implement typed pointers in v1 via a lightweight +`GraphPointer` wrapper. Rejected because it requires either JPATH Module +dependency or reactive signal integration, both of which add complexity +without clear v1 benefit. Direct key-based addressing is sufficient. + +### DD10: dbtype integration is post-v1 + +`@alkdev/dbtype`'s UJSX→Module→Host pipeline can eliminate the manual dual +definition of SQLite/PG table schemas. But dbtype is Phase 0 (architecture +complete, no implementation). For v1, storage uses manual Drizzle table +definitions. The Module-based graph type definitions are compatible with dbtype +because both produce `Type.Module` objects — the integration path is clear. + +**Alternative considered**: Implement dbtype integration in v1 alongside Module +migration. Rejected because it adds a dependency on an unimplemented package +and the manual table definitions work well. The cost of deferring is continued +dual SQLite/PG maintenance, which is manageable for 6 metagraph tables. + +## What Changes + +| Current | New | +|---------|-----| +| `types.ts` — standalone schemas | `modules/metagraph.ts` — `Metagraph` Module | +| `schemaBuilder.ts` — fluent builder | Removed — replaced by Module construction | +| `types.ts` — `BaseNodeAttributes`, `BaseEdgeAttributes` | `Metagraph` Module entries | +| `types.ts` — `GraphConfig`, `GraphStatus`, `GraphBaseType` | `Metagraph` Module entries + const objects | +| `allowedSourceTypes`/`allowedTargetTypes` as DB columns only | Named `*EdgeConstraints` Module entries (projected to DB columns) | +| No concrete graph type Modules | `modules/call-graph.ts`, `modules/acl-graph.ts`, etc. | +| No bridge between Module ↔ DB ↔ graphology | `bridge.ts` — validation, DB mapping, graphology format | + +## What Doesn't Change + +- **Database tables** — same 6 metagraph tables, same columns, same relations +- **SQLite host** — table definitions, relations, client factory unchanged +- **PostgreSQL host** (planned) — same shapes, different dialect +- **`@alkdev/typebox` dependency** — same. Modules are a core TypeBox feature +- **Encryption utility** — unchanged, can be a Module entry in `SecretGraph` +- **`allowedSourceTypes`/`allowedTargetTypes`** — same DB columns, same semantics + (Module entries are the source of truth, projected to DB columns by + `moduleToDbSchema()`) + +## Migration Path + +1. **Phase 1**: Add `Metagraph` Module, replace `types.ts` and remove + `schemaBuilder.ts`. Export Module construction API. +2. **Phase 2**: Add `bridge.ts` with `moduleToDbSchema()`, `validateNode()`, + `validateEdge()`. +3. **Phase 3**: Add `modules/` directory with reference graph type Modules + (call-graph, acl-graph, task-graph, secret-graph). These use + `Metagraph.Import()` for `BaseNode`/`BaseEdge` and `Type.Composite()` + for node/edge type composition. +4. **Phase 4**: Add `moduleToGraphology()` and `fromGraphologyExport()` for the + graphology bridge. Storage produces the format, flowgraph consumes it. + +Acceptance criteria per phase: +- **Phase 2 complete**: `moduleToDbSchema()` produces values compatible with all + 6 existing metagraph tables +- **Phase 3 complete**: Reference Modules validate against their flowgraph/taskgraph + counterparts + +## Relationship to Other Packages + +| Package | What changes | What stays | +|---------|-------------|------------| +| `@alkdev/storage` | `types.ts` → Module, `schemaBuilder.ts` → removed, new `modules/` and `bridge.ts` | Tables, relations, crypto, client factory | +| `@alkdev/flowgraph` | `CallNodeAttrs`, `CallEdgeAttrs`, `CallStatus` become Module entries (optional, exported from `/schema` subpath) | FlowGraph class, analysis, all runtime logic | +| `@alkdev/taskgraph` | `TaskGraphNodeAttributes`, `DependencyEdge` become Module entries (optional) | TaskGraph class, analysis, all runtime logic | +| `@alkdev/operations` | `Identity`, `AccessControl` become Module entries (optional) | Registry, call protocol, adapters | +| `@alkdev/pubsub` | No change | Transport layer | +| `@alkdev/ujsx` | No change (already a Module) | The pattern we're following | +| `@alkdev/dbtype` | No change (Phase 0) | Future: storage table defs could be dbtype element trees | + +## Open Questions + +1. **Should `@alkdev/flowgraph` export a `Type.Module`, or should storage define + its own entries with documented correspondence?** Flowgraph currently exports + `CallNodeAttrs` as a standalone `Type.Object`. To use `Import()`, flowgraph + needs to export a Module. But storage can start with standalone schemas and + `Type.Composite([BaseNode, CallNodeAttrs])` — no dependency on flowgraph. + Migrate to `Import()` when flowgraph provides a Module. **This avoids a + circular dependency: `@alkdev/storage` does NOT depend on `@alkdev/flowgraph`.** + +2. **Should concrete graph type Modules live in storage or in their respective + packages?** Call-graph attribute schemas are defined by flowgraph's domain, not + storage's. Storage provides the metagraph *framework* (the `Metagraph` Module + with `BaseNode`, `BaseEdge`, `Config`). Concrete graph types like `CallGraph` + could live either in storage (as reference implementations) or in their + respective packages (flowgraph exports `CallGraph` Module alongside + `CallNodeAttrs`). **Decision: Both.** Storage provides reference Modules in + `modules/` that consumers can use directly or replace. Flowgraph may also + export a Module — the two are compatible via Module `$defs`. + +3. ~~**How does `SchemaBuilder.build()` return a Module while maintaining backward + compat?**~~ **Resolved**: No backward compat needed (no existing consumers). + `SchemaBuilder` is removed. Consumers use `Type.Module()` construction + directly. See "SchemaBuilder.build() → Module Equivalence". + +4. **Should `*EdgeConstraints` entries use `Type.Ref("CallNode")` or + `Type.String()` for allowed source/target types?** Using `Type.Ref` + would mean "each element in the array must validate against the CallNode + schema," which is semantically wrong — the constraint is about which named + node types are valid endpoints, not about data shapes. Using `Type.String()` + matches the actual semantics (arrays of node type names) but loses the + structural link. **Decision: `Type.String()`** — the constraint arrays + contain names, not schemas. The naming convention provides an implicit + contract that string values should correspond to `*Node` entry names, + enforced by `moduleToDbSchema()` at projection time. + +5. **How does the graph pointer abstraction interact with the repository layer?** + For v1, repository functions use direct key-based addressing. Typed pointers + (JPATH Module, reactive ValuePointer) could layer on top of the repository + later. The key question: does the repository return raw data (untyped JSON), + or does it validate against the Module before returning? **Decision: validate + on read** — if the data doesn't match the Module entry, throw. This makes + typed pointers safe: any value you get from the repo conforms to the schema. + +## References + +- ujsx schema (proven Module pattern): `/workspace/@alkdev/ujsx/src/core/schema.ts` +- ujsx ADR-002 (Module as type registry): `/workspace/@alkdev/ujsx/docs/architecture/decisions/002-typebox-module-as-registry.md` +- ujsx schema docs: `/workspace/@alkdev/ujsx/docs/architecture/schema.md` +- TsToModule codegen: `/workspace/research/typebox_research/codegen/ts-to-module.ts` +- ujsx Module examples: `/workspace/research/typebox_research/ujsx/unist.gen.ts`, `/workspace/research/typebox_research/ujsx/mdast.gen.ts` +- Flowgraph schema (standalone TypeBox, not yet Module): `/workspace/@alkdev/flowgraph/src/schema/` +- Flowgraph SerializedGraph factory: `/workspace/@alkdev/flowgraph/src/schema/graph.ts` +- Forward-looking connections (pointers, dbtype, ujsx IR): [forward-look.md](./forward-look.md) +- Current metagraph model: [metagraph.md](./metagraph.md) +- Ecosystem integration: [overview.md](./overview.md) \ No newline at end of file diff --git a/docs/architecture/metagraph.md b/docs/architecture/metagraph.md index 3819147..7c41230 100644 --- a/docs/architecture/metagraph.md +++ b/docs/architecture/metagraph.md @@ -5,6 +5,11 @@ last_updated: 2026-05-28 # Metagraph Model +> **Superseded by [metagraph-module.md](./metagraph-module.md)** — graph type +> definitions are now TypeBox Modules, not standalone schemas + SchemaBuilder. +> This document describes the current (pre-Module) data model. The Module +> migration is specified in metagraph-module.md. + The core data model: graph types define schemas, node types define shapes, edge types define relationships, and typed graph instances hold actual data. diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md index 6d52331..7beac51 100644 --- a/docs/architecture/overview.md +++ b/docs/architecture/overview.md @@ -29,7 +29,7 @@ ecosystem. @alkdev/storage/ ├── mod.ts → re-exports graphs/ (zero db deps) ├── src/ -│ ├── graphs/ → schema types + SchemaBuilder (no db deps) +│ ├── graphs/ → Metagraph Module, bridge functions (no db deps) │ ├── sqlite/ → SQLite host (drizzle-orm/libsql) │ │ ├── tables/ → drizzle table definitions │ │ ├── relations.ts → drizzle relational mappings @@ -85,12 +85,16 @@ type with specific node types (operation call, subcall) and edge types This trades some query convenience for generality. Domain-specific queries are built on top of the graph query layer, not baked into table schemas. -### D3: SchemaBuilder as the primary API surface +### D3: Type.Module as the primary API surface -The `SchemaBuilder` fluent API is the intended way to construct graph type -definitions. It validates against TypeBox schemas at build time, ensuring that -graph/node/edge type definitions are structurally sound before they're persisted -to the database. +The `Type.Module()` construction API is the intended way to define graph type +definitions. The `Metagraph` Module provides base entries (`BaseNode`, +`BaseEdge`, `Config`); concrete graph types compose them via `Metagraph.Import()` +and `Type.Composite()`. The `SchemaBuilder` is removed. + +This replaces the earlier fluent builder pattern. The Module format provides +native `Type.Ref()` for internal references, `Module.Import()` for cross-package +references, and JSON Schema `$defs` that map directly to DB storage. ### D4: Injectable clients, no module-level side effects @@ -147,7 +151,7 @@ consumed by the hub and spokes, not by storage itself. ### Implemented -- Graph schema types and SchemaBuilder +- Graph schema types and Metagraph Module (replaces SchemaBuilder) - SQLite host: 6 metagraph tables + actors table + Drizzle relations + client factory - TypeBox select/insert schemas generated from Drizzle tables (drizzlebox) @@ -296,6 +300,8 @@ storage node attributes and operations call events), they should either: ## References +- Metagraph Module evolution: [metagraph-module.md](./metagraph-module.md) +- Forward-looking connections: [forward-look.md](./forward-look.md) - Operations architecture: `/workspace/@alkdev/operations/docs/architecture/README.md` - Pubsub architecture: `/workspace/@alkdev/pubsub/docs/architecture/README.md` - Flowgraph architecture: `/workspace/@alkdev/flowgraph/docs/architecture/README.md`