docs: restructure metagraph-module.md for clarity and reduced redundancy
- Eliminate 4x redundancy on SchemaBuilder removal (was in Overview, Equivalence section, DD1, DD2) - Remove forward references to DD numbers that break reading flow - Separate specification from rationale (DDs capture decisions, body specifies) - Fix Type.Ref inconsistency in Edge Constraints example (should use Metagraph.Import per DD2) - Expand 'Why TypeBox Modules' with the three friction points it solves - Add Performance subsection, Codegen Path, Transition table, Implementation Path - Restore Relationship to Other Packages table - Remove historical artifacts (SchemaBuilder equivalence internals, Type.Any migration notes) - 887 lines → 694 lines (22% reduction)
This commit is contained in:
@@ -1,16 +1,16 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-05-30
|
||||
last_updated: 2026-05-29
|
||||
---
|
||||
|
||||
# Metagraph as TypeBox Module
|
||||
|
||||
Graph type definitions as `Type.Module` — aligning with the ujsx pattern for
|
||||
recursive schemas, cross-package references, codegen, and graphology serialization.
|
||||
Graph type definitions as `Type.Module` — recursive schemas, cross-package
|
||||
references, and DB persistence.
|
||||
|
||||
## The Metagraph Data Model
|
||||
|
||||
The metagraph pattern is a three-level type system:
|
||||
The metagraph is a three-level type system:
|
||||
|
||||
1. **GraphType** — A class of graphs (e.g., "call-graph", "acl",
|
||||
"task-dependencies"). Defines structural constraints
|
||||
@@ -24,8 +24,8 @@ The metagraph pattern is a three-level type system:
|
||||
"can_read", "depends_on"). Each edge type has a TypeBox schema for its
|
||||
attributes. Optionally constrains which source/target node types are valid.
|
||||
|
||||
Then **Graph instances** belong to a graph type and contain **Nodes** and
|
||||
**Edges** conforming to those type definitions.
|
||||
**Graph instances** belong to a graph type and contain **Nodes** and **Edges**
|
||||
conforming to those type definitions.
|
||||
|
||||
```
|
||||
GraphType "call-graph" (directed, multi, self-loops allowed)
|
||||
@@ -42,7 +42,7 @@ Graph "session-abc-call-graph" (instance)
|
||||
│ └── attributes: { requestId, operationId, status, ... }
|
||||
├── Node "call-002" → nodeTypeId → NodeType "subcall"
|
||||
│ └── attributes: { requestId, parentRequestId, ... }
|
||||
└── Edge "edge-001" → edgeTypeId → EdgeType "triggered"
|
||||
└── Edge "edge-001" → edgeTypeId → NodeType "triggered"
|
||||
└── attributes: { type: "triggered" }
|
||||
sourceNodeKey: "call-001"
|
||||
targetNodeKey: "call-002"
|
||||
@@ -54,83 +54,45 @@ Nodes and edges use a **composite identity model**: identified by
|
||||
`key` is the identity.
|
||||
|
||||
Node and edge attributes are stored as JSON text in SQLite (jsonb in PG). The
|
||||
graph type's schema defines what shape these attributes should have, but the
|
||||
database doesn't enforce the schema — all validation happens in the repository
|
||||
layer. See [schema-evolution.md](./schema-evolution.md) for how schemas change
|
||||
over time, and [sqlite-host.md](./sqlite-host.md) for the table definitions.
|
||||
graph type's schema defines the expected shape, but the database doesn't enforce
|
||||
it — validation happens in the repository layer. See
|
||||
[schema-evolution.md](./schema-evolution.md) for how schemas change over time,
|
||||
and [sqlite-host.md](./sqlite-host.md) for table definitions.
|
||||
|
||||
## Overview
|
||||
## Why TypeBox Modules
|
||||
|
||||
A graph type definition is naturally a TypeBox Module. It has named entries
|
||||
(node types, edge types, config) that reference each other with `Type.Ref()`,
|
||||
compose with `Type.Composite()`, and can cross-reference other Modules with
|
||||
`Import()`. This is the same pattern used by `@alkdev/ujsx` (where `UJSX` is
|
||||
a Module with `UPrimitive`, `UElement`, `URoot`, `UNode` recursively referencing
|
||||
each other).
|
||||
A graph type definition has named entries (node types, edge types, config) that
|
||||
reference each other. `Type.Module` is the natural fit:
|
||||
|
||||
The removed `SchemaBuilder` produced a flat `GraphSchema` object — an ad-hoc
|
||||
`Record<string, NodeType>` + `Record<string, EdgeType>`. This works but
|
||||
creates friction:
|
||||
- **`Type.Ref("CallStatus")`** — recursive and internal references resolve
|
||||
within the Module's `$defs`
|
||||
- **`Module.Import("CallStatus")`** — cross-package references embed the
|
||||
referenced Module's `$defs`
|
||||
- **`Value.Check(Module.Import("CallNode"), data)`** — runtime validation
|
||||
- **`Static<typeof Module>`** — TypeScript types from the Module
|
||||
|
||||
1. **No cross-graph-type references** — a call graph node type can't reference
|
||||
`CallStatus` from `@alkdev/flowgraph` without manual `Type.Intersect`
|
||||
composition. Each package defines schemas independently, duplicating types.
|
||||
2. **No graphology compatibility** — the schema output is a flat JSON object,
|
||||
not a format that maps to graphology's `import()`/`export()`. Consumers
|
||||
manually map node/edge attributes.
|
||||
This replaces the removed `SchemaBuilder`, which produced a flat
|
||||
`Record<string, NodeType>` + `Record<string, EdgeType>`. That approach had
|
||||
three limitations that Modules solve natively:
|
||||
|
||||
1. **No cross-graph-type references** — a call graph node type couldn't
|
||||
reference `CallStatus` from `@alkdev/flowgraph` without manual
|
||||
`Type.Intersect`. Each package duplicated types independently.
|
||||
2. **No graphology compatibility** — the flat JSON output didn't map to
|
||||
graphology's `import()`/`export()`. Consumers manually mapped node/edge
|
||||
attributes.
|
||||
3. **No codegen leverage** — `TsToModule` generates TypeBox Modules from
|
||||
TypeScript interfaces. The SchemaBuilder couldn't consume Module output, so
|
||||
codegen-produced types must be manually translated.
|
||||
TypeScript interfaces, but the builder couldn't consume Module output.
|
||||
|
||||
The Module approach treats each graph type as a `Type.Module`, aligning storage
|
||||
with how ujsx already works — recursive types via `Ref`, composition via
|
||||
`Composite`, cross-references via `Import`.
|
||||
This aligns with the pattern proven in `@alkdev/ujsx`, where `UJSX` is a Module
|
||||
with `UPrimitive`, `UElement`, `URoot`, `UNode` recursively referencing each
|
||||
other. See [forward-look.md](./forward-look.md) for how this connects to the
|
||||
broader ecosystem (codegen, graphology, dbtype).
|
||||
|
||||
For the forward-looking view of how this connects to dbtype, graph pointers,
|
||||
and the ujsx universal IR pipeline, see [forward-look.md](./forward-look.md).
|
||||
## Base Module: Metagraph
|
||||
|
||||
## The Pattern (Proven in ujsx)
|
||||
|
||||
`@alkdev/ujsx` already uses this pattern (ADR-002: "TypeBox Module as type
|
||||
registry"):
|
||||
|
||||
```ts
|
||||
// ujsx: schema.ts
|
||||
export const UJSX = Type.Module({
|
||||
UPrimitive: Type.Union([Type.String(), Type.Number(), Type.Boolean(), Type.Null()]),
|
||||
PropValue: Type.Union([..., Type.Ref("UNode"), ...]),
|
||||
UniversalProps: Type.Object({}, { additionalProperties: Type.Union([Type.Ref("PropValue"), Type.Undefined()]) }),
|
||||
UElement: Type.Object({
|
||||
type: Type.String(),
|
||||
props: Type.Ref("UniversalProps"),
|
||||
children: Type.Array(Type.Ref("UNode")), // recursive!
|
||||
}),
|
||||
URoot: Type.Object({
|
||||
type: Type.Literal("root"),
|
||||
props: Type.Ref("UniversalProps"),
|
||||
children: Type.Array(Type.Ref("UNode")), // recursive!
|
||||
}),
|
||||
UNode: Type.Union([Type.Ref("UPrimitive"), Type.Ref("UElement"), Type.Ref("URoot")]),
|
||||
});
|
||||
```
|
||||
|
||||
Key properties:
|
||||
- **`Type.Ref("UNode")`** resolves within the Module's `$defs` — recursive
|
||||
references are natural
|
||||
- **`UJSX.Import("UElement")`** lets other Modules reference ujsx types — the
|
||||
referenced Module's `$defs` are embedded in the importing Module's JSON Schema
|
||||
- **`Value.Check(UJSX.Import("UElement"), node)`** validates at runtime
|
||||
- **`Static<typeof UJSX>`** gives TypeScript types (or hand-written types for
|
||||
non-serializable entries like `ComponentFn`)
|
||||
|
||||
Graph type definitions have the same structure — named entries that reference
|
||||
each other, with possible cross-references to other packages' Modules.
|
||||
|
||||
## Proposed: GraphType as a TypeBox Module
|
||||
|
||||
### Base Module: Metagraph
|
||||
|
||||
The metagraph meta-schema itself is a Module:
|
||||
The metagraph meta-schema is a Module providing base entries that concrete
|
||||
graph types compose from:
|
||||
|
||||
```ts
|
||||
export const Metagraph = Type.Module({
|
||||
@@ -157,12 +119,26 @@ export const Metagraph = Type.Module({
|
||||
});
|
||||
```
|
||||
|
||||
### Concrete Graph Type: CallGraph
|
||||
- `Config` uses `Type.Union` with defaults for construction-time validation
|
||||
("any valid config"). Specific graph types narrow these to `Type.Literal`
|
||||
values.
|
||||
- `BaseNode` and `BaseEdge` provide common attribute schemas. Concrete graph
|
||||
types compose them via `Type.Composite`.
|
||||
- `metadata` and similar "arbitrary data" fields use `Type.Unknown()`
|
||||
(not `Type.Any()`). `Type.Unknown()` is canonical — it communicates "no
|
||||
validation applied" explicitly.
|
||||
|
||||
A specific graph type is also a Module. It composes `BaseNode`/`BaseEdge` via
|
||||
`Type.Composite()` (same as ujsx's `Mdast.Node: Type.Composite([Unist.Import("UnistNode"), ...])`):
|
||||
## Concrete Graph Type Modules
|
||||
|
||||
A specific graph type is also a `Type.Module`. It composes `BaseNode`/`BaseEdge`
|
||||
via `Metagraph.Import()` and `Type.Composite()`, narrows config to literal values,
|
||||
and defines its own node types, edge types, and shared types.
|
||||
|
||||
### Example: CallGraph
|
||||
|
||||
```ts
|
||||
import { Metagraph } from "./metagraph.ts";
|
||||
|
||||
export const CallGraph = Type.Module({
|
||||
// Config is specific — literal values, not unions with defaults
|
||||
Config: Type.Object({
|
||||
@@ -171,7 +147,7 @@ export const CallGraph = Type.Module({
|
||||
allowSelfLoops: Type.Literal(false),
|
||||
}),
|
||||
|
||||
// Node types compose BaseNode (from Metagraph) with call-specific attributes
|
||||
// Node types compose BaseNode with call-specific attributes
|
||||
CallNode: Type.Composite([
|
||||
Metagraph.Import("BaseNode"),
|
||||
Type.Object({
|
||||
@@ -229,9 +205,33 @@ export const CallGraph = Type.Module({
|
||||
});
|
||||
```
|
||||
|
||||
### Type.Composite, not Type.Intersect
|
||||
|
||||
Graph type Modules use `Type.Composite` to extend base schemas, not
|
||||
`Type.Intersect`. The difference:
|
||||
|
||||
- **`Type.Intersect`** produces a `TIntersect` wrapper with `allOf` — consumers
|
||||
must traverse `allOf` to access properties.
|
||||
- **`Type.Composite`** produces a flat `TObject` — overlapping keys are
|
||||
intersected via `IntersectEvaluated`, non-overlapping keys are merged.
|
||||
|
||||
Both use intersection semantics for overlapping keys. When overlapping keys have
|
||||
a subtype relationship (e.g., `type: Type.String()` → `type: Type.Literal("triggered")`),
|
||||
the intersection resolves to the narrower type, which is the correct behavior.
|
||||
|
||||
**Constraint**: Do not use `Type.Composite` with overlapping keys of incompatible
|
||||
types. If `BaseEdge` has `type: Type.String()` and a concrete edge type needs
|
||||
`type: Type.Number()`, the intersection evaluates to `never`. For graph types,
|
||||
this is not a concern — base and concrete keys either don't overlap, or the
|
||||
overlap is a valid subtype narrowing.
|
||||
|
||||
### Cross-Module References
|
||||
|
||||
`Module.Import()` allows one Module to reference entries from another:
|
||||
`Module.Import()` allows one Module to reference entries from another. In the
|
||||
CallGraph example, `Metagraph.Import("BaseNode")` embeds `Metagraph`'s `$defs`
|
||||
into `CallGraph`'s JSON Schema output.
|
||||
|
||||
This also works across packages:
|
||||
|
||||
```ts
|
||||
import { FlowGraph } from "@alkdev/flowgraph/schema";
|
||||
@@ -241,146 +241,76 @@ const CallGraph = Type.Module({
|
||||
CallNode: Type.Composite([
|
||||
Type.Ref("BaseNode"),
|
||||
Type.Object({
|
||||
status: FlowGraph.Import("CallStatus"), // from flowgraph
|
||||
identity: Type.Optional(FlowGraph.Import("Identity")), // from flowgraph
|
||||
// ...
|
||||
status: FlowGraph.Import("CallStatus"),
|
||||
identity: Type.Optional(FlowGraph.Import("Identity")),
|
||||
}),
|
||||
]),
|
||||
});
|
||||
```
|
||||
|
||||
This is exactly the `Mdast.Import("UnistNode")` pattern from the ujsx research.
|
||||
|
||||
**⚠️ Import embedding**: `Module.Import()` embeds the referenced Module's `$defs`
|
||||
into the importing Module's JSON Schema output. When `CallGraph` imports from
|
||||
**Import embedding**: `Module.Import()` embeds the referenced Module's `$defs`
|
||||
into the importing Module's JSON Schema. When `CallGraph` imports from
|
||||
`FlowGraph`, the resulting JSON Schema includes all of `FlowGraph`'s definitions
|
||||
in `$defs`. See DD6 for how the repository layer handles this.
|
||||
in `$defs`. The repository layer stores **dereferenced entry schemas** — each
|
||||
`node_types` row gets its entry's resolved JSON Schema (with inline `$defs` for
|
||||
just its transitive references), not the entire importing Module. This avoids
|
||||
storage bloat and version coupling (DD6).
|
||||
|
||||
**Decision (DD6)**: The repository layer stores **dereferenced entry schemas** —
|
||||
each `node_types` row gets its entry's resolved JSON Schema (with inline `$defs`
|
||||
for just its transitive references), not the entire importing Module. This
|
||||
avoids storage bloat and version coupling issues.
|
||||
### BaseNode/BaseEdge: Import vs Local Re-declaration
|
||||
|
||||
### BaseNode/BaseEdge: Local Re-declaration vs Metagraph.Import
|
||||
There are two ways to get `BaseNode`/`BaseEdge` into a concrete graph type Module:
|
||||
|
||||
`Type.Ref()` only resolves entries within the *same* Module. In the `CallGraph`
|
||||
example above, `Type.Ref("BaseNode")` requires `BaseNode` to be an entry in the
|
||||
`CallGraph` Module. There are two strategies for getting `BaseNode`/`BaseEdge`
|
||||
into a concrete graph type Module:
|
||||
- **`Metagraph.Import("BaseNode")`** — references the base Module directly.
|
||||
No duplication, but embeds `Metagraph`'s `$defs` (3 entries — minimal bloat).
|
||||
- **Local re-declaration** — copy the base schemas into the concrete Module.
|
||||
No `$defs` embedding, but duplication if `Metagraph` evolves.
|
||||
|
||||
**Option A: Re-declare locally** (shown in the example above). Each concrete
|
||||
Module includes its own `BaseNode`/`BaseEdge` entries. The schemas are identical
|
||||
to `Metagraph.BaseNode`/`Metagraph.BaseEdge` — you copy them in. Simple, but
|
||||
creates duplication. If the base schemas evolve, each concrete Module must be
|
||||
updated independently.
|
||||
**Decision**: Use `Metagraph.Import()` for Modules within `@alkdev/storage`
|
||||
(e.g., `modules/call-graph.ts`). Both Modules live in the same package, so
|
||||
there's no circular dependency. For Modules defined in external packages
|
||||
(e.g., `@alkdev/flowgraph`), re-declare base schemas locally — external
|
||||
packages should not depend on storage's `Metagraph` Module.
|
||||
|
||||
**Option B: Metagraph.Import**. The concrete Module imports from `Metagraph`:
|
||||
### Config: Literal Values Freeze the Configuration
|
||||
|
||||
The general `Metagraph.Config` uses `Type.Union` with defaults (for
|
||||
construction-time: "any valid config"). Specific graph types freeze these to
|
||||
`Type.Literal` values:
|
||||
|
||||
```ts
|
||||
const CallGraph = Type.Module({
|
||||
CallNode: Type.Composite([
|
||||
Metagraph.Import("BaseNode"),
|
||||
Type.Object({ requestId: Type.String(), ... }),
|
||||
]),
|
||||
});
|
||||
```
|
||||
// General: accepts any valid config
|
||||
Metagraph.Config // type: union of "directed"|"undirected"|"mixed", multi: boolean, ...
|
||||
|
||||
This avoids duplication but embeds `Metagraph`'s `$defs` into `CallGraph`'s
|
||||
JSON Schema output. For most cases, `Metagraph` is small (3 entries) so the
|
||||
bloat is minimal. If `Metagraph` grows, this could become a concern.
|
||||
|
||||
**Decision: Option B for same-package Modules (recommended), Option A as
|
||||
fallback for external-package Modules**.
|
||||
|
||||
For Modules defined within `@alkdev/storage` (like `CallGraph` in
|
||||
`modules/call-graph.ts`), `Metagraph.Import("BaseNode")` has no circular
|
||||
dependency issue — both `Metagraph` and `CallGraph` live in the same package.
|
||||
The `Import` approach avoids duplication and keeps the base schemas in one
|
||||
place.
|
||||
|
||||
For Modules defined outside `@alkdev/storage` (e.g., in `@alkdev/flowgraph`),
|
||||
Option A applies because external packages should not depend on storage's
|
||||
`Metagraph` Module (see Open Question 1). Those packages re-declare their own
|
||||
base schemas or define them independently.
|
||||
|
||||
The v1 reference Modules in `modules/` should use Option B. If a future
|
||||
consumer defines a `CallGraph` Module externally, they can choose either
|
||||
approach — the schemas are structurally identical.
|
||||
|
||||
**Verified**: `Type.Composite([Type.Ref("BaseNode"), Type.Object({...})])`
|
||||
within a Module resolves correctly. Test confirms: `Value.Check(Module.Import("CallNode"), validData)` passes.
|
||||
|
||||
### Type.Composite vs Type.Intersect
|
||||
|
||||
The Module approach uses `Type.Composite` for extending `BaseNode`/`BaseEdge`,
|
||||
not `Type.Intersect`. This matches the ujsx pattern where `Mdast.Node` is
|
||||
`Type.Composite([Unist.Import("UnistNode"), Type.Object({...})])`.
|
||||
|
||||
The difference:
|
||||
- **`Type.Intersect`** creates a JSON Schema `allOf` — the result is a
|
||||
`TIntersect` wrapper with nested schemas. Consumers must traverse `allOf`
|
||||
to access properties.
|
||||
- **`Type.Composite`** produces an **intersection evaluated into a flat
|
||||
`TObject`** — overlapping keys are intersected via `IntersectEvaluated`
|
||||
and the result is a single object with no `allOf` wrapper. The output
|
||||
shape is `{ key1: Intersect([typeA, typeB]), key2: typeC, ... }`.
|
||||
|
||||
**Both use intersection semantics for overlapping keys.** Composite is NOT
|
||||
an `Object.assign` override — when overlapping keys have varying (incompatible)
|
||||
types, the result is `never`. When overlapping keys have a subtype
|
||||
relationship (like `Type.String()` and `Type.Literal("triggered")`), the
|
||||
intersection resolves to the narrower type (`Type.Literal("triggered")`),
|
||||
which is the correct behavior.
|
||||
|
||||
**Why Composite over Intersect for graph types**: The output is a flat
|
||||
`TObject` that maps directly to a node/edge attribute schema. `Intersect`
|
||||
produces a `TIntersect` wrapper that would need unwrapping. For graph types
|
||||
where base and concrete attributes have non-overlapping keys (most cases)
|
||||
or subtype-only overlaps (like `type: Type.String()` → `type: Type.Literal(...)`),
|
||||
Composite evaluates to the same result but in a more convenient shape.
|
||||
|
||||
**Design constraint**: Do not use `Type.Composite` with overlapping keys of
|
||||
incompatible types. If `BaseEdge` has `type: Type.String()` and a concrete
|
||||
edge type needs `type: Type.Number()`, the intersection evaluates to `never`.
|
||||
For graph types, this is not a concern — base and concrete keys either don't
|
||||
overlap, or the overlap is a valid subtype narrowing (union → literal).
|
||||
|
||||
### Config: Literal Values for Specific Graph Types
|
||||
|
||||
The general `Metagraph.Config` has `Type.Union` with defaults (for
|
||||
construction-time validation: "any valid config"). Specific graph types use
|
||||
`Type.Literal` for frozen config values:
|
||||
|
||||
```ts
|
||||
// General (construction): Type.Union([Type.Literal("directed"), Type.Literal("undirected"), ...])
|
||||
// Specific (frozen): Type.Literal("directed")
|
||||
// Specific: frozen to exact values
|
||||
CallGraph.Config // type: "directed", multi: false, allowSelfLoops: false
|
||||
```
|
||||
|
||||
The construction flow: consumer provides a general config → validated against
|
||||
`Metagraph.Config` → the specific graph type Module uses `Type.Literal` to
|
||||
freeze the value. Narrowing from `Type.Union` to `Type.Literal` is explicit
|
||||
in the Module — no builder step needed.
|
||||
`Metagraph.Config` → the specific graph type Module freezes the values with
|
||||
`Type.Literal`.
|
||||
|
||||
### Edge Type Constraints: named constraint entries
|
||||
## Edge Type Constraints
|
||||
|
||||
Edge type constraints (`allowedSourceTypes`/`allowedTargetTypes`) are **named
|
||||
Module entries**, not columns bolted onto DB rows. This makes them first-class
|
||||
parts of the schema — queryable, validatable, and composable:
|
||||
parts of the schema — queryable, validatable, and serializable.
|
||||
|
||||
```ts
|
||||
import { Metagraph } from "./metagraph.ts";
|
||||
|
||||
export const CallGraph = Type.Module({
|
||||
// ...
|
||||
TriggeredEdge: Type.Composite([
|
||||
Type.Ref("BaseEdge"),
|
||||
Metagraph.Import("BaseEdge"),
|
||||
Type.Object({ type: Type.Literal("triggered") }),
|
||||
]),
|
||||
TriggeredEdgeConstraints: Type.Object({
|
||||
edgeType: Type.Literal("triggered"),
|
||||
allowedSourceTypes: Type.Array(Type.String()), // node type names: ["Call"]
|
||||
allowedTargetTypes: Type.Array(Type.String()), // node type names: ["Call", "Subcall"]
|
||||
allowedSourceTypes: Type.Array(Type.String()), // ["Call"]
|
||||
allowedTargetTypes: Type.Array(Type.String()), // ["Call", "Subcall"]
|
||||
}),
|
||||
DependsOnEdge: Type.Composite([
|
||||
Type.Ref("BaseEdge"),
|
||||
Metagraph.Import("BaseEdge"),
|
||||
Type.Object({ type: Type.Literal("depends_on") }),
|
||||
]),
|
||||
DependsOnEdgeConstraints: Type.Object({
|
||||
@@ -391,47 +321,23 @@ export const CallGraph = Type.Module({
|
||||
});
|
||||
```
|
||||
|
||||
**Why Module entries instead of DB columns** (DD7 revised):
|
||||
|
||||
1. **Schema-level validation**: `Value.Check(CallGraph.TriggeredEdgeConstraints, data)`
|
||||
validates that constraint data is well-formed. With DB columns, there's no
|
||||
schema validation — just JSON arrays in text columns.
|
||||
2. **Serialization**: The constraint entries serialize to JSON Schema with
|
||||
`$defs`, enabling `Value.Diff` for migration detection and `FromSchema`
|
||||
for round-tripping.
|
||||
3. **DB mapping**: The `moduleToDbSchema()` function extracts
|
||||
`*EdgeConstraints` entries and writes their `allowedSourceTypes`/
|
||||
`allowedTargetTypes` fields to the existing `edge_types` columns. The DB
|
||||
schema doesn't change — the Module entries are the source of truth, the
|
||||
DB columns are the persistence projection.
|
||||
|
||||
**Why Type.String() not Type.Ref()**: The constraint arrays contain node type
|
||||
*names* (strings like `"Call"`), not node type *schemas*. `Type.Ref("CallNode")`
|
||||
would mean "an element must validate against the CallNode schema," which is
|
||||
incorrect — the constraint is about which named node types are valid endpoints,
|
||||
not about node data shapes. The naming convention (`*Node` suffix) provides an
|
||||
implicit structural contract: string values in `allowedSourceTypes` should
|
||||
correspond to `*Node` entry names in the same Module. This is enforced by
|
||||
`moduleToDbSchema()` at Module-to-DB projection time, not by the schema itself.
|
||||
See Open Question 4 for the `Type.Ref` vs `Type.String` trade-off.
|
||||
|
||||
**DB mapping note**: The current DB schema stores `allowedSourceTypes` and
|
||||
`allowedTargetTypes` as JSON text columns (arrays of strings, default `[]`).
|
||||
In the Module, these become `Type.Array(Type.String())` entries — the DB
|
||||
column values are the same string arrays. `moduleToDbSchema()` extracts them
|
||||
directly. Read-path reconstruction resolves the names back to Module entries
|
||||
for validation.
|
||||
**Why `Type.String()` not `Type.Ref()`**: The constraint arrays contain node
|
||||
type *names* (strings like `"Call"`), not node type *schemas*. `Type.Ref("CallNode")`
|
||||
would mean "each element must validate against the CallNode schema," which is
|
||||
semantically wrong — the constraint is about which named node types are valid
|
||||
endpoints, not about data shapes. The `*Node` suffix naming convention provides
|
||||
an implicit structural contract. `moduleToDbSchema()` enforces this convention
|
||||
at Module-to-DB projection time.
|
||||
|
||||
**Empty array semantics**: In the DB, `[]` means "no restriction" (any node
|
||||
type valid). In the Module, omitting the `*EdgeConstraints` entry means the
|
||||
same thing. An explicit entry with empty arrays is not valid — it would mean
|
||||
"no node types are valid at this endpoint," which is nonsensical. The
|
||||
repository layer enforces this convention.
|
||||
type valid). In the Module, omitting the `*EdgeConstraints` entry means the same
|
||||
thing. An explicit entry with empty arrays is not valid — it would mean "no node
|
||||
types are valid at this endpoint," which is nonsensical.
|
||||
|
||||
### Entry Naming Convention
|
||||
## Entry Naming Convention
|
||||
|
||||
Within a graph type Module, entries follow a naming convention that distinguishes
|
||||
their role (DD8):
|
||||
Within a graph type Module, entries follow a suffix convention that distinguishes
|
||||
their role and determines their DB mapping:
|
||||
|
||||
| Suffix | Role | Maps to DB |
|
||||
|--------|------|------------|
|
||||
@@ -442,15 +348,14 @@ their role (DD8):
|
||||
| `*Enum` or bare name | Shared enum/type | Embedded in `node_types.schema`/`edge_types.schema` |
|
||||
| `BaseNode`, `BaseEdge` | Base attribute schemas | Composed into `*Node`/`*Edge` entries |
|
||||
|
||||
The `moduleToDbSchema()` function uses this convention to map Module entries to
|
||||
the `node_types` and `edge_types` tables. Entries ending in `Node` become rows
|
||||
with `name = entryNameWithoutSuffix ("Node")` and `schema = resolved entry`.
|
||||
Same for `*Edge`. The `Config` entry maps to `graph_types.config`.
|
||||
`moduleToDbSchema()` uses this convention to project Module entries to DB rows.
|
||||
Entries ending in `Node` become rows with `name = entryNameWithoutSuffix("Node")`
|
||||
and `schema = resolved entry`. Same for `*Edge`. The `Config` entry maps to
|
||||
`graph_types.config`.
|
||||
|
||||
## graphology Serialization Bridge
|
||||
|
||||
The bridge between Modules and graphology is the `SerializedGraph` pattern that
|
||||
`@alkdev/flowgraph` already uses:
|
||||
The bridge between Modules and graphology is the `SerializedGraph` pattern:
|
||||
|
||||
```ts
|
||||
// flowgraph's current pattern (standalone schemas)
|
||||
@@ -462,7 +367,7 @@ const CallGraphSerialized = SerializedGraph(
|
||||
|
||||
// Module pattern (entries from the Module)
|
||||
const CallGraphSerialized = SerializedGraph(
|
||||
CallGraph.CallNode, // entry from Module — resolves Refs through $defs
|
||||
CallGraph.CallNode, // entry from Module — resolves Refs through $defs
|
||||
CallGraph.DependsOnEdge, // entry from Module
|
||||
Type.Object({}),
|
||||
);
|
||||
@@ -472,7 +377,7 @@ Graphology's serialized format:
|
||||
|
||||
```ts
|
||||
{
|
||||
attributes: {}, // Graph-level attributes (empty for most graphs)
|
||||
attributes: {}, // Graph-level attributes
|
||||
options: {
|
||||
type: "directed", // From CallGraph.Config
|
||||
multi: false,
|
||||
@@ -493,36 +398,27 @@ The mapping:
|
||||
- `CallGraph.CallNode` → validates `nodes[].attributes`
|
||||
- `CallGraph.TriggeredEdge` → validates `edges[].attributes`
|
||||
|
||||
This is **complementary** to `@alkdev/flowgraph`'s `SerializedGraph` — storage
|
||||
produces the data, flowgraph operates on it in memory. The `SerializedGraph`
|
||||
factory function stays the same — its schema arguments now come from Module
|
||||
entries instead of standalone schemas. The `moduleToDbSchema()`
|
||||
function extracts per-entry schemas for DB storage; the `moduleToGraphology()`
|
||||
function produces the graphology import format for hydration.
|
||||
Storage produces this format; `@alkdev/flowgraph`'s `FlowGraph.fromJSON()` and
|
||||
`SerializedGraph` consume it. The `SerializedGraph` factory function stays the
|
||||
same — its schema arguments now come from Module entries instead of standalone
|
||||
schemas. Storage doesn't need a graphology dependency.
|
||||
|
||||
## DB Persistence Bridge
|
||||
|
||||
The repository layer maps Module entries to the existing 6-table schema:
|
||||
The repository layer maps Module entries to the 6-table metagraph schema:
|
||||
|
||||
1. **`graph_types`** row: `name` = Module name, `config` = `CallGraph.Config`
|
||||
JSON Schema (with defaults resolved)
|
||||
2. **`node_types`** rows: one row per `*Node` entry, `name` = entry name
|
||||
(minus `Node` suffix), `schema` = resolved entry JSON Schema
|
||||
3. **`edge_types`** rows: one row per `*Edge` entry, `name` = entry name
|
||||
(minus `Edge` suffix), `schema` = resolved entry JSON Schema,
|
||||
1. **`graph_types`** row: `name` = Module name, `config` = resolved
|
||||
`CallGraph.Config` JSON Schema
|
||||
2. **`node_types`** rows: one per `*Node` entry, `name` = entry name (minus
|
||||
suffix), `schema` = resolved entry JSON Schema
|
||||
3. **`edge_types`** rows: one per `*Edge` entry, `name` = entry name (minus
|
||||
suffix), `schema` = resolved entry JSON Schema,
|
||||
`allowedSourceTypes`/`allowedTargetTypes` from constraint entries
|
||||
|
||||
On read, the repository layer reconstructs the Module from DB rows:
|
||||
`Value.Check(CallGraph.CallNode, node.attributes)` validates node data against
|
||||
the Module entry.
|
||||
|
||||
**`Module.Import()` embedding**: When a Module entry references entries from
|
||||
another Module (e.g., `FlowGraph.Import("CallStatus")`), the JSON Schema for
|
||||
that entry includes the referenced entries in `$defs`. The repository layer
|
||||
stores the **dereferenced entry** — the resolved JSON Schema with inline `$defs`
|
||||
for transitive references — not the entire importing Module. This avoids
|
||||
duplicating all of FlowGraph's definitions in every CallGraph node_types row.
|
||||
|
||||
### Bridge Functions
|
||||
|
||||
#### `moduleToDbSchema(module)`
|
||||
@@ -564,8 +460,7 @@ function moduleToDbSchema(module: TModule): DbSchema
|
||||
- `*EdgeConstraints` entries that reference edge type entries not present in
|
||||
the Module (the `edgeType` field must match an `*Edge` entry name).
|
||||
- `*EdgeConstraints` entries with empty `allowedSourceTypes` and
|
||||
`allowedTargetTypes` arrays (empty = "no types allowed", which is
|
||||
nonsensical; omit the entry instead for "no restriction").
|
||||
`allowedTargetTypes` arrays (omit the entry for "no restriction").
|
||||
- Module without a `Config` entry (all graph types require configuration).
|
||||
|
||||
#### `validateNode(module, entryName, data)` / `validateEdge(module, entryName, data)`
|
||||
@@ -581,47 +476,31 @@ Returns `true` if data passes `Value.Check` against the resolved Module entry.
|
||||
Throws if `entryName` doesn't match an `*Node`/`*Edge` entry in the Module.
|
||||
Does NOT throw on invalid data — returns `false`.
|
||||
|
||||
### Type.Any vs Type.Unknown
|
||||
|
||||
The pre-Module `types.ts` used `Type.Any()` for `metadata` and `schema` fields.
|
||||
The Module approach uses `Type.Unknown()`. These have different JSON Schema
|
||||
outputs:
|
||||
|
||||
- `Type.Any()` → `{}` (accepts anything, no validation)
|
||||
- `Type.Unknown()` → `{}` with `additionalProperties: true` semantics
|
||||
|
||||
For the Module approach, **`Type.Unknown()` is canonical**. It's the more
|
||||
explicit choice — it communicates "this field stores arbitrary data, no
|
||||
validation applied." `Type.Any()` is a legacy from the original TypeBox API.
|
||||
The `Metagraph` Module uses `Type.Unknown()` throughout.
|
||||
|
||||
### Performance Expectations
|
||||
### Performance
|
||||
|
||||
Graph type Modules are small — typically 5–20 entries (one Config, 2–5 node
|
||||
types, 2–5 edge types, 2–5 shared types, 2–5 constraint entries). The
|
||||
`Value.Check` cost scales with schema complexity, not Module size; only the
|
||||
resolved entry schema is checked, not the entire Module.
|
||||
types, 2–5 edge types, 2–5 shared types, 2–5 constraint entries). `Value.Check`
|
||||
cost scales with schema complexity, not Module size; only the resolved entry
|
||||
schema is checked, not the entire Module.
|
||||
|
||||
The dereferenced entry strategy (DD6) means each DB row stores only its own
|
||||
JSON Schema with transitive `$defs` — typically 1–3 KB per entry. A full
|
||||
graph type's schemas total ~10–50 KB in the DB. This is negligible compared
|
||||
to the node/edge data being stored.
|
||||
JSON Schema with transitive `$defs` — typically 1–3 KB per entry. A full graph
|
||||
type's schemas total ~10–50 KB in the DB, negligible compared to node/edge data.
|
||||
|
||||
"Validate on read" (Open Question 5) has a per-read cost. For
|
||||
high-throughput paths, the repository layer can cache the resolved Module
|
||||
entry locally after first read, avoiding repeated `Value.Check` for known-good
|
||||
data. This is a repository-layer optimization, not a Module design concern.
|
||||
"Validate on read" has a per-read cost. For high-throughput paths, the repository
|
||||
layer can cache the resolved Module entry locally after first read. This is a
|
||||
repository-layer optimization, not a Module design concern.
|
||||
|
||||
## Codegen Path
|
||||
|
||||
`TsToModule` generates TypeBox Modules from TypeScript interfaces. The path from
|
||||
TypeScript to graph type:
|
||||
`TsToModule.Generate()` produces TypeBox Module entries from TypeScript
|
||||
interfaces, enabling a pipeline from TypeScript to graph type:
|
||||
|
||||
```
|
||||
TypeScript interface → TsToModule.Generate() → TypeBox Module entry
|
||||
@alkdev/flowgraph CallNodeAttrs → flowgraph schema.ts → FlowGraph Module
|
||||
@alkdev/taskgraph TaskNodeAttrs → taskgraph schema.ts → TaskGraph Module
|
||||
@alkdev/operations Identity → operations types.ts → Operations Module
|
||||
TypeScript interface → TsToModule.Generate() → Module entry
|
||||
@alkdev/flowgraph CallNodeAttrs → flowgraph schema.ts → FlowGraph Module
|
||||
@alkdev/taskgraph TaskNodeAttrs → taskgraph schema.ts → TaskGraph Module
|
||||
@alkdev/operations Identity → operations types.ts → Operations Module
|
||||
```
|
||||
|
||||
Since flowgraph already defines `CallNodeAttrs` as a standalone TypeBox schema,
|
||||
@@ -629,180 +508,32 @@ the codegen can produce a Module entry from it. Storage's `CallGraph` Module the
|
||||
composes `BaseNode` with `CallNodeAttrs` via `Type.Composite`, or imports from
|
||||
the flowgraph Module if flowgraph exports one (see Open Question 1).
|
||||
|
||||
## SchemaBuilder Equivalence
|
||||
## Transition from SchemaBuilder
|
||||
|
||||
The removed `SchemaBuilder.build()` used to return a `GraphSchema` — a flat
|
||||
object with `config`, `nodeTypes: Record<string, NodeType>`, and `edgeTypes:
|
||||
Record<string, EdgeType>`. A `Type.Module` with the same entries is
|
||||
structurally equivalent. This section documents what the builder was doing
|
||||
internally to show the correspondence.
|
||||
|
||||
### What the builder was doing internally
|
||||
|
||||
```
|
||||
SchemaBuilder
|
||||
.config({ type: "directed", multi: false })
|
||||
.nodeType("call", CallNodeSchema)
|
||||
.edgeType("triggered", EdgeSchema, { allowedSourceTypes: ["call"] })
|
||||
.build()
|
||||
|
||||
internally builds:
|
||||
|
||||
defs = {
|
||||
Config: Type.Object({ type: Literal("directed"), multi: Literal(false), ... }),
|
||||
CallNode: CallNodeSchema,
|
||||
TriggeredEdge: EdgeSchema,
|
||||
TriggeredEdgeConstraints: Type.Object({ edgeType: Literal("triggered"), ... }),
|
||||
}
|
||||
return Type.Module(defs)
|
||||
```
|
||||
|
||||
The `.build()` return type was `TModule` (TypeBox Module). The `SchemaBuilder` is
|
||||
removed — consumers use Module construction directly.
|
||||
|
||||
### Why this is equivalent
|
||||
|
||||
The `SchemaBuilder` was building a module under the hood — it just didn't have a
|
||||
module system to target. Named entries referencing each other via strings is
|
||||
exactly what `Type.Ref()` does natively. The Module format:
|
||||
|
||||
- Gives `Type.Ref()` instead of loose schema objects
|
||||
- Gives `Module.Import()` instead of `Type.Intersect` for cross-package refs
|
||||
- Gives JSON Schema `$defs` that map directly to DB storage
|
||||
- Gives `Value.Check`, `Value.Diff`, `Value.Errors` on the full type system
|
||||
- Gives codegen compatibility via `TsToModule.Generate()`
|
||||
|
||||
For the forward-looking connections (typed graph pointers, dbtype table
|
||||
rendering, ujsx HostConfig for graph schemas), see
|
||||
[forward-look.md](./forward-look.md).
|
||||
|
||||
## Design Decisions
|
||||
|
||||
### DD1: Module replaces SchemaBuilder
|
||||
|
||||
The SchemaBuilder is replaced by TypeBox Modules. The Module format provides
|
||||
what SchemaBuilder was building toward, but natively:
|
||||
- Named references → `Type.Ref()` instead of loose schema objects
|
||||
- Cross-module imports → `Module.Import()` instead of `Type.Intersect`
|
||||
- JSON Schema `$defs` → maps directly to DB storage
|
||||
- Codegen compatibility → `TsToModule.Generate()` produces Module entries
|
||||
|
||||
### DD2: SchemaBuilder removed
|
||||
|
||||
The `SchemaBuilder` is removed. Consumers use `Type.Module()` construction
|
||||
directly, with `Type.Ref()`, `Type.Composite()`, and `Metagraph.Import()`
|
||||
as the building blocks. The `moduleToDbSchema()` function replaces
|
||||
`SchemaBuilder.build()` as the bridge from Module to DB rows.
|
||||
|
||||
### DD3: Config as a Module entry with Literal values
|
||||
|
||||
Specific graph type Modules use `Type.Literal` for config values. The general
|
||||
`Metagraph.Config` with `Type.Union` and defaults is for construction-time
|
||||
validation. The specific Module freezes the config to exact values.
|
||||
|
||||
### DD4: Node/edge attribute schemas are Module entries, not `Type.Any()`
|
||||
|
||||
At the application layer, node and edge attribute schemas are named Module entries
|
||||
with full type safety (`CallGraph.CallNode`, not `schema: Type.Any()`). At the
|
||||
DB storage layer, the meta-schemas (`NodeType`, `EdgeType`) still have
|
||||
`schema: Type.Unknown()` because the DB stores arbitrary JSON Schema blobs — the
|
||||
Module entries are the application-level validation, the DB is the persistence
|
||||
layer.
|
||||
|
||||
**Mapping**: The repository layer maps between Module entries and DB rows using
|
||||
the naming convention (`*Node` → `node_types`, `*Edge` → `edge_types`, `Config`
|
||||
→ `graph_types.config`). On read, it looks up the graph type's Module to get
|
||||
the validation schema for each entry.
|
||||
|
||||
### DD5: Graphology import/export as the bridge to in-memory graphs
|
||||
|
||||
Storage produces data that `@alkdev/flowgraph`'s `FlowGraph.fromJSON()` and
|
||||
`SerializedGraph` consume. The Module entries validate data flowing in both
|
||||
directions. Storage doesn't need its own graphology dependency — it produces
|
||||
the JSON format, flowgraph consumes it.
|
||||
|
||||
### DD6: Repository stores dereferenced entry schemas
|
||||
|
||||
To avoid `Module.Import()` embedding the full `$defs` of referenced Modules in
|
||||
every DB row, the repository layer stores **dereferenced entry schemas** — each
|
||||
`node_types` row gets its entry's resolved JSON Schema with just the transitive
|
||||
`$defs` it needs, not the entire importing Module's definitions.
|
||||
|
||||
### DD7: Edge type constraints as named Module entries, not DB columns
|
||||
|
||||
Edge type constraints (`allowedSourceTypes`/`allowedTargetTypes`) are named
|
||||
Module entries (e.g., `TriggeredEdgeConstraints` with `Type.Array(Type.String())`
|
||||
fields), not just DB columns. This gives them schema validation and
|
||||
serialization. The repository layer projects these entries to the existing
|
||||
`edge_types` columns (arrays of node type name strings). The DB schema
|
||||
doesn't change — the Module entries are the source of truth.
|
||||
|
||||
**Revised from original DD7** which stored constraints only as DB columns.
|
||||
Named entries are strictly more capable: they validate and serialize;
|
||||
DB columns are their persistence projection.
|
||||
|
||||
### DD8: Naming convention for Module entries
|
||||
|
||||
Within a graph type Module, entries are named with role-distinguishing suffixes:
|
||||
`*Node` for node types, `*Edge` for edge types, `Config` for graph configuration,
|
||||
`*EdgeConstraints` for edge endpoint constraints, and bare names or `*Enum` for
|
||||
shared types. `moduleToDbSchema()` uses this convention to map entries to DB
|
||||
tables.
|
||||
|
||||
**Alternative considered**: Explicit metadata/decorators on entries (e.g.,
|
||||
`{ kind: "nodeType", name: "call", schema: ... }`). Rejected because it adds
|
||||
boilerplate without adding information — the suffix convention is simpler
|
||||
and sufficient for the expected Module size (5–20 entries).
|
||||
|
||||
### DD9: Pointer abstraction is forward-looking, not v1
|
||||
|
||||
The structural analogy between ujsx's `ValuePointer`/`selectNode`/`setNode` and
|
||||
graph node/edge addressing is real, but implementing typed graph pointers (via
|
||||
JPATH Module or reactive signals) is a post-v1 concern. For v1, repository
|
||||
functions use direct key-based addressing and the Module validates attribute
|
||||
shapes. The Module's existence makes typed pointers feasible later because
|
||||
it provides the schema the pointer validates against.
|
||||
|
||||
**Alternative considered**: Implement typed pointers in v1 via a lightweight
|
||||
`GraphPointer<T>` wrapper. Rejected because it requires either JPATH Module
|
||||
dependency or reactive signal integration, both of which add complexity
|
||||
without clear v1 benefit. Direct key-based addressing is sufficient.
|
||||
|
||||
### DD10: dbtype integration is post-v1
|
||||
|
||||
`@alkdev/dbtype`'s UJSX→Module→Host pipeline can eliminate the manual dual
|
||||
definition of SQLite/PG table schemas. But dbtype is Phase 0 (architecture
|
||||
complete, no implementation). For v1, storage uses manual Drizzle table
|
||||
definitions. The Module-based graph type definitions are compatible with dbtype
|
||||
because both produce `Type.Module` objects — the integration path is clear.
|
||||
|
||||
**Alternative considered**: Implement dbtype integration alongside the initial Module
|
||||
construction. Rejected because it adds a dependency on an unimplemented package
|
||||
and the manual table definitions work well. The cost of deferring is continued
|
||||
dual SQLite/PG maintenance, which is manageable for 6 metagraph tables.
|
||||
|
||||
## What Changes
|
||||
The existing `schemaBuilder.ts` and `types.ts` use a different approach that is
|
||||
being replaced:
|
||||
|
||||
| Before (unreleased) | After |
|
||||
|---------|-----|
|
||||
| `types.ts` — standalone schemas | `modules/metagraph.ts` — `Metagraph` Module |
|
||||
| `schemaBuilder.ts` — fluent builder | Removed — replaced by Module construction |
|
||||
| `schemaBuilder.ts` — fluent builder | Removed — replaced by `Type.Module()` construction |
|
||||
| `types.ts` — `BaseNodeAttributes`, `BaseEdgeAttributes` | `Metagraph` Module entries |
|
||||
| `types.ts` — `GraphConfig`, `GraphStatus`, `GraphBaseType` | `Metagraph` Module entries + const objects |
|
||||
| `allowedSourceTypes`/`allowedTargetTypes` as DB columns only | Named `*EdgeConstraints` Module entries (projected to DB columns) |
|
||||
| No concrete graph type Modules | `modules/call-graph.ts`, `modules/acl-graph.ts`, etc. |
|
||||
| No bridge between Module ↔ DB ↔ graphology | `bridge.ts` — validation, DB mapping, graphology format |
|
||||
|
||||
## What Doesn't Change
|
||||
Note: `Type.Any()` used in the old `types.ts` for `metadata` and `schema` fields
|
||||
is replaced by `Type.Unknown()` in the Module approach. Both produce `{}` in
|
||||
JSON Schema, but `Type.Unknown()` is the canonical choice — it explicitly
|
||||
communicates "no validation applied."
|
||||
|
||||
- **Database tables** — same 6 metagraph tables, same columns, same relations
|
||||
- **SQLite host** — table definitions, relations, client factory unchanged
|
||||
- **PostgreSQL host** (planned) — same shapes, different dialect
|
||||
- **`@alkdev/typebox` dependency** — same. Modules are a core TypeBox feature
|
||||
- **Encryption utility** — unchanged, can be a Module entry in `SecretGraph`
|
||||
- **`allowedSourceTypes`/`allowedTargetTypes`** — same DB columns, same semantics
|
||||
(Module entries are the source of truth, projected to DB columns by
|
||||
`moduleToDbSchema()`)
|
||||
**What doesn't change**: The 6 metagraph database tables, their columns, and
|
||||
relations remain the same. SQLite host table definitions, client factory, and
|
||||
drizzlebox-generated schemas are unchanged. The `@alkdev/typebox` dependency is
|
||||
unchanged. The encryption utility (planned) is unchanged. `allowedSourceTypes`
|
||||
and `allowedTargetTypes` remain DB columns with the same semantics — Module
|
||||
entries are the source of truth, projected to columns by `moduleToDbSchema()`.
|
||||
|
||||
## Implementation Path
|
||||
|
||||
@@ -817,62 +548,138 @@ dual SQLite/PG maintenance, which is manageable for 6 metagraph tables.
|
||||
4. **Phase 4**: Add `moduleToGraphology()` and `fromGraphologyExport()` for the
|
||||
graphology bridge. Storage produces the format, flowgraph consumes it.
|
||||
|
||||
Acceptance criteria per phase:
|
||||
- **Phase 2 complete**: `moduleToDbSchema()` produces values compatible with all
|
||||
6 metagraph tables
|
||||
- **Phase 3 complete**: Reference Modules validate against their flowgraph/taskgraph
|
||||
counterparts
|
||||
Acceptance criteria:
|
||||
- **Phase 2 complete**: `moduleToDbSchema()` produces values compatible with
|
||||
all 6 metagraph tables
|
||||
- **Phase 3 complete**: Reference Modules validate against their
|
||||
flowgraph/taskgraph counterparts
|
||||
|
||||
## Relationship to Other Packages
|
||||
### Relationship to Other Packages
|
||||
|
||||
| Package | What changes | What stays |
|
||||
|---------|-------------|------------|
|
||||
| `@alkdev/storage` | `types.ts` → Module, `schemaBuilder.ts` → removed, new `modules/` and `bridge.ts` | Tables, relations, crypto, client factory |
|
||||
| `@alkdev/flowgraph` | `CallNodeAttrs`, `CallEdgeAttrs`, `CallStatus` become Module entries (optional, exported from `/schema` subpath) | FlowGraph class, analysis, all runtime logic |
|
||||
| `@alkdev/flowgraph` | `CallNodeAttrs`, `CallEdgeAttrs`, `CallStatus` become Module entries (optional, exported from `/schema`) | FlowGraph class, analysis, all runtime logic |
|
||||
| `@alkdev/taskgraph` | `TaskGraphNodeAttributes`, `DependencyEdge` become Module entries (optional) | TaskGraph class, analysis, all runtime logic |
|
||||
| `@alkdev/operations` | `Identity`, `AccessControl` become Module entries (optional) | Registry, call protocol, adapters |
|
||||
| `@alkdev/pubsub` | No change | Transport layer |
|
||||
| `@alkdev/ujsx` | No change (already a Module) | The pattern we're following |
|
||||
| `@alkdev/dbtype` | No change (Phase 0) | Future: storage table defs could be dbtype element trees |
|
||||
|
||||
## Design Decisions
|
||||
|
||||
### DD1: TypeBox Module replaces the SchemaBuilder
|
||||
|
||||
Graph type definitions are `Type.Module` objects. The previous `SchemaBuilder`
|
||||
class is removed — consumers use `Type.Module()` construction directly, with
|
||||
`Type.Ref()`, `Type.Composite()`, and `Metagraph.Import()` as the building
|
||||
blocks. The `moduleToDbSchema()` function replaces `SchemaBuilder.build()` as
|
||||
the bridge from Module to DB rows.
|
||||
|
||||
This provides `Type.Ref()` for internal references, `Module.Import()` for
|
||||
cross-package references, JSON Schema `$defs` that map directly to DB storage,
|
||||
and codegen compatibility via `TsToModule.Generate()`.
|
||||
|
||||
### DD2: Metagraph.Import() for same-package Modules
|
||||
|
||||
Concrete graph types within `@alkdev/storage` use `Metagraph.Import("BaseNode")`
|
||||
to compose base schemas. This avoids duplication and keeps the base schemas in
|
||||
one place. External packages that define graph type Modules should re-declare
|
||||
base schemas locally — storage should not be a dependency of other packages'
|
||||
schema definitions.
|
||||
|
||||
### DD3: Config as a Module entry with Literal values
|
||||
|
||||
General `Metagraph.Config` uses `Type.Union` with defaults for construction-time
|
||||
validation. Specific graph types freeze config values to `Type.Literal`, making
|
||||
the config a precise contract rather than a validation surface.
|
||||
|
||||
### DD4: Node/edge attribute schemas are Module entries, not Type.Any()
|
||||
|
||||
At the application layer, node and edge attribute schemas are named Module
|
||||
entries with full type safety (`CallGraph.CallNode`, not `schema: Type.Any()`).
|
||||
At the DB storage layer, the meta-schemas (`NodeType`, `EdgeType`) still have
|
||||
`schema: Type.Unknown()` because the DB stores arbitrary JSON Schema blobs.
|
||||
|
||||
### DD5: Storage produces graphology format, flowgraph consumes it
|
||||
|
||||
Storage doesn't need a graphology dependency. It produces the JSON serialization
|
||||
format that `@alkdev/flowgraph`'s `FlowGraph.fromJSON()` and `SerializedGraph`
|
||||
consume. The Module entries validate data flowing in both directions.
|
||||
|
||||
### DD6: Repository stores dereferenced entry schemas
|
||||
|
||||
When a Module entry uses `Module.Import()`, the entry's JSON Schema embeds the
|
||||
referenced Module's `$defs`. To avoid storing the full referenced Module in
|
||||
every DB row, the repository layer stores **dereferenced entry schemas** — each
|
||||
`node_types` row gets its entry's resolved JSON Schema with just the transitive
|
||||
`$defs` it needs, not the entire importing Module's definitions.
|
||||
|
||||
### DD7: Edge type constraints as named Module entries
|
||||
|
||||
Edge type constraints (`allowedSourceTypes`/`allowedTargetTypes`) are named
|
||||
Module entries (e.g., `TriggeredEdgeConstraints`), not just DB columns. This
|
||||
gives them schema validation (`Value.Check`) and serialization (JSON Schema
|
||||
with `$defs`). The repository layer projects these entries to the existing
|
||||
`edge_types` columns. The DB schema doesn't change — Module entries are the
|
||||
source of truth, DB columns are the persistence projection.
|
||||
|
||||
### DD8: Naming convention for Module entries
|
||||
|
||||
Module entries use role-distinguishing suffixes: `*Node` for node types,
|
||||
`*Edge` for edge types, `Config` for graph configuration, `*EdgeConstraints`
|
||||
for edge endpoint constraints, and bare names or `*Enum` for shared types.
|
||||
`moduleToDbSchema()` uses this convention to map entries to DB tables.
|
||||
|
||||
This was chosen over explicit metadata/decorators (e.g.,
|
||||
`{ kind: "nodeType", name: "call", schema: ... }`) because the suffix convention
|
||||
is simpler and sufficient for the expected Module size (5–20 entries).
|
||||
|
||||
### DD9: Pointer abstraction is forward-looking, not v1
|
||||
|
||||
The structural analogy between ujsx's `ValuePointer`/`selectNode`/`setNode` and
|
||||
graph node/edge addressing is real, but implementing typed graph pointers (via
|
||||
JPATH Module or reactive signals) is a post-v1 concern. For v1, repository
|
||||
functions use direct key-based addressing (`findNode(graphId, nodeKey)`), and
|
||||
the Module validates attribute shapes. See [forward-look.md](./forward-look.md).
|
||||
|
||||
### DD10: dbtype integration is post-v1
|
||||
|
||||
`@alkdev/dbtype`'s UJSX→Module→Host pipeline can eliminate the manual dual
|
||||
definition of SQLite/PG table schemas. But dbtype is Phase 0 (architecture
|
||||
complete, no implementation). For v1, storage uses manual Drizzle table
|
||||
definitions. The Module-based graph type definitions are compatible with dbtype
|
||||
because both produce `Type.Module` objects — the integration path is clear.
|
||||
See [forward-look.md](./forward-look.md).
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **Should `@alkdev/flowgraph` export a `Type.Module`, or should storage define
|
||||
its own entries with documented correspondence?** Flowgraph currently exports
|
||||
`CallNodeAttrs` as a standalone `Type.Object`. To use `Import()`, flowgraph
|
||||
needs to export a Module. But storage can start with standalone schemas and
|
||||
needs to export a Module. Storage can start with standalone schemas and
|
||||
`Type.Composite([BaseNode, CallNodeAttrs])` — no dependency on flowgraph.
|
||||
Adopt `Import()` when flowgraph provides a Module. **This avoids a
|
||||
circular dependency: `@alkdev/storage` does NOT depend on `@alkdev/flowgraph`.**
|
||||
Adopt `Import()` when flowgraph provides a Module. **This avoids a circular
|
||||
dependency: `@alkdev/storage` does NOT depend on `@alkdev/flowgraph`.**
|
||||
|
||||
2. **Should concrete graph type Modules live in storage or in their respective
|
||||
packages?** Call-graph attribute schemas are defined by flowgraph's domain, not
|
||||
storage's. Storage provides the metagraph *framework* (the `Metagraph` Module
|
||||
with `BaseNode`, `BaseEdge`, `Config`). Concrete graph types like `CallGraph`
|
||||
could live either in storage (as reference implementations) or in their
|
||||
respective packages (flowgraph exports `CallGraph` Module alongside
|
||||
`CallNodeAttrs`). **Decision: Both.** Storage provides reference Modules in
|
||||
`modules/` that consumers can use directly or replace. Flowgraph may also
|
||||
export a Module — the two are compatible via Module `$defs`.
|
||||
with `BaseNode`, `BaseEdge`, `Config`). Concrete types like `CallGraph` could
|
||||
live either in storage (as reference implementations) or in their respective
|
||||
packages. **Decision: Both.** Storage provides reference Modules in `modules/`
|
||||
that consumers can use directly or replace. Flowgraph may also export a
|
||||
Module — the two are compatible via Module `$defs`.
|
||||
|
||||
3. **Should `*EdgeConstraints` entries use `Type.Ref("CallNode")` or
|
||||
`Type.String()` for allowed source/target types?** Using `Type.Ref`
|
||||
would mean "each element in the array must validate against the CallNode
|
||||
schema," which is semantically wrong — the constraint is about which named
|
||||
node types are valid endpoints, not about data shapes. Using `Type.String()`
|
||||
matches the actual semantics (arrays of node type names) but loses the
|
||||
structural link. **Decision: `Type.String()`** — the constraint arrays
|
||||
contain names, not schemas. The naming convention provides an implicit
|
||||
contract that string values should correspond to `*Node` entry names,
|
||||
enforced by `moduleToDbSchema()` at projection time.
|
||||
`Type.String()` for allowed source/target types?** See the
|
||||
[Edge Type Constraints](#edge-type-constraints) section. **Decision:
|
||||
`Type.String()`** — the constraint arrays contain names, not schemas.
|
||||
|
||||
4. **How does the graph pointer abstraction interact with the repository layer?**
|
||||
For v1, repository functions use direct key-based addressing. Typed pointers
|
||||
(JPATH Module, reactive ValuePointer) could layer on top of the repository
|
||||
later. The key question: does the repository return raw data (untyped JSON),
|
||||
or does it validate against the Module before returning? **Decision: validate
|
||||
on read** — if the data doesn't match the Module entry, throw. This makes
|
||||
typed pointers safe: any value you get from the repo conforms to the schema.
|
||||
For v1, repository functions use direct key-based addressing. **Decision:
|
||||
validate on read** — if data doesn't match the Module entry, throw. This
|
||||
makes any value retrieved from the repo conform to the schema.
|
||||
|
||||
## References
|
||||
|
||||
@@ -880,8 +687,8 @@ Acceptance criteria per phase:
|
||||
- ujsx ADR-002 (Module as type registry): `/workspace/@alkdev/ujsx/docs/architecture/decisions/002-typebox-module-as-registry.md`
|
||||
- ujsx schema docs: `/workspace/@alkdev/ujsx/docs/architecture/schema.md`
|
||||
- TsToModule codegen: `/workspace/research/typebox_research/codegen/ts-to-module.ts`
|
||||
- ujsx Module examples: `/workspace/research/typebox_research/ujsx/unist.gen.ts`, `/workspace/research/typebox_research/ujsx/mdast.gen.ts`
|
||||
- Flowgraph schema (standalone TypeBox, not yet Module): `/workspace/@alkdev/flowgraph/src/schema/`
|
||||
- Flowgraph SerializedGraph factory: `/workspace/@alkdev/flowgraph/src/schema/graph.ts`
|
||||
- Forward-looking connections (pointers, dbtype, ujsx IR): [forward-look.md](./forward-look.md)
|
||||
- Ecosystem integration: [overview.md](./overview.md)
|
||||
- Schema evolution: [schema-evolution.md](./schema-evolution.md)
|
||||
- Forward-looking connections: [forward-look.md](./forward-look.md)
|
||||
- Package overview: [overview.md](./overview.md)
|
||||
Reference in New Issue
Block a user