docs: restructure metagraph-module.md for clarity and reduced redundancy
- Eliminate 4x redundancy on SchemaBuilder removal (was in Overview, Equivalence section, DD1, DD2) - Remove forward references to DD numbers that break reading flow - Separate specification from rationale (DDs capture decisions, body specifies) - Fix Type.Ref inconsistency in Edge Constraints example (should use Metagraph.Import per DD2) - Expand 'Why TypeBox Modules' with the three friction points it solves - Add Performance subsection, Codegen Path, Transition table, Implementation Path - Restore Relationship to Other Packages table - Remove historical artifacts (SchemaBuilder equivalence internals, Type.Any migration notes) - 887 lines → 694 lines (22% reduction)
This commit is contained in:
@@ -1,16 +1,16 @@
|
|||||||
---
|
---
|
||||||
status: draft
|
status: draft
|
||||||
last_updated: 2026-05-30
|
last_updated: 2026-05-29
|
||||||
---
|
---
|
||||||
|
|
||||||
# Metagraph as TypeBox Module
|
# Metagraph as TypeBox Module
|
||||||
|
|
||||||
Graph type definitions as `Type.Module` — aligning with the ujsx pattern for
|
Graph type definitions as `Type.Module` — recursive schemas, cross-package
|
||||||
recursive schemas, cross-package references, codegen, and graphology serialization.
|
references, and DB persistence.
|
||||||
|
|
||||||
## The Metagraph Data Model
|
## The Metagraph Data Model
|
||||||
|
|
||||||
The metagraph pattern is a three-level type system:
|
The metagraph is a three-level type system:
|
||||||
|
|
||||||
1. **GraphType** — A class of graphs (e.g., "call-graph", "acl",
|
1. **GraphType** — A class of graphs (e.g., "call-graph", "acl",
|
||||||
"task-dependencies"). Defines structural constraints
|
"task-dependencies"). Defines structural constraints
|
||||||
@@ -24,8 +24,8 @@ The metagraph pattern is a three-level type system:
|
|||||||
"can_read", "depends_on"). Each edge type has a TypeBox schema for its
|
"can_read", "depends_on"). Each edge type has a TypeBox schema for its
|
||||||
attributes. Optionally constrains which source/target node types are valid.
|
attributes. Optionally constrains which source/target node types are valid.
|
||||||
|
|
||||||
Then **Graph instances** belong to a graph type and contain **Nodes** and
|
**Graph instances** belong to a graph type and contain **Nodes** and **Edges**
|
||||||
**Edges** conforming to those type definitions.
|
conforming to those type definitions.
|
||||||
|
|
||||||
```
|
```
|
||||||
GraphType "call-graph" (directed, multi, self-loops allowed)
|
GraphType "call-graph" (directed, multi, self-loops allowed)
|
||||||
@@ -42,7 +42,7 @@ Graph "session-abc-call-graph" (instance)
|
|||||||
│ └── attributes: { requestId, operationId, status, ... }
|
│ └── attributes: { requestId, operationId, status, ... }
|
||||||
├── Node "call-002" → nodeTypeId → NodeType "subcall"
|
├── Node "call-002" → nodeTypeId → NodeType "subcall"
|
||||||
│ └── attributes: { requestId, parentRequestId, ... }
|
│ └── attributes: { requestId, parentRequestId, ... }
|
||||||
└── Edge "edge-001" → edgeTypeId → EdgeType "triggered"
|
└── Edge "edge-001" → edgeTypeId → NodeType "triggered"
|
||||||
└── attributes: { type: "triggered" }
|
└── attributes: { type: "triggered" }
|
||||||
sourceNodeKey: "call-001"
|
sourceNodeKey: "call-001"
|
||||||
targetNodeKey: "call-002"
|
targetNodeKey: "call-002"
|
||||||
@@ -54,83 +54,45 @@ Nodes and edges use a **composite identity model**: identified by
|
|||||||
`key` is the identity.
|
`key` is the identity.
|
||||||
|
|
||||||
Node and edge attributes are stored as JSON text in SQLite (jsonb in PG). The
|
Node and edge attributes are stored as JSON text in SQLite (jsonb in PG). The
|
||||||
graph type's schema defines what shape these attributes should have, but the
|
graph type's schema defines the expected shape, but the database doesn't enforce
|
||||||
database doesn't enforce the schema — all validation happens in the repository
|
it — validation happens in the repository layer. See
|
||||||
layer. See [schema-evolution.md](./schema-evolution.md) for how schemas change
|
[schema-evolution.md](./schema-evolution.md) for how schemas change over time,
|
||||||
over time, and [sqlite-host.md](./sqlite-host.md) for the table definitions.
|
and [sqlite-host.md](./sqlite-host.md) for table definitions.
|
||||||
|
|
||||||
## Overview
|
## Why TypeBox Modules
|
||||||
|
|
||||||
A graph type definition is naturally a TypeBox Module. It has named entries
|
A graph type definition has named entries (node types, edge types, config) that
|
||||||
(node types, edge types, config) that reference each other with `Type.Ref()`,
|
reference each other. `Type.Module` is the natural fit:
|
||||||
compose with `Type.Composite()`, and can cross-reference other Modules with
|
|
||||||
`Import()`. This is the same pattern used by `@alkdev/ujsx` (where `UJSX` is
|
|
||||||
a Module with `UPrimitive`, `UElement`, `URoot`, `UNode` recursively referencing
|
|
||||||
each other).
|
|
||||||
|
|
||||||
The removed `SchemaBuilder` produced a flat `GraphSchema` object — an ad-hoc
|
- **`Type.Ref("CallStatus")`** — recursive and internal references resolve
|
||||||
`Record<string, NodeType>` + `Record<string, EdgeType>`. This works but
|
within the Module's `$defs`
|
||||||
creates friction:
|
- **`Module.Import("CallStatus")`** — cross-package references embed the
|
||||||
|
referenced Module's `$defs`
|
||||||
|
- **`Value.Check(Module.Import("CallNode"), data)`** — runtime validation
|
||||||
|
- **`Static<typeof Module>`** — TypeScript types from the Module
|
||||||
|
|
||||||
1. **No cross-graph-type references** — a call graph node type can't reference
|
This replaces the removed `SchemaBuilder`, which produced a flat
|
||||||
`CallStatus` from `@alkdev/flowgraph` without manual `Type.Intersect`
|
`Record<string, NodeType>` + `Record<string, EdgeType>`. That approach had
|
||||||
composition. Each package defines schemas independently, duplicating types.
|
three limitations that Modules solve natively:
|
||||||
2. **No graphology compatibility** — the schema output is a flat JSON object,
|
|
||||||
not a format that maps to graphology's `import()`/`export()`. Consumers
|
1. **No cross-graph-type references** — a call graph node type couldn't
|
||||||
manually map node/edge attributes.
|
reference `CallStatus` from `@alkdev/flowgraph` without manual
|
||||||
|
`Type.Intersect`. Each package duplicated types independently.
|
||||||
|
2. **No graphology compatibility** — the flat JSON output didn't map to
|
||||||
|
graphology's `import()`/`export()`. Consumers manually mapped node/edge
|
||||||
|
attributes.
|
||||||
3. **No codegen leverage** — `TsToModule` generates TypeBox Modules from
|
3. **No codegen leverage** — `TsToModule` generates TypeBox Modules from
|
||||||
TypeScript interfaces. The SchemaBuilder couldn't consume Module output, so
|
TypeScript interfaces, but the builder couldn't consume Module output.
|
||||||
codegen-produced types must be manually translated.
|
|
||||||
|
|
||||||
The Module approach treats each graph type as a `Type.Module`, aligning storage
|
This aligns with the pattern proven in `@alkdev/ujsx`, where `UJSX` is a Module
|
||||||
with how ujsx already works — recursive types via `Ref`, composition via
|
with `UPrimitive`, `UElement`, `URoot`, `UNode` recursively referencing each
|
||||||
`Composite`, cross-references via `Import`.
|
other. See [forward-look.md](./forward-look.md) for how this connects to the
|
||||||
|
broader ecosystem (codegen, graphology, dbtype).
|
||||||
|
|
||||||
For the forward-looking view of how this connects to dbtype, graph pointers,
|
## Base Module: Metagraph
|
||||||
and the ujsx universal IR pipeline, see [forward-look.md](./forward-look.md).
|
|
||||||
|
|
||||||
## The Pattern (Proven in ujsx)
|
The metagraph meta-schema is a Module providing base entries that concrete
|
||||||
|
graph types compose from:
|
||||||
`@alkdev/ujsx` already uses this pattern (ADR-002: "TypeBox Module as type
|
|
||||||
registry"):
|
|
||||||
|
|
||||||
```ts
|
|
||||||
// ujsx: schema.ts
|
|
||||||
export const UJSX = Type.Module({
|
|
||||||
UPrimitive: Type.Union([Type.String(), Type.Number(), Type.Boolean(), Type.Null()]),
|
|
||||||
PropValue: Type.Union([..., Type.Ref("UNode"), ...]),
|
|
||||||
UniversalProps: Type.Object({}, { additionalProperties: Type.Union([Type.Ref("PropValue"), Type.Undefined()]) }),
|
|
||||||
UElement: Type.Object({
|
|
||||||
type: Type.String(),
|
|
||||||
props: Type.Ref("UniversalProps"),
|
|
||||||
children: Type.Array(Type.Ref("UNode")), // recursive!
|
|
||||||
}),
|
|
||||||
URoot: Type.Object({
|
|
||||||
type: Type.Literal("root"),
|
|
||||||
props: Type.Ref("UniversalProps"),
|
|
||||||
children: Type.Array(Type.Ref("UNode")), // recursive!
|
|
||||||
}),
|
|
||||||
UNode: Type.Union([Type.Ref("UPrimitive"), Type.Ref("UElement"), Type.Ref("URoot")]),
|
|
||||||
});
|
|
||||||
```
|
|
||||||
|
|
||||||
Key properties:
|
|
||||||
- **`Type.Ref("UNode")`** resolves within the Module's `$defs` — recursive
|
|
||||||
references are natural
|
|
||||||
- **`UJSX.Import("UElement")`** lets other Modules reference ujsx types — the
|
|
||||||
referenced Module's `$defs` are embedded in the importing Module's JSON Schema
|
|
||||||
- **`Value.Check(UJSX.Import("UElement"), node)`** validates at runtime
|
|
||||||
- **`Static<typeof UJSX>`** gives TypeScript types (or hand-written types for
|
|
||||||
non-serializable entries like `ComponentFn`)
|
|
||||||
|
|
||||||
Graph type definitions have the same structure — named entries that reference
|
|
||||||
each other, with possible cross-references to other packages' Modules.
|
|
||||||
|
|
||||||
## Proposed: GraphType as a TypeBox Module
|
|
||||||
|
|
||||||
### Base Module: Metagraph
|
|
||||||
|
|
||||||
The metagraph meta-schema itself is a Module:
|
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
export const Metagraph = Type.Module({
|
export const Metagraph = Type.Module({
|
||||||
@@ -157,12 +119,26 @@ export const Metagraph = Type.Module({
|
|||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
### Concrete Graph Type: CallGraph
|
- `Config` uses `Type.Union` with defaults for construction-time validation
|
||||||
|
("any valid config"). Specific graph types narrow these to `Type.Literal`
|
||||||
|
values.
|
||||||
|
- `BaseNode` and `BaseEdge` provide common attribute schemas. Concrete graph
|
||||||
|
types compose them via `Type.Composite`.
|
||||||
|
- `metadata` and similar "arbitrary data" fields use `Type.Unknown()`
|
||||||
|
(not `Type.Any()`). `Type.Unknown()` is canonical — it communicates "no
|
||||||
|
validation applied" explicitly.
|
||||||
|
|
||||||
A specific graph type is also a Module. It composes `BaseNode`/`BaseEdge` via
|
## Concrete Graph Type Modules
|
||||||
`Type.Composite()` (same as ujsx's `Mdast.Node: Type.Composite([Unist.Import("UnistNode"), ...])`):
|
|
||||||
|
A specific graph type is also a `Type.Module`. It composes `BaseNode`/`BaseEdge`
|
||||||
|
via `Metagraph.Import()` and `Type.Composite()`, narrows config to literal values,
|
||||||
|
and defines its own node types, edge types, and shared types.
|
||||||
|
|
||||||
|
### Example: CallGraph
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
|
import { Metagraph } from "./metagraph.ts";
|
||||||
|
|
||||||
export const CallGraph = Type.Module({
|
export const CallGraph = Type.Module({
|
||||||
// Config is specific — literal values, not unions with defaults
|
// Config is specific — literal values, not unions with defaults
|
||||||
Config: Type.Object({
|
Config: Type.Object({
|
||||||
@@ -171,7 +147,7 @@ export const CallGraph = Type.Module({
|
|||||||
allowSelfLoops: Type.Literal(false),
|
allowSelfLoops: Type.Literal(false),
|
||||||
}),
|
}),
|
||||||
|
|
||||||
// Node types compose BaseNode (from Metagraph) with call-specific attributes
|
// Node types compose BaseNode with call-specific attributes
|
||||||
CallNode: Type.Composite([
|
CallNode: Type.Composite([
|
||||||
Metagraph.Import("BaseNode"),
|
Metagraph.Import("BaseNode"),
|
||||||
Type.Object({
|
Type.Object({
|
||||||
@@ -229,9 +205,33 @@ export const CallGraph = Type.Module({
|
|||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Type.Composite, not Type.Intersect
|
||||||
|
|
||||||
|
Graph type Modules use `Type.Composite` to extend base schemas, not
|
||||||
|
`Type.Intersect`. The difference:
|
||||||
|
|
||||||
|
- **`Type.Intersect`** produces a `TIntersect` wrapper with `allOf` — consumers
|
||||||
|
must traverse `allOf` to access properties.
|
||||||
|
- **`Type.Composite`** produces a flat `TObject` — overlapping keys are
|
||||||
|
intersected via `IntersectEvaluated`, non-overlapping keys are merged.
|
||||||
|
|
||||||
|
Both use intersection semantics for overlapping keys. When overlapping keys have
|
||||||
|
a subtype relationship (e.g., `type: Type.String()` → `type: Type.Literal("triggered")`),
|
||||||
|
the intersection resolves to the narrower type, which is the correct behavior.
|
||||||
|
|
||||||
|
**Constraint**: Do not use `Type.Composite` with overlapping keys of incompatible
|
||||||
|
types. If `BaseEdge` has `type: Type.String()` and a concrete edge type needs
|
||||||
|
`type: Type.Number()`, the intersection evaluates to `never`. For graph types,
|
||||||
|
this is not a concern — base and concrete keys either don't overlap, or the
|
||||||
|
overlap is a valid subtype narrowing.
|
||||||
|
|
||||||
### Cross-Module References
|
### Cross-Module References
|
||||||
|
|
||||||
`Module.Import()` allows one Module to reference entries from another:
|
`Module.Import()` allows one Module to reference entries from another. In the
|
||||||
|
CallGraph example, `Metagraph.Import("BaseNode")` embeds `Metagraph`'s `$defs`
|
||||||
|
into `CallGraph`'s JSON Schema output.
|
||||||
|
|
||||||
|
This also works across packages:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
import { FlowGraph } from "@alkdev/flowgraph/schema";
|
import { FlowGraph } from "@alkdev/flowgraph/schema";
|
||||||
@@ -241,146 +241,76 @@ const CallGraph = Type.Module({
|
|||||||
CallNode: Type.Composite([
|
CallNode: Type.Composite([
|
||||||
Type.Ref("BaseNode"),
|
Type.Ref("BaseNode"),
|
||||||
Type.Object({
|
Type.Object({
|
||||||
status: FlowGraph.Import("CallStatus"), // from flowgraph
|
status: FlowGraph.Import("CallStatus"),
|
||||||
identity: Type.Optional(FlowGraph.Import("Identity")), // from flowgraph
|
identity: Type.Optional(FlowGraph.Import("Identity")),
|
||||||
// ...
|
|
||||||
}),
|
}),
|
||||||
]),
|
]),
|
||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
This is exactly the `Mdast.Import("UnistNode")` pattern from the ujsx research.
|
**Import embedding**: `Module.Import()` embeds the referenced Module's `$defs`
|
||||||
|
into the importing Module's JSON Schema. When `CallGraph` imports from
|
||||||
**⚠️ Import embedding**: `Module.Import()` embeds the referenced Module's `$defs`
|
|
||||||
into the importing Module's JSON Schema output. When `CallGraph` imports from
|
|
||||||
`FlowGraph`, the resulting JSON Schema includes all of `FlowGraph`'s definitions
|
`FlowGraph`, the resulting JSON Schema includes all of `FlowGraph`'s definitions
|
||||||
in `$defs`. See DD6 for how the repository layer handles this.
|
in `$defs`. The repository layer stores **dereferenced entry schemas** — each
|
||||||
|
`node_types` row gets its entry's resolved JSON Schema (with inline `$defs` for
|
||||||
|
just its transitive references), not the entire importing Module. This avoids
|
||||||
|
storage bloat and version coupling (DD6).
|
||||||
|
|
||||||
**Decision (DD6)**: The repository layer stores **dereferenced entry schemas** —
|
### BaseNode/BaseEdge: Import vs Local Re-declaration
|
||||||
each `node_types` row gets its entry's resolved JSON Schema (with inline `$defs`
|
|
||||||
for just its transitive references), not the entire importing Module. This
|
|
||||||
avoids storage bloat and version coupling issues.
|
|
||||||
|
|
||||||
### BaseNode/BaseEdge: Local Re-declaration vs Metagraph.Import
|
There are two ways to get `BaseNode`/`BaseEdge` into a concrete graph type Module:
|
||||||
|
|
||||||
`Type.Ref()` only resolves entries within the *same* Module. In the `CallGraph`
|
- **`Metagraph.Import("BaseNode")`** — references the base Module directly.
|
||||||
example above, `Type.Ref("BaseNode")` requires `BaseNode` to be an entry in the
|
No duplication, but embeds `Metagraph`'s `$defs` (3 entries — minimal bloat).
|
||||||
`CallGraph` Module. There are two strategies for getting `BaseNode`/`BaseEdge`
|
- **Local re-declaration** — copy the base schemas into the concrete Module.
|
||||||
into a concrete graph type Module:
|
No `$defs` embedding, but duplication if `Metagraph` evolves.
|
||||||
|
|
||||||
**Option A: Re-declare locally** (shown in the example above). Each concrete
|
**Decision**: Use `Metagraph.Import()` for Modules within `@alkdev/storage`
|
||||||
Module includes its own `BaseNode`/`BaseEdge` entries. The schemas are identical
|
(e.g., `modules/call-graph.ts`). Both Modules live in the same package, so
|
||||||
to `Metagraph.BaseNode`/`Metagraph.BaseEdge` — you copy them in. Simple, but
|
there's no circular dependency. For Modules defined in external packages
|
||||||
creates duplication. If the base schemas evolve, each concrete Module must be
|
(e.g., `@alkdev/flowgraph`), re-declare base schemas locally — external
|
||||||
updated independently.
|
packages should not depend on storage's `Metagraph` Module.
|
||||||
|
|
||||||
**Option B: Metagraph.Import**. The concrete Module imports from `Metagraph`:
|
### Config: Literal Values Freeze the Configuration
|
||||||
|
|
||||||
|
The general `Metagraph.Config` uses `Type.Union` with defaults (for
|
||||||
|
construction-time: "any valid config"). Specific graph types freeze these to
|
||||||
|
`Type.Literal` values:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
const CallGraph = Type.Module({
|
// General: accepts any valid config
|
||||||
CallNode: Type.Composite([
|
Metagraph.Config // type: union of "directed"|"undirected"|"mixed", multi: boolean, ...
|
||||||
Metagraph.Import("BaseNode"),
|
|
||||||
Type.Object({ requestId: Type.String(), ... }),
|
|
||||||
]),
|
|
||||||
});
|
|
||||||
```
|
|
||||||
|
|
||||||
This avoids duplication but embeds `Metagraph`'s `$defs` into `CallGraph`'s
|
// Specific: frozen to exact values
|
||||||
JSON Schema output. For most cases, `Metagraph` is small (3 entries) so the
|
CallGraph.Config // type: "directed", multi: false, allowSelfLoops: false
|
||||||
bloat is minimal. If `Metagraph` grows, this could become a concern.
|
|
||||||
|
|
||||||
**Decision: Option B for same-package Modules (recommended), Option A as
|
|
||||||
fallback for external-package Modules**.
|
|
||||||
|
|
||||||
For Modules defined within `@alkdev/storage` (like `CallGraph` in
|
|
||||||
`modules/call-graph.ts`), `Metagraph.Import("BaseNode")` has no circular
|
|
||||||
dependency issue — both `Metagraph` and `CallGraph` live in the same package.
|
|
||||||
The `Import` approach avoids duplication and keeps the base schemas in one
|
|
||||||
place.
|
|
||||||
|
|
||||||
For Modules defined outside `@alkdev/storage` (e.g., in `@alkdev/flowgraph`),
|
|
||||||
Option A applies because external packages should not depend on storage's
|
|
||||||
`Metagraph` Module (see Open Question 1). Those packages re-declare their own
|
|
||||||
base schemas or define them independently.
|
|
||||||
|
|
||||||
The v1 reference Modules in `modules/` should use Option B. If a future
|
|
||||||
consumer defines a `CallGraph` Module externally, they can choose either
|
|
||||||
approach — the schemas are structurally identical.
|
|
||||||
|
|
||||||
**Verified**: `Type.Composite([Type.Ref("BaseNode"), Type.Object({...})])`
|
|
||||||
within a Module resolves correctly. Test confirms: `Value.Check(Module.Import("CallNode"), validData)` passes.
|
|
||||||
|
|
||||||
### Type.Composite vs Type.Intersect
|
|
||||||
|
|
||||||
The Module approach uses `Type.Composite` for extending `BaseNode`/`BaseEdge`,
|
|
||||||
not `Type.Intersect`. This matches the ujsx pattern where `Mdast.Node` is
|
|
||||||
`Type.Composite([Unist.Import("UnistNode"), Type.Object({...})])`.
|
|
||||||
|
|
||||||
The difference:
|
|
||||||
- **`Type.Intersect`** creates a JSON Schema `allOf` — the result is a
|
|
||||||
`TIntersect` wrapper with nested schemas. Consumers must traverse `allOf`
|
|
||||||
to access properties.
|
|
||||||
- **`Type.Composite`** produces an **intersection evaluated into a flat
|
|
||||||
`TObject`** — overlapping keys are intersected via `IntersectEvaluated`
|
|
||||||
and the result is a single object with no `allOf` wrapper. The output
|
|
||||||
shape is `{ key1: Intersect([typeA, typeB]), key2: typeC, ... }`.
|
|
||||||
|
|
||||||
**Both use intersection semantics for overlapping keys.** Composite is NOT
|
|
||||||
an `Object.assign` override — when overlapping keys have varying (incompatible)
|
|
||||||
types, the result is `never`. When overlapping keys have a subtype
|
|
||||||
relationship (like `Type.String()` and `Type.Literal("triggered")`), the
|
|
||||||
intersection resolves to the narrower type (`Type.Literal("triggered")`),
|
|
||||||
which is the correct behavior.
|
|
||||||
|
|
||||||
**Why Composite over Intersect for graph types**: The output is a flat
|
|
||||||
`TObject` that maps directly to a node/edge attribute schema. `Intersect`
|
|
||||||
produces a `TIntersect` wrapper that would need unwrapping. For graph types
|
|
||||||
where base and concrete attributes have non-overlapping keys (most cases)
|
|
||||||
or subtype-only overlaps (like `type: Type.String()` → `type: Type.Literal(...)`),
|
|
||||||
Composite evaluates to the same result but in a more convenient shape.
|
|
||||||
|
|
||||||
**Design constraint**: Do not use `Type.Composite` with overlapping keys of
|
|
||||||
incompatible types. If `BaseEdge` has `type: Type.String()` and a concrete
|
|
||||||
edge type needs `type: Type.Number()`, the intersection evaluates to `never`.
|
|
||||||
For graph types, this is not a concern — base and concrete keys either don't
|
|
||||||
overlap, or the overlap is a valid subtype narrowing (union → literal).
|
|
||||||
|
|
||||||
### Config: Literal Values for Specific Graph Types
|
|
||||||
|
|
||||||
The general `Metagraph.Config` has `Type.Union` with defaults (for
|
|
||||||
construction-time validation: "any valid config"). Specific graph types use
|
|
||||||
`Type.Literal` for frozen config values:
|
|
||||||
|
|
||||||
```ts
|
|
||||||
// General (construction): Type.Union([Type.Literal("directed"), Type.Literal("undirected"), ...])
|
|
||||||
// Specific (frozen): Type.Literal("directed")
|
|
||||||
```
|
```
|
||||||
|
|
||||||
The construction flow: consumer provides a general config → validated against
|
The construction flow: consumer provides a general config → validated against
|
||||||
`Metagraph.Config` → the specific graph type Module uses `Type.Literal` to
|
`Metagraph.Config` → the specific graph type Module freezes the values with
|
||||||
freeze the value. Narrowing from `Type.Union` to `Type.Literal` is explicit
|
`Type.Literal`.
|
||||||
in the Module — no builder step needed.
|
|
||||||
|
|
||||||
### Edge Type Constraints: named constraint entries
|
## Edge Type Constraints
|
||||||
|
|
||||||
Edge type constraints (`allowedSourceTypes`/`allowedTargetTypes`) are **named
|
Edge type constraints (`allowedSourceTypes`/`allowedTargetTypes`) are **named
|
||||||
Module entries**, not columns bolted onto DB rows. This makes them first-class
|
Module entries**, not columns bolted onto DB rows. This makes them first-class
|
||||||
parts of the schema — queryable, validatable, and composable:
|
parts of the schema — queryable, validatable, and serializable.
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
|
import { Metagraph } from "./metagraph.ts";
|
||||||
|
|
||||||
export const CallGraph = Type.Module({
|
export const CallGraph = Type.Module({
|
||||||
// ...
|
// ...
|
||||||
TriggeredEdge: Type.Composite([
|
TriggeredEdge: Type.Composite([
|
||||||
Type.Ref("BaseEdge"),
|
Metagraph.Import("BaseEdge"),
|
||||||
Type.Object({ type: Type.Literal("triggered") }),
|
Type.Object({ type: Type.Literal("triggered") }),
|
||||||
]),
|
]),
|
||||||
TriggeredEdgeConstraints: Type.Object({
|
TriggeredEdgeConstraints: Type.Object({
|
||||||
edgeType: Type.Literal("triggered"),
|
edgeType: Type.Literal("triggered"),
|
||||||
allowedSourceTypes: Type.Array(Type.String()), // node type names: ["Call"]
|
allowedSourceTypes: Type.Array(Type.String()), // ["Call"]
|
||||||
allowedTargetTypes: Type.Array(Type.String()), // node type names: ["Call", "Subcall"]
|
allowedTargetTypes: Type.Array(Type.String()), // ["Call", "Subcall"]
|
||||||
}),
|
}),
|
||||||
DependsOnEdge: Type.Composite([
|
DependsOnEdge: Type.Composite([
|
||||||
Type.Ref("BaseEdge"),
|
Metagraph.Import("BaseEdge"),
|
||||||
Type.Object({ type: Type.Literal("depends_on") }),
|
Type.Object({ type: Type.Literal("depends_on") }),
|
||||||
]),
|
]),
|
||||||
DependsOnEdgeConstraints: Type.Object({
|
DependsOnEdgeConstraints: Type.Object({
|
||||||
@@ -391,47 +321,23 @@ export const CallGraph = Type.Module({
|
|||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
**Why Module entries instead of DB columns** (DD7 revised):
|
**Why `Type.String()` not `Type.Ref()`**: The constraint arrays contain node
|
||||||
|
type *names* (strings like `"Call"`), not node type *schemas*. `Type.Ref("CallNode")`
|
||||||
1. **Schema-level validation**: `Value.Check(CallGraph.TriggeredEdgeConstraints, data)`
|
would mean "each element must validate against the CallNode schema," which is
|
||||||
validates that constraint data is well-formed. With DB columns, there's no
|
semantically wrong — the constraint is about which named node types are valid
|
||||||
schema validation — just JSON arrays in text columns.
|
endpoints, not about data shapes. The `*Node` suffix naming convention provides
|
||||||
2. **Serialization**: The constraint entries serialize to JSON Schema with
|
an implicit structural contract. `moduleToDbSchema()` enforces this convention
|
||||||
`$defs`, enabling `Value.Diff` for migration detection and `FromSchema`
|
at Module-to-DB projection time.
|
||||||
for round-tripping.
|
|
||||||
3. **DB mapping**: The `moduleToDbSchema()` function extracts
|
|
||||||
`*EdgeConstraints` entries and writes their `allowedSourceTypes`/
|
|
||||||
`allowedTargetTypes` fields to the existing `edge_types` columns. The DB
|
|
||||||
schema doesn't change — the Module entries are the source of truth, the
|
|
||||||
DB columns are the persistence projection.
|
|
||||||
|
|
||||||
**Why Type.String() not Type.Ref()**: The constraint arrays contain node type
|
|
||||||
*names* (strings like `"Call"`), not node type *schemas*. `Type.Ref("CallNode")`
|
|
||||||
would mean "an element must validate against the CallNode schema," which is
|
|
||||||
incorrect — the constraint is about which named node types are valid endpoints,
|
|
||||||
not about node data shapes. The naming convention (`*Node` suffix) provides an
|
|
||||||
implicit structural contract: string values in `allowedSourceTypes` should
|
|
||||||
correspond to `*Node` entry names in the same Module. This is enforced by
|
|
||||||
`moduleToDbSchema()` at Module-to-DB projection time, not by the schema itself.
|
|
||||||
See Open Question 4 for the `Type.Ref` vs `Type.String` trade-off.
|
|
||||||
|
|
||||||
**DB mapping note**: The current DB schema stores `allowedSourceTypes` and
|
|
||||||
`allowedTargetTypes` as JSON text columns (arrays of strings, default `[]`).
|
|
||||||
In the Module, these become `Type.Array(Type.String())` entries — the DB
|
|
||||||
column values are the same string arrays. `moduleToDbSchema()` extracts them
|
|
||||||
directly. Read-path reconstruction resolves the names back to Module entries
|
|
||||||
for validation.
|
|
||||||
|
|
||||||
**Empty array semantics**: In the DB, `[]` means "no restriction" (any node
|
**Empty array semantics**: In the DB, `[]` means "no restriction" (any node
|
||||||
type valid). In the Module, omitting the `*EdgeConstraints` entry means the
|
type valid). In the Module, omitting the `*EdgeConstraints` entry means the same
|
||||||
same thing. An explicit entry with empty arrays is not valid — it would mean
|
thing. An explicit entry with empty arrays is not valid — it would mean "no node
|
||||||
"no node types are valid at this endpoint," which is nonsensical. The
|
types are valid at this endpoint," which is nonsensical.
|
||||||
repository layer enforces this convention.
|
|
||||||
|
|
||||||
### Entry Naming Convention
|
## Entry Naming Convention
|
||||||
|
|
||||||
Within a graph type Module, entries follow a naming convention that distinguishes
|
Within a graph type Module, entries follow a suffix convention that distinguishes
|
||||||
their role (DD8):
|
their role and determines their DB mapping:
|
||||||
|
|
||||||
| Suffix | Role | Maps to DB |
|
| Suffix | Role | Maps to DB |
|
||||||
|--------|------|------------|
|
|--------|------|------------|
|
||||||
@@ -442,15 +348,14 @@ their role (DD8):
|
|||||||
| `*Enum` or bare name | Shared enum/type | Embedded in `node_types.schema`/`edge_types.schema` |
|
| `*Enum` or bare name | Shared enum/type | Embedded in `node_types.schema`/`edge_types.schema` |
|
||||||
| `BaseNode`, `BaseEdge` | Base attribute schemas | Composed into `*Node`/`*Edge` entries |
|
| `BaseNode`, `BaseEdge` | Base attribute schemas | Composed into `*Node`/`*Edge` entries |
|
||||||
|
|
||||||
The `moduleToDbSchema()` function uses this convention to map Module entries to
|
`moduleToDbSchema()` uses this convention to project Module entries to DB rows.
|
||||||
the `node_types` and `edge_types` tables. Entries ending in `Node` become rows
|
Entries ending in `Node` become rows with `name = entryNameWithoutSuffix("Node")`
|
||||||
with `name = entryNameWithoutSuffix ("Node")` and `schema = resolved entry`.
|
and `schema = resolved entry`. Same for `*Edge`. The `Config` entry maps to
|
||||||
Same for `*Edge`. The `Config` entry maps to `graph_types.config`.
|
`graph_types.config`.
|
||||||
|
|
||||||
## graphology Serialization Bridge
|
## graphology Serialization Bridge
|
||||||
|
|
||||||
The bridge between Modules and graphology is the `SerializedGraph` pattern that
|
The bridge between Modules and graphology is the `SerializedGraph` pattern:
|
||||||
`@alkdev/flowgraph` already uses:
|
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
// flowgraph's current pattern (standalone schemas)
|
// flowgraph's current pattern (standalone schemas)
|
||||||
@@ -462,7 +367,7 @@ const CallGraphSerialized = SerializedGraph(
|
|||||||
|
|
||||||
// Module pattern (entries from the Module)
|
// Module pattern (entries from the Module)
|
||||||
const CallGraphSerialized = SerializedGraph(
|
const CallGraphSerialized = SerializedGraph(
|
||||||
CallGraph.CallNode, // entry from Module — resolves Refs through $defs
|
CallGraph.CallNode, // entry from Module — resolves Refs through $defs
|
||||||
CallGraph.DependsOnEdge, // entry from Module
|
CallGraph.DependsOnEdge, // entry from Module
|
||||||
Type.Object({}),
|
Type.Object({}),
|
||||||
);
|
);
|
||||||
@@ -472,7 +377,7 @@ Graphology's serialized format:
|
|||||||
|
|
||||||
```ts
|
```ts
|
||||||
{
|
{
|
||||||
attributes: {}, // Graph-level attributes (empty for most graphs)
|
attributes: {}, // Graph-level attributes
|
||||||
options: {
|
options: {
|
||||||
type: "directed", // From CallGraph.Config
|
type: "directed", // From CallGraph.Config
|
||||||
multi: false,
|
multi: false,
|
||||||
@@ -493,36 +398,27 @@ The mapping:
|
|||||||
- `CallGraph.CallNode` → validates `nodes[].attributes`
|
- `CallGraph.CallNode` → validates `nodes[].attributes`
|
||||||
- `CallGraph.TriggeredEdge` → validates `edges[].attributes`
|
- `CallGraph.TriggeredEdge` → validates `edges[].attributes`
|
||||||
|
|
||||||
This is **complementary** to `@alkdev/flowgraph`'s `SerializedGraph` — storage
|
Storage produces this format; `@alkdev/flowgraph`'s `FlowGraph.fromJSON()` and
|
||||||
produces the data, flowgraph operates on it in memory. The `SerializedGraph`
|
`SerializedGraph` consume it. The `SerializedGraph` factory function stays the
|
||||||
factory function stays the same — its schema arguments now come from Module
|
same — its schema arguments now come from Module entries instead of standalone
|
||||||
entries instead of standalone schemas. The `moduleToDbSchema()`
|
schemas. Storage doesn't need a graphology dependency.
|
||||||
function extracts per-entry schemas for DB storage; the `moduleToGraphology()`
|
|
||||||
function produces the graphology import format for hydration.
|
|
||||||
|
|
||||||
## DB Persistence Bridge
|
## DB Persistence Bridge
|
||||||
|
|
||||||
The repository layer maps Module entries to the existing 6-table schema:
|
The repository layer maps Module entries to the 6-table metagraph schema:
|
||||||
|
|
||||||
1. **`graph_types`** row: `name` = Module name, `config` = `CallGraph.Config`
|
1. **`graph_types`** row: `name` = Module name, `config` = resolved
|
||||||
JSON Schema (with defaults resolved)
|
`CallGraph.Config` JSON Schema
|
||||||
2. **`node_types`** rows: one row per `*Node` entry, `name` = entry name
|
2. **`node_types`** rows: one per `*Node` entry, `name` = entry name (minus
|
||||||
(minus `Node` suffix), `schema` = resolved entry JSON Schema
|
suffix), `schema` = resolved entry JSON Schema
|
||||||
3. **`edge_types`** rows: one row per `*Edge` entry, `name` = entry name
|
3. **`edge_types`** rows: one per `*Edge` entry, `name` = entry name (minus
|
||||||
(minus `Edge` suffix), `schema` = resolved entry JSON Schema,
|
suffix), `schema` = resolved entry JSON Schema,
|
||||||
`allowedSourceTypes`/`allowedTargetTypes` from constraint entries
|
`allowedSourceTypes`/`allowedTargetTypes` from constraint entries
|
||||||
|
|
||||||
On read, the repository layer reconstructs the Module from DB rows:
|
On read, the repository layer reconstructs the Module from DB rows:
|
||||||
`Value.Check(CallGraph.CallNode, node.attributes)` validates node data against
|
`Value.Check(CallGraph.CallNode, node.attributes)` validates node data against
|
||||||
the Module entry.
|
the Module entry.
|
||||||
|
|
||||||
**`Module.Import()` embedding**: When a Module entry references entries from
|
|
||||||
another Module (e.g., `FlowGraph.Import("CallStatus")`), the JSON Schema for
|
|
||||||
that entry includes the referenced entries in `$defs`. The repository layer
|
|
||||||
stores the **dereferenced entry** — the resolved JSON Schema with inline `$defs`
|
|
||||||
for transitive references — not the entire importing Module. This avoids
|
|
||||||
duplicating all of FlowGraph's definitions in every CallGraph node_types row.
|
|
||||||
|
|
||||||
### Bridge Functions
|
### Bridge Functions
|
||||||
|
|
||||||
#### `moduleToDbSchema(module)`
|
#### `moduleToDbSchema(module)`
|
||||||
@@ -564,8 +460,7 @@ function moduleToDbSchema(module: TModule): DbSchema
|
|||||||
- `*EdgeConstraints` entries that reference edge type entries not present in
|
- `*EdgeConstraints` entries that reference edge type entries not present in
|
||||||
the Module (the `edgeType` field must match an `*Edge` entry name).
|
the Module (the `edgeType` field must match an `*Edge` entry name).
|
||||||
- `*EdgeConstraints` entries with empty `allowedSourceTypes` and
|
- `*EdgeConstraints` entries with empty `allowedSourceTypes` and
|
||||||
`allowedTargetTypes` arrays (empty = "no types allowed", which is
|
`allowedTargetTypes` arrays (omit the entry for "no restriction").
|
||||||
nonsensical; omit the entry instead for "no restriction").
|
|
||||||
- Module without a `Config` entry (all graph types require configuration).
|
- Module without a `Config` entry (all graph types require configuration).
|
||||||
|
|
||||||
#### `validateNode(module, entryName, data)` / `validateEdge(module, entryName, data)`
|
#### `validateNode(module, entryName, data)` / `validateEdge(module, entryName, data)`
|
||||||
@@ -581,47 +476,31 @@ Returns `true` if data passes `Value.Check` against the resolved Module entry.
|
|||||||
Throws if `entryName` doesn't match an `*Node`/`*Edge` entry in the Module.
|
Throws if `entryName` doesn't match an `*Node`/`*Edge` entry in the Module.
|
||||||
Does NOT throw on invalid data — returns `false`.
|
Does NOT throw on invalid data — returns `false`.
|
||||||
|
|
||||||
### Type.Any vs Type.Unknown
|
### Performance
|
||||||
|
|
||||||
The pre-Module `types.ts` used `Type.Any()` for `metadata` and `schema` fields.
|
|
||||||
The Module approach uses `Type.Unknown()`. These have different JSON Schema
|
|
||||||
outputs:
|
|
||||||
|
|
||||||
- `Type.Any()` → `{}` (accepts anything, no validation)
|
|
||||||
- `Type.Unknown()` → `{}` with `additionalProperties: true` semantics
|
|
||||||
|
|
||||||
For the Module approach, **`Type.Unknown()` is canonical**. It's the more
|
|
||||||
explicit choice — it communicates "this field stores arbitrary data, no
|
|
||||||
validation applied." `Type.Any()` is a legacy from the original TypeBox API.
|
|
||||||
The `Metagraph` Module uses `Type.Unknown()` throughout.
|
|
||||||
|
|
||||||
### Performance Expectations
|
|
||||||
|
|
||||||
Graph type Modules are small — typically 5–20 entries (one Config, 2–5 node
|
Graph type Modules are small — typically 5–20 entries (one Config, 2–5 node
|
||||||
types, 2–5 edge types, 2–5 shared types, 2–5 constraint entries). The
|
types, 2–5 edge types, 2–5 shared types, 2–5 constraint entries). `Value.Check`
|
||||||
`Value.Check` cost scales with schema complexity, not Module size; only the
|
cost scales with schema complexity, not Module size; only the resolved entry
|
||||||
resolved entry schema is checked, not the entire Module.
|
schema is checked, not the entire Module.
|
||||||
|
|
||||||
The dereferenced entry strategy (DD6) means each DB row stores only its own
|
The dereferenced entry strategy (DD6) means each DB row stores only its own
|
||||||
JSON Schema with transitive `$defs` — typically 1–3 KB per entry. A full
|
JSON Schema with transitive `$defs` — typically 1–3 KB per entry. A full graph
|
||||||
graph type's schemas total ~10–50 KB in the DB. This is negligible compared
|
type's schemas total ~10–50 KB in the DB, negligible compared to node/edge data.
|
||||||
to the node/edge data being stored.
|
|
||||||
|
|
||||||
"Validate on read" (Open Question 5) has a per-read cost. For
|
"Validate on read" has a per-read cost. For high-throughput paths, the repository
|
||||||
high-throughput paths, the repository layer can cache the resolved Module
|
layer can cache the resolved Module entry locally after first read. This is a
|
||||||
entry locally after first read, avoiding repeated `Value.Check` for known-good
|
repository-layer optimization, not a Module design concern.
|
||||||
data. This is a repository-layer optimization, not a Module design concern.
|
|
||||||
|
|
||||||
## Codegen Path
|
## Codegen Path
|
||||||
|
|
||||||
`TsToModule` generates TypeBox Modules from TypeScript interfaces. The path from
|
`TsToModule.Generate()` produces TypeBox Module entries from TypeScript
|
||||||
TypeScript to graph type:
|
interfaces, enabling a pipeline from TypeScript to graph type:
|
||||||
|
|
||||||
```
|
```
|
||||||
TypeScript interface → TsToModule.Generate() → TypeBox Module entry
|
TypeScript interface → TsToModule.Generate() → Module entry
|
||||||
@alkdev/flowgraph CallNodeAttrs → flowgraph schema.ts → FlowGraph Module
|
@alkdev/flowgraph CallNodeAttrs → flowgraph schema.ts → FlowGraph Module
|
||||||
@alkdev/taskgraph TaskNodeAttrs → taskgraph schema.ts → TaskGraph Module
|
@alkdev/taskgraph TaskNodeAttrs → taskgraph schema.ts → TaskGraph Module
|
||||||
@alkdev/operations Identity → operations types.ts → Operations Module
|
@alkdev/operations Identity → operations types.ts → Operations Module
|
||||||
```
|
```
|
||||||
|
|
||||||
Since flowgraph already defines `CallNodeAttrs` as a standalone TypeBox schema,
|
Since flowgraph already defines `CallNodeAttrs` as a standalone TypeBox schema,
|
||||||
@@ -629,180 +508,32 @@ the codegen can produce a Module entry from it. Storage's `CallGraph` Module the
|
|||||||
composes `BaseNode` with `CallNodeAttrs` via `Type.Composite`, or imports from
|
composes `BaseNode` with `CallNodeAttrs` via `Type.Composite`, or imports from
|
||||||
the flowgraph Module if flowgraph exports one (see Open Question 1).
|
the flowgraph Module if flowgraph exports one (see Open Question 1).
|
||||||
|
|
||||||
## SchemaBuilder Equivalence
|
## Transition from SchemaBuilder
|
||||||
|
|
||||||
The removed `SchemaBuilder.build()` used to return a `GraphSchema` — a flat
|
The existing `schemaBuilder.ts` and `types.ts` use a different approach that is
|
||||||
object with `config`, `nodeTypes: Record<string, NodeType>`, and `edgeTypes:
|
being replaced:
|
||||||
Record<string, EdgeType>`. A `Type.Module` with the same entries is
|
|
||||||
structurally equivalent. This section documents what the builder was doing
|
|
||||||
internally to show the correspondence.
|
|
||||||
|
|
||||||
### What the builder was doing internally
|
|
||||||
|
|
||||||
```
|
|
||||||
SchemaBuilder
|
|
||||||
.config({ type: "directed", multi: false })
|
|
||||||
.nodeType("call", CallNodeSchema)
|
|
||||||
.edgeType("triggered", EdgeSchema, { allowedSourceTypes: ["call"] })
|
|
||||||
.build()
|
|
||||||
|
|
||||||
internally builds:
|
|
||||||
|
|
||||||
defs = {
|
|
||||||
Config: Type.Object({ type: Literal("directed"), multi: Literal(false), ... }),
|
|
||||||
CallNode: CallNodeSchema,
|
|
||||||
TriggeredEdge: EdgeSchema,
|
|
||||||
TriggeredEdgeConstraints: Type.Object({ edgeType: Literal("triggered"), ... }),
|
|
||||||
}
|
|
||||||
return Type.Module(defs)
|
|
||||||
```
|
|
||||||
|
|
||||||
The `.build()` return type was `TModule` (TypeBox Module). The `SchemaBuilder` is
|
|
||||||
removed — consumers use Module construction directly.
|
|
||||||
|
|
||||||
### Why this is equivalent
|
|
||||||
|
|
||||||
The `SchemaBuilder` was building a module under the hood — it just didn't have a
|
|
||||||
module system to target. Named entries referencing each other via strings is
|
|
||||||
exactly what `Type.Ref()` does natively. The Module format:
|
|
||||||
|
|
||||||
- Gives `Type.Ref()` instead of loose schema objects
|
|
||||||
- Gives `Module.Import()` instead of `Type.Intersect` for cross-package refs
|
|
||||||
- Gives JSON Schema `$defs` that map directly to DB storage
|
|
||||||
- Gives `Value.Check`, `Value.Diff`, `Value.Errors` on the full type system
|
|
||||||
- Gives codegen compatibility via `TsToModule.Generate()`
|
|
||||||
|
|
||||||
For the forward-looking connections (typed graph pointers, dbtype table
|
|
||||||
rendering, ujsx HostConfig for graph schemas), see
|
|
||||||
[forward-look.md](./forward-look.md).
|
|
||||||
|
|
||||||
## Design Decisions
|
|
||||||
|
|
||||||
### DD1: Module replaces SchemaBuilder
|
|
||||||
|
|
||||||
The SchemaBuilder is replaced by TypeBox Modules. The Module format provides
|
|
||||||
what SchemaBuilder was building toward, but natively:
|
|
||||||
- Named references → `Type.Ref()` instead of loose schema objects
|
|
||||||
- Cross-module imports → `Module.Import()` instead of `Type.Intersect`
|
|
||||||
- JSON Schema `$defs` → maps directly to DB storage
|
|
||||||
- Codegen compatibility → `TsToModule.Generate()` produces Module entries
|
|
||||||
|
|
||||||
### DD2: SchemaBuilder removed
|
|
||||||
|
|
||||||
The `SchemaBuilder` is removed. Consumers use `Type.Module()` construction
|
|
||||||
directly, with `Type.Ref()`, `Type.Composite()`, and `Metagraph.Import()`
|
|
||||||
as the building blocks. The `moduleToDbSchema()` function replaces
|
|
||||||
`SchemaBuilder.build()` as the bridge from Module to DB rows.
|
|
||||||
|
|
||||||
### DD3: Config as a Module entry with Literal values
|
|
||||||
|
|
||||||
Specific graph type Modules use `Type.Literal` for config values. The general
|
|
||||||
`Metagraph.Config` with `Type.Union` and defaults is for construction-time
|
|
||||||
validation. The specific Module freezes the config to exact values.
|
|
||||||
|
|
||||||
### DD4: Node/edge attribute schemas are Module entries, not `Type.Any()`
|
|
||||||
|
|
||||||
At the application layer, node and edge attribute schemas are named Module entries
|
|
||||||
with full type safety (`CallGraph.CallNode`, not `schema: Type.Any()`). At the
|
|
||||||
DB storage layer, the meta-schemas (`NodeType`, `EdgeType`) still have
|
|
||||||
`schema: Type.Unknown()` because the DB stores arbitrary JSON Schema blobs — the
|
|
||||||
Module entries are the application-level validation, the DB is the persistence
|
|
||||||
layer.
|
|
||||||
|
|
||||||
**Mapping**: The repository layer maps between Module entries and DB rows using
|
|
||||||
the naming convention (`*Node` → `node_types`, `*Edge` → `edge_types`, `Config`
|
|
||||||
→ `graph_types.config`). On read, it looks up the graph type's Module to get
|
|
||||||
the validation schema for each entry.
|
|
||||||
|
|
||||||
### DD5: Graphology import/export as the bridge to in-memory graphs
|
|
||||||
|
|
||||||
Storage produces data that `@alkdev/flowgraph`'s `FlowGraph.fromJSON()` and
|
|
||||||
`SerializedGraph` consume. The Module entries validate data flowing in both
|
|
||||||
directions. Storage doesn't need its own graphology dependency — it produces
|
|
||||||
the JSON format, flowgraph consumes it.
|
|
||||||
|
|
||||||
### DD6: Repository stores dereferenced entry schemas
|
|
||||||
|
|
||||||
To avoid `Module.Import()` embedding the full `$defs` of referenced Modules in
|
|
||||||
every DB row, the repository layer stores **dereferenced entry schemas** — each
|
|
||||||
`node_types` row gets its entry's resolved JSON Schema with just the transitive
|
|
||||||
`$defs` it needs, not the entire importing Module's definitions.
|
|
||||||
|
|
||||||
### DD7: Edge type constraints as named Module entries, not DB columns
|
|
||||||
|
|
||||||
Edge type constraints (`allowedSourceTypes`/`allowedTargetTypes`) are named
|
|
||||||
Module entries (e.g., `TriggeredEdgeConstraints` with `Type.Array(Type.String())`
|
|
||||||
fields), not just DB columns. This gives them schema validation and
|
|
||||||
serialization. The repository layer projects these entries to the existing
|
|
||||||
`edge_types` columns (arrays of node type name strings). The DB schema
|
|
||||||
doesn't change — the Module entries are the source of truth.
|
|
||||||
|
|
||||||
**Revised from original DD7** which stored constraints only as DB columns.
|
|
||||||
Named entries are strictly more capable: they validate and serialize;
|
|
||||||
DB columns are their persistence projection.
|
|
||||||
|
|
||||||
### DD8: Naming convention for Module entries
|
|
||||||
|
|
||||||
Within a graph type Module, entries are named with role-distinguishing suffixes:
|
|
||||||
`*Node` for node types, `*Edge` for edge types, `Config` for graph configuration,
|
|
||||||
`*EdgeConstraints` for edge endpoint constraints, and bare names or `*Enum` for
|
|
||||||
shared types. `moduleToDbSchema()` uses this convention to map entries to DB
|
|
||||||
tables.
|
|
||||||
|
|
||||||
**Alternative considered**: Explicit metadata/decorators on entries (e.g.,
|
|
||||||
`{ kind: "nodeType", name: "call", schema: ... }`). Rejected because it adds
|
|
||||||
boilerplate without adding information — the suffix convention is simpler
|
|
||||||
and sufficient for the expected Module size (5–20 entries).
|
|
||||||
|
|
||||||
### DD9: Pointer abstraction is forward-looking, not v1
|
|
||||||
|
|
||||||
The structural analogy between ujsx's `ValuePointer`/`selectNode`/`setNode` and
|
|
||||||
graph node/edge addressing is real, but implementing typed graph pointers (via
|
|
||||||
JPATH Module or reactive signals) is a post-v1 concern. For v1, repository
|
|
||||||
functions use direct key-based addressing and the Module validates attribute
|
|
||||||
shapes. The Module's existence makes typed pointers feasible later because
|
|
||||||
it provides the schema the pointer validates against.
|
|
||||||
|
|
||||||
**Alternative considered**: Implement typed pointers in v1 via a lightweight
|
|
||||||
`GraphPointer<T>` wrapper. Rejected because it requires either JPATH Module
|
|
||||||
dependency or reactive signal integration, both of which add complexity
|
|
||||||
without clear v1 benefit. Direct key-based addressing is sufficient.
|
|
||||||
|
|
||||||
### DD10: dbtype integration is post-v1
|
|
||||||
|
|
||||||
`@alkdev/dbtype`'s UJSX→Module→Host pipeline can eliminate the manual dual
|
|
||||||
definition of SQLite/PG table schemas. But dbtype is Phase 0 (architecture
|
|
||||||
complete, no implementation). For v1, storage uses manual Drizzle table
|
|
||||||
definitions. The Module-based graph type definitions are compatible with dbtype
|
|
||||||
because both produce `Type.Module` objects — the integration path is clear.
|
|
||||||
|
|
||||||
**Alternative considered**: Implement dbtype integration alongside the initial Module
|
|
||||||
construction. Rejected because it adds a dependency on an unimplemented package
|
|
||||||
and the manual table definitions work well. The cost of deferring is continued
|
|
||||||
dual SQLite/PG maintenance, which is manageable for 6 metagraph tables.
|
|
||||||
|
|
||||||
## What Changes
|
|
||||||
|
|
||||||
| Before (unreleased) | After |
|
| Before (unreleased) | After |
|
||||||
|---------|-----|
|
|---------|-----|
|
||||||
| `types.ts` — standalone schemas | `modules/metagraph.ts` — `Metagraph` Module |
|
| `types.ts` — standalone schemas | `modules/metagraph.ts` — `Metagraph` Module |
|
||||||
| `schemaBuilder.ts` — fluent builder | Removed — replaced by Module construction |
|
| `schemaBuilder.ts` — fluent builder | Removed — replaced by `Type.Module()` construction |
|
||||||
| `types.ts` — `BaseNodeAttributes`, `BaseEdgeAttributes` | `Metagraph` Module entries |
|
| `types.ts` — `BaseNodeAttributes`, `BaseEdgeAttributes` | `Metagraph` Module entries |
|
||||||
| `types.ts` — `GraphConfig`, `GraphStatus`, `GraphBaseType` | `Metagraph` Module entries + const objects |
|
| `types.ts` — `GraphConfig`, `GraphStatus`, `GraphBaseType` | `Metagraph` Module entries + const objects |
|
||||||
| `allowedSourceTypes`/`allowedTargetTypes` as DB columns only | Named `*EdgeConstraints` Module entries (projected to DB columns) |
|
| `allowedSourceTypes`/`allowedTargetTypes` as DB columns only | Named `*EdgeConstraints` Module entries (projected to DB columns) |
|
||||||
| No concrete graph type Modules | `modules/call-graph.ts`, `modules/acl-graph.ts`, etc. |
|
| No concrete graph type Modules | `modules/call-graph.ts`, `modules/acl-graph.ts`, etc. |
|
||||||
| No bridge between Module ↔ DB ↔ graphology | `bridge.ts` — validation, DB mapping, graphology format |
|
| No bridge between Module ↔ DB ↔ graphology | `bridge.ts` — validation, DB mapping, graphology format |
|
||||||
|
|
||||||
## What Doesn't Change
|
Note: `Type.Any()` used in the old `types.ts` for `metadata` and `schema` fields
|
||||||
|
is replaced by `Type.Unknown()` in the Module approach. Both produce `{}` in
|
||||||
|
JSON Schema, but `Type.Unknown()` is the canonical choice — it explicitly
|
||||||
|
communicates "no validation applied."
|
||||||
|
|
||||||
- **Database tables** — same 6 metagraph tables, same columns, same relations
|
**What doesn't change**: The 6 metagraph database tables, their columns, and
|
||||||
- **SQLite host** — table definitions, relations, client factory unchanged
|
relations remain the same. SQLite host table definitions, client factory, and
|
||||||
- **PostgreSQL host** (planned) — same shapes, different dialect
|
drizzlebox-generated schemas are unchanged. The `@alkdev/typebox` dependency is
|
||||||
- **`@alkdev/typebox` dependency** — same. Modules are a core TypeBox feature
|
unchanged. The encryption utility (planned) is unchanged. `allowedSourceTypes`
|
||||||
- **Encryption utility** — unchanged, can be a Module entry in `SecretGraph`
|
and `allowedTargetTypes` remain DB columns with the same semantics — Module
|
||||||
- **`allowedSourceTypes`/`allowedTargetTypes`** — same DB columns, same semantics
|
entries are the source of truth, projected to columns by `moduleToDbSchema()`.
|
||||||
(Module entries are the source of truth, projected to DB columns by
|
|
||||||
`moduleToDbSchema()`)
|
|
||||||
|
|
||||||
## Implementation Path
|
## Implementation Path
|
||||||
|
|
||||||
@@ -817,62 +548,138 @@ dual SQLite/PG maintenance, which is manageable for 6 metagraph tables.
|
|||||||
4. **Phase 4**: Add `moduleToGraphology()` and `fromGraphologyExport()` for the
|
4. **Phase 4**: Add `moduleToGraphology()` and `fromGraphologyExport()` for the
|
||||||
graphology bridge. Storage produces the format, flowgraph consumes it.
|
graphology bridge. Storage produces the format, flowgraph consumes it.
|
||||||
|
|
||||||
Acceptance criteria per phase:
|
Acceptance criteria:
|
||||||
- **Phase 2 complete**: `moduleToDbSchema()` produces values compatible with all
|
- **Phase 2 complete**: `moduleToDbSchema()` produces values compatible with
|
||||||
6 metagraph tables
|
all 6 metagraph tables
|
||||||
- **Phase 3 complete**: Reference Modules validate against their flowgraph/taskgraph
|
- **Phase 3 complete**: Reference Modules validate against their
|
||||||
counterparts
|
flowgraph/taskgraph counterparts
|
||||||
|
|
||||||
## Relationship to Other Packages
|
### Relationship to Other Packages
|
||||||
|
|
||||||
| Package | What changes | What stays |
|
| Package | What changes | What stays |
|
||||||
|---------|-------------|------------|
|
|---------|-------------|------------|
|
||||||
| `@alkdev/storage` | `types.ts` → Module, `schemaBuilder.ts` → removed, new `modules/` and `bridge.ts` | Tables, relations, crypto, client factory |
|
| `@alkdev/storage` | `types.ts` → Module, `schemaBuilder.ts` → removed, new `modules/` and `bridge.ts` | Tables, relations, crypto, client factory |
|
||||||
| `@alkdev/flowgraph` | `CallNodeAttrs`, `CallEdgeAttrs`, `CallStatus` become Module entries (optional, exported from `/schema` subpath) | FlowGraph class, analysis, all runtime logic |
|
| `@alkdev/flowgraph` | `CallNodeAttrs`, `CallEdgeAttrs`, `CallStatus` become Module entries (optional, exported from `/schema`) | FlowGraph class, analysis, all runtime logic |
|
||||||
| `@alkdev/taskgraph` | `TaskGraphNodeAttributes`, `DependencyEdge` become Module entries (optional) | TaskGraph class, analysis, all runtime logic |
|
| `@alkdev/taskgraph` | `TaskGraphNodeAttributes`, `DependencyEdge` become Module entries (optional) | TaskGraph class, analysis, all runtime logic |
|
||||||
| `@alkdev/operations` | `Identity`, `AccessControl` become Module entries (optional) | Registry, call protocol, adapters |
|
| `@alkdev/operations` | `Identity`, `AccessControl` become Module entries (optional) | Registry, call protocol, adapters |
|
||||||
| `@alkdev/pubsub` | No change | Transport layer |
|
| `@alkdev/pubsub` | No change | Transport layer |
|
||||||
| `@alkdev/ujsx` | No change (already a Module) | The pattern we're following |
|
| `@alkdev/ujsx` | No change (already a Module) | The pattern we're following |
|
||||||
| `@alkdev/dbtype` | No change (Phase 0) | Future: storage table defs could be dbtype element trees |
|
| `@alkdev/dbtype` | No change (Phase 0) | Future: storage table defs could be dbtype element trees |
|
||||||
|
|
||||||
|
## Design Decisions
|
||||||
|
|
||||||
|
### DD1: TypeBox Module replaces the SchemaBuilder
|
||||||
|
|
||||||
|
Graph type definitions are `Type.Module` objects. The previous `SchemaBuilder`
|
||||||
|
class is removed — consumers use `Type.Module()` construction directly, with
|
||||||
|
`Type.Ref()`, `Type.Composite()`, and `Metagraph.Import()` as the building
|
||||||
|
blocks. The `moduleToDbSchema()` function replaces `SchemaBuilder.build()` as
|
||||||
|
the bridge from Module to DB rows.
|
||||||
|
|
||||||
|
This provides `Type.Ref()` for internal references, `Module.Import()` for
|
||||||
|
cross-package references, JSON Schema `$defs` that map directly to DB storage,
|
||||||
|
and codegen compatibility via `TsToModule.Generate()`.
|
||||||
|
|
||||||
|
### DD2: Metagraph.Import() for same-package Modules
|
||||||
|
|
||||||
|
Concrete graph types within `@alkdev/storage` use `Metagraph.Import("BaseNode")`
|
||||||
|
to compose base schemas. This avoids duplication and keeps the base schemas in
|
||||||
|
one place. External packages that define graph type Modules should re-declare
|
||||||
|
base schemas locally — storage should not be a dependency of other packages'
|
||||||
|
schema definitions.
|
||||||
|
|
||||||
|
### DD3: Config as a Module entry with Literal values
|
||||||
|
|
||||||
|
General `Metagraph.Config` uses `Type.Union` with defaults for construction-time
|
||||||
|
validation. Specific graph types freeze config values to `Type.Literal`, making
|
||||||
|
the config a precise contract rather than a validation surface.
|
||||||
|
|
||||||
|
### DD4: Node/edge attribute schemas are Module entries, not Type.Any()
|
||||||
|
|
||||||
|
At the application layer, node and edge attribute schemas are named Module
|
||||||
|
entries with full type safety (`CallGraph.CallNode`, not `schema: Type.Any()`).
|
||||||
|
At the DB storage layer, the meta-schemas (`NodeType`, `EdgeType`) still have
|
||||||
|
`schema: Type.Unknown()` because the DB stores arbitrary JSON Schema blobs.
|
||||||
|
|
||||||
|
### DD5: Storage produces graphology format, flowgraph consumes it
|
||||||
|
|
||||||
|
Storage doesn't need a graphology dependency. It produces the JSON serialization
|
||||||
|
format that `@alkdev/flowgraph`'s `FlowGraph.fromJSON()` and `SerializedGraph`
|
||||||
|
consume. The Module entries validate data flowing in both directions.
|
||||||
|
|
||||||
|
### DD6: Repository stores dereferenced entry schemas
|
||||||
|
|
||||||
|
When a Module entry uses `Module.Import()`, the entry's JSON Schema embeds the
|
||||||
|
referenced Module's `$defs`. To avoid storing the full referenced Module in
|
||||||
|
every DB row, the repository layer stores **dereferenced entry schemas** — each
|
||||||
|
`node_types` row gets its entry's resolved JSON Schema with just the transitive
|
||||||
|
`$defs` it needs, not the entire importing Module's definitions.
|
||||||
|
|
||||||
|
### DD7: Edge type constraints as named Module entries
|
||||||
|
|
||||||
|
Edge type constraints (`allowedSourceTypes`/`allowedTargetTypes`) are named
|
||||||
|
Module entries (e.g., `TriggeredEdgeConstraints`), not just DB columns. This
|
||||||
|
gives them schema validation (`Value.Check`) and serialization (JSON Schema
|
||||||
|
with `$defs`). The repository layer projects these entries to the existing
|
||||||
|
`edge_types` columns. The DB schema doesn't change — Module entries are the
|
||||||
|
source of truth, DB columns are the persistence projection.
|
||||||
|
|
||||||
|
### DD8: Naming convention for Module entries
|
||||||
|
|
||||||
|
Module entries use role-distinguishing suffixes: `*Node` for node types,
|
||||||
|
`*Edge` for edge types, `Config` for graph configuration, `*EdgeConstraints`
|
||||||
|
for edge endpoint constraints, and bare names or `*Enum` for shared types.
|
||||||
|
`moduleToDbSchema()` uses this convention to map entries to DB tables.
|
||||||
|
|
||||||
|
This was chosen over explicit metadata/decorators (e.g.,
|
||||||
|
`{ kind: "nodeType", name: "call", schema: ... }`) because the suffix convention
|
||||||
|
is simpler and sufficient for the expected Module size (5–20 entries).
|
||||||
|
|
||||||
|
### DD9: Pointer abstraction is forward-looking, not v1
|
||||||
|
|
||||||
|
The structural analogy between ujsx's `ValuePointer`/`selectNode`/`setNode` and
|
||||||
|
graph node/edge addressing is real, but implementing typed graph pointers (via
|
||||||
|
JPATH Module or reactive signals) is a post-v1 concern. For v1, repository
|
||||||
|
functions use direct key-based addressing (`findNode(graphId, nodeKey)`), and
|
||||||
|
the Module validates attribute shapes. See [forward-look.md](./forward-look.md).
|
||||||
|
|
||||||
|
### DD10: dbtype integration is post-v1
|
||||||
|
|
||||||
|
`@alkdev/dbtype`'s UJSX→Module→Host pipeline can eliminate the manual dual
|
||||||
|
definition of SQLite/PG table schemas. But dbtype is Phase 0 (architecture
|
||||||
|
complete, no implementation). For v1, storage uses manual Drizzle table
|
||||||
|
definitions. The Module-based graph type definitions are compatible with dbtype
|
||||||
|
because both produce `Type.Module` objects — the integration path is clear.
|
||||||
|
See [forward-look.md](./forward-look.md).
|
||||||
|
|
||||||
## Open Questions
|
## Open Questions
|
||||||
|
|
||||||
1. **Should `@alkdev/flowgraph` export a `Type.Module`, or should storage define
|
1. **Should `@alkdev/flowgraph` export a `Type.Module`, or should storage define
|
||||||
its own entries with documented correspondence?** Flowgraph currently exports
|
its own entries with documented correspondence?** Flowgraph currently exports
|
||||||
`CallNodeAttrs` as a standalone `Type.Object`. To use `Import()`, flowgraph
|
`CallNodeAttrs` as a standalone `Type.Object`. To use `Import()`, flowgraph
|
||||||
needs to export a Module. But storage can start with standalone schemas and
|
needs to export a Module. Storage can start with standalone schemas and
|
||||||
`Type.Composite([BaseNode, CallNodeAttrs])` — no dependency on flowgraph.
|
`Type.Composite([BaseNode, CallNodeAttrs])` — no dependency on flowgraph.
|
||||||
Adopt `Import()` when flowgraph provides a Module. **This avoids a
|
Adopt `Import()` when flowgraph provides a Module. **This avoids a circular
|
||||||
circular dependency: `@alkdev/storage` does NOT depend on `@alkdev/flowgraph`.**
|
dependency: `@alkdev/storage` does NOT depend on `@alkdev/flowgraph`.**
|
||||||
|
|
||||||
2. **Should concrete graph type Modules live in storage or in their respective
|
2. **Should concrete graph type Modules live in storage or in their respective
|
||||||
packages?** Call-graph attribute schemas are defined by flowgraph's domain, not
|
packages?** Call-graph attribute schemas are defined by flowgraph's domain, not
|
||||||
storage's. Storage provides the metagraph *framework* (the `Metagraph` Module
|
storage's. Storage provides the metagraph *framework* (the `Metagraph` Module
|
||||||
with `BaseNode`, `BaseEdge`, `Config`). Concrete graph types like `CallGraph`
|
with `BaseNode`, `BaseEdge`, `Config`). Concrete types like `CallGraph` could
|
||||||
could live either in storage (as reference implementations) or in their
|
live either in storage (as reference implementations) or in their respective
|
||||||
respective packages (flowgraph exports `CallGraph` Module alongside
|
packages. **Decision: Both.** Storage provides reference Modules in `modules/`
|
||||||
`CallNodeAttrs`). **Decision: Both.** Storage provides reference Modules in
|
that consumers can use directly or replace. Flowgraph may also export a
|
||||||
`modules/` that consumers can use directly or replace. Flowgraph may also
|
Module — the two are compatible via Module `$defs`.
|
||||||
export a Module — the two are compatible via Module `$defs`.
|
|
||||||
|
|
||||||
3. **Should `*EdgeConstraints` entries use `Type.Ref("CallNode")` or
|
3. **Should `*EdgeConstraints` entries use `Type.Ref("CallNode")` or
|
||||||
`Type.String()` for allowed source/target types?** Using `Type.Ref`
|
`Type.String()` for allowed source/target types?** See the
|
||||||
would mean "each element in the array must validate against the CallNode
|
[Edge Type Constraints](#edge-type-constraints) section. **Decision:
|
||||||
schema," which is semantically wrong — the constraint is about which named
|
`Type.String()`** — the constraint arrays contain names, not schemas.
|
||||||
node types are valid endpoints, not about data shapes. Using `Type.String()`
|
|
||||||
matches the actual semantics (arrays of node type names) but loses the
|
|
||||||
structural link. **Decision: `Type.String()`** — the constraint arrays
|
|
||||||
contain names, not schemas. The naming convention provides an implicit
|
|
||||||
contract that string values should correspond to `*Node` entry names,
|
|
||||||
enforced by `moduleToDbSchema()` at projection time.
|
|
||||||
|
|
||||||
4. **How does the graph pointer abstraction interact with the repository layer?**
|
4. **How does the graph pointer abstraction interact with the repository layer?**
|
||||||
For v1, repository functions use direct key-based addressing. Typed pointers
|
For v1, repository functions use direct key-based addressing. **Decision:
|
||||||
(JPATH Module, reactive ValuePointer) could layer on top of the repository
|
validate on read** — if data doesn't match the Module entry, throw. This
|
||||||
later. The key question: does the repository return raw data (untyped JSON),
|
makes any value retrieved from the repo conform to the schema.
|
||||||
or does it validate against the Module before returning? **Decision: validate
|
|
||||||
on read** — if the data doesn't match the Module entry, throw. This makes
|
|
||||||
typed pointers safe: any value you get from the repo conforms to the schema.
|
|
||||||
|
|
||||||
## References
|
## References
|
||||||
|
|
||||||
@@ -880,8 +687,8 @@ Acceptance criteria per phase:
|
|||||||
- ujsx ADR-002 (Module as type registry): `/workspace/@alkdev/ujsx/docs/architecture/decisions/002-typebox-module-as-registry.md`
|
- ujsx ADR-002 (Module as type registry): `/workspace/@alkdev/ujsx/docs/architecture/decisions/002-typebox-module-as-registry.md`
|
||||||
- ujsx schema docs: `/workspace/@alkdev/ujsx/docs/architecture/schema.md`
|
- ujsx schema docs: `/workspace/@alkdev/ujsx/docs/architecture/schema.md`
|
||||||
- TsToModule codegen: `/workspace/research/typebox_research/codegen/ts-to-module.ts`
|
- TsToModule codegen: `/workspace/research/typebox_research/codegen/ts-to-module.ts`
|
||||||
- ujsx Module examples: `/workspace/research/typebox_research/ujsx/unist.gen.ts`, `/workspace/research/typebox_research/ujsx/mdast.gen.ts`
|
|
||||||
- Flowgraph schema (standalone TypeBox, not yet Module): `/workspace/@alkdev/flowgraph/src/schema/`
|
- Flowgraph schema (standalone TypeBox, not yet Module): `/workspace/@alkdev/flowgraph/src/schema/`
|
||||||
- Flowgraph SerializedGraph factory: `/workspace/@alkdev/flowgraph/src/schema/graph.ts`
|
- Flowgraph SerializedGraph factory: `/workspace/@alkdev/flowgraph/src/schema/graph.ts`
|
||||||
- Forward-looking connections (pointers, dbtype, ujsx IR): [forward-look.md](./forward-look.md)
|
- Schema evolution: [schema-evolution.md](./schema-evolution.md)
|
||||||
- Ecosystem integration: [overview.md](./overview.md)
|
- Forward-looking connections: [forward-look.md](./forward-look.md)
|
||||||
|
- Package overview: [overview.md](./overview.md)
|
||||||
Reference in New Issue
Block a user