Files

glm-5.1 d2253099ee add flowgraph architecture docs (Phase 1 SDD)

Draft architecture specification for @alkdev/flowgraph — a workflow graph library providing DAG-based orchestration over operations. Covers two graph types (operation graph, call graph), ujsx workflow templates, GraphologyHost and ReactiveHost configs, signal-driven execution, type-compatibility analysis, error hierarchy, and build/distribution. Includes 3 ADRs: ujsx as template IR, DAG-only enforcement, decoupled storage.

2026-05-19 09:36:22 +00:00

9.9 KiB

Raw Blame History

status, last_updated

status	last_updated
draft	2026-05-19

Operation Graph (Static)

The static operation graph built from OperationSpecs at startup. Nodes represent operations, edges represent type compatibility between output and input schemas.

Overview

The operation graph is built once at startup from the OperationRegistry. It answers structural questions about the operation space:

Type compatibility: Can operation A's output feed into operation B's input?
Cycle detection: Are there circular operation dependencies?
Reachability: What operations are reachable from a given starting point?
Template validation: Is a proposed call sequence structurally valid?

The operation graph is immutable after construction. Operations don't appear or disappear at runtime — they're registered once and the graph is built from the registry. If the registry changes, the graph is rebuilt from scratch.

Construction

fromSpecs()

static fromSpecs(specs: OperationSpec[]): FlowGraph

The primary construction path. Takes an array of OperationSpec objects (from OperationRegistry.getAll()) and builds a directed graph where:

Nodes — one per operation, key = namespace.name, attributes = OperationNodeAttrs
Typed edges — added between operations where the output schema of the source is compatible with the input schema of the target

const graph = FlowGraph.fromSpecs(registry.getAll());

Edge construction calls the typeCompat analysis function for each (source, target) pair. An edge is added if typeCompat(source.outputSchema, target.inputSchema).compatible === true.

The number of edges is O(n²) in the worst case (all operations are type-compatible with all others). For realistic registries (10-50 operations), this is sub-millisecond. If the registry grows large, edge construction can be deferred to query time.

fromJSON()

static fromJSON(data: OperationGraphSerialized): FlowGraph

Deserialize from graphology native JSON format. Validates against OperationGraphSerialized schema using Value.Check(). Throws InvalidInputError on validation failure.

Round-trip: fromSpecs() → export() → fromJSON() is lossless.

Incremental construction

const graph = new FlowGraph();
graph.addOperation(spec);
graph.addTypedEdge("task.classify", "task.enrich", { compatible: true, detail: "output → input" });

addOperation adds a node. addTypedEdge adds a type-compatibility edge. Both throw on duplicates (matching taskgraph's behavior).

Node Attributes

See schema.md for the full schema definition. Key fields:

Field	Purpose
`name`	Operation name (e.g., `"classify"`)
`namespace`	Namespace (e.g., `"task"`)
`type`	`"query" \| "mutation" \| "subscription"`
`inputSchema`	TypeBox schema for input — used by type-compatibility analysis
`outputSchema`	TypeBox schema for output — used by type-compatibility analysis

The node key is ${namespace}.${name}, matching the operationId format.

Edges

Edges represent type compatibility. The edge direction is:

source → target  (source's output is compatible with target's input)

Following graphology convention: graph.inNeighbors("task.enrich") returns operations that can feed into enrich. graph.outNeighbors("task.classify") returns operations that classify can feed into.

This direction matches the data flow: classify produces output that enrich can consume. It also matches taskgraph's prerequisite → dependent convention.

Edge attributes

{
  edgeType: "typed",
  compatible: true,
  detail: "classify.output → enrich.input"
}

edgeType — always "typed" for operation graph edges
compatible — whether the source's output schema is compatible with the target's input schema
detail — optional human-readable description of the compatibility relationship

Why compatible: false edges?

The operation graph includes incompatible edges (compatible: false) alongside compatible ones. This is intentional:

Diagnostic value — showing all potential connections, both valid and invalid, helps developers understand the operation space.
Template authoring — when building a workflow template, seeing that A → B is incompatible (and why) is more useful than seeing no edge at all.
Type mismatch prevention — incompatible edges make it clear where type conversions would be needed.

The typeCompat analysis function determines compatibility. Edges where compatibility cannot be determined (e.g., inputSchema is Unknown) are not added at all — there's no "unknown compatibility" edge.

Validation

The operation graph validates:

Cycle detection — throws CircularDependencyError if any cycle exists. Unlike taskgraph (which allows cycles and detects them via hasCycles()), flowgraph enforces acyclicity at construction time. A cycle in the operation graph means an operation's output feeds back into its own input, which is a design error.
Dangling references — edges that reference operations not in the graph are structural errors. addTypedEdge throws OperationNotFoundError if either endpoint doesn't exist.
Schema compatibility — warns (via validateGraph()) about nodes that have no incoming or outgoing edges (isolated operations that aren't connected to the type flow graph).

Queries

The operation graph supports the same query functions as taskgraph, delegated to graphology-dag:

Query	Method	Returns
Topological order	`topologicalOrder()`	`string[]` of node keys in prerequisite→dependent order
Has cycles	`hasCycles()`	`boolean` (should always be false if construction validated)
Find cycles	`findCycles()`	`string[][]` of cycle paths
Ancestors	`ancestors(nodeId)`	`string[]` of all nodes reachable via incoming edges
Descendants	`descendants(nodeId)`	`string[]` of all nodes reachable via outgoing edges
Predecessors	`predecessors(nodeId)`	`string[]` of direct incoming neighbors
Successors	`successors(nodeId)`	`string[]` of direct outgoing neighbors
Reachable from	`reachableFrom(nodeIds)`	`Set<string>` of all nodes reachable from the given start nodes

These are thin wrappers around graphology and graphology-dag functions, following the same pattern as taskgraph.

Type Compatibility Analysis

The typeCompat function compares two TypeBox schemas and returns a compatibility result:

function typeCompat(
  outputSchema: TSchema,
  inputSchema: TSchema,
): { compatible: boolean; detail?: string }

Compatibility rules

The analysis is structural, not semantic:

Exact match — outputSchema is identical to inputSchema → compatible
Subtype match — outputSchema is a subtype of inputSchema → compatible (e.g., output has extra fields beyond what input requires)
Unknown passthrough — if either schema is Type.Unknown(), compatibility is unknown → no edge added (not incompatible, just unresolvable)
Incompatible — structural mismatch (e.g., output is string, input requires number) → edge added with compatible: false

See analysis.md for the full type-compatibility algorithm.

Constraints

Immutable after construction — the operation graph is not mutated after fromSpecs() builds it. If the registry changes, rebuild the graph.
DAG-only — cycles are rejected at construction time. The operation graph must be a valid DAG.
No parallel edges — at most one edge per (source, target) pair. If A's output is compatible with B's input at multiple JSON paths, that's recorded in detail, not as multiple edges.
No self-loops — an operation cannot depend on its own output. Self-referential operations (e.g., recursive subscriptions) are modeled differently (see call-graph.md).
Edge direction is data flow — A → B means A produces data that B consumes. inNeighbors(B) returns A's dependencies, outNeighbors(A) returns A's dependents. This matches taskgraph's convention.
Operation nodes use namespace.name as keys — this matches the call protocol's operationId format and ensures uniqueness within a registry.

Open Questions

Should fromSpecs() add ALL possible edges or only compatible ones? The current design adds both compatible and incompatible edges. An alternative is to only add compatible edges, with a separate potentialEdges() query that computes incompatible connections on demand. Pro: smaller graph. Con: loses diagnostic information.
How to handle version conflicts? If two versions of the same operation exist in the registry, should they be separate nodes (task.classify@1.0.0 vs task.classify@2.0.0) or should the latest version win? The current design uses namespace.name (no version) as the node key, meaning only one version per operation can exist in the graph.
Should subscription operations be treated differently? A subscription produces a stream, not a single output. Its outputSchema describes a single stream element, but the data flow semantics are different from query/mutation. Should the type compatibility check account for this?
How granular should type compatibility be? The current detail field is a string. A more structured approach would be { compatible: boolean, mismatchPaths: string[] } listing the specific JSON paths that don't match. This adds complexity but improves diagnostics.

References

Schema: schema.md — OperationNodeAttrs, TypedEdgeAttrs, CallStatus, EdgeType
Type compatibility: analysis.md
Call graph: call-graph.md
Operation types: @alkdev/operations/src/types.ts
Taskgraph construction: @alkdev/taskgraph_ts/src/graph/construction.ts
Graphology DAG: graphology-dag package

9.9 KiB Raw Blame History