Draft architecture specification for @alkdev/flowgraph — a workflow graph library providing DAG-based orchestration over operations. Covers two graph types (operation graph, call graph), ujsx workflow templates, GraphologyHost and ReactiveHost configs, signal-driven execution, type-compatibility analysis, error hierarchy, and build/distribution. Includes 3 ADRs: ujsx as template IR, DAG-only enforcement, decoupled storage.
12 KiB
status, last_updated
| status | last_updated |
|---|---|
| draft | 2026-05-19 |
Analysis Functions
Standalone composable functions for type-compatibility checking, execution ordering, and precondition validation.
Overview
Analysis functions are pure, composable functions that operate on a FlowGraph instance. They follow the same pattern as taskgraph: standalone functions (not methods on the class) that take a graph as input and return structured results.
The analysis layer provides:
- Type compatibility — can operation A's output feed into operation B's input?
- Execution ordering — what's a valid topological order for a set of operations?
- Precondition validation — are all required inputs available before a step starts?
- Reachability — which operations can be reached from a given starting point?
- Template validation — does a workflow template follow a valid path through the operation graph?
All analysis functions are pure: they don't mutate the graph, they don't depend on external state, and they return structured results (not throw on failure). This makes them testable, composable, and suitable for both synchronous and async use.
Type Compatibility
typeCompat(outputSchema, inputSchema)
function typeCompat(
outputSchema: TSchema,
inputSchema: TSchema,
): TypeCompatResult
interface TypeCompatResult {
compatible: boolean;
detail?: string;
mismatches?: TypeMismatch[];
}
interface TypeMismatch {
path: string; // JSON path to the mismatched field
expected: string; // What the input schema requires
actual: string; // What the output schema provides
}
Compares two TypeBox schemas and determines if the output schema is compatible with the input schema. Returns a structured result with details about mismatches.
Compatibility rules
The analysis is structural, not semantic. It checks whether the output shape can satisfy the input shape:
- Exact match —
outputSchemaandinputSchemaare structurally identical →compatible: true - Output is superset — output has all fields that input requires, plus extras →
compatible: true(output is a subtype of input, meaning input accepts output) - Output is subset — output is missing fields that input requires →
compatible: false, withmismatcheslisting the missing fields - Type mismatch — output field type doesn't match input field type →
compatible: false, withmismatcheslisting the type differences - Unknown passthrough — if either schema is
Type.Unknown(), compatibility is unknown → no edge is created (not incompatible, just unresolvable)
Subtype checking
The key insight: output must be a subtype of input for compatibility. This means:
- If input expects
{ name: string, age: number }, output must provide at least those fields - If input expects
string, output providingstring | numberis not compatible (it could produce a number) - If input expects
string | number, output providingstringis compatible (string is a subset of string|number)
This follows standard type theory: the output must be at least as specific as what the input requires.
buildTypeEdges(graph)
function buildTypeEdges(graph: FlowGraph<OperationNodeAttrs, OperationEdgeAttrs>): void
Populates the operation graph with type-compatibility edges. For each pair of nodes (A, B), calls typeCompat(A.outputSchema, B.inputSchema) and adds an edge with the result.
This is called automatically by FlowGraph.fromSpecs(). It can also be called manually after adding operations incrementally.
Edge attributes from type compatibility
A type-compatibility edge carries:
{
edgeType: "typed",
compatible: boolean, // true if output feeds into input
detail?: string, // "classify.output is compatible with enrich.input"
mismatches?: TypeMismatch[] // specific field-level mismatches (if incompatible)
}
Execution Ordering
topologicalOrder(graph)
function topologicalOrder(graph: FlowGraph): string[]
Returns node keys in topological order (prerequisites before dependents). Uses graphology-dag's topologicalSort algorithm.
Throws CircularDependencyError if the graph contains cycles, with cycles populated by findCycles().
parallelGroups(graph)
function parallelGroups(graph: FlowGraph): string[][]
Returns groups of nodes that can execute in parallel. Each group is an array of node keys. Groups are ordered by dependency depth:
- Group 0: nodes with no prerequisites (roots)
- Group 1: nodes whose only prerequisites are in Group 0
- Group N: nodes whose prerequisites are all in Groups 0 through N-1
This is useful for the hub coordinator to determine max parallelism: all nodes in a group can start simultaneously.
criticalPath(graph)
function criticalPath(graph: FlowGraph): string[]
Returns the longest path through the DAG, which represents the sequence of operations that determines the minimum total execution time. Useful for identifying bottlenecks.
Precondition Validation
validatePreconditions(graph)
function validatePreconditions(
graph: FlowGraph<OperationNodeAttrs, OperationEdgeAttrs>
): ValidationError[]
For each node in the operation graph, checks that all required input fields are provided by at least one predecessor's output. Returns an array of ValidationError objects (never throws).
A "missing precondition" occurs when a node's input requires a field that no predecessor's output provides. This is a stronger check than type compatibility — it verifies that a valid execution path exists through the graph.
validateTemplate(template, operationGraph)
function validateTemplate(
template: UNode,
operationGraph: FlowGraph<OperationNodeAttrs, OperationEdgeAttrs>,
): ValidationError[]
Validates a workflow template against an operation graph:
- All operations exist — every
<Operation name="X">has a matching node in the operation graph - No cycles — the rendered DAG has no cycles
- Type compatibility — sequential operations have compatible type edges (or no incompatible edge)
- Reachability — all operations are reachable from the start
- No orphan nodes — every operation has at least one incoming or outgoing edge (unless it's a single-operation template)
Returns an array of ValidationError objects. Template validation is advisory — it can produce warnings (e.g., "operation not in registry") and errors (e.g., "cycle detected").
Reachability
reachableFrom(graph, nodeIds)
function reachableFrom(graph: FlowGraph, nodeIds: string[]): Set<string>
Returns all node keys reachable from the given starting nodes via directed edges. Useful for:
- Determining which operations a coordinator can reach from a starting operation
- Computing the abort cascade scope for a given call
- Finding all operations affected by a change to a particular operation
ancestors(graph, nodeId)
function ancestors(graph: FlowGraph, nodeId: string): string[]
Returns all ancestors of a node (nodes reachable via incoming edges). Useful for:
- Finding which operations must complete before a given operation can start
- Computing depth-from-roots for execution priority
descendants(graph, nodeId)
function descendants(graph: FlowGraph, nodeId: string): string[]
Returns all descendants of a node (nodes reachable via outgoing edges). Useful for:
- Finding all calls that would be affected by aborting a given call
- Computing the scope of a failure cascade
Graph-Level Validation
validateGraph(graph)
function validateGraph(graph: FlowGraph): AnyValidationError[]
Runs all validation checks:
- Schema validation — node attributes match
OperationNodeAttrsorCallNodeAttrsschema - Graph invariants — no cycles, no dangling edges, no self-loops
- Orphan detection — nodes with no edges (warning, not error)
Returns an array of AnyValidationError objects, which is a union type:
type AnyValidationError = ValidationError | GraphValidationError;
Matching taskgraph's pattern, this function never throws — it collects all issues and returns them.
Standalone Function Pattern
All analysis functions are standalone (not methods on FlowGraph). They take a FlowGraph instance as their first argument and return structured results. This follows taskgraph's pattern:
// Standalone functions
import { topologicalOrder, hasCycles, typeCompat } from "@alkdev/flowgraph/analysis";
const order = topologicalOrder(graph);
const cycles = hasCycles(graph);
const result = typeCompat(outputSchema, inputSchema);
The FlowGraph class exposes convenience methods that delegate to these standalone functions:
class FlowGraph {
topologicalOrder(): string[] { return _topologicalOrder(this._graph); }
hasCycles(): boolean { return _hasCycles(this._graph); }
validate(): AnyValidationError[] { return _validate(this._graph); }
}
This pattern enables:
- Tree-shaking — consumers only import the analysis functions they use
- Testing — standalone functions are easier to test in isolation
- Composition — consumers can chain analysis functions without creating intermediate
FlowGraphinstances
Constraints
- Analysis functions are pure — they don't mutate the graph, don't depend on external state, and don't throw on validation failures (they return error arrays)
- Type compatibility is structural, not semantic —
typeCompat()checks schema shapes, not whether the data makes sense. "Age as number" is compatible with "count as number" even though they're semantically different. - Template validation is advisory — warnings are not errors. A template with an unknown operation is a warning, not a validation failure (the operation might be added to the registry later).
- Analysis functions work on the underlying
DirectedGraph— they're thin wrappers around graphology and graphology-dag functions, following the same pattern as taskgraph topologicalOrder()throws on cycles — unlikevalidateGraph()which returns errors,topologicalOrder()throwsCircularDependencyErrorbecause it cannot produce a valid ordering from a cyclic graph
Open Questions
-
How deep should
typeCompatcheck? Currently it checks top-level field existence and type compatibility. Should it recursively check nested objects and arrays? Full recursive checking is more thorough but slower and may produce false negatives for schemas with dynamic structures. -
Should
validateTemplatecheck runtime preconditions? Currently it only checks structural validity and type compatibility. Runtime preconditions (e.g., "operation B requires an API key that operation A doesn't have access to") are beyond the scope of static analysis and belong to the access control layer. -
Should analysis functions be async? For very large graphs (thousands of nodes), type compatibility checking could be slow. Making it async would allow incremental progress reporting. Current graphs are small enough (50-200 nodes) that synchronous checking is fine.
-
Should
parallelGroupsaccount for resource constraints? Currently it returns the theoretical maximum parallelism. An optionalmaxConcurrencyparameter could limit group sizes for realistic scheduling.
References
- Schema: schema.md —
TypeCompatResult,TypeMismatch,ValidationError - Error handling: error-handling.md —
CircularDependencyError,TypeIncompatError - Taskgraph analysis pattern:
@alkdev/taskgraph_ts/src/analysis/ - TypeBox Value utilities:
@alkdev/typebox/value