add flowgraph architecture docs (Phase 1 SDD)

Draft architecture specification for @alkdev/flowgraph — a workflow graph library providing DAG-based orchestration over operations. Covers two graph types (operation graph, call graph), ujsx workflow templates, GraphologyHost and ReactiveHost configs, signal-driven execution, type-compatibility analysis, error hierarchy, and build/distribution. Includes 3 ADRs: ujsx as template IR, DAG-only enforcement, decoupled storage.
This commit is contained in:
2026-05-19 09:36:22 +00:00
parent 333dcd5ac1
commit d2253099ee
13 changed files with 2863 additions and 0 deletions

193
docs/architecture/README.md Normal file
View File

@@ -0,0 +1,193 @@
---
status: draft
last_updated: 2026-05-19
---
# @alkdev/flowgraph Architecture
Workflow graph library — DAG-based operation orchestration over graphology, with ujsx template composition and reactive execution.
## Why This Exists
Flowgraph fills the gap between the operation registry (`@alkdev/operations`) and the call graph observability layer (`@alkdev/alkhub`). Operations define *what can be called*. The call graph records *what was called*. Flowgraph defines *how calls are orchestrated* — the structure, validation, and execution of workflows.
Without flowgraph:
- **Workflows are ad-hoc** — the hub coordinator manually chains `registry.execute()` calls with no structural validation, no type checking between steps, and no way to reuse workflow patterns.
- **Call templates are hardcoded** — the SDD pipeline (architect → reviewer → decomposer → coordinator → specialist) is a recurring pattern with no reusable definition.
- **Abort cascading is manual** — when the 3rd of 5 operations fails, the coordinator must explicitly cancel the remaining operations. Flowgraph's DAG enables structural abort propagation.
- **No precondition checking** — there's no way to validate that operation A's output schema is compatible with operation B's input schema before attempting the call.
Flowgraph provides three conceptual graphs, each built for a different purpose:
1. **Operation Graph (Static)** — built from `OperationSpec`s at startup; nodes are operations, edges are type-compatibility relationships. Enables cycle detection, topological ordering, and call template validation.
2. **Call Graph (Dynamic)** — built at runtime from call events; nodes are call invocations with status and timestamps, edges are parent-child relationships. Enables abort cascading, observability, and DAG queries.
3. **Workflow Template (Declarative)** — a ujsx tree that defines a reusable workflow structure. A template is a validated path through the operation graph, instantiated as a call graph at runtime.
## Core Principle
**The graph is the specification. The template is the authoring surface. The call graph is the execution record.**
The operation graph provides static type checking and structural validation. The ujsx template provides human-readable, composable workflow definitions. The call graph captures what actually happened. Flowgraph is the bridge between all three.
## Relationship to Sibling Packages
| Package | Relationship |
|---------|-------------|
| `@alkdev/operations` | **Peer dependency**. Provides `OperationSpec`, `OperationRegistry`, `CallEvent`, `PendingRequestMap`. Flowgraph consumes operation types but does not depend on a specific runtime. |
| `@alkdev/ujsx` | **Direct dependency**. Workflow templates are `UNode` trees rendered through `HostConfig`s. Flowgraph provides the workflow-specific host configurations (graphology DAG, reactive execution). |
| `@alkdev/taskgraph` | **Pattern reference**. Flowgraph follows the same graphology-wrapping pattern (`FlowGraph` class like `TaskGraph` class) but enforces DAG invariants instead of allowing cycles. |
| `@alkdev/typebox` | **Direct dependency**. All schemas are TypeBox Modules. Runtime validation, JSON Schema export, and `Value.Check`/`Value.Errors`. |
| `@alkdev/pubsub` | **Optional peer dependency**. For event-driven call graph population. Flowgraph works in-memory; pubsub connects it to the call protocol. |
| `@alkdev/cograph` | **Future consumer**. The cognitive graph depends on flowgraph for workflow templates and execution tracking. |
## Current State
Flowgraph is in Phase 0/1 (exploration → architecture). No code exists yet. This architecture document set defines the WHAT and WHY before any implementation.
## Architecture Documents
| Document | Content |
|----------|---------|
| [schema.md](schema.md) | TypeBox Module, TypeScript types, enums (CallStatus, EdgeType, NodeStatus), node/edge attribute schemas, SerializedGraph factory |
| [operation-graph.md](operation-graph.md) | Static graph from OperationSpecs, type-compatibility edges, construction paths, validation |
| [call-graph-runtime.md](call-graph-runtime.md) | Dynamic graph from call events, node lifecycle, abort cascading, fromCallEvents construction |
| [workflow-templates.md](workflow-templates.md) | ujsx components (`<Operation>`, `<Sequential>`, `<Parallel>`, `<Conditional>`), template→DAG hydration, serialization |
| [host-configs.md](host-configs.md) | Graphology HostConfig (template→DAG analysis), Reactive HostConfig (template→execution engine), Instance types |
| [reactive-execution.md](reactive-execution.md) | Signal-driven status propagation, computed preconditions, abort cascade via signals, ReactiveRoot integration |
| [analysis.md](analysis.md) | Type-compatibility checking (input/output schema matching), precondition validation, execution ordering |
| [error-handling.md](error-handling.md) | FlowgraphError hierarchy, CycleError, TypeIncompatError, ValidationError, error collection strategy |
| [build-distribution.md](build-distribution.md) | Package structure, exports map, dependencies, platform targets |
### Design Decisions
| ADR | Decision |
|-----|----------|
| [001](decisions/001-ujsx-as-template-ir.md) | ujsx tree as workflow template intermediate representation |
| [002](decisions/002-dag-only-graph.md) | Enforce DAG invariants — no cycles in flowgraph |
| [003](decisions/003-storage-decoupled.md) | Storage is not flowgraph's concern — in-memory graph with export/import boundary |
## Consumer Context
### alkhub (hub-spoke coordinator)
The hub instantiates flowgraph to:
- Build the operation graph at startup from the registry
- Validate call templates before execution
- Populate call graphs at runtime from call protocol events
- Query call graphs for observability (what's running, what failed, what's blocked)
- Persist call graph state via `export()` → Postgres
### OpenCode Plugin (future)
An OpenCode plugin that provides workflow tools:
- `workflow.validate` — validate a template against the operation graph
- `workflow.run` — instantiate a template as a call graph and execute it
- `workflow.status` — query a running call graph
### Cograph (future)
The cognitive graph uses flowgraph's templates and operation graph to define procedural knowledge: which operations can be composed, what the valid execution paths are, and what preconditions each step requires.
## Source Structure
```
src/
component/ # ujsx components for workflow definition
operation.ts # <Operation name="classify" />
sequential.ts # <Sequential>...</Sequential>
parallel.ts # <Parallel>...</Parallel>
conditional.ts # <Conditional test={fn}>...</Conditional>
index.ts
host/
graphology.ts # HostConfig: ujsx tree → graphology DAG
reactive.ts # HostConfig: ujsx tree → reactive execution engine
schema/
enums.ts # CallStatus, NodeStatus, EdgeType, OperationType
node.ts # OperationNodeAttributes, CallNodeAttributes
edge.ts # TypeCompatEdgeAttrs, ParentChildEdgeAttrs, DependencyEdgeAttrs
graph.ts # FlowGraphSerialized (graphology export format)
index.ts
graph/
construction.ts # FlowGraph class (like TaskGraph)
validation.ts # Cycle detection, type-compat validation, precondition checks
queries.ts # topologicalOrder, ancestors, descendants, hasCycles
mutation.ts # updateNode, updateEdge, removeNode, removeEdge
index.ts
reactive/
workflow.ts # ReactiveRoot for workflow state
node-status.ts # Per-node status signals + computed preconditions
index.ts
analysis/
type-compat.ts # Schema compatibility checking between operation input/output
workflow.ts # Execution ordering, precondition resolution, path validation
defaults.ts # Default status, edge type, etc.
index.ts
error/
index.ts # FlowgraphError, CycleError, TypeIncompatError, ValidationError
index.ts # Barrel export
```
## Key Design Decisions
### 1. ujsx as Template IR
Workflow templates are `UNode` trees. This gives us:
- **Composability** — `<Sequential>` and `<Parallel>` compose naturally as parent-child structure
- **Serialization** — `UNode` trees are JSON, trivially stored and transmitted
- **Host targets** — the same template renders to a graphology DAG (analysis) or a reactive execution engine (runtime) via different `HostConfig` implementations
- **Reconciler support** — incremental template updates via ujsx's reconciler (add/remove/reorder steps without full rebuild)
This is a design decision worth documenting because it's a non-obvious choice. The alternative is to define templates as plain data structures (arrays of step objects), which is simpler but loses composability, host switching, and reconciler benefits.
### 2. DAG-Only, No Cycles
Flowgraph enforces acyclicity. The `FlowGraph` class rejects cycle-creating edges at mutation time (unlike `TaskGraph`, which allows cycles and detects them via `hasCycles()`). This is because:
- **Operation graphs** represent type flow — a cycle means an operation's output feeds back into its own input, which is almost certainly a design error.
- **Call graphs** represent execution order — cycles are physically impossible (you can't have a call that is its own ancestor).
- **Workflow templates** represent validated paths through the operation graph — they must be DAGs by construction.
### 3. Storage is Decoupled
Flowgraph handles in-memory graph construction, validation, and analysis. Persistence is the caller's concern. The `export()`/`fromJSON()` boundary provides a clean serialization format (graphology native JSON) that the hub can store in Postgres. This follows the same pattern as taskgraph.
### 4. Template → DAG → Execution is a Pipeline, Not a Monolith
The three representations serve different phases:
- **Template** (ujsx tree) → authoring, composition, serialization
- **DAG** (graphology) → validation, type checking, topological ordering
- **Execution** (reactive signals) → runtime status tracking, preconditions, abort propagation
Each can exist independently. You can validate a template without executing it. You can build a call graph from events without a template. You can run a reactive workflow directly from a DAG.
## Document Lifecycle
Architecture documents use YAML frontmatter with `status` and `last_updated` fields:
```yaml
---
status: draft | stable | deprecated
last_updated: YYYY-MM-DD
---
```
| Status | Meaning | Transitions |
|--------|---------|-------------|
| `draft` | Under active development. Content may change significantly. Implementation should not start until the document reaches `stable`. | → `stable` when implementation is complete and API contract is verified by tests. |
| `stable` | API contracts are locked. Changes require a review cycle and may warrant an ADR if they affect documented decisions. | → `deprecated` when superseded. → `draft` if a fundamental redesign is needed (rare). |
| `deprecated` | Superseded by another document. Kept for reference. Links should point to the replacement. | Removed when no longer referenced. |
ADR documents use a separate `Status` field in their body: `Proposed`, `Accepted`, `Deprecated`, or `Superseded`. ADRs never revert from `Accepted`.
## References
- Call protocol architecture: `@alkdev/alkhub_ts/docs/architecture/call-graph.md`
- Call graph storage schema: `@alkdev/alkhub_ts/docs/architecture/storage/call-graph.md`
- Operation types and registry: `@alkdev/operations/src/types.ts`, `@alkdev/operations/src/registry.ts`
- ujsx architecture: `@alkdev/ujsx/docs/architecture/`
- ujsx research on flowgraph HostConfigs: `@alkdev/ujsx/docs/research/reconciler/05-flowgraph-host-configs.md`
- Taskgraph architecture: `@alkdev/taskgraph_ts/docs/architecture/`
- SDD process: `docs/sdd_process.md`

View File

@@ -0,0 +1,265 @@
---
status: draft
last_updated: 2026-05-19
---
# Analysis Functions
Standalone composable functions for type-compatibility checking, execution ordering, and precondition validation.
## Overview
Analysis functions are pure, composable functions that operate on a `FlowGraph` instance. They follow the same pattern as taskgraph: standalone functions (not methods on the class) that take a graph as input and return structured results.
The analysis layer provides:
- **Type compatibility** — can operation A's output feed into operation B's input?
- **Execution ordering** — what's a valid topological order for a set of operations?
- **Precondition validation** — are all required inputs available before a step starts?
- **Reachability** — which operations can be reached from a given starting point?
- **Template validation** — does a workflow template follow a valid path through the operation graph?
All analysis functions are pure: they don't mutate the graph, they don't depend on external state, and they return structured results (not throw on failure). This makes them testable, composable, and suitable for both synchronous and async use.
## Type Compatibility
### `typeCompat(outputSchema, inputSchema)`
```typescript
function typeCompat(
outputSchema: TSchema,
inputSchema: TSchema,
): TypeCompatResult
interface TypeCompatResult {
compatible: boolean;
detail?: string;
mismatches?: TypeMismatch[];
}
interface TypeMismatch {
path: string; // JSON path to the mismatched field
expected: string; // What the input schema requires
actual: string; // What the output schema provides
}
```
Compares two TypeBox schemas and determines if the output schema is compatible with the input schema. Returns a structured result with details about mismatches.
### Compatibility rules
The analysis is **structural**, not semantic. It checks whether the output shape can satisfy the input shape:
1. **Exact match**`outputSchema` and `inputSchema` are structurally identical → `compatible: true`
2. **Output is superset** — output has all fields that input requires, plus extras → `compatible: true` (output is a subtype of input, meaning input accepts output)
3. **Output is subset** — output is missing fields that input requires → `compatible: false`, with `mismatches` listing the missing fields
4. **Type mismatch** — output field type doesn't match input field type → `compatible: false`, with `mismatches` listing the type differences
5. **Unknown passthrough** — if either schema is `Type.Unknown()`, compatibility is unknown → no edge is created (not incompatible, just unresolvable)
### Subtype checking
The key insight: **output must be a subtype of input** for compatibility. This means:
- If input expects `{ name: string, age: number }`, output must provide at least those fields
- If input expects `string`, output providing `string | number` is **not** compatible (it could produce a number)
- If input expects `string | number`, output providing `string` **is** compatible (string is a subset of string|number)
This follows standard type theory: the output must be at least as specific as what the input requires.
### `buildTypeEdges(graph)`
```typescript
function buildTypeEdges(graph: FlowGraph<OperationNodeAttrs, OperationEdgeAttrs>): void
```
Populates the operation graph with type-compatibility edges. For each pair of nodes (A, B), calls `typeCompat(A.outputSchema, B.inputSchema)` and adds an edge with the result.
This is called automatically by `FlowGraph.fromSpecs()`. It can also be called manually after adding operations incrementally.
### Edge attributes from type compatibility
A type-compatibility edge carries:
```typescript
{
edgeType: "typed",
compatible: boolean, // true if output feeds into input
detail?: string, // "classify.output is compatible with enrich.input"
mismatches?: TypeMismatch[] // specific field-level mismatches (if incompatible)
}
```
## Execution Ordering
### `topologicalOrder(graph)`
```typescript
function topologicalOrder(graph: FlowGraph): string[]
```
Returns node keys in topological order (prerequisites before dependents). Uses `graphology-dag`'s `topologicalSort` algorithm.
Throws `CircularDependencyError` if the graph contains cycles, with `cycles` populated by `findCycles()`.
### `parallelGroups(graph)`
```typescript
function parallelGroups(graph: FlowGraph): string[][]
```
Returns groups of nodes that can execute in parallel. Each group is an array of node keys. Groups are ordered by dependency depth:
- Group 0: nodes with no prerequisites (roots)
- Group 1: nodes whose only prerequisites are in Group 0
- Group N: nodes whose prerequisites are all in Groups 0 through N-1
This is useful for the hub coordinator to determine max parallelism: all nodes in a group can start simultaneously.
### `criticalPath(graph)`
```typescript
function criticalPath(graph: FlowGraph): string[]
```
Returns the longest path through the DAG, which represents the sequence of operations that determines the minimum total execution time. Useful for identifying bottlenecks.
## Precondition Validation
### `validatePreconditions(graph)`
```typescript
function validatePreconditions(
graph: FlowGraph<OperationNodeAttrs, OperationEdgeAttrs>
): ValidationError[]
```
For each node in the operation graph, checks that all required input fields are provided by at least one predecessor's output. Returns an array of `ValidationError` objects (never throws).
A "missing precondition" occurs when a node's input requires a field that no predecessor's output provides. This is a stronger check than type compatibility — it verifies that a valid execution path exists through the graph.
### `validateTemplate(template, operationGraph)`
```typescript
function validateTemplate(
template: UNode,
operationGraph: FlowGraph<OperationNodeAttrs, OperationEdgeAttrs>,
): ValidationError[]
```
Validates a workflow template against an operation graph:
1. **All operations exist** — every `<Operation name="X">` has a matching node in the operation graph
2. **No cycles** — the rendered DAG has no cycles
3. **Type compatibility** — sequential operations have compatible type edges (or no incompatible edge)
4. **Reachability** — all operations are reachable from the start
5. **No orphan nodes** — every operation has at least one incoming or outgoing edge (unless it's a single-operation template)
Returns an array of `ValidationError` objects. Template validation is advisory — it can produce warnings (e.g., "operation not in registry") and errors (e.g., "cycle detected").
## Reachability
### `reachableFrom(graph, nodeIds)`
```typescript
function reachableFrom(graph: FlowGraph, nodeIds: string[]): Set<string>
```
Returns all node keys reachable from the given starting nodes via directed edges. Useful for:
- Determining which operations a coordinator can reach from a starting operation
- Computing the abort cascade scope for a given call
- Finding all operations affected by a change to a particular operation
### `ancestors(graph, nodeId)`
```typescript
function ancestors(graph: FlowGraph, nodeId: string): string[]
```
Returns all ancestors of a node (nodes reachable via incoming edges). Useful for:
- Finding which operations must complete before a given operation can start
- Computing depth-from-roots for execution priority
### `descendants(graph, nodeId)`
```typescript
function descendants(graph: FlowGraph, nodeId: string): string[]
```
Returns all descendants of a node (nodes reachable via outgoing edges). Useful for:
- Finding all calls that would be affected by aborting a given call
- Computing the scope of a failure cascade
## Graph-Level Validation
### `validateGraph(graph)`
```typescript
function validateGraph(graph: FlowGraph): AnyValidationError[]
```
Runs all validation checks:
1. **Schema validation** — node attributes match `OperationNodeAttrs` or `CallNodeAttrs` schema
2. **Graph invariants** — no cycles, no dangling edges, no self-loops
3. **Orphan detection** — nodes with no edges (warning, not error)
Returns an array of `AnyValidationError` objects, which is a union type:
```typescript
type AnyValidationError = ValidationError | GraphValidationError;
```
Matching taskgraph's pattern, this function never throws — it collects all issues and returns them.
## Standalone Function Pattern
All analysis functions are standalone (not methods on `FlowGraph`). They take a `FlowGraph` instance as their first argument and return structured results. This follows taskgraph's pattern:
```typescript
// Standalone functions
import { topologicalOrder, hasCycles, typeCompat } from "@alkdev/flowgraph/analysis";
const order = topologicalOrder(graph);
const cycles = hasCycles(graph);
const result = typeCompat(outputSchema, inputSchema);
```
The `FlowGraph` class exposes convenience methods that delegate to these standalone functions:
```typescript
class FlowGraph {
topologicalOrder(): string[] { return _topologicalOrder(this._graph); }
hasCycles(): boolean { return _hasCycles(this._graph); }
validate(): AnyValidationError[] { return _validate(this._graph); }
}
```
This pattern enables:
- **Tree-shaking** — consumers only import the analysis functions they use
- **Testing** — standalone functions are easier to test in isolation
- **Composition** — consumers can chain analysis functions without creating intermediate `FlowGraph` instances
## Constraints
- **Analysis functions are pure** — they don't mutate the graph, don't depend on external state, and don't throw on validation failures (they return error arrays)
- **Type compatibility is structural, not semantic** — `typeCompat()` checks schema shapes, not whether the data makes sense. "Age as number" is compatible with "count as number" even though they're semantically different.
- **Template validation is advisory** — warnings are not errors. A template with an unknown operation is a warning, not a validation failure (the operation might be added to the registry later).
- **Analysis functions work on the underlying `DirectedGraph`** — they're thin wrappers around graphology and graphology-dag functions, following the same pattern as taskgraph
- **`topologicalOrder()` throws on cycles** — unlike `validateGraph()` which returns errors, `topologicalOrder()` throws `CircularDependencyError` because it cannot produce a valid ordering from a cyclic graph
## Open Questions
1. **How deep should `typeCompat` check?** Currently it checks top-level field existence and type compatibility. Should it recursively check nested objects and arrays? Full recursive checking is more thorough but slower and may produce false negatives for schemas with dynamic structures.
2. **Should `validateTemplate` check runtime preconditions?** Currently it only checks structural validity and type compatibility. Runtime preconditions (e.g., "operation B requires an API key that operation A doesn't have access to") are beyond the scope of static analysis and belong to the access control layer.
3. **Should analysis functions be async?** For very large graphs (thousands of nodes), type compatibility checking could be slow. Making it async would allow incremental progress reporting. Current graphs are small enough (50-200 nodes) that synchronous checking is fine.
4. **Should `parallelGroups` account for resource constraints?** Currently it returns the theoretical maximum parallelism. An optional `maxConcurrency` parameter could limit group sizes for realistic scheduling.
## References
- Schema: [schema.md](schema.md) — `TypeCompatResult`, `TypeMismatch`, `ValidationError`
- Error handling: [error-handling.md](error-handling.md) — `CircularDependencyError`, `TypeIncompatError`
- Taskgraph analysis pattern: `@alkdev/taskgraph_ts/src/analysis/`
- TypeBox Value utilities: `@alkdev/typebox/value`

View File

@@ -0,0 +1,289 @@
---
status: draft
last_updated: 2026-05-19
---
# Build & Distribution
Package structure, exports map, dependencies, and platform targets.
## Package Structure
```
@alkdev/flowgraph/
├── src/
│ ├── component/ # ujsx workflow components
│ │ ├── operation.ts # <Operation> component
│ │ ├── sequential.ts # <Sequential> component
│ │ ├── parallel.ts # <Parallel> component
│ │ ├── conditional.ts # <Conditional> component
│ │ └── index.ts
│ ├── host/
│ │ ├── graphology.ts # GraphologyHostConfig
│ │ ├── reactive.ts # ReactiveHostConfig
│ │ └── index.ts
│ ├── schema/
│ │ ├── enums.ts # CallStatus, EdgeType, NodeCategory, NodeStatus
│ │ ├── node.ts # OperationNodeAttrs, CallNodeAttrs
│ │ ├── edge.ts # TypedEdgeAttrs, CallEdgeAttrs, TemplateEdgeAttrs
│ │ ├── graph.ts # SerializedGraph, FlowGraphSerialized
│ │ └── index.ts
│ ├── graph/
│ │ ├── construction.ts # FlowGraph class (fromSpecs, fromCallEvents, fromJSON, etc.)
│ │ ├── validation.ts # validateSchema, validateGraph, validate
│ │ ├── queries.ts # topologicalOrder, hasCycles, ancestors, descendants, etc.
│ │ ├── mutation.ts # addNode, addEdge, updateNodeStatus, removeNode, etc.
│ │ └── index.ts
│ ├── reactive/
│ │ ├── workflow.ts # WorkflowReactiveRoot (signal-backed execution)
│ │ ├── node-status.ts # Signal<NodeStatus>, computed preconditions
│ │ └── index.ts
│ ├── analysis/
│ │ ├── type-compat.ts # typeCompat, buildTypeEdges, analyzeTypeCompat
│ │ ├── workflow.ts # validateTemplate, validatePreconditions
│ │ ├── defaults.ts # resolveDefaults for CallStatus, EdgeType, etc.
│ │ └── index.ts
│ ├── error/
│ │ └── index.ts # FlowgraphError hierarchy
│ └── index.ts # Barrel export
├── test/
│ ├── graph/
│ │ ├── construction.test.ts
│ │ ├── validation.test.ts
│ │ ├── queries.test.ts
│ │ └── mutation.test.ts
│ ├── schema/
│ │ └── enums.test.ts
│ ├── analysis/
│ │ ├── type-compat.test.ts
│ │ └── workflow.test.ts
│ ├── component/
│ │ └── components.test.ts
│ ├── host/
│ │ ├── graphology.test.ts
│ │ └── reactive.test.ts
│ └── error/
│ └── errors.test.ts
├── package.json
├── tsconfig.json
├── tsup.config.ts
├── vitest.config.ts
└── AGENTS.md
```
## Package JSON
```json
{
"name": "@alkdev/flowgraph",
"version": "0.1.0",
"type": "module",
"exports": {
".": {
"import": "./dist/index.js",
"require": "./dist/index.cjs"
},
"./component": {
"import": "./dist/component/index.js",
"require": "./dist/component/index.cjs"
},
"./host": {
"import": "./dist/host/index.js",
"require": "./dist/host/index.cjs"
},
"./schema": {
"import": "./dist/schema/index.js",
"require": "./dist/schema/index.cjs"
},
"./graph": {
"import": "./dist/graph/index.js",
"require": "./dist/graph/index.cjs"
},
"./reactive": {
"import": "./dist/reactive/index.js",
"require": "./dist/reactive/index.cjs"
},
"./analysis": {
"import": "./dist/analysis/index.js",
"require": "./dist/analysis/index.cjs"
},
"./error": {
"import": "./dist/error/index.js",
"require": "./dist/error/index.cjs"
}
},
"typesVersions": {
"*": {
"component": ["./dist/component/index.d.ts"],
"host": ["./dist/host/index.d.ts"],
"schema": ["./dist/schema/index.d.ts"],
"graph": ["./dist/graph/index.d.ts"],
"reactive": ["./dist/reactive/index.d.ts"],
"analysis": ["./dist/analysis/index.d.ts"],
"error": ["./dist/error/index.d.ts"]
}
}
}
```
## Exports Map
Following the taskgraph pattern, each module has a sub-path export:
| Sub-path | Content | Use case |
|----------|---------|----------|
| `@alkdev/flowgraph` | Barrel export (everything) | Full import |
| `@alkdev/flowgraph/component` | `<Operation>`, `<Sequential>`, `<Parallel>`, `<Conditional>` | Template authoring |
| `@alkdev/flowgraph/host` | `GraphologyHostConfig`, `ReactiveHostConfig` | ujsx HostConfig implementations |
| `@alkdev/flowgraph/schema` | TypeBox schemas, enums, types | Schema-only import (no graph dependency) |
| `@alkdev/flowgraph/graph` | `FlowGraph` class, construction, mutation, queries | Core graph operations |
| `@alkdev/flowgraph/reactive` | `WorkflowReactiveRoot`, signal-based execution | Runtime execution |
| `@alkdev/flowgraph/analysis` | `typeCompat`, `validateTemplate`, ordering functions | Analysis and validation |
| `@alkdev/flowgraph/error` | Error classes | Error handling |
## Dependencies
### Production Dependencies
```json
{
"dependencies": {
"@alkdev/typebox": "workspace:*",
"@alkdev/ujsx": "workspace:*",
"@preact/signals-core": "^1.x",
"graphology": "^0.25",
"graphology-dag": "^0.4"
},
"peerDependencies": {
"@alkdev/operations": "workspace:*"
}
}
```
| Package | Role | Why |
|---------|------|-----|
| `@alkdev/typebox` | Schema definitions, validation, `Value.Check`, `Value.Errors` | Direct dependency — all schemas are TypeBox |
| `@alkdev/ujsx` | UNode, HostConfig, createRoot, h(), ReactiveRoot | Direct dependency — workflow templates are ujsx trees |
| `@preact/signals-core` | `signal`, `computed`, `effect`, `batch` | Transitive via ujsx, re-exported for flowgraph's reactive layer |
| `graphology` | `DirectedGraph` data structure | Core graph engine — same as taskgraph |
| `graphology-dag` | `topologicalSort`, `hasCycle`, `parallelGroups` | DAG-specific algorithms |
| `@alkdev/operations` | `OperationSpec`, `CallEventMap`, `CallStatus` | Peer dependency — type imports only, no runtime dependency |
### Why `@alkdev/operations` is a Peer Dependency
Flowgraph imports `OperationSpec`, `CallEventMap`, and `CallStatus` types from `@alkdev/operations`, but does not depend on the runtime (registry, call handler, pending request map). Making it a peer dependency:
1. Avoids circular dependency concerns (operations doesn't depend on flowgraph)
2. Allows flowgraph to work with any version of operations that provides the right types
3. Reduces bundle size for consumers that don't use operations
### Why `@preact/signals-core` via `@alkdev/ujsx`
Flowgraph's reactive layer uses `signal()`, `computed()`, and `effect()` from `@preact/signals-core`. These are re-exported from `@alkdev/ujsx/reactive` so consumers don't need to import directly from Preact. If ujsx ever changes its reactive primitive library, only ujsx's re-export needs updating.
## Build Configuration
### tsup.config.ts
Following taskgraph's build pattern:
```typescript
import { defineConfig } from "tsup";
export default defineConfig({
entry: {
index: "src/index.ts",
component: "src/component/index.ts",
host: "src/host/index.ts",
schema: "src/schema/index.ts",
graph: "src/graph/index.ts",
reactive: "src/reactive/index.ts",
analysis: "src/analysis/index.ts",
error: "src/error/index.ts",
},
format: ["esm", "cjs"],
dts: true,
clean: true,
splitting: true,
sourcemap: true,
});
```
- **ESM + CJS dual output** — matches all sibling packages
- **Code splitting** — enables tree-shaking for sub-path imports
- **Source maps** — for debugging
- **Type declarations** — `.d.ts` files for all exports
### tsconfig.json
```json
{
"compilerOptions": {
"target": "ES2022",
"module": "Node16",
"moduleResolution": "Node16",
"strict": true,
"declaration": true,
"declarationMap": true,
"sourceMap": true,
"outDir": "./dist",
"rootDir": "./src",
"types": ["vitest/globals"]
},
"include": ["src/**/*.ts"],
"exclude": ["node_modules", "dist", "test"]
}
```
Matches the tsconfig pattern of all `@alkdev` packages: ES2022 target, Node16 module resolution, strict mode.
### vitest.config.ts
```typescript
import { defineConfig } from "vitest/config";
export default defineConfig({
test: {
globals: true,
},
});
```
## Platform Targets
Following taskgraph's philosophy: **pure JavaScript, no native addons**.
| Platform | Support Level | Notes |
|----------|---------------|-------|
| Node.js | Primary | All dependencies are pure JS |
| Deno | Compatible | ESM-first, no Node-specific APIs used |
| Bun | Compatible | All dependencies are Bun-compatible |
| Browser | Compatible | graphology and signals-core work in browsers |
The library has no native dependencies, no filesystem access, and no Node-specific APIs (no `fs`, `path`, `child_process`, etc.). This makes it platform-agnostic.
## Tree-Shaking
The sub-path export structure enables effective tree-shaking:
- Consumers using only `@alkdev/flowgraph/schema` don't pull in the graph engine or ujsx
- Consumers using only `@alkdev/flowgraph/analysis` don't pull in the reactive layer
- Consumers using `@alkdev/flowgraph/component` get ujsx but not graphology (templates can be defined without importing the graph engine)
The barrel export (`@alkdev/flowgraph`) re-exports everything for convenience, but consumers concerned about bundle size should use sub-path imports.
## Constraints
- **No filesystem access** — flowgraph is a pure computation library. Persistence is the hub's concern.
- **No network access** — no HTTP clients, WebSocket connections, or Redis clients. All data comes in through constructor arguments.
- **`@alkdev/operations` is a peer dependency** — type imports only, no runtime dependency on the operations registry or call protocol.
- **ESM-first** — the package is authored in ESM with CJS output generated by tsup. All internal imports use `.js` extensions for Node16 module resolution.
- **Code splitting enabled** — tsup's `splitting: true` enables optimal code splitting for sub-path imports.
- **Vitest for testing** — following the monorepo convention.
## References
- Taskgraph build configuration: `@alkdev/taskgraph_ts/tsup.config.ts`, `@alkdev/taskgraph_ts/tsconfig.json`
- ujsx build configuration: `@alkdev/ujsx/tsup.config.ts`
- graphology: https://github.com/graphology/graphology
- graphology-dag: https://github.com/graphology/graphology-dag

View File

@@ -0,0 +1,255 @@
---
status: draft
last_updated: 2026-05-19
---
# Call Graph (Dynamic Runtime)
The dynamic call graph populated at runtime from call events. Nodes are call invocations with status and timestamps; edges are parent-child and dependency relationships.
## Overview
The call graph is the runtime counterpart to the operation graph. Where the operation graph captures what *can* happen (type compatibility), the call graph captures what *is* happening or *has happened* (running calls, completed calls, failures, aborts).
The call graph is populated automatically by the call protocol — every `call.requested` adds a node, every `call.responded`/`call.error`/`call.aborted` updates its status. This means the call graph is always in sync with the actual state of in-flight calls.
Key capabilities:
- **Abort cascading** — abort a call → all children are automatically aborted via `parentRequestId` chains
- **Observability** — query what's running, what failed, what's blocked
- **DAG operations** — topological sort of running calls, cycle detection (shouldn't happen but verified), reachability queries
- **Serialization** — `export()`/`fromJSON()` for Postgres persistence
## Construction
### fromCallEvents()
```typescript
static fromCallEvents(events: CallEventMapValue[]): FlowGraph<CallNodeAttrs, CallEdgeAttrs>
```
Builds a call graph from an array of call protocol events. Events are processed in order:
1. **`call.requested`** → add a `CallNodeAttrs` node with `status: "pending"`. If `parentRequestId` is set, add a `triggered` edge from parent to child.
2. **`call.responded`** → update node status to `completed`, set `output` and `completedAt`
3. **`call.error`** → update node status to `failed`, set `error` and `completedAt`
4. **`call.aborted`** → update node status to `aborted`, set `completedAt`
5. **`call.completed`** → update node status to `completed`, set `completedAt` (if not already set by `call.responded`)
Processing is idempotent — processing the same event twice has no effect (the node already has the updated status).
### Incremental: updateFromEvent()
```typescript
updateFromEvent(event: CallEventMapValue): void
```
Updates an existing call graph with a single call event. This is the primary interface for real-time graph population:
```typescript
const callGraph = new FlowGraph();
// Subscribe to call protocol events
pubsub.subscribe("call.requested", (event) => callGraph.updateFromEvent(event));
pubsub.subscribe("call.responded", (event) => callGraph.updateFromEvent(event));
pubsub.subscribe("call.error", (event) => callGraph.updateFromEvent(event));
pubsub.subscribe("call.aborted", (event) => callGraph.updateFromEvent(event));
pubsub.subscribe("call.completed", (event) => callGraph.updateFromEvent(event));
```
### fromJSON()
```typescript
static fromJSON(data: CallGraphSerialized): FlowGraph
```
Deserialize from graphology native JSON format. Used for loading persisted call graphs from Postgres.
## Node Attributes
See [schema.md](schema.md#CallNodeAttrs) for the full schema definition.
| Field | Type | Set by |
|-------|------|--------|
| `requestId` | `string` | `call.requested` |
| `operationId` | `string` | `call.requested` |
| `status` | `CallStatus` | Updated by each call event |
| `parentRequestId` | `string?` | `call.requested` |
| `input` | `unknown` | `call.requested` |
| `output` | `unknown?` | `call.responded` |
| `error` | `{ code, message, details? }?` | `call.error` |
| `identity` | `Identity?` | `call.requested` |
| `startedAt` | `string?` | `call.requested` (when handler starts) |
| `completedAt` | `string?` | Terminal event (`responded`, `error`, `aborted`) |
The node key is `requestId`.
## Edges
Call graph edges carry an `edgeType` attribute:
| `edgeType` | Meaning | Added by |
|-----------|---------|----------|
| `triggered` | Parent call caused child call to execute | `call.requested` with `parentRequestId` |
| `depends_on` | Data dependency — source needs target's result | Explicit declaration (not auto-populated) |
`depends_on` edges are not auto-populated by the call protocol. They represent data dependencies that aren't captured by the parent-child hierarchy. They may be added by:
- Workflow template instantiation (the template knows which steps depend on which)
- Explicit `addDependency(parent, child)` calls by the hub coordinator
### Edge Key Convention
`triggered` edges use `${parentRequestId}->${childRequestId}` as the edge key. `depends_on` edges use `${sourceRequestId}->${targetRequestId}:depends_on` to distinguish from `triggered` edges between the same pair.
Since `multi: false`, there can be at most one `triggered` and one `depends_on` edge between the same pair. The edge key convention ensures deterministic keys.
## Status Lifecycle
Call node status transitions follow a strict state machine:
```
call.requested
┌─────────┐
│ pending │
└────┬────┘
handler starts
┌─────────┐
┌────│ running │────┐
│ └────┬────┘ │
call.aborted │ call.aborted
│ │ │
▼ │ ▼
┌─────────┐ │ ┌─────────┐
│ aborted │ │ │ aborted │
└─────────┘ │ └─────────┘
┌─────────┼─────────┐
│ │ │
call.responded │ call.error
│ │ │
▼ │ ▼
┌───────────┐ │ ┌────────┐
│ completed │ │ │ failed │
└───────────┘ │ └────────┘
call.completed
┌───────────┐
│ completed │
└───────────┘
```
Invalid transitions (e.g., `completed``running`) throw `InvalidTransitionError`. The `updateStatus()` method validates the transition before applying it.
## Abort Cascading
When a call is aborted, all of its children should also be aborted. The call protocol handles this via `call.aborted` events propagating through `parentRequestId` chains.
The call graph supports this with a traversal query:
```typescript
// Abort cascade: get all descendants of a call
const descendants = callGraph.descendants(requestId);
// → all calls that would be affected by aborting this call
```
The hub coordinator can:
1. Receive `call.aborted` for a parent call
2. Query `callGraph.descendants(requestId)` for all children
3. Abort each child call via `PendingRequestMap.abort()`
This is a structural operation — the graph provides the "who is affected" information, the protocol provides the "abort them" mechanism.
## Observability Queries
The call graph supports queries for observability without traversing the entire graph:
| Query | Method | Returns |
|-------|--------|---------|
| Get running calls | `filterByStatus("running")` | Node IDs with running status |
| Get failed calls | `filterByStatus("failed")` | Node IDs with failed status |
| Get top-level calls | `getRoots()` | Nodes with no `parentRequestId` |
| Get children of call | `children(requestId)` | Direct children via `triggered` edges |
| Get call duration | `duration(requestId)` | `completedAt - startedAt` (throws if not completed) |
| Get call lineage | `lineage(requestId)` | Ancestor chain from root to this call |
### filterByStatus
```typescript
filterByStatus(status: CallStatus): string[]
```
Returns all node keys with the given status. Implemented as a filter over `graph.forEachNode()`. For small graphs (tens to hundreds of nodes), this is O(n) and fast. For very large graphs, a status index could be added as an optimization.
### getRoots
```typescript
getRoots(): string[]
```
Returns all nodes with `parentRequestId === undefined` (top-level calls). These are the entry points of call chains.
## Serialization and Persistence
```typescript
const data = callGraph.export(); // graphology native JSON
callGraph.toJSON(); // alias for export()
const restored = FlowGraph.fromJSON(data); // round-trip
```
The call graph's `export()`/`fromJSON()` boundary is designed for Postgres persistence via the hub's storage layer. Flowgraph does not handle database operations — it provides the serialized format, and the hub handles storage.
Payload fields (`input`, `output`, `error`) are stored as-is in the graph. The hub's storage layer is responsible for truncation and redaction (see `@alkdev/alkhub_ts/docs/architecture/storage/call-graph.md` for the payload handling strategy).
## Mutations
```typescript
// Add a call node (from call.requested event)
addCall(attrs: CallNodeAttrs): void
// Update call status (from call.responded/error/aborted/completed event)
updateStatus(requestId: string, status: CallStatus, extra?: Partial<CallNodeAttrs>): void
// Add a dependency edge (explicit, not auto-populated)
addDependency(source: string, target: string): void
// Remove a call node and its edges
removeCall(requestId: string): void
// Update call attributes (partial merge)
updateCall(requestId: string, attrs: Partial<CallNodeAttrs>): void
```
`updateStatus` validates the transition. `addDependency` validates that both endpoints exist. `removeCall` removes the node and all attached edges (graphology cascade).
## Constraints
- **DAG-only** — call graphs cannot have cycles. A call cannot be its own ancestor. `addCall` with a `parentRequestId` that would create a cycle throws `CycleError`.
- **Status transitions are validated** — invalid transitions throw `InvalidTransitionError`.
- **Node keys are `requestId`** — not `operationId`. Multiple calls to the same operation have different `requestId`s but the same `operationId`.
- **`parentRequestId` is both node attribute and edge** — denormalized for fast point lookups (node attribute) and traversal queries (edge), following the storage schema pattern.
- **`depends_on` edges are not auto-populated** — they represent data dependencies that the call protocol doesn't capture. They must be added explicitly by the hub coordinator or workflow template instantiation.
- **Payload fields are stored as-is** — flowgraph doesn't truncate or redact `input`, `output`, or `error`. That's the hub's responsibility at the persistence boundary.
- **Small graph sizes** — call graphs at hub level are typically tens of nodes. Performance is a non-issue; O(n) traversals are fine.
## Open Questions
1. **Should the call graph support `call.requested` events with unknown `operationId`?** If a `call.requested` event references an operation not in the registry, should the node be created with `operationId` set to the unknown value? Yes — the call graph records what happened, not what should have happened. The node gets a `status: "pending"` and may later transition to `"failed"` with an `OPERATION_NOT_FOUND` error code.
2. **Should `depends_on` edges be auto-populated from workflow templates?** When a call graph is instantiated from a workflow template, the template's sequential/parallel structure implies data dependencies. Should the template instantiation automatically create `depends_on` edges? This would couple the call graph to the template system, which may not always be desirable.
3. **Should the call graph support multiple graphs simultaneously (one per workflow execution)?** Currently the design assumes one call graph per `FlowGraph` instance. If the hub needs to track multiple concurrent workflows, it would use multiple instances. An alternative is a single graph with workflow-scoped subgraphs.
4. **Should `filterByStatus` use an index?** For small graphs (tens of nodes), a simple filter is fast. For very large graphs, maintaining a `Map<CallStatus, Set<string>>` index would make status queries O(1). The index would need to be updated on every `updateStatus()` call.
## References
- Schema: [schema.md](schema.md) — `CallNodeAttrs`, `CallEdgeAttrs`, `CallStatus`, `EdgeType`
- Call protocol: `@alkdev/alkhub_ts/docs/architecture/call-graph.md`
- Call graph storage: `@alkdev/alkhub_ts/docs/architecture/storage/call-graph.md`
- Call event types: `@alkdev/operations/src/call.ts`
- Taskgraph pattern: `@alkdev/taskgraph_ts/src/graph/construction.ts`

View File

@@ -0,0 +1,69 @@
# ADR-001: ujsx Trees as Workflow Template IR
## Status
Proposed
## Context
Flowgraph needs a way to define workflow templates — reusable sequences of operations with conditional branching and parallel execution. The templates must be:
1. **Declarative** — defining *what* should happen, not *how*
2. **Composable** — nesting sequential, parallel, and conditional flows
3. **Serializable** — store in JSON, transmit over APIs, version in git
4. **Validatable** — check against an operation graph before execution
5. **Renderable to multiple targets** — structural validation (DAG) and runtime execution (reactive)
The obvious approach is a custom template format: an array of step objects with type discriminators:
```typescript
const template = [
{ type: "operation", name: "architect" },
{ type: "sequential", steps: [...] },
];
```
This works but has limitations:
- Custom format requires a custom parser, serializer, and validator
- No composition primitives — `sequential` and `parallel` are just types in an array
- No host switching — a separate compiler is needed for each target (DAG, execution engine)
- No incremental updates — changing a step requires rebuilding the entire structure
## Decision
Use ujsx `UNode` trees as the workflow template intermediate representation. Workflow components (`Operation`, `Sequential`, `Parallel`, `Conditional`) are `UComponent` functions that produce `UElement` nodes. The template is rendered to different targets through ujsx `HostConfig` implementations.
```typescript
const template = h(Sequential, {},
h(Operation, { name: "architect" }),
h(Operation, { name: "reviewer" }),
);
```
## Rationale
1. **No new format** — ujsx already defines `UNode`, `UElement`, `URoot`, type guards, and serialization. We don't need to design, implement, and maintain a template format.
2. **Composition is structural**`<Sequential>` and `<Parallel>` compose naturally as parent-child structure in a tree. Array-of-objects requires custom merging logic.
3. **Host target switching** — the same `UNode` tree renders to a graphology DAG (for validation) or a reactive engine (for execution) by swapping the `HostConfig`. No template-specific compiler needed.
4. **Incremental updates** — when the ujsx reconciler is implemented, template changes (add/remove/reorder steps) can be applied incrementally without rebuilding the entire DAG. Array-of-objects requires full diffing and rebuilding.
5. **Reactive props**`@preact/signals-core` enables signal-driven prop updates. An `Operation` node's `name` could be a `signal<string>`, enabling dynamic workflow modification at runtime.
6. **Serialization for free**`UNode` trees are plain JSON. `JSON.stringify(template)` works. No custom serializer needed.
## Consequences
- **Direct dependency on `@alkdev/ujsx`** — flowgraph imports `h`, `createRoot`, `HostConfig`, `ReactiveRoot`, and type definitions from ujsx. This is a direct dependency, not a peer dependency.
- **Function props don't serialize** — `Conditional.test` can be a function `(results) => boolean`, which doesn't survive JSON round-trips. Templates with conditional branches need to provide `test` at render time or use expression strings.
- **Template components must follow ujsx component contract** — `(props) => UNode`. This is a minimal contract but it means components are synchronous functions that return a tree.
- **The template IS the tree** — there is no separate compilation step between the ujsx tree and the render target. The `HostConfig.render()` call IS the compilation.
## References
- ujsx architecture: `@alkdev/ujsx/docs/architecture/README.md`
- ujsx HostConfig: `@alkdev/ujsx/docs/architecture/host-config.md`
- Workflow templates: [workflow-templates.md](../workflow-templates.md)
- Host configs: [host-configs.md](../host-configs.md)

View File

@@ -0,0 +1,41 @@
# ADR-002: Enforce DAG Invariants (No Cycles)
## Status
Proposed
## Context
Flowgraph represents two types of graphs: operation graphs (static type compatibility) and call graphs (dynamic call hierarchy). Both are directed acyclic graphs (DAGs) by nature:
- **Operation graphs** — type flow is acyclic. An operation's output feeding back into its own input is a design error.
- **Call graphs** — execution order is acyclic. A call being its own ancestor is physically impossible (you can't trigger yourself before you start).
- **Workflow templates** — rendered templates must be DAGs. Cycles in a template mean infinite loops in execution.
Taskgraph, the sibling package, allows cycles in its graph and detects them via `hasCycles()` and `findCycles()`. This makes sense because task dependencies can form cycles (e.g., iterative refinement where task A depends on task B which depends on task A's revised output).
## Decision
Flowgraph enforces acyclicity at construction time. Adding an edge that would create a cycle throws `CycleError`. `topologicalOrder()` can always produce a valid ordering without needing a cycle check first.
This is a stricter invariant than taskgraph's approach. The rationale:
1. **Cycles in operation graphs are design errors** — if operation A's output type is compatible with operation B's input, and B's output is compatible with A's input, that's circular type flow. It means infinite recursion is possible.
2. **Cycles in call graphs are physically impossible** — a call cannot be its own ancestor. The call protocol ensures this via `parentRequestId` chains.
3. **Cycles in templates are execution errors** — a cycle in a `<Sequential>` chain means infinite execution. This should be caught at template validation time, not at runtime.
4. **DAG algorithms are simpler**`topologicalOrder()` can always return a valid ordering. No need for `hasCycles()` + fallback path. `parallelGroups()` always produces a valid grouping. `reachableFrom()` never loops.
## Consequences
- **`addEdge()` validates before adding** — if adding the edge would create a cycle, it throws `CycleError` with the cycle paths.
- **`fromSpecs()` and `fromCallEvents()` cannot produce cyclic graphs** — cycles in the input data throw errors.
- **`topologicalOrder()` never throws** — it can always produce a valid ordering because the graph is guaranteed acyclic.
- **`hasCycles()` always returns `false`** — kept as a validation method for graphs loaded via `fromJSON()` (which doesn't enforce acyclicity during import).
- **This is different from taskgraph** — consumers familiar with taskgraph's `hasCycles()``findCycles()``topologicalOrder()` error-handling pattern need to adjust. In flowgraph, cycle prevention is at construction time, not query time.
## References
- Taskgraph cycle handling: `@alkdev/taskgraph_ts/docs/architecture/graph-model.md`
- Operation graph: [operation-graph.md](../operation-graph.md)
- Call graph: [call-graph.md](../call-graph.md)
- Error handling: [error-handling.md](../error-handling.md)

View File

@@ -0,0 +1,57 @@
# ADR-003: Decoupled Storage — In-Memory Graph with Export/Import Boundary
## Status
Proposed
## Context
Call graphs need to persist across hub restarts. The alkhub storage schema (`call_graph_nodes` and `call_graph_edges` tables) stores call data in Postgres. The question is: should flowgraph handle its own persistence, or should it provide a serialization boundary and let the hub handle storage?
Taskgraph takes the serialization boundary approach: `export()` returns a graphology JSON blob, `fromJSON()` restores it. The hub stores this data in whatever format it needs.
The alkhub call graph storage schema has specific requirements (payload truncation, redaction, indexing) that are storage-layer concerns, not graph concerns.
## Decision
Flowgraph operates on in-memory graphology instances and provides `export()`/`fromJSON()` for serialization. Storage, persistence, and database operations are the hub's concern, not flowgraph's.
```typescript
// In-memory graph
const graph = FlowGraph.fromCallEvents(events);
// Export for persistence
const data = graph.export(); // graphology native JSON
// Hub stores this in Postgres
await db.saveCallGraph(data);
// Restore from storage
const restored = FlowGraph.fromJSON(await db.loadCallGraph());
```
## Rationale
1. **Separation of concerns** — flowgraph is a graph library, not a database client. Mixing graph operations with SQL queries violates the single-responsibility principle.
2. **Storage varies by consumer** — the hub uses Postgres, but other consumers might use SQLite, IndexedDB, or in-memory caches. Flowgraph shouldn't prescribe a storage backend.
3. **The storage schema has concerns beyond the graph** — payload truncation (10KB threshold), field redaction (stripping API keys), and indexing are storage-layer concerns. Flowgraph stores raw `input`/`output`/`error` fields; the hub handles truncation at the persistence boundary.
4. **Taskgraph's pattern works** — the same approach has served taskgraph well. The hub loads graph data from DB, constructs a `TaskGraph` in memory, runs analysis, and saves changes back.
5. **Platform-agnostic requirement** — flowgraph must work in Deno, Node, and Bun. Database clients vary by platform (native addons, connection pooling, etc.). Keeping flowgraph pure JS means no native dependencies.
## Consequences
- **`export()` and `fromJSON()` are the persistence boundary** — consumers that need persistence serialize the graph and handle storage themselves.
- **No database imports in flowgraph** — `pg`, `better-sqlite3`, `mongodb`, etc. are not in flowgraph's dependency tree.
- **Payload handling is the hub's concern** — flowgraph stores raw `input`/`output`/`error` on call nodes. Truncation and redaction happen when the hub writes to Postgres.
- **`fromJSON()` validates the data structure** — using `Value.Check()` against the `FlowGraphSerialized` schema. Invalid data throws `InvalidInputError`. But `fromJSON()` does NOT validate business rules (e.g., no cycles — that's `validateGraph()`).
- **The hub must keep its storage schema in sync with flowgraph's `FlowGraphSerialized`** — if the storage column types change, the hub's mapping code needs updating, not flowgraph.
## References
- Taskgraph serialization: `@alkdev/taskgraph_ts/src/graph/construction.ts` (fromJSON, export)
- Call graph storage: `@alkdev/alkhub_ts/docs/architecture/storage/call-graph.md`
- Schema: [schema.md](../schema.md) — FlowGraphSerialized format

View File

@@ -0,0 +1,265 @@
---
status: draft
last_updated: 2026-05-19
---
# Error Handling
FlowgraphError hierarchy, validation error collection, and error boundaries.
## Design Principle
Flowgraph follows taskgraph's error handling pattern:
1. **Programmer errors throw** — invalid arguments, duplicate node IDs, cycles where acyclicity is enforced
2. **Operational conditions return structured results** — validation errors, type mismatches, unreachable nodes
3. **Graph mutations throw on constraint violations** — adding a duplicate node throws `DuplicateNodeError`, adding a cycle-creating edge throws `CycleError`
This means validation functions (`validateGraph()`, `validateSchema()`, `validateTemplate()`) *never throw* — they collect issues and return them. But construction functions (`fromSpecs()`, `addNode()`, `addEdge()`) *do throw* on constraint violations.
## Error Hierarchy
```
FlowgraphError # Base class for all flowgraph errors
├── ConstructionError # Errors during graph construction
│ ├── DuplicateNodeError # Duplicate node key
│ ├── DuplicateEdgeError # Duplicate edge key
│ ├── NodeNotFoundError # Referenced node doesn't exist
│ └── CycleError # Adding an edge would create a cycle
├── ValidationError # Schema validation failed (single field)
├── GraphValidationError # Graph-level validation issue
│ ├── CycleValidationError # Cycle detected in the graph
│ └── DanglingReferenceError # Edge references non-existent node
├── TypeIncompatError # Type compatibility check failed
└── InvalidTransitionError # Invalid call status transition
```
### FlowgraphError
```typescript
class FlowgraphError extends Error {
constructor(message: string) {
super(message);
this.name = "FlowgraphError";
}
}
```
Base class. All flowgraph errors inherit from this.
### ConstructionError
```typescript
class ConstructionError extends FlowgraphError {
constructor(message: string) {
super(message);
this.name = "ConstructionError";
}
}
```
Base class for errors that occur during graph construction (`fromSpecs()`, `addNode()`, `addEdge()`, etc.).
### DuplicateNodeError
```typescript
class DuplicateNodeError extends ConstructionError {
constructor(public readonly key: string) {
super(`Node with key "${key}" already exists`);
this.name = "DuplicateNodeError";
}
}
```
Thrown when adding a node with a key that already exists in the graph.
### DuplicateEdgeError
```typescript
class DuplicateEdgeError extends ConstructionError {
constructor(
public readonly source: string,
public readonly target: string,
) {
super(`Edge "${source} -> ${target}" already exists`);
this.name = "DuplicateEdgeError";
}
}
```
Thrown when adding an edge between two nodes that already have an edge between them.
### NodeNotFoundError
```typescript
class NodeNotFoundError extends ConstructionError {
constructor(public readonly key: string) {
super(`Node "${key}" not found in graph`);
this.name = "NodeNotFoundError";
}
}
```
Thrown when referencing a node that doesn't exist (e.g., adding an edge with a non-existent endpoint).
### CycleError
```typescript
class CycleError extends ConstructionError {
constructor(public readonly cycles: string[][]) {
super(`Adding this edge would create a cycle: ${JSON.stringify(cycles)}`);
this.name = "CycleError";
}
}
```
Thrown when adding an edge would create a cycle. The `cycles` field contains the cycle paths that would be created.
Note: Unlike `CircularDependencyError` in taskgraph (which is thrown by `topologicalOrder()`), `CycleError` is thrown by `addEdge()` during construction. Taskgraph allows cycles and detects them later; flowgraph prevents them at construction time.
### ValidationError
```typescript
interface ValidationError {
type: "schema";
nodeKey: string;
field: string;
message: string;
value?: unknown;
}
```
Returned by `validateSchema()` when a node's attributes don't match the TypeBox schema. This is a structured result, not a thrown error.
### GraphValidationError
```typescript
interface GraphValidationError {
type: "graph";
category: "cycle" | "dangling-reference" | "orphan-node" | "status-inconsistency";
details: unknown;
}
```
Returned by `validateGraph()` for graph-level issues:
| Category | Meaning | Details |
|----------|---------|---------|
| `cycle` | The graph contains cycles | `{ cycles: string[][] }` |
| `dangling-reference` | An edge references a non-existent node | `{ source: string, target: string }` |
| `orphan-node` | A node has no incoming or outgoing edges | `{ nodeKey: string }` |
| `status-inconsistency` | A call node has incompatible status with its parent (e.g., parent completed but child still running) | `{ nodeKey: string, parentKey: string, nodeStatus: string, parentStatus: string }` |
### InvalidTransitionError
```typescript
class InvalidTransitionError extends FlowgraphError {
constructor(
public readonly requestId: string,
public readonly from: CallStatus,
public readonly to: CallStatus,
) {
super(`Invalid status transition for call ${requestId}: ${from}${to}`);
this.name = "InvalidTransitionError";
}
}
```
Thrown when `updateNodeStatus()` is called with an invalid transition (e.g., `completed → running`).
### TypeIncompatError
```typescript
interface TypeIncompatError {
type: "type-compat";
sourceKey: string;
targetKey: string;
compatible: false;
mismatches: TypeMismatch[];
}
interface TypeMismatch {
path: string;
expected: string;
actual: string;
}
```
Returned by `validateTemplate()` and `analyzeTypeCompat()` when an edge between operations has incompatible type schemas. This is a structured result, not a thrown error.
## Error Collection
Validation functions collect all issues into an array and return them. They do not throw on the first error:
```typescript
const errors = graph.validate();
// errors is AnyValidationError[], which may be empty
for (const error of errors) {
if (error.type === "schema") {
console.log(`Node ${error.nodeKey} has invalid field ${error.field}: ${error.message}`);
} else if (error.type === "graph" && error.category === "cycle") {
console.log(`Graph has cycles: ${error.details.cycles}`);
}
}
```
This "collect all errors" pattern allows consumers to see all issues at once, rather than fixing them one at a time.
### AnyValidationError
```typescript
type AnyValidationError = ValidationError | GraphValidationError | TypeIncompatError;
```
Union type for all validation errors. Consumers use the `type` discriminator to handle each category:
```typescript
switch (error.type) {
case "schema": // ValidationError
case "graph": // GraphValidationError
case "type-compat": // TypeIncompatError
}
```
## Throwing vs. Returning
The distinction between thrown errors and returned errors:
| Function | Behavior | Rationale |
|----------|----------|-----------|
| `addNode(key, attrs)` | Throws `DuplicateNodeError` on duplicate key | Adding a duplicate is a programmer error |
| `addEdge(source, target)` | Throws `NodeNotFoundError` on missing endpoint | Edge without endpoints is invalid |
| `addEdge(source, target)` | Throws `CycleError` if edge creates cycle | DAG invariant must be maintained |
| `updateNodeStatus(id, status)` | Throws `InvalidTransitionError` on invalid transition | State machine must be enforced |
| `validateSchema()` | Returns `ValidationError[]` | Schema issues are validations, not crashes |
| `validateGraph()` | Returns `GraphValidationError[]` | Graph issues are validations, not crashes |
| `validateTemplate()` | Returns `AnyValidationError[]` | Template issues are validations, not crashes |
| `analyzeTypeCompat()` | Returns `TypeCompatResult` (includes mismatches) | Type incompatibility is advisory, not blocking |
| `topologicalOrder()` | Throws `CircularDependencyError` on cycles | No valid ordering exists from a cyclic graph |
This matches taskgraph's pattern: construction enforces invariants (throwing on violations), validation reports issues (returning error arrays).
## Error Boundaries in the Call Graph
The call graph has an additional error boundary: `updateFromEvent()`. Call events arrive from the pub/sub layer and may reference unknown operations or have invalid transitions. The error boundary handles these gracefully:
- **Unknown `requestId`** in a `call.responded` event → log warning, ignore the event (the call may have been created by a different process)
- **Invalid status transition** → log warning, ignore the event (the call may have transitioned in a different order)
- **Unknown `operationId`** → create the node anyway with `status: "pending"` (the operation may be registered later)
This makes `updateFromEvent()` resilient to out-of-order, duplicate, and partial events. Errors are logged but don't crash the process.
## Constraints
- **Validation functions never throw** — they collect errors and return them. This is a contract.
- **Construction functions throw on invariant violations** — adding a cycle-creating edge is a programming error, not a validation finding.
- **All errors have structured data** — `CycleError` includes cycle paths, `InvalidTransitionError` includes from/to status, `TypeIncompatError` includes mismatch details.
- **Error messages are descriptive** — errors include enough context to diagnose the problem without additional lookups.
- **Error classes follow the taskgraph pattern** — naming, structure, and behavior match `@alkdev/taskgraph_ts/src/error/`.
## References
- Taskgraph errors: `@alkdev/taskgraph_ts/src/error/`
- Call protocol events: `@alkdev/operations/src/call.ts`
- Schema: [schema.md](schema.md)

View File

@@ -0,0 +1,348 @@
---
status: draft
last_updated: 2026-05-19
---
# Host Configs
The two `HostConfig` implementations that render workflow templates to different targets: graphology DAG (structural analysis) and reactive execution engine (runtime status tracking).
## Overview
Flowgraph uses ujsx's `HostConfig` pattern to render the same workflow template (`UNode` tree) to different targets. Each HostConfig implements the `HostConfig<WorkflowTag, Instance, RootCtx>` interface:
| HostConfig | Target | Purpose |
|------------|--------|---------|
| GraphologyHostConfig | `DirectedGraph` | Validate templates, check cycles, compute topological order |
| ReactiveHostConfig | `Map<string, WorkflowNode>` | Runtime execution with signal-driven status propagation |
Both HostConfigs share the same template components (`Operation`, `Sequential`, `Parallel`, `Conditional`) and the same tag type. The difference is what `createInstance` and `appendChild` do:
- **GraphologyHostConfig**: Creates graph nodes and edges. `appendChild` adds an edge.
- **ReactiveHostConfig**: Creates a `WorkflowNode` (with a `signal<NodeStatus>`) and registers preconditions. `appendChild` registers the parent-child dependency.
## WorkflowTag Type
```typescript
type WorkflowTag = "operation" | "sequential" | "parallel" | "conditional" | "map";
```
This constrains `HostConfig<TTag, ...>` to only accept workflow-specific element types. Attempting to render an unsupported tag (e.g., `"div"`) is a type error at compile time.
## GraphologyHostConfig
### Type Parameters
```typescript
const graphologyHost: HostConfig<WorkflowTag, Graph, GraphContext>
```
- **TTag**: `WorkflowTag`
- **Instance**: `Graph` (the graphology `DirectedGraph` instance — every element creates a subgraph reference)
- **RootCtx**: `GraphContext` (the root context carrying the graph and metadata)
Wait — this needs refinement. In graphology, instances aren't subgraphs. Let me reconsider.
Actually, the GraphologyHostConfig's `Instance` type is a logical representation of what each template node becomes:
```typescript
interface GraphNode {
key: string; // The graphology node key
attributes: OperationNodeAttrs | TemplateNodeAttrs;
}
```
The `RootCtx` is:
```typescript
interface GraphContext {
graph: DirectedGraph; // The graphology DAG being built
parentStack: string[]; // Stack of parent node keys for edge creation
operationRegistry?: OperationRegistry; // Optional, for name resolution
}
```
### createRootContext
```typescript
createRootContext(container, options, context): GraphContext {
const graph = new DirectedGraph({ type: "directed", multi: false, allowSelfLoops: false });
return { graph, parentStack: [], operationRegistry: options?.registry };
}
```
Creates a fresh `DirectedGraph` with DAG constraints (no self-loops, no parallel edges). The `container` parameter is unused — the graph IS the container.
### createInstance
```typescript
createInstance(tag: WorkflowTag, props, ctx: GraphContext, parent?: GraphNode): GraphNode {
switch (tag) {
case "operation": {
const key = props.name as string;
ctx.graph.addNode(key, { ...operationAttrs, name: key });
return { key, attributes };
}
case "sequential":
case "parallel":
case "conditional":
case "map":
// Structural containers — no node in the graph, just manage parentStack
return { key: `__${tag}_${counter++}`, attributes: {} };
}
}
```
`Operation` elements create real graph nodes. Structural containers (`Sequential`, `Parallel`, `Conditional`, `Map`) do NOT create graph nodes — they manage the `parentStack` to influence edge creation for their children.
This is a key design decision: **structural containers are transparent in the graph**. A `Sequential` node doesn't appear as a node in the DAG. It only affects the edges between its children.
### appendChild
```typescript
appendChild(parent: GraphNode, child: GraphNode, ctx: GraphContext): void {
// Only add edges between real nodes (not structural containers)
if (!isStructuralContainer(parent) && !isStructuralContainer(child)) {
const edgeType = inferEdgeType(ctx, parent.key, child.key);
ctx.graph.addEdgeWithKey(
`${parent.key}->${child.key}`,
parent.key,
child.key,
{ edgeType, compatible: true }
);
}
}
```
Edge creation depends on the context:
- Children of a `Sequential` container: sequential edges between consecutive siblings
- Children of a `Parallel` container: no edges between siblings
- Children of a `Conditional` container: conditional edge to the test branch
### How Sequential edges are created
The `Sequential` component doesn't create edges itself. Instead, the HostConfig tracks the `parentStack` and creates edges between consecutive siblings:
```typescript
// In the rendering of <Sequential>
// After child1 is appended: parentStack = [child1.key]
// After child2 is appended: edge child1→child2 is created, parentStack = [child2.key]
// After child3 is appended: edge child2→child3 is created, parentStack = [child3.key]
```
The `parentStack` is managed by the `Sequential` component's `finalizeInstance` hook — it pops the last child after rendering all children, replacing it with the overall group's last child.
### How Parallel handles edges
The `Parallel` component renders all children without creating inter-child edges. It pushes a "parallel group" marker onto the `parentStack` so that the group's successors connect to ALL parallel children, not just the last one.
This requires the HostConfig to understand parent-child relationships for `Parallel` groups: the group's successors should connect to each parallel child.
### finalizeInstance
```typescript
finalizeInstance?(instance: GraphNode, ctx: GraphContext): void {
// Pop the structural container from the parentStack after all children are rendered
// This is important for Sequential and Parallel to clean up their structural state
}
```
### Cycle Detection
After rendering, the HostConfig checks for cycles using `graphology-dag.hasCycle()`. If a cycle is detected, the rendering throws `CircularDependencyError` with the cycle paths.
This is the primary validation step: a valid workflow template must produce a valid DAG. Cycles in a template mean infinite loops in execution, which are always design errors.
## ReactiveHostConfig
### Type Parameters
```typescript
const reactiveHost: HostConfig<WorkflowTag, WorkflowNode, ReactiveContext>
```
- **TTag**: `WorkflowTag`
- **Instance**: `WorkflowNode` (carries a `signal<NodeStatus>` and computed preconditions)
- **RootCtx**: `ReactiveContext` (carries the operation registry and status tracking)
### WorkflowNode
```typescript
interface WorkflowNode {
key: string; // Operation name or structural container ID
type: "operation" | "sequential" | "parallel" | "conditional" | "map";
status: Signal<NodeStatus>; // Reactive status signal
prerequisites: Computed<boolean>; // Computed: true when all prerequisites are met
operationId?: string; // For operation nodes: the fully qualified ID
output?: Signal<unknown>; // For operation nodes: the call result (when completed)
children: WorkflowNode[]; // Child nodes (structural containers have children)
}
```
Each `WorkflowNode` holds:
- A `signal<NodeStatus>` that tracks the call's lifecycle (`idle``waiting``ready``running``completed`/`failed`/`aborted`/`skipped`)
- A `computed` that derives `prerequisites` from parent nodes' statuses
- An optional `output` signal that holds the call result when completed
### ReactiveContext
```typescript
interface ReactiveContext {
operationRegistry: OperationRegistry;
nodes: Map<string, WorkflowNode>; // All nodes by key
statusSignals: Map<string, Signal<NodeStatus>>; // Status signals by key
}
```
### createInstance
```typescript
createInstance(tag: WorkflowTag, props, ctx: ReactiveContext, parent?: WorkflowNode): WorkflowNode {
const key = props.key ?? generateKey();
const status = signal<NodeStatus>("idle");
const node: WorkflowNode = {
key,
type: tag,
status,
prerequisites: computed(() => computePrerequisites(node, ctx)),
children: [],
};
ctx.nodes.set(key, node);
ctx.statusSignals.set(key, status);
if (tag === "operation") {
node.operationId = props.name as string;
node.output = signal<unknown>(undefined);
}
return node;
}
```
### Prerequisite Computation
The `prerequisites` computed signal for each node derives from its structural context:
- **Sequential child**: prerequisites = previous sibling is `completed`
- **Parallel child**: prerequisites = parent's prerequisites are met
- **Conditional child**: prerequisites = parent's prerequisites are met AND condition evaluates to true
```typescript
function computePrerequisites(node: WorkflowNode, ctx: ReactiveContext): boolean {
// Sequential: previous sibling must be completed
// Parallel: parent must be ready
// Conditional: condition must evaluate to true
const predecessorKeys = getPredecessorKeys(node, ctx);
return predecessorKeys.every(key => {
const status = ctx.statusSignals.get(key)?.value;
return status === "completed" || status === "skipped";
});
}
```
### Status Propagation
When a node's `status` signal changes, its dependents' `prerequisites` computed automatically re-evaluate. If prerequisites are met, the node transitions to `ready`:
```typescript
effect(() => {
if (node.prerequisites.value) {
node.status.value = "ready";
}
});
```
The reactive engine then starts the call associated with the node, which sets `status` to `running`, and eventually `completed` or `failed`.
### Abort Cascading
When a node is aborted, all its descendants are also aborted:
```typescript
function cascadeAbort(node: WorkflowNode): void {
if (node.status.value === "running" || node.status.value === "ready" || node.status.value === "waiting") {
node.status.value = "aborted";
}
for (const child of node.children) {
cascadeAbort(child);
}
}
```
This is reactive — when a parent node's status changes to `aborted`, the `effect` on each child evaluates and cascades the abort.
## Two HostConfigs, One Template
The key insight: the same ujsx template renders to both targets:
```typescript
const template = h(Sequential, {},
h(Operation, { name: "architect" }),
h(Operation, { name: "reviewer" }),
);
// Validate structure
const dagRoot = createRoot(graphologyHost, new DirectedGraph());
dagRoot.render(template);
dagRoot.ctx.graph.hasCycles(); // → false (valid DAG)
// Execute reactively
const reactiveRoot = createRoot(reactiveHost, { registry });
reactiveRoot.render(template);
// Each operation node now has a signal<NodeStatus>
```
No template-specific logic is needed in either HostConfig. The same `UNode` tree, the same components, the same rendering pipeline — just different `createInstance`/`appendChild` implementations.
## Known Gaps
### ujsx Reconciler Not Yet Available
The current ujsx `HostConfig` is mount-only (see [host-configs.md](../../../ujsx/docs/architecture/host-config.md)). The reconciler research (see [reconciler.md](../../../ujsx/docs/architecture/reconciler.md)) has not been implemented yet. This means:
- `render()` can only be called once per root
- No incremental template updates
- No `prepareUpdate`/`commitUpdate` flow
For flowgraph, this is acceptable in v1 because:
- Template rendering is typically done once at startup
- Runtime status updates flow through signals, not through template re-rendering
- When the reconciler is implemented, flowgraph gains incremental template updates "for free"
### Structural Container Handling
The current design where `Sequential`, `Parallel`, and `Conditional` don't create graph nodes is clean for the DAG, but creates complexity for the reactive engine — the "previous sibling" precondition depends on understanding the structural context, which isn't stored on the node itself.
Alternative: Create "virtual" nodes for structural containers that hold `signal<NodeStatus>` but don't correspond to graph nodes. This makes the reactive engine simpler (every node has a status and prerequisites) at the cost of a slightly larger node tree.
### Conditional Test Evaluation
The `Conditional.test` prop can be a function or a string. At the template level, it's stored as a prop. At runtime, the reactive engine evaluates it as a `computed` that depends on referenced nodes' outputs. This evaluation needs access to the `WorkflowContext` (which holds the results of previous steps), which means the reactive engine must have a reference to the call graph or a results map.
## Constraints
- **Both HostConfigs share the same `WorkflowTag` type** — element types that workflow templates use. Non-workflow tags (`"div"`, `"span"`, etc.) are type errors.
- **GraphologyHostConfig produces a static DAG** — the rendered DAG is immutable after rendering. No re-rendering until the reconciler is available.
- **ReactiveHostConfig requires an operation registry** — `Operation` nodes reference operations by name, and the registry resolves them at render time.
- **Template rendering is one-shot** — until the reconciler is implemented, `createRoot(host, container).render(template)` can only be called once per root.
- **Structural containers are transparent in the DAG** — Sequential, Parallel, Conditional create edges between children, not nodes for themselves.
- **HostConfigs must follow ujsx's post-order append contract** — children are appended to parents after all descendants are created. This guarantees that edges are created bottom-up.
## Open Questions
1. **Should structural containers create "virtual" nodes in the reactive engine?** This would simplify prerequisite computation (every node has a status) but adds nodes that don't correspond to calls or operations.
2. **Should the GraphologyHostConfig produce a separate graph for edge types?** Currently all edge types (`sequential`, `conditional`, `typed`) share the same graph. An alternative is a separate graph per edge type, enabling type-specific queries without filtering.
3. **How does the ReactiveHostConfig interact with the call protocol?** When a node transitions to `ready`, the reactive engine needs to call `registry.execute()` or `PendingRequestMap.call()`. This bridges the reactive layer to the operation execution layer. The HostConfig's `createInstance` callback is one option; a separate `ExecutionEngine` class is another.
4. **Should the reactive engine own the call graph?** Currently the call graph (from call-graph.md) and the reactive engine (from this doc) are separate concepts. But at runtime, every `<Operation>` in a template becomes a call graph node. Should the reactive engine populate the call graph as a side effect?
## References
- ujsx HostConfig: `@alkdev/ujsx/docs/architecture/host-config.md`
- ujsx reconciler research: `@alkdev/ujsx/docs/research/reconciler/05-flowgraph-host-configs.md`
- Workflow templates: [workflow-templates.md](workflow-templates.md)
- Reactive execution: [reactive-execution.md](reactive-execution.md)
- Schema: [schema.md](schema.md)

View File

@@ -0,0 +1,187 @@
---
status: draft
last_updated: 2026-05-19
---
# Operation Graph (Static)
The static operation graph built from `OperationSpec`s at startup. Nodes represent operations, edges represent type compatibility between output and input schemas.
## Overview
The operation graph is built once at startup from the `OperationRegistry`. It answers structural questions about the operation space:
- **Type compatibility**: Can operation A's output feed into operation B's input?
- **Cycle detection**: Are there circular operation dependencies?
- **Reachability**: What operations are reachable from a given starting point?
- **Template validation**: Is a proposed call sequence structurally valid?
The operation graph is **immutable after construction**. Operations don't appear or disappear at runtime — they're registered once and the graph is built from the registry. If the registry changes, the graph is rebuilt from scratch.
## Construction
### fromSpecs()
```typescript
static fromSpecs(specs: OperationSpec[]): FlowGraph
```
The primary construction path. Takes an array of `OperationSpec` objects (from `OperationRegistry.getAll()`) and builds a directed graph where:
1. **Nodes** — one per operation, key = `namespace.name`, attributes = `OperationNodeAttrs`
2. **Typed edges** — added between operations where the output schema of the source is compatible with the input schema of the target
```typescript
const graph = FlowGraph.fromSpecs(registry.getAll());
```
Edge construction calls the `typeCompat` analysis function for each (source, target) pair. An edge is added if `typeCompat(source.outputSchema, target.inputSchema).compatible === true`.
The number of edges is O(n²) in the worst case (all operations are type-compatible with all others). For realistic registries (10-50 operations), this is sub-millisecond. If the registry grows large, edge construction can be deferred to query time.
### fromJSON()
```typescript
static fromJSON(data: OperationGraphSerialized): FlowGraph
```
Deserialize from graphology native JSON format. Validates against `OperationGraphSerialized` schema using `Value.Check()`. Throws `InvalidInputError` on validation failure.
Round-trip: `fromSpecs()``export()``fromJSON()` is lossless.
### Incremental construction
```typescript
const graph = new FlowGraph();
graph.addOperation(spec);
graph.addTypedEdge("task.classify", "task.enrich", { compatible: true, detail: "output → input" });
```
`addOperation` adds a node. `addTypedEdge` adds a type-compatibility edge. Both throw on duplicates (matching taskgraph's behavior).
## Node Attributes
See [schema.md](schema.md#OperationNodeAttrs) for the full schema definition. Key fields:
| Field | Purpose |
|-------|---------|
| `name` | Operation name (e.g., `"classify"`) |
| `namespace` | Namespace (e.g., `"task"`) |
| `type` | `"query" \| "mutation" \| "subscription"` |
| `inputSchema` | TypeBox schema for input — used by type-compatibility analysis |
| `outputSchema` | TypeBox schema for output — used by type-compatibility analysis |
The node key is `${namespace}.${name}`, matching the `operationId` format.
## Edges
Edges represent type compatibility. The edge direction is:
```
source → target (source's output is compatible with target's input)
```
Following graphology convention: `graph.inNeighbors("task.enrich")` returns operations that can feed into `enrich`. `graph.outNeighbors("task.classify")` returns operations that `classify` can feed into.
This direction matches the data flow: classify produces output that enrich can consume. It also matches taskgraph's `prerequisite → dependent` convention.
### Edge attributes
```typescript
{
edgeType: "typed",
compatible: true,
detail: "classify.output → enrich.input"
}
```
- `edgeType` — always `"typed"` for operation graph edges
- `compatible` — whether the source's output schema is compatible with the target's input schema
- `detail` — optional human-readable description of the compatibility relationship
### Why compatible: false edges?
The operation graph includes **incompatible** edges (compatible: false) alongside compatible ones. This is intentional:
- **Diagnostic value** — showing all potential connections, both valid and invalid, helps developers understand the operation space.
- **Template authoring** — when building a workflow template, seeing that A → B is incompatible (and why) is more useful than seeing no edge at all.
- **Type mismatch prevention** — incompatible edges make it clear where type conversions would be needed.
The `typeCompat` analysis function determines compatibility. Edges where compatibility cannot be determined (e.g., `inputSchema` is `Unknown`) are not added at all — there's no "unknown compatibility" edge.
## Validation
The operation graph validates:
1. **Cycle detection** — throws `CircularDependencyError` if any cycle exists. Unlike taskgraph (which allows cycles and detects them via `hasCycles()`), flowgraph enforces acyclicity at construction time. A cycle in the operation graph means an operation's output feeds back into its own input, which is a design error.
2. **Dangling references** — edges that reference operations not in the graph are structural errors. `addTypedEdge` throws `OperationNotFoundError` if either endpoint doesn't exist.
3. **Schema compatibility** — warns (via `validateGraph()`) about nodes that have no incoming or outgoing edges (isolated operations that aren't connected to the type flow graph).
## Queries
The operation graph supports the same query functions as taskgraph, delegated to graphology-dag:
| Query | Method | Returns |
|-------|--------|---------|
| Topological order | `topologicalOrder()` | `string[]` of node keys in prerequisite→dependent order |
| Has cycles | `hasCycles()` | `boolean` (should always be false if construction validated) |
| Find cycles | `findCycles()` | `string[][]` of cycle paths |
| Ancestors | `ancestors(nodeId)` | `string[]` of all nodes reachable via incoming edges |
| Descendants | `descendants(nodeId)` | `string[]` of all nodes reachable via outgoing edges |
| Predecessors | `predecessors(nodeId)` | `string[]` of direct incoming neighbors |
| Successors | `successors(nodeId)` | `string[]` of direct outgoing neighbors |
| Reachable from | `reachableFrom(nodeIds)` | `Set<string>` of all nodes reachable from the given start nodes |
These are thin wrappers around graphology and graphology-dag functions, following the same pattern as taskgraph.
## Type Compatibility Analysis
The `typeCompat` function compares two TypeBox schemas and returns a compatibility result:
```typescript
function typeCompat(
outputSchema: TSchema,
inputSchema: TSchema,
): { compatible: boolean; detail?: string }
```
### Compatibility rules
The analysis is **structural**, not **semantic**:
1. **Exact match**`outputSchema` is identical to `inputSchema` → compatible
2. **Subtype match**`outputSchema` is a subtype of `inputSchema` → compatible (e.g., output has extra fields beyond what input requires)
3. **Unknown passthrough** — if either schema is `Type.Unknown()`, compatibility is unknown → no edge added (not incompatible, just unresolvable)
4. **Incompatible** — structural mismatch (e.g., output is `string`, input requires `number`) → edge added with `compatible: false`
See [analysis.md](analysis.md) for the full type-compatibility algorithm.
## Constraints
- **Immutable after construction** — the operation graph is not mutated after `fromSpecs()` builds it. If the registry changes, rebuild the graph.
- **DAG-only** — cycles are rejected at construction time. The operation graph must be a valid DAG.
- **No parallel edges** — at most one edge per (source, target) pair. If A's output is compatible with B's input at multiple JSON paths, that's recorded in `detail`, not as multiple edges.
- **No self-loops** — an operation cannot depend on its own output. Self-referential operations (e.g., recursive subscriptions) are modeled differently (see [call-graph.md](call-graph.md)).
- **Edge direction is data flow** — `A → B` means A produces data that B consumes. `inNeighbors(B)` returns A's dependencies, `outNeighbors(A)` returns A's dependents. This matches taskgraph's convention.
- **Operation nodes use `namespace.name` as keys** — this matches the call protocol's `operationId` format and ensures uniqueness within a registry.
## Open Questions
1. **Should `fromSpecs()` add ALL possible edges or only compatible ones?** The current design adds both compatible and incompatible edges. An alternative is to only add compatible edges, with a separate `potentialEdges()` query that computes incompatible connections on demand. Pro: smaller graph. Con: loses diagnostic information.
2. **How to handle version conflicts?** If two versions of the same operation exist in the registry, should they be separate nodes (`task.classify@1.0.0` vs `task.classify@2.0.0`) or should the latest version win? The current design uses `namespace.name` (no version) as the node key, meaning only one version per operation can exist in the graph.
3. **Should subscription operations be treated differently?** A subscription produces a stream, not a single output. Its `outputSchema` describes a single stream element, but the data flow semantics are different from query/mutation. Should the type compatibility check account for this?
4. **How granular should type compatibility be?** The current `detail` field is a string. A more structured approach would be `{ compatible: boolean, mismatchPaths: string[] }` listing the specific JSON paths that don't match. This adds complexity but improves diagnostics.
## References
- Schema: [schema.md](schema.md) — `OperationNodeAttrs`, `TypedEdgeAttrs`, `CallStatus`, `EdgeType`
- Type compatibility: [analysis.md](analysis.md)
- Call graph: [call-graph.md](call-graph.md)
- Operation types: `@alkdev/operations/src/types.ts`
- Taskgraph construction: `@alkdev/taskgraph_ts/src/graph/construction.ts`
- Graphology DAG: `graphology-dag` package

View File

@@ -0,0 +1,279 @@
---
status: draft
last_updated: 2026-05-19
---
# Reactive Execution
Signal-driven status propagation, computed preconditions, and abort cascading for workflow template execution.
## Overview
The reactive execution layer bridges workflow template structure (DAG) to runtime behavior (call execution). It uses `@preact/signals-core` (via ujsx's reactive layer) to create a signal-backed execution model where:
- Each `<Operation>` node gets a `signal<NodeStatus>` tracking its lifecycle state
- Preconditions are `computed<boolean>` values that automatically resolve when upstream dependencies complete
- Abort cascades propagate through the signal graph — setting one node to `"aborted"` automatically prevents downstream nodes from starting
This layer does NOT execute operations directly. It provides reactive state that the hub coordinator reads and writes. The coordinator calls `registry.execute()` when a node's preconditions are met, and updates the node's status signal when the call completes.
## ReactiveRoot for Workflows
```typescript
class WorkflowReactiveRoot {
private statusMap: Map<string, Signal<NodeStatus>>;
private preconditions: Map<string, Computed<boolean>>;
private graph: DirectedGraph;
private abortMap: Map<string, () => void>;
constructor(graph: DirectedGraph) {
this.graph = graph;
this.statusMap = new Map();
this.preconditions = new Map();
this.abortMap = new Map();
this.initializeSignals();
}
}
```
`WorkflowReactiveRoot` wraps the reactive state for an entire workflow execution. It takes the structural DAG (from the GraphologyHost) and creates reactive state for each operation node.
### initializeSignals()
```typescript
private initializeSignals(): void {
for (const node of this.graph.nodes()) {
const attrs = this.graph.getNodeAttributes(node);
if (attrs.category !== "operation") continue; // Skip structural nodes (already flattened)
const status = signal<NodeStatus>("idle");
const preconditions = computed(() => {
const predecessors = this.graph.inNeighbors(node);
return predecessors.every(pred => {
const predStatus = this.statusMap.get(pred);
return predStatus && predStatus.value === "completed";
});
});
this.statusMap.set(node, status);
this.preconditions.set(node, preconditions);
this.abortMap.set(node, () => this.cascadeAbort(node));
}
}
```
For each operation node in the DAG:
1. Create a `signal<NodeStatus>` starting at `"idle"`
2. Create a `computed<boolean>` that's `true` when all predecessor nodes have status `"completed"`
3. Register an abort function that cascades to all descendants
### Status lifecycle
The signal-based status lifecycle mirrors `CallStatus` with workflow-specific additions:
```
idle → waiting → ready → running → completed
→ failed
→ aborted → aborted
```
| Status | Meaning | Signal trigger |
|--------|---------|---------------|
| `idle` | Node just created, no parent completion yet | Initial state |
| `waiting` | At least one predecessor is running, none have completed | Any predecessor status change |
| `ready` | All predecessors completed (preconditions met) | `computed` resolves to `true` |
| `running` | Call executing | Hub sets `status.value = "running"` |
| `completed` | Call succeeded | Hub sets `status.value = "completed"` |
| `failed` | Call failed | Hub sets `status.value = "failed"` |
| `aborted` | Call cancelled (or parent cancelled) | Hub or cascade sets `status.value = "aborted"` |
| `skipped` | Conditional branch not taken | Conditional evaluation sets this |
The hub coordinator reads the `ready` state (via `preconditions`) and triggers execution. When the call completes, the hub writes the new status to the signal. The signal propagates to all downstream `computed` values automatically.
## Computed Preconditions
The core innovation of reactive execution: each node's "can I start?" question is a `computed` signal that automatically resolves based on upstream states.
```typescript
const preconditions = computed(() => {
const predecessors = graph.inNeighbors(node);
return predecessors.every(pred => statusMap.get(pred)!.value === "completed");
});
```
This means:
- Adding a new predecessor automatically includes it in the check (if the DAG changes)
- A predecessor completing automatically re-evaluates all dependent preconditions
- An aborted predecessor prevents all dependents from becoming `ready`
- No manual event wiring or callback chains
### Sequential preconditions
In a sequential group (A → B → C):
- A's preconditions: `true` (no predecessors, or root-level)
- B's preconditions: `A.status === "completed"`
- C's preconditions: `B.status === "completed"`
When A completes → B's preconditions become true → hub starts B → B completes → C's preconditions become true → hub starts C. All without manual event wiring.
### Parallel preconditions
In a parallel group (A starts B and C simultaneously):
- B's preconditions: `A.status === "completed"` (same as any sequential dependency)
- C's preconditions: `A.status === "completed"` (shared predecessor)
Both B and C become `ready` at the same time, and the hub starts them in parallel.
### Join preconditions
When a node depends on multiple predecessors (e.g., D depends on both B and C completing):
- D's preconditions: `B.status === "completed" && C.status === "completed"`
D only becomes `ready` when all predecessors complete. This is the "join" in fork-join parallelism.
## Abort Cascade
Abort cascading is signal-driven. When a node is aborted:
```typescript
cascadeAbort(nodeId: string): void {
const status = this.statusMap.get(nodeId);
if (status && !isTerminal(status.value)) {
status.value = "aborted";
// Cascade to all descendants
for (const desc of this.graph.descendants(nodeId)) {
const descStatus = this.statusMap.get(desc);
if (descStatus && !isTerminal(descStatus.value)) {
descStatus.value = "aborted";
}
}
}
}
```
This sets the status of the aborted node and all of its descendants to `"aborted"`. The `computed` preconditions of these nodes automatically re-evaluate — but since aborted nodes never become "completed", their dependents will never become "ready".
### Interaction with call protocol abort
There are two abort mechanisms:
1. **Signal cascade** (this layer) — sets `status.value = "aborted"` for the node and all descendants. This is immediate and graph-based.
2. **Call protocol abort** (operations layer) — `PendingRequestMap.abort(requestId)` propagates `call.aborted` events through the pub/sub layer. This is network-aware and handles remote calls.
The hub coordinator should invoke both:
```typescript
// When aborting a call:
workflowRoot.cascadeAbort(nodeId); // Signal cascade
prm.abort(requestId); // Protocol cascade
```
The signal cascade is for local state (the reactive graph). The protocol cascade is for remote state (the running calls). They're complementary — the protocol cascade may take time to propagate, but the signal cascade is instant.
## NodeStatus vs CallStatus
`NodeStatus` extends `CallStatus` with workflow-specific states that have no call protocol equivalent:
| NodeStatus | Meaning | CallStatus equivalent |
|-----------|---------|----------------------|
| `idle` | Not started, no preconditions evaluated | None (call doesn't exist yet) |
| `waiting` | Preconditions not met (upstream still running) | None |
| `ready` | Preconditions met, eligible to start | None |
| `running` | Call in progress | `running` |
| `completed` | Call succeeded | `completed` |
| `failed` | Call failed | `failed` |
| `aborted` | Call cancelled | `aborted` |
| `skipped` | Conditional branch not taken | None |
The hub coordinator maps between these:
```typescript
// NodeStatus → CallStatus (when starting a call)
function nodeStatusToCallAction(status: NodeStatus): "start" | "skip" | "abort" | "none" {
switch (status) {
case "ready": return "start";
case "skipped": return "skip";
case "aborted": return "abort";
default: return "none";
}
}
// CallStatus → NodeStatus (when call event arrives)
function callStatusToNodeStatus(callStatus: CallStatus): NodeStatus {
// Direct mapping for shared states
return callStatus as NodeStatus;
}
```
## Effect-Driven Execution
The hub coordinator uses `effect()` to react to precondition changes:
```typescript
for (const [nodeId, preconditions] of workflowRoot.preconditions) {
effect(() => {
if (preconditions.value) {
const status = workflowRoot.statusMap.get(nodeId)!;
if (status.value === "idle" || status.value === "waiting") {
// All preconditions met — start the call
status.value = "running";
const operationId = graph.getNodeAttributes(nodeId).name;
prm.call(operationId, getInput(nodeId), { parentRequestId: parentCallId })
.then(result => { status.value = "completed"; })
.catch(error => { status.value = "failed"; });
}
}
});
}
```
Each node gets an `effect()` that watches its `preconditions` computed value. When preconditions resolve to `true` and the node is in a startable state (`idle` or `waiting`), the effect starts the call via `PendingRequestMap.call()`.
The call's promise resolution updates the node's status signal, which triggers downstream preconditions to re-evaluate, which triggers their effects, and so on.
### Effect disposal
Each `effect()` returns a dispose function. The `WorkflowReactiveRoot` tracks all effect disposers and provides a `dispose()` method that tears down the entire reactive graph:
```typescript
dispose(): void {
for (const disposer of this.effectDisposers) {
disposer();
}
this.statusMap.clear();
this.preconditions.clear();
this.abortMap.clear();
}
```
This is critical for cleaning up when a workflow completes, fails, or is aborted. Without disposal, signal subscriptions leak.
## Constraints
- **Signals are in-memory** — `WorkflowReactiveRoot` state is not persisted. If the hub restarts, the reactive state is lost and must be reconstructed from call protocol events + template re-render.
- **Effect-driven execution is optional** — the hub coordinator can choose not to use `effect()` and instead poll `preconditions.value` manually. The reactive layer provides the building blocks; the coordinator decides how to use them.
- **Abort is immediate in signals, delayed in protocol** — setting `status.value = "aborted"` is instant, but `prm.abort(requestId)` takes time to propagate through the call protocol. The hub should invoke both.
- **`skipped` is set by conditional evaluation, not by the call protocol** — a `Conditional` node whose test evaluates to `false` sets its child's status to `skipped`, which prevents the call from ever starting.
- **`NodeStatus` and `CallStatus` share terminal states** — `running`, `completed`, `failed`, `aborted` map directly. `idle`, `waiting`, `ready`, `skipped` are workflow-specific additions.
## Open Questions
1. **Should preconditions support OR logic?** Currently all predecessors must complete. An `anyOf` predicate would allow "start this node as soon as any predecessor completes." This would require an edge attribute or node-level configuration.
2. **How are retries handled at the signal level?** If an operation fails and should be retried, the status would go `running → failed → ready → running`. This requires resetting the status back to `ready`, which the current state machine doesn't support (failed is terminal). A `retried` status or a separate `retryCount` attribute may be needed.
3. **Should the reactive graph support partial re-rendering?** If a template changes mid-execution (e.g., a step is added), the ujsx reconciler could diff the old and new trees. But the ReactiveHost only supports mount rendering. Re-rendering would require reconciler support.
4. **How does `maxConcurrency` interact with preconditions?** A `Parallel` group with `maxConcurrency: 3` should only start 3 nodes at a time, even though all preconditions are met. This is a scheduling concern, not a structural one. The reactive layer could implement this as a semaphore signal, or it could be the coordinator's responsibility.
## References
- ujsx reactive layer: `@alkdev/ujsx/docs/architecture/reactive-layer.md`
- ujsx reconciler: `@alkdev/ujsx/docs/architecture/reconciler.md`
- Schema: [schema.md](schema.md) — `NodeStatus`, `CallStatus`
- Host configs: [host-configs.md](host-configs.md)
- Workflow templates: [workflow-templates.md](workflow-templates.md)
- Call protocol: `@alkdev/alkhub_ts/docs/architecture/call-graph.md`

327
docs/architecture/schema.md Normal file
View File

@@ -0,0 +1,327 @@
---
status: draft
last_updated: 2026-05-19
---
# Schema
TypeBox Module, TypeScript types, categorical enums, node/edge attribute schemas, and the design decisions behind them.
## Overview
Flowgraph's schema layer follows the same pattern as taskgraph: TypeBox schemas are the single source of truth for both runtime validation and TypeScript type derivation. All data shapes are defined as TypeBox schemas, with `Static<typeof Schema>` producing the corresponding TypeScript types.
The schema is organized around two distinct graph types (operation graph and call graph) plus shared enums and the serialized graph factory.
## Design Decision: TypeBox as Single Source of Truth
Identical to taskgraph's approach:
1. **Static TypeScript types** via `Static<typeof Schema>` — every schema constant has a corresponding `type X = Static<typeof X>` alias
2. **Runtime validation** via `Value.Check()` / `Value.Errors()` — structured field-level error reporting
3. **JSON Schema export** for consumers that need schema-based contracts
No separate `interface` or `type` definitions outside of `Static<typeof>`. No Zod.
### Naming Convention
| Category | Convention | Example |
|----------|-----------|---------|
| Enum schema constant | PascalCase + `Enum` suffix | `CallStatusEnum` |
| Enum type alias | PascalCase, no suffix | `type CallStatus = Static<typeof CallStatusEnum>` |
| Object schema constant | PascalCase, no suffix | `OperationNodeAttrs`, `CallNodeAttrs` |
| Object type alias | Same name as schema constant | `type OperationNodeAttrs = Static<typeof OperationNodeAttrs>` |
| Graph attribute schemas | `PascalCase` + suffix | `FlowGraphSerialized`, `OperationGraphSerialized` |
| Factory function | PascalCase | `SerializedGraph(NodeAttrs, EdgeAttrs, GraphAttrs)` |
### Nullable Helper
Same `Nullable` helper as taskgraph:
```typescript
const Nullable = <T extends TSchema>(schema: T) => Type.Union([schema, Type.Null()]);
```
Used for fields that can be explicitly set to `null` (distinct from absent).
## Enums
### CallStatus
The lifecycle states of a call invocation. Matches the call graph storage schema in `@alkdev/alkhub_ts/docs/architecture/storage/call-graph.md`.
```typescript
const CallStatusEnum = Type.Union([
Type.Literal("pending"), // Call requested, not yet dispatched
Type.Literal("running"), // Handler executing
Type.Literal("completed"), // Successfully finished (call.responded + call.completed)
Type.Literal("failed"), // Handler threw or call.error emitted
Type.Literal("aborted"), // Call.aborted emitted (parent cancelled, deadline exceeded)
]);
type CallStatus = Static<typeof CallStatusEnum>;
```
Transitions:
```
pending → running → completed
→ failed
→ aborted
```
- `pending → running`: Handler starts executing
- `running → completed`: `call.responded` + `call.completed` received
- `running → failed`: `call.error` received
- `pending → aborted`: `call.aborted` received before handler started (e.g., deadline exceeded)
- `running → aborted`: `call.aborted` received during execution (parent cancelled)
`completed`, `failed`, and `aborted` are terminal states — no further transitions.
### NodeStatus
A derived status for workflow template nodes. While `CallStatus` tracks individual call invocations, `NodeStatus` reflects the template-level view:
```typescript
const NodeStatusEnum = Type.Union([
Type.Literal("idle"), // Not started, no call yet
Type.Literal("waiting"), // Preconditions not met, waiting for upstream
Type.Literal("ready"), // Preconditions met, eligible to start
Type.Literal("running"), // Call in progress
Type.Literal("completed"), // Call completed successfully
Type.Literal("failed"), // Call failed
Type.Literal("skipped"), // Conditional branch not taken
Type.Literal("aborted"), // Call aborted
]);
type NodeStatus = Static<typeof NodeStatusEnum>;
```
`NodeStatus` extends `CallStatus` with workflow-specific states (`idle`, `waiting`, `ready`, `skipped`) that have no call protocol equivalent. A node that is `waiting` has no call yet because its preconditions haven't been met.
### EdgeType
The type of edge in a flowgraph. Matches the call graph storage schema's `edgeType` column:
```typescript
const EdgeTypeEnum = Type.Union([
Type.Literal("triggered"), // Source caused target to execute (parent→child in call hierarchy)
Type.Literal("depends_on"), // Source requires target's result before it can complete (data dependency)
Type.Literal("typed"), // Type compatibility edge (output schema A → input schema B)
Type.Literal("sequential"), // Sequential flow edge (template: <Sequential> ordering)
Type.Literal("conditional"), // Conditional flow edge (template: <Conditional> branch)
]);
type EdgeType = Static<typeof EdgeTypeEnum>;
```
The first three (`triggered`, `depends_on`) match the call graph storage schema. The last two (`sequential`, `conditional`) are template-specific and only exist in workflow template DAGs.
| Edge Type | Graph Type | Meaning |
|-----------|------------|---------|
| `triggered` | Call graph | Parent call triggered child call. Corresponds to `parentRequestId`. |
| `depends_on` | Call graph | Data dependency — source needs target's result. |
| `typed` | Operation graph | Type compatibility — source's output schema is compatible with target's input schema. |
| `sequential` | Template → DAG | Sequential ordering from `<Sequential>` component. |
| `conditional` | Template → DAG | Conditional branch from `<Conditional>` component. |
## Node Attribute Schemas
### OperationNodeAttrs
Attributes for nodes in the operation graph. Derived from `OperationSpec` but carrying only graph-relevant data:
```typescript
const OperationNodeAttrs = Type.Object({
name: Type.String(), // Operation name (e.g., "classify")
namespace: Type.String(), // Namespace (e.g., "task")
version: Type.String(), // Semantic version
type: OperationTypeEnum, // "query" | "mutation" | "subscription"
inputSchema: Type.Unknown(), // JSON Schema for input (TypeBox schema)
outputSchema: Type.Unknown(), // JSON Schema for output (TypeBox schema)
description: Type.Optional(Type.String()),
tags: Type.Optional(Type.Array(Type.String())),
});
type OperationNodeAttrs = Static<typeof OperationNodeAttrs>;
```
The node key is `namespace.name` (e.g., `"task.classify"`), matching the `operationId` format used in the call protocol. The full `OperationSpec` is not stored on the graph — `accessControl`, `errorSchemas`, and `handler` belong to the registry, not the graph.
**Why `inputSchema` and `outputSchema` on the graph**: These are needed for type-compatibility edge construction. An edge from operation A to operation B exists if A's `outputSchema` is compatible with B's `inputSchema`. Storing the schemas on the node avoids a round-trip to the registry during graph queries.
### CallNodeAttrs
Attributes for nodes in the call graph. Populated from call events:
```typescript
const CallNodeAttrs = Type.Object({
requestId: Type.String(), // Unique call identifier
operationId: Type.String(), // namespace.name of the operation
status: CallStatusEnum, // Current call status
parentRequestId: Type.Optional(Type.String()), // Parent call (null = top-level)
input: Type.Unknown(), // Call input
output: Type.Optional(Type.Unknown()), // Call output (on completion)
error: Type.Optional(Type.Object({ // Call error (on failure)
code: Type.String(),
message: Type.String(),
details: Type.Optional(Type.Unknown()),
})),
identity: Type.Optional(Type.Object({ // Caller identity
id: Type.String(),
scopes: Type.Array(Type.String()),
resources: Type.Optional(Type.Record(Type.String(), Type.Array(Type.String()))),
})),
startedAt: Type.Optional(Type.String()), // ISO timestamp when call was dispatched
completedAt: Type.Optional(Type.String()), // ISO timestamp when call completed/failed/aborted
});
type CallNodeAttrs = Static<typeof CallNodeAttrs>;
```
The node key is `requestId`. This matches the call protocol's correlation mechanism and the call graph storage schema.
**Why ISO timestamps as strings**: Following the call protocol, timestamps are ISO 8601 strings rather than numbers. This makes the graph directly serializable to JSON without transformation and aligns with the storage schema's `timestamp with tz` columns.
**Why `parentRequestId` is both a node attribute and an edge**: Following the same denormalization pattern as the storage schema — `parentRequestId` on the node enables fast point lookups ("who is this call's parent?"), while `triggered` edges enable traversal queries. Both are kept consistent by construction.
## Edge Attribute Schemas
### TypedEdgeAttrs (Operation Graph)
```typescript
const TypedEdgeAttrs = Type.Object({
compatible: Type.Boolean({ description: "Whether the source output schema is compatible with the target input schema" }),
compatibilityDetail: Type.Optional(Type.String({ description: "Human-readable description of compatibility or mismatch" })),
});
type TypedEdgeAttrs = Static<typeof TypedEdgeAttrs>;
```
Type-compatibility edges carry a boolean `compatible` flag and optional detail. This allows the operation graph to include both compatible edges (green paths) and incompatible edges (red paths) for diagnostics.
### TriggeredEdgeAttrs (Call Graph)
```typescript
const TriggeredEdgeAttrs = Type.Object({});
type TriggeredEdgeAttrs = Static<typeof TriggeredEdgeAttrs>;
```
Parent-child edges in the call graph carry no additional attributes — the relationship is fully captured by the edge direction and type. This may be extended in the future with `latency` or `metadata` attributes.
### DependencyEdgeAttrs (Call Graph)
```typescript
const DependencyEdgeAttrs = Type.Object({});
type DependencyEdgeAttrs = Static<typeof DependencyEdgeAttrs>;
```
Data dependency edges also carry no additional attributes. Future extensions may include `dataPath` (which field of the output feeds which field of the input).
### TemplateEdgeAttrs (Workflow Templates)
```typescript
const TemplateEdgeAttrs = Type.Object({
edgeType: EdgeTypeEnum, // "sequential" or "conditional"
condition: Type.Optional(Type.Unknown()), // For conditional edges: the condition function or expression
});
type TemplateEdgeAttrs = Static<typeof TemplateEdgeAttrs>;
```
Template edges carry an `edgeType` to distinguish sequential flow from conditional branching. Conditional edges optionally store a `condition` that determines whether the target node executes.
## SerializedGraph Factory
Following the taskgraph pattern, a generic factory for graphology native JSON format:
```typescript
const SerializedGraph = <N extends TSchema, E extends TSchema, G extends TSchema>(
NodeAttrs: N,
EdgeAttrs: E,
GraphAttrs: G,
) =>
Type.Object({
attributes: GraphAttrs,
options: Type.Object({
type: Type.Literal("directed"),
multi: Type.Literal(false),
allowSelfLoops: Type.Literal(false),
}),
nodes: Type.Array(Type.Object({
key: Type.String(),
attributes: NodeAttrs,
})),
edges: Type.Array(Type.Object({
key: Type.String(),
source: Type.String(),
target: Type.String(),
attributes: EdgeAttrs,
})),
});
```
**`multi: false`**: Flowgraph edges are unique per (source, target, edgeType) triple. No parallel edges between the same node pair with the same type.
**`allowSelfLoops: false`**: Operations and calls cannot be their own prerequisite. Self-loops are rejected at construction time.
**`type: "directed"`**: All edges have direction. `A → B` means A is prerequisite/source, B is dependent/target. This matches the graphology convention and the call graph storage schema.
### FlowGraphSerialized variants
Two specialized serialization types, one for each graph type:
```typescript
const OperationGraphSerialized = SerializedGraph(
OperationNodeAttrs,
TypedEdgeAttrs,
Type.Object({}), // No graph-level attributes
);
const CallGraphSerialized = SerializedGraph(
CallNodeAttrs,
Type.Union([TriggeredEdgeAttrs, DependencyEdgeAttrs]),
Type.Object({}), // No graph-level attributes
);
```
For call graphs, edges can be either `triggered` or `depends_on`, distinguished by their attributes rather than separate schemas.
## Edge Key Convention
Following taskgraph's ADR-006, edge keys are deterministic:
```
${source}->${target}
```
For the operation graph, this means keys like `"task.classify->task.enrich"`. For the call graph, keys like `"req_abc123->req_def456"`.
Since `multi: false`, there can be at most one edge between any (source, target) pair. When multiple edge types are needed between the same pair (e.g., both `triggered` and `depends_on` between two calls), the graph stores a single edge whose `edgeType` attribute captures the semantic relationship. This is a simplification from the storage schema, which allows multiple edges per (source, target, edgeType) triple — the in-memory graph collapses these into a single edge per (source, target) pair.
This is acceptable because:
- Operation graphs only have `typed` edges, so no multi-edge concern.
- Call graphs rarely have both `triggered` and `depends_on` between the same pair.
- Template DAGs only have `sequential` or `conditional` edges.
If multi-edge support becomes necessary, the `allowSelfLoops: false` constraint can be relaxed and a composite key format (`${source}->${target}:${edgeType}`) adopted.
## Constraints
- **TypeBox schemas are the single source of truth** — no hand-written `interface` or `type` definitions for data shapes. All types are derived via `Static<typeof Schema>`.
- **Edge keys are deterministic** — `${source}->${target}` format, following ADR-006 in taskgraph.
- **No parallel edges** — `multi: false` in graphology. At most one edge per (source, target) pair.
- **No self-loops** — `allowSelfLoops: false`. An operation cannot be its own prerequisite.
- **ISO timestamp strings** — Call graph timestamps are ISO 8601 strings, matching the storage schema.
- **Nullable categorical fields** — Following taskgraph's convention, `Type.Optional(Nullable(Enum))` for optional fields that can be explicitly null.
- **`inputSchema` and `outputSchema` on operation nodes** — These are TypeBox schemas (unknown at the graph level), stored for type-compatibility checking. The graph does not validate these schemas — it stores them and makes them available for the `typeCompat` analysis function.
- **No schema version field** — Following taskgraph, the serialized format does not include a version field. It follows graphology's native JSON format and is not a persistence format with backward-compatibility guarantees. Consumers that need persistence wrap it in their own versioned envelope.
## Open Questions
1. **Should `edgeType` be a required field on ALL edges, or only on call graph and template edges?** Operation graph edges are always `typed`, so requiring an explicit `edgeType` attribute there is redundant. Options: (a) make `edgeType` required on all edges, (b) have separate edge attribute types per graph mode, (c) use a union type on edge attributes and let the consumer tag the edge.
2. **Should `CallNodeAttrs.identity` be a `Type.Record` or the structured `Identity` type from operations?** The structured type matches the call protocol and storage schema but creates a dependency on `@alkdev/operations` types. Options: (a) import `Identity` from operations (peer dep), (b) duplicate the type in flowgraph, (c) use `Type.Record` and accept weaker typing.
3. **How should conditional edge conditions be represented?** `condition: Type.Unknown()` is maximally flexible but provides no type safety. Options: (a) `Type.Unknown()` with documentation, (b) `Type.Union([Type.String(), Type.Function(...)])` for expression strings and function references, (c) a dedicated `ConditionSchema` that flowgraph defines.
## References
- Taskgraph schema patterns: `@alkdev/taskgraph_ts/docs/architecture/schemas.md`
- Call graph storage schema: `@alkdev/alkhub_ts/docs/architecture/storage/call-graph.md`
- Call event types: `@alkdev/operations/src/call.ts`
- Operation types: `@alkdev/operations/src/types.ts`
- ujsx schema: `@alkdev/ujsx/docs/architecture/schema.md`

View File

@@ -0,0 +1,288 @@
---
status: draft
last_updated: 2026-05-19
---
# Workflow Templates
ujsx-based workflow definition — compose operations as declarative template trees, render them to DAGs or reactive execution engines.
## Overview
Workflow templates are ujsx trees that define reusable call patterns. Instead of hardcoding operation sequences in the hub coordinator, templates provide a declarative, composable way to define "what should happen in what order":
```typescript
import { h, createRoot } from "@alkdev/ujsx";
import { Operation, Sequential, Parallel, Conditional } from "@alkdev/flowgraph/component";
import { GraphologyHostConfig } from "@alkdev/flowgraph/host/graphology";
const sddPipeline = h(Sequential, {},
h(Operation, { name: "architect" }),
h(Operation, { name: "architecture-reviewer" }),
h(Conditional, {
test: (results) => results["architecture-reviewer"].approved
},
h(Sequential, {},
h(Operation, { name: "decomposer" }),
h(Operation, { name: "coordinator" }),
h(Operation, { name: "specialist" }),
),
),
h(Operation, { name: "code-reviewer" }),
);
```
The template is a `UNode` tree — a plain data structure that can be:
- **Serialized** to JSON for storage and transmission
- **Validated** against the operation graph (are all referenced operations registered? are there type mismatches?)
- **Rendered to a graphology DAG** via the `GraphologyHostConfig` for structural analysis
- **Rendered to a reactive execution engine** via the `ReactiveHostConfig` for runtime status tracking
This is the same `UNode` tree that ujsx defines, with flowgraph-specific component functions (`Operation`, `Sequential`, `Parallel`, `Conditional`) that produce `UElement` nodes with workflow-specific props and meaning.
## Why ujsx as Template IR
The alternative to ujsx would be a custom template format — an array of step objects with type discriminators:
```typescript
// Alternative: custom template format
const template = [
{ type: "operation", name: "architect" },
{ type: "sequential", steps: [
{ type: "operation", name: "decomposer" },
{ type: "operation", name: "coordinator" },
]},
];
```
ujsx is better for several reasons:
1. **Composability** — Nested elements are the natural representation of hierarchical workflows. `Sequential({ children: [...] })` is cleaner than a recursive type discriminator.
2. **No new format** — ujsx already defines the tree structure, type guards, reactive layer, and reconciler. We don't need to design, implement, and maintain a template parser/serializer.
3. **Host target switching** — The `HostConfig` pattern means the same template renders to different targets without template-specific logic. Graphology for analysis, reactive engine for runtime. No template→IR→DAG compilation step.
4. **Incremental updates** — The ujsx reconciler enables incremental template updates. Add a step, remove a step, reorder steps — the reconciler computes the diff and applies minimal mutations to the DAG, rather than rebuilding the entire graph.
5. **Reactive props**`@preact/signals-core` enables signal-driven prop updates. An `Operation` node's `name` could be a signal, enabling dynamic workflow modification at runtime.
See [ADR-001](decisions/001-ujsx-as-template-ir.md) for the full decision record.
## Component Definitions
### `<Operation>`
Represents a single operation invocation in the workflow:
```typescript
const Operation: UComponent<{
name: string; // Operation name (namespace.name or just name if namespace is inherited)
input?: unknown; // Static input for the call
retries?: number; // Number of retries on failure (default: 0)
timeout?: number; // Deadline in ms from call start
}>;
```
`Operation` produces a `UElement` with `type: "operation"` and the given props. When rendered to a graphology DAG, it creates a node with the operation's attributes. When rendered to the reactive engine, it creates a `signal<NodeStatus>` that tracks the call's lifecycle.
### `<Sequential>`
Represents sequential execution — children run in order, each child waits for the previous to complete:
```typescript
const Sequential: UComponent<{
id?: string; // Optional identifier for the sequence
}>;
```
`Sequential` children are rendered in order. In the graphology DAG, each child has a `sequential` edge to the next child. In the reactive engine, each child's precondition is "previous child is `completed`".
### `<Parallel>`
Represents parallel execution — all children start simultaneously:
```typescript
const Parallel: UComponent<{
id?: string; // Optional identifier for the parallel group
maxConcurrency?: number; // Maximum concurrent children (default: unlimited)
}>;
```
`Parallel` children have no ordering edges between them. In the reactive engine, all children's preconditions are "parent's prerequisites are met", so they all become `ready` at the same time.
`maxConcurrency` is a runtime hint, not a structural constraint. The DAG doesn't encode it — it's a scheduling hint for the execution engine.
### `<Conditional>`
Represents conditional branching — children only execute if the test passes:
```typescript
const Conditional: UComponent<{
test: ((results: Record<string, CallResult>) => boolean) | string;
// If string: operation name whose result to check
// If function: receives results of previous steps, returns boolean
else?: UNode; // Alternative branch if test fails
}>;
```
When rendered to a graphology DAG, `Conditional` creates an edge with `edgeType: "conditional"` and `condition` attribute. When rendered to the reactive engine, the condition is evaluated as a `computed` that depends on the referenced step's status and output.
If the test evaluates to `false`, the branch is marked `skipped` in `NodeStatus`.
## Template → DAG Conversion
The `GraphologyHostConfig` renders a template to a graphology DAG:
```typescript
import { createRoot } from "@alkdev/ujsx";
import { GraphologyHostConfig } from "@alkdev/flowgraph/host/graphology";
const host = new GraphologyHostConfig();
const root = createRoot(host, new DirectedGraph());
const template = h(Sequential, {},
h(Operation, { name: "architect" }),
h(Operation, { name: "reviewer" }),
h(Operation, { name: "decomposer" }),
);
root.render(template);
// Now root.ctx is a DirectedGraph with:
// - nodes: "architect", "reviewer", "decomposer"
// - edges: "architect" → "reviewer" → "decomposer" (sequential)
```
The HostConfig maps ujsx component types to graphology operations:
| UElement type | Graphology operation |
|---------------|---------------------|
| `"operation"` | Add node with `OperationNodeAttrs` |
| `"sequential"` | Add `sequential` edges between consecutive children |
| `"parallel"` | No edges between children (they run concurrently) |
| `"conditional"` | Add `conditional` edge with test attribute |
### Edge creation rules
- **Sequential**: For children C1, C2, ..., Cn, edges C1→C2, C2→C3, ..., C(n-1)→Cn are added. Within a sequential group, children have implicit `depends_on` edges.
- **Parallel**: No edges between children. All children have the same prerequisites as the parallel group itself.
- **Conditional**: Edge from the conditional node's prerequisite to the first child of the branch, with `edgeType: "conditional"` and `condition` attribute.
- **Nested**: A `Sequential` inside a `Parallel` has its own internal edges. A `Parallel` inside a `Sequential` creates a subgraph where all parallel children share the same predecessor.
### Root node handling
The template's root `URoot` is transparent — its children are mounted directly into the graph. `Sequential` and `Parallel` component functions are also transparent in terms of graph structure — they produce edges between their children, but do not create nodes for themselves.
This means a template like:
```typescript
h(Sequential, {},
h(Operation, { name: "A" }),
h(Parallel, {},
h(Operation, { name: "B" }),
h(Operation, { name: "C" }),
),
h(Operation, { name: "D" }),
);
```
Produces a DAG with nodes A, B, C, D and edges A→B, A→C, B→D, C→D. No "parallel" or "sequential" nodes.
## Template → Reactive Execution
The `ReactiveHostConfig` renders a template to a reactive execution engine:
```typescript
import { createRoot } from "@alkdev/ujsx";
import { ReactiveHostConfig } from "@alkdev/flowgraph/host/reactive";
const host = new ReactiveHostConfig(operationRegistry);
const root = createRoot(host, {});
const template = h(Sequential, {},
h(Operation, { name: "architect" }),
h(Operation, { name: "reviewer" }),
);
root.render(template);
// Now each operation node has a signal<NodeStatus>:
// - "architect": signal("idle")
// - "reviewer": signal("idle")
// The reviewer's precondition is: architect.status === "completed"
```
See [reactive-execution.md](reactive-execution.md) for the full reactive execution architecture.
## Serialization
Since workflow templates are ujsx `UNode` trees, they are JSON-serializable by design:
```typescript
import { Value } from "@alkdev/typebox/value";
import { UJSX } from "@alkdev/ujsx";
const template = h(Sequential, {},
h(Operation, { name: "architect" }),
h(Operation, { name: "reviewer" }),
);
// Serialize
const json = JSON.stringify(template);
// → {"type":"sequential","props":{"name":"sequential"},"children":[...]}
// Deserialize
const parsed = JSON.parse(json);
if (Value.Check(UJSX.Import("UElement"), parsed)) {
// Valid UElement — can render to any HostConfig
}
```
Note: function-valued props (like `Conditional.test` with a function) are not serializable. For storage, conditional tests must be expressed as strings (operation references) rather than functions. The HostConfig resolves string references to functions at render time.
## Validation
A workflow template can be validated against an operation graph before execution:
```typescript
function validateTemplate(
template: UNode,
operationGraph: FlowGraph<OperationNodeAttrs, OperationEdgeAttrs>,
): ValidationError[]
```
Validation checks:
1. **All operation names exist in the registry** — every `<Operation name="X">` must have a matching node in the operation graph
2. **Type compatibility** — sequential operations have type-compatible edges in the operation graph
3. **No cycles** — the rendered DAG has no cycles (inherited from FlowGraph's DAG enforcement)
4. **Reachability from start** — all operations in the template are reachable from the first operation
Validation returns an array of `ValidationError` objects (never throws). See [analysis.md](analysis.md) for the full validation algorithm.
## Constraints
- **Templates are ujsx trees** — no custom format, no parser, no compiler. Components are `UComponent` functions that produce `UElement` nodes.
- **`Operation` props are workflow metadata** — `name`, `input`, `retries`, `timeout` are NOT passed to the HostConfig's `createInstance`. They're workflow-level configuration that the reactive execution engine uses to configure the call.
- **Function props are not serializable** — `Conditional.test` with a function cannot be round-tripped through JSON. Use string references for stored templates.
- **Sequential ordering is structural, not temporal** — a `Sequential` group means "these operations should complete in order", not "start the next only after the previous completes" (though the reactive engine implements this via preconditions).
- **Parallel has no structural edges** — a `Parallel` group produces no DAG edges between its children. The execution engine starts them concurrently when the group's prerequisites are met.
- **Conditional branches are either/or** — a `Conditional` node renders to one branch or the `else` branch. There's no "both" evaluation.
## Open Questions
1. **Should `Sequential` and `Parallel` be transparent in the graph?** Currently they produce edges, not nodes. An alternative is to create "virtual" grouping nodes (like a "parallel gateway" in BPMN). This would make the graph structure richer but adds complexity.
2. **Should templates support loops?** A `<ForEach>` component that iterates over an array and produces a child for each element. This would enable dynamic workflows where the number of parallel calls isn't known at template definition time.
3. **Should templates support `depends_on` edges explicitly?** Currently dependencies are inferred from structure (sequential implies dependency). An explicit `<DependsOn target="operation-name" />` component would make data dependencies visible in the template without relying on sequential ordering.
4. **How does template instantiation interact with the call protocol?** When a template is instantiated as a call graph, each `<Operation>` becomes a call. But the call protocol's `call.requested` events include `parentRequestId` — who is the parent? The template itself? The hub coordinator? This needs a clear answer.
## References
- ujsx architecture: `@alkdev/ujsx/docs/architecture/`
- ujsx HostConfig: `@alkdev/ujsx/docs/architecture/host-config.md`
- ujsx reactive layer: `@alkdev/ujsx/docs/architecture/reactive-layer.md`
- Host configs: [host-configs.md](host-configs.md)
- Reactive execution: [reactive-execution.md](reactive-execution.md)
- Analysis and validation: [analysis.md](analysis.md)