- C-05: Add flowgraph-api.md with complete public API surface - C-06: Document <Map> component in workflow-templates.md - C-07: Specify Conditional else-branch behavior - C-08: Add lifecycle/ownership section to reactive-execution.md - C-09: Add consumer-integration.md end-to-end walkthrough - W-02: Add reactive error boundary semantics (3 levels) - W-03: Complete ReactiveContext interface definition - W-04: Add template composition rules (8 rules) - W-05: Document removeChild for both HostConfigs - W-06: Document signal/effect disposal lifecycle - W-07: Add ADR-004 (no schema version field) - W-08: Add type compatibility depth/contract to analysis.md - W-11: Add performance characteristics section - S-01: Getting Started merged into consumer-integration.md - S-02: Add flow diagrams for template rendering pipeline - S-03: Add node status state machine diagram - S-04: Add testing strategy section - S-06: Validate source structure cross-references Review round 2 fixes: - Define TemplateNodeAttrs as alias for OperationNodeAttrs - Document CallEventMapValue and CallResult types in schema.md - Standardize CycleError naming (replace CircularDependencyError) - Add function form to Map.over type definition - Define Map aggregate completion/failure semantics - Fix immutability claim for fromCallEvents - Clarify edgeType storage alongside OperationEdgeAttrs - Clarify WorkflowNode.status === statusMap (same Signal) - Add component-to-tag mapping for WorkflowTag
677 lines
33 KiB
Markdown
677 lines
33 KiB
Markdown
---
|
|
status: draft
|
|
last_updated: 2026-05-20
|
|
---
|
|
|
|
# Reactive Execution
|
|
|
|
Signal-driven status propagation, computed preconditions, and failure propagation for workflow template execution.
|
|
|
|
## Overview
|
|
|
|
The reactive execution layer bridges workflow template structure (DAG) to runtime behavior (call execution). It uses `@preact/signals-core` (via ujsx's reactive layer) to create a signal-backed execution model where:
|
|
|
|
- Each `<Operation>` node gets a `signal<NodeStatus>` tracking its lifecycle state
|
|
- Preconditions are `computed<boolean>` values that automatically resolve when upstream dependencies complete
|
|
- Failure propagation follows dependency edges — a failed predecessor causes downstream dependents to abort, while independent branches continue running
|
|
- Conditionals can serve as error boundaries, catching failures and redirecting to fallback paths
|
|
|
|
This layer does NOT execute operations directly. It provides reactive state that the hub coordinator reads and writes. The coordinator calls `registry.execute()` when a node's preconditions are met, and updates the node's status signal when the call completes or fails.
|
|
|
|
## ReactiveRoot for Workflows
|
|
|
|
```typescript
|
|
class WorkflowReactiveRoot {
|
|
private statusMap: Map<string, Signal<NodeStatus>>;
|
|
private preconditions: Map<string, Computed<boolean>>;
|
|
private blockedByFailure: Map<string, Computed<boolean>>;
|
|
private graph: DirectedGraph;
|
|
private effectDisposers: (() => void)[];
|
|
|
|
constructor(graph: DirectedGraph) {
|
|
this.graph = graph;
|
|
this.statusMap = new Map();
|
|
this.preconditions = new Map();
|
|
this.blockedByFailure = new Map();
|
|
this.effectDisposers = [];
|
|
this.initializeSignals();
|
|
}
|
|
}
|
|
```
|
|
|
|
`WorkflowReactiveRoot` wraps the reactive state for an entire workflow execution. It takes the structural DAG (from the GraphologyHost) and creates reactive state for each operation node.
|
|
|
|
### initializeSignals()
|
|
|
|
```typescript
|
|
private initializeSignals(): void {
|
|
for (const node of this.graph.nodes()) {
|
|
const attrs = this.graph.getNodeAttributes(node);
|
|
if (attrs.category !== "operation") continue; // Skip structural nodes (already flattened)
|
|
|
|
const status = signal<NodeStatus>("idle");
|
|
|
|
const predecessors = this.graph.inNeighbors(node);
|
|
|
|
// Preconditions: all predecessors completed or skipped
|
|
const preconditions = computed(() => {
|
|
return predecessors.every(pred => {
|
|
const predStatus = this.statusMap.get(pred);
|
|
return predStatus && (predStatus.value === "completed" || predStatus.value === "skipped");
|
|
});
|
|
});
|
|
|
|
// Blocked by failure: any predecessor failed or aborted (uncaught)
|
|
const blockedByFailure = computed(() => {
|
|
return predecessors.some(pred => {
|
|
const predStatus = this.statusMap.get(pred);
|
|
return predStatus && (predStatus.value === "failed" || predStatus.value === "aborted");
|
|
});
|
|
});
|
|
|
|
this.statusMap.set(node, status);
|
|
this.preconditions.set(node, preconditions);
|
|
this.blockedByFailure.set(node, blockedByFailure);
|
|
}
|
|
}
|
|
```
|
|
|
|
For each operation node in the DAG:
|
|
1. Create a `signal<NodeStatus>` starting at `"idle"`
|
|
2. Create a `computed<boolean>` that's `true` when all predecessor nodes have status `"completed"` (or `"skipped"` — a skipped node satisfies its dependents' preconditions)
|
|
3. Create a `computed<NodeStatus | null>` that detects whether any predecessor has failed or been aborted, triggering a cascade
|
|
4. Register an abort function that cascades to all descendants
|
|
|
|
### Status lifecycle
|
|
|
|
The signal-based status lifecycle mirrors `CallStatus` with workflow-specific additions:
|
|
|
|
```
|
|
┌──────┐
|
|
┌────────│ idle │────────────┐
|
|
│ └──┬───┘ │
|
|
│ │ predecessor │ (no predecessors —
|
|
│ │ starts running │ root node)
|
|
│ ▼ │
|
|
│ ┌───────┐ │
|
|
│ │waiting│ │
|
|
│ └───┬───┘ │
|
|
│ │ all preds │
|
|
│ │ completed/ │
|
|
│ ┌────┤ skipped │
|
|
│ │ │ ▼
|
|
│ │ │ ┌──────┐
|
|
│ │ └──────────►│ready │
|
|
│ │ └──┬───┘
|
|
│ │ │ hub starts call
|
|
│ │ ▼
|
|
│ │ ┌────────┐
|
|
│ │ │running │──── ──── ──── ────►
|
|
│ │ └──┬──┬──┘ │
|
|
│ │ │ │ │
|
|
│ │ call │ │ call │ call
|
|
│ │ completed │ │ failed │ aborted
|
|
│ │ │ │ │
|
|
│ │ ▼ ▼ ▼
|
|
│ │ ┌───────────┐ ┌──────┐ ┌────────┐
|
|
│ │ │ completed │ │failed│ │aborted │
|
|
│ │ └───────────┘ └──────┘ └────────┘
|
|
│ │ │ │ │
|
|
│ │ │ │ (uncaught) │
|
|
│ │ │ ▼ │
|
|
│ │ │ cascades to all │
|
|
│ │ │ downstream dependents │
|
|
│ │ │ via blockedByFailure │
|
|
│ │ │ │
|
|
└──────┼──────────────┼────────────────────────────┘
|
|
│ │
|
|
│ ┌─────────┐│
|
|
└───►│skipped ││ (Conditional branch
|
|
└─────────┘│ not taken)
|
|
│
|
|
└─── all are terminal states
|
|
```
|
|
|
|
Full transition rules:
|
|
|
|
```
|
|
idle → waiting (predecessor starts running)
|
|
idle → ready (no predecessors — root node)
|
|
waiting → ready (all predecessors completed or skipped)
|
|
waiting → aborted (predecessor failed and failure is uncaught)
|
|
ready → running (hub starts the call)
|
|
running → completed (call succeeded)
|
|
running → failed (call threw an error)
|
|
running → aborted (call cancelled externally)
|
|
failed → [terminal] (no further transitions)
|
|
aborted → [terminal] (no further transitions)
|
|
skipped → [terminal] (conditional branch not taken)
|
|
completed → [terminal] (no further transitions)
|
|
```
|
|
|
|
| Status | Meaning | Signal trigger |
|
|
|--------|---------|---------------|
|
|
| `idle` | Node just created, no predecessor activity yet | Initial state |
|
|
| `waiting` | At least one predecessor is running, none have completed yet | Any predecessor status change |
|
|
| `ready` | All predecessors completed or skipped (preconditions met) | `computed` resolves to `true` |
|
|
| `running` | Call executing | Hub sets `status.value = "running"` |
|
|
| `completed` | Call succeeded | Hub sets `status.value = "completed"` |
|
|
| `failed` | Call failed (uncaught error) | Hub sets `status.value = "failed"` |
|
|
| `aborted` | Call cancelled, or cascaded from failed predecessor | Hub or cascade sets `status.value = "aborted"` |
|
|
| `skipped` | Conditional branch not taken | Conditional evaluation sets this |
|
|
|
|
The key distinction between `failed` and `aborted`:
|
|
- **`failed`** means the operation itself threw an error. The node is the *source* of the failure.
|
|
- **`aborted`** means the operation was cancelled or a predecessor failed. The node is a *victim* of failure propagation.
|
|
|
|
## Computed Preconditions
|
|
|
|
The core innovation of reactive execution: each node's "can I start?" question is a `computed` signal that automatically resolves based on upstream states.
|
|
|
|
```typescript
|
|
const preconditions = computed(() => {
|
|
const predecessors = graph.inNeighbors(node);
|
|
return predecessors.every(pred => {
|
|
const status = statusMap.get(pred)!.value;
|
|
return status === "completed" || status === "skipped";
|
|
});
|
|
});
|
|
```
|
|
|
|
A node's preconditions are met when **all predecessors have reached a satisfying terminal state** (`completed` or `skipped`). A `failed` or `aborted` predecessor does NOT satisfy preconditions — it prevents the dependent from ever becoming `ready`.
|
|
|
|
This means:
|
|
- Adding a new predecessor automatically includes it in the check (if the DAG changes)
|
|
- A predecessor completing automatically re-evaluates all dependent preconditions
|
|
- An aborted predecessor prevents dependents from becoming `ready`
|
|
- A skipped predecessor satisfies preconditions (the branch was deliberately bypassed, not broken)
|
|
- No manual event wiring or callback chains
|
|
|
|
### Sequential preconditions
|
|
|
|
In a sequential group (A → B → C):
|
|
|
|
- A's preconditions: `true` (no predecessors, or root-level)
|
|
- B's preconditions: `A.status === "completed"`
|
|
- C's preconditions: `B.status === "completed"`
|
|
|
|
When A completes → B's preconditions become true → hub starts B → B completes → C's preconditions become true → hub starts C. All without manual event wiring.
|
|
|
|
### Parallel preconditions
|
|
|
|
In a parallel group (A starts B and C simultaneously):
|
|
|
|
- B's preconditions: `A.status === "completed"` (same as any sequential dependency)
|
|
- C's preconditions: `A.status === "completed"` (shared predecessor)
|
|
|
|
Both B and C become `ready` at the same time, and the hub starts them in parallel.
|
|
|
|
### Join preconditions
|
|
|
|
When a node depends on multiple predecessors (e.g., D depends on both B and C completing):
|
|
|
|
- D's preconditions: `B.status === "completed" && C.status === "completed"`
|
|
|
|
D only becomes `ready` when all predecessors complete. This is the "join" in fork-join parallelism.
|
|
|
|
## Failure Propagation
|
|
|
|
Failure propagation is the mechanism by which a failed or aborted node causes its downstream dependents to abort. The key design principle: **failure follows dependency edges, not structural scope**.
|
|
|
|
This means:
|
|
- In a `Sequential` group, failure propagates forward through the chain (B depends on A, so if A fails, B aborts)
|
|
- In a `Parallel` group, sibling branches are independent — a failure in branch A does NOT affect branch B, because there are no dependency edges between them
|
|
- A node that depends on multiple predecessors (a join) aborts only when it's impossible for its preconditions to ever be met
|
|
|
|
### The preconditions-failure duality
|
|
|
|
Each node has two complementary reactive computations:
|
|
|
|
1. **`preconditions`** (`computed<boolean>`) — true when all predecessors are `completed` or `skipped`. Node can start.
|
|
2. **`blockedByFailure`** (`computed<boolean>`) — true when any predecessor is `failed` or `aborted` and the failure is uncaught (not handled by a `Conditional`).
|
|
|
|
```typescript
|
|
const preconditions = computed(() => {
|
|
const predecessors = graph.inNeighbors(node);
|
|
return predecessors.every(pred => {
|
|
const status = statusMap.get(pred)!.value;
|
|
return status === "completed" || status === "skipped";
|
|
});
|
|
});
|
|
|
|
const blockedByFailure = computed(() => {
|
|
const predecessors = graph.inNeighbors(node);
|
|
return predecessors.some(pred => {
|
|
const status = statusMap.get(pred)!.value;
|
|
return status === "failed" || status === "aborted";
|
|
});
|
|
});
|
|
```
|
|
|
|
When `blockedByFailure` becomes `true` and the node hasn't started (`idle` or `waiting`), the node transitions to `aborted`. This happens via an `effect()`:
|
|
|
|
```typescript
|
|
effect(() => {
|
|
if (blockedByFailure.value && (status.value === "idle" || status.value === "waiting")) {
|
|
status.value = "aborted";
|
|
}
|
|
});
|
|
```
|
|
|
|
This cascade is automatic and reactive — when a predecessor fails, all downstream `blockedByFailure` computations re-evaluate, and their effects fire, aborting any waiting dependents.
|
|
|
|
### Sequential failure propagation
|
|
|
|
```
|
|
A (failed) → B (aborted) → C (aborted)
|
|
```
|
|
|
|
When A fails, B's `blockedByFailure` becomes true. B transitions from `waiting` to `aborted`. C's `blockedByFailure` then becomes true (B is now `aborted`). C transitions to `aborted`. The entire downstream chain aborts.
|
|
|
|
### Parallel independence
|
|
|
|
```
|
|
┌── B (completed) ──┐
|
|
A (completed) ├── D (ready)
|
|
└── C (failed) ─────┘
|
|
```
|
|
|
|
When C fails:
|
|
- C's downstream dependents see `blockedByFailure = true`
|
|
- B is unaffected — it's on an independent branch
|
|
- D depends on both B and C. D's `preconditions` will never be met (C is `failed`, not `completed`). D's `blockedByFailure` is true (C is `failed`). D transitions to `aborted`.
|
|
|
|
But crucially, this is because D *depends on* C, not because they share a structural scope:
|
|
|
|
```
|
|
┌── B (completed) ──┐
|
|
A (completed) │ (no edge from C to E)
|
|
└── C (failed) ─────┘
|
|
└── E (completed)
|
|
```
|
|
|
|
E has no dependency on C. E continues running regardless of C's failure. **Failure follows dependency edges, not structural boundaries.**
|
|
|
|
### Join semantics
|
|
|
|
When a node depends on multiple predecessors (fork-join):
|
|
|
|
```
|
|
┌── B (completed) ──┐
|
|
A (completed) ├── D (aborted)
|
|
└── C (failed) ─────┘
|
|
```
|
|
|
|
D's `preconditions` requires both B and C to be completed/skipped. Since C is `failed`, D's preconditions can never be met. D transitions to `aborted`.
|
|
|
|
The alternative would be "partial success" — D starts with B's output even though C failed. This is NOT supported by the precondition model. If partial execution is needed, the template author should use a `Conditional` to handle the failure case explicitly.
|
|
|
|
### Conditional as error boundary
|
|
|
|
A `Conditional` can catch a failure and redirect to a fallback path:
|
|
|
|
```typescript
|
|
h(Sequential, {},
|
|
h(Operation, { name: "fetch-data" }),
|
|
h(Conditional, {
|
|
test: (results) => results["fetch-data"].status !== "failed",
|
|
},
|
|
// then: proceed with data processing
|
|
h(Sequential, {},
|
|
h(Operation, { name: "transform" }),
|
|
h(Operation, { name: "store" }),
|
|
),
|
|
// else: fallback path
|
|
h(Operation, { name: "notify-error" }),
|
|
),
|
|
)
|
|
```
|
|
|
|
If `fetch-data` fails:
|
|
1. The `Conditional`'s `test` function receives the results map including `fetch-data`'s status
|
|
2. `test` evaluates to `false` (the operation failed)
|
|
3. The `then` branch transitions to `skipped`
|
|
4. The `else` branch (`notify-error`) becomes `ready`
|
|
5. Downstream nodes after the `Conditional` see the `Conditional` as `completed` (it resolved successfully, just on a different branch)
|
|
|
|
This makes `Conditional` a **caught error boundary**. The failure is handled — downstream nodes don't see a cascade because the `Conditional` resolved successfully.
|
|
|
|
Without a `Conditional`, the failure is **uncaught**. It cascades through dependency edges to all dependents, which transition to `aborted`.
|
|
|
|
### Systemic failure: aborting the entire workflow
|
|
|
|
For failures that should cancel everything (e.g., provider outage, authentication failure), the hub coordinator can abort the entire `WorkflowReactiveRoot`:
|
|
|
|
```typescript
|
|
workflowRoot.abortAll(); // Sets all non-terminal nodes to "aborted"
|
|
```
|
|
|
|
This is separate from dependency-edge failure propagation. It's for systemic failures where the workflow cannot meaningfully continue regardless of which branches are independent.
|
|
|
|
### Interaction with call protocol abort
|
|
|
|
There are two abort mechanisms:
|
|
|
|
1. **Signal cascade** (this layer) — `blockedByFailure` effects transition dependents to `aborted`. This is automatic and follows dependency edges.
|
|
2. **Call protocol abort** (operations layer) — `PendingRequestMap.abort(requestId)` propagates `call.aborted` events through the pub/sub layer. This is network-aware and handles remote calls.
|
|
3. **Full workflow abort** — `workflowRoot.abortAll()` aborts all non-terminal nodes. For systemic failures.
|
|
|
|
The hub coordinator should invoke signal cascade and protocol abort together:
|
|
```typescript
|
|
// When aborting a call:
|
|
workflowRoot.abortNode(nodeId); // Signal: transition dependents to aborted
|
|
prm.abort(requestId); // Protocol: cancel the remote call
|
|
|
|
// When aborting entire workflow:
|
|
workflowRoot.abortAll(); // Signal: abort everything
|
|
prm.abortAll(pendingRequestIds); // Protocol: cancel all pending calls
|
|
```
|
|
|
|
Signal cascades are instant. Protocol aborts may take time to propagate. They're complementary — the signal cascade ensures local state is immediately consistent, while the protocol abort ensures remote state eventually catches up.
|
|
|
|
## NodeStatus vs CallStatus
|
|
|
|
`NodeStatus` extends `CallStatus` with workflow-specific states that have no call protocol equivalent:
|
|
|
|
| NodeStatus | Meaning | CallStatus equivalent |
|
|
|-----------|---------|----------------------|
|
|
| `idle` | Not started, no preconditions evaluated | None (call doesn't exist yet) |
|
|
| `waiting` | Preconditions not met (upstream still running) | None |
|
|
| `ready` | Preconditions met, eligible to start | None |
|
|
| `running` | Call in progress | `running` |
|
|
| `completed` | Call succeeded | `completed` |
|
|
| `failed` | Call failed | `failed` |
|
|
| `aborted` | Call cancelled | `aborted` |
|
|
| `skipped` | Conditional branch not taken | None |
|
|
|
|
The hub coordinator maps between these:
|
|
|
|
```typescript
|
|
// NodeStatus → CallStatus (when starting a call)
|
|
function nodeStatusToCallAction(status: NodeStatus): "start" | "skip" | "abort" | "none" {
|
|
switch (status) {
|
|
case "ready": return "start";
|
|
case "skipped": return "skip";
|
|
case "aborted": return "abort";
|
|
default: return "none";
|
|
}
|
|
}
|
|
|
|
// CallStatus → NodeStatus (when call event arrives)
|
|
function callStatusToNodeStatus(callStatus: CallStatus): NodeStatus {
|
|
// Direct mapping for shared states
|
|
return callStatus as NodeStatus;
|
|
}
|
|
```
|
|
|
|
## Effect-Driven Execution
|
|
|
|
The hub coordinator uses two `effect()`s per node — one for starting when preconditions are met, and one for aborting when failure propagates:
|
|
|
|
```typescript
|
|
for (const [nodeId, preconditions, blockedByFailure] of workflowRoot.nodes) {
|
|
// Start the call when preconditions are met
|
|
effect(() => {
|
|
if (preconditions.value) {
|
|
const status = workflowRoot.statusMap.get(nodeId)!;
|
|
if (status.value === "idle" || status.value === "waiting") {
|
|
// All preconditions satisfied — start the call
|
|
status.value = "running";
|
|
const operationId = graph.getNodeAttributes(nodeId).name;
|
|
prm.call(operationId, getInput(nodeId), { parentRequestId: parentCallId })
|
|
.then(result => { status.value = "completed"; })
|
|
.catch(error => { status.value = "failed"; });
|
|
}
|
|
}
|
|
});
|
|
|
|
// Abort when a predecessor fails (uncaught failure propagation)
|
|
effect(() => {
|
|
if (blockedByFailure.value) {
|
|
const status = workflowRoot.statusMap.get(nodeId)!;
|
|
if (status.value === "idle" || status.value === "waiting") {
|
|
// A predecessor failed and no Conditional caught it — abort
|
|
status.value = "aborted";
|
|
}
|
|
}
|
|
});
|
|
}
|
|
```
|
|
|
|
Both effects are reactive. When a predecessor completes, the `preconditions` computed re-evaluates, potentially triggering the start effect. When a predecessor fails, the `blockedByFailure` computed re-evaluates, potentially triggering the abort effect.
|
|
|
|
The call's promise resolution updates the node's status signal, which triggers downstream preconditions and failure propagations to re-evaluate, which triggers their effects, and so on.
|
|
|
|
### Effect disposal
|
|
|
|
Each `effect()` returns a dispose function. The `WorkflowReactiveRoot` tracks all effect disposers and provides a `dispose()` method that tears down the entire reactive graph:
|
|
|
|
```typescript
|
|
dispose(): void {
|
|
for (const disposer of this.effectDisposers) {
|
|
disposer();
|
|
}
|
|
this.statusMap.clear();
|
|
this.preconditions.clear();
|
|
this.blockedByFailure.clear();
|
|
}
|
|
```
|
|
|
|
This is critical for cleaning up when a workflow completes, fails, or is aborted. Without disposal, signal subscriptions leak.
|
|
|
|
### Full workflow abort
|
|
|
|
For systemic failures (provider outage, authentication failure), `WorkflowReactiveRoot` provides `abortAll()`:
|
|
|
|
```typescript
|
|
abortAll(): void {
|
|
for (const [nodeId, status] of this.statusMap) {
|
|
if (status.value !== "completed" && status.value !== "failed") {
|
|
status.value = "aborted";
|
|
}
|
|
}
|
|
// Effects will fire and clean up any waiting/ready nodes
|
|
}
|
|
```
|
|
|
|
This transitions all non-terminal, non-failed nodes to `aborted`. It's for cases where the entire workflow should stop, regardless of which branches are independent.
|
|
|
|
## Reactive Error Boundaries
|
|
|
|
The reactive execution layer has three levels of error handling, each with distinct scope and semantics:
|
|
|
|
### Level 1: Signal-level errors (per-node)
|
|
|
|
When a call fails, the hub coordinator sets the node's status to `"failed"`:
|
|
|
|
```typescript
|
|
status.value = "failed"; // Individual node failure
|
|
```
|
|
|
|
This triggers `blockedByFailure` in all downstream dependents, causing them to transition to `"aborted"`. The failure propagates through the signal graph reactively — no manual error handling is needed.
|
|
|
|
### Level 2: Conditional error boundaries (branch-level)
|
|
|
|
A `Conditional` node catches failures and redirects to an alternative branch:
|
|
|
|
```typescript
|
|
h(Conditional, {
|
|
test: (results) => results["fetch-data"].status !== "failed",
|
|
},
|
|
// then-branch (happy path)
|
|
h(Operation, { name: "process" }),
|
|
// else-branch (fallback)
|
|
h(Operation, { name: "handle-error" }),
|
|
)
|
|
```
|
|
|
|
When the `Conditional`'s `test` function evaluates to `false` (because a predecessor failed), the then-branch transitions to `skipped` and the else-branch becomes `ready`. Downstream nodes after the `Conditional` see it as `completed` — the failure is contained.
|
|
|
|
This is the reactive equivalent of a `try/catch` block. Without a `Conditional`, failures cascade uncaught through dependency edges.
|
|
|
|
### Level 3: Workflow abort (system-level)
|
|
|
|
For failures that should cancel everything, the hub calls `workflowRoot.abortAll()`:
|
|
|
|
```typescript
|
|
workflowRoot.abortAll(); // All non-terminal nodes → "aborted"
|
|
```
|
|
|
|
This is for system-level failures: provider outage, authentication failure, or any condition where the workflow cannot meaningfully continue regardless of branch independence.
|
|
|
|
### WorkflowErrorBoundary (coordinator-level)
|
|
|
|
The hub coordinator wraps the entire reactive execution in a `WorkflowErrorBoundary` — a conceptual boundary, not a signal:
|
|
|
|
```typescript
|
|
try {
|
|
// Drive the workflow
|
|
for (const [nodeId, preconditions, blockedByFailure] of workflowRoot.nodes) {
|
|
effect(() => { /* start calls when ready */ });
|
|
effect(() => { /* abort when blocked */ });
|
|
}
|
|
} catch (error) {
|
|
// Unhandled reactive error — signal graph inconsistency
|
|
// This shouldn't happen in normal operation
|
|
workflowRoot.abortAll();
|
|
prm.abortAll(pendingRequestIds);
|
|
}
|
|
```
|
|
|
|
The `WorkflowErrorBoundary` catches errors that escape the signal graph (e.g., a `computed` that throws, an `effect` that errors). These are catastrophic — the reactive state is inconsistent. The boundary's job is to:
|
|
1. Abort all calls via `prm.abortAll()`
|
|
2. Set all non-terminal nodes to `"aborted"` via `workflowRoot.abortAll()`
|
|
3. Dispose the reactive root
|
|
4. Log the error for diagnostics
|
|
|
|
**Error propagation summary**:
|
|
|
|
| Error type | Scope | Mechanism | Recovery |
|
|
|------------|-------|-----------|----------|
|
|
| Call failure | Single node | `status.value = "failed"` | Cascades to dependents via `blockedByFailure` |
|
|
| Caught by Conditional | Branch | `Conditional.test` evaluates against failed status | Redirect to else-branch, downstream sees `completed` |
|
|
| Uncaught cascade | Downstream chain | `blockedByFailure` effects | Downstream nodes transition to `aborted` |
|
|
| System failure | Entire workflow | `abortAll()` | All non-terminal nodes to `aborted` |
|
|
| Reactive error | Signal graph | `WorkflowErrorBoundary` catch | Abort everything, dispose, log |
|
|
|
|
## Constraints
|
|
|
|
- **Signals are in-memory** — `WorkflowReactiveRoot` state is not persisted. If the hub restarts, the reactive state is lost and must be reconstructed from call protocol events + template re-render.
|
|
- **Effect-driven execution is optional** — the hub coordinator can choose not to use `effect()` and instead poll `preconditions.value` and `blockedByFailure.value` manually. The reactive layer provides the building blocks; the coordinator decides how to use them.
|
|
- **Failure follows dependency edges, not structural scope** — a failed node causes only its downstream dependents (via DAG edges) to abort. Sibling branches in a `Parallel` group are independent and continue running. This enables partial success: one branch can fail while another completes.
|
|
- **Conditionals are error boundaries** — a `Conditional` whose test evaluates against a failed predecessor can redirect to an else branch, catching the failure. Without a `Conditional`, failures cascade uncaught through dependency edges.
|
|
- **Abort is immediate in signals, delayed in protocol** — setting `status.value = "aborted"` is instant, but `prm.abort(requestId)` takes time to propagate through the call protocol. The hub should invoke both.
|
|
- **`skipped` satisfies preconditions** — a `skipped` predecessor is treated as "completed for the purpose of preconditions." It means the branch was deliberately bypassed, not broken.
|
|
- **`failed` and `aborted` block preconditions** — a `failed` or `aborted` predecessor means the dependent's preconditions can never be met. The `blockedByFailure` effect transitions the dependent to `aborted`.
|
|
- **`NodeStatus` and `CallStatus` share terminal states** — `running`, `completed`, `failed`, `aborted` map directly. `idle`, `waiting`, `ready`, `skipped` are workflow-specific additions.
|
|
|
|
## Lifecycle and Ownership
|
|
|
|
The reactive execution pipeline has a clear creation order and ownership model:
|
|
|
|
### Creation Order
|
|
|
|
```
|
|
1. Template (UNode tree)
|
|
↓ GraphologyHostConfig
|
|
2. DAG (DirectedGraph)
|
|
↓ WorkflowReactiveRoot constructor
|
|
3. Signal graph (statusMap, preconditions, blockedByFailure)
|
|
↓ ReactiveHostConfig.render()
|
|
4. WorkflowNode tree (with effects registered)
|
|
```
|
|
|
|
1. **Template → DAG**: The consumer provides a template and renders it through `GraphologyHostConfig`. This produces a `DirectedGraph` stored in the `GraphContext`.
|
|
|
|
2. **DAG → Signal graph**: The consumer creates a `WorkflowReactiveRoot` from the DAG. The constructor iterates over all operation nodes in the DAG and creates `signal<NodeStatus>`, `computed<boolean>` (preconditions), and `computed<boolean>` (blockedByFailure) for each.
|
|
|
|
3. **Signal graph → WorkflowNode tree**: The consumer renders the template through `ReactiveHostConfig`. The `createInstance` call for each `Operation` node looks up the corresponding signal in the `ReactiveRoot` and wires the node's effects.
|
|
|
|
### Ownership
|
|
|
|
| Object | Owned by | Disposed by |
|
|
|--------|----------|-------------|
|
|
| Template (`UNode` tree) | Consumer | Consumer (not a reactive resource) |
|
|
| DAG (`DirectedGraph`) | GraphologyHostConfig's `GraphContext` | Consumer (static, no disposal needed) |
|
|
| `WorkflowReactiveRoot` | Consumer (typically the hub coordinator) | Consumer calls `root.dispose()` |
|
|
| Signal graph (statusMap, preconditions, etc.) | `WorkflowReactiveRoot` | `root.dispose()` clears all maps |
|
|
| `WorkflowNode` tree | `ReactiveContext` (created by ReactiveHostConfig) | Cleared when `ReactiveContext` is garbage collected |
|
|
| Effects | `WorkflowReactiveRoot.effectDisposers` | `root.dispose()` calls all disposers |
|
|
|
|
**Key ownership rules**:
|
|
- `WorkflowReactiveRoot` owns the signal graph. It creates every `signal` and `computed`, tracks every `effect` disposer, and is responsible for cleaning them all up.
|
|
- `ReactiveHostConfig` is stateless after rendering. It creates `WorkflowNode` instances and registers effects, but the effects are tracked by `WorkflowReactiveRoot`, not by the HostConfig.
|
|
- The consumer owns the `WorkflowReactiveRoot` lifecycle. It creates it, drives execution by setting status values, and disposes it when done.
|
|
|
|
### Disposal
|
|
|
|
```typescript
|
|
// When workflow completes or is cancelled:
|
|
workflowRoot.dispose();
|
|
```
|
|
|
|
`dispose()` performs the following in order:
|
|
1. Calls every `effect()` disposer, unsubscribing all reactive effects.
|
|
2. Clears `statusMap`, `preconditions`, and `blockedByFailure` maps, releasing signal references.
|
|
3. The `WorkflowNode` tree becomes inert — status signals no longer exist, so no updates propagate.
|
|
|
|
**When to dispose**:
|
|
- Workflow completes successfully (all nodes `completed`)
|
|
- Workflow is aborted (consumer calls `abortAll()`, then `dispose()`)
|
|
- Template is being re-rendered (dispose the old root before creating a new one — until ujsx reconciler supports re-rendering)
|
|
|
|
**What NOT to dispose**:
|
|
- The DAG (`DirectedGraph`) is not a reactive resource. It doesn't need disposal.
|
|
- The template (`UNode` tree) is plain data. It doesn't need disposal.
|
|
|
|
### Interaction with ReactiveHostConfig
|
|
|
|
The `ReactiveHostConfig` does NOT own the reactive state. It creates `WorkflowNode` instances during rendering, but these nodes reference signals that belong to `WorkflowReactiveRoot`. The rendering flow is:
|
|
|
|
```typescript
|
|
// 1. Create ReactiveRoot from DAG
|
|
const workflowRoot = new WorkflowReactiveRoot(dag);
|
|
|
|
// 2. Create ReactiveHostConfig with reference to ReactiveRoot's signals
|
|
const hostConfig = new ReactiveHostConfig(operationRegistry, workflowRoot);
|
|
|
|
// 3. Render template
|
|
const root = createRoot(hostConfig, {});
|
|
root.render(template);
|
|
|
|
// 4. Drive execution (hub coordinator sets status values)
|
|
workflowRoot.statusMap.get("architect")!.value = "ready";
|
|
// ... external code starts the call, eventually:
|
|
workflowRoot.statusMap.get("architect")!.value = "completed";
|
|
// ... which triggers downstream preconditions
|
|
|
|
// 5. Cleanup
|
|
workflowRoot.dispose();
|
|
```
|
|
|
|
The `ReactiveContext` passed to `ReactiveHostConfig` includes a reference to `workflowRoot.statusSignals` so that `createInstance` can look up and wire signals for each node. The context does not own these signals — it's a lookup table.
|
|
|
|
**Important**: `WorkflowNode.status` and `WorkflowReactiveRoot.statusMap.get(nodeId)` reference the **same** `Signal<NodeStatus>` instance. There is one signal per node, owned by `WorkflowReactiveRoot`, and both the `WorkflowNode` and the `statusMap` hold references to it. Setting `workflowRoot.statusMap.get("architect").value = "running"` and setting `workflowNode.status.value = "running"` (where `workflowNode.key === "architect"`) are equivalent operations on the same signal. Similarly, `WorkflowNode.preconditions` and `WorkflowReactiveRoot.preconditions.get(nodeId)` reference the **same** `Computed<boolean>` instance.
|
|
|
|
## Open Questions
|
|
|
|
1. **Should preconditions support OR logic?** Currently all predecessors must complete (AND logic). An `anyOf` predicate would allow "start this node as soon as any predecessor completes." This would require an edge attribute or node-level configuration.
|
|
|
|
2. **How are retries handled at the signal level?** If an operation fails and should be retried, the status would go `running → failed → ready → running`. This requires resetting the status back to `ready`, which the current state machine doesn't support (failed is terminal). A `retried` status or a separate `retryCount` attribute may be needed.
|
|
|
|
3. **Should the reactive graph support partial re-rendering?** If a template changes mid-execution (e.g., a step is added), the ujsx reconciler could diff the old and new trees. But the ReactiveHost only supports mount rendering. Re-rendering would require reconciler support.
|
|
|
|
4. **How does `maxConcurrency` interact with preconditions?** A `Parallel` group with `maxConcurrency: 3` should only start 3 nodes at a time, even though all preconditions are met. This is a scheduling concern, not a structural one. The reactive layer could implement this as a semaphore signal, or it could be the coordinator's responsibility.
|
|
|
|
5. **Should `blockedByFailure` be a separate `computed` or derived from `preconditions`?** Currently the design has two separate computeds — `preconditions` (all predecessors completed/skipped) and `blockedByFailure` (any predecessor failed/aborted). An alternative is a single `computed<NodeReadiness>` that returns `"ready" | "blocked" | "failed"` or similar. This reduces the number of effects but makes the readiness check less composable.
|
|
|
|
6. **What happens to running nodes when a predecessor fails?** The current spec transitions `idle` and `waiting` nodes to `aborted`. But what about a node that's already `running`? Should it be cancelled (set to `aborted` and call `prm.abort()`), or should it be allowed to complete? The answer depends on whether the running node's output is still needed — which the template author decides via `Conditional` error boundaries.
|
|
|
|
## References
|
|
|
|
- ujsx reactive layer: `@alkdev/ujsx/docs/architecture/reactive-layer.md`
|
|
- ujsx reconciler: `@alkdev/ujsx/docs/architecture/reconciler.md`
|
|
- Schema: [schema.md](schema.md) — `NodeStatus`, `CallStatus`
|
|
- Host configs: [host-configs.md](host-configs.md)
|
|
- Workflow templates: [workflow-templates.md](workflow-templates.md)
|
|
- Call protocol: `@alkdev/alkhub_ts/docs/architecture/call-graph.md` |