--- status: draft last_updated: 2026-05-19 --- # Reactive Execution Signal-driven status propagation, computed preconditions, and failure propagation for workflow template execution. ## Overview The reactive execution layer bridges workflow template structure (DAG) to runtime behavior (call execution). It uses `@preact/signals-core` (via ujsx's reactive layer) to create a signal-backed execution model where: - Each `` node gets a `signal` tracking its lifecycle state - Preconditions are `computed` values that automatically resolve when upstream dependencies complete - Failure propagation follows dependency edges — a failed predecessor causes downstream dependents to abort, while independent branches continue running - Conditionals can serve as error boundaries, catching failures and redirecting to fallback paths This layer does NOT execute operations directly. It provides reactive state that the hub coordinator reads and writes. The coordinator calls `registry.execute()` when a node's preconditions are met, and updates the node's status signal when the call completes or fails. ## ReactiveRoot for Workflows ```typescript class WorkflowReactiveRoot { private statusMap: Map>; private preconditions: Map>; private blockedByFailure: Map>; private graph: DirectedGraph; private effectDisposers: (() => void)[]; constructor(graph: DirectedGraph) { this.graph = graph; this.statusMap = new Map(); this.preconditions = new Map(); this.blockedByFailure = new Map(); this.effectDisposers = []; this.initializeSignals(); } } ``` `WorkflowReactiveRoot` wraps the reactive state for an entire workflow execution. It takes the structural DAG (from the GraphologyHost) and creates reactive state for each operation node. ### initializeSignals() ```typescript private initializeSignals(): void { for (const node of this.graph.nodes()) { const attrs = this.graph.getNodeAttributes(node); if (attrs.category !== "operation") continue; // Skip structural nodes (already flattened) const status = signal("idle"); const predecessors = this.graph.inNeighbors(node); // Preconditions: all predecessors completed or skipped const preconditions = computed(() => { return predecessors.every(pred => { const predStatus = this.statusMap.get(pred); return predStatus && (predStatus.value === "completed" || predStatus.value === "skipped"); }); }); // Blocked by failure: any predecessor failed or aborted (uncaught) const blockedByFailure = computed(() => { return predecessors.some(pred => { const predStatus = this.statusMap.get(pred); return predStatus && (predStatus.value === "failed" || predStatus.value === "aborted"); }); }); this.statusMap.set(node, status); this.preconditions.set(node, preconditions); this.blockedByFailure.set(node, blockedByFailure); } } ``` For each operation node in the DAG: 1. Create a `signal` starting at `"idle"` 2. Create a `computed` that's `true` when all predecessor nodes have status `"completed"` (or `"skipped"` — a skipped node satisfies its dependents' preconditions) 3. Create a `computed` that detects whether any predecessor has failed or been aborted, triggering a cascade 4. Register an abort function that cascades to all descendants ### Status lifecycle The signal-based status lifecycle mirrors `CallStatus` with workflow-specific additions: ``` idle → waiting → ready → running → completed ↓ ↑ failed │ ↓ │ (uncaught) → aborted ←──┘ ↑ (cascade from failed predecessor) ↑ skipped (conditional) ``` Full transition rules: ``` idle → waiting (predecessor starts running) idle → ready (no predecessors — root node) waiting → ready (all predecessors completed or skipped) waiting → aborted (predecessor failed and failure is uncaught) ready → running (hub starts the call) running → completed (call succeeded) running → failed (call threw an error) running → aborted (call cancelled externally) failed → [terminal] (no further transitions) aborted → [terminal] (no further transitions) skipped → [terminal] (conditional branch not taken) completed → [terminal] (no further transitions) ``` | Status | Meaning | Signal trigger | |--------|---------|---------------| | `idle` | Node just created, no predecessor activity yet | Initial state | | `waiting` | At least one predecessor is running, none have completed yet | Any predecessor status change | | `ready` | All predecessors completed or skipped (preconditions met) | `computed` resolves to `true` | | `running` | Call executing | Hub sets `status.value = "running"` | | `completed` | Call succeeded | Hub sets `status.value = "completed"` | | `failed` | Call failed (uncaught error) | Hub sets `status.value = "failed"` | | `aborted` | Call cancelled, or cascaded from failed predecessor | Hub or cascade sets `status.value = "aborted"` | | `skipped` | Conditional branch not taken | Conditional evaluation sets this | The key distinction between `failed` and `aborted`: - **`failed`** means the operation itself threw an error. The node is the *source* of the failure. - **`aborted`** means the operation was cancelled or a predecessor failed. The node is a *victim* of failure propagation. ## Computed Preconditions The core innovation of reactive execution: each node's "can I start?" question is a `computed` signal that automatically resolves based on upstream states. ```typescript const preconditions = computed(() => { const predecessors = graph.inNeighbors(node); return predecessors.every(pred => { const status = statusMap.get(pred)!.value; return status === "completed" || status === "skipped"; }); }); ``` A node's preconditions are met when **all predecessors have reached a satisfying terminal state** (`completed` or `skipped`). A `failed` or `aborted` predecessor does NOT satisfy preconditions — it prevents the dependent from ever becoming `ready`. This means: - Adding a new predecessor automatically includes it in the check (if the DAG changes) - A predecessor completing automatically re-evaluates all dependent preconditions - An aborted predecessor prevents dependents from becoming `ready` - A skipped predecessor satisfies preconditions (the branch was deliberately bypassed, not broken) - No manual event wiring or callback chains ### Sequential preconditions In a sequential group (A → B → C): - A's preconditions: `true` (no predecessors, or root-level) - B's preconditions: `A.status === "completed"` - C's preconditions: `B.status === "completed"` When A completes → B's preconditions become true → hub starts B → B completes → C's preconditions become true → hub starts C. All without manual event wiring. ### Parallel preconditions In a parallel group (A starts B and C simultaneously): - B's preconditions: `A.status === "completed"` (same as any sequential dependency) - C's preconditions: `A.status === "completed"` (shared predecessor) Both B and C become `ready` at the same time, and the hub starts them in parallel. ### Join preconditions When a node depends on multiple predecessors (e.g., D depends on both B and C completing): - D's preconditions: `B.status === "completed" && C.status === "completed"` D only becomes `ready` when all predecessors complete. This is the "join" in fork-join parallelism. ## Failure Propagation Failure propagation is the mechanism by which a failed or aborted node causes its downstream dependents to abort. The key design principle: **failure follows dependency edges, not structural scope**. This means: - In a `Sequential` group, failure propagates forward through the chain (B depends on A, so if A fails, B aborts) - In a `Parallel` group, sibling branches are independent — a failure in branch A does NOT affect branch B, because there are no dependency edges between them - A node that depends on multiple predecessors (a join) aborts only when it's impossible for its preconditions to ever be met ### The preconditions-failure duality Each node has two complementary reactive computations: 1. **`preconditions`** (`computed`) — true when all predecessors are `completed` or `skipped`. Node can start. 2. **`blockedByFailure`** (`computed`) — true when any predecessor is `failed` or `aborted` and the failure is uncaught (not handled by a `Conditional`). ```typescript const preconditions = computed(() => { const predecessors = graph.inNeighbors(node); return predecessors.every(pred => { const status = statusMap.get(pred)!.value; return status === "completed" || status === "skipped"; }); }); const blockedByFailure = computed(() => { const predecessors = graph.inNeighbors(node); return predecessors.some(pred => { const status = statusMap.get(pred)!.value; return status === "failed" || status === "aborted"; }); }); ``` When `blockedByFailure` becomes `true` and the node hasn't started (`idle` or `waiting`), the node transitions to `aborted`. This happens via an `effect()`: ```typescript effect(() => { if (blockedByFailure.value && (status.value === "idle" || status.value === "waiting")) { status.value = "aborted"; } }); ``` This cascade is automatic and reactive — when a predecessor fails, all downstream `blockedByFailure` computations re-evaluate, and their effects fire, aborting any waiting dependents. ### Sequential failure propagation ``` A (failed) → B (aborted) → C (aborted) ``` When A fails, B's `blockedByFailure` becomes true. B transitions from `waiting` to `aborted`. C's `blockedByFailure` then becomes true (B is now `aborted`). C transitions to `aborted`. The entire downstream chain aborts. ### Parallel independence ``` ┌── B (completed) ──┐ A (completed) ├── D (ready) └── C (failed) ─────┘ ``` When C fails: - C's downstream dependents see `blockedByFailure = true` - B is unaffected — it's on an independent branch - D depends on both B and C. D's `preconditions` will never be met (C is `failed`, not `completed`). D's `blockedByFailure` is true (C is `failed`). D transitions to `aborted`. But crucially, this is because D *depends on* C, not because they share a structural scope: ``` ┌── B (completed) ──┐ A (completed) │ (no edge from C to E) └── C (failed) ─────┘ └── E (completed) ``` E has no dependency on C. E continues running regardless of C's failure. **Failure follows dependency edges, not structural boundaries.** ### Join semantics When a node depends on multiple predecessors (fork-join): ``` ┌── B (completed) ──┐ A (completed) ├── D (aborted) └── C (failed) ─────┘ ``` D's `preconditions` requires both B and C to be completed/skipped. Since C is `failed`, D's preconditions can never be met. D transitions to `aborted`. The alternative would be "partial success" — D starts with B's output even though C failed. This is NOT supported by the precondition model. If partial execution is needed, the template author should use a `Conditional` to handle the failure case explicitly. ### Conditional as error boundary A `Conditional` can catch a failure and redirect to a fallback path: ```typescript h(Sequential, {}, h(Operation, { name: "fetch-data" }), h(Conditional, { test: (results) => results["fetch-data"].status !== "failed", }, // then: proceed with data processing h(Sequential, {}, h(Operation, { name: "transform" }), h(Operation, { name: "store" }), ), // else: fallback path h(Operation, { name: "notify-error" }), ), ) ``` If `fetch-data` fails: 1. The `Conditional`'s `test` function receives the results map including `fetch-data`'s status 2. `test` evaluates to `false` (the operation failed) 3. The `then` branch transitions to `skipped` 4. The `else` branch (`notify-error`) becomes `ready` 5. Downstream nodes after the `Conditional` see the `Conditional` as `completed` (it resolved successfully, just on a different branch) This makes `Conditional` a **caught error boundary**. The failure is handled — downstream nodes don't see a cascade because the `Conditional` resolved successfully. Without a `Conditional`, the failure is **uncaught**. It cascades through dependency edges to all dependents, which transition to `aborted`. ### Systemic failure: aborting the entire workflow For failures that should cancel everything (e.g., provider outage, authentication failure), the hub coordinator can abort the entire `WorkflowReactiveRoot`: ```typescript workflowRoot.abortAll(); // Sets all non-terminal nodes to "aborted" ``` This is separate from dependency-edge failure propagation. It's for systemic failures where the workflow cannot meaningfully continue regardless of which branches are independent. ### Interaction with call protocol abort There are two abort mechanisms: 1. **Signal cascade** (this layer) — `blockedByFailure` effects transition dependents to `aborted`. This is automatic and follows dependency edges. 2. **Call protocol abort** (operations layer) — `PendingRequestMap.abort(requestId)` propagates `call.aborted` events through the pub/sub layer. This is network-aware and handles remote calls. 3. **Full workflow abort** — `workflowRoot.abortAll()` aborts all non-terminal nodes. For systemic failures. The hub coordinator should invoke signal cascade and protocol abort together: ```typescript // When aborting a call: workflowRoot.abortNode(nodeId); // Signal: transition dependents to aborted prm.abort(requestId); // Protocol: cancel the remote call // When aborting entire workflow: workflowRoot.abortAll(); // Signal: abort everything prm.abortAll(pendingRequestIds); // Protocol: cancel all pending calls ``` Signal cascades are instant. Protocol aborts may take time to propagate. They're complementary — the signal cascade ensures local state is immediately consistent, while the protocol abort ensures remote state eventually catches up. ## NodeStatus vs CallStatus `NodeStatus` extends `CallStatus` with workflow-specific states that have no call protocol equivalent: | NodeStatus | Meaning | CallStatus equivalent | |-----------|---------|----------------------| | `idle` | Not started, no preconditions evaluated | None (call doesn't exist yet) | | `waiting` | Preconditions not met (upstream still running) | None | | `ready` | Preconditions met, eligible to start | None | | `running` | Call in progress | `running` | | `completed` | Call succeeded | `completed` | | `failed` | Call failed | `failed` | | `aborted` | Call cancelled | `aborted` | | `skipped` | Conditional branch not taken | None | The hub coordinator maps between these: ```typescript // NodeStatus → CallStatus (when starting a call) function nodeStatusToCallAction(status: NodeStatus): "start" | "skip" | "abort" | "none" { switch (status) { case "ready": return "start"; case "skipped": return "skip"; case "aborted": return "abort"; default: return "none"; } } // CallStatus → NodeStatus (when call event arrives) function callStatusToNodeStatus(callStatus: CallStatus): NodeStatus { // Direct mapping for shared states return callStatus as NodeStatus; } ``` ## Effect-Driven Execution The hub coordinator uses two `effect()`s per node — one for starting when preconditions are met, and one for aborting when failure propagates: ```typescript for (const [nodeId, preconditions, blockedByFailure] of workflowRoot.nodes) { // Start the call when preconditions are met effect(() => { if (preconditions.value) { const status = workflowRoot.statusMap.get(nodeId)!; if (status.value === "idle" || status.value === "waiting") { // All preconditions satisfied — start the call status.value = "running"; const operationId = graph.getNodeAttributes(nodeId).name; prm.call(operationId, getInput(nodeId), { parentRequestId: parentCallId }) .then(result => { status.value = "completed"; }) .catch(error => { status.value = "failed"; }); } } }); // Abort when a predecessor fails (uncaught failure propagation) effect(() => { if (blockedByFailure.value) { const status = workflowRoot.statusMap.get(nodeId)!; if (status.value === "idle" || status.value === "waiting") { // A predecessor failed and no Conditional caught it — abort status.value = "aborted"; } } }); } ``` Both effects are reactive. When a predecessor completes, the `preconditions` computed re-evaluates, potentially triggering the start effect. When a predecessor fails, the `blockedByFailure` computed re-evaluates, potentially triggering the abort effect. The call's promise resolution updates the node's status signal, which triggers downstream preconditions and failure propagations to re-evaluate, which triggers their effects, and so on. ### Effect disposal Each `effect()` returns a dispose function. The `WorkflowReactiveRoot` tracks all effect disposers and provides a `dispose()` method that tears down the entire reactive graph: ```typescript dispose(): void { for (const disposer of this.effectDisposers) { disposer(); } this.statusMap.clear(); this.preconditions.clear(); this.blockedByFailure.clear(); } ``` This is critical for cleaning up when a workflow completes, fails, or is aborted. Without disposal, signal subscriptions leak. ### Full workflow abort For systemic failures (provider outage, authentication failure), `WorkflowReactiveRoot` provides `abortAll()`: ```typescript abortAll(): void { for (const [nodeId, status] of this.statusMap) { if (status.value !== "completed" && status.value !== "failed") { status.value = "aborted"; } } // Effects will fire and clean up any waiting/ready nodes } ``` This transitions all non-terminal, non-failed nodes to `aborted`. It's for cases where the entire workflow should stop, regardless of which branches are independent. ## Constraints - **Signals are in-memory** — `WorkflowReactiveRoot` state is not persisted. If the hub restarts, the reactive state is lost and must be reconstructed from call protocol events + template re-render. - **Effect-driven execution is optional** — the hub coordinator can choose not to use `effect()` and instead poll `preconditions.value` and `blockedByFailure.value` manually. The reactive layer provides the building blocks; the coordinator decides how to use them. - **Failure follows dependency edges, not structural scope** — a failed node causes only its downstream dependents (via DAG edges) to abort. Sibling branches in a `Parallel` group are independent and continue running. This enables partial success: one branch can fail while another completes. - **Conditionals are error boundaries** — a `Conditional` whose test evaluates against a failed predecessor can redirect to an else branch, catching the failure. Without a `Conditional`, failures cascade uncaught through dependency edges. - **Abort is immediate in signals, delayed in protocol** — setting `status.value = "aborted"` is instant, but `prm.abort(requestId)` takes time to propagate through the call protocol. The hub should invoke both. - **`skipped` satisfies preconditions** — a `skipped` predecessor is treated as "completed for the purpose of preconditions." It means the branch was deliberately bypassed, not broken. - **`failed` and `aborted` block preconditions** — a `failed` or `aborted` predecessor means the dependent's preconditions can never be met. The `blockedByFailure` effect transitions the dependent to `aborted`. - **`NodeStatus` and `CallStatus` share terminal states** — `running`, `completed`, `failed`, `aborted` map directly. `idle`, `waiting`, `ready`, `skipped` are workflow-specific additions. ## Open Questions 1. **Should preconditions support OR logic?** Currently all predecessors must complete (AND logic). An `anyOf` predicate would allow "start this node as soon as any predecessor completes." This would require an edge attribute or node-level configuration. 2. **How are retries handled at the signal level?** If an operation fails and should be retried, the status would go `running → failed → ready → running`. This requires resetting the status back to `ready`, which the current state machine doesn't support (failed is terminal). A `retried` status or a separate `retryCount` attribute may be needed. 3. **Should the reactive graph support partial re-rendering?** If a template changes mid-execution (e.g., a step is added), the ujsx reconciler could diff the old and new trees. But the ReactiveHost only supports mount rendering. Re-rendering would require reconciler support. 4. **How does `maxConcurrency` interact with preconditions?** A `Parallel` group with `maxConcurrency: 3` should only start 3 nodes at a time, even though all preconditions are met. This is a scheduling concern, not a structural one. The reactive layer could implement this as a semaphore signal, or it could be the coordinator's responsibility. 5. **Should `blockedByFailure` be a separate `computed` or derived from `preconditions`?** Currently the design has two separate computeds — `preconditions` (all predecessors completed/skipped) and `blockedByFailure` (any predecessor failed/aborted). An alternative is a single `computed` that returns `"ready" | "blocked" | "failed"` or similar. This reduces the number of effects but makes the readiness check less composable. 6. **What happens to running nodes when a predecessor fails?** The current spec transitions `idle` and `waiting` nodes to `aborted`. But what about a node that's already `running`? Should it be cancelled (set to `aborted` and call `prm.abort()`), or should it be allowed to complete? The answer depends on whether the running node's output is still needed — which the template author decides via `Conditional` error boundaries. ## References - ujsx reactive layer: `@alkdev/ujsx/docs/architecture/reactive-layer.md` - ujsx reconciler: `@alkdev/ujsx/docs/architecture/reconciler.md` - Schema: [schema.md](schema.md) — `NodeStatus`, `CallStatus` - Host configs: [host-configs.md](host-configs.md) - Workflow templates: [workflow-templates.md](workflow-templates.md) - Call protocol: `@alkdev/alkhub_ts/docs/architecture/call-graph.md`