ADR-005 accepted: resolve all open consequences, update cascading docs

Resolve the three open consequences from ADR-005 (Event Log as Single
Source of Truth) and transition from Proposed to Accepted:

1. Event log IS the call protocol event stream — not a separate type,
   but an EventLogProjection interface (append/getStatus/getResult/
   getEvents) over CallEventMapValue[] with an append-only contract.

2. Event log persists across template re-renders — projections recompute
   against the new DAG; orphaned events stay in log for audit but don't
   affect active projections.

3. Edges get dataFlow: boolean attribute on TemplateEdgeAttrs — inferred
   (not manual) by GraphologyHostConfig from template expressions.
   typeCompat() only runs on dataFlow: true edges. Inference rules are
   precisely specified for Conditional.test, Map.over, and Operation.input.

Also resolve OQ-05 (structural containers stay transparent; aggregate
status is a projection from children) and OQ-10 (running node failure
is a FailurePolicy configuration, default continues-running).

Cascading updates to:
- reactive-execution.md: add hybrid status model (event-log-driven vs
  projection-driven vs signal-mutation), EventLogProjection interface,
  result projection respecting retries, FailurePolicy type
- host-configs.md: ReactiveContext now includes resultProjection and
  computed results; resolved Q1/Q3/Q4
- schema.md: dataFlow attribute on TemplateEdgeAttrs with inference
  rules and type checking implications
- workflow-templates.md: edge creation rules with dataFlow, result
  projection in Conditional/Map, resolved Q1/Q4
- open-questions.md: all ADR-005 questions marked resolved, updated
  summary table and cross-cutting themes, removed duplicate OQ-07

7 files changed, 464 insertions, 139 deletions
This commit is contained in:
2026-05-21 07:44:28 +00:00
parent 2c1b2d1a15
commit c76be7f689
7 changed files with 463 additions and 138 deletions

View File

@@ -1,6 +1,6 @@
---
status: draft
last_updated: 2026-05-20
last_updated: 2026-05-21
---
# @alkdev/flowgraph Architecture

View File

@@ -2,7 +2,7 @@
## Status
Proposed
Accepted
## Context
@@ -139,11 +139,111 @@ This resolves OQ-01: incompatible edges (type mismatches) only exist on state-tr
- **Event replay must be idempotent.** Processing the same event twice must produce the same projected state. This is already a property of the call protocol events (`updateFromEvent` is documented as idempotent in call-graph.md).
- **The result projection needs a clear interface.** `getResult(nodeId)` must be defined — what it returns, when it's available, and how it interacts with `Conditional.test` and `Map.over` closures that may reference results from nodes that haven't completed yet.
### Open
### Resolved: Event log is the call protocol event stream
- **Should the event log be its own exported type, or is it the call protocol event stream by another name?** The call protocol already defines the events. The event log might just be `CallEventMapValue[]` with an append-only contract and projection functions.
- **How does the event log interact with the ujsx template lifecycle?** When a template is rendered to a reactive root, the log starts empty and populates as events arrive. But if the template is re-rendered (when the reconciler supports it), what happens to the log? Is it reset, or does it persist across re-renders?
- **Should temporal-only edges be explicitly marked?** Currently `sequential` edges are always temporal ordering. Data flow is implicitly expressed by `Conditional.test` and `Map.over` reading from the result projection. Should edges carry an attribute that explicitly marks them as notification vs. state transfer? This would make type checking more precise (only check types on state-transfer edges).
The event log is NOT a separate type. It IS the call protocol event stream with an **append-only contract** and **projection functions**. The call protocol events (`CallEventMapValue[]`) already carry everything needed:
- `requestId` — identifies which invocation
- `operationId` — identifies which operation
- `input`/`output` — the payload data (for state transfer edges)
- `parentRequestId` — the causation link
- `timestamp` — when it happened
What flowgraph provides is not a new event type, but a **consumption contract**:
```typescript
interface EventLogProjection {
/** Append an event. Events are processed idempotently. */
append(event: CallEventMapValue): void;
/** Current status of a node, derived from the most recent event. */
getStatus(nodeId: string): NodeStatus;
/** Result of a completed node, derived from call.responded events. */
getResult(nodeId: string): CallResult | undefined;
/** All events for a node, in order. */
getEvents(nodeId: string): CallEventMapValue[];
}
```
The `EventLogProjection` interface makes the append-only discipline explicit and provides typed access to projections. Implementations wrap `CallEventMapValue[]` and derive state on demand (or with memoization). This avoids creating a parallel type system — the event types, their structure, and their semantics remain in `@alkdev/operations/src/call.ts`.
### Resolved: Event log persists across re-renders; projections recompute
When a template is re-rendered (when the ujsx reconciler supports it), the event log persists. Events are append-only facts — they record what happened, and what happened doesn't change when the template structure changes.
Projections are recomputed by scanning the log against the new DAG:
1. Events for nodes still in the DAG map naturally to their projections.
2. Events for nodes removed from the DAG become **orphaned events** — they remain in the log (for audit/history) but don't affect active projections.
3. New nodes added to the DAG have no events yet — their status is `idle` and their result is `undefined`.
This means re-rendering doesn't lose history. The event log is the durable record; projections are ephemeral views that can always be reconstructed.
For v1 (before the reconciler exists), the event log starts at template mount and is disposed when the `WorkflowReactiveRoot` is disposed. The re-render scenario is an architectural commitment for when the reconciler arrives, not something to implement now.
#### Orphaned events specification
When a template is re-rendered and nodes are removed from the DAG, their events become orphaned. The projection layer handles this as follows:
1. **The `EventLogProjection` receives the current DAG structure** (the set of active node keys) alongside the event log. Methods like `getStatus(nodeId)` first check whether `nodeId` is in the active DAG. If not, the node is orphaned.
2. **Orphaned nodes return `undefined` from `getResult()`**. A downstream node referencing an orphaned predecessor via `Conditional.test` or `Map.over` will see `undefined`, causing the test to evaluate as if the predecessor didn't complete. This is the correct behavior — a removed node can't provide data.
3. **Orphaned events remain in the log** for audit and history. `getEvents(nodeId)` on an orphaned node returns its events (if any). The overall event log is still queryable for debugging.
4. **The `nodeKeyToRequestId` map is rebuilt on re-render**. New nodes get fresh `requestId` values. Old mappings are discarded, along with their associated signal subscriptions (the `WorkflowReactiveRoot.dispose()` call before re-render handles this).
### Resolved: Edges are marked with `dataFlow` attribute
Template edges get a `dataFlow: boolean` attribute that distinguishes temporal edges from state-transfer edges:
| `dataFlow` value | Meaning | Type checking needed? |
|:---|:---|:---|
| `false` (default) | Temporal ordering only — downstream starts after upstream completes but doesn't read upstream's output | No — no data flows between nodes |
| `true` | State transfer — downstream reads upstream's output via `Conditional.test` or `Map.over` | Yes — `typeCompat()` checks output→input compatibility |
This attribute is **inferred, not manual**. The `GraphologyHostConfig` detects `dataFlow` from template expressions during rendering:
- A `Sequential` edge where the downstream node references `results["upstreamNode"]` in `Conditional.test`, `Map.over`, or `Operation.input` gets `dataFlow: true`
- A `Sequential` edge where no such reference exists gets `dataFlow: false` (the default)
- A `Conditional` edge always gets `dataFlow: true` (the condition always reads a predecessor's result)
- `Parallel` edges don't exist (parallel children have no inter-sibling edges)
#### dataFlow inference specification
The inference algorithm operates at **template AST level** during `GraphologyHostConfig.createInstance` / `appendChild`, not at runtime. It inspects template component props to detect references to predecessor results:
**Detectable references** (set `dataFlow: true` on the edge from the referenced node to the referencing node):
| Expression | Detection method |
|:---|:---|
| `Conditional.test = (results) => results["X"]` | Static analysis of the function body for `results[...]` property accesses |
| `Conditional.test = "X"` (string form) | String comparison — the referenced operation name |
| `Map.over = (results) => results["X"].output.items` | Static analysis of the function body for `results[...]` property accesses |
| `Map.over = itemsSignal` (signal form) | No `dataFlow: true` — the array comes from a signal, not a predecessor result |
| `Operation.input = (results) => results["X"].output` | Static analysis of the function body for `results[...]` property accesses |
| `Operation.input = staticValue` | No `dataFlow: true` — the input doesn't depend on a predecessor result |
**Inference rules**:
1. **Direct predecessor edges only**: `dataFlow: true` is set only on edges that exist in the DAG. In a `Sequential` chain A → B → C, if C references `results["A"]`, the edge B → C gets `dataFlow: true` (since A is a predecessor of C via the chain), but no new edge A → C is created. Data flows transitively through the chain — B must complete before C starts, and C reads A's result from the result projection.
2. **`Map` component edges**: A `Map` component's predecessor-to-first-mapped-child edge gets `dataFlow: true` if `Map.over` references a predecessor result. Each mapped child's edge from the `Map`'s predecessor gets `dataFlow: true` because the array data comes from a predecessor's output.
3. **Ambiguous references**: If `Operation.input` is a function that cannot be statically analyzed (e.g., `(results) => computeInput(results)` where `computeInput` is a closure), the inference defaults to `dataFlow: false`. Template authors can manually annotate with `dataFlow: true` as an override, though this should be rare.
4. **Function body analysis**: JavaScript function introspection is unreliable (minification, closures). Inference operates on the **AST** of the ujsx template during rendering, not on the runtime function body. This means that `Conditional.test` functions passed as closures from external code (not inline in the template) cannot have their references detected. For these cases, the string form (`Conditional.test = "operationName"`) should be used to ensure detectability.
The `dataFlow` attribute propagates to the `TemplateEdgeAttrs` schema:
```typescript
const TemplateEdgeAttrs = Type.Object({
edgeType: Type.Union([Type.Literal("sequential"), Type.Literal("conditional")]),
condition: Type.Optional(Type.Unknown()),
dataFlow: Type.Optional(Type.Boolean({ default: false })),
});
```
This resolves OQ-01 and OQ-02 precisely: `typeCompat()` only runs on edges where `dataFlow: true`. Temporal-only edges bypass type checking entirely.
## References

View File

@@ -1,6 +1,6 @@
---
status: draft
last_updated: 2026-05-20
last_updated: 2026-05-21
---
# Host Configs
@@ -234,16 +234,19 @@ interface ReactiveContext {
statusSignals: Map<string, Signal<NodeStatus>>; // Status signals by key (owned by WorkflowReactiveRoot)
preconditions: Map<string, Computed<boolean>>; // Precondition computeds by key (owned by WorkflowReactiveRoot)
blockedByFailure: Map<string, Computed<boolean>>; // blockedByFailure computeds by key (owned by WorkflowReactiveRoot)
resultProjection: EventLogProjection; // Result projection from ADR-005 event log
parentMap: Map<string, string>; // Child → parent key mapping (for precondition computation)
siblingMap: Map<string, string[]>; // Parent → children keys (for structural context)
results: Map<string, Signal<unknown>>; // Operation output signals by key
results: Map<string, Computed<CallResult | undefined>>; // Result computeds by key (derived from event log)
}
```
The `ReactiveContext` is constructed during `ReactiveHostConfig` initialization. It receives the `operationRegistry` and empty maps. During `createInstance`, nodes and signals are registered in the context maps. After rendering completes, the context holds a complete index of the reactive workflow tree.
The `ReactiveContext` is constructed during `ReactiveHostConfig` initialization. It receives the `operationRegistry`, empty maps, and the `EventLogProjection` from the `WorkflowReactiveRoot`. During `createInstance`, nodes and signals are registered in the context maps. After rendering completes, the context holds a complete index of the reactive workflow tree.
**Important**: `statusSignals`, `preconditions`, and `blockedByFailure` are references to the `WorkflowReactiveRoot`'s maps. The `ReactiveHostConfig` does not own these signals — it looks them up during `createInstance` to wire `WorkflowNode` references. Disposal is the `WorkflowReactiveRoot`'s responsibility.
**Result projection (ADR-005)**: The `resultProjection` provides `getResult(nodeId)` which returns `CallResult | undefined`. This is derived from the event log, not from direct signal reads. `Conditional.test` and `Map.over` functions access predecessor results through this projection, ensuring they always see the most recent data — including retry results.
### createInstance
```typescript
@@ -435,7 +438,7 @@ Alternative: Create "virtual" nodes for structural containers that hold `signal<
### Conditional Test Evaluation
The `Conditional.test` prop can be a function or a string. At the template level, it's stored as a prop. At runtime, the reactive engine evaluates it as a `computed` that depends on referenced nodes' outputs. This evaluation needs access to the `WorkflowContext` (which holds the results of previous steps), which means the reactive engine must have a reference to the call graph or a results map.
The `Conditional.test` prop can be a function or a string. At the template level, it's stored as a prop. At runtime, the reactive engine evaluates it as a `computed` that depends on referenced nodes' outputs from the **result projection** (per ADR-005). This evaluation reads from `getResult(nodeId)` which derives from the event log, ensuring that `Conditional.test` always sees the most recent state — including retry results.
## Constraints
@@ -448,13 +451,13 @@ The `Conditional.test` prop can be a function or a string. At the template level
## Open Questions
1. **Should structural containers create "virtual" nodes in the reactive engine?** This would simplify precondition computation (every node has a status) but adds nodes that don't correspond to calls or operations.
1. ~~**Should structural containers create "virtual" nodes in the reactive engine?**~~ **Resolved (OQ-05)**: Containers stay transparent. Aggregate status for structural containers is computed as a projection from children's statuses, without requiring nodes in the event log or DAG. The `parentMap` and `siblingMap` in `ReactiveContext` provide the structural context for precondition computation.
2. **Should the GraphologyHostConfig produce a separate graph for edge types?** Currently all edge types (`sequential`, `conditional`, `typed`) share the same graph. An alternative is a separate graph per edge type, enabling type-specific queries without filtering.
3. **How does the ReactiveHostConfig interact with the call protocol?** When a node transitions to `ready`, the reactive engine needs to call `registry.execute()` or `PendingRequestMap.call()`. This bridges the reactive layer to the operation execution layer. The HostConfig's `createInstance` callback is one option; a separate `ExecutionEngine` class is another.
3. ~~**How does the ReactiveHostConfig interact with the call protocol?**~~ **Resolved (ADR-005)**: The reactive layer bridges to the call protocol through the event log. The hub coordinator appends call protocol events; the reactive layer projects them into status and results. The `ReactiveHostConfig` reads from the `EventLogProjection` interface (via `getStatus()` and `getResult()`), not from direct signal mutations by the coordinator.
4. **Should the reactive engine own the call graph?** Currently the call graph (from call-graph.md) and the reactive engine (from this doc) are separate concepts. But at runtime, every `<Operation>` in a template becomes a call graph node. Should the reactive engine populate the call graph as a side effect?
4. ~~**Should the reactive engine own the call graph?**~~ **Resolved (ADR-005)**: Neither owns the other. Both the call graph and the reactive engine are projections of the same event log. The call graph projects the structural view (who triggered whom). The reactive engine projects the behavioral view (what's running, what's blocked).
## References

View File

@@ -16,41 +16,36 @@ Cross-cutting compilation of all unresolved questions across the flowgraph archi
## ADR-005 Impact
[ADR-005: Event Log as Single Source of Truth](decisions/005-event-log-as-source-of-truth.md) proposes an Execution Event Log pattern that resolves or reframes several open questions. Questions affected by ADR-005 are marked with `adr-005` in their status. Summary:
[ADR-005: Event Log as Single Source of Truth](decisions/005-event-log-as-source-of-truth.md) proposes an Execution Event Log pattern that resolves or reframes several open questions. ADR-005 is now **Accepted**. All questions it affects have been resolved:
| Question | ADR-005 Impact |
|----------|-----------------|
| OQ-01 | Reframed: incompatible edges only exist where there's data flow. Temporal-only edges don't need type checking. |
| OQ-02 | Reframed: type compatibility depth only applies to state-transfer edges, not notification edges. |
| OQ-06 | Resolved: the reactive layer bridges to the call protocol through the event log, not direct signal mutation. |
| OQ-07 | Resolved: call graph and reactive engine are both projections of the event log. Neither owns the other. |
| OQ-08 | Resolved: `depends_on` edges unnecessary; data dependencies expressed through result projection. |
| OQ-09 | Resolved: retries are natural append events, not state mutations. |
| OQ-10 | Reframed: policy question (abort running nodes?) becomes a projection configuration, not a hardcoded state machine rule. |
| Question | ADR-005 Impact | Final Resolution |
|----------|-----------------|-------------------|
| OQ-01 | Reframed → Resolved | Type-compat edges only on `dataFlow: true` edges. Temporal edges bypass type checking. |
| OQ-02 | Reframed → Resolved | Type checking scope narrows to state-transfer edges. Structured mismatch reporting confirmed. |
| OQ-05 | Independent → Resolved | Containers stay transparent. Aggregate status computed as projection from children. |
| OQ-06 | Resolved | The reactive layer bridges to call protocol through the event log. Hub appends events; reactive layer projects them. |
| OQ-07 | Resolved | Call graph and reactive engine are both projections of the event log. Neither owns the other. |
| OQ-08 | Resolved | `depends_on` edges unnecessary. Data dependencies expressed through result projection. |
| OQ-09 | Resolved | Retries are natural append events. New `requestId` per retry. |
| OQ-10 | Reframed → Resolved | Running node failure handling is a projection policy, not a state machine rule. Default: running nodes continue. |
## Theme 1: Edge Semantics and Type Compatibility
### OQ-01: Should `fromSpecs()` add ALL edges or only compatible ones?
- **Origin**: [operation-graph.md](operation-graph.md) Q1
- **Status**: reframed by ADR-005
- **Status**: resolved
- **Priority**: high — affects storage size, API surface, and diagnostic value
- **Options**:
- (a) Add both compatible and incompatible edges (current design). Pro: diagnostic information visible. Con: graph is larger.
- (b) Only add compatible edges, with a `potentialEdges()` query computing incompatible connections on demand. Pro: smaller graph. Con: loses diagnostic information.
- **Notes**: This decision affects `buildTypeEdges()` in [analysis.md](analysis.md) and `OperationEdgeAttrs` in [schema.md](schema.md). The `compatible: false` attribute on edges only makes sense if option (a) is chosen.
- **ADR-005 reframing**: Incompatible edges only exist on **state-transfer** edges (where data flows from A's output to B's input). **Temporal-only** edges (where B starts after A completes but doesn't use A's output) don't need type checking at all. This means option (b) may be correct for temporal edges, while option (a) is correct for state-transfer edges. The operation graph could distinguish these with an edge attribute.
- **Resolution**: Adopt option (a) for state-transfer edges, option (b) for temporal-only edges. Type-compatibility edges (with `compatible: true/false` attributes) are only added where data flows between operations. The `dataFlow` attribute on `TemplateEdgeAttrs` (resolved in ADR-005) determines which edges need type checking. For edges where `dataFlow: true`, both compatible and incompatible edges provide diagnostic value. For edges where `dataFlow: false`, no type-compat edge is needed — temporal ordering doesn't have type compatibility.
- **Cross-references**: OQ-04
### OQ-02: How granular should type compatibility results be?
- **Origin**: [operation-graph.md](operation-graph.md) Q4, [analysis.md](analysis.md) Q1
- **Status**: reframed by ADR-005
- **Status**: resolved
- **Priority**: high — directly shapes the `typeCompat()` return type and `OperationEdgeAttrs`
- **Question (merged)**: How deep should `typeCompat` check? Should it be fully recursive? And should the result be `{ compatible, detail? }` or `{ compatible, mismatches: TypeMismatch[] }`?
- **Current design**: The schema already defines `TypeMismatch` with `{ path, expected, actual }` and `OperationEdgeAttrs` has an optional `mismatches` field. The analysis doc describes deep recursive structural comparison. But there's a tension: full recursive checking is more thorough but may produce false negatives for schemas with dynamic structures.
- **Notes**: The schema doc already has `mismatches?: TypeMismatch[]` in `OperationEdgeAttrs`. The analysis doc already defines `TypeCompatResult` with `mismatches`. This suggests the design has already converged toward structured mismatch reporting. What remains is confirming: (a) recursive depth limits, (b) handling of `Type.Unknown()` and complex types (unions, intersections), (c) whether the `detail` string field is still needed alongside `mismatches`.
- **ADR-005 reframing**: Type compatibility checking only applies to **state-transfer** edges (where A's output flows into B's input). **Temporal-only** edges (where B starts after A but doesn't use A's output) don't need type checking — their "compatibility" is trivially true. This means the operation graph should distinguish between edges that carry data and edges that only express ordering. `typeCompat()` only needs to run on state-transfer edges.
- **Resolution**: Type compatibility checking only applies to **state-transfer edges** (where A's output flows into B's input), as established by ADR-005's `dataFlow` attribute on `TemplateEdgeAttrs`. Temporal-only edges bypass type checking entirely (their "compatibility" is trivially true). The `typeCompat()` function returns `{ compatible, detail?, mismatches? }` for state-transfer edges only. The schema already has `mismatches?: TypeMismatch[]` in `OperationEdgeAttrs` — this design is confirmed. Remaining detail decisions (recursive depth limits, unknown/union type handling) are implementation concerns, not architecture decisions.
- **Cross-references**: OQ-01
### OQ-03: Should subscription operations be treated differently in type compatibility?
@@ -79,13 +74,19 @@ Cross-cutting compilation of all unresolved questions across the flowgraph archi
### OQ-05: Should `Sequential` and `Parallel` be transparent in the graph?
- **Origin**: [workflow-templates.md](workflow-templates.md) Q1, [host-configs.md](host-configs.md) Q1
- **Status**: open
- **Status**: resolved
- **Priority**: high — fundamental to how the DAG is structured and how the reactive engine computes preconditions
- **Question (merged)**: Currently, structural containers (`Sequential`, `Parallel`, `Conditional`) produce edges but no nodes. The reactive engine then has to reconstruct structural context to compute preconditions. Should they create "virtual" nodes instead?
- **Options**:
- (a) Transparent (current design): No nodes for containers. Edges carry the structure. Pro: smaller DAG, cleaner topology. Con: precondition computation needs structural context (parentStack, siblingMap).
- (b) Virtual nodes: Containers create nodes with `signal<NodeStatus>`. Pro: every node has a status and preconditions, simpler reactive engine. Con: more nodes, containers with no call protocol equivalent, slightly more complex graph queries.
- **Notes**: The host-configs doc identifies this as a "known gap": `Sequential`, `Parallel`, `Conditional` are transparent in the DAG but create complexity for the reactive engine's "previous sibling" precondition logic. The reactive-execution doc's `WorkflowReactiveRoot.initializeSignals()` assumes it operates on the flattened DAG (all nodes are operations), which aligns with option (a). The question is whether the reactive engine's context maps (`parentMap`, `siblingMap`) are sufficient or if virtual nodes would simplify things.
- **Resolution**: Keep containers transparent (current design). Structural containers do NOT create nodes in the DAG or events in the event log. Their aggregate status can be computed as a projection from their children's statuses:
- A `Sequential` is "completed" when all its children are completed/skipped
- A `Parallel` is "completed" when all its children are completed/skipped
- A `Conditional` is "completed" when its taken branch is completed/skipped
This resolution aligns with ADR-005's projection model: the event log records real call events, and projections derive derived state. Virtual nodes in the event log would pollute it with synthetic events that have no call protocol equivalent. Virtual nodes in the DAG would add structural overhead for what is already computable.
The `parentMap` and `siblingMap` in the `ReactiveContext` remain the mechanism for computing preconditions. These maps are derived from the template structure during rendering, not from the DAG. They provide the structural context that the transparent-DAG approach needs, without requiring container nodes.
- **Cross-references**: OQ-14 (partial re-rendering)
---
@@ -108,11 +109,6 @@ Cross-cutting compilation of all unresolved questions across the flowgraph archi
- **Priority**: high — affects the separation between flowgraph and the call protocol
- **Question**: Currently the call graph (from call-graph.md) and the reactive engine (from reactive-execution.md) are separate concepts. But at runtime, every `<Operation>` in a template becomes a call graph node. Should the reactive engine populate the call graph as a side effect?
- **ADR-005 resolution**: Neither owns the other. Both the call graph and the reactive status/result projections derive from the same event log. They are independent projections of the same source of truth. The call graph projects the structural view (who triggered whom). The reactive engine projects the behavioral view (what's running, what's blocked). You can have one without the other, or both simultaneously.
- **Question**: Currently the call graph (from call-graph.md) and the reactive engine (from reactive-execution.md) are separate concepts. But at runtime, every `<Operation>` in a template becomes a call graph node. Should the reactive engine populate the call graph as a side effect?
- **Options**:
- (a) Separate: Call graph is populated by call protocol events. Reactive engine uses signals only. Coordinator bridges them.
- (b) Unified: Reactive engine creates call graph nodes when nodes transition to `running`, updates them on completion. Call graph is derived from reactive state.
- **Notes**: Option (a) matches ADR-003 (flowgraph doesn't do storage/persistence) and the current design where the call graph is populated by `updateFromEvent()`. Option (b) would couple the reactive engine to the call protocol. The current design's separation is cleaner but requires the coordinator to maintain both reactive state and call graph state.
### OQ-08: Should `depends_on` edges be auto-populated from workflow templates?
@@ -142,20 +138,10 @@ Cross-cutting compilation of all unresolved questions across the flowgraph archi
### OQ-10: What happens to running nodes when a predecessor fails?
- **Origin**: [reactive-execution.md](reactive-execution.md) Q6
- **Status**: reframed by ADR-005
- **Status**: resolved
- **Priority**: high — affects failure propagation correctness
- **Question**: The current spec transitions `idle` and `waiting` nodes to `aborted` when `blockedByFailure` becomes true. But what about a node that's already `running`? Should it be cancelled?
- **Options**:
- (a) Running nodes are NOT affected. A predecessor's failure blocks dependents that haven't started, but running nodes continue. The coordinator can cancel them via `prm.abort()` if desired.
- (b) Running nodes automatically transition to `aborted`. This requires the `effect()` to check for running nodes.
- **ADR-005 reframing**: This becomes a policy configuration of the status projection, not a hardcoded state machine rule. The event log records the failure fact. The projection decides: do we abort running nodes that depend on the failed node? The answer depends on the workflow's failure strategy. Option (a) is the default (running nodes continue), but a policy could specify otherwise. The event log makes both strategies expressible without changing the underlying mechanism — only the projection logic changes.
- **Cross-references**: OQ-09 (retries need to know if a running node can be restarted)
- **Question**: The current spec transitions `idle` and `waiting` nodes to `aborted` when `blockedByFailure` becomes true. But what about a node that's already `running`? Should it be cancelled?
- **Options**:
- (a) Running nodes are NOT affected. A predecessor's failure blocks dependents that haven't started, but running nodes continue. The coordinator can cancel them via `prm.abort()` if desired.
- (b) Running nodes automatically transition to `aborted`. This requires the `effect()` to check for running nodes.
- **Notes**: Option (a) is consistent with "failure follows dependency edges, not structural scope" — a running node has already passed its preconditions, so it should be allowed to complete. The coordinator can choose to abort it. Option (b) would be more aggressive. The reactive-execution doc's constraint says "abort is immediate in signals, delayed in protocol," suggesting option (a) is intended.
- **Cross-references**: OQ-09 (retries need to know if a running node can be restarted)
- **Resolution**: This is a **policy configuration** of the status projection, not a hardcoded state machine rule. The event log records failure facts. The projection decides how to handle running nodes that depend on a failed node. The default policy (option a from the original framing): running nodes are NOT affected by a predecessor's failure — only idle/waiting nodes transition to `aborted`. A more aggressive policy could abort running nodes, but this requires explicit configuration. The event log makes both strategies expressible without changing the underlying mechanism — only the projection logic changes. This aligns with ADR-005's principle that projections encode policy while the log records facts.
- **Cross-references**: OQ-09 (retries are new events, not state mutations)
---
@@ -352,16 +338,16 @@ Cross-cutting compilation of all unresolved questions across the flowgraph archi
| ID | Question | Origin | Priority | Status |
|----|----------|--------|----------|--------|
| OQ-01 | All edges or only compatible edges? | operation-graph | high | reframed by ADR-005 |
| OQ-02 | Type compatibility depth and granularity | operation-graph, analysis | high | reframed by ADR-005 |
| OQ-01 | All edges or only compatible edges? | operation-graph | high | resolved |
| OQ-02 | Type compatibility depth and granularity | operation-graph, analysis | high | resolved |
| OQ-03 | Subscription operations in type compat | operation-graph | medium | open |
| OQ-04 | `edgeType` on all edges? | schema | medium | open |
| OQ-05 | Structural container transparency | workflow-templates, host-configs | high | open |
| OQ-06 | Template ↔ call protocol interaction | workflow-templates, host-configs | high | resolved by ADR-005 |
| OQ-07 | Should reactive engine own call graph? | host-configs | high | resolved by ADR-005 |
| OQ-08 | Auto-populate `depends_on` from templates? | call-graph | medium | resolved by ADR-005 |
| OQ-09 | Retries at signal level | reactive-execution | high | resolved by ADR-005 |
| OQ-10 | Running nodes when predecessor fails | reactive-execution | high | reframed by ADR-005 |
| OQ-05 | Structural container transparency | workflow-templates, host-configs | high | resolved |
| OQ-06 | Template ↔ call protocol interaction | workflow-templates, host-configs | high | resolved |
| OQ-07 | Should reactive engine own call graph? | host-configs | high | resolved |
| OQ-08 | Auto-populate `depends_on` from templates? | call-graph | medium | resolved |
| OQ-09 | Retries at signal level | reactive-execution | high | resolved |
| OQ-10 | Running nodes when predecessor fails | reactive-execution | high | resolved |
| OQ-11 | OR logic for preconditions | reactive-execution | medium | open |
| OQ-12 | `maxConcurrency` interaction with preconditions | reactive-execution | medium | open |
| OQ-13 | `blockedByFailure` vs single computed | reactive-execution | low | open |
@@ -383,17 +369,21 @@ Cross-cutting compilation of all unresolved questions across the flowgraph archi
### Priority Assessment
**Resolved (ADR-005)**:
- ~~OQ-01: All edges or only compatible~~ — resolved: type-compat edges only on `dataFlow: true` edges
- ~~OQ-02: Type compatibility depth~~ — resolved: type checking only for state-transfer edges
- ~~OQ-05: Structural container transparency~~ — resolved: containers stay transparent, aggregate status is a projection
- ~~OQ-06: Template ↔ call protocol~~ — resolved: bridge through event log
- ~~OQ-07: Reactive engine owns call graph?~~ — resolved: both are projections of event log
- ~~OQ-08: Auto-populate `depends_on` from templates?~~ — resolved: unnecessary, data flows through result projection
- ~~OQ-09: Retries at signal level~~ — resolved: append events, not state mutations
- ~~OQ-10: Running node failure handling~~ — resolved: projection policy, default is running nodes continue
**High priority** (should resolve before implementation):
- ~~OQ-01: All edges or only compatible~~ — reframed by ADR-005: incompatible edges only exist on state-transfer edges
- ~~OQ-02: Type compatibility depth~~ — reframed by ADR-005: type checking only for state-transfer edges
- OQ-05: Structural container transparency — fundamental to DAG and reactive engine
- ~~OQ-06: Template ↔ call protocol~~ — resolved by ADR-005
- ~~OQ-07: Reactive engine owns call graph?~~ — resolved by ADR-005
- ~~OQ-09: Retries~~ — resolved by ADR-005
- ~~OQ-10: Running node failure handling~~ — reframed by ADR-005: policy configuration, not hardcoded
- (all high-priority questions have been resolved)
**Medium priority** (should resolve before v1 release):
- OQ-03, OQ-04, OQ-08, OQ-11, OQ-12, OQ-14, OQ-17, OQ-20, OQ-21, OQ-22, OQ-29
- OQ-03, OQ-04, OQ-11, OQ-12, OQ-14, OQ-17, OQ-20, OQ-21, OQ-22, OQ-29
**Low priority** (can defer or decide during implementation):
- OQ-13, OQ-15, OQ-16, OQ-18, OQ-19, OQ-24, OQ-25, OQ-26, OQ-27, OQ-28
@@ -402,11 +392,13 @@ Cross-cutting compilation of all unresolved questions across the flowgraph archi
These groups of questions interact with each other and should be resolved together:
1. **Edge semantics group** (OQ-01, OQ-02, OQ-04): All affect the operation graph's edge structure and the type compatibility API.
1. **~~Edge semantics group~~** (OQ-01, OQ-02, OQ-04): ~~All affect the operation graph's edge structure and the type compatibility API.~~ **Resolved by ADR-005.** OQ-01 and OQ-02 resolved (type checking only on `dataFlow: true` edges). OQ-04 remains open (edge type on all edges).
2. **Call protocol integration group** (OQ-06, OQ-07, OQ-08): All about how flowgraph connects to the live call protocol.
2. **~~Call protocol integration group~~** (OQ-06, OQ-07, OQ-08): ~~All about how flowgraph connects to the live call protocol.~~ **Resolved by ADR-005.** All three resolved: bridge through event log, projections instead of ownership, data flow through result projection.
3. **Failure semantics group** (OQ-09, OQ-10): Both about how failure and retry propagate through the reactive engine. Resolving one may resolve or constrain the other.
3. **~~Failure semantics group~~** (OQ-09, OQ-10): ~~Both about how failure and retry propagate through the reactive engine.~~ **Resolved by ADR-005.** Retries are append events; running node failure is a projection policy.
4. **Scheduling group** (OQ-11, OQ-12): Both about how preconditions interact with scheduling constraints.
4. **Scheduling group** (OQ-11, OQ-12): Both about how preconditions interact with scheduling constraints.

View File

@@ -1,11 +1,11 @@
---
status: draft
last_updated: 2026-05-20
last_updated: 2026-05-21
---
# Reactive Execution
Signal-driven status propagation, computed preconditions, and failure propagation for workflow template execution.
Signal-driven status propagation, computed preconditions, and failure propagation for workflow template execution, built on the event log as single source of truth (ADR-005).
## Overview
@@ -16,30 +16,160 @@ The reactive execution layer bridges workflow template structure (DAG) to runtim
- Failure propagation follows dependency edges — a failed predecessor causes downstream dependents to abort, while independent branches continue running
- Conditionals can serve as error boundaries, catching failures and redirecting to fallback paths
This layer does NOT execute operations directly. It provides reactive state that the hub coordinator reads and writes. The coordinator calls `registry.execute()` when a node's preconditions are met, and updates the node's status signal when the call completes or fails.
### Event Log as Source of Truth
Per [ADR-005](decisions/005-event-log-as-source-of-truth.md), the reactive execution layer is a **projection** of the call protocol event log. The hub coordinator appends call protocol events (`call.requested`, `call.responded`, `call.error`, `call.aborted`, `call.completed`), and the reactive layer derives its state from these events:
```
┌─────────────────────────────────────────────┐
│ Execution Event Log │
│ (append-only CallEventMapValue[] — │
│ the call protocol events) │
└──────────────────┬──────────────────────────┘
┌─────────────┼──────────────┐
│ │ │
▼ ▼ ▼
┌─────────┐ ┌──────────┐ ┌──────────┐
│ Status │ │ Result │ │ Call │
│ Proj. │ │ Proj. │ │ Graph │
│ │ │ │ │ Proj. │
│ nodeId: │ │ nodeId: │ │ │
│ status │ │ output │ │ nodes + │
│ │ │ │ │ edges │
└────┬────┘ └────┬─────┘ └──────────┘
│ │
▼ ▼
┌───────────────────────────────────────────┐
│ Reactive Execution Layer │
│ │
│ preconditions → "does the log show │
│ all predecessors │
│ completed?" │
│ │
│ result resolution → "does the log │
│ have A's output?" │
│ │
│ Conditional.test → reads from result proj. │
│ Map.over → reads from result proj. │
└───────────────────────────────────────────┘
```
**The hub coordinator appends events; the reactive layer projects them.** This replaces the previous design where the coordinator directly set signal values. Under ADR-005, the coordinator's responsibility is:
1. Start a call (which emits `call.requested`)
2. Receive the result (which emits `call.responded` or `call.error`)
3. Append these events to the log
The reactive layer's projections derive `NodeStatus` and `CallResult` from the log. The coordinator no longer calls `status.value = "running"` — the status projection derives this from `call.requested` events.
### Hybrid Status Model
While ADR-005 positions the event log as the single source of truth, not all `NodeStatus` values correspond to call protocol events. The model is hybrid:
**Event-log-driven statuses** (derived directly from `CallEventMapValue` events):
| Call protocol event | Derived NodeStatus |
|---------------------|--------------------|
| `call.requested` | `running` |
| `call.responded` | `completed` |
| `call.error` | `failed` |
| `call.aborted` | `aborted` |
**Projection-driven statuses** (derived from the event log combined with template structure and reactive state):
| NodeStatus | Derived from |
|------------|-------------|
| `idle` | No events for this node yet; no predecessors are running |
| `waiting` | At least one predecessor is `running`, none have completed |
| `ready` | All predecessors are `completed` or `skipped`; no `call.requested` event yet |
| `skipped` | Conditional branch not taken (template-level decision, no call event) |
**Signal-mutation statuses** (set by the reactive engine, not derived from events):
| Trigger | NodeStatus | Rationale |
|---------|------------|-----------|
| `blockedByFailure` effect | `aborted` | A predecessor failed; the node is aborted by failure propagation. This is a projection policy decision, not a call protocol event. |
This distinction is important: the event log records **what happened at the call level**, while the reactive engine derives **workflow-level state** from the log combined with template structure. The `WorkflowReactiveRoot` maintains `signal<NodeStatus>` values, but these signals are set by:
1. The status projection when call events arrive (event-log-driven)
2. The reactive engine for workflow-level states (projection-driven or signal-mutation)
The `getStatus(nodeId)` method on `EventLogProjection` checks the event log first (for call-level statuses), then falls back to the signal map (for workflow-level statuses). The `getResult(nodeId)` method is purely event-log-driven.
## ReactiveRoot for Workflows
```typescript
class WorkflowReactiveRoot {
class WorkflowReactiveRoot implements EventLogProjection {
private statusMap: Map<string, Signal<NodeStatus>>;
private preconditions: Map<string, Computed<boolean>>;
private blockedByFailure: Map<string, Computed<boolean>>;
private resultMap: Map<string, Computed<CallResult | undefined>>;
private graph: DirectedGraph;
private effectDisposers: (() => void)[];
private eventLog: CallEventMapValue[];
private nodeKeyToRequestId: Map<string, string>;
private failurePolicy: FailurePolicy;
constructor(graph: DirectedGraph) {
constructor(graph: DirectedGraph, options?: { failurePolicy?: FailurePolicy }) {
this.graph = graph;
this.statusMap = new Map();
this.preconditions = new Map();
this.blockedByFailure = new Map();
this.resultMap = new Map();
this.effectDisposers = [];
this.eventLog = [];
this.nodeKeyToRequestId = new Map();
this.failurePolicy = options?.failurePolicy ?? "continue-running";
this.initializeSignals();
}
}
```
`WorkflowReactiveRoot` wraps the reactive state for an entire workflow execution. It takes the structural DAG (from the GraphologyHost) and creates reactive state for each operation node.
`WorkflowReactiveRoot` wraps the reactive state for an entire workflow execution. It takes the structural DAG (from the GraphologyHost) and creates reactive state for each operation node. It implements the `EventLogProjection` interface from ADR-005, meaning the hub coordinator appends call protocol events and the root derives status and results from them.
### FailurePolicy
The failure policy determines what happens to running nodes when a predecessor fails. Per ADR-005 and OQ-010, this is a **projection policy**, not a hardcoded rule:
```typescript
type FailurePolicy =
| "continue-running" // Running nodes continue. Only idle/waiting dependents abort. (default)
| "abort-dependents"; // Running dependents of the failed node also abort.
```
The default policy (`continue-running`) means a node that has already started execution completes normally, even if a sibling or predecessor fails. Only nodes that haven't started (`idle` or `waiting`) transition to `aborted`.
### EventLogProjection Interface
```typescript
interface EventLogProjection {
/** Append an event. Events are processed idempotently. */
append(event: CallEventMapValue): void;
/** Current status of a node, derived from the most recent event. */
getStatus(nodeId: string): NodeStatus;
/** Result of a completed node, derived from call.responded events. */
getResult(nodeId: string): CallResult | undefined;
/** All events for a node, in order. */
getEvents(nodeId: string): CallEventMapValue[];
}
```
The `append()` method is the primary entry point for the hub coordinator. When a call protocol event arrives (`call.requested`, `call.responded`, etc.), the coordinator appends it to the log. The projections automatically update: `getStatus()` scans the log for the most recent event per node, and `getResult()` extracts the output from `call.responded` events.
### Request ID Mapping
The event log uses `requestId` (from the call protocol), while the reactive engine uses node keys (from the template DAG). The `nodeKeyToRequestId` map bridges these:
```typescript
// When starting a call:
const requestId = crypto.randomUUID();
workflowRoot.nodeKeyToRequestId.set(nodeKey, requestId);
// When appending events:
workflowRoot.append({ type: "call.requested", requestId, operationId, input, timestamp: now() });
```
This mapping is necessary because a single template node may have multiple requests (retries), and the event log records all of them.
### initializeSignals()
@@ -71,9 +201,51 @@ private initializeSignals(): void {
});
});
// Result: derived from the event log's result projection
// Uses the MOST RECENT call.responded event for this node (respects retries)
const result = computed(() => {
const requestId = this.nodeKeyToRequestId.get(node);
if (!requestId) return undefined;
const nodeEvents = this.eventLog
.filter(e => "requestId" in e && e.requestId === requestId);
// For retries, find the most recent call.responded or call.error event
// Events are in chronological order, so findLast would work in ES2023.
// Here we reverse-filter to find the latest terminal event.
let latestTerminalEvent: CallEventMapValue | undefined;
for (let i = nodeEvents.length - 1; i >= 0; i--) {
const e = nodeEvents[i];
if (e.type === "call.responded" || e.type === "call.error" || e.type === "call.aborted") {
latestTerminalEvent = e;
break;
}
}
if (!latestTerminalEvent) return undefined;
if (latestTerminalEvent.type === "call.error") {
return {
status: "failed",
output: undefined,
error: latestTerminalEvent.error,
} satisfies CallResult;
}
if (latestTerminalEvent.type === "call.responded") {
return {
status: "completed",
output: latestTerminalEvent.output,
} satisfies CallResult;
}
if (latestTerminalEvent.type === "call.aborted") {
return {
status: "aborted",
output: undefined,
} satisfies CallResult;
}
return undefined;
});
this.statusMap.set(node, status);
this.preconditions.set(node, preconditions);
this.blockedByFailure.set(node, blockedByFailure);
this.resultMap.set(node, result);
}
}
```
@@ -82,11 +254,23 @@ For each operation node in the DAG:
1. Create a `signal<NodeStatus>` starting at `"idle"`
2. Create a `computed<boolean>` that's `true` when all predecessor nodes have status `"completed"` (or `"skipped"` — a skipped node satisfies its dependents' preconditions)
3. Create a `computed<NodeStatus | null>` that detects whether any predecessor has failed or been aborted, triggering a cascade
4. Register an abort function that cascades to all descendants
4. Create a `computed<CallResult | undefined>` that derives the node's result from the event log (for use by `Conditional.test` and `Map.over`)
5. Register an abort function that cascades to all descendants
### Status lifecycle
The signal-based status lifecycle mirrors `CallStatus` with workflow-specific additions:
The signal-based status lifecycle mirrors `CallStatus` with workflow-specific additions. Under ADR-005, status transitions are **derived from the event log** — the coordinator appends events, and the status projection maps events to states:
| Event log signals | NodeStatus | Meaning |
|-------------------|------------|---------|
| (no events) | `idle` | Node just created, no call activity yet |
| Predecessor events arriving | `waiting` | At least one predecessor is running, none have completed yet |
| All predecessors completed/skipped | `ready` | All preconditions met, eligible to start |
| `call.requested` received | `running` | Call executing |
| `call.responded` received | `completed` | Call succeeded |
| `call.error` received | `failed` | Call failed (uncaught error) |
| `call.aborted` received | `aborted` | Call cancelled |
| Conditional branch not taken | `skipped` | Conditional branch not taken |
```
┌──────┐
@@ -106,13 +290,14 @@ The signal-based status lifecycle mirrors `CallStatus` with workflow-specific ad
│ │ └──────────►│ready │
│ │ └──┬───┘
│ │ │ hub starts call
│ │ │ (appends call.requested)
│ │ ▼
│ │ ┌────────┐
│ │ │running │──── ──── ──── ────►
│ │ └──┬──┬──┘ │
│ │ │ │ │
│ │ call │ │ call │ call
│ │ completed │ │ failed │ aborted
│ │ responded │ │ failed │ aborted
│ │ │ │ │
│ │ ▼ ▼ ▼
│ │ ┌───────────┐ ┌──────┐ ┌────────┐
@@ -134,37 +319,18 @@ The signal-based status lifecycle mirrors `CallStatus` with workflow-specific ad
└─── all are terminal states
```
Full transition rules:
### Retry semantics (ADR-005)
Retries are natural with the event log. A retry is NOT a state mutation — it's a new sequence of events appended to the log:
```
idle → waiting (predecessor starts running)
idle → ready (no predecessors — root node)
waiting → ready (all predecessors completed or skipped)
waiting → aborted (predecessor failed and failure is uncaught)
ready → running (hub starts the call)
running → completed (call succeeded)
running → failed (call threw an error)
running → aborted (call cancelled externally)
failed → [terminal] (no further transitions)
aborted → [terminal] (no further transitions)
skipped → [terminal] (conditional branch not taken)
completed → [terminal] (no further transitions)
call.requested(A, reqId=1) → fact: A was requested
call.error(A, reqId=1) → fact: A failed on first attempt
call.requested(A, reqId=2) → fact: A was retried with a new request
call.responded(A, reqId=2) → fact: A succeeded on retry
```
| Status | Meaning | Signal trigger |
|--------|---------|---------------|
| `idle` | Node just created, no predecessor activity yet | Initial state |
| `waiting` | At least one predecessor is running, none have completed yet | Any predecessor status change |
| `ready` | All predecessors completed or skipped (preconditions met) | `computed` resolves to `true` |
| `running` | Call executing | Hub sets `status.value = "running"` |
| `completed` | Call succeeded | Hub sets `status.value = "completed"` |
| `failed` | Call failed (uncaught error) | Hub sets `status.value = "failed"` |
| `aborted` | Call cancelled, or cascaded from failed predecessor | Hub or cascade sets `status.value = "aborted"` |
| `skipped` | Conditional branch not taken | Conditional evaluation sets this |
The key distinction between `failed` and `aborted`:
- **`failed`** means the operation itself threw an error. The node is the *source* of the failure.
- **`aborted`** means the operation was cancelled or a predecessor failed. The node is a *victim* of failure propagation.
The status projection derives the current state by scanning for the **most recent event per node**. No `retried` status needed; no state machine mutation; the log preserves full history. The `nodeKeyToRequestId` map tracks which `requestId` corresponds to each node's current attempt.
## Computed Preconditions
@@ -330,13 +496,13 @@ h(Sequential, {},
```
If `fetch-data` fails:
1. The `Conditional`'s `test` function receives the results map including `fetch-data`'s status
1. The `Conditional`'s `test` function receives the results map from the **result projection** (derived from the event log)
2. `test` evaluates to `false` (the operation failed)
3. The `then` branch transitions to `skipped`
4. The `else` branch (`notify-error`) becomes `ready`
3. The `then`-branch transitions to `skipped`
4. The `else`-branch (`notify-error`) becomes `ready`
5. Downstream nodes after the `Conditional` see the `Conditional` as `completed` (it resolved successfully, just on a different branch)
This makes `Conditional` a **caught error boundary**. The failure is handled — downstream nodes don't see a cascade because the `Conditional` resolved successfully.
The result projection (from ADR-005) provides `CallResult` values to `Conditional.test` and `Map.over`. These are computed from the event log, not from direct signal reads. This ensures that `Conditional.test` always sees the most recent state — if a node is retried, the test sees the retry's result, not the original failure.
Without a `Conditional`, the failure is **uncaught**. It cascades through dependency edges to all dependents, which transition to `aborted`.
@@ -406,11 +572,17 @@ function callStatusToNodeStatus(callStatus: CallStatus): NodeStatus {
}
```
## Effect-Driven Execution
## Event-Driven Execution
The hub coordinator uses two `effect()`s per node — one for starting when preconditions are met, and one for aborting when failure propagates:
Under ADR-005, the hub coordinator's responsibility shifts from directly setting signal values to **appending events to the log**. The reactive layer drives execution via `effect()`s that watch projections and invoke calls when preconditions are met.
### Coordinator Flow
```typescript
// 1. Create the reactive root from the DAG
const workflowRoot = new WorkflowReactiveRoot(dag, { failurePolicy: "continue-running" });
// 2. Register effects that start calls when preconditions are met
for (const [nodeId, preconditions, blockedByFailure] of workflowRoot.nodes) {
// Start the call when preconditions are met
effect(() => {
@@ -418,11 +590,18 @@ for (const [nodeId, preconditions, blockedByFailure] of workflowRoot.nodes) {
const status = workflowRoot.statusMap.get(nodeId)!;
if (status.value === "idle" || status.value === "waiting") {
// All preconditions satisfied — start the call
status.value = "running";
const operationId = graph.getNodeAttributes(nodeId).name;
prm.call(operationId, getInput(nodeId), { parentRequestId: parentCallId })
.then(result => { status.value = "completed"; })
.catch(error => { status.value = "failed"; });
const requestId = crypto.randomUUID();
workflowRoot.nodeKeyToRequestId.set(nodeId, requestId);
// Append event to the log (the status projection updates automatically)
workflowRoot.append({
type: "call.requested",
requestId,
operationId,
input: getInput(nodeId),
timestamp: new Date().toISOString(),
});
}
}
});
@@ -438,11 +617,45 @@ for (const [nodeId, preconditions, blockedByFailure] of workflowRoot.nodes) {
}
});
}
// 3. When a call completes, append the result event
prm.call(operationId, input, { parentRequestId })
.then(result => {
workflowRoot.append({
type: "call.responded",
requestId,
output: result,
timestamp: new Date().toISOString(),
});
})
.catch(error => {
workflowRoot.append({
type: "call.error",
requestId,
error: { code: error.code, message: error.message },
timestamp: new Date().toISOString(),
});
});
```
Both effects are reactive. When a predecessor completes, the `preconditions` computed re-evaluates, potentially triggering the start effect. When a predecessor fails, the `blockedByFailure` computed re-evaluates, potentially triggering the abort effect.
The call's promise resolution updates the node's status signal, which triggers downstream preconditions and failure propagations to re-evaluate, which triggers their effects, and so on.
The call's promise resolution appends events to the log. The status projection derives state from events. There is no direct `status.value = "running"` or `status.value = "completed"` — the projection handles these transitions by scanning the event log.
### Event-to-Status Mapping
The status projection maps events to `NodeStatus` values:
| Last event for node | Derived NodeStatus |
|---------------------|--------------------|
| No events | `idle` (or `waiting` if predecessors are running) |
| `call.requested` | `running` |
| `call.responded` | `completed` |
| `call.error` | `failed` |
| `call.aborted` | `aborted` |
| `call.completed` | `completed` |
For retries, the projection scans for the most recent event per node. A node with both `call.error` and `call.requested` (with a new `requestId`) is `running`, not `failed`.
### Effect disposal
@@ -558,11 +771,13 @@ The `WorkflowErrorBoundary` catches errors that escape the signal graph (e.g., a
## Constraints
- **Signals are in-memory** — `WorkflowReactiveRoot` state is not persisted. If the hub restarts, the reactive state is lost and must be reconstructed from call protocol events + template re-render.
- **Effect-driven execution is optional** — the hub coordinator can choose not to use `effect()` and instead poll `preconditions.value` and `blockedByFailure.value` manually. The reactive layer provides the building blocks; the coordinator decides how to use them.
- **Events are the source of truth** (ADR-005) — the hub coordinator appends call protocol events. Status, results, and call graph state are derived from the event log. The coordinator does NOT directly set signal values.
- **Event processing is idempotent** — processing the same event twice produces the same projected state. The status projection scans for the most recent event per node.
- **Signals are in-memory** — `WorkflowReactiveRoot` state is not persisted. If the hub restarts, the reactive state is reconstructed from call protocol events + template re-render. The event log itself can be reconstructed from the call protocol event stream.
- **Failure policy is configurable** — the `FailurePolicy` determines what happens to running nodes when a predecessor fails. Default is `continue-running` (only idle/waiting nodes abort). Alternative is `abort-dependents` (running dependents also abort).
- **Failure follows dependency edges, not structural scope** — a failed node causes only its downstream dependents (via DAG edges) to abort. Sibling branches in a `Parallel` group are independent and continue running. This enables partial success: one branch can fail while another completes.
- **Conditionals are error boundaries** — a `Conditional` whose test evaluates against a failed predecessor can redirect to an else branch, catching the failure. Without a `Conditional`, failures cascade uncaught through dependency edges.
- **Abort is immediate in signals, delayed in protocol** — setting `status.value = "aborted"` is instant, but `prm.abort(requestId)` takes time to propagate through the call protocol. The hub should invoke both.
- **Abort is immediate in signals, delayed in protocol** — transitioning a signal to `aborted` is instant, but `prm.abort(requestId)` takes time to propagate through the call protocol. The hub should invoke both.
- **`skipped` satisfies preconditions** — a `skipped` predecessor is treated as "completed for the purpose of preconditions." It means the branch was deliberately bypassed, not broken.
- **`failed` and `aborted` block preconditions** — a `failed` or `aborted` predecessor means the dependent's preconditions can never be met. The `blockedByFailure` effect transitions the dependent to `aborted`.
- **`NodeStatus` and `CallStatus` share terminal states** — `running`, `completed`, `failed`, `aborted` map directly. `idle`, `waiting`, `ready`, `skipped` are workflow-specific additions.
@@ -659,7 +874,7 @@ The `ReactiveContext` passed to `ReactiveHostConfig` includes a reference to `wo
1. **Should preconditions support OR logic?** Currently all predecessors must complete (AND logic). An `anyOf` predicate would allow "start this node as soon as any predecessor completes." This would require an edge attribute or node-level configuration.
2. **How are retries handled at the signal level?** If an operation fails and should be retried, the status would go `running → failed → ready → running`. This requires resetting the status back to `ready`, which the current state machine doesn't support (failed is terminal). A `retried` status or a separate `retryCount` attribute may be needed.
2. ~~**How are retries handled at the signal level?**~~ **Resolved by ADR-005**: Retries are natural append events. A retry creates a new `call.requested` with a new `requestId`. The status projection derives the current state by scanning for the most recent event per node. No `retried` status needed. See the Retry semantics section above.
3. **Should the reactive graph support partial re-rendering?** If a template changes mid-execution (e.g., a step is added), the ujsx reconciler could diff the old and new trees. But the ReactiveHost only supports mount rendering. Re-rendering would require reconciler support.
@@ -667,7 +882,7 @@ The `ReactiveContext` passed to `ReactiveHostConfig` includes a reference to `wo
5. **Should `blockedByFailure` be a separate `computed` or derived from `preconditions`?** Currently the design has two separate computeds — `preconditions` (all predecessors completed/skipped) and `blockedByFailure` (any predecessor failed/aborted). An alternative is a single `computed<NodeReadiness>` that returns `"ready" | "blocked" | "failed"` or similar. This reduces the number of effects but makes the readiness check less composable.
6. **What happens to running nodes when a predecessor fails?** The current spec transitions `idle` and `waiting` nodes to `aborted`. But what about a node that's already `running`? Should it be cancelled (set to `aborted` and call `prm.abort()`), or should it be allowed to complete? The answer depends on whether the running node's output is still needed — which the template author decides via `Conditional` error boundaries.
6. ~~**What happens to running nodes when a predecessor fails?**~~ **Resolved by ADR-005/OQ-010**: This is a `FailurePolicy` configuration of the projection. The default policy (`continue-running`) means running nodes continue. An alternative policy (`abort-dependents`) would abort running dependents. The event log makes both strategies expressible — only the projection logic changes.
## References

View File

@@ -1,6 +1,6 @@
---
status: draft
last_updated: 2026-05-20
last_updated: 2026-05-21
---
# Schema
@@ -298,12 +298,26 @@ A union type used as the edge attribute type parameter for call graphs (`FlowGra
const TemplateEdgeAttrs = Type.Object({
edgeType: Type.Union([Type.Literal("sequential"), Type.Literal("conditional")]),
condition: Type.Optional(Type.Unknown()), // For conditional edges: the condition function or expression
dataFlow: Type.Optional(Type.Boolean({ default: false, description: "Whether this edge carries data (state transfer) or only ordering (temporal notification)" })),
});
type TemplateEdgeAttrs = Static<typeof TemplateEdgeAttrs>;
```
Template edges carry an `edgeType` to distinguish sequential flow from conditional branching. Conditional edges optionally store a `condition` that determines whether the target node executes.
**`dataFlow` attribute (ADR-005)**: Distinguishes temporal-only edges from state-transfer edges. This attribute is critical for type compatibility checking:
- **`dataFlow: false`** (default): The edge expresses temporal ordering only — the downstream node starts after the upstream node completes, but doesn't read the upstream node's output. No type compatibility check is needed.
- **`dataFlow: true`**: The edge carries data — the downstream node reads the upstream node's output via `Conditional.test`, `Map.over`, or `Operation.input`. Type compatibility checking (`typeCompat()`) should verify that the upstream output schema is compatible with the downstream input schema.
The `dataFlow` attribute is **inferred** by the `GraphologyHostConfig` during template rendering, not manually specified by template authors:
- A `Sequential` edge where the downstream node references `results["upstreamNode"]` in any expression gets `dataFlow: true`
- A `Sequential` edge where no such reference exists gets `dataFlow: false` (the default)
- A `Conditional` edge always gets `dataFlow: true` (the condition always reads a predecessor's result)
This resolves OQ-01 and OQ-02: `typeCompat()` only runs on edges where `dataFlow: true`. Temporal-only edges bypass type checking entirely, since no data flows between the connected nodes.
**Note**: `TemplateEdgeAttrs.edgeType` uses a constrained union of `"sequential" | "conditional"` rather than the full `EdgeTypeEnum`. Template DAGs never have `triggered`, `depends_on`, or `typed` edges — those belong to call graphs and operation graphs respectively.
### TemplateNodeAttrs (Workflow Templates)

View File

@@ -1,6 +1,6 @@
---
status: draft
last_updated: 2026-05-20
last_updated: 2026-05-21
---
# Workflow Templates
@@ -126,7 +126,7 @@ const Conditional: UComponent<{
}>;
```
When rendered to a graphology DAG, `Conditional` creates an edge with `edgeType: "conditional"` and `condition` attribute. When rendered to the reactive engine, the condition is evaluated as a `computed` that depends on the referenced step's status and output.
When rendered to a graphology DAG, `Conditional` creates an edge with `edgeType: "conditional"` and `dataFlow: true` (conditional edges always carry data — the test reads a predecessor's result). When rendered to the reactive engine, the condition is evaluated as a `computed` that depends on the result projection (from the event log per ADR-005).
If the test evaluates to `false` and no `else` branch is provided, the branch nodes transition to `skipped` in `NodeStatus`.
@@ -143,6 +143,7 @@ When the `else` prop is provided, the `Conditional` renders two subgraphs:
- When `test` evaluates to `true`: `then`-branch nodes become `ready` (preconditions met). `else`-branch nodes transition to `skipped`. Their `preconditions` are satisfied by the `skipped` state — downstream nodes see the `Conditional` as completed regardless of which branch was taken.
- When `test` evaluates to `false`: `else`-branch nodes become `ready`. `then`-branch nodes transition to `skipped`. Downstream nodes after the `Conditional` see all branches as resolved.
- When no `else` prop is provided: the `false` branch simply doesn't exist. Nodes after the `Conditional` that depend on it still see it as `completed` because the `Conditional` itself resolves regardless of which path is taken.
- The `test` function receives its data from the **result projection** (ADR-005). `results["nodeName"]` reads from `getResult("nodeName")`, which derives from the event log. This ensures retries are reflected — if a node is retried, its result updates when the retry's `call.responded` event arrives.
This means a `Conditional` with an `else` branch acts as a **complete error boundary** — downstream nodes are insulated from the branch choice. The `Conditional` is `completed` whether the `then` or `else` branch executed.
@@ -170,8 +171,8 @@ The `<Map>` component dynamically replicates its child template for each element
**Reactive rendering (ReactiveHostConfig)**:
- For each item in `over`, creates a `WorkflowNode` with its own `signal<NodeStatus>` and `computed` preconditions.
- All mapped nodes' preconditions are identical: the `Map`'s predecessor must be `completed` (same as `Parallel`).
- Each mapped node's `output` signal holds the result of its corresponding call.
- The `Map` result is available as an aggregated signal containing all mapped nodes' outputs.
- Each mapped node's result is available from the **result projection** (ADR-005). `getResult(nodeKey)` derives from the event log.
- The `Map` result is available as an aggregated computed containing all mapped nodes' results from the result projection.
**Example**:
@@ -239,9 +240,9 @@ The HostConfig maps ujsx component types to graphology operations:
### Edge creation rules
- **Sequential**: For children C1, C2, ..., Cn, edges C1→C2, C2→C3, ..., C(n-1)→Cn are added. Within a sequential group, children have implicit `depends_on` edges.
- **Sequential**: For children C1, C2, ..., Cn, edges C1→C2, C2→C3, ..., C(n-1)→Cn are added. Each edge carries `edgeType: "sequential"`. If the downstream node references the upstream node's result (via `Conditional.test`, `Map.over`, or `Operation.input`), the edge also carries `dataFlow: true`. Otherwise, `dataFlow: false` (temporal ordering only).
- **Parallel**: No edges between children. All children have the same preconditions as the parallel group itself.
- **Conditional**: Edge from the conditional node's prerequisite to the first child of the branch, with `edgeType: "conditional"` and `condition` attribute.
- **Conditional**: Edge from the conditional node's prerequisite to the first child of the branch, with `edgeType: "conditional"` and `dataFlow: true` (conditional edges always carry data — the condition reads a predecessor's result).
- **Nested**: A `Sequential` inside a `Parallel` has its own internal edges. A `Parallel` inside a `Sequential` creates a subgraph where all parallel children share the same predecessor.
### Root node handling
@@ -375,13 +376,13 @@ Not all component combinations are valid. The following rules govern which compo
## Open Questions
1. **Should `Sequential` and `Parallel` be transparent in the graph?** Currently they produce edges, not nodes. An alternative is to create "virtual" grouping nodes (like a "parallel gateway" in BPMN). This would make the graph structure richer but adds complexity.
1. ~~**Should `Sequential` and `Parallel` be transparent in the graph?**~~ **Resolved (OQ-05)**: Containers stay transparent. No nodes for `Sequential`, `Parallel`, or `Conditional` in the DAG. Aggregate status for containers is computed as a projection from children's statuses. The `parentMap` and `siblingMap` in `ReactiveContext` provide the structural context for precondition computation.
2. ~~**Should templates support loops?**~~ **Resolved**: The `<Map>` component provides array iteration — one child per array element. It does NOT support general loops (while, do-while). For repeated execution with conditional exit, use `Conditional` inside a `Sequential` group. General-purpose loops with arbitrary termination conditions are not supported because they would require cycle-supporting templates, which conflicts with the DAG-only invariant.
3. **Should templates support `depends_on` edges explicitly?** Currently dependencies are inferred from structure (sequential implies dependency). An explicit `<DependsOn target="operation-name" />` component would make data dependencies visible in the template without relying on sequential ordering.
3. **Should templates support `depends_on` edges explicitly?** Currently dependencies are inferred from structure (sequential implies dependency). An explicit `<DependsOn target="operation-name" />` component would make data dependencies visible in the template without relying on sequential ordering. With ADR-005's `dataFlow` attribute, data dependencies are now inferable from template expressions — `Conditional.test` and `Map.over` that reference predecessor results set `dataFlow: true` on the corresponding edge. Explicit `depends_on` edges would add manual annotation capability, but the `dataFlow` inference may be sufficient for v1.
4. **How does template instantiation interact with the call protocol?** When a template is instantiated as a call graph, each `<Operation>` becomes a call. But the call protocol's `call.requested` events include `parentRequestId` — who is the parent? The template itself? The hub coordinator? This needs a clear answer.
4. ~~**How does template instantiation interact with the call protocol?**~~ **Resolved (ADR-005)**: The template bridges to the call protocol through the event log. The hub coordinator appends call protocol events; the reactive layer projects them. Each `<Operation>` node's `requestId` maps to call protocol events via the `nodeKeyToRequestId` map. No callback, no boomerang — the event log is the bridge.
## References