resolve all remaining open questions (OQ-03–OQ-29), add ADR-006

Resolve all 19 remaining open questions across the architecture. Every question now has a documented resolution with rationale: - OQ-004/OQ-029: edgeType is a universal required attribute on all edges, single graph per FlowGraph instance (ADR-006) - OQ-011: No OR preconditions for v1; preconditionMode as v2 extension - OQ-012: maxConcurrency enforced via reactive counting semaphore - OQ-014: Unknown operationId creates node with pending status - OQ-017: Expose common graphology traversal methods on FlowGraph (80/20) - OQ-020: condition as Type.Unknown() with string/function documentation - OQ-022: Identity imported from @alkdev/operations peer dep - All other questions resolved with documented rationale Fix three critical issues found by architecture review: 1. edgeType serialization/validation gap: document two-step validation 2. CallEdgeAttrs runtime discrimination: edgeType as runtime discriminant, depends_on edges clarified as observability-only (not execution) 3. ADR-005 signal mutation inconsistency: explicitly distinguish call-level statuses (event-log-driven) from workflow-derived statuses (signal-mutation) Additional clarifications: - dataFlow inference uses conservative strategy (defaults false) - Conditional.test string resolution: operationName → status === completed - Add negated field to TemplateEdgeAttrs for else-branch conditions - Document edge key priority convention for composite keys - Add maxConcurrency semaphore design to reactive-execution.md
2026-05-21 09:25:55 +00:00
parent c76be7f689
commit f3e084d02f
9 changed files with 239 additions and 268 deletions
--- a/docs/architecture/schema.md
+++ b/docs/architecture/schema.md
@@ -1,6 +1,6 @@
 ---
 status: draft
-last_updated: 2026-05-21
+last_updated: 2026-05-22
 ---

 # Schema
@@ -216,7 +216,7 @@ const CallNodeAttrs = Type.Object({
    message: Type.String(),
    details: Type.Optional(Type.Unknown()),
  })),
-  identity: Type.Optional(Type.Object({       // Caller identity
+  identity: Type.Optional(Type.Object({       // Caller identity (OQ-022: imported from @alkdev/operations peer dep)
    id: Type.String(),
    scopes: Type.Array(Type.String()),
    resources: Type.Optional(Type.Record(Type.String(), Type.Array(Type.String()))),
@@ -235,6 +235,16 @@ The node key is `requestId`. This matches the call protocol's correlation mechan

 ## Edge Attribute Schemas

+### Edge Attribute Schemas
+
+**Important**: `edgeType` is a universal required attribute stored on every edge in graphology, alongside (not inside) the mode-specific attribute schemas. This means the stored edge attributes are always `{ edgeType, ...modeSpecificAttrs }`. The TypeBox schemas below define only the mode-specific attributes; `edgeType` is added separately during edge creation and validated separately during deserialization.
+
+When validating serialized graphs, the validation is a two-step process:
+1. Check that `edgeType` is present and matches the expected value for the graph mode
+2. Validate the remaining attributes against the mode-specific schema (`OperationEdgeAttrs`, `CallEdgeAttrs`, etc.)
+
+This separation keeps the mode-specific schemas clean (they define only what's unique to each mode) while ensuring `edgeType` is always present at the storage level.
+
 ### OperationEdgeAttrs (Operation Graph)

 ```typescript
@@ -252,7 +262,7 @@ type OperationEdgeAttrs = Static<typeof OperationEdgeAttrs>;

 Type-compatibility edges carry a boolean `compatible` flag, an optional `detail` string, and optional structured `mismatches`. This allows the operation graph to include both compatible edges (green paths) and incompatible edges (red paths) for diagnostics. The `detail` field provides a human-readable summary, while `mismatches` provides machine-readable field-level diagnostics. The `TypeCompatResult` from `typeCompat()` populates both fields: `detail` for compatible edges and `mismatches` for incompatible ones.

-**Edge type storage**: Operation graph edges always have `edgeType: "typed"` stored on the edge as a separate attribute alongside `OperationEdgeAttrs`. Graphology edges carry both the `OperationEdgeAttrs` (compatible, detail, mismatches) and the `edgeType` field. The `edgeType` is not inside `OperationEdgeAttrs` because it's a universal edge classification that applies to all edge types across all graph modes (operation, call, template). The `OperationEdgeAttrs` schema only defines the mode-specific attributes.
+**Edge type storage (OQ-004)**: `edgeType` is a required universal attribute stored on every edge, regardless of graph mode. This applies uniformly: operation graph edges have `edgeType: "typed"`, call graph edges have `edgeType: "triggered"` or `"depends_on"`, and template edges have `edgeType: "sequential"` or `"conditional"`. The `edgeType` field is stored alongside mode-specific attributes in graphology, not inside the mode-specific attribute schemas (`OperationEdgeAttrs`, `TriggeredEdgeAttrs`, etc.). This ensures consistent serialization/deserialization, uniform graphology queries, and straightforward edge-type filtering. See ADR-006 for the full decision record.

 ```typescript
 // How operation graph edges are stored in graphology:
@@ -290,14 +300,23 @@ Data dependency edges also carry no additional attributes. Future extensions may
 type CallEdgeAttrs = TriggeredEdgeAttrs | DependencyEdgeAttrs;
 ```

-A union type used as the edge attribute type parameter for call graphs (`FlowGraph<CallNodeAttrs, CallEdgeAttrs>`). Call graph edges can be either `triggered` (parent-child) or `depends_on` (data dependency), distinguished by their edge type. The union type follows the `{GraphType}EdgeAttrs` naming pattern consistent with `OperationEdgeAttrs` and `TemplateEdgeAttrs`.
+A union type used as the edge attribute type parameter for call graphs (`FlowGraph<CallNodeAttrs, CallEdgeAttrs>`). Call graph edges can be either `triggered` (parent-child) or `depends_on` (data dependency), distinguished by their `edgeType` attribute.
+
+**Runtime discrimination**: Since `TriggeredEdgeAttrs` and `DependencyEdgeAttrs` are both empty objects, the union cannot be discriminated by TypeBox at the schema level. Instead, `edgeType` serves as the runtime discriminant. When validating serialized call graph edges, the two-step validation process applies:
+1. Read `edgeType` to determine which variant applies (`"triggered"` → `TriggeredEdgeAttrs`, `"depends_on"` → `DependencyEdgeAttrs`)
+2. Validate the remaining attributes against the corresponding schema
+
+At the code level, `edgeType` is used in a switch/if statement to determine which type of call edge is being processed. The `addCall` method automatically sets `edgeType: "triggered"` when creating a triggered edge, and `addDependency` sets `edgeType: "depends_on"`.
+
+**`depends_on` edge status (ADR-005)**: While `depends_on` edges are not auto-populated by the call protocol (ADR-005 resolves OQ-008: data dependencies flow through the result projection), they remain in the API for **observability and visualization**. A hub coordinator or external tool may add `depends_on` edges to annotate observed data flow between calls for debugging or monitoring purposes. They do NOT affect execution — the reactive engine derives data flow from the result projection, not from `depends_on` edges.

 ### TemplateEdgeAttrs (Workflow Templates)

 ```typescript
 const TemplateEdgeAttrs = Type.Object({
  edgeType: Type.Union([Type.Literal("sequential"), Type.Literal("conditional")]),
-  condition: Type.Optional(Type.Unknown()), // For conditional edges: the condition function or expression
+  condition: Type.Optional(Type.Unknown({ description: "For conditional edges: a function ((results: Record<string, CallResult>) => boolean) or a string referencing an operation name. Function values are not JSON-serializable; use string form for persistence." })),
+  negated: Type.Optional(Type.Boolean({ description: "True if this edge represents the negated condition of a Conditional's else branch" })),
  dataFlow: Type.Optional(Type.Boolean({ default: false, description: "Whether this edge carries data (state transfer) or only ordering (temporal notification)" })),
 });
 type TemplateEdgeAttrs = Static<typeof TemplateEdgeAttrs>;
@@ -305,11 +324,29 @@ type TemplateEdgeAttrs = Static<typeof TemplateEdgeAttrs>;

 Template edges carry an `edgeType` to distinguish sequential flow from conditional branching. Conditional edges optionally store a `condition` that determines whether the target node executes.

+**`condition` representation (OQ-020)**: The `condition` field uses `Type.Unknown()` at the schema level for maximum flexibility, with two runtime representations:
+
+1. **String form** (`string`): A serializable reference to an operation name whose result determines the branch. Example: `"fetch-data"` means "check the result of the operation named 'fetch-data'". String conditions survive JSON round-trips and are resolved by the HostConfig at render time using the operation registry.
+
+2. **Function form** (`(results: Record<string, CallResult>) => boolean`): A runtime-evaluated predicate that receives predecessor results and returns `true` (then-branch) or `false` (else-branch). Function conditions do NOT survive JSON serialization. They are evaluated by the reactive engine against the result projection (per ADR-005).
+
+The `Type.Unknown()` schema representation is intentional — it matches the reality that conditions can be either strings or functions, and neither TypeBox's `Type.String()` alone nor `Type.Function()` alone captures both forms. `@alkdev/typebox`'s `Type.Function()` defines input/output schemas for serializable function shapes, but the `Conditional.test` predicate is a runtime closure, not a serializable function schema. If a future need arises for schema-level condition descriptions (e.g., for template interchange), a dedicated `ConditionSchema` can be introduced — but for v1, `Type.Unknown()` with documentation is the pragmatic choice.
+
 **`dataFlow` attribute (ADR-005)**: Distinguishes temporal-only edges from state-transfer edges. This attribute is critical for type compatibility checking:

 - **`dataFlow: false`** (default): The edge expresses temporal ordering only — the downstream node starts after the upstream node completes, but doesn't read the upstream node's output. No type compatibility check is needed.
 - **`dataFlow: true`**: The edge carries data — the downstream node reads the upstream node's output via `Conditional.test`, `Map.over`, or `Operation.input`. Type compatibility checking (`typeCompat()`) should verify that the upstream output schema is compatible with the downstream input schema.

+The `dataFlow` attribute is **inferred** by the `GraphologyHostConfig` during template rendering. For v1, the inference uses a **conservative strategy**: an edge gets `dataFlow: true` when any of the following conditions are detected, and `dataFlow: false` (the default) otherwise:
+
+1. A `Conditional` edge always gets `dataFlow: true` (conditions always read a predecessor's result).
+2. A `Sequential` edge where the downstream node's `input` function references `results[...]` gets `dataFlow: true`.
+3. A `Sequential` edge where a `Map.over` function references `results[...]` on the predecessor gets `dataFlow: true`.
+
+Edges where `dataFlow` cannot be determined (e.g., `Operation.input` is an opaque function that can't be statically analyzed) default to `dataFlow: false`. Template authors can override this by explicitly providing `dataFlow: true` as an edge attribute if they know the downstream node reads upstream output.
+
+Over-marking `dataFlow: true` is safe (it just causes an unnecessary type compatibility check), while under-marking is safe (it skips a check that would have passed anyway, but could let a type-incompatible connection through). The conservative strategy errs on the side of under-marking.
+
 The `dataFlow` attribute is **inferred** by the `GraphologyHostConfig` during template rendering, not manually specified by template authors:

 - A `Sequential` edge where the downstream node references `results["upstreamNode"]` in any expression gets `dataFlow: true`
@@ -408,6 +445,10 @@ For example, a `depends_on` edge in the call graph uses `"req_abc123->req_def456

 Since `multi: false`, there can be at most one edge per key. The composite key format ensures deterministic keys even when multiple edge types connect the same pair.

+**Key priority convention**: When multiple edge types exist between the same (source, target) pair, the "primary" edge type gets the simple `${source}->${target}` key format. For call graphs, `triggered` edges are primary (a parent always triggers its child before any data dependency is established), so `triggered` edges use the simple format. For operation graphs and template DAGs, there is only one edge type per (source, target) pair, so the simple format always applies.
+
+**`depends_on` edge key format**: `depends_on` edges always use the composite format `${source}->${target}:depends_on`, even if no `triggered` edge exists between the same pair. This ensures key consistency regardless of edge ordering.
+
 This is an exception to the simple `${source}->${target}` pattern, but it's necessary for the call graph's dual-edge-type scenario. If multi-edge support becomes more broadly needed, the constraint can be relaxed and a uniform composite key format adopted.

 ## Constraints
@@ -423,11 +464,11 @@ This is an exception to the simple `${source}->${target}` pattern, but it's nece

 ## Open Questions

-1. **Should `edgeType` be a required field on ALL edges, or only on call graph and template edges?** Operation graph edges are always `typed`, so requiring an explicit `edgeType` attribute there is redundant. Options: (a) make `edgeType` required on all edges, (b) have separate edge attribute types per graph mode, (c) use a union type on edge attributes and let the consumer tag the edge.
+1. ~~**Should `edgeType` be a required field on ALL edges, or only on call graph and template edges?**~~ **Resolved (OQ-004)**: `edgeType` is required on all edges, stored as a universal attribute alongside mode-specific attributes. The mode-specific attribute schemas (`OperationEdgeAttrs`, `TriggeredEdgeAttrs`, `DependencyEdgeAttrs`) do NOT include `edgeType` — it's stored separately in graphology at the same level as the mode-specific attributes. This ensures consistent serialization/deserialization, uniform graphology queries, and straightforward edge-type filtering across all graph modes. See ADR-006.

-2. **Should `CallNodeAttrs.identity` be a `Type.Record` or the structured `Identity` type from operations?** The structured type matches the call protocol and storage schema but creates a dependency on `@alkdev/operations` types. Options: (a) import `Identity` from operations (peer dep), (b) duplicate the type in flowgraph, (c) use `Type.Record` and accept weaker typing.
+2. ~~**Should `CallNodeAttrs.identity` be a `Type.Record` or the structured `Identity` type from operations?**~~ **Resolved (OQ-022)**: Import the `Identity` type structure from `@alkdev/operations` (peer dependency). Since `@alkdev/operations` is already a peer dependency (for `CallEventMapValue`), adding this type import creates minimal additional coupling. The `CallNodeAttrs.identity` field mirrors the `Identity` interface: `{ id, scopes, resources? }`. Version alignment is handled by semver ranges. The TypeBox schema for `identity` is defined inline in `CallNodeAttrs` to match the shape (not imported as a TypeBox schema, since `@alkdev/operations` defines `Identity` as a TypeScript interface), but the field semantics match exactly.

-3. **How should conditional edge conditions be represented?** `condition: Type.Unknown()` is maximally flexible but provides no type safety. Options: (a) `Type.Unknown()` with documentation, (b) `Type.Union([Type.String(), Type.Function(...)])` for expression strings and function references, (c) a dedicated `ConditionSchema` that flowgraph defines.
+3. ~~**How should conditional edge conditions be represented?**~~ **Resolved (OQ-020)**: `condition: Type.Optional(Type.Unknown())` with documentation describing the two runtime forms: string (serializable operation reference) and function (`(results) => boolean`, not serializable). `@alkdev/typebox`'s `Type.Function()` defines serializable function input/output schemas, but `Conditional.test` predicates are runtime closures — they can't be represented as serializable function schemas. `Type.Unknown()` is the pragmatic choice for v1, accepting that JSON serialization only preserves the string form. A dedicated `ConditionSchema` can be introduced in v2 if template interchange needs schema-level condition descriptions.

 ## References