add open questions tracker: compile all unresolved questions across architecture docs into one cross-referenced view organized by theme and priority
This commit is contained in:
@@ -72,6 +72,10 @@ Flowgraph is in Phase 0/1 (exploration → architecture). No code exists yet. Th
|
|||||||
| [003](decisions/003-storage-decoupled.md) | Storage is not flowgraph's concern — in-memory graph with export/import boundary |
|
| [003](decisions/003-storage-decoupled.md) | Storage is not flowgraph's concern — in-memory graph with export/import boundary |
|
||||||
| [004](decisions/004-no-schema-version.md) | No schema version field in serialized format — consumers wrap in their own versioned envelope |
|
| [004](decisions/004-no-schema-version.md) | No schema version field in serialized format — consumers wrap in their own versioned envelope |
|
||||||
|
|
||||||
|
### Open Questions
|
||||||
|
|
||||||
|
All unresolved design questions across the architecture are tracked in [open-questions.md](open-questions.md), organized by theme with cross-references between related questions.
|
||||||
|
|
||||||
## Consumer Context
|
## Consumer Context
|
||||||
|
|
||||||
### alkhub (hub-spoke coordinator)
|
### alkhub (hub-spoke coordinator)
|
||||||
|
|||||||
395
docs/architecture/open-questions.md
Normal file
395
docs/architecture/open-questions.md
Normal file
@@ -0,0 +1,395 @@
|
|||||||
|
---
|
||||||
|
status: draft
|
||||||
|
last_updated: 2026-05-20
|
||||||
|
---
|
||||||
|
|
||||||
|
# Open Questions Tracker
|
||||||
|
|
||||||
|
Cross-cutting compilation of all unresolved questions across the flowgraph architecture documents, organized by theme. Questions that appear in multiple documents are unified here with cross-references.
|
||||||
|
|
||||||
|
## How to Use This Document
|
||||||
|
|
||||||
|
- Each question has an **ID** (e.g., OQ-01), **status**, **origin** (which doc(s)), and **priority** assessment
|
||||||
|
- **Cross-references** link related questions that may conflict or answer each other
|
||||||
|
- When a question is resolved, update its status to `resolved` and add a resolution note
|
||||||
|
- Once all questions in a theme are resolved, the theme section can be removed
|
||||||
|
|
||||||
|
## Theme 1: Edge Semantics and Type Compatibility
|
||||||
|
|
||||||
|
### OQ-01: Should `fromSpecs()` add ALL edges or only compatible ones?
|
||||||
|
|
||||||
|
- **Origin**: [operation-graph.md](operation-graph.md) Q1
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: high — affects storage size, API surface, and diagnostic value
|
||||||
|
- **Options**:
|
||||||
|
- (a) Add both compatible and incompatible edges (current design). Pro: diagnostic information visible. Con: graph is larger.
|
||||||
|
- (b) Only add compatible edges, with a `potentialEdges()` query computing incompatible connections on demand. Pro: smaller graph. Con: loses diagnostic information.
|
||||||
|
- **Notes**: This decision affects `buildTypeEdges()` in [analysis.md](analysis.md) and `OperationEdgeAttrs` in [schema.md](schema.md). The `compatible: false` attribute on edges only makes sense if option (a) is chosen.
|
||||||
|
- **Cross-references**: OQ-04
|
||||||
|
|
||||||
|
### OQ-02: How granular should type compatibility results be?
|
||||||
|
|
||||||
|
- **Origin**: [operation-graph.md](operation-graph.md) Q4, [analysis.md](analysis.md) Q1
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: high — directly shapes the `typeCompat()` return type and `OperationEdgeAttrs`
|
||||||
|
- **Question (merged)**: How deep should `typeCompat` check? Should it be fully recursive? And should the result be `{ compatible, detail? }` or `{ compatible, mismatches: TypeMismatch[] }`?
|
||||||
|
- **Current design**: The schema already defines `TypeMismatch` with `{ path, expected, actual }` and `OperationEdgeAttrs` has an optional `mismatches` field. The analysis doc describes deep recursive structural comparison. But there's a tension: full recursive checking is more thorough but may produce false negatives for schemas with dynamic structures.
|
||||||
|
- **Notes**: The schema doc already has `mismatches?: TypeMismatch[]` in `OperationEdgeAttrs`. The analysis doc already defines `TypeCompatResult` with `mismatches`. This suggests the design has already converged toward structured mismatch reporting. What remains is confirming: (a) recursive depth limits, (b) handling of `Type.Unknown()` and complex types (unions, intersections), (c) whether the `detail` string field is still needed alongside `mismatches`.
|
||||||
|
- **Cross-references**: OQ-01 (incompatible edges need mismatch detail)
|
||||||
|
|
||||||
|
### OQ-03: Should subscription operations be treated differently in type compatibility?
|
||||||
|
|
||||||
|
- **Origin**: [operation-graph.md](operation-graph.md) Q3
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: medium — affects operation graph edge semantics for streaming operations
|
||||||
|
- **Question**: A subscription produces a stream, not a single output. Its `outputSchema` describes a single stream element, but the data flow semantics are different. Should type compat check for subscriptions account for this?
|
||||||
|
- **Notes**: This has downstream implications for call-graph population (subscriptions produce multiple `call.responded` events) and template authoring (a subscription feeding into a mutation has different semantics than a query feeding into a mutation). May want to defer to v2 but should at least document the current behavior (subscriptions are treated the same as queries/mutations).
|
||||||
|
|
||||||
|
### OQ-04: Edge type consistency — should `edgeType` be required on ALL edges?
|
||||||
|
|
||||||
|
- **Origin**: [schema.md](schema.md) Q1
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: medium — affects serialization format and edge handling across all graph types
|
||||||
|
- **Options**:
|
||||||
|
- (a) `edgeType` required on all edges. Pro: consistent, self-describing. Con: operation graph edges are always `typed`, making the field redundant there.
|
||||||
|
- (b) Separate edge attribute types per graph mode (current implicit design — `CallEdgeAttrs` is a union, `OperationEdgeAttrs` doesn't include edge type). Con: graphology edges must carry attributes from a single schema.
|
||||||
|
- (c) Union type on edge attributes, letting the consumer tag the edge. Pro: flexible. Con: runtime discrimination burden.
|
||||||
|
- **Notes**: The current schema already stores `edgeType` alongside the edge-specific attributes in graphology (see schema.md's "Edge type storage" section), which is effectively option (a) at the storage level. The question is really about the TypeScript type API: should `OperationEdgeAttrs` include `edgeType: "typed"` or should that be a separate concern?
|
||||||
|
- **Cross-references**: OQ-01 (if incompatible edges exist, they need tagging)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Theme 2: Structural Container Transparency
|
||||||
|
|
||||||
|
### OQ-05: Should `Sequential` and `Parallel` be transparent in the graph?
|
||||||
|
|
||||||
|
- **Origin**: [workflow-templates.md](workflow-templates.md) Q1, [host-configs.md](host-configs.md) Q1
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: high — fundamental to how the DAG is structured and how the reactive engine computes preconditions
|
||||||
|
- **Question (merged)**: Currently, structural containers (`Sequential`, `Parallel`, `Conditional`) produce edges but no nodes. The reactive engine then has to reconstruct structural context to compute preconditions. Should they create "virtual" nodes instead?
|
||||||
|
- **Options**:
|
||||||
|
- (a) Transparent (current design): No nodes for containers. Edges carry the structure. Pro: smaller DAG, cleaner topology. Con: precondition computation needs structural context (parentStack, siblingMap).
|
||||||
|
- (b) Virtual nodes: Containers create nodes with `signal<NodeStatus>`. Pro: every node has a status and preconditions, simpler reactive engine. Con: more nodes, containers with no call protocol equivalent, slightly more complex graph queries.
|
||||||
|
- **Notes**: The host-configs doc identifies this as a "known gap": `Sequential`, `Parallel`, `Conditional` are transparent in the DAG but create complexity for the reactive engine's "previous sibling" precondition logic. The reactive-execution doc's `WorkflowReactiveRoot.initializeSignals()` assumes it operates on the flattened DAG (all nodes are operations), which aligns with option (a). The question is whether the reactive engine's context maps (`parentMap`, `siblingMap`) are sufficient or if virtual nodes would simplify things.
|
||||||
|
- **Cross-references**: OQ-14 (partial re-rendering)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Theme 3: Call Protocol Integration
|
||||||
|
|
||||||
|
### OQ-06: How does template instantiation interact with the call protocol?
|
||||||
|
|
||||||
|
- **Origin**: [workflow-templates.md](workflow-templates.md) Q4, [host-configs.md](host-configs.md) Q3
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: high — this is a fundamental integration point between flowgraph and the call protocol
|
||||||
|
- **Question (merged)**: When a template is instantiated as a call graph, each `<Operation>` becomes a call. But the call protocol's `call.requested` events include `parentRequestId` — who is the parent? Is it the template instance? The hub coordinator? And how does the `ReactiveHostConfig` bridge to `registry.execute()` or `PendingRequestMap.call()`?
|
||||||
|
- **Notes**: The consumer-integration doc shows the coordinator calling `registry.execute()` inside an `effect()`, but doesn't specify the `parentRequestId` semantics. This is a consumer-side decision, but flowgraph needs to document: (a) whether the template has its own `requestId`, (b) how the reactive engine signals the coordinator to start a call, (c) whether `ReactiveHostConfig` has a callback prop for this.
|
||||||
|
- **Cross-references**: OQ-07, OQ-08
|
||||||
|
|
||||||
|
### OQ-07: Should the reactive engine own the call graph?
|
||||||
|
|
||||||
|
- **Origin**: [host-configs.md](host-configs.md) Q4
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: high — affects the separation between flowgraph and the call protocol
|
||||||
|
- **Question**: Currently the call graph (from call-graph.md) and the reactive engine (from reactive-execution.md) are separate concepts. But at runtime, every `<Operation>` in a template becomes a call graph node. Should the reactive engine populate the call graph as a side effect?
|
||||||
|
- **Options**:
|
||||||
|
- (a) Separate: Call graph is populated by call protocol events. Reactive engine uses signals only. Coordinator bridges them.
|
||||||
|
- (b) Unified: Reactive engine creates call graph nodes when nodes transition to `running`, updates them on completion. Call graph is derived from reactive state.
|
||||||
|
- **Notes**: Option (a) matches ADR-003 (flowgraph doesn't do storage/persistence) and the current design where the call graph is populated by `updateFromEvent()`. Option (b) would couple the reactive engine to the call protocol. The current design's separation is cleaner but requires the coordinator to maintain both reactive state and call graph state.
|
||||||
|
|
||||||
|
### OQ-08: Should `depends_on` edges be auto-populated from workflow templates?
|
||||||
|
|
||||||
|
- **Origin**: [call-graph.md](call-graph.md) Q2
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: medium — affects how the call graph and template system relate
|
||||||
|
- **Question**: When a call graph is instantiated from a workflow template, the template's sequential/parallel structure implies data dependencies. Should the template instantiation automatically create `depends_on` edges in the call graph?
|
||||||
|
- **Notes**: Currently `depends_on` edges must be added explicitly. Auto-population would couple the call graph to the template system. The alternative is for the coordinator to add `depends_on` edges when it instantiates a template.
|
||||||
|
- **Cross-references**: OQ-06, workflow-templates Q3 (explicit `depends_on` in templates)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Theme 4: Failure and Retry Semantics
|
||||||
|
|
||||||
|
### OQ-09: How are retries handled at the signal level?
|
||||||
|
|
||||||
|
- **Origin**: [reactive-execution.md](reactive-execution.md) Q2
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: high — affects the core status state machine
|
||||||
|
- **Question**: If an operation fails and should be retried, the status would need to go `running → failed → ready → running`. But the current state machine marks `failed` as terminal with no exit transitions. How should this work?
|
||||||
|
- **Options**:
|
||||||
|
- (a) A `retried` status that allows re-entering `ready`. Con: adds another state to `NodeStatus`.
|
||||||
|
- (b) A separate `retryCount` attribute. A node can reset its status from `failed` to `ready` if `retryCount < maxRetries`. Con: breaks the terminal-state invariant.
|
||||||
|
- (c) Retry creates a new node (new `requestId`). The old node stays `failed`. Con: increases graph size but preserves state machine integrity.
|
||||||
|
- **Notes**: Option (c) aligns with the call protocol, where each retry is a new call with a new `requestId`. This is likely the right answer but needs confirmation.
|
||||||
|
- **Cross-references**: OQ-10
|
||||||
|
|
||||||
|
### OQ-10: What happens to running nodes when a predecessor fails?
|
||||||
|
|
||||||
|
- **Origin**: [reactive-execution.md](reactive-execution.md) Q6
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: high — affects failure propagation correctness
|
||||||
|
- **Question**: The current spec transitions `idle` and `waiting` nodes to `aborted` when `blockedByFailure` becomes true. But what about a node that's already `running`? Should it be cancelled?
|
||||||
|
- **Options**:
|
||||||
|
- (a) Running nodes are NOT affected. A predecessor's failure blocks dependents that haven't started, but running nodes continue. The coordinator can cancel them via `prm.abort()` if desired.
|
||||||
|
- (b) Running nodes automatically transition to `aborted`. This requires the `effect()` to check for running nodes.
|
||||||
|
- **Notes**: Option (a) is consistent with "failure follows dependency edges, not structural scope" — a running node has already passed its preconditions, so it should be allowed to complete. The coordinator can choose to abort it. Option (b) would be more aggressive. The reactive-execution doc's constraint says "abort is immediate in signals, delayed in protocol," suggesting option (a) is intended.
|
||||||
|
- **Cross-references**: OQ-09 (retries need to know if a running node can be restarted)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Theme 5: Preconditions and Scheduling
|
||||||
|
|
||||||
|
### OQ-11: Should preconditions support OR logic?
|
||||||
|
|
||||||
|
- **Origin**: [reactive-execution.md](reactive-execution.md) Q1
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: medium — affects the precondition computation model
|
||||||
|
- **Question**: Currently all predecessors must complete (AND logic). An `anyOf` predicate would allow "start this node as soon as any predecessor completes."
|
||||||
|
- **Notes**: OR preconditions would require either: (a) an edge attribute indicating `allOf` vs `anyOf`, (b) a node-level configuration, or (c) a separate `anyOfPredecessors` computed per node. This is a semantic change that affects both the DAG structure and the reactive engine. Might be a v2 feature.
|
||||||
|
- **Cross-references**: OQ-12
|
||||||
|
|
||||||
|
### OQ-12: How does `maxConcurrency` interact with preconditions?
|
||||||
|
|
||||||
|
- **Origin**: [reactive-execution.md](reactive-execution.md) Q4
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: medium — a `Parallel` group with `maxConcurrency: 3` should only start 3 nodes at a time
|
||||||
|
- **Notes**: `maxConcurrency` is a scheduling concern, not a structural one. The DAG doesn't encode it. Options: (a) a semaphore signal in the reactive layer, (b) coordinator-enforced throttling, (c) a `maxConcurrency` prop on `Parallel` that the reactive engine respects. The `<Parallel>` component already has `maxConcurrency` as an optional prop in its definition (workflow-templates.md).
|
||||||
|
- **Cross-references**: OQ-11, workflow-templates `Parallel` component
|
||||||
|
|
||||||
|
### OQ-13: Should `blockedByFailure` be a separate `computed` or derived from `preconditions`?
|
||||||
|
|
||||||
|
- **Origin**: [reactive-execution.md](reactive-execution.md) Q5
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: low — implementation detail, can be decided during implementation
|
||||||
|
- **Question**: Currently there are two separate `computed` values — `preconditions` (all predecessors completed/skipped) and `blockedByFailure` (any predecessor failed/aborted). An alternative is a single `computed<NodeReadiness>` returning `"ready" | "blocked" | "failed"`.
|
||||||
|
- **Notes**: Two separate `computed` values are more composable (you can check preconditions independently of failure status) but require two effects per node. A single `computed` is simpler (one effect) but less composably queryable. This is largely an implementation choice that doesn't affect the public API. Can be deferred to implementation.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Theme 6: Graph Construction and API Surface
|
||||||
|
|
||||||
|
### OQ-14: Should the call graph support unknown `operationId`?
|
||||||
|
|
||||||
|
- **Origin**: [call-graph.md](call-graph.md) Q1
|
||||||
|
- **Status**: open (with a proposed answer)
|
||||||
|
- **Priority**: medium — affects `fromCallEvents()` and `updateFromEvent()` behavior
|
||||||
|
- **Proposed answer**: Yes. The call graph records what happened, not what should have happened. Nodes with unknown `operationId` get `status: "pending"` and may later transition to `"failed"` with an `OPERATION_NOT_FOUND` error code.
|
||||||
|
- **Notes**: The doc already has a proposed answer. This just needs confirmation and the behavior documented in the `fromCallEvents()` spec.
|
||||||
|
|
||||||
|
### OQ-15: Should the call graph support multiple graphs simultaneously?
|
||||||
|
|
||||||
|
- **Origin**: [call-graph.md](call-graph.md) Q3
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: low — can be deferred to v2
|
||||||
|
- **Question**: Currently one `FlowGraph` instance = one call graph. If the hub needs to track multiple concurrent workflows, it uses multiple instances. An alternative is a single graph with workflow-scoped subgraphs.
|
||||||
|
- **Notes**: The current design (multiple instances) is simpler and matches graphology's model. Subgraphs would require a scoping mechanism. This can be deferred unless early usage shows it's needed.
|
||||||
|
|
||||||
|
### OQ-16: Should `filterByStatus` use an index?
|
||||||
|
|
||||||
|
- **Origin**: [call-graph.md](call-graph.md) Q4
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: low — premature optimization for small graphs
|
||||||
|
- **Notes**: Call graphs at hub level are typically tens of nodes. O(n) filter is fast enough. An index can be added later if performance becomes an issue. Can be deferred.
|
||||||
|
|
||||||
|
### OQ-17: Should `FlowGraph` expose graphology's traversal methods directly?
|
||||||
|
|
||||||
|
- **Origin**: [flowgraph-api.md](flowgraph-api.md) Q1
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: medium — affects the public API surface
|
||||||
|
- **Question**: Currently the plan is convenience methods that delegate. But some consumers may find it inconvenient to go through `.graph.forEachNode()`.
|
||||||
|
- **Options**:
|
||||||
|
- (a) Convenience methods only (current plan). Direct access via `.graph` for power users.
|
||||||
|
- (b) Expose graphology's traversal methods directly on `FlowGraph` (e.g., `flowGraph.forEachNode()`).
|
||||||
|
- (c) Expose only the most common traversal methods and let `.graph` handle the rest.
|
||||||
|
- **Notes**: This is a UX decision. Option (a) keeps the API surface small. Option (b) is more convenient but increases the delegation surface. Option (c) is a middle ground. The decision can be made during implementation based on actual consumer usage patterns.
|
||||||
|
|
||||||
|
### OQ-18: Should `addOperation` auto-populate type-compat edges?
|
||||||
|
|
||||||
|
- **Origin**: [flowgraph-api.md](flowgraph-api.md) Q2
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: low — affects incremental construction behavior
|
||||||
|
- **Question**: `fromSpecs()` calls `buildTypeEdges()` which adds all type-compatibility edges. Should `addOperation()` (incremental) also attempt auto-type-compat edge creation?
|
||||||
|
- **Notes**: This is only relevant for incremental construction (rare use case). The operation graph is typically built once via `fromSpecs()`. If incremental construction is needed, the consumer can call `buildTypeEdges()` manually after adding operations. Can be deferred.
|
||||||
|
|
||||||
|
### OQ-28: Should `FlowGraph` share analysis functions across instances?
|
||||||
|
|
||||||
|
- **Origin**: [flowgraph-api.md](flowgraph-api.md) Q3
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: low — optimization concern, not blocking
|
||||||
|
- **Question**: Currently each `FlowGraph` instance owns its own `DirectedGraph`. A future optimization could pool analysis functions across instances.
|
||||||
|
- **Notes**: Distinct from OQ-15 (multiple graphs per instance) — this is about sharing analysis logic, not about graph scoping. Can be deferred.
|
||||||
|
|
||||||
|
### OQ-19: Should `parallelGroups` account for resource constraints?
|
||||||
|
|
||||||
|
- **Origin**: [analysis.md](analysis.md) Q4
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: low — feature enhancement, not a core concern
|
||||||
|
- **Question**: Currently `parallelGroups()` returns the theoretical maximum parallelism. An optional `maxConcurrency` parameter could limit group sizes for realistic scheduling.
|
||||||
|
- **Notes**: Can be added later as an optional parameter. Not blocking.
|
||||||
|
|
||||||
|
### OQ-27: Should `validateTemplate` check runtime preconditions?
|
||||||
|
|
||||||
|
- **Origin**: [analysis.md](analysis.md) Q2
|
||||||
|
- **Status**: open (intentionally deferred)
|
||||||
|
- **Priority**: low — explicitly out of scope for static analysis
|
||||||
|
- **Question**: Currently `validateTemplate` only checks structural validity and type compatibility. Runtime preconditions (e.g., "operation B requires an API key that operation A doesn't have access to") are beyond the scope of static analysis and belong to the access control layer.
|
||||||
|
- **Notes**: This is a deliberate scope boundary, not a design gap. Documented here to confirm that this is an intentional deferral, not an oversight.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Theme 7: Conditional and Template Semantics
|
||||||
|
|
||||||
|
### OQ-29: Should GraphologyHostConfig produce a separate graph per edge type?
|
||||||
|
|
||||||
|
- **Origin**: [host-configs.md](host-configs.md) Q2
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: medium — affects implementation of the GraphologyHostConfig
|
||||||
|
- **Question**: Currently all edge types (`sequential`, `conditional`, `typed`) share the same graph. An alternative is a separate graph per edge type, enabling type-specific queries without filtering.
|
||||||
|
- **Notes**: Related to OQ-04 (edge type consistency at the schema level) but distinct — this is about the runtime graph structure, not the type design. Multiple graphs would make type-specific queries faster (no filtering) but increase complexity and memory usage.
|
||||||
|
- **Cross-references**: OQ-04
|
||||||
|
|
||||||
|
### OQ-20: How should conditional edge conditions be represented?
|
||||||
|
|
||||||
|
- **Origin**: [schema.md](schema.md) Q3
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: medium — affects `TemplateEdgeAttrs.condition` type safety
|
||||||
|
- **Options**:
|
||||||
|
- (a) `Type.Unknown()` with documentation (current). Pro: maximally flexible. Con: no type safety.
|
||||||
|
- (b) `Type.Union([Type.String(), Type.Function(...)])` for expression strings and function references. Pro: documents both forms. Con: functions don't serialize.
|
||||||
|
- (c) A dedicated `ConditionSchema` that flowgraph defines. Pro: type safe, consistent. Con: may be overly prescriptive.
|
||||||
|
- **Notes**: The workflow-templates doc already specifies `Conditional.test` as `((results: Record<string, CallResult>) => boolean) | string`, and the host-configs doc notes that function props need runtime resolution. Option (b) seems like the pragmatic choice that matches the existing design, but the schema representation is what needs deciding.
|
||||||
|
- **Known Gap** (from [host-configs.md](host-configs.md)): "Conditional Test Evaluation" — the `Conditional.test` function needs access to the `WorkflowContext`/`ReactiveContext` at runtime to evaluate against predecessor results. This is a concrete sub-problem of OQ-06 (how the reactive host config bridges to execution).
|
||||||
|
- **Cross-references**: OQ-05 (conditional branch behavior in reactive engine), OQ-06 (runtime resolution of function props)
|
||||||
|
|
||||||
|
### OQ-21: Should templates support explicit `depends_on` edges?
|
||||||
|
|
||||||
|
- **Origin**: [workflow-templates.md](workflow-templates.md) Q3
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: medium — affects template composition expressiveness
|
||||||
|
- **Question**: Currently dependencies are inferred from structure (sequential implies dependency). An explicit `<DependsOn target="operation-name" />` component would make data dependencies visible in the template without relying on sequential ordering.
|
||||||
|
- **Notes**: This would add expressiveness but also complexity. Implicit dependency from structure is simpler and covers the most common cases. Explicit `depends_on` would be needed when a node depends on a non-adjacent predecessor in a way that can't be expressed by a `Sequential` group. Can be deferred to v2.
|
||||||
|
- **Cross-references**: OQ-08 (call graph `depends_on` edges)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Theme 8: Identity and Serialization
|
||||||
|
|
||||||
|
### OQ-22: Should `CallNodeAttrs.identity` be a structured type or `Type.Record`?
|
||||||
|
|
||||||
|
- **Origin**: [schema.md](schema.md) Q2
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: medium — affects the `@alkdev/operations` peer dependency
|
||||||
|
- **Options**:
|
||||||
|
- (a) Import `Identity` from `@alkdev/operations` (peer dep). Pro: matches call protocol. Con: creates a direct type dependency.
|
||||||
|
- (b) Duplicate the type in flowgraph. Pro: no dependency. Con: divergence risk.
|
||||||
|
- (c) Use `Type.Record(Type.String(), Type.Array(Type.String()))` for the `resources` field. Pro: flexible. Con: weaker typing.
|
||||||
|
- **Notes**: Since `@alkdev/operations` is already a peer dependency for type imports, option (a) seems reasonable. The concern is version alignment, but semver ranges handle this. This could also be a `Type.Unknown()` with documentation, letting the consumer validate.
|
||||||
|
|
||||||
|
### OQ-23: Multiple graphs per `FlowGraph` instance?
|
||||||
|
|
||||||
|
- **Origin**: [call-graph.md](call-graph.md) Q3 (same as OQ-15)
|
||||||
|
- **Status**: open (duplicate of OQ-15 — see above)
|
||||||
|
|
||||||
|
### OQ-24: Async analysis functions?
|
||||||
|
|
||||||
|
- **Origin**: [analysis.md](analysis.md) Q3
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: low — premature for current scale
|
||||||
|
- **Question**: Should analysis functions be async for large graphs? Current graphs are small (50-200 nodes), synchronous is fine.
|
||||||
|
- **Notes**: Can be deferred. If large graphs become common, async analysis can be added with an optional `async` variant.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Theme 9: Reactive Execution Mechanics
|
||||||
|
|
||||||
|
### OQ-25: Should the reactive graph support partial re-rendering?
|
||||||
|
|
||||||
|
- **Origin**: [reactive-execution.md](reactive-execution.md) Q3
|
||||||
|
- **Status**: open (blocked on ujsx reconciler)
|
||||||
|
- **Priority**: low — blocked on ujsx reconciler implementation
|
||||||
|
- **Question**: If a template changes mid-execution, the ujsx reconciler could diff and apply changes. Currently only mount rendering is supported.
|
||||||
|
- **Known Gap** (from [host-configs.md](host-configs.md)): "ujsx Reconciler Not Yet Available" — the current `HostConfig` is mount-only: no incremental template updates, no `prepareUpdate`/`commitUpdate` flow. This gap is broader than just re-rendering.
|
||||||
|
- **Notes**: This is entirely dependent on the ujsx reconciler, which is not yet implemented. The host-configs doc notes "currently mount-only." When the reconciler is available, flowgraph gets re-rendering "for free." This question should be revisited after the reconciler is implemented.
|
||||||
|
- **Cross-references**: OQ-05 (structural container handling during re-render), host-configs.md "Known Gaps"
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Theme 10: Version and Scale Concerns
|
||||||
|
|
||||||
|
### OQ-26: How to handle version conflicts?
|
||||||
|
|
||||||
|
- **Origin**: [operation-graph.md](operation-graph.md) Q2
|
||||||
|
- **Status**: open
|
||||||
|
- **Priority**: low — can be deferred to a versioning use case
|
||||||
|
- **Question**: If two versions of the same operation exist in the registry, should they be separate nodes (`task.classify@1.0.0` vs `task.classify@2.0.0`) or should the latest version win?
|
||||||
|
- **Notes**: The current design uses `namespace.name` (no version) as the node key, meaning only one version per operation. This is intentional simplicity. Version conflicts are a niche concern that can be addressed when a concrete use case arises.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Summary Table
|
||||||
|
|
||||||
|
| ID | Question | Origin | Priority | Status |
|
||||||
|
|----|----------|--------|----------|--------|
|
||||||
|
| OQ-01 | All edges or only compatible edges? | operation-graph | high | open |
|
||||||
|
| OQ-02 | Type compatibility depth and granularity | operation-graph, analysis | high | open |
|
||||||
|
| OQ-03 | Subscription operations in type compat | operation-graph | medium | open |
|
||||||
|
| OQ-04 | `edgeType` on all edges? | schema | medium | open |
|
||||||
|
| OQ-05 | Structural container transparency | workflow-templates, host-configs | high | open |
|
||||||
|
| OQ-06 | Template ↔ call protocol interaction | workflow-templates, host-configs | high | open |
|
||||||
|
| OQ-07 | Should reactive engine own call graph? | host-configs | high | open |
|
||||||
|
| OQ-08 | Auto-populate `depends_on` from templates? | call-graph | medium | open |
|
||||||
|
| OQ-09 | Retries at signal level | reactive-execution | high | open |
|
||||||
|
| OQ-10 | Running nodes when predecessor fails | reactive-execution | high | open |
|
||||||
|
| OQ-11 | OR logic for preconditions | reactive-execution | medium | open |
|
||||||
|
| OQ-12 | `maxConcurrency` interaction with preconditions | reactive-execution | medium | open |
|
||||||
|
| OQ-13 | `blockedByFailure` vs single computed | reactive-execution | low | open |
|
||||||
|
| OQ-14 | Unknown `operationId` in call graph | call-graph | medium | open (proposed) |
|
||||||
|
| OQ-15 | Multiple graphs per instance | call-graph | low | open |
|
||||||
|
| OQ-16 | `filterByStatus` index | call-graph | low | open |
|
||||||
|
| OQ-17 | Expose graphology traversal directly? | flowgraph-api | medium | open |
|
||||||
|
| OQ-18 | Auto-populate type edges on `addOperation`? | flowgraph-api | low | open |
|
||||||
|
| OQ-19 | `parallelGroups` with resource constraints | analysis | low | open |
|
||||||
|
| OQ-20 | Conditional edge condition representation | schema | medium | open |
|
||||||
|
| OQ-21 | Explicit `depends_on` in templates | workflow-templates | medium | open |
|
||||||
|
| OQ-22 | `CallNodeAttrs.identity` type | schema | medium | open |
|
||||||
|
| OQ-24 | Async analysis functions | analysis | low | open |
|
||||||
|
| OQ-25 | Partial re-rendering | reactive-execution | low | open (blocked) |
|
||||||
|
| OQ-26 | Operation version conflicts | operation-graph | low | open |
|
||||||
|
| OQ-27 | Runtime preconditions in validateTemplate? | analysis | low | open (deferred) |
|
||||||
|
| OQ-28 | Share analysis functions across instances? | flowgraph-api | low | open |
|
||||||
|
| OQ-29 | Separate graph per edge type? | host-configs | medium | open |
|
||||||
|
|
||||||
|
### Priority Assessment
|
||||||
|
|
||||||
|
**High priority** (should resolve before implementation):
|
||||||
|
- OQ-01: All edges or only compatible — shapes the entire operation graph API
|
||||||
|
- OQ-02: Type compatibility depth — shapes `typeCompat()` return type
|
||||||
|
- OQ-05: Structural container transparency — fundamental to DAG and reactive engine
|
||||||
|
- OQ-06: Template ↔ call protocol — fundamental integration point
|
||||||
|
- OQ-07: Reactive engine owns call graph? — affects architecture boundaries
|
||||||
|
- OQ-09: Retries — shapes the state machine
|
||||||
|
- OQ-10: Running node failure handling — shapes failure propagation
|
||||||
|
|
||||||
|
**Medium priority** (should resolve before v1 release):
|
||||||
|
- OQ-03, OQ-04, OQ-08, OQ-11, OQ-12, OQ-14, OQ-17, OQ-20, OQ-21, OQ-22, OQ-29
|
||||||
|
|
||||||
|
**Low priority** (can defer or decide during implementation):
|
||||||
|
- OQ-13, OQ-15, OQ-16, OQ-18, OQ-19, OQ-24, OQ-25, OQ-26, OQ-27, OQ-28
|
||||||
|
|
||||||
|
### Cross-Cutting Themes
|
||||||
|
|
||||||
|
These groups of questions interact with each other and should be resolved together:
|
||||||
|
|
||||||
|
1. **Edge semantics group** (OQ-01, OQ-02, OQ-04): All affect the operation graph's edge structure and the type compatibility API.
|
||||||
|
|
||||||
|
2. **Call protocol integration group** (OQ-06, OQ-07, OQ-08): All about how flowgraph connects to the live call protocol.
|
||||||
|
|
||||||
|
3. **Failure semantics group** (OQ-09, OQ-10): Both about how failure and retry propagate through the reactive engine. Resolving one may resolve or constrain the other.
|
||||||
|
|
||||||
|
4. **Scheduling group** (OQ-11, OQ-12): Both about how preconditions interact with scheduling constraints.
|
||||||
|
|
||||||
|
5. **Template expressiveness group** (OQ-05, OQ-20, OQ-21): All about what the template system can express and how it renders.
|
||||||
|
|
||||||
|
6. **Graph structure group** (OQ-04, OQ-29): Both about how edge types are represented in the graph — OQ-04 at the schema/type level, OQ-29 at the runtime graph structure level. Resolution of one constrains the other.
|
||||||
|
|
||||||
|
7. **Known gaps from host-configs.md** — not all "known gaps" are "open questions" (the reconciler gap is a dependency, not a design question), but they should be tracked here for completeness.
|
||||||
Reference in New Issue
Block a user