add flowgraph architecture docs (Phase 1 SDD)

Draft architecture specification for @alkdev/flowgraph — a workflow graph library providing DAG-based orchestration over operations. Covers two graph types (operation graph, call graph), ujsx workflow templates, GraphologyHost and ReactiveHost configs, signal-driven execution, type-compatibility analysis, error hierarchy, and build/distribution. Includes 3 ADRs: ujsx as template IR, DAG-only enforcement, decoupled storage.
2026-05-19 09:36:22 +00:00
parent 333dcd5ac1
commit d2253099ee
13 changed files with 2863 additions and 0 deletions
--- a/docs/architecture/decisions/001-ujsx-as-template-ir.md
+++ b/docs/architecture/decisions/001-ujsx-as-template-ir.md
@@ -0,0 +1,69 @@
+# ADR-001: ujsx Trees as Workflow Template IR
+
+## Status
+
+Proposed
+
+## Context
+
+Flowgraph needs a way to define workflow templates — reusable sequences of operations with conditional branching and parallel execution. The templates must be:
+
+1. **Declarative** — defining *what* should happen, not *how*
+2. **Composable** — nesting sequential, parallel, and conditional flows
+3. **Serializable** — store in JSON, transmit over APIs, version in git
+4. **Validatable** — check against an operation graph before execution
+5. **Renderable to multiple targets** — structural validation (DAG) and runtime execution (reactive)
+
+The obvious approach is a custom template format: an array of step objects with type discriminators:
+
+```typescript
+const template = [
+  { type: "operation", name: "architect" },
+  { type: "sequential", steps: [...] },
+];
+```
+
+This works but has limitations:
+- Custom format requires a custom parser, serializer, and validator
+- No composition primitives — `sequential` and `parallel` are just types in an array
+- No host switching — a separate compiler is needed for each target (DAG, execution engine)
+- No incremental updates — changing a step requires rebuilding the entire structure
+
+## Decision
+
+Use ujsx `UNode` trees as the workflow template intermediate representation. Workflow components (`Operation`, `Sequential`, `Parallel`, `Conditional`) are `UComponent` functions that produce `UElement` nodes. The template is rendered to different targets through ujsx `HostConfig` implementations.
+
+```typescript
+const template = h(Sequential, {},
+  h(Operation, { name: "architect" }),
+  h(Operation, { name: "reviewer" }),
+);
+```
+
+## Rationale
+
+1. **No new format** — ujsx already defines `UNode`, `UElement`, `URoot`, type guards, and serialization. We don't need to design, implement, and maintain a template format.
+
+2. **Composition is structural** — `<Sequential>` and `<Parallel>` compose naturally as parent-child structure in a tree. Array-of-objects requires custom merging logic.
+
+3. **Host target switching** — the same `UNode` tree renders to a graphology DAG (for validation) or a reactive engine (for execution) by swapping the `HostConfig`. No template-specific compiler needed.
+
+4. **Incremental updates** — when the ujsx reconciler is implemented, template changes (add/remove/reorder steps) can be applied incrementally without rebuilding the entire DAG. Array-of-objects requires full diffing and rebuilding.
+
+5. **Reactive props** — `@preact/signals-core` enables signal-driven prop updates. An `Operation` node's `name` could be a `signal<string>`, enabling dynamic workflow modification at runtime.
+
+6. **Serialization for free** — `UNode` trees are plain JSON. `JSON.stringify(template)` works. No custom serializer needed.
+
+## Consequences
+
+- **Direct dependency on `@alkdev/ujsx`** — flowgraph imports `h`, `createRoot`, `HostConfig`, `ReactiveRoot`, and type definitions from ujsx. This is a direct dependency, not a peer dependency.
+- **Function props don't serialize** — `Conditional.test` can be a function `(results) => boolean`, which doesn't survive JSON round-trips. Templates with conditional branches need to provide `test` at render time or use expression strings.
+- **Template components must follow ujsx component contract** — `(props) => UNode`. This is a minimal contract but it means components are synchronous functions that return a tree.
+- **The template IS the tree** — there is no separate compilation step between the ujsx tree and the render target. The `HostConfig.render()` call IS the compilation.
+
+## References
+
+- ujsx architecture: `@alkdev/ujsx/docs/architecture/README.md`
+- ujsx HostConfig: `@alkdev/ujsx/docs/architecture/host-config.md`
+- Workflow templates: [workflow-templates.md](../workflow-templates.md)
+- Host configs: [host-configs.md](../host-configs.md)
--- a/docs/architecture/decisions/002-dag-only-graph.md
+++ b/docs/architecture/decisions/002-dag-only-graph.md
@@ -0,0 +1,41 @@
+# ADR-002: Enforce DAG Invariants (No Cycles)
+
+## Status
+
+Proposed
+
+## Context
+
+Flowgraph represents two types of graphs: operation graphs (static type compatibility) and call graphs (dynamic call hierarchy). Both are directed acyclic graphs (DAGs) by nature:
+
+- **Operation graphs** — type flow is acyclic. An operation's output feeding back into its own input is a design error.
+- **Call graphs** — execution order is acyclic. A call being its own ancestor is physically impossible (you can't trigger yourself before you start).
+- **Workflow templates** — rendered templates must be DAGs. Cycles in a template mean infinite loops in execution.
+
+Taskgraph, the sibling package, allows cycles in its graph and detects them via `hasCycles()` and `findCycles()`. This makes sense because task dependencies can form cycles (e.g., iterative refinement where task A depends on task B which depends on task A's revised output).
+
+## Decision
+
+Flowgraph enforces acyclicity at construction time. Adding an edge that would create a cycle throws `CycleError`. `topologicalOrder()` can always produce a valid ordering without needing a cycle check first.
+
+This is a stricter invariant than taskgraph's approach. The rationale:
+
+1. **Cycles in operation graphs are design errors** — if operation A's output type is compatible with operation B's input, and B's output is compatible with A's input, that's circular type flow. It means infinite recursion is possible.
+2. **Cycles in call graphs are physically impossible** — a call cannot be its own ancestor. The call protocol ensures this via `parentRequestId` chains.
+3. **Cycles in templates are execution errors** — a cycle in a `<Sequential>` chain means infinite execution. This should be caught at template validation time, not at runtime.
+4. **DAG algorithms are simpler** — `topologicalOrder()` can always return a valid ordering. No need for `hasCycles()` + fallback path. `parallelGroups()` always produces a valid grouping. `reachableFrom()` never loops.
+
+## Consequences
+
+- **`addEdge()` validates before adding** — if adding the edge would create a cycle, it throws `CycleError` with the cycle paths.
+- **`fromSpecs()` and `fromCallEvents()` cannot produce cyclic graphs** — cycles in the input data throw errors.
+- **`topologicalOrder()` never throws** — it can always produce a valid ordering because the graph is guaranteed acyclic.
+- **`hasCycles()` always returns `false`** — kept as a validation method for graphs loaded via `fromJSON()` (which doesn't enforce acyclicity during import).
+- **This is different from taskgraph** — consumers familiar with taskgraph's `hasCycles()` → `findCycles()` → `topologicalOrder()` error-handling pattern need to adjust. In flowgraph, cycle prevention is at construction time, not query time.
+
+## References
+
+- Taskgraph cycle handling: `@alkdev/taskgraph_ts/docs/architecture/graph-model.md`
+- Operation graph: [operation-graph.md](../operation-graph.md)
+- Call graph: [call-graph.md](../call-graph.md)
+- Error handling: [error-handling.md](../error-handling.md)
--- a/docs/architecture/decisions/003-storage-decoupled.md
+++ b/docs/architecture/decisions/003-storage-decoupled.md
@@ -0,0 +1,57 @@
+# ADR-003: Decoupled Storage — In-Memory Graph with Export/Import Boundary
+
+## Status
+
+Proposed
+
+## Context
+
+Call graphs need to persist across hub restarts. The alkhub storage schema (`call_graph_nodes` and `call_graph_edges` tables) stores call data in Postgres. The question is: should flowgraph handle its own persistence, or should it provide a serialization boundary and let the hub handle storage?
+
+Taskgraph takes the serialization boundary approach: `export()` returns a graphology JSON blob, `fromJSON()` restores it. The hub stores this data in whatever format it needs.
+
+The alkhub call graph storage schema has specific requirements (payload truncation, redaction, indexing) that are storage-layer concerns, not graph concerns.
+
+## Decision
+
+Flowgraph operates on in-memory graphology instances and provides `export()`/`fromJSON()` for serialization. Storage, persistence, and database operations are the hub's concern, not flowgraph's.
+
+```typescript
+// In-memory graph
+const graph = FlowGraph.fromCallEvents(events);
+
+// Export for persistence
+const data = graph.export();  // graphology native JSON
+
+// Hub stores this in Postgres
+await db.saveCallGraph(data);
+
+// Restore from storage
+const restored = FlowGraph.fromJSON(await db.loadCallGraph());
+```
+
+## Rationale
+
+1. **Separation of concerns** — flowgraph is a graph library, not a database client. Mixing graph operations with SQL queries violates the single-responsibility principle.
+
+2. **Storage varies by consumer** — the hub uses Postgres, but other consumers might use SQLite, IndexedDB, or in-memory caches. Flowgraph shouldn't prescribe a storage backend.
+
+3. **The storage schema has concerns beyond the graph** — payload truncation (10KB threshold), field redaction (stripping API keys), and indexing are storage-layer concerns. Flowgraph stores raw `input`/`output`/`error` fields; the hub handles truncation at the persistence boundary.
+
+4. **Taskgraph's pattern works** — the same approach has served taskgraph well. The hub loads graph data from DB, constructs a `TaskGraph` in memory, runs analysis, and saves changes back.
+
+5. **Platform-agnostic requirement** — flowgraph must work in Deno, Node, and Bun. Database clients vary by platform (native addons, connection pooling, etc.). Keeping flowgraph pure JS means no native dependencies.
+
+## Consequences
+
+- **`export()` and `fromJSON()` are the persistence boundary** — consumers that need persistence serialize the graph and handle storage themselves.
+- **No database imports in flowgraph** — `pg`, `better-sqlite3`, `mongodb`, etc. are not in flowgraph's dependency tree.
+- **Payload handling is the hub's concern** — flowgraph stores raw `input`/`output`/`error` on call nodes. Truncation and redaction happen when the hub writes to Postgres.
+- **`fromJSON()` validates the data structure** — using `Value.Check()` against the `FlowGraphSerialized` schema. Invalid data throws `InvalidInputError`. But `fromJSON()` does NOT validate business rules (e.g., no cycles — that's `validateGraph()`).
+- **The hub must keep its storage schema in sync with flowgraph's `FlowGraphSerialized`** — if the storage column types change, the hub's mapping code needs updating, not flowgraph.
+
+## References
+
+- Taskgraph serialization: `@alkdev/taskgraph_ts/src/graph/construction.ts` (fromJSON, export)
+- Call graph storage: `@alkdev/alkhub_ts/docs/architecture/storage/call-graph.md`
+- Schema: [schema.md](../schema.md) — FlowGraphSerialized format