Files
flowgraph/docs/architecture/schema.md
glm-5.1 d2253099ee add flowgraph architecture docs (Phase 1 SDD)
Draft architecture specification for @alkdev/flowgraph — a workflow graph library providing DAG-based orchestration over operations. Covers two graph types (operation graph, call graph), ujsx workflow templates, GraphologyHost and ReactiveHost configs, signal-driven execution, type-compatibility analysis, error hierarchy, and build/distribution. Includes 3 ADRs: ujsx as template IR, DAG-only enforcement, decoupled storage.
2026-05-19 09:36:22 +00:00

16 KiB

status, last_updated
status last_updated
draft 2026-05-19

Schema

TypeBox Module, TypeScript types, categorical enums, node/edge attribute schemas, and the design decisions behind them.

Overview

Flowgraph's schema layer follows the same pattern as taskgraph: TypeBox schemas are the single source of truth for both runtime validation and TypeScript type derivation. All data shapes are defined as TypeBox schemas, with Static<typeof Schema> producing the corresponding TypeScript types.

The schema is organized around two distinct graph types (operation graph and call graph) plus shared enums and the serialized graph factory.

Design Decision: TypeBox as Single Source of Truth

Identical to taskgraph's approach:

  1. Static TypeScript types via Static<typeof Schema> — every schema constant has a corresponding type X = Static<typeof X> alias
  2. Runtime validation via Value.Check() / Value.Errors() — structured field-level error reporting
  3. JSON Schema export for consumers that need schema-based contracts

No separate interface or type definitions outside of Static<typeof>. No Zod.

Naming Convention

Category Convention Example
Enum schema constant PascalCase + Enum suffix CallStatusEnum
Enum type alias PascalCase, no suffix type CallStatus = Static<typeof CallStatusEnum>
Object schema constant PascalCase, no suffix OperationNodeAttrs, CallNodeAttrs
Object type alias Same name as schema constant type OperationNodeAttrs = Static<typeof OperationNodeAttrs>
Graph attribute schemas PascalCase + suffix FlowGraphSerialized, OperationGraphSerialized
Factory function PascalCase SerializedGraph(NodeAttrs, EdgeAttrs, GraphAttrs)

Nullable Helper

Same Nullable helper as taskgraph:

const Nullable = <T extends TSchema>(schema: T) => Type.Union([schema, Type.Null()]);

Used for fields that can be explicitly set to null (distinct from absent).

Enums

CallStatus

The lifecycle states of a call invocation. Matches the call graph storage schema in @alkdev/alkhub_ts/docs/architecture/storage/call-graph.md.

const CallStatusEnum = Type.Union([
  Type.Literal("pending"),     // Call requested, not yet dispatched
  Type.Literal("running"),     // Handler executing
  Type.Literal("completed"),   // Successfully finished (call.responded + call.completed)
  Type.Literal("failed"),      // Handler threw or call.error emitted
  Type.Literal("aborted"),    // Call.aborted emitted (parent cancelled, deadline exceeded)
]);
type CallStatus = Static<typeof CallStatusEnum>;

Transitions:

pending → running → completed
                  → failed
         → aborted
  • pending → running: Handler starts executing
  • running → completed: call.responded + call.completed received
  • running → failed: call.error received
  • pending → aborted: call.aborted received before handler started (e.g., deadline exceeded)
  • running → aborted: call.aborted received during execution (parent cancelled)

completed, failed, and aborted are terminal states — no further transitions.

NodeStatus

A derived status for workflow template nodes. While CallStatus tracks individual call invocations, NodeStatus reflects the template-level view:

const NodeStatusEnum = Type.Union([
  Type.Literal("idle"),        // Not started, no call yet
  Type.Literal("waiting"),     // Preconditions not met, waiting for upstream
  Type.Literal("ready"),      // Preconditions met, eligible to start
  Type.Literal("running"),     // Call in progress
  Type.Literal("completed"),   // Call completed successfully
  Type.Literal("failed"),     // Call failed
  Type.Literal("skipped"),     // Conditional branch not taken
  Type.Literal("aborted"),     // Call aborted
]);
type NodeStatus = Static<typeof NodeStatusEnum>;

NodeStatus extends CallStatus with workflow-specific states (idle, waiting, ready, skipped) that have no call protocol equivalent. A node that is waiting has no call yet because its preconditions haven't been met.

EdgeType

The type of edge in a flowgraph. Matches the call graph storage schema's edgeType column:

const EdgeTypeEnum = Type.Union([
  Type.Literal("triggered"),    // Source caused target to execute (parent→child in call hierarchy)
  Type.Literal("depends_on"),   // Source requires target's result before it can complete (data dependency)
  Type.Literal("typed"),        // Type compatibility edge (output schema A → input schema B)
  Type.Literal("sequential"),   // Sequential flow edge (template: <Sequential> ordering)
  Type.Literal("conditional"),  // Conditional flow edge (template: <Conditional> branch)
]);
type EdgeType = Static<typeof EdgeTypeEnum>;

The first three (triggered, depends_on) match the call graph storage schema. The last two (sequential, conditional) are template-specific and only exist in workflow template DAGs.

Edge Type Graph Type Meaning
triggered Call graph Parent call triggered child call. Corresponds to parentRequestId.
depends_on Call graph Data dependency — source needs target's result.
typed Operation graph Type compatibility — source's output schema is compatible with target's input schema.
sequential Template → DAG Sequential ordering from <Sequential> component.
conditional Template → DAG Conditional branch from <Conditional> component.

Node Attribute Schemas

OperationNodeAttrs

Attributes for nodes in the operation graph. Derived from OperationSpec but carrying only graph-relevant data:

const OperationNodeAttrs = Type.Object({
  name: Type.String(),                    // Operation name (e.g., "classify")
  namespace: Type.String(),               // Namespace (e.g., "task")
  version: Type.String(),                 // Semantic version
  type: OperationTypeEnum,                // "query" | "mutation" | "subscription"
  inputSchema: Type.Unknown(),            // JSON Schema for input (TypeBox schema)
  outputSchema: Type.Unknown(),           // JSON Schema for output (TypeBox schema)
  description: Type.Optional(Type.String()),
  tags: Type.Optional(Type.Array(Type.String())),
});
type OperationNodeAttrs = Static<typeof OperationNodeAttrs>;

The node key is namespace.name (e.g., "task.classify"), matching the operationId format used in the call protocol. The full OperationSpec is not stored on the graph — accessControl, errorSchemas, and handler belong to the registry, not the graph.

Why inputSchema and outputSchema on the graph: These are needed for type-compatibility edge construction. An edge from operation A to operation B exists if A's outputSchema is compatible with B's inputSchema. Storing the schemas on the node avoids a round-trip to the registry during graph queries.

CallNodeAttrs

Attributes for nodes in the call graph. Populated from call events:

const CallNodeAttrs = Type.Object({
  requestId: Type.String(),                  // Unique call identifier
  operationId: Type.String(),                // namespace.name of the operation
  status: CallStatusEnum,                    // Current call status
  parentRequestId: Type.Optional(Type.String()),  // Parent call (null = top-level)
  input: Type.Unknown(),                     // Call input
  output: Type.Optional(Type.Unknown()),     // Call output (on completion)
  error: Type.Optional(Type.Object({         // Call error (on failure)
    code: Type.String(),
    message: Type.String(),
    details: Type.Optional(Type.Unknown()),
  })),
  identity: Type.Optional(Type.Object({       // Caller identity
    id: Type.String(),
    scopes: Type.Array(Type.String()),
    resources: Type.Optional(Type.Record(Type.String(), Type.Array(Type.String()))),
  })),
  startedAt: Type.Optional(Type.String()),    // ISO timestamp when call was dispatched
  completedAt: Type.Optional(Type.String()),  // ISO timestamp when call completed/failed/aborted
});
type CallNodeAttrs = Static<typeof CallNodeAttrs>;

The node key is requestId. This matches the call protocol's correlation mechanism and the call graph storage schema.

Why ISO timestamps as strings: Following the call protocol, timestamps are ISO 8601 strings rather than numbers. This makes the graph directly serializable to JSON without transformation and aligns with the storage schema's timestamp with tz columns.

Why parentRequestId is both a node attribute and an edge: Following the same denormalization pattern as the storage schema — parentRequestId on the node enables fast point lookups ("who is this call's parent?"), while triggered edges enable traversal queries. Both are kept consistent by construction.

Edge Attribute Schemas

TypedEdgeAttrs (Operation Graph)

const TypedEdgeAttrs = Type.Object({
  compatible: Type.Boolean({ description: "Whether the source output schema is compatible with the target input schema" }),
  compatibilityDetail: Type.Optional(Type.String({ description: "Human-readable description of compatibility or mismatch" })),
});
type TypedEdgeAttrs = Static<typeof TypedEdgeAttrs>;

Type-compatibility edges carry a boolean compatible flag and optional detail. This allows the operation graph to include both compatible edges (green paths) and incompatible edges (red paths) for diagnostics.

TriggeredEdgeAttrs (Call Graph)

const TriggeredEdgeAttrs = Type.Object({});
type TriggeredEdgeAttrs = Static<typeof TriggeredEdgeAttrs>;

Parent-child edges in the call graph carry no additional attributes — the relationship is fully captured by the edge direction and type. This may be extended in the future with latency or metadata attributes.

DependencyEdgeAttrs (Call Graph)

const DependencyEdgeAttrs = Type.Object({});
type DependencyEdgeAttrs = Static<typeof DependencyEdgeAttrs>;

Data dependency edges also carry no additional attributes. Future extensions may include dataPath (which field of the output feeds which field of the input).

TemplateEdgeAttrs (Workflow Templates)

const TemplateEdgeAttrs = Type.Object({
  edgeType: EdgeTypeEnum,                   // "sequential" or "conditional"
  condition: Type.Optional(Type.Unknown()), // For conditional edges: the condition function or expression
});
type TemplateEdgeAttrs = Static<typeof TemplateEdgeAttrs>;

Template edges carry an edgeType to distinguish sequential flow from conditional branching. Conditional edges optionally store a condition that determines whether the target node executes.

SerializedGraph Factory

Following the taskgraph pattern, a generic factory for graphology native JSON format:

const SerializedGraph = <N extends TSchema, E extends TSchema, G extends TSchema>(
  NodeAttrs: N,
  EdgeAttrs: E,
  GraphAttrs: G,
) =>
  Type.Object({
    attributes: GraphAttrs,
    options: Type.Object({
      type: Type.Literal("directed"),
      multi: Type.Literal(false),
      allowSelfLoops: Type.Literal(false),
    }),
    nodes: Type.Array(Type.Object({
      key: Type.String(),
      attributes: NodeAttrs,
    })),
    edges: Type.Array(Type.Object({
      key: Type.String(),
      source: Type.String(),
      target: Type.String(),
      attributes: EdgeAttrs,
    })),
  });

multi: false: Flowgraph edges are unique per (source, target, edgeType) triple. No parallel edges between the same node pair with the same type.

allowSelfLoops: false: Operations and calls cannot be their own prerequisite. Self-loops are rejected at construction time.

type: "directed": All edges have direction. A → B means A is prerequisite/source, B is dependent/target. This matches the graphology convention and the call graph storage schema.

FlowGraphSerialized variants

Two specialized serialization types, one for each graph type:

const OperationGraphSerialized = SerializedGraph(
  OperationNodeAttrs,
  TypedEdgeAttrs,
  Type.Object({}),  // No graph-level attributes
);

const CallGraphSerialized = SerializedGraph(
  CallNodeAttrs,
  Type.Union([TriggeredEdgeAttrs, DependencyEdgeAttrs]),
  Type.Object({}),  // No graph-level attributes
);

For call graphs, edges can be either triggered or depends_on, distinguished by their attributes rather than separate schemas.

Edge Key Convention

Following taskgraph's ADR-006, edge keys are deterministic:

${source}->${target}

For the operation graph, this means keys like "task.classify->task.enrich". For the call graph, keys like "req_abc123->req_def456".

Since multi: false, there can be at most one edge between any (source, target) pair. When multiple edge types are needed between the same pair (e.g., both triggered and depends_on between two calls), the graph stores a single edge whose edgeType attribute captures the semantic relationship. This is a simplification from the storage schema, which allows multiple edges per (source, target, edgeType) triple — the in-memory graph collapses these into a single edge per (source, target) pair.

This is acceptable because:

  • Operation graphs only have typed edges, so no multi-edge concern.
  • Call graphs rarely have both triggered and depends_on between the same pair.
  • Template DAGs only have sequential or conditional edges.

If multi-edge support becomes necessary, the allowSelfLoops: false constraint can be relaxed and a composite key format (${source}->${target}:${edgeType}) adopted.

Constraints

  • TypeBox schemas are the single source of truth — no hand-written interface or type definitions for data shapes. All types are derived via Static<typeof Schema>.
  • Edge keys are deterministic${source}->${target} format, following ADR-006 in taskgraph.
  • No parallel edgesmulti: false in graphology. At most one edge per (source, target) pair.
  • No self-loopsallowSelfLoops: false. An operation cannot be its own prerequisite.
  • ISO timestamp strings — Call graph timestamps are ISO 8601 strings, matching the storage schema.
  • Nullable categorical fields — Following taskgraph's convention, Type.Optional(Nullable(Enum)) for optional fields that can be explicitly null.
  • inputSchema and outputSchema on operation nodes — These are TypeBox schemas (unknown at the graph level), stored for type-compatibility checking. The graph does not validate these schemas — it stores them and makes them available for the typeCompat analysis function.
  • No schema version field — Following taskgraph, the serialized format does not include a version field. It follows graphology's native JSON format and is not a persistence format with backward-compatibility guarantees. Consumers that need persistence wrap it in their own versioned envelope.

Open Questions

  1. Should edgeType be a required field on ALL edges, or only on call graph and template edges? Operation graph edges are always typed, so requiring an explicit edgeType attribute there is redundant. Options: (a) make edgeType required on all edges, (b) have separate edge attribute types per graph mode, (c) use a union type on edge attributes and let the consumer tag the edge.

  2. Should CallNodeAttrs.identity be a Type.Record or the structured Identity type from operations? The structured type matches the call protocol and storage schema but creates a dependency on @alkdev/operations types. Options: (a) import Identity from operations (peer dep), (b) duplicate the type in flowgraph, (c) use Type.Record and accept weaker typing.

  3. How should conditional edge conditions be represented? condition: Type.Unknown() is maximally flexible but provides no type safety. Options: (a) Type.Unknown() with documentation, (b) Type.Union([Type.String(), Type.Function(...)]) for expression strings and function references, (c) a dedicated ConditionSchema that flowgraph defines.

References

  • Taskgraph schema patterns: @alkdev/taskgraph_ts/docs/architecture/schemas.md
  • Call graph storage schema: @alkdev/alkhub_ts/docs/architecture/storage/call-graph.md
  • Call event types: @alkdev/operations/src/call.ts
  • Operation types: @alkdev/operations/src/types.ts
  • ujsx schema: @alkdev/ujsx/docs/architecture/schema.md