flowgraph/docs/architecture/schema.md

---
status: draft
last_updated: 2026-05-20
---

# Schema

TypeBox Module, TypeScript types, categorical enums, node/edge attribute schemas, and the design decisions behind them.

## Overview

Flowgraph's schema layer follows the same pattern as taskgraph: TypeBox schemas are the single source of truth for both runtime validation and TypeScript type derivation. All data shapes are defined as TypeBox schemas, with `Static<typeof Schema>` producing the corresponding TypeScript types.

The schema is organized around two distinct graph types (operation graph and call graph) plus shared enums and the serialized graph factory.

## Design Decision: TypeBox as Single Source of Truth

Identical to taskgraph's approach:

1. **Static TypeScript types** via `Static<typeof Schema>` — every schema constant has a corresponding `type X = Static<typeof X>` alias
2. **Runtime validation** via `Value.Check()` / `Value.Errors()` — structured field-level error reporting
3. **JSON Schema export** for consumers that need schema-based contracts

No separate `interface` or `type` definitions outside of `Static<typeof>`. No Zod.

### Naming Convention

| Category | Convention | Example |
|----------|-----------|---------|
| Enum schema constant | PascalCase + `Enum` suffix | `CallStatusEnum` |
| Enum type alias | PascalCase, no suffix | `type CallStatus = Static<typeof CallStatusEnum>` |
| Object schema constant | PascalCase, no suffix | `OperationNodeAttrs`, `CallNodeAttrs` |
| Object type alias | Same name as schema constant | `type OperationNodeAttrs = Static<typeof OperationNodeAttrs>` |
| Graph attribute schemas | `PascalCase` + suffix | `FlowGraphSerialized`, `OperationGraphSerialized` |
| Factory function | PascalCase | `SerializedGraph(NodeAttrs, EdgeAttrs, GraphAttrs)` |

### Nullable Helper

Same `Nullable` helper as taskgraph:

```typescript
const Nullable = <T extends TSchema>(schema: T) => Type.Union([schema, Type.Null()]);
```

Used for fields that can be explicitly set to `null` (distinct from absent).

## Enums

### CallStatus

The lifecycle states of a call invocation. Matches the call graph storage schema in `@alkdev/alkhub_ts/docs/architecture/storage/call-graph.md`.

```typescript
const CallStatusEnum = Type.Union([
  Type.Literal("pending"),     // Call requested, not yet dispatched
  Type.Literal("running"),     // Handler executing
  Type.Literal("completed"),   // Successfully finished (call.responded + call.completed)
  Type.Literal("failed"),      // Handler threw or call.error emitted
  Type.Literal("aborted"),    // Call.aborted emitted (parent cancelled, deadline exceeded)
]);
type CallStatus = Static<typeof CallStatusEnum>;
```

Transitions:

```
pending → running → completed
                  → failed
         → aborted
```

- `pending → running`: Handler starts executing
- `running → completed`: `call.responded` + `call.completed` received
- `running → failed`: `call.error` received
- `pending → aborted`: `call.aborted` received before handler started (e.g., deadline exceeded)
- `running → aborted`: `call.aborted` received during execution (parent cancelled)

`completed`, `failed`, and `aborted` are terminal states — no further transitions.

### NodeStatus

A derived status for workflow template nodes. While `CallStatus` tracks individual call invocations, `NodeStatus` reflects the template-level view:

```typescript
const NodeStatusEnum = Type.Union([
  Type.Literal("idle"),        // Not started, no call yet
  Type.Literal("waiting"),     // Preconditions not met, waiting for upstream
  Type.Literal("ready"),      // Preconditions met, eligible to start
  Type.Literal("running"),     // Call in progress
  Type.Literal("completed"),   // Call completed successfully
  Type.Literal("failed"),     // Call failed
  Type.Literal("skipped"),     // Conditional branch not taken
  Type.Literal("aborted"),     // Call aborted
]);
type NodeStatus = Static<typeof NodeStatusEnum>;
```

`NodeStatus` extends `CallStatus` with workflow-specific states (`idle`, `waiting`, `ready`, `skipped`) that have no call protocol equivalent. A node that is `waiting` has no call yet because its preconditions haven't been met.

**Precondition semantics**: A predecessor in `completed` or `skipped` status satisfies a dependent's preconditions. A predecessor in `failed` or `aborted` status does NOT satisfy preconditions — it blocks the dependent and triggers failure propagation (the dependent transitions to `aborted`). This enables partial success: independent parallel branches continue running even when one branch fails.

### CallResult

The result of a completed call, used by `Conditional.test` and `Map.over` to access predecessor outputs:

```typescript
interface CallResult {
  status: NodeStatus;      // Status of the call (completed, failed, aborted, skipped)
  output: unknown;          // Call output (if completed)
  error?: {                  // Call error (if failed)
    code: string;
    message: string;
    details?: unknown;
  };
}
```

`CallResult` is the value in the `results` map passed to `Conditional.test` and `Map.over` functions. It's derived from `CallNodeAttrs` but simplified for template use — it omits `requestId`, `operationId`, `identity`, and timestamps, preserving only what template logic needs.

### OperationTypeEnum

The type of an operation, determining its call semantics:

```typescript
const OperationTypeEnum = Type.Union([
  Type.Literal("query"),        // Read-only, idempotent
  Type.Literal("mutation"),     // Side effects, not idempotent
  Type.Literal("subscription"), // Streaming, produces multiple results
]);
type OperationType = Static<typeof OperationTypeEnum>;
```

This enum is used in `OperationNodeAttrs.type` to classify operations by their call behavior.

### CallEventMapValue

`CallEventMapValue` is imported from `@alkdev/operations` (peer dependency). It represents a single call protocol event — the union type of all event types (`CallRequestedEvent | CallRespondedEvent | CallErrorEvent | CallAbortedEvent | CallCompletedEvent`). The full definition lives in `@alkdev/operations/src/call.ts`.

Flowgraph's `fromCallEvents()` and `updateFromEvent()` accept this type directly. The mapping from `CallEventMapValue` to `CallNodeAttrs` is:

| Event type | Action |
|------------|--------|
| `call.requested` | Add node with `status: "pending"`, add `triggered` edge if `parentRequestId` present |
| `call.responded` | Update node status to `completed`, set `output` and `completedAt` |
| `call.error` | Update node status to `failed`, set `error` and `completedAt` |
| `call.aborted` | Update node status to `aborted`, set `completedAt` |
| `call.completed` | Update node status to `completed`, set `completedAt` (if not already set) |

### EdgeType

The type of edge in a flowgraph. Matches the call graph storage schema's `edgeType` column. This is a universal enum that covers all graph modes (operation, call, template), but each graph mode uses only a subset:

```typescript
const EdgeTypeEnum = Type.Union([
  Type.Literal("triggered"),    // Source caused target to execute (parent→child in call hierarchy)
  Type.Literal("depends_on"),   // Source requires target's result before it can complete (data dependency)
  Type.Literal("typed"),        // Type compatibility edge (output schema A → input schema B)
  Type.Literal("sequential"),   // Sequential flow edge (template: <Sequential> ordering)
  Type.Literal("conditional"),  // Conditional flow edge (template: <Conditional> branch)
]);
type EdgeType = Static<typeof EdgeTypeEnum>;
```

| Edge Type | Graph Mode | Meaning |
|-----------|------------|---------|
| `triggered` | Call graph | Parent call triggered child call. Corresponds to `parentRequestId`. |
| `depends_on` | Call graph | Data dependency — source needs target's result. |
| `typed` | Operation graph | Type compatibility — source's output schema is compatible with target's input schema. |
| `sequential` | Template DAG | Sequential ordering from `<Sequential>` component. |
| `conditional` | Template DAG | Conditional branch from `<Conditional>` component. |

`EdgeTypeEnum` is the universal enumeration. Each graph mode constrains its edge types through its specific edge attribute schemas:

- **Operation graphs** only use `typed` edges (`OperationEdgeAttrs`)
- **Call graphs** use `triggered` and `depends_on` edges (`CallEdgeAttrs`)
- **Template DAGs** use `sequential` and `conditional` edges (`TemplateEdgeAttrs`)

## Node Attribute Schemas

### OperationNodeAttrs

Attributes for nodes in the operation graph. Derived from `OperationSpec` but carrying only graph-relevant data:

```typescript
const OperationNodeAttrs = Type.Object({
  name: Type.String(),                    // Operation name (e.g., "classify")
  namespace: Type.String(),               // Namespace (e.g., "task")
  version: Type.String(),                 // Semantic version
  type: OperationTypeEnum,                // "query" | "mutation" | "subscription"
  inputSchema: Type.Unknown(),            // JSON Schema for input (TypeBox schema)
  outputSchema: Type.Unknown(),           // JSON Schema for output (TypeBox schema)
  description: Type.Optional(Type.String()),
  tags: Type.Optional(Type.Array(Type.String())),
});
type OperationNodeAttrs = Static<typeof OperationNodeAttrs>;
```

The node key is `namespace.name` (e.g., `"task.classify"`), matching the `operationId` format used in the call protocol. The full `OperationSpec` is not stored on the graph — `accessControl`, `errorSchemas`, and `handler` belong to the registry, not the graph.

**Why `inputSchema` and `outputSchema` on the graph**: These are needed for type-compatibility edge construction. An edge from operation A to operation B exists if A's `outputSchema` is compatible with B's `inputSchema`. Storing the schemas on the node avoids a round-trip to the registry during graph queries.

### CallNodeAttrs

Attributes for nodes in the call graph. Populated from call events:

```typescript
const CallNodeAttrs = Type.Object({
  requestId: Type.String(),                  // Unique call identifier
  operationId: Type.String(),                // namespace.name of the operation
  status: CallStatusEnum,                    // Current call status
  parentRequestId: Type.Optional(Type.String()),  // Parent call (null = top-level)
  input: Type.Unknown(),                     // Call input
  output: Type.Optional(Type.Unknown()),     // Call output (on completion)
  error: Type.Optional(Type.Object({         // Call error (on failure)
    code: Type.String(),
    message: Type.String(),
    details: Type.Optional(Type.Unknown()),
  })),
  identity: Type.Optional(Type.Object({       // Caller identity
    id: Type.String(),
    scopes: Type.Array(Type.String()),
    resources: Type.Optional(Type.Record(Type.String(), Type.Array(Type.String()))),
  })),
  startedAt: Type.Optional(Type.String()),    // ISO timestamp when call was dispatched
  completedAt: Type.Optional(Type.String()),  // ISO timestamp when call completed/failed/aborted
});
type CallNodeAttrs = Static<typeof CallNodeAttrs>;
```

The node key is `requestId`. This matches the call protocol's correlation mechanism and the call graph storage schema.

**Why ISO timestamps as strings**: Following the call protocol, timestamps are ISO 8601 strings rather than numbers. This makes the graph directly serializable to JSON without transformation and aligns with the storage schema's `timestamp with tz` columns.

**Why `parentRequestId` is both a node attribute and an edge**: Following the same denormalization pattern as the storage schema — `parentRequestId` on the node enables fast point lookups ("who is this call's parent?"), while `triggered` edges enable traversal queries. Both are kept consistent by construction.

## Edge Attribute Schemas

### OperationEdgeAttrs (Operation Graph)

```typescript
const OperationEdgeAttrs = Type.Object({
  compatible: Type.Boolean({ description: "Whether the source output schema is compatible with the target input schema" }),
  detail: Type.Optional(Type.String({ description: "Human-readable description of compatibility or mismatch" })),
  mismatches: Type.Optional(Type.Array(Type.Object({  // Structured mismatch details (populated when compatible: false)
    path: Type.String(),
    expected: Type.String(),
    actual: Type.String(),
  }))),
});
type OperationEdgeAttrs = Static<typeof OperationEdgeAttrs>;
```

Type-compatibility edges carry a boolean `compatible` flag, an optional `detail` string, and optional structured `mismatches`. This allows the operation graph to include both compatible edges (green paths) and incompatible edges (red paths) for diagnostics. The `detail` field provides a human-readable summary, while `mismatches` provides machine-readable field-level diagnostics. The `TypeCompatResult` from `typeCompat()` populates both fields: `detail` for compatible edges and `mismatches` for incompatible ones.

**Edge type storage**: Operation graph edges always have `edgeType: "typed"` stored on the edge as a separate attribute alongside `OperationEdgeAttrs`. Graphology edges carry both the `OperationEdgeAttrs` (compatible, detail, mismatches) and the `edgeType` field. The `edgeType` is not inside `OperationEdgeAttrs` because it's a universal edge classification that applies to all edge types across all graph modes (operation, call, template). The `OperationEdgeAttrs` schema only defines the mode-specific attributes.

```typescript
// How operation graph edges are stored in graphology:
{
  edgeType: "typed",          // Universal classification (stored alongside attrs)
  compatible: true,           // OperationEdgeAttrs field
  detail: "classify.output → enrich.input",  // OperationEdgeAttrs field
  mismatches: []              // Empty when compatible
}
```

**Naming note**: Previously named `TypedEdgeAttrs`. Renamed to follow the `{GraphType}EdgeAttrs` pattern used by `CallEdgeAttrs` and `TemplateEdgeAttrs`.

### TriggeredEdgeAttrs (Call Graph)

```typescript
const TriggeredEdgeAttrs = Type.Object({});
type TriggeredEdgeAttrs = Static<typeof TriggeredEdgeAttrs>;
```

Parent-child edges in the call graph carry no additional attributes — the relationship is fully captured by the edge direction and type. This may be extended in the future with `latency` or `metadata` attributes.

### DependencyEdgeAttrs (Call Graph)

```typescript
const DependencyEdgeAttrs = Type.Object({});
type DependencyEdgeAttrs = Static<typeof DependencyEdgeAttrs>;
```

Data dependency edges also carry no additional attributes. Future extensions may include `dataPath` (which field of the output feeds which field of the input).

### CallEdgeAttrs (Call Graph Union)

```typescript
type CallEdgeAttrs = TriggeredEdgeAttrs | DependencyEdgeAttrs;
```

A union type used as the edge attribute type parameter for call graphs (`FlowGraph<CallNodeAttrs, CallEdgeAttrs>`). Call graph edges can be either `triggered` (parent-child) or `depends_on` (data dependency), distinguished by their edge type. The union type follows the `{GraphType}EdgeAttrs` naming pattern consistent with `OperationEdgeAttrs` and `TemplateEdgeAttrs`.

### TemplateEdgeAttrs (Workflow Templates)

```typescript
const TemplateEdgeAttrs = Type.Object({
  edgeType: Type.Union([Type.Literal("sequential"), Type.Literal("conditional")]),
  condition: Type.Optional(Type.Unknown()), // For conditional edges: the condition function or expression
});
type TemplateEdgeAttrs = Static<typeof TemplateEdgeAttrs>;
```

Template edges carry an `edgeType` to distinguish sequential flow from conditional branching. Conditional edges optionally store a `condition` that determines whether the target node executes.

**Note**: `TemplateEdgeAttrs.edgeType` uses a constrained union of `"sequential" | "conditional"` rather than the full `EdgeTypeEnum`. Template DAGs never have `triggered`, `depends_on`, or `typed` edges — those belong to call graphs and operation graphs respectively.

### TemplateNodeAttrs (Workflow Templates)

Template DAGs use `OperationNodeAttrs` for their operation nodes — the template doesn't need a separate node type because every node in a template DAG corresponds to an operation invocation. The template's structural information (`Sequential`, `Parallel`, `Conditional`, `Map`) is expressed through edges, not through special node types.

```typescript
// Template DAGs use OperationNodeAttrs for operation nodes
type TemplateNodeAttrs = OperationNodeAttrs;
// This alias makes the intent explicit: a template node represents an operation invocation
```

The separation between `OperationNodeAttrs` and `TemplateNodeAttrs` is a type alias for clarity. In the template context, each node carries the same attributes as an operation node (name, namespace, type, input/output schemas), but with template-specific edges (sequential, conditional) rather than type-compatibility edges (typed).

## SerializedGraph Factory

Following the taskgraph pattern, a generic factory for graphology native JSON format:

```typescript
const SerializedGraph = <N extends TSchema, E extends TSchema, G extends TSchema>(
  NodeAttrs: N,
  EdgeAttrs: E,
  GraphAttrs: G,
) =>
  Type.Object({
    attributes: GraphAttrs,
    options: Type.Object({
      type: Type.Literal("directed"),
      multi: Type.Literal(false),
      allowSelfLoops: Type.Literal(false),
    }),
    nodes: Type.Array(Type.Object({
      key: Type.String(),
      attributes: NodeAttrs,
    })),
    edges: Type.Array(Type.Object({
      key: Type.String(),
      source: Type.String(),
      target: Type.String(),
      attributes: EdgeAttrs,
    })),
  });
```

**`multi: false`**: Flowgraph edges are unique per (source, target, edgeType) triple. No parallel edges between the same node pair with the same type.

**`allowSelfLoops: false`**: Operations and calls cannot be their own prerequisite. Self-loops are rejected at construction time.

**`type: "directed"`**: All edges have direction. `A → B` means A is prerequisite/source, B is dependent/target. This matches the graphology convention and the call graph storage schema.

### FlowGraphSerialized variants

Two specialized serialization types, one for each graph type:

```typescript
const OperationGraphSerialized = SerializedGraph(
  OperationNodeAttrs,
  OperationEdgeAttrs,
  Type.Object({}),  // No graph-level attributes
);

const CallGraphSerialized = SerializedGraph(
  CallNodeAttrs,
  CallEdgeAttrs,
  Type.Object({}),  // No graph-level attributes
);
```

For call graphs, edges can be either `triggered` or `depends_on`, distinguished by their attributes rather than separate schemas.

## Edge Key Convention

Following taskgraph's ADR-006, edge keys are deterministic:

```
${source}->${target}
```

For the operation graph, this means keys like `"task.classify->task.enrich"`. For the call graph, keys like `"req_abc123->req_def456"`.

When multiple edge types exist between the same (source, target) pair (e.g., in the call graph where both `triggered` and `depends_on` edges can connect the same calls), a composite key format is used:

```
${source}->${target}:${edgeType}
```

For example, a `depends_on` edge in the call graph uses `"req_abc123->req_def456:depends_on"` while the `triggered` edge between the same pair uses `"req_abc123->req_def456"`.

Since `multi: false`, there can be at most one edge per key. The composite key format ensures deterministic keys even when multiple edge types connect the same pair.

This is an exception to the simple `${source}->${target}` pattern, but it's necessary for the call graph's dual-edge-type scenario. If multi-edge support becomes more broadly needed, the constraint can be relaxed and a uniform composite key format adopted.

## Constraints

- **TypeBox schemas are the single source of truth** — no hand-written `interface` or `type` definitions for data shapes. All types are derived via `Static<typeof Schema>`.
- **Edge keys are deterministic** — `${source}->${target}` format, following ADR-006 in taskgraph.
- **No parallel edges** — `multi: false` in graphology. At most one edge per (source, target) pair.
- **No self-loops** — `allowSelfLoops: false`. An operation cannot be its own prerequisite.
- **ISO timestamp strings** — Call graph timestamps are ISO 8601 strings, matching the storage schema.
- **Nullable categorical fields** — Following taskgraph's convention, `Type.Optional(Nullable(Enum))` for optional fields that can be explicitly null.
- **`inputSchema` and `outputSchema` on operation nodes** — These are TypeBox schemas (unknown at the graph level), stored for type-compatibility checking. The graph does not validate these schemas — it stores them and makes them available for the `typeCompat` analysis function.
- **No schema version field** — Following taskgraph, the serialized format does not include a version field. It follows graphology's native JSON format and is not a persistence format with backward-compatibility guarantees. Consumers that need persistence wrap it in their own versioned envelope.

## Open Questions

1. **Should `edgeType` be a required field on ALL edges, or only on call graph and template edges?** Operation graph edges are always `typed`, so requiring an explicit `edgeType` attribute there is redundant. Options: (a) make `edgeType` required on all edges, (b) have separate edge attribute types per graph mode, (c) use a union type on edge attributes and let the consumer tag the edge.

2. **Should `CallNodeAttrs.identity` be a `Type.Record` or the structured `Identity` type from operations?** The structured type matches the call protocol and storage schema but creates a dependency on `@alkdev/operations` types. Options: (a) import `Identity` from operations (peer dep), (b) duplicate the type in flowgraph, (c) use `Type.Record` and accept weaker typing.

3. **How should conditional edge conditions be represented?** `condition: Type.Unknown()` is maximally flexible but provides no type safety. Options: (a) `Type.Unknown()` with documentation, (b) `Type.Union([Type.String(), Type.Function(...)])` for expression strings and function references, (c) a dedicated `ConditionSchema` that flowgraph defines.

## References

- Taskgraph schema patterns: `@alkdev/taskgraph_ts/docs/architecture/schemas.md`
- Call graph storage schema: `@alkdev/alkhub_ts/docs/architecture/storage/call-graph.md`
- Call event types: `@alkdev/operations/src/call.ts`
- Operation types: `@alkdev/operations/src/types.ts`
- ujsx schema: `@alkdev/ujsx/docs/architecture/schema.md`