Align storage & architecture specs with published npm libraries
Systematically compared @alkdev/taskgraph, @alkdev/operations, and
@alkdev/flowgraph against storage/arch specs and fixed all mismatches.
Key changes:
Tasks (storage/tasks.md + ADR-011):
- Rename TaskFrontmatter → TaskInput to match library export
- Fix dependsOn (was depends_on) in field mappings — library uses
camelCase; parseFrontmatter normalizes YAML snake_case on input
- Document DependencyEdge shape {from, to, qualityRetention?} and
DB↔library field mapping
- Document graph node vs DB column distinction (TaskGraphNodeAttrs
is a subset of TaskInput)
- Fix default risk fallback from low → medium (matches resolveDefaults)
- Fix cross-project guard column references (dependentTaskId, not taskId)
- Clarify @alkdev/taskgraph TS is source of truth; frontmatter is for
LLM output parsing and legacy imports, not Rust CLI
- Add complete library exports reference
Operations (storage/spokes.md + operations.md):
- Add version, title, _meta columns to operations table (required by
OperationSpec, were missing)
- Fix type casing: query/mutation/subscription (lowercase, matching
OperationType runtime values)
- Make outputSchema and accessControl NOT NULL (matching library)
- Document ErrorDefinition shape {code, description, schema, httpStatus?}
- Document _meta vs commonCols.metadata distinction
- Add registerAll, get, getHandler, getByName, list, subscribe methods
- Fix buildCallHandler signature ({ registry, callMap })
- Fix OperationType values (lowercase)
Call graph (storage/call-graph.md + call-graph.md):
- Change operationId to NOT NULL with RESTRICT FK (was nullable/SET NULL)
— matches flowgraph's required CallNodeAttrs.operationId
- Document sentinel __removed__ operation strategy for deletions
- Document ISO 8601 string ↔ timestamptz conversion requirement
- Rewrite CallEventMap to match actual library: flat dot-notation keys,
timestamp on all events, nested error structure, optional output on
completed event
- Remove call.running event (doesn't exist in library) — hub calls
updateStatus(running) directly on dispatch
- Fix buildCallHandler({ registry, callMap }) signature
- Fix PendingRequestMap constructor (positional EventTarget)
- Add updateCall/removeCall/graph methods to API summary
- Document abort cascade as hub logic, not flowgraph logic
- Add open questions for operation deletion and reactive vs call graph
semantics
Table reference (storage/table-reference.md):
- Update call_graph_nodes.operationId cascade to RESTRICT
- Update operations.type comment to lowercase
- Update status enum reference
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-05-22
|
||||
last_updated: 2026-05-25
|
||||
---
|
||||
|
||||
# Call Protocol, Call Graph & Operation Graph
|
||||
@@ -44,54 +44,78 @@ This means `call` is semantically `subscribe().next()` — a subscription that c
|
||||
|
||||
## Call Event Types
|
||||
|
||||
All communication flows through typed events:
|
||||
|
||||
> The call event TypeBox schemas are defined in `@alkdev/operations` as `CallEventSchema`. The shape shown here is the current design; verify against the package source for any minor differences.
|
||||
All communication flows through typed events. The call event TypeBox schemas are defined in `@alkdev/operations` as `CallEventSchema` (flat dot-notation keys like `"call.requested"`). The shapes below match the library's actual event types.
|
||||
|
||||
```ts
|
||||
import { Type } from "@alkdev/typebox"
|
||||
|
||||
export const CallEventMap = {
|
||||
call: {
|
||||
requested: Type.Object({
|
||||
// CallEventSchema uses flat dot-notation keys (not nested objects):
|
||||
// "call.requested", "call.responded", "call.completed", "call.aborted", "call.error"
|
||||
// The shapes below show the event payload for each type.
|
||||
|
||||
// call.requested
|
||||
Type.Object({
|
||||
type: Type.Literal("call.requested"),
|
||||
requestId: Type.String(),
|
||||
operationId: Type.String(),
|
||||
input: Type.Unknown(),
|
||||
timestamp: Type.String(), // ISO 8601 — used as source for startedAt/completedAt
|
||||
parentRequestId: Type.Optional(Type.String()),
|
||||
deadline: Type.Optional(Type.Number()),
|
||||
identity: Type.Optional(Type.Object({
|
||||
id: Type.String(),
|
||||
scopes: Type.Array(Type.String()),
|
||||
resources: Type.Optional(Type.Record(Type.String(), Type.Array(Type.String())))
|
||||
}))
|
||||
}),
|
||||
responded: Type.Object({
|
||||
})),
|
||||
startedAt: Type.Optional(Type.String()), // ISO 8601 — if provided, overrides timestamp for startedAt
|
||||
})
|
||||
|
||||
// call.responded
|
||||
Type.Object({
|
||||
type: Type.Literal("call.responded"),
|
||||
requestId: Type.String(),
|
||||
output: Type.Unknown() // ResponseEnvelope from @alkdev/operations
|
||||
}),
|
||||
completed: Type.Object({
|
||||
requestId: Type.String()
|
||||
}),
|
||||
aborted: Type.Object({
|
||||
requestId: Type.String()
|
||||
}),
|
||||
error: Type.Object({
|
||||
output: Type.Unknown(), // ResponseEnvelope from @alkdev/operations
|
||||
timestamp: Type.String(), // ISO 8601
|
||||
})
|
||||
|
||||
// call.completed
|
||||
Type.Object({
|
||||
type: Type.Literal("call.completed"),
|
||||
requestId: Type.String(),
|
||||
output: Type.Optional(Type.Unknown()), // Optional final output
|
||||
timestamp: Type.String(), // ISO 8601
|
||||
})
|
||||
|
||||
// call.aborted
|
||||
Type.Object({
|
||||
type: Type.Literal("call.aborted"),
|
||||
requestId: Type.String(),
|
||||
timestamp: Type.String(), // ISO 8601
|
||||
})
|
||||
|
||||
// call.error
|
||||
Type.Object({
|
||||
type: Type.Literal("call.error"),
|
||||
requestId: Type.String(),
|
||||
error: Type.Object({ // Error is nested under "error" key
|
||||
code: Type.String(),
|
||||
message: Type.String(),
|
||||
details: Type.Optional(Type.Unknown())
|
||||
details: Type.Optional(Type.Unknown()),
|
||||
}),
|
||||
timestamp: Type.String(), // ISO 8601
|
||||
})
|
||||
}
|
||||
} as const
|
||||
```
|
||||
|
||||
**Note on `deadline`**: The `call.requested` event above does not include a `deadline` field. Deadlines are a `PendingRequestMap` concept (timeouts are applied at the protocol layer, not persisted in the call graph). If a call times out, the `PendingRequestMap` emits a `call.error` with code `TIMEOUT`.
|
||||
|
||||
**Note: no `call.running` event**: The library's `CallEventMapValue` union only has 5 event types. There is no `call.running` event. When the hub dispatches an operation handler, it calls `flowGraph.updateStatus(requestId, "running", { startedAt: now.toISOString() })` directly. This is a hub-initiated state transition, not an event. See [Write Path](#write-path) for details.
|
||||
|
||||
### Event Semantics
|
||||
|
||||
- **`call.requested`** — Initiates a call. Creates a call graph node (status: `pending`) and adds a `triggered` edge if `parentRequestId` is present.
|
||||
- **`call.responded`** — Carries the call result. For one-shot calls, this is the terminal event that resolves the `Promise<ResponseEnvelope>`. The `output` field contains a `ResponseEnvelope` (with `data` and `meta` fields) from `@alkdev/operations`.
|
||||
- **`call.completed`** — Terminal completion signal, idempotent if `call.responded` was already received. For subscriptions, fires after the last `call.responded` to signal stream end. For one-shot calls, the `PendingRequestMap` may emit `call.completed` as a separate event or as part of `call.responded` processing. In flowgraph, this event fills `completedAt` if it was not already set.
|
||||
- **`call.aborted`** — Call was cancelled. Sets status to `aborted` and cascades to children.
|
||||
- **`call.error`** — Call failed with an error. Sets status to `failed` and stores the error.
|
||||
- **`call.requested`** — Initiates a call. Creates a call graph node (status: `pending`) and adds a `triggered` edge if `parentRequestId` is present. If `startedAt` is provided in the event, it overrides `timestamp` for the node's `startedAt`.
|
||||
- **`call.responded`** — Carries the call result. For one-shot calls, this is the terminal event that resolves the `Promise<ResponseEnvelope>`. The `output` field contains a `ResponseEnvelope` (with `data` and `meta` fields) from `@alkdev/operations`. Sets `status: "completed"` and `completedAt: event.timestamp`.
|
||||
- **`call.completed`** — Terminal completion signal, idempotent if `call.responded` was already received. For subscriptions, fires after the last `call.responded` to signal stream end. May include optional `output`. Sets `completedAt: event.timestamp` if not already set.
|
||||
- **`call.aborted`** — Call was cancelled. Sets status to `aborted` and `completedAt: event.timestamp`. **Note**: flowgraph's `updateFromEvent()` does NOT cascade aborts to children — the hub's `CallHandler` is responsible for cascading `call.aborted` to descendant calls via the pubsub layer.
|
||||
- **`call.error`** — Call failed with an error. Sets status to `failed`, stores the error (nested under `error: { code, message, details? }`), and sets `completedAt: event.timestamp`.
|
||||
|
||||
**Note on `@alkdev/flowgraph`**: The `CallEventMapValue` type in `@alkdev/flowgraph/schema` defines the union of these event types. Flowgraph's `FlowGraph.fromCallEvents()` and `updateFromEvent()` consume these events directly to populate the call graph. The `CallStatus` enum in flowgraph (`pending`, `running`, `completed`, `failed`, `aborted`) aligns with the statuses in the call protocol events.
|
||||
|
||||
@@ -133,8 +157,8 @@ Manages in-flight requests and provides the `call()` interface:
|
||||
// From @alkdev/operations
|
||||
import { PendingRequestMap } from "@alkdev/operations"
|
||||
|
||||
// Construction — takes optional EventTarget for pluggable transport
|
||||
const prm = new PendingRequestMap({ eventTarget })
|
||||
// Construction — takes optional EventTarget for pluggable transport (positional, not options object)
|
||||
const prm = new PendingRequestMap(eventTarget?)
|
||||
|
||||
// Call protocol — call() returns Promise<ResponseEnvelope>
|
||||
const envelope = await prm.call(operationId, input, { deadline, identity })
|
||||
@@ -158,7 +182,7 @@ prm.abort(requestId)
|
||||
- `subscribe()` returns `AsyncIterable<ResponseEnvelope>`
|
||||
- `respond()` requires `isResponseEnvelope(output)`
|
||||
- Built-in deadline and idle timeout support
|
||||
- Constructor takes optional `EventTarget` for pluggable transport
|
||||
- Constructor takes optional `EventTarget` for pluggable transport (positional parameter, not options object)
|
||||
|
||||
## CallHandler
|
||||
|
||||
@@ -167,7 +191,8 @@ Bridges pubsub events to `OperationRegistry.execute()`. Performs access control
|
||||
```ts
|
||||
import { buildCallHandler } from "@alkdev/operations"
|
||||
|
||||
const handler = buildCallHandler({ registry, eventTarget })
|
||||
const handler = buildCallHandler({ registry, callMap })
|
||||
// callMap: PendingRequestMap instance (not raw EventTarget)
|
||||
// subscribes to call.requested events
|
||||
// checks access control (requiredScopes, resource permissions) against Identity
|
||||
// executes via registry, dispatches call.responded on success
|
||||
@@ -282,11 +307,11 @@ The call graph is populated by `FlowGraph.fromCallEvents(events)` or incremental
|
||||
| Event | Graph Mutation |
|
||||
|-------|---------------|
|
||||
| `call.requested` | `addCall(attrs)` — creates node (status: `pending`) + `triggered` edge if `parentRequestId` present |
|
||||
| `call.responded` | `updateCall(requestId, { status: "completed", output, completedAt })` |
|
||||
| `call.completed` | `updateCall(requestId, { completedAt })` — idempotent if already responded, sets `completedAt` if missing |
|
||||
| `call.error` | `updateCall(requestId, { status: "failed", error: { code, message, details? } })` |
|
||||
| `call.aborted` | `updateStatus(requestId, "aborted")` + cascade to children |
|
||||
| `call.running` | `updateStatus(requestId, "running")` — when the call starts executing (hub dispatches to handler) |
|
||||
| `call.responded` | `updateFromEvent()` sets `status: "completed"`, `output`, `completedAt: event.timestamp` (idempotent for terminal states) |
|
||||
| `call.completed` | Sets `completedAt: event.timestamp` if not already set. Idempotent — if already `completed`, no-op. May also set `output` if present in event. |
|
||||
| `call.error` | `updateFromEvent()` sets `status: "failed"`, `error: { code, message, details? }`, `completedAt: event.timestamp` |
|
||||
| `call.aborted` | `updateFromEvent()` sets `status: "aborted"`, `completedAt: event.timestamp`. **No cascade**: `updateFromEvent()` does not abort children — the hub's `CallHandler` handles cascading. |
|
||||
| _(hub-initiated)_ | `updateStatus(requestId, "running", { startedAt: now.toISOString() })` — **Not an event.** The hub's `CallHandler` calls `updateStatus()` directly when it dispatches the operation handler, transitioning `pending` → `running`. |
|
||||
|
||||
### Call Status State Machine
|
||||
|
||||
@@ -339,24 +364,29 @@ const callGraph = FlowGraph.fromCallEvents(storedEvents)
|
||||
const callGraph = new FlowGraph(CallNodeAttrs, CallEdgeAttrs)
|
||||
|
||||
// Process events
|
||||
callGraph.updateFromEvent(event) // handles all call.* event types
|
||||
callGraph.updateFromEvent(event) // handles all 5 call.* event types
|
||||
|
||||
// Status management
|
||||
callGraph.updateStatus(requestId, "running") // validates state machine transition
|
||||
callGraph.updateStatus(requestId, "completed") // throws if not currently "running"
|
||||
callGraph.updateStatus(requestId, "running", { startedAt: now.toISOString() }) // hub-initiated, not event-driven
|
||||
callGraph.updateStatus(requestId, "completed") // validates state machine transition
|
||||
|
||||
// Edge management
|
||||
callGraph.addCall({ requestId, operationId, status: "pending", parentRequestId?, input?, identity? })
|
||||
// Call management
|
||||
callGraph.addCall({ requestId, operationId, status: "pending", parentRequestId?, input?, identity?, startedAt? })
|
||||
callGraph.updateCall(requestId, attrs) // partial update of any CallNodeAttrs
|
||||
callGraph.removeCall(requestId)
|
||||
callGraph.addDependency(sourceRequestId, targetRequestId) // depends_on edge
|
||||
|
||||
// Queries
|
||||
callGraph.children(requestId) // direct children via triggered edges
|
||||
callGraph.descendants(requestId) // all descendants
|
||||
callGraph.descendants(nodeId) // all descendants
|
||||
callGraph.lineage(requestId) // ancestor chain from root to this call
|
||||
callGraph.getRoots() // calls with no parentRequestId
|
||||
callGraph.filterByStatus("running") // all running calls
|
||||
callGraph.duration(requestId) // completedAt - startedAt in ms
|
||||
|
||||
// Escape hatch
|
||||
callGraph.graph // raw graphology DirectedGraph
|
||||
|
||||
// Serialization (for Postgres persistence)
|
||||
const data = callGraph.export() // -> CallGraphSerialized
|
||||
const restored = FlowGraph.fromJSON(data)
|
||||
@@ -417,11 +447,11 @@ The storage layer persists individual `call_graph_nodes` and `call_graph_edges`
|
||||
The hub's `CallHandler` is responsible for writing call graph data to Postgres. When a call protocol event arrives:
|
||||
|
||||
1. **`call.requested`**: The `CallHandler` creates a row in `call_graph_nodes` (status: `pending`) and, if `parentRequestId` is present, a `triggered` edge in `call_graph_edges`. This write happens **synchronously before dispatching** to ensure the call is tracked even if the handler fails immediately.
|
||||
2. **`call.responded`**: Updates the node's status to `completed`, sets `output` (unwrapped from the `ResponseEnvelope` — only `data` is stored, not `meta`), and sets `completedAt`.
|
||||
3. **`call.error`**: Updates status to `failed`, sets `error`, and sets `completedAt`.
|
||||
4. **`call.aborted`**: Updates status to `aborted` and sets `completedAt`. The hub then cascades the abort to child calls.
|
||||
5. **`call.completed`**: Sets `completedAt` if not already set. Idempotent — no-op if the call is already `completed`.
|
||||
6. **`call.running`**: Updates status from `pending` to `running` and sets `startedAt`.
|
||||
2. **Handler dispatch**: The `CallHandler` dispatches the operation handler. At this point it updates the node status to `running` and sets `startedAt` — this is **not** an event, but a hub-initiated `updateStatus()` call on both the in-memory flowgraph and the DB row.
|
||||
3. **`call.responded`**: Updates the node's status to `completed`, sets `output` (unwrapped from the `ResponseEnvelope` — only `data` is stored, not `meta`), and sets `completedAt`.
|
||||
4. **`call.error`**: Updates status to `failed`, sets `error`, and sets `completedAt`.
|
||||
5. **`call.aborted`**: Updates status to `aborted` and sets `completedAt`. The hub's `CallHandler` then cascades the abort to child calls (this is hub logic, not flowgraph's `updateFromEvent()`).
|
||||
6. **`call.completed`**: Sets `completedAt` if not already set. Idempotent — no-op if the call is already `completed`.
|
||||
|
||||
Error handling: If a DB write fails, the call still proceeds (the handler has already been invoked). The hub logs the write failure and continues. Call graph data is best-effort — the in-memory flowgraph is the authoritative source for running calls; the DB is for persistence and observability.
|
||||
|
||||
@@ -440,16 +470,16 @@ When reconstructing a flowgraph from the database, the hub uses `requestId` as t
|
||||
|--------|------|-------|
|
||||
| commonCols | — | id, metadata, createdAt, updatedAt |
|
||||
| requestId | text NOT NULL UNIQUE | Protocol-level correlation key. Also the flowgraph node key. |
|
||||
| operationId | text | FK → operations.id. Nullable — survives operation removal. |
|
||||
| operationId | text NOT NULL | FK → operations.id (RESTRICT). NOT NULL — `CallNodeAttrs.operationId` is required in flowgraph. |
|
||||
| parentRequestId | text | Denormalized parent — fast point lookup. Redundant with `triggered` edge. |
|
||||
| identity | jsonb | Caller identity: `{ id, scopes, resources }` |
|
||||
| identity | jsonb | Caller identity: `{ id, scopes, resources? }` |
|
||||
| callerAccountId | text | FK → accounts.id (ON DELETE SET NULL). System calls are nullable. |
|
||||
| status | text NOT NULL | Matches `CallStatus` enum: `pending`, `running`, `completed`, `failed`, `aborted` |
|
||||
| input | jsonb | Call input (redacted, truncated — see storage/call-graph.md) |
|
||||
| output | jsonb | Call output (on success) |
|
||||
| error | jsonb | `{ code, message, details? }` (on failure) |
|
||||
| startedAt | timestamp with tz | When call was dispatched (maps to flowgraph `startedAt`) |
|
||||
| completedAt | timestamp with tz | When call completed/failed/aborted (maps to flowgraph `completedAt`) |
|
||||
| error | jsonb | `{ code, message, details? }` (on failure, nested under `error` key matching flowgraph) |
|
||||
| startedAt | timestamp with tz | When call was dispatched. **Type conversion**: flowgraph stores as ISO 8601 string; DB stores as `timestamptz`. |
|
||||
| completedAt | timestamp with tz | When call completed/failed/aborted. Same type conversion as `startedAt`. |
|
||||
|
||||
### `call_graph_edges` — Typed directed edges between calls
|
||||
|
||||
@@ -485,6 +515,12 @@ A `WebSocketEventTarget` implementing `TypedEventTarget` makes each spoke runner
|
||||
|
||||
The call protocol itself, `PendingRequestMap`, `CallHandler`, `buildEnv` dual-mode, call graph auto-tracking, and reactive workflow execution are **in the initial implementation**. They're not much code and they prevent the need to bolt on ad-hoc error handling and abort logic in every coordination operation.
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **Operation deletion and call graph referential integrity**: The `call_graph_nodes.operationId` column has a RESTRICT FK to `operations.id`. An operation cannot be deleted while any call records reference it. For v1, the strategy is to deny removal while call records exist. If operation removal becomes necessary (e.g., cleanup of old operations), the hub would need to either: (a) reassign all referencing call records to a sentinel `__removed__` operation (pre-seeded in migrations with `id='__removed__'`, `namespace='system'`), then delete the original operation, or (b) accept that historical call records reference operations that may have been provided by disconnected spokes — in which case, consider making `operationId` nullable in flowgraph's `CallNodeAttrs` so the hub can NULL the FK instead of requiring a sentinel row. This requires coordination with the `@alkdev/flowgraph` package.
|
||||
|
||||
2. **Reactive vs. call graph `requested` semantics**: In `FlowGraph`, `call.requested` creates a node in `pending` state. In `WorkflowReactiveRoot`, `call.requested` maps to `NodeStatus.running` (the reactive model assumes a template node starts executing when requested). This is a deliberate semantic difference — the reactive model tracks execution progress, while the call graph model tracks protocol state. The spec documents this, but implementers should be aware that feeding the same event to both models produces different initial statuses.
|
||||
|
||||
## Dependencies
|
||||
|
||||
```
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-05-18
|
||||
last_updated: 2026-05-25
|
||||
---
|
||||
|
||||
# Operations System
|
||||
@@ -15,25 +15,30 @@ The operations system is the universal abstraction for all work in the alk.dev p
|
||||
|
||||
### Core Types (`operations/types.ts`)
|
||||
|
||||
- `OperationType` — QUERY (read-only), MUTATION (write), SUBSCRIPTION (async generator)
|
||||
- `OperationSpec` — serializable, hashable subset (name, namespace, version, type, description, tags, inputSchema, outputSchema, errorSchemas, accessControl, \_meta)
|
||||
- `OperationType` — `QUERY = "query"`, `MUTATION = "mutation"`, `SUBSCRIPTION = "subscription"` (enum names uppercase, string values lowercase)
|
||||
- `OperationSpec` — serializable, hashable subset (name, namespace, version, type, description, title?, tags?, inputSchema, outputSchema, errorSchemas?, accessControl, _meta?)
|
||||
- `IOperationDefinition` — extends `OperationSpec` with runtime `handler`
|
||||
- `OperationContext` — metadata, requestId, parentRequestId, identity, env
|
||||
- `AccessControl` — requiredScopes (all match), requiredScopesAny (any match), resourceType, resourceAction. See below.
|
||||
- `ResponseEnvelope<T>` — universal result wrapper with source tracking (local/http/mcp). All `execute()` and `env` functions return `ResponseEnvelope<T>`.
|
||||
- `CallError` / `InfrastructureErrorCode` — structured error codes: `OPERATION_NOT_FOUND`, `ACCESS_DENIED`, `VALIDATION_ERROR`, `TIMEOUT`, `ABORTED`, `EXECUTION_ERROR`, `UNKNOWN_ERROR`.
|
||||
- `ErrorDefinition` — structured error schema declaration: `{ code: string, description: string, schema: unknown, httpStatus?: number }`
|
||||
|
||||
### Registry (`operations/registry.ts`)
|
||||
|
||||
- Register by `{namespace}.{name}` key
|
||||
- `register()` now accepts `OperationSpec & { handler? }` (handler can be registered separately)
|
||||
- `register()` accepts `OperationSpec & { handler? }` (handler can be registered separately)
|
||||
- `registerSpec()` / `registerHandler()` — separate spec and handler registration
|
||||
- `registerAll(definitions)` — bulk registration
|
||||
- `execute()` returns `Promise<ResponseEnvelope<TOutput>>` (not `Promise<TOutput>`)
|
||||
- Constructor accepts optional `SchemaAdapter` for Zod/Valibot conversion
|
||||
- Access control is enforced in the registry (via `enforceAccess`)
|
||||
- Validate input before handler execution
|
||||
- Warn on output schema mismatch (don't throw)
|
||||
- `getSpec()` / `getAllSpecs()` for serializable specs
|
||||
- `get(name)` / `getByName(namespace, name)` — retrieve definitions
|
||||
- `getHandler(name)` — retrieve handler function
|
||||
- `list()` — list all registered operation names
|
||||
|
||||
### Scanner (`operations/scanner.ts`)
|
||||
|
||||
@@ -51,6 +56,7 @@ The operations system is the universal abstraction for all work in the alk.dev p
|
||||
- Sets `trusted: true` on nested context (bypasses access control for internal calls)
|
||||
- Env functions return `Promise<ResponseEnvelope>`, callers use `unwrap(envelope)` or `envelope.data`
|
||||
- Filters SUBSCRIPTION operations out of env
|
||||
- `subscribe(registry, operationId, input, context)` — standalone function for subscription operations
|
||||
|
||||
### FromSchema (`operations/from_schema.ts`)
|
||||
|
||||
@@ -119,7 +125,7 @@ Operations use `buildEnv()` which supports direct mode (see call-graph.md):
|
||||
|
||||
- **Direct mode**: `buildEnv({ registry, context })` → env functions call `registry.execute()`
|
||||
|
||||
The call protocol (PendingRequestMap, CallHandler) is part of `@alkdev/operations`. It provides call graph tracking, abort cascading, and structured error handling across all transports. See call-graph.md for the full spec.
|
||||
The call protocol (PendingRequestMap, CallHandler) is part of `@alkdev/operations`. It provides call graph tracking, abort cascading, and structured error handling across all transports. The `buildCallHandler({ registry, callMap })` creates a `CallHandler` that subscribes to `call.requested` events on the `callMap` (a `PendingRequestMap`), enforces access control, and dispatches via `registry.execute()`. See call-graph.md for the full spec.
|
||||
|
||||
## How It Connects to Everything Else
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-05-22
|
||||
last_updated: 2026-05-25
|
||||
---
|
||||
|
||||
# Table Schemas: Call Graph
|
||||
@@ -15,23 +15,29 @@ Call graph entries for observability. Every operation invocation creates a node;
|
||||
|--------|------|-------|
|
||||
| commonCols | — | id, metadata, createdAt, updatedAt |
|
||||
| requestId | text NOT NULL UNIQUE | Protocol-level correlation key. Also serves as the flowgraph node key. |
|
||||
| operationId | text | FK → operations.id — The operation definition that was called. Nullable — if an operation definition is removed, the call record survives but the operation reference is nulled. Uses the `operations` table (post-remap namespace+name), not the pre-remap identifier. |
|
||||
| operationId | text NOT NULL | FK → operations.id (RESTRICT) — The operation definition that was called. **NOT NULL** — `@alkdev/flowgraph`'s `CallNodeAttrs.operationId` is a required string. The RESTRICT constraint means an operation cannot be deleted while call records reference it. **Deletion strategy**: The hub should deny operation removal when active call records exist. If removal is required (e.g., cleanup), the hub must first reassign call records to a sentinel operation row (pre-seeded in migrations with id `__removed__`, name `removed`, namespace `system`), then delete the original operation. |
|
||||
| parentRequestId | text | Parent call's requestId (null = top-level call). Denormalized fast lookup — redundant with `triggered` edge in `call_graph_edges`. |
|
||||
| identity | jsonb | Caller identity at time of call (`{ id, scopes, resources }`), matching `@alkdev/flowgraph/schema`'s `CallNodeAttrs.identity`. |
|
||||
| identity | jsonb | Caller identity at time of call (`{ id, scopes, resources? }`), matching `@alkdev/flowgraph/schema`'s `CallNodeAttrs.identity`. |
|
||||
| callerAccountId | text | FK → accounts.id — The account that initiated this call. Nullable — system-initiated calls may not have an account. onDelete: SET NULL (calls survive account deletion for audit). This follows the D1 cascade policy — live session/call data uses nullable FK + SET NULL to preserve audit history. |
|
||||
| status | text NOT NULL | Matches `@alkdev/flowgraph/schema`'s `CallStatus` enum: `pending`, `running`, `completed`, `failed`, `aborted`. State transitions are enforced by the flowgraph state machine — `pending → running → completed/failed` and `pending/running → aborted`. |
|
||||
| input | jsonb | Call input (redacted before storage — see Payload Redaction). |
|
||||
| output | jsonb | Call output (on success). **Contains `ResponseEnvelope.data` only** — the hub unwraps the envelope before storing in the call graph. Maps to `CallNodeAttrs.output` in flowgraph. |
|
||||
| error | jsonb | `{ code, message, details? }` (on failure). Maps to `CallNodeAttrs.error` in flowgraph. |
|
||||
| startedAt | timestamp with tz | When call was dispatched. Maps to `CallNodeAttrs.startedAt` in flowgraph. |
|
||||
| completedAt | timestamp with tz | When call completed/failed/aborted. Maps to `CallNodeAttrs.completedAt` in flowgraph. |
|
||||
| startedAt | timestamp with tz | When call was dispatched. Maps to `CallNodeAttrs.startedAt` in flowgraph. **Type conversion**: flowgraph stores timestamps as ISO 8601 strings; storage layer must convert between `timestamptz` and ISO strings during read/write. |
|
||||
| completedAt | timestamp with tz | When call completed/failed/aborted. Maps to `CallNodeAttrs.completedAt` in flowgraph. **Type conversion**: same as `startedAt`. |
|
||||
|
||||
**identity boundaries**: Caller identity at time of call (account, scopes, resources). This is immutable after creation. **metadata boundaries**: Retention metadata and other system fields. User-facing data goes in `input`/`output`.
|
||||
|
||||
**Timestamp serialization**: `@alkdev/flowgraph`'s `CallNodeAttrs` stores `startedAt` and `completedAt` as **ISO 8601 strings** (`Type.Optional(Type.String())`), not native Date objects. The storage layer stores them as Postgres `timestamp with tz`. The hub must:
|
||||
- **On write (DB→flowgraph)**: Convert `timestamptz` → ISO string via `.toISOString()`
|
||||
- **On read (flowgraph→DB)**: Convert ISO string → `Date` or pass as parameterized timestamp
|
||||
|
||||
**Indexes**: `idx_call_graph_nodes_request_id` UNIQUE on `(requestId)`, `idx_call_graph_nodes_operation_id` on `(operationId)`, `idx_call_graph_nodes_status` on `(status)`, `idx_call_graph_nodes_caller_account_id` on `(callerAccountId)`, `idx_call_graph_nodes_created_at` on `(createdAt)` — time-range queries, `idx_call_graph_nodes_operation_created` on `(operationId, createdAt)` — operation + time queries, `idx_call_graph_nodes_started_at` on `(startedAt)` — p99 latency analysis.
|
||||
|
||||
**Call graph payload size**: The `input` and `output` JSONB columns can grow arbitrarily large. For observability, the full payload is valuable but can bloat storage. Strategy: truncate payloads larger than 10KB to `{ _truncated: true, size: number, preview: string }` at the application layer. Full payloads can optionally be stored in object storage (S3/MinIO) with a reference URL in the `metadata` column. This keeps the call graph table lean while preserving the ability to inspect large payloads when needed.
|
||||
|
||||
**`call.running` and `startedAt`**: There is no `call.running` event in `@alkdev/flowgraph`'s `CallEventMapValue`. The `call.requested` event creates the node in `pending` state. The transition to `running` is performed by the hub's `CallHandler` calling `flowGraph.updateStatus(requestId, "running", { startedAt: now.toISOString() })` directly when it dispatches the operation handler. This is hub-initiated, not event-driven. See call-graph.md for the write path details.
|
||||
|
||||
**Mapping to `@alkdev/flowgraph`**: The `call_graph_nodes` columns map directly to `CallNodeAttrs` in `@alkdev/flowgraph/schema`. The in-memory flowgraph instance uses `requestId` as the node key. Storage reads populate a `FlowGraph.fromCallEvents()` call graph for observability queries, and storage writes persist each call protocol event incrementally.
|
||||
|
||||
### `call_graph_edges`
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-04-19
|
||||
last_updated: 2026-05-25
|
||||
---
|
||||
|
||||
# Table Schemas: Spokes & Operations
|
||||
@@ -44,18 +44,40 @@ Operation definitions — what an operation IS. These persist independently of s
|
||||
| commonCols | — | id, metadata, createdAt, updatedAt |
|
||||
| namespace | text NOT NULL | Post-remap identifier (e.g., `dev.{spokeId}.fs.read`) |
|
||||
| name | text NOT NULL | Operation name within namespace (e.g., `fs.read`, `call`) |
|
||||
| type | text NOT NULL | `QUERY`, `MUTATION`, `SUBSCRIPTION` |
|
||||
| version | text NOT NULL DEFAULT '1.0.0' | Semantic version of the operation definition. Required by `OperationSpec.version`. When a spoke re-registers with a different version, the hub updates this column. |
|
||||
| type | text NOT NULL | `query`, `mutation`, `subscription` (lowercase, matching `@alkdev/operations` `OperationType` enum runtime values) |
|
||||
| title | text | Display/UX title. Populated by `FromOpenAPI` (from OpenAPI `summary`) and MCP adapter (from MCP tool `description`). Nullable — native operations may not set this. Falls back to `name` for display. |
|
||||
| description | text | Human-readable description |
|
||||
| inputSchema | jsonb NOT NULL | TypeBox schema for input |
|
||||
| outputSchema | jsonb | TypeBox schema for output |
|
||||
| errorSchemas | jsonb | Array of error type schemas |
|
||||
| accessControl | jsonb | Access control definition |
|
||||
| outputSchema | jsonb NOT NULL | TypeBox schema for output. NOT NULL — `OperationSpec` requires this. Use `{}` (empty schema) for operations with no meaningful output. |
|
||||
| errorSchemas | jsonb | Array of `ErrorDefinition` objects (see [ErrorDefinition Shape](#errordefinition-shape)). Nullable — operations with no declared error schemas leave this null. |
|
||||
| accessControl | jsonb NOT NULL | `AccessControl` definition. NOT NULL — `OperationSpec` requires this. Use `{ requiredScopes: [] }` for operations with no access restrictions. |
|
||||
| tags | jsonb | String array for search/filter |
|
||||
| _meta | jsonb | Operation-specific extension metadata. Distinct from `commonCols.metadata` (which is generic row-level metadata). Used by adapters: `FromOpenAPI` stores `{ method, path, summary }`, MCP adapter stores MCP-specific metadata. Nullable — native operations may not set this. |
|
||||
|
||||
**Unique constraint**: `CREATE UNIQUE INDEX unq_operations_namespace_name ON operations (namespace, name)` — operation definitions are unique by namespace+name, regardless of how many providers register them.
|
||||
|
||||
**Indexes**: `idx_operations_namespace` on `(namespace)`, `idx_operations_type` on `(type)`.
|
||||
|
||||
**`type` column casing**: Values are lowercase (`query`, `mutation`, `subscription`), matching the `OperationType` enum runtime values in `@alkdev/operations`. The enum names are uppercase (`OperationType.QUERY`) but the string values are lowercase (`"query"`). SQL queries should use lowercase: `WHERE type = 'query'`.
|
||||
|
||||
**`_meta` vs `commonCols.metadata`**: Both are JSONB but serve different purposes. `_meta` holds operation-specific adapter metadata (HTTP method/path for OpenAPI ops, protocol details for MCP ops). `metadata` holds generic row-level metadata (retention, audit, key versioning) with a namespacing convention (`_subsystem.key`). They are not interchangeable — `_meta` is set by the operation author/adapter, `metadata` is set by hub subsystems.
|
||||
|
||||
### ErrorDefinition Shape
|
||||
|
||||
The `errorSchemas` column stores an array of `ErrorDefinition` objects (from `@alkdev/operations`):
|
||||
|
||||
```ts
|
||||
interface ErrorDefinition {
|
||||
code: string; // e.g., "INVALID_INPUT", "NOT_FOUND"
|
||||
description: string; // Human-readable description
|
||||
schema: unknown; // TypeBox schema for error detail shape
|
||||
httpStatus?: number; // Optional HTTP status code mapping
|
||||
}
|
||||
```
|
||||
|
||||
This is the structured error contract between an operation and its callers. No `errorSchemas` = safe default with `EXECUTION_ERROR` wrapper (see call-graph.md error model).
|
||||
|
||||
### `operation_registrations`
|
||||
|
||||
Provider registrations — which spoke/client PROVIDES an operation right now. Ephemeral data: these reflect the current runtime state of who can handle a call.
|
||||
@@ -90,3 +112,22 @@ When a spoke disconnects:
|
||||
When an admin deletes a spoke row (rare):
|
||||
1. `operation_registrations` with that `providerId` are CASCADE deleted (ephemeral data, follows D1 cascade policy for ephemeral config)
|
||||
2. If no other registrations exist for an operation, its definition may be cleaned up separately
|
||||
|
||||
### Polymorphic FK Enforcement for `providerId`
|
||||
|
||||
`operation_registrations.providerId` is a polymorphic FK: it references `spokes.id` when `providerType = 'spoke'` and `clients.id` when `providerType = 'client'`. Postgres does not support multi-target FK constraints natively. The current approach uses **application-layer enforcement**:
|
||||
|
||||
- No DB-level FK on `providerId` — referential integrity is enforced by the application at registration time
|
||||
- `onDelete` behavior is also application-managed: when a spoke disconnects, registrations are set to `inactive`; when an admin deletes a spoke, registrations are CASCADE-deleted by the application
|
||||
|
||||
This is a pragmatic trade-off: polymorphic FKs in a single column are awkward in Postgres (requiring triggers or check constraints with multiple nullable FK columns). The application layer already knows the provider type at registration time, making enforcement straightforward.
|
||||
|
||||
**Alternative approaches** (deferred):
|
||||
- Two nullable FK columns (`spokeId` and `clientId`) with a CHECK constraint ensuring exactly one is set
|
||||
- A trigger that validates `providerId` against the correct table based on `providerType`
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **Operation deletion and call graph integrity**: An operation row referenced by `call_graph_nodes.operationId` cannot be deleted while call records exist (RESTRICT FK). Two strategies: (a) deny the removal while any call records reference it, or (b) reassign call records to a sentinel `__removed__` operation row (pre-seeded in migrations) before deleting. Strategy (a) is simpler and recommended for v1. Strategy (b) requires the sentinel row to exist before any call records can reference it, and adds write overhead. The sentinel operation row (`__removed__`, namespace `system`) should be pre-seeded in migrations if strategy (b) is adopted.
|
||||
|
||||
2. **`providerId` FK enforcement**: Should `operation_registrations.providerId` use application-layer enforcement (current), triggers, or separate nullable FK columns? See Polymorphic FK Enforcement section above.
|
||||
@@ -1,6 +1,6 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-04-23
|
||||
last_updated: 2026-05-25
|
||||
---
|
||||
|
||||
# Storage: Table Schemas
|
||||
@@ -90,7 +90,7 @@ export const commonCols = {
|
||||
| task_dependencies.dependentTaskId → tasks.id | CASCADE | Dependent task deletion removes its incoming dependency edges |
|
||||
| call_graph_edges.sourceId → call_graph_nodes.id | CASCADE | Deleting a node removes its outgoing edges |
|
||||
| call_graph_edges.targetId → call_graph_nodes.id | CASCADE | Deleting a target node removes its incoming edges |
|
||||
| call_graph_nodes.operationId → operations.id | SET NULL | Operation definition deletion preserves call records but detaches them (nullable FK — call data retains audit value even if the operation is removed) |
|
||||
| call_graph_nodes.operationId → operations.id | RESTRICT | Call records must reference a valid operation. If an operation is being removed, the hub must reassign call records first (e.g., to a sentinel `__removed__` operation) or deny the removal. |
|
||||
| api_keys.rotatedToId → api_keys.id | SET NULL | Old key keeps its data; if new key is deleted, rotation link is broken but both keys remain |
|
||||
|
||||
## Index Reference
|
||||
@@ -134,7 +134,7 @@ export const commonCols = {
|
||||
| call_graph_edges | `unq_call_graph_edges_source_target_type` | UNIQUE (sourceId, targetId, edgeType) | Prevent duplicate edges from retries/reconnections |
|
||||
| operations | `unq_operations_namespace_name` | UNIQUE (namespace, name) | Operation definition uniqueness by namespace+name |
|
||||
| operations | `idx_operations_namespace` | B-tree | Filter by namespace |
|
||||
| operations | `idx_operations_type` | B-tree | Filter by operation type |
|
||||
| operations | `idx_operations_type` | B-tree | Filter by operation type (lowercase: query/mutation/subscription) |
|
||||
| operation_registrations | `unq_operation_registrations_active` | UNIQUE partial (WHERE status = 'active') | One active registration per provider per operation |
|
||||
| operation_registrations | `idx_operation_registrations_operation_id` | B-tree | Find registrations for an operation |
|
||||
| operation_registrations | `idx_operation_registrations_provider_id` | B-tree | Find registrations for a provider |
|
||||
@@ -194,7 +194,7 @@ Status enums across tables:
|
||||
| `sessions` | `idle`, `busy`, `retry`, `archived` | Session lifecycle |
|
||||
| `sessions.roleName` | text | Which behavioral role (e.g., "architect", "implementation-specialist"). Free-form string, not a FK constraint. See [agent-roles.md](../../agent-roles.md) and [ADR-012](../../../decisions/ADR-012-agent-vs-role-vs-account.md). |
|
||||
| `spokes` | `connected`, `disconnected` | WebSocket connection state |
|
||||
| `operations` | (no status column) | — Definitions are persistent |
|
||||
| `operations` | (no status column) | — Definitions are persistent. `type` column uses lowercase: `query`, `mutation`, `subscription` |
|
||||
| `operation_registrations` | `active`, `inactive` | Provider registration lifecycle |
|
||||
| `mappings` | `active`, `completed`, `aborted`, `failed` | Coordination workflow state |
|
||||
| `call_graph_nodes` | `pending`, `running`, `completed`, `failed`, `aborted` | Call protocol lifecycle |
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-05-18
|
||||
last_updated: 2026-05-25
|
||||
---
|
||||
|
||||
# Storage: Tasks & Task Dependencies
|
||||
@@ -71,7 +71,7 @@ These fields are written by the Decomposer/file sync. The `ON CONFLICT DO UPDATE
|
||||
| (body) | `body` |
|
||||
| created | `fileCreatedAt` |
|
||||
| modified | `fileModifiedAt` |
|
||||
| depends_on | `task_dependencies` table |
|
||||
| dependsOn | `task_dependencies` table |
|
||||
|
||||
**Note**: `projectId` is set from the project context during sync (the task file's location within a project's `tasks/` directory determines the project), not from taskgraph frontmatter. `commonCols` fields (`id`, `metadata`, `createdAt`, `updatedAt`) are DB-generated and not part of the sync conflict domain.
|
||||
|
||||
@@ -87,16 +87,18 @@ These fields are never overwritten by sync. They are only mutated by hub operati
|
||||
|
||||
> **Warning**: Sync must never write `status`, `startedAt`, or `completedAt` — these are owned by hub operations. The sync upsert uses `ON CONFLICT DO UPDATE SET` only for authored fields; runtime fields are excluded from the SET clause.
|
||||
|
||||
## Field Mapping: taskgraph Frontmatter → DB Columns
|
||||
## Field Mapping: taskgraph `TaskInput` → DB Columns
|
||||
|
||||
Every field in taskgraph's `TaskFrontmatter` struct maps to a dedicated DB column. No frontmatter fields are relegated to JSONB `metadata`.
|
||||
Every field in taskgraph's `TaskInput` type (the TypeScript equivalent of the Rust `TaskFrontmatter` struct) maps to a dedicated DB column. No `TaskInput` fields are relegated to JSONB `metadata`.
|
||||
|
||||
| taskgraph Field | DB Column | Type | Notes |
|
||||
> **Naming note**: The library exports `TaskInput`, not `TaskFrontmatter`. The JSDoc confirms it "matches the Rust `TaskFrontmatter` field set." The YAML key for dependencies is `dependsOn` in the library (camelCase); `parseFrontmatter()` normalizes `depends_on` → `dependsOn` on input, and `serializeFrontmatter()` outputs `dependsOn`. `@alkdev/taskgraph` (TypeScript) is the source of truth for the frontmatter format. The Rust CLI is not used going forward — frontmatter is used for LLM output parsing and importing legacy task files, with the DB as the authoritative runtime representation.
|
||||
|
||||
| taskgraph Field (`TaskInput`) | DB Column | Type | Notes |
|
||||
|---|---|---|---|
|
||||
| `id` | `slug` | text NOT NULL | Direct mapping. No transformation. `slug` is taskgraph-compatible, used in `depends_on` references. |
|
||||
| `id` | `slug` | text NOT NULL | Direct mapping. No transformation. `slug` is taskgraph-compatible, used in `dependsOn` references. |
|
||||
| `name` | `name` | text NOT NULL | Direct mapping |
|
||||
| `status` | `status` | text NOT NULL, enum | Direct mapping: `pending`, `in-progress`, `completed`, `failed`, `blocked`. Default: `pending`. |
|
||||
| `depends_on` | `task_dependencies` table | — | Each element creates a row: `depends_on[i]` → `dependsOnTaskId`, task → `dependentTaskId` |
|
||||
| `dependsOn` | `task_dependencies` table | — | Each element creates a row: `dependsOn[i]` → `dependsOnTaskId`, task → `dependentTaskId`. Library key is `dependsOn` (camelCase); YAML frontmatter may use `depends_on` which is normalized to `dependsOn` on parse. |
|
||||
| `scope` | `scope` | text, enum | `single`, `narrow`, `moderate`, `broad`, `system`. **Nullable** — NULL = not yet assessed. |
|
||||
| `risk` | `risk` | text, enum | `trivial`, `low`, `medium`, `high`, `critical`. **Nullable** — NULL = not yet assessed. |
|
||||
| `impact` | `impact` | text, enum | `isolated`, `component`, `phase`, `project`. **Nullable** — NULL = not yet assessed. |
|
||||
@@ -106,7 +108,7 @@ Every field in taskgraph's `TaskFrontmatter` struct maps to a dedicated DB colum
|
||||
| `assignee` | `assignee` | text | Assigned agent or person. Nullable. |
|
||||
| `due` | `dueAt` | timestamp with tz | Renamed from `due` for DB convention. Nullable. |
|
||||
| `created` | `fileCreatedAt` | timestamp with tz | Frontmatter `created` field. Separate from DB `createdAt` (row creation time). Nullable — frontmatter may not include it. |
|
||||
| `modified` | `fileModifiedAt` | timestamp with tz | Frontmatter `modified` field. Separate from DB `updatedAt` (row update time). Nullable. |
|
||||
| `modified` | `fileModifiedAt` | timestamp with tz | Frontmatter `modified` field. Separate from DB `updatedAt` (row update time). Nullable — frontmatter may not include it. |
|
||||
| (body) | `body` | text | Markdown content after frontmatter. Nullable — empty body is valid. |
|
||||
| (directory path) | `path` | text | Logical grouping prefix: `architecture`, `implementation/storage`. Nullable — tasks created via API with no file origin have no path. See [Path Semantics](#path-semantics). |
|
||||
| (project) | `projectId` | text NOT NULL | FK → projects.id |
|
||||
@@ -156,7 +158,7 @@ The decomposer template should consume these same enum definitions to ensure DB-
|
||||
|
||||
**Indexes**: `idx_tasks_project_id` on `(projectId)`, `idx_tasks_project_status` on `(projectId, status)` — composite for "find all pending tasks in project X", `idx_tasks_status` on `(status)`, `idx_tasks_active` partial on `(projectId)` WHERE `status IN ('pending', 'in-progress', 'blocked')` — efficiently find active tasks, `idx_tasks_path` on `(path)` **with `text_pattern_ops`** — locale-independent LIKE pattern matching for path prefix queries (e.g., `WHERE path LIKE 'implementation/%'`), `idx_tasks_priority` on `(priority)`, `idx_tasks_assignee` on `(assignee)`, `idx_tasks_due_at` on `(dueAt)`, `idx_tasks_tags` GIN on `(tags)` — for array-contains queries (`tags @> '{security}'`).
|
||||
|
||||
**`slug` semantics**: From taskgraph frontmatter `id` field. Kebab-case identifiers like `auth-setup`, `storage-tasks-table`. Appears in `depends_on` arrays.
|
||||
**`slug` semantics**: From taskgraph frontmatter `id` field. Kebab-case identifiers like `auth-setup`, `storage-tasks-table`. Appears in `dependsOn` arrays (library key; YAML: `depends_on`).
|
||||
|
||||
**`path` semantics**: Nullable — tasks created via API with no filesystem origin have no path. When set, captures the logical grouping derived from the `tasks/` directory structure. E.g., a file at `tasks/implementation/storage/tasks-table.md` gets `path: "implementation/storage"`. Enables `WHERE path LIKE 'implementation/%'` (scoped queries) without requiring a `parentId` FK. This replaces the previous `parentId` column — grouping is a path concern, not a tree relationship.
|
||||
|
||||
@@ -182,11 +184,11 @@ Dependency edges between tasks. Directed: a row means the dependent task depends
|
||||
|
||||
**Direction**: `dependentTaskId` is the task that has the dependency. `dependsOnTaskId` is the prerequisite task. Together they form a directed edge: `dependentTaskId` → `dependsOnTaskId` meaning "task dependentTaskId depends on task dependsOnTaskId". In the graph, there's an edge from `dependsOnTaskId` → `dependentTaskId` (prerequisite → dependent). This gives correct topological order: prerequisites before dependents.
|
||||
|
||||
**Cross-project dependency guard**: `taskId` and `dependsOnTaskId` MUST reference tasks within the same project. The application layer enforces this constraint — creating a dependency between tasks in different projects is rejected with a validation error. This is not enforced at the DB level (FK constraints allow cross-project references), so the application must check project consistency before insert.
|
||||
**Cross-project dependency guard**: `dependentTaskId` and `dependsOnTaskId` MUST reference tasks within the same project. The application layer enforces this constraint — creating a dependency between tasks in different projects is rejected with a validation error. This is not enforced at the DB level (FK constraints allow cross-project references), so the application must check project consistency before insert.
|
||||
|
||||
A future DB-level guard could use a trigger: `BEFORE INSERT ON task_dependencies` that checks `NEW.taskId` and `NEW.dependsOnTaskId` reference tasks in the same project. This is deferred to Phase 2 — the application-layer check is sufficient for now.
|
||||
A future DB-level guard could use a trigger: `BEFORE INSERT ON task_dependencies` that checks `NEW.dependentTaskId` and `NEW.dependsOnTaskId` reference tasks in the same project. This is deferred to Phase 2 — the application-layer check is sufficient for now.
|
||||
|
||||
**Sync source**: Dependency edges are authored in task file frontmatter (`depends_on: [other-task]`) and synced to this table during the file → DB sync operation. The sync clears and re-inserts all edges for a task on each run — dependencies are fully replaced by the sync, not merged or modified at runtime.
|
||||
**Sync source**: Dependency edges are authored in task file frontmatter (`dependsOn: [other-task]` in the library, `depends_on:` in YAML) and synced to this table during the file → DB sync operation. The sync clears and re-inserts all edges for a task on each run — dependencies are fully replaced by the sync, not merged or modified at runtime.
|
||||
|
||||
## Why ALL Frontmatter Fields Get Proper Columns
|
||||
|
||||
@@ -215,7 +217,7 @@ Taskgraph itself makes these fields `Option<TaskScope>`, `Option<TaskRisk>`, etc
|
||||
- Exclude it from cost-benefit analysis (you can't compute risk-path without risk values)
|
||||
- Suggest the Decomposer assess it
|
||||
|
||||
For @alkdev/taskgraph operations that need numeric weights, provide fallbacks at the application layer (e.g., treat NULL risk as `low` for topo sort, but warn).
|
||||
For @alkdev/taskgraph operations that need numeric weights, provide fallbacks at the application layer. The library's `resolveDefaults()` uses `medium` as the default risk, `narrow` as the default scope, and `isolated` as the default impact. These defaults are used when computing analysis metrics — they do NOT change the DB value (NULL remains NULL in the database).
|
||||
|
||||
## Path Semantics
|
||||
|
||||
@@ -349,19 +351,104 @@ Without them, you just get topological sort — useful, but not structurally ins
|
||||
|
||||
For runtime graph operations, the hub uses **`@alkdev/taskgraph`** — a TypeScript package that wraps graphology and provides a high-level `TaskGraph` class plus analysis functions. The CLI (`taskgraph`) is for offline authoring and analysis; the TS package is for runtime use.
|
||||
|
||||
### Construction
|
||||
|
||||
The approach:
|
||||
1. Load all `tasks` + `task_dependencies` rows for a project from the DB
|
||||
2. Build a `TaskGraph` via `TaskGraph.fromRecords(tasks, edges)`
|
||||
3. Run analysis functions as needed: `criticalPath()`, `parallelGroups()`, `bottlenecks()`, `riskPath()`, `shouldDecomposeTask()`, `workflowCost()`
|
||||
2. Transform DB rows into `TaskInput[]` and `DependencyEdge[]` shapes (see [Library ↔ DB Field Mapping](#library--db-field-mapping) below)
|
||||
3. Build a `TaskGraph` via `TaskGraph.fromRecords(taskInputs, edges)`
|
||||
4. Run analysis functions as needed
|
||||
|
||||
This works because realistic task graphs are small — typically 10–50 tasks, rarely exceeding 200 even on large projects. Building a graph from DB rows is instant at this scale (`TaskGraph.fromRecords` with 100 nodes reconstructs in <5ms).
|
||||
|
||||
`@alkdev/taskgraph` exports:
|
||||
- **`TaskGraph`** — construction (fromTasks, fromRecords, fromJSON), mutation (addTask, removeTask, addDependency, updateTask), queries (hasCycles, findCycles, topologicalOrder, dependencies, dependents, getTask), validation (validateSchema, validateGraph), export
|
||||
- **Analysis functions** — criticalPath, weightedCriticalPath, parallelGroups, bottlenecks, riskPath, riskDistribution, calculateTaskEv, workflowCost, shouldDecomposeTask
|
||||
- **Schema types** — TaskScope, TaskRisk, TaskImpact, TaskLevel, TaskPriority, TaskStatus enums with TypeBox schemas
|
||||
- **Frontmatter** — parseFrontmatter, serializeFrontmatter (YAML + markdown)
|
||||
- **Error classes** — TaskgraphError, CircularDependencyError, TaskNotFoundError, etc.
|
||||
### Library ↔ DB Field Mapping
|
||||
|
||||
**Task inputs**: `TaskGraph.fromRecords(tasks, edges)` takes `TaskInput[]` (frontmatter-shaped), not DB row shapes. The hub transforms DB rows → `TaskInput`:
|
||||
|
||||
| DB Column | TaskInput Field | Notes |
|
||||
|-----------|----------------|-------|
|
||||
| `slug` | `id` | Direct mapping |
|
||||
| `name` | `name` | Direct mapping |
|
||||
| `status` | `status` | Direct mapping |
|
||||
| `scope` | `scope` | Direct mapping |
|
||||
| `risk` | `risk` | Direct mapping |
|
||||
| `impact` | `impact` | Direct mapping |
|
||||
| `level` | `level` | Direct mapping |
|
||||
| `priority` | `priority` | Direct mapping |
|
||||
| `tags` | `tags` | Direct mapping |
|
||||
|
||||
**Dependency edges**: `DependencyEdge` uses `{ from, to, qualityRetention? }`, not DB column names:
|
||||
|
||||
| DB Column | DependencyEdge Field | Notes |
|
||||
|-----------|---------------------|-------|
|
||||
| `dependsOnTaskId` (prerequisite) | `from` | The prerequisite task that must complete first |
|
||||
| `dependentTaskId` (dependent) | `to` | The dependent task that waits for the prerequisite |
|
||||
| (no column) | `qualityRetention?` | Per-edge failure propagation weight (0–1, default 0.9). Used by `workflowCost` analysis. Not stored in DB — set at graph construction time. |
|
||||
|
||||
### Graph Node vs DB Column Distinction
|
||||
|
||||
`TaskGraphNodeAttributes` (what the graph stores per node) is a **subset** of `TaskInput`. The graph intentionally drops fields that aren't relevant to graph algorithms:
|
||||
|
||||
| In `TaskInput` | In `TaskGraphNodeAttributes` | Reason |
|
||||
|----------------|-------------------------------|--------|
|
||||
| `id` | ✅ `id` | Node key |
|
||||
| `name` | ✅ `name` | Display |
|
||||
| `status` | ✅ `status` | State tracking |
|
||||
| `scope` | ✅ `scope` | Analysis |
|
||||
| `risk` | ✅ `risk` | Analysis |
|
||||
| `impact` | ✅ `impact` | Analysis |
|
||||
| `level` | ✅ `level` | Analysis |
|
||||
| `priority` | ✅ `priority` | Analysis |
|
||||
| `tags` | ❌ | Not used by graph algorithms — available in DB |
|
||||
| `assignee` | ❌ | Not used by graph algorithms — available in DB |
|
||||
| `due` | ❌ | Not used by graph algorithms — available in DB |
|
||||
| `created` | ❌ | Not used by graph algorithms — available in DB |
|
||||
| `modified` | ❌ | Not used by graph algorithms — available in DB |
|
||||
|
||||
Fields like `tags`, `assignee`, and `due` are fully queryable in the DB and don't need to be in the graph for analysis. If the coordinator needs to filter a graph by assignee, it should query the DB first and then construct a filtered subgraph using `taskGraph.subgraph(filter)`.
|
||||
|
||||
### @alkdev/taskgraph Exports
|
||||
|
||||
**Construction** — `TaskGraph` class:
|
||||
- `TaskGraph.fromTasks(tasks: TaskInput[])` — builds graph from tasks, inferring edges from `dependsOn` arrays
|
||||
- `TaskGraph.fromRecords(tasks: TaskInput[], edges: DependencyEdge[])` — builds from tasks + explicit edge list
|
||||
- `TaskGraph.fromJSON(data: TaskGraphSerialized)` — deserializes from graphology JSON
|
||||
- Mutation: `addTask(task)`, `removeTask(taskId)`, `addDependency(prerequisite, dependent, qualityRetention?)`, `updateTask(taskId, attrs)`
|
||||
- Queries: `hasCycles`, `findCycles`, `topologicalOrder`, `dependencies(taskId)`, `dependents(taskId)`, `getTask(taskId)`, `subgraph(filter)`
|
||||
- Validation: `validateSchema()`, `validateGraph()`
|
||||
- Export: `export()` → `TaskGraphSerialized`, `toJSON()` (alias)
|
||||
- Escape hatch: `get graph` → raw graphology `DirectedGraph`
|
||||
|
||||
**Analysis functions**:
|
||||
- `criticalPath(graph)` — longest path by edge count
|
||||
- `weightedCriticalPath(graph, weightFn)` — longest path with custom weight function
|
||||
- `parallelGroups(graph)` — groups of tasks that can run concurrently
|
||||
- `bottlenecks(graph)` — high-betweenness tasks. Returns `BottleneckResult[]`
|
||||
- `riskPath(graph)` — highest cumulative risk path. Returns `RiskPathResult { path, totalRisk }`
|
||||
- `riskDistribution(graph)` — risk distribution across graph. Returns `RiskDistributionResult`
|
||||
- `shouldDecomposeTask(task)` — decomposition recommendation. Returns `DecomposeResult { shouldDecompose, reasons }`
|
||||
- `calculateTaskEv(p, scopeCost, impactWeight, config?)` — expected value math. Returns `EvResult`
|
||||
- `workflowCost(graph, options?)` — total workflow cost with failure propagation. Returns `WorkflowCostResult`. Options: `WorkflowCostOptions`
|
||||
|
||||
**Categorical numeric methods** (map enum values → numbers for analysis):
|
||||
- `scopeCostEstimate(scope)` — numeric scope cost
|
||||
- `scopeTokenEstimate(scope)` — token-based scope estimate
|
||||
- `riskSuccessProbability(risk)` — probability of success (0–1)
|
||||
- `riskWeight(risk)` — weight for risk calculations
|
||||
- `impactWeight(impact)` — weight for impact calculations
|
||||
- `resolveDefaults(attrs)` — fills default values for unassessed fields and computes derived numeric values. Returns `ResolvedTaskAttributes`
|
||||
|
||||
**Schema types** — TypeBox schemas with `Enum` suffix:
|
||||
- `TaskStatusEnum`, `TaskScopeEnum`, `TaskRiskEnum`, `TaskImpactEnum`, `TaskLevelEnum`, `TaskPriorityEnum`
|
||||
- TypeScript types: `TaskStatus`, `TaskScope`, `TaskRisk`, `TaskImpact`, `TaskLevel`, `TaskPriority`
|
||||
|
||||
**Frontmatter**:
|
||||
- `parseFrontmatter(content)` — parses YAML + markdown, normalizes `depends_on` → `dependsOn`
|
||||
- `serializeFrontmatter(data)` — serializes to YAML + markdown, outputs `dependsOn`
|
||||
- `splitFrontmatter(content)` — lower-level helper that splits `---`-delimited YAML from markdown without validating
|
||||
|
||||
**Error classes**:
|
||||
- `TaskgraphError` (base), `CircularDependencyError`, `TaskNotFoundError`, `DuplicateNodeError`, `DuplicateEdgeError`, `ValidationError`, `GraphValidationError`
|
||||
|
||||
**Why not taskgraph NAPI for v1**: The Rust CLI (`taskgraph`) is for offline authoring and analysis. The TypeScript package (`@alkdev/taskgraph`) handles all runtime graph operations. Graphology is a transitive dependency through `@alkdev/taskgraph` and handles < 200 nodes trivially. NAPI is unnecessary at realistic scales.
|
||||
|
||||
@@ -416,7 +503,7 @@ This is a manual step — "I want to run analysis now" — not an automatic sync
|
||||
|-------|----------|
|
||||
| Invalid YAML frontmatter | Skip file, log warning with file path and parse error. Continue with remaining files. |
|
||||
| Missing required `id` or `name` field | Skip file, log warning. Task cannot be synced without these fields. |
|
||||
| `depends_on` references non-existent slug within project | Insert the dependency edge anyway (dangling reference). The coordinator detects and warns about unresolvable dependencies. `taskgraph validate` should be run before sync to catch these. |
|
||||
| `dependsOn` references non-existent slug within project | Insert the dependency edge anyway (dangling reference). The coordinator detects and warns about unresolvable dependencies. `taskgraph validate` should be run before sync to catch these. |
|
||||
| Duplicate `id` (slug) in same project | Fail the sync with a clear error. Slug uniqueness is enforced by the DB constraint `unq_tasks_project_slug`. |
|
||||
| File removed from filesystem | DELETE the DB row. FK cascade handles dependent rows. Git preserves history. |
|
||||
|
||||
@@ -437,7 +524,7 @@ This is a manual step — "I want to run analysis now" — not an automatic sync
|
||||
- Cost-benefit framework: taskgraph framework docs — why categorical estimates are structurally required
|
||||
- Workflow guide: taskgraph workflow docs — practical usage patterns
|
||||
- Task file format: @alkdev/taskgraph README — field definitions
|
||||
- TaskFrontmatter struct: @alkdev/taskgraph package source — canonical field types and defaults
|
||||
- TaskFrontmatter struct: @alkdev/taskgraph package source — `TaskInput` type (TypeScript equivalent of Rust `TaskFrontmatter`)
|
||||
- taskgraph architecture: taskgraph architecture docs
|
||||
- Storage pattern: [README.md](./README.md)
|
||||
- Table reference (cross-cutting): [table-reference.md](./table-reference.md)
|
||||
|
||||
@@ -38,9 +38,9 @@ We choose **Option 3: Database as source of truth, files as authoring surface**.
|
||||
|
||||
### Key Design Principles
|
||||
|
||||
1. **Every taskgraph frontmatter field is a proper DB column** — no fields relegated to JSONB `metadata`. `priority`, `assignee`, `dueAt`, `tags` get dedicated columns because they're queryable and filterable in coordinator workflows.
|
||||
1. **Every taskgraph frontmatter field is a proper DB column** — no fields relegated to JSONB `metadata`. `priority`, `assignee`, `dueAt`, `tags` get dedicated columns because they're queryable and filterable in coordinator workflows. The library type is `TaskInput` (TypeScript equivalent of the Rust `TaskFrontmatter`).
|
||||
|
||||
2. **Categorical fields are nullable, not NOT NULL with defaults** — `scope`, `risk`, `impact`, `level` are nullable (NULL = not yet assessed). This preserves the distinction between "deliberately assessed as low" and "nobody filled this in." Taskgraph itself uses `Option<TaskScope>` etc.
|
||||
2. **Categorical fields are nullable, not NOT NULL with defaults** — `scope`, `risk`, `impact`, `level` are nullable (NULL = not yet assessed). This preserves the distinction between "deliberately assessed as low" and "nobody filled this in." The library uses `Type.Optional(Nullable(Enum))` on `TaskInput`, matching this model. For analysis, `resolveDefaults()` uses `medium` as the default risk (replacing the old `low` default) and `narrow`/`isolated` for scope/impact — these are computational fallbacks, not DB defaults, making unassessed tasks more visible in analysis rather than appearing optimistically safe.
|
||||
|
||||
3. **No `parentId`** — Grouping is handled by `path` (a nullable text column for scoped queries like `WHERE path LIKE 'implementation/%'`). Dependencies are in `task_dependencies`. These are separate concepts.
|
||||
|
||||
@@ -73,4 +73,4 @@ We choose **Option 3: Database as source of truth, files as authoring surface**.
|
||||
- ADR-001: JSONB data columns vs individual columns (same principle — proper columns for queryable fields)
|
||||
- Cost-benefit framework: taskgraph framework docs
|
||||
- Task storage: `docs/architecture/storage/tasks.md`
|
||||
- taskgraph TaskFrontmatter: taskgraph source
|
||||
- taskgraph TaskInput: taskgraph source
|
||||
Reference in New Issue
Block a user