Align storage & architecture specs with published npm libraries

Systematically compared @alkdev/taskgraph, @alkdev/operations, and @alkdev/flowgraph against storage/arch specs and fixed all mismatches. Key changes: Tasks (storage/tasks.md + ADR-011): - Rename TaskFrontmatter → TaskInput to match library export - Fix dependsOn (was depends_on) in field mappings — library uses camelCase; parseFrontmatter normalizes YAML snake_case on input - Document DependencyEdge shape {from, to, qualityRetention?} and DB↔library field mapping - Document graph node vs DB column distinction (TaskGraphNodeAttrs is a subset of TaskInput) - Fix default risk fallback from low → medium (matches resolveDefaults) - Fix cross-project guard column references (dependentTaskId, not taskId) - Clarify @alkdev/taskgraph TS is source of truth; frontmatter is for LLM output parsing and legacy imports, not Rust CLI - Add complete library exports reference Operations (storage/spokes.md + operations.md): - Add version, title, _meta columns to operations table (required by OperationSpec, were missing) - Fix type casing: query/mutation/subscription (lowercase, matching OperationType runtime values) - Make outputSchema and accessControl NOT NULL (matching library) - Document ErrorDefinition shape {code, description, schema, httpStatus?} - Document _meta vs commonCols.metadata distinction - Add registerAll, get, getHandler, getByName, list, subscribe methods - Fix buildCallHandler signature ({ registry, callMap }) - Fix OperationType values (lowercase) Call graph (storage/call-graph.md + call-graph.md): - Change operationId to NOT NULL with RESTRICT FK (was nullable/SET NULL) — matches flowgraph's required CallNodeAttrs.operationId - Document sentinel __removed__ operation strategy for deletions - Document ISO 8601 string ↔ timestamptz conversion requirement - Rewrite CallEventMap to match actual library: flat dot-notation keys, timestamp on all events, nested error structure, optional output on completed event - Remove call.running event (doesn't exist in library) — hub calls updateStatus(running) directly on dispatch - Fix buildCallHandler({ registry, callMap }) signature - Fix PendingRequestMap constructor (positional EventTarget) - Add updateCall/removeCall/graph methods to API summary - Document abort cascade as hub logic, not flowgraph logic - Add open questions for operation deletion and reactive vs call graph semantics Table reference (storage/table-reference.md): - Update call_graph_nodes.operationId cascade to RESTRICT - Update operations.type comment to lowercase - Update status enum reference
2026-05-25 11:46:42 +00:00
parent 2b63cda1c7
commit 93e2286343
7 changed files with 288 additions and 112 deletions
--- a/docs/architecture/storage/tasks.md
+++ b/docs/architecture/storage/tasks.md
@@ -1,6 +1,6 @@
 ---
 status: draft
-last_updated: 2026-05-18
+last_updated: 2026-05-25
 ---

 # Storage: Tasks & Task Dependencies
@@ -71,7 +71,7 @@ These fields are written by the Decomposer/file sync. The `ON CONFLICT DO UPDATE
 | (body) | `body` |
 | created | `fileCreatedAt` |
 | modified | `fileModifiedAt` |
-| depends_on | `task_dependencies` table |
+| dependsOn | `task_dependencies` table |

 **Note**: `projectId` is set from the project context during sync (the task file's location within a project's `tasks/` directory determines the project), not from taskgraph frontmatter. `commonCols` fields (`id`, `metadata`, `createdAt`, `updatedAt`) are DB-generated and not part of the sync conflict domain.

@@ -87,16 +87,18 @@ These fields are never overwritten by sync. They are only mutated by hub operati

 > **Warning**: Sync must never write `status`, `startedAt`, or `completedAt` — these are owned by hub operations. The sync upsert uses `ON CONFLICT DO UPDATE SET` only for authored fields; runtime fields are excluded from the SET clause.

-## Field Mapping: taskgraph Frontmatter → DB Columns
+## Field Mapping: taskgraph `TaskInput` → DB Columns

-Every field in taskgraph's `TaskFrontmatter` struct maps to a dedicated DB column. No frontmatter fields are relegated to JSONB `metadata`.
+Every field in taskgraph's `TaskInput` type (the TypeScript equivalent of the Rust `TaskFrontmatter` struct) maps to a dedicated DB column. No `TaskInput` fields are relegated to JSONB `metadata`.

-| taskgraph Field | DB Column | Type | Notes |
+> **Naming note**: The library exports `TaskInput`, not `TaskFrontmatter`. The JSDoc confirms it "matches the Rust `TaskFrontmatter` field set." The YAML key for dependencies is `dependsOn` in the library (camelCase); `parseFrontmatter()` normalizes `depends_on` → `dependsOn` on input, and `serializeFrontmatter()` outputs `dependsOn`. `@alkdev/taskgraph` (TypeScript) is the source of truth for the frontmatter format. The Rust CLI is not used going forward — frontmatter is used for LLM output parsing and importing legacy task files, with the DB as the authoritative runtime representation.
+
+| taskgraph Field (`TaskInput`) | DB Column | Type | Notes |
 |---|---|---|---|
-| `id` | `slug` | text NOT NULL | Direct mapping. No transformation. `slug` is taskgraph-compatible, used in `depends_on` references. |
+| `id` | `slug` | text NOT NULL | Direct mapping. No transformation. `slug` is taskgraph-compatible, used in `dependsOn` references. |
 | `name` | `name` | text NOT NULL | Direct mapping |
 | `status` | `status` | text NOT NULL, enum | Direct mapping: `pending`, `in-progress`, `completed`, `failed`, `blocked`. Default: `pending`. |
-| `depends_on` | `task_dependencies` table | — | Each element creates a row: `depends_on[i]` → `dependsOnTaskId`, task → `dependentTaskId` |
+| `dependsOn` | `task_dependencies` table | — | Each element creates a row: `dependsOn[i]` → `dependsOnTaskId`, task → `dependentTaskId`. Library key is `dependsOn` (camelCase); YAML frontmatter may use `depends_on` which is normalized to `dependsOn` on parse. |
 | `scope` | `scope` | text, enum | `single`, `narrow`, `moderate`, `broad`, `system`. **Nullable** — NULL = not yet assessed. |
 | `risk` | `risk` | text, enum | `trivial`, `low`, `medium`, `high`, `critical`. **Nullable** — NULL = not yet assessed. |
 | `impact` | `impact` | text, enum | `isolated`, `component`, `phase`, `project`. **Nullable** — NULL = not yet assessed. |
@@ -106,7 +108,7 @@ Every field in taskgraph's `TaskFrontmatter` struct maps to a dedicated DB colum
 | `assignee` | `assignee` | text | Assigned agent or person. Nullable. |
 | `due` | `dueAt` | timestamp with tz | Renamed from `due` for DB convention. Nullable. |
 | `created` | `fileCreatedAt` | timestamp with tz | Frontmatter `created` field. Separate from DB `createdAt` (row creation time). Nullable — frontmatter may not include it. |
-| `modified` | `fileModifiedAt` | timestamp with tz | Frontmatter `modified` field. Separate from DB `updatedAt` (row update time). Nullable. |
+| `modified` | `fileModifiedAt` | timestamp with tz | Frontmatter `modified` field. Separate from DB `updatedAt` (row update time). Nullable — frontmatter may not include it. |
 | (body) | `body` | text | Markdown content after frontmatter. Nullable — empty body is valid. |
 | (directory path) | `path` | text | Logical grouping prefix: `architecture`, `implementation/storage`. Nullable — tasks created via API with no file origin have no path. See [Path Semantics](#path-semantics). |
 | (project) | `projectId` | text NOT NULL | FK → projects.id |
@@ -156,7 +158,7 @@ The decomposer template should consume these same enum definitions to ensure DB-

 **Indexes**: `idx_tasks_project_id` on `(projectId)`, `idx_tasks_project_status` on `(projectId, status)` — composite for "find all pending tasks in project X", `idx_tasks_status` on `(status)`, `idx_tasks_active` partial on `(projectId)` WHERE `status IN ('pending', 'in-progress', 'blocked')` — efficiently find active tasks, `idx_tasks_path` on `(path)` **with `text_pattern_ops`** — locale-independent LIKE pattern matching for path prefix queries (e.g., `WHERE path LIKE 'implementation/%'`), `idx_tasks_priority` on `(priority)`, `idx_tasks_assignee` on `(assignee)`, `idx_tasks_due_at` on `(dueAt)`, `idx_tasks_tags` GIN on `(tags)` — for array-contains queries (`tags @> '{security}'`).

-**`slug` semantics**: From taskgraph frontmatter `id` field. Kebab-case identifiers like `auth-setup`, `storage-tasks-table`. Appears in `depends_on` arrays.
+**`slug` semantics**: From taskgraph frontmatter `id` field. Kebab-case identifiers like `auth-setup`, `storage-tasks-table`. Appears in `dependsOn` arrays (library key; YAML: `depends_on`).

 **`path` semantics**: Nullable — tasks created via API with no filesystem origin have no path. When set, captures the logical grouping derived from the `tasks/` directory structure. E.g., a file at `tasks/implementation/storage/tasks-table.md` gets `path: "implementation/storage"`. Enables `WHERE path LIKE 'implementation/%'` (scoped queries) without requiring a `parentId` FK. This replaces the previous `parentId` column — grouping is a path concern, not a tree relationship.

@@ -182,11 +184,11 @@ Dependency edges between tasks. Directed: a row means the dependent task depends

 **Direction**: `dependentTaskId` is the task that has the dependency. `dependsOnTaskId` is the prerequisite task. Together they form a directed edge: `dependentTaskId` → `dependsOnTaskId` meaning "task dependentTaskId depends on task dependsOnTaskId". In the graph, there's an edge from `dependsOnTaskId` → `dependentTaskId` (prerequisite → dependent). This gives correct topological order: prerequisites before dependents.

-**Cross-project dependency guard**: `taskId` and `dependsOnTaskId` MUST reference tasks within the same project. The application layer enforces this constraint — creating a dependency between tasks in different projects is rejected with a validation error. This is not enforced at the DB level (FK constraints allow cross-project references), so the application must check project consistency before insert.
+**Cross-project dependency guard**: `dependentTaskId` and `dependsOnTaskId` MUST reference tasks within the same project. The application layer enforces this constraint — creating a dependency between tasks in different projects is rejected with a validation error. This is not enforced at the DB level (FK constraints allow cross-project references), so the application must check project consistency before insert.

-A future DB-level guard could use a trigger: `BEFORE INSERT ON task_dependencies` that checks `NEW.taskId` and `NEW.dependsOnTaskId` reference tasks in the same project. This is deferred to Phase 2 — the application-layer check is sufficient for now.
+A future DB-level guard could use a trigger: `BEFORE INSERT ON task_dependencies` that checks `NEW.dependentTaskId` and `NEW.dependsOnTaskId` reference tasks in the same project. This is deferred to Phase 2 — the application-layer check is sufficient for now.

-**Sync source**: Dependency edges are authored in task file frontmatter (`depends_on: [other-task]`) and synced to this table during the file → DB sync operation. The sync clears and re-inserts all edges for a task on each run — dependencies are fully replaced by the sync, not merged or modified at runtime.
+**Sync source**: Dependency edges are authored in task file frontmatter (`dependsOn: [other-task]` in the library, `depends_on:` in YAML) and synced to this table during the file → DB sync operation. The sync clears and re-inserts all edges for a task on each run — dependencies are fully replaced by the sync, not merged or modified at runtime.

 ## Why ALL Frontmatter Fields Get Proper Columns

@@ -215,7 +217,7 @@ Taskgraph itself makes these fields `Option<TaskScope>`, `Option<TaskRisk>`, etc
 - Exclude it from cost-benefit analysis (you can't compute risk-path without risk values)
 - Suggest the Decomposer assess it

-For @alkdev/taskgraph operations that need numeric weights, provide fallbacks at the application layer (e.g., treat NULL risk as `low` for topo sort, but warn).
+For @alkdev/taskgraph operations that need numeric weights, provide fallbacks at the application layer. The library's `resolveDefaults()` uses `medium` as the default risk, `narrow` as the default scope, and `isolated` as the default impact. These defaults are used when computing analysis metrics — they do NOT change the DB value (NULL remains NULL in the database).

 ## Path Semantics

@@ -349,19 +351,104 @@ Without them, you just get topological sort — useful, but not structurally ins

 For runtime graph operations, the hub uses **`@alkdev/taskgraph`** — a TypeScript package that wraps graphology and provides a high-level `TaskGraph` class plus analysis functions. The CLI (`taskgraph`) is for offline authoring and analysis; the TS package is for runtime use.

+### Construction
+
 The approach:
 1. Load all `tasks` + `task_dependencies` rows for a project from the DB
-2. Build a `TaskGraph` via `TaskGraph.fromRecords(tasks, edges)`
-3. Run analysis functions as needed: `criticalPath()`, `parallelGroups()`, `bottlenecks()`, `riskPath()`, `shouldDecomposeTask()`, `workflowCost()`
+2. Transform DB rows into `TaskInput[]` and `DependencyEdge[]` shapes (see [Library ↔ DB Field Mapping](#library--db-field-mapping) below)
+3. Build a `TaskGraph` via `TaskGraph.fromRecords(taskInputs, edges)`
+4. Run analysis functions as needed

 This works because realistic task graphs are small — typically 10–50 tasks, rarely exceeding 200 even on large projects. Building a graph from DB rows is instant at this scale (`TaskGraph.fromRecords` with 100 nodes reconstructs in <5ms).

-`@alkdev/taskgraph` exports:
- **`TaskGraph`** — construction (fromTasks, fromRecords, fromJSON), mutation (addTask, removeTask, addDependency, updateTask), queries (hasCycles, findCycles, topologicalOrder, dependencies, dependents, getTask), validation (validateSchema, validateGraph), export
- **Analysis functions** — criticalPath, weightedCriticalPath, parallelGroups, bottlenecks, riskPath, riskDistribution, calculateTaskEv, workflowCost, shouldDecomposeTask
- **Schema types** — TaskScope, TaskRisk, TaskImpact, TaskLevel, TaskPriority, TaskStatus enums with TypeBox schemas
- **Frontmatter** — parseFrontmatter, serializeFrontmatter (YAML + markdown)
- **Error classes** — TaskgraphError, CircularDependencyError, TaskNotFoundError, etc.
+### Library ↔ DB Field Mapping
+
+**Task inputs**: `TaskGraph.fromRecords(tasks, edges)` takes `TaskInput[]` (frontmatter-shaped), not DB row shapes. The hub transforms DB rows → `TaskInput`:
+
+| DB Column | TaskInput Field | Notes |
+|-----------|----------------|-------|
+| `slug` | `id` | Direct mapping |
+| `name` | `name` | Direct mapping |
+| `status` | `status` | Direct mapping |
+| `scope` | `scope` | Direct mapping |
+| `risk` | `risk` | Direct mapping |
+| `impact` | `impact` | Direct mapping |
+| `level` | `level` | Direct mapping |
+| `priority` | `priority` | Direct mapping |
+| `tags` | `tags` | Direct mapping |
+
+**Dependency edges**: `DependencyEdge` uses `{ from, to, qualityRetention? }`, not DB column names:
+
+| DB Column | DependencyEdge Field | Notes |
+|-----------|---------------------|-------|
+| `dependsOnTaskId` (prerequisite) | `from` | The prerequisite task that must complete first |
+| `dependentTaskId` (dependent) | `to` | The dependent task that waits for the prerequisite |
+| (no column) | `qualityRetention?` | Per-edge failure propagation weight (0–1, default 0.9). Used by `workflowCost` analysis. Not stored in DB — set at graph construction time. |
+
+### Graph Node vs DB Column Distinction
+
+`TaskGraphNodeAttributes` (what the graph stores per node) is a **subset** of `TaskInput`. The graph intentionally drops fields that aren't relevant to graph algorithms:
+
+| In `TaskInput` | In `TaskGraphNodeAttributes` | Reason |
+|----------------|-------------------------------|--------|
+| `id` | ✅ `id` | Node key |
+| `name` | ✅ `name` | Display |
+| `status` | ✅ `status` | State tracking |
+| `scope` | ✅ `scope` | Analysis |
+| `risk` | ✅ `risk` | Analysis |
+| `impact` | ✅ `impact` | Analysis |
+| `level` | ✅ `level` | Analysis |
+| `priority` | ✅ `priority` | Analysis |
+| `tags` | ❌ | Not used by graph algorithms — available in DB |
+| `assignee` | ❌ | Not used by graph algorithms — available in DB |
+| `due` | ❌ | Not used by graph algorithms — available in DB |
+| `created` | ❌ | Not used by graph algorithms — available in DB |
+| `modified` | ❌ | Not used by graph algorithms — available in DB |
+
+Fields like `tags`, `assignee`, and `due` are fully queryable in the DB and don't need to be in the graph for analysis. If the coordinator needs to filter a graph by assignee, it should query the DB first and then construct a filtered subgraph using `taskGraph.subgraph(filter)`.
+
+### @alkdev/taskgraph Exports
+
+**Construction** — `TaskGraph` class:
+- `TaskGraph.fromTasks(tasks: TaskInput[])` — builds graph from tasks, inferring edges from `dependsOn` arrays
+- `TaskGraph.fromRecords(tasks: TaskInput[], edges: DependencyEdge[])` — builds from tasks + explicit edge list
+- `TaskGraph.fromJSON(data: TaskGraphSerialized)` — deserializes from graphology JSON
+- Mutation: `addTask(task)`, `removeTask(taskId)`, `addDependency(prerequisite, dependent, qualityRetention?)`, `updateTask(taskId, attrs)`
+- Queries: `hasCycles`, `findCycles`, `topologicalOrder`, `dependencies(taskId)`, `dependents(taskId)`, `getTask(taskId)`, `subgraph(filter)`
+- Validation: `validateSchema()`, `validateGraph()`
+- Export: `export()` → `TaskGraphSerialized`, `toJSON()` (alias)
+- Escape hatch: `get graph` → raw graphology `DirectedGraph`
+
+**Analysis functions**:
+- `criticalPath(graph)` — longest path by edge count
+- `weightedCriticalPath(graph, weightFn)` — longest path with custom weight function
+- `parallelGroups(graph)` — groups of tasks that can run concurrently
+- `bottlenecks(graph)` — high-betweenness tasks. Returns `BottleneckResult[]`
+- `riskPath(graph)` — highest cumulative risk path. Returns `RiskPathResult { path, totalRisk }`
+- `riskDistribution(graph)` — risk distribution across graph. Returns `RiskDistributionResult`
+- `shouldDecomposeTask(task)` — decomposition recommendation. Returns `DecomposeResult { shouldDecompose, reasons }`
+- `calculateTaskEv(p, scopeCost, impactWeight, config?)` — expected value math. Returns `EvResult`
+- `workflowCost(graph, options?)` — total workflow cost with failure propagation. Returns `WorkflowCostResult`. Options: `WorkflowCostOptions`
+
+**Categorical numeric methods** (map enum values → numbers for analysis):
+- `scopeCostEstimate(scope)` — numeric scope cost
+- `scopeTokenEstimate(scope)` — token-based scope estimate
+- `riskSuccessProbability(risk)` — probability of success (0–1)
+- `riskWeight(risk)` — weight for risk calculations
+- `impactWeight(impact)` — weight for impact calculations
+- `resolveDefaults(attrs)` — fills default values for unassessed fields and computes derived numeric values. Returns `ResolvedTaskAttributes`
+
+**Schema types** — TypeBox schemas with `Enum` suffix:
+- `TaskStatusEnum`, `TaskScopeEnum`, `TaskRiskEnum`, `TaskImpactEnum`, `TaskLevelEnum`, `TaskPriorityEnum`
+- TypeScript types: `TaskStatus`, `TaskScope`, `TaskRisk`, `TaskImpact`, `TaskLevel`, `TaskPriority`
+
+**Frontmatter**:
+- `parseFrontmatter(content)` — parses YAML + markdown, normalizes `depends_on` → `dependsOn`
+- `serializeFrontmatter(data)` — serializes to YAML + markdown, outputs `dependsOn`
+- `splitFrontmatter(content)` — lower-level helper that splits `---`-delimited YAML from markdown without validating
+
+**Error classes**:
+- `TaskgraphError` (base), `CircularDependencyError`, `TaskNotFoundError`, `DuplicateNodeError`, `DuplicateEdgeError`, `ValidationError`, `GraphValidationError`

 **Why not taskgraph NAPI for v1**: The Rust CLI (`taskgraph`) is for offline authoring and analysis. The TypeScript package (`@alkdev/taskgraph`) handles all runtime graph operations. Graphology is a transitive dependency through `@alkdev/taskgraph` and handles < 200 nodes trivially. NAPI is unnecessary at realistic scales.

@@ -416,7 +503,7 @@ This is a manual step — "I want to run analysis now" — not an automatic sync
 |-------|----------|
 | Invalid YAML frontmatter | Skip file, log warning with file path and parse error. Continue with remaining files. |
 | Missing required `id` or `name` field | Skip file, log warning. Task cannot be synced without these fields. |
-| `depends_on` references non-existent slug within project | Insert the dependency edge anyway (dangling reference). The coordinator detects and warns about unresolvable dependencies. `taskgraph validate` should be run before sync to catch these. |
+| `dependsOn` references non-existent slug within project | Insert the dependency edge anyway (dangling reference). The coordinator detects and warns about unresolvable dependencies. `taskgraph validate` should be run before sync to catch these. |
 | Duplicate `id` (slug) in same project | Fail the sync with a clear error. Slug uniqueness is enforced by the DB constraint `unq_tasks_project_slug`. |
 | File removed from filesystem | DELETE the DB row. FK cascade handles dependent rows. Git preserves history. |

@@ -437,7 +524,7 @@ This is a manual step — "I want to run analysis now" — not an automatic sync
 - Cost-benefit framework: taskgraph framework docs — why categorical estimates are structurally required
 - Workflow guide: taskgraph workflow docs — practical usage patterns
 - Task file format: @alkdev/taskgraph README — field definitions
- TaskFrontmatter struct: @alkdev/taskgraph package source — canonical field types and defaults
+- TaskFrontmatter struct: @alkdev/taskgraph package source — `TaskInput` type (TypeScript equivalent of Rust `TaskFrontmatter`)
 - taskgraph architecture: taskgraph architecture docs
 - Storage pattern: [README.md](./README.md)
 - Table reference (cross-cutting): [table-reference.md](./table-reference.md)