Files
hub/docs/architecture/storage/tasks.md
glm-5.1 93e2286343 Align storage & architecture specs with published npm libraries
Systematically compared @alkdev/taskgraph, @alkdev/operations, and
@alkdev/flowgraph against storage/arch specs and fixed all mismatches.

Key changes:

Tasks (storage/tasks.md + ADR-011):
- Rename TaskFrontmatter → TaskInput to match library export
- Fix dependsOn (was depends_on) in field mappings — library uses
  camelCase; parseFrontmatter normalizes YAML snake_case on input
- Document DependencyEdge shape {from, to, qualityRetention?} and
  DB↔library field mapping
- Document graph node vs DB column distinction (TaskGraphNodeAttrs
  is a subset of TaskInput)
- Fix default risk fallback from low → medium (matches resolveDefaults)
- Fix cross-project guard column references (dependentTaskId, not taskId)
- Clarify @alkdev/taskgraph TS is source of truth; frontmatter is for
  LLM output parsing and legacy imports, not Rust CLI
- Add complete library exports reference

Operations (storage/spokes.md + operations.md):
- Add version, title, _meta columns to operations table (required by
  OperationSpec, were missing)
- Fix type casing: query/mutation/subscription (lowercase, matching
  OperationType runtime values)
- Make outputSchema and accessControl NOT NULL (matching library)
- Document ErrorDefinition shape {code, description, schema, httpStatus?}
- Document _meta vs commonCols.metadata distinction
- Add registerAll, get, getHandler, getByName, list, subscribe methods
- Fix buildCallHandler signature ({ registry, callMap })
- Fix OperationType values (lowercase)

Call graph (storage/call-graph.md + call-graph.md):
- Change operationId to NOT NULL with RESTRICT FK (was nullable/SET NULL)
  — matches flowgraph's required CallNodeAttrs.operationId
- Document sentinel __removed__ operation strategy for deletions
- Document ISO 8601 string ↔ timestamptz conversion requirement
- Rewrite CallEventMap to match actual library: flat dot-notation keys,
  timestamp on all events, nested error structure, optional output on
  completed event
- Remove call.running event (doesn't exist in library) — hub calls
  updateStatus(running) directly on dispatch
- Fix buildCallHandler({ registry, callMap }) signature
- Fix PendingRequestMap constructor (positional EventTarget)
- Add updateCall/removeCall/graph methods to API summary
- Document abort cascade as hub logic, not flowgraph logic
- Add open questions for operation deletion and reactive vs call graph
  semantics

Table reference (storage/table-reference.md):
- Update call_graph_nodes.operationId cascade to RESTRICT
- Update operations.type comment to lowercase
- Update status enum reference
2026-05-25 11:46:42 +00:00

38 KiB
Raw Blame History

status, last_updated
status last_updated
draft 2026-05-25

Storage: Tasks & Task Dependencies

Tasks are the unit of work in the Spec-Driven Development (SDD) process. The database is the source of truth for task data at runtime. Markdown files serve as the authoring surface for the Decomposer role and the taskgraph CLI — they are ingested into the DB via a sync operation and can be exported back for offline analysis.

For the overall storage pattern, see README.md. For cross-cutting table reference (common columns, cascade behavior, index reference, status enums, relations), see table-reference.md. For design decisions, see ../../decisions/.

Overview

Why Database as Source of Truth

Taskgraph's file-based model works well for single-agent, single-worktree workflows. In the hub's multi-agent, multi-worktree environment, files create problems:

  • Parallel worktrees: Agent A marks a task in-progress in their worktree's file. Agent B can't see this — the file lives in A's working directory. The coordinator can't get a consistent view.
  • Reliable coordination: The coordinator needs to query "which tasks are pending?" and "what's blocking task X?" at runtime without scanning filesystems across worktrees.
  • Atomic status updates: An agent calling hub.task.updateStatus gets an immediate, transactional state change visible to all other agents and the coordinator.

The database is the authoritative, queryable, concurrent-safe representation. Files are the authoring format.

Relationship to taskgraph CLI

The taskgraph CLI operates on markdown files. Its value is in offline analysistopo, cycles, parallel, critical, bottleneck, risk-path, decompose. These commands depend on categorical fields (scope, risk, impact, level) being assessed.

The workflow is:

  1. Author — Decomposer creates/edits markdown files using taskgraph init and direct editing
  2. Sync — Files are ingested into the DB (files → DB)
  3. Execute — Coordinator and agents query and mutate the DB via hub operations
  4. Analyze — When needed, export from DB to files, run taskgraph risk-path etc.

The taskgraph CLI is not required at runtime. The hub uses @alkdev/taskgraph for runtime graph operations (topological sort, cycle detection, parallel groups, critical path, risk analysis) — see Graphology Integration.

Task Authority Model

Aspect Authority Why
Task structure (all fields) DB Queryable, concurrent-safe, consistent
Task specification (body) DB (body column) Stored as markdown text; agents append notes during execution
Task authoring/creation Files → sync → DB Decomposer edits files; sync ingests them
Runtime status mutations DB (hub operations) hub.task.* operations — coordinator and agents call these
Offline graph analysis Files (taskgraph CLI) Export from DB when needed for taskgraph risk-path etc.

See Field Authority Split for the explicit list of authored vs runtime-managed fields.

Field Authority Split

Fields are split into two categories based on who writes them:

Authored Fields (upserted by file sync)

These fields are written by the Decomposer/file sync. The ON CONFLICT DO UPDATE SET clause in the sync upsert includes only these columns:

Field DB Column
id slug
name name
(project) projectId
(directory path) path
scope scope
risk risk
impact impact
level level
priority priority
tags tags
assignee assignee
due dueAt
(body) body
created fileCreatedAt
modified fileModifiedAt
dependsOn task_dependencies table

Note: projectId is set from the project context during sync (the task file's location within a project's tasks/ directory determines the project), not from taskgraph frontmatter. commonCols fields (id, metadata, createdAt, updatedAt) are DB-generated and not part of the sync conflict domain.

Runtime-Managed Fields (mutated via hub.task.* operations only)

These fields are never overwritten by sync. They are only mutated by hub operations (hub.task.updateStatus, hub.task.addNote, etc.):

Field DB Column Set By
status status hub.task.updateStatus
(started timestamp) startedAt hub.task.updateStatus (on in-progress)
(completed timestamp) completedAt hub.task.updateStatus (on completed)

Warning

: Sync must never write status, startedAt, or completedAt — these are owned by hub operations. The sync upsert uses ON CONFLICT DO UPDATE SET only for authored fields; runtime fields are excluded from the SET clause.

Field Mapping: taskgraph TaskInput → DB Columns

Every field in taskgraph's TaskInput type (the TypeScript equivalent of the Rust TaskFrontmatter struct) maps to a dedicated DB column. No TaskInput fields are relegated to JSONB metadata.

Naming note: The library exports TaskInput, not TaskFrontmatter. The JSDoc confirms it "matches the Rust TaskFrontmatter field set." The YAML key for dependencies is dependsOn in the library (camelCase); parseFrontmatter() normalizes depends_ondependsOn on input, and serializeFrontmatter() outputs dependsOn. @alkdev/taskgraph (TypeScript) is the source of truth for the frontmatter format. The Rust CLI is not used going forward — frontmatter is used for LLM output parsing and importing legacy task files, with the DB as the authoritative runtime representation.

taskgraph Field (TaskInput) DB Column Type Notes
id slug text NOT NULL Direct mapping. No transformation. slug is taskgraph-compatible, used in dependsOn references.
name name text NOT NULL Direct mapping
status status text NOT NULL, enum Direct mapping: pending, in-progress, completed, failed, blocked. Default: pending.
dependsOn task_dependencies table Each element creates a row: dependsOn[i]dependsOnTaskId, task → dependentTaskId. Library key is dependsOn (camelCase); YAML frontmatter may use depends_on which is normalized to dependsOn on parse.
scope scope text, enum single, narrow, moderate, broad, system. Nullable — NULL = not yet assessed.
risk risk text, enum trivial, low, medium, high, critical. Nullable — NULL = not yet assessed.
impact impact text, enum isolated, component, phase, project. Nullable — NULL = not yet assessed.
level level text, enum planning, decomposition, implementation, review, research. Nullable — NULL = not yet assessed.
priority priority text, enum low, medium, high, critical. Nullable.
tags tags text[] String array. Default {}.
assignee assignee text Assigned agent or person. Nullable.
due dueAt timestamp with tz Renamed from due for DB convention. Nullable.
created fileCreatedAt timestamp with tz Frontmatter created field. Separate from DB createdAt (row creation time). Nullable — frontmatter may not include it.
modified fileModifiedAt timestamp with tz Frontmatter modified field. Separate from DB updatedAt (row update time). Nullable — frontmatter may not include it.
(body) body text Markdown content after frontmatter. Nullable — empty body is valid.
(directory path) path text Logical grouping prefix: architecture, implementation/storage. Nullable — tasks created via API with no file origin have no path. See Path Semantics.
(project) projectId text NOT NULL FK → projects.id

Table Schemas

tasks

SDD task definitions. The database is the source of truth for task data at runtime. Markdown files serve as the authoring surface for the Decomposer and taskgraph CLI — they are ingested into the DB via a sync operation. Every field in taskgraph's TaskFrontmatter struct maps to a dedicated DB column (no frontmatter fields in metadata JSONB).

Column Type Notes
commonCols id, metadata, createdAt, updatedAt
projectId text NOT NULL FK → projects.id (cascade) — tasks belong to a project
slug text NOT NULL taskgraph id — kebab-case identifier used in depends_on references. Unique within a project.
name text NOT NULL Human-readable task name (from frontmatter name)
path text Logical grouping prefix derived from filesystem location (e.g., architecture, implementation/storage). Nullable — tasks created via API with no file origin have no path. Enables WHERE path LIKE 'implementation/%' for scoped queries.
status text NOT NULL Enum: pending, in-progress, completed, failed, blocked. Default: pending. Status transitions go through hub operations, not file edits.
scope text Categorical scope: single, narrow, moderate, broad, system. Nullable — NULL = not yet assessed. See Why Categorical Fields Are Nullable.
risk text Categorical risk: trivial, low, medium, high, critical. Nullable — NULL = not yet assessed.
impact text Categorical impact: isolated, component, phase, project. Nullable — NULL = not yet assessed.
level text Task level: planning, decomposition, implementation, review, research. Nullable — NULL = not yet assessed.
priority text Priority: low, medium, high, critical. Nullable.
assignee text Assigned agent or person. Nullable.
dueAt timestamp with tz Due date (from frontmatter due). Nullable.
tags text[] Filtering tags. Default {}. GIN index for array-contains queries.
body text Markdown task specification (from file body after frontmatter). Nullable — empty body is valid. Agents may append notes during execution.
fileCreatedAt timestamp with tz Frontmatter created field — file creation time from the markdown. Separate from DB createdAt (row creation time). Nullable.
fileModifiedAt timestamp with tz Frontmatter modified field — file modification time from the markdown. Separate from DB updatedAt (row update time). Nullable.
startedAt timestamp with tz When status became in-progress. Set by hub operation, not by agent.
completedAt timestamp with tz When status became completed. Set by hub operation.

Unique constraint: unq_tasks_project_slug UNIQUE on (projectId, slug) — task slugs are unique within a project.

pgEnum Definitions: The following enum columns use PostgreSQL pgEnum for type safety. Drizzle's pgEnum generates named PostgreSQL enums and provides TypeScript type inference. The enum values are aligned with taskgraph's categorical fields.

export const taskStatus = pgEnum("task_status", ["pending", "in-progress", "completed", "failed", "blocked"]);
export const taskScope = pgEnum("task_scope", ["single", "narrow", "moderate", "broad", "system"]);
export const taskRisk = pgEnum("task_risk", ["trivial", "low", "medium", "high", "critical"]);
export const taskImpact = pgEnum("task_impact", ["isolated", "component", "phase", "project"]);
export const taskLevel = pgEnum("task_level", ["planning", "decomposition", "implementation", "review", "research"]);
export const taskPriority = pgEnum("task_priority", ["low", "medium", "high", "critical"]);

The decomposer template should consume these same enum definitions to ensure DB-level constraints match the application-level typing.

Indexes: idx_tasks_project_id on (projectId), idx_tasks_project_status on (projectId, status) — composite for "find all pending tasks in project X", idx_tasks_status on (status), idx_tasks_active partial on (projectId) WHERE status IN ('pending', 'in-progress', 'blocked') — efficiently find active tasks, idx_tasks_path on (path) with text_pattern_ops — locale-independent LIKE pattern matching for path prefix queries (e.g., WHERE path LIKE 'implementation/%'), idx_tasks_priority on (priority), idx_tasks_assignee on (assignee), idx_tasks_due_at on (dueAt), idx_tasks_tags GIN on (tags) — for array-contains queries (tags @> '{security}').

slug semantics: From taskgraph frontmatter id field. Kebab-case identifiers like auth-setup, storage-tasks-table. Appears in dependsOn arrays (library key; YAML: depends_on).

path semantics: Nullable — tasks created via API with no filesystem origin have no path. When set, captures the logical grouping derived from the tasks/ directory structure. E.g., a file at tasks/implementation/storage/tasks-table.md gets path: "implementation/storage". Enables WHERE path LIKE 'implementation/%' (scoped queries) without requiring a parentId FK. This replaces the previous parentId column — grouping is a path concern, not a tree relationship.

No parentId column: Grouping is handled by path, dependencies by task_dependencies. A "meta task" is just a regular task that depends on its sub-tasks — no special entity type needed.

No removedAt column: When a task file is removed, the sync operation DELETEs the DB row. Git history preserves the file-level history; the DB doesn't need to duplicate it with soft deletes. FK cascade handles cleanup.

metadata JSONB: Reserved for truly ad-hoc data not in the taskgraph schema. No taskgraph frontmatter fields are stored here — all have proper columns.

task_dependencies

Dependency edges between tasks. Directed: a row means the dependent task depends on the prerequisite task (prerequisite must complete before dependent can start). Mirrors the taskgraph depends_on relationship.

Column Type Notes
commonCols id, metadata, createdAt, updatedAt
dependsOnTaskId text NOT NULL FK → tasks.id (cascade) — The prerequisite task (must complete first)
dependentTaskId text NOT NULL FK → tasks.id (cascade) — The dependent task (waits for prerequisite)

Unique constraint: unq_task_dependencies_depends_on_task UNIQUE on (dependsOnTaskId, dependentTaskId) — no duplicate dependency edges.

Indexes: idx_task_dependencies_depends_on_task_id on (dependsOnTaskId) — "what depends on this task?", idx_task_dependencies_dependent_task_id on (dependentTaskId) — "what does this task depend on?".

Direction: dependentTaskId is the task that has the dependency. dependsOnTaskId is the prerequisite task. Together they form a directed edge: dependentTaskIddependsOnTaskId meaning "task dependentTaskId depends on task dependsOnTaskId". In the graph, there's an edge from dependsOnTaskIddependentTaskId (prerequisite → dependent). This gives correct topological order: prerequisites before dependents.

Cross-project dependency guard: dependentTaskId and dependsOnTaskId MUST reference tasks within the same project. The application layer enforces this constraint — creating a dependency between tasks in different projects is rejected with a validation error. This is not enforced at the DB level (FK constraints allow cross-project references), so the application must check project consistency before insert.

A future DB-level guard could use a trigger: BEFORE INSERT ON task_dependencies that checks NEW.dependentTaskId and NEW.dependsOnTaskId reference tasks in the same project. This is deferred to Phase 2 — the application-layer check is sufficient for now.

Sync source: Dependency edges are authored in task file frontmatter (dependsOn: [other-task] in the library, depends_on: in YAML) and synced to this table during the file → DB sync operation. The sync clears and re-inserts all edges for a task on each run — dependencies are fully replaced by the sync, not merged or modified at runtime.

Why ALL Frontmatter Fields Get Proper Columns

ADR-001 establishes the pattern: "separate structured columns for high-query, high-filter fields." For tasks, every taskgraph frontmatter field is queryable and filterable in the coordinator's workflow:

  • priority — "show me high-priority pending tasks" (coordinator prioritization)
  • assignee — "which tasks are assigned to agent X?" (work assignment)
  • dueAt — "which tasks are due this week?" (deadline tracking)
  • tags — "filter by tag" (cross-cutting concerns)

Shoving these into metadata JSONB loses type safety, indexability, and SQL queryability — exactly the problems the database is meant to solve. The metadata JSONB column (from commonCols) is reserved for truly ad-hoc data that isn't in the taskgraph schema.

Why Categorical Fields Are Nullable (Not NOT NULL with Defaults)

The previous design made scope, risk, impact, and level NOT NULL with defaults (narrow, low, isolated, implementation). This conflated two states:

  • Assessed as low — the Decomposer evaluated this and determined the risk is low
  • Not assessed — nobody filled this in

Hiding the distinction with defaults means the coordinator can't distinguish a deliberate assessment from a gap. NULL is the correct signal for "not yet assessed."

Taskgraph itself makes these fields Option<TaskScope>, Option<TaskRisk>, etc. — nullable. The DB should match the source model.

Application-layer handling: When scope, risk, impact, or level is NULL, the coordinator should:

  • Warn that the task hasn't been assessed
  • Exclude it from cost-benefit analysis (you can't compute risk-path without risk values)
  • Suggest the Decomposer assess it

For @alkdev/taskgraph operations that need numeric weights, provide fallbacks at the application layer. The library's resolveDefaults() uses medium as the default risk, narrow as the default scope, and isolated as the default impact. These defaults are used when computing analysis metrics — they do NOT change the DB value (NULL remains NULL in the database).

Path Semantics

The path column captures the logical grouping of tasks, derived from their location in the tasks/ directory hierarchy:

tasks/
├── architecture/
│   ├── auth-design.md          → path: "architecture"
│   └── storage-overview.md     → path: "architecture"
├── research/
│   └── embedding-approach.md   → path: "research"
└── implementation/
    ├── storage/
    │   ├── tasks-table.md      → path: "implementation/storage"
    │   └── relations.md        → path: "implementation/storage"
    └── auth/
        └── oauth-flow.md       → path: "implementation/auth"

path is nullable because tasks created at runtime via hub operations (not synced from files) have no filesystem origin.

path enables scoped queries:

  • WHERE path = 'architecture' — all architecture tasks
  • WHERE path LIKE 'implementation/%' — all implementation tasks
  • WHERE path = 'implementation/storage' — storage implementation tasks

This is a prefix-based grouping mechanism. It replaces parentId (which was not in the taskgraph model and conflated organizational grouping with dependency ordering).

Locale sensitivity: The path column uses text type with the database's default collation. LIKE pattern matching (WHERE path LIKE 'implementation/%') is collation-sensitive. For case-sensitive matching (recommended for task paths which use lowercase), use COLLATE "C" or ensure the default collation is C/POSIX. Alternatively, use text_pattern_ops operator class for the index: CREATE INDEX idx_tasks_path ON tasks (path text_pattern_ops) which enables LIKE and ~ pattern matching regardless of collation.

Grouping vs Dependencies

There is no parentId column. Task grouping and dependency ordering are separate concepts:

  • Groupingpath column. "This task belongs to the implementation/storage group." Enables scoped queries. Derived from filesystem layout during sync.
  • Dependenciestask_dependencies table. "This task cannot start until that task completes." Enables topological sort, cycle detection, critical path. Derived from depends_on frontmatter.

A "meta task" (e.g., "implement storage") is simply a task that depends_on all its sub-tasks. There is no special entity type — it's regular task + dependency edges. The coordinator picks up the meta task as an assignment, and the implementation specialist works through sub-tasks in dependency order.

Why not parentId: parentId was invented in a previous doc revision but has no basis in the taskgraph data model. It created confusion:

  • Redundant with task_dependencies (a meta task's dependencies ARE its sub-tasks)
  • Required a fragile "inference from directory structure" during sync
  • Violated the invariant that the DB schema mirrors the taskgraph frontmatter model

Relationship to Existing Tables

mappings Table

The mappings table links sessions to coordinators, spokes, and worktrees. A taskId column references the task a mapping is assigned to:

taskId: text REFERENCES tasks(id)   // FK to tasks
task: text                           // denormalized display name (e.g., task.slug or task.name)

This preserves the quick-reference pattern (coordinators can list mappings with task names without a JOIN) while maintaining referential integrity.

projects Table

Tasks belong to a project via tasks.projectId. A project's tasks live in the project's tasks/ directory. Cross-project task dependencies are not supported — tasks can only depend on other tasks within the same project. This is enforced at the application level (see task_dependencies cross-project guard).

sessions Table

Sessions are linked to tasks indirectly through mappings. When the coordinator spawns a session for a meta task:

  1. The task row already exists in tasks (synced from file or created via API)
  2. Creates a sessions row for the implementation specialist
  3. Creates a mappings row with taskId pointing to the meta task

Task Status Lifecycle

pending → in-progress → completed
                      ↘ failed → in-progress (retry)
                      ↘ blocked → in-progress (unblocked)
Status Meaning
pending Task exists, not yet started
in-progress A session is actively working on this task
completed Task finished successfully
failed Task failed, may retry (Safe Exit protocol)
blocked Task is blocked by an unmet dependency or external issue

Status transitions go through hub operations (hub.task.updateStatus), not file edits. This ensures:

  • All agents see consistent state immediately
  • The coordinator can query "which tasks are pending?" reliably
  • No merge conflicts from parallel file edits

Timestamp columns startedAt and completedAt track when a task entered in-progress and completed states respectively. These are set by the hub operation, not by the agent.

Task Notes (Append-Only)

Agents may need to add notes to a task during execution (observations, partial progress, blockers encountered). For v1, this is handled by appending markdown to the body column:

## Task Description (original)

Implement the tasks table with Drizzle-TypeBox pattern...

## Implementation Notes

- 2026-04-19: Started with table definition, commonCols pattern works
- 2026-04-19: Hit issue with text[] type for tags — need to check Drizzle support

The hub.task.addNote operation appends a timestamped note section to the end of body. This is simple, preserves the full context in one place, and requires no additional tables.

Concurrency model for hub.task.addNote: Notes are appended to the task body field using DB-level concatenation: UPDATE tasks SET body = COALESCE(body, '') || $note WHERE id = $taskId. This avoids read-modify-write cycles entirely — the append is atomic at the SQL level, eliminating race conditions between concurrent agents.

As a fallback for scenarios where DB-level concatenation isn't feasible, optimistic locking via updatedAt can be used: read the current updatedAt, append the note, and UPDATE WHERE updatedAt = readValue. If the row was updated between read and write, the UPDATE affects 0 rows and the operation must be retried. This is sufficient for the expected low-contention scenario (one agent at a time writing notes to a task).

For high-contention scenarios (multiple agents writing simultaneously), consider a separate task_notes table with INSERT operations instead of UPDATE appends.

If structured, multi-agent notes become necessary later, a dedicated task_notes table can be added. The body append pattern doesn't preclude this — it's additive.

Why Categorical Estimates Matter

The scope, risk, impact, and level fields are not cosmetic metadata — they are what make taskgraph's analysis commands produce useful results. The cost-benefit framework (see taskgraph framework docs) demonstrates a structural property: upstream failures multiply downstream damage.

These fields power:

  • taskgraph decompose — flags tasks where risk > medium or scope > moderate
  • taskgraph risk-path — finds the highest cumulative risk path
  • taskgraph critical — finds completion blockers
  • taskgraph bottleneck — finds high-betweenness tasks

Without them, you just get topological sort — useful, but not structurally insightful. The DB columns for these fields are nullable (NULL = not assessed) rather than NOT NULL with defaults, because the distinction between "deliberately assessed as low" and "nobody filled this in" is itself valuable information for the coordinator.

Graphology Integration (Runtime Graph Ops)

For runtime graph operations, the hub uses @alkdev/taskgraph — a TypeScript package that wraps graphology and provides a high-level TaskGraph class plus analysis functions. The CLI (taskgraph) is for offline authoring and analysis; the TS package is for runtime use.

Construction

The approach:

  1. Load all tasks + task_dependencies rows for a project from the DB
  2. Transform DB rows into TaskInput[] and DependencyEdge[] shapes (see Library ↔ DB Field Mapping below)
  3. Build a TaskGraph via TaskGraph.fromRecords(taskInputs, edges)
  4. Run analysis functions as needed

This works because realistic task graphs are small — typically 1050 tasks, rarely exceeding 200 even on large projects. Building a graph from DB rows is instant at this scale (TaskGraph.fromRecords with 100 nodes reconstructs in <5ms).

Library ↔ DB Field Mapping

Task inputs: TaskGraph.fromRecords(tasks, edges) takes TaskInput[] (frontmatter-shaped), not DB row shapes. The hub transforms DB rows → TaskInput:

DB Column TaskInput Field Notes
slug id Direct mapping
name name Direct mapping
status status Direct mapping
scope scope Direct mapping
risk risk Direct mapping
impact impact Direct mapping
level level Direct mapping
priority priority Direct mapping
tags tags Direct mapping

Dependency edges: DependencyEdge uses { from, to, qualityRetention? }, not DB column names:

DB Column DependencyEdge Field Notes
dependsOnTaskId (prerequisite) from The prerequisite task that must complete first
dependentTaskId (dependent) to The dependent task that waits for the prerequisite
(no column) qualityRetention? Per-edge failure propagation weight (01, default 0.9). Used by workflowCost analysis. Not stored in DB — set at graph construction time.

Graph Node vs DB Column Distinction

TaskGraphNodeAttributes (what the graph stores per node) is a subset of TaskInput. The graph intentionally drops fields that aren't relevant to graph algorithms:

In TaskInput In TaskGraphNodeAttributes Reason
id id Node key
name name Display
status status State tracking
scope scope Analysis
risk risk Analysis
impact impact Analysis
level level Analysis
priority priority Analysis
tags Not used by graph algorithms — available in DB
assignee Not used by graph algorithms — available in DB
due Not used by graph algorithms — available in DB
created Not used by graph algorithms — available in DB
modified Not used by graph algorithms — available in DB

Fields like tags, assignee, and due are fully queryable in the DB and don't need to be in the graph for analysis. If the coordinator needs to filter a graph by assignee, it should query the DB first and then construct a filtered subgraph using taskGraph.subgraph(filter).

@alkdev/taskgraph Exports

ConstructionTaskGraph class:

  • TaskGraph.fromTasks(tasks: TaskInput[]) — builds graph from tasks, inferring edges from dependsOn arrays
  • TaskGraph.fromRecords(tasks: TaskInput[], edges: DependencyEdge[]) — builds from tasks + explicit edge list
  • TaskGraph.fromJSON(data: TaskGraphSerialized) — deserializes from graphology JSON
  • Mutation: addTask(task), removeTask(taskId), addDependency(prerequisite, dependent, qualityRetention?), updateTask(taskId, attrs)
  • Queries: hasCycles, findCycles, topologicalOrder, dependencies(taskId), dependents(taskId), getTask(taskId), subgraph(filter)
  • Validation: validateSchema(), validateGraph()
  • Export: export()TaskGraphSerialized, toJSON() (alias)
  • Escape hatch: get graph → raw graphology DirectedGraph

Analysis functions:

  • criticalPath(graph) — longest path by edge count
  • weightedCriticalPath(graph, weightFn) — longest path with custom weight function
  • parallelGroups(graph) — groups of tasks that can run concurrently
  • bottlenecks(graph) — high-betweenness tasks. Returns BottleneckResult[]
  • riskPath(graph) — highest cumulative risk path. Returns RiskPathResult { path, totalRisk }
  • riskDistribution(graph) — risk distribution across graph. Returns RiskDistributionResult
  • shouldDecomposeTask(task) — decomposition recommendation. Returns DecomposeResult { shouldDecompose, reasons }
  • calculateTaskEv(p, scopeCost, impactWeight, config?) — expected value math. Returns EvResult
  • workflowCost(graph, options?) — total workflow cost with failure propagation. Returns WorkflowCostResult. Options: WorkflowCostOptions

Categorical numeric methods (map enum values → numbers for analysis):

  • scopeCostEstimate(scope) — numeric scope cost
  • scopeTokenEstimate(scope) — token-based scope estimate
  • riskSuccessProbability(risk) — probability of success (01)
  • riskWeight(risk) — weight for risk calculations
  • impactWeight(impact) — weight for impact calculations
  • resolveDefaults(attrs) — fills default values for unassessed fields and computes derived numeric values. Returns ResolvedTaskAttributes

Schema types — TypeBox schemas with Enum suffix:

  • TaskStatusEnum, TaskScopeEnum, TaskRiskEnum, TaskImpactEnum, TaskLevelEnum, TaskPriorityEnum
  • TypeScript types: TaskStatus, TaskScope, TaskRisk, TaskImpact, TaskLevel, TaskPriority

Frontmatter:

  • parseFrontmatter(content) — parses YAML + markdown, normalizes depends_ondependsOn
  • serializeFrontmatter(data) — serializes to YAML + markdown, outputs dependsOn
  • splitFrontmatter(content) — lower-level helper that splits ----delimited YAML from markdown without validating

Error classes:

  • TaskgraphError (base), CircularDependencyError, TaskNotFoundError, DuplicateNodeError, DuplicateEdgeError, ValidationError, GraphValidationError

Why not taskgraph NAPI for v1: The Rust CLI (taskgraph) is for offline authoring and analysis. The TypeScript package (@alkdev/taskgraph) handles all runtime graph operations. Graphology is a transitive dependency through @alkdev/taskgraph and handles < 200 nodes trivially. NAPI is unnecessary at realistic scales.

Sync Flow

┌──────────────┐       ┌───────────────┐       ┌──────────────────┐
│ Decomposer   │       │ taskgraph CLI │       │ Hub DB            │
│ creates .md  │──────►│ validates     │──────►│ tasks table       │
│ files        │       │ analyzes      │       │ task_dependencies  │
└──────────────┘       └───────────────┘       └──────────────────┘
                                                       ▲
                                                       │
                                              ┌────────┴─────────┐
                                              │ Hub operations     │
                                              │ hub.task.*         │
                                              │ (status, notes)    │
                                              └────────────────────┘

Sync: Files → DB

The sync operation runs as a single database transaction:

  1. Begin transaction
  2. Scan tasks/ directory for markdown files
  3. Parse frontmatter (YAML) + body (markdown) from each file. @alkdev/taskgraph provides parseFrontmatter() and serializeFrontmatter() for YAML+markdown parsing. parseTaskFile() and parseTaskDirectory() are Node.js only (use node:fs/promises); for Deno, use parseFrontmatter() with Deno file I/O.
  4. Upsert into tasks table (matches by (projectId, slug))
  5. For each task, DELETE FROM task_dependencies WHERE dependentTaskId = ? then INSERT the current edges — dependency edges are fully replaced, not merged, because the files own the dependency declarations
  6. Commit transaction

If any step fails, the entire sync rolls back — no partial updates.

Concurrency: Only one sync should run at a time. The Decomposer triggers sync after creating/updating task files. No concurrent sync mechanism is needed for v1.

Deleted files: When a task file is removed from tasks/, the sync operation deletes the corresponding DB row. Git history preserves the full file-level history — the DB doesn't need to duplicate it with soft deletes. FK cascade handles cleanup (task_dependencies rows, mappings.taskId SET NULL).

DB → Files (Export)

When graph analysis is needed, export DB rows back to markdown files:

  1. Query tasks + task_dependencies for a project
  2. For each task, generate markdown with YAML frontmatter + body
  3. Write to tasks/ directory structure (using path to determine subdirectory)
  4. Run taskgraph validate, taskgraph risk-path, etc.

This is a manual step — "I want to run analysis now" — not an automatic sync.

Sync Error Handling

Error Behavior
Invalid YAML frontmatter Skip file, log warning with file path and parse error. Continue with remaining files.
Missing required id or name field Skip file, log warning. Task cannot be synced without these fields.
dependsOn references non-existent slug within project Insert the dependency edge anyway (dangling reference). The coordinator detects and warns about unresolvable dependencies. taskgraph validate should be run before sync to catch these.
Duplicate id (slug) in same project Fail the sync with a clear error. Slug uniqueness is enforced by the DB constraint unq_tasks_project_slug.
File removed from filesystem DELETE the DB row. FK cascade handles dependent rows. Git preserves history.

Validation ordering: Run taskgraph validate before sync to catch structural errors (cycles, missing dependencies, duplicate IDs) at the CLI level. The DB sync then handles data-level integrity (unique constraints, FK checks).

Open Questions

  1. Embeddings: Task descriptions may benefit from vector embeddings for similarity search. Deferred — the metadata JSONB column can hold an embedding reference later, or a separate task_embeddings table can be added.

  2. Bulk status updates: When the coordinator completes a meta task (all sub-tasks done), should it automatically mark the meta task completed? Likely yes — this is an application-level operation, not a DB concern.

  3. Cross-project dependencies: Not supported. Tasks can only depend on other tasks within the same project. Application-layer validation rejects cross-project dependencies; a future DB-level trigger guard is deferred to Phase 2 (see task_dependencies cross-project guard).

  4. Task versioning: When a task's body is modified (e.g., notes appended), should we keep previous versions? For v1, no — the current body is sufficient. If audit trail is needed, updatedAt timestamp + metadata revision count could suffice.

References

  • Cost-benefit framework: taskgraph framework docs — why categorical estimates are structurally required
  • Workflow guide: taskgraph workflow docs — practical usage patterns
  • Task file format: @alkdev/taskgraph README — field definitions
  • TaskFrontmatter struct: @alkdev/taskgraph package source — TaskInput type (TypeScript equivalent of Rust TaskFrontmatter)
  • taskgraph architecture: taskgraph architecture docs
  • Storage pattern: README.md
  • Table reference (cross-cutting): table-reference.md
  • ADR-011: ../../decisions/ADR-011-dual-task-representation.md
  • @alkdev/taskgraph (runtime graph engine): @alkdev/taskgraph npm package