alkdev/hub

Files

glm-5.1 2b63cda1c7 Setup repo: migrate architecture specs, code stubs, and tasks from alkhub_ts

Copy architecture docs, ADRs, storage domain specs, research, reviews,
and 56 storage architecture tasks from the alkhub_ts monorepo. Adapt for
standalone @alkdev/hub repo structure (src/ not packages/hub/).

Sanitize all sensitive information:
- Replace private IPs (10.0.0.1) with localhost defaults
- Remove internal server hostnames (dev1, ns528096)
- Replace /workspace/ private paths with npm package references
- Remove hardcoded credentials from examples
- Rewrite infrastructure.md without private network details

Add Deno project scaffolding: deno.json (pinned deps), .gitignore,
AGENTS.md, entry point. Migrate existing code stubs (crypto, config
types, logger) with updated import paths.

2026-05-25 10:56:32 +00:00

32 KiB

Raw Blame History

status, last_updated

status	last_updated
draft	2026-05-18

Storage: Tasks & Task Dependencies

Tasks are the unit of work in the Spec-Driven Development (SDD) process. The database is the source of truth for task data at runtime. Markdown files serve as the authoring surface for the Decomposer role and the taskgraph CLI — they are ingested into the DB via a sync operation and can be exported back for offline analysis.

For the overall storage pattern, see README.md. For cross-cutting table reference (common columns, cascade behavior, index reference, status enums, relations), see table-reference.md. For design decisions, see ../../decisions/.

Overview

Why Database as Source of Truth

Taskgraph's file-based model works well for single-agent, single-worktree workflows. In the hub's multi-agent, multi-worktree environment, files create problems:

Parallel worktrees: Agent A marks a task in-progress in their worktree's file. Agent B can't see this — the file lives in A's working directory. The coordinator can't get a consistent view.
Reliable coordination: The coordinator needs to query "which tasks are pending?" and "what's blocking task X?" at runtime without scanning filesystems across worktrees.
Atomic status updates: An agent calling hub.task.updateStatus gets an immediate, transactional state change visible to all other agents and the coordinator.

The database is the authoritative, queryable, concurrent-safe representation. Files are the authoring format.

Relationship to taskgraph CLI

The taskgraph CLI operates on markdown files. Its value is in offline analysis — topo, cycles, parallel, critical, bottleneck, risk-path, decompose. These commands depend on categorical fields (scope, risk, impact, level) being assessed.

The workflow is:

Author — Decomposer creates/edits markdown files using taskgraph init and direct editing
Sync — Files are ingested into the DB (files → DB)
Execute — Coordinator and agents query and mutate the DB via hub operations
Analyze — When needed, export from DB to files, run taskgraph risk-path etc.

The taskgraph CLI is not required at runtime. The hub uses @alkdev/taskgraph for runtime graph operations (topological sort, cycle detection, parallel groups, critical path, risk analysis) — see Graphology Integration.

Task Authority Model

Aspect	Authority	Why
Task structure (all fields)	DB	Queryable, concurrent-safe, consistent
Task specification (body)	DB (`body` column)	Stored as markdown text; agents append notes during execution
Task authoring/creation	Files → sync → DB	Decomposer edits files; sync ingests them
Runtime status mutations	DB (hub operations)	`hub.task.*` operations — coordinator and agents call these
Offline graph analysis	Files (taskgraph CLI)	Export from DB when needed for `taskgraph risk-path` etc.

See Field Authority Split for the explicit list of authored vs runtime-managed fields.

Field Authority Split

Fields are split into two categories based on who writes them:

Authored Fields (upserted by file sync)

These fields are written by the Decomposer/file sync. The ON CONFLICT DO UPDATE SET clause in the sync upsert includes only these columns:

Field	DB Column
id	`slug`
name	`name`
(project)	`projectId`
(directory path)	`path`
scope	`scope`
risk	`risk`
impact	`impact`
level	`level`
priority	`priority`
tags	`tags`
assignee	`assignee`
due	`dueAt`
(body)	`body`
created	`fileCreatedAt`
modified	`fileModifiedAt`
depends_on	`task_dependencies` table

Note: projectId is set from the project context during sync (the task file's location within a project's tasks/ directory determines the project), not from taskgraph frontmatter. commonCols fields (id, metadata, createdAt, updatedAt) are DB-generated and not part of the sync conflict domain.

Runtime-Managed Fields (mutated via `hub.task.*` operations only)

These fields are never overwritten by sync. They are only mutated by hub operations (hub.task.updateStatus, hub.task.addNote, etc.):

Field	DB Column	Set By
status	`status`	`hub.task.updateStatus`
(started timestamp)	`startedAt`	`hub.task.updateStatus` (on `in-progress`)
(completed timestamp)	`completedAt`	`hub.task.updateStatus` (on `completed`)

Warning

: Sync must never write status, startedAt, or completedAt — these are owned by hub operations. The sync upsert uses ON CONFLICT DO UPDATE SET only for authored fields; runtime fields are excluded from the SET clause.

Field Mapping: taskgraph Frontmatter → DB Columns

Every field in taskgraph's TaskFrontmatter struct maps to a dedicated DB column. No frontmatter fields are relegated to JSONB metadata.

taskgraph Field	DB Column	Type	Notes
`id`	`slug`	text NOT NULL	Direct mapping. No transformation. `slug` is taskgraph-compatible, used in `depends_on` references.
`name`	`name`	text NOT NULL	Direct mapping
`status`	`status`	text NOT NULL, enum	Direct mapping: `pending`, `in-progress`, `completed`, `failed`, `blocked`. Default: `pending`.
`depends_on`	`task_dependencies` table	—	Each element creates a row: `depends_on[i]` → `dependsOnTaskId`, task → `dependentTaskId`
`scope`	`scope`	text, enum	`single`, `narrow`, `moderate`, `broad`, `system`. Nullable — NULL = not yet assessed.
`risk`	`risk`	text, enum	`trivial`, `low`, `medium`, `high`, `critical`. Nullable — NULL = not yet assessed.
`impact`	`impact`	text, enum	`isolated`, `component`, `phase`, `project`. Nullable — NULL = not yet assessed.
`level`	`level`	text, enum	`planning`, `decomposition`, `implementation`, `review`, `research`. Nullable — NULL = not yet assessed.
`priority`	`priority`	text, enum	`low`, `medium`, `high`, `critical`. Nullable.
`tags`	`tags`	text[]	String array. Default `{}`.
`assignee`	`assignee`	text	Assigned agent or person. Nullable.
`due`	`dueAt`	timestamp with tz	Renamed from `due` for DB convention. Nullable.
`created`	`fileCreatedAt`	timestamp with tz	Frontmatter `created` field. Separate from DB `createdAt` (row creation time). Nullable — frontmatter may not include it.
`modified`	`fileModifiedAt`	timestamp with tz	Frontmatter `modified` field. Separate from DB `updatedAt` (row update time). Nullable.
(body)	`body`	text	Markdown content after frontmatter. Nullable — empty body is valid.
(directory path)	`path`	text	Logical grouping prefix: `architecture`, `implementation/storage`. Nullable — tasks created via API with no file origin have no path. See Path Semantics.
(project)	`projectId`	text NOT NULL	FK → projects.id

Table Schemas

`tasks`

SDD task definitions. The database is the source of truth for task data at runtime. Markdown files serve as the authoring surface for the Decomposer and taskgraph CLI — they are ingested into the DB via a sync operation. Every field in taskgraph's TaskFrontmatter struct maps to a dedicated DB column (no frontmatter fields in metadata JSONB).

Column	Type	Notes
commonCols	—	id, metadata, createdAt, updatedAt
projectId	text NOT NULL	FK → projects.id (cascade) — tasks belong to a project
slug	text NOT NULL	taskgraph `id` — kebab-case identifier used in `depends_on` references. Unique within a project.
name	text NOT NULL	Human-readable task name (from frontmatter `name`)
path	text	Logical grouping prefix derived from filesystem location (e.g., `architecture`, `implementation/storage`). Nullable — tasks created via API with no file origin have no path. Enables `WHERE path LIKE 'implementation/%'` for scoped queries.
status	text NOT NULL	Enum: `pending`, `in-progress`, `completed`, `failed`, `blocked`. Default: `pending`. Status transitions go through hub operations, not file edits.
scope	text	Categorical scope: `single`, `narrow`, `moderate`, `broad`, `system`. Nullable — NULL = not yet assessed. See Why Categorical Fields Are Nullable.
risk	text	Categorical risk: `trivial`, `low`, `medium`, `high`, `critical`. Nullable — NULL = not yet assessed.
impact	text	Categorical impact: `isolated`, `component`, `phase`, `project`. Nullable — NULL = not yet assessed.
level	text	Task level: `planning`, `decomposition`, `implementation`, `review`, `research`. Nullable — NULL = not yet assessed.
priority	text	Priority: `low`, `medium`, `high`, `critical`. Nullable.
assignee	text	Assigned agent or person. Nullable.
dueAt	timestamp with tz	Due date (from frontmatter `due`). Nullable.
tags	text[]	Filtering tags. Default `{}`. GIN index for array-contains queries.
body	text	Markdown task specification (from file body after frontmatter). Nullable — empty body is valid. Agents may append notes during execution.
fileCreatedAt	timestamp with tz	Frontmatter `created` field — file creation time from the markdown. Separate from DB `createdAt` (row creation time). Nullable.
fileModifiedAt	timestamp with tz	Frontmatter `modified` field — file modification time from the markdown. Separate from DB `updatedAt` (row update time). Nullable.
startedAt	timestamp with tz	When status became `in-progress`. Set by hub operation, not by agent.
completedAt	timestamp with tz	When status became `completed`. Set by hub operation.

Unique constraint: unq_tasks_project_slug UNIQUE on (projectId, slug) — task slugs are unique within a project.

pgEnum Definitions: The following enum columns use PostgreSQL pgEnum for type safety. Drizzle's pgEnum generates named PostgreSQL enums and provides TypeScript type inference. The enum values are aligned with taskgraph's categorical fields.

export const taskStatus = pgEnum("task_status", ["pending", "in-progress", "completed", "failed", "blocked"]);
export const taskScope = pgEnum("task_scope", ["single", "narrow", "moderate", "broad", "system"]);
export const taskRisk = pgEnum("task_risk", ["trivial", "low", "medium", "high", "critical"]);
export const taskImpact = pgEnum("task_impact", ["isolated", "component", "phase", "project"]);
export const taskLevel = pgEnum("task_level", ["planning", "decomposition", "implementation", "review", "research"]);
export const taskPriority = pgEnum("task_priority", ["low", "medium", "high", "critical"]);

The decomposer template should consume these same enum definitions to ensure DB-level constraints match the application-level typing.

Indexes: idx_tasks_project_id on (projectId), idx_tasks_project_status on (projectId, status) — composite for "find all pending tasks in project X", idx_tasks_status on (status), idx_tasks_active partial on (projectId) WHERE status IN ('pending', 'in-progress', 'blocked') — efficiently find active tasks, idx_tasks_path on (path) with text_pattern_ops — locale-independent LIKE pattern matching for path prefix queries (e.g., WHERE path LIKE 'implementation/%'), idx_tasks_priority on (priority), idx_tasks_assignee on (assignee), idx_tasks_due_at on (dueAt), idx_tasks_tags GIN on (tags) — for array-contains queries (tags @> '{security}').

slug semantics: From taskgraph frontmatter id field. Kebab-case identifiers like auth-setup, storage-tasks-table. Appears in depends_on arrays.

path semantics: Nullable — tasks created via API with no filesystem origin have no path. When set, captures the logical grouping derived from the tasks/ directory structure. E.g., a file at tasks/implementation/storage/tasks-table.md gets path: "implementation/storage". Enables WHERE path LIKE 'implementation/%' (scoped queries) without requiring a parentId FK. This replaces the previous parentId column — grouping is a path concern, not a tree relationship.

No parentId column: Grouping is handled by path, dependencies by task_dependencies. A "meta task" is just a regular task that depends on its sub-tasks — no special entity type needed.

No removedAt column: When a task file is removed, the sync operation DELETEs the DB row. Git history preserves the file-level history; the DB doesn't need to duplicate it with soft deletes. FK cascade handles cleanup.

metadata JSONB: Reserved for truly ad-hoc data not in the taskgraph schema. No taskgraph frontmatter fields are stored here — all have proper columns.

`task_dependencies`

Dependency edges between tasks. Directed: a row means the dependent task depends on the prerequisite task (prerequisite must complete before dependent can start). Mirrors the taskgraph depends_on relationship.

Column	Type	Notes
commonCols	—	id, metadata, createdAt, updatedAt
dependsOnTaskId	text NOT NULL	FK → tasks.id (cascade) — The prerequisite task (must complete first)
dependentTaskId	text NOT NULL	FK → tasks.id (cascade) — The dependent task (waits for prerequisite)

Unique constraint: unq_task_dependencies_depends_on_task UNIQUE on (dependsOnTaskId, dependentTaskId) — no duplicate dependency edges.

Indexes: idx_task_dependencies_depends_on_task_id on (dependsOnTaskId) — "what depends on this task?", idx_task_dependencies_dependent_task_id on (dependentTaskId) — "what does this task depend on?".

Direction: dependentTaskId is the task that has the dependency. dependsOnTaskId is the prerequisite task. Together they form a directed edge: dependentTaskId → dependsOnTaskId meaning "task dependentTaskId depends on task dependsOnTaskId". In the graph, there's an edge from dependsOnTaskId → dependentTaskId (prerequisite → dependent). This gives correct topological order: prerequisites before dependents.

Cross-project dependency guard: taskId and dependsOnTaskId MUST reference tasks within the same project. The application layer enforces this constraint — creating a dependency between tasks in different projects is rejected with a validation error. This is not enforced at the DB level (FK constraints allow cross-project references), so the application must check project consistency before insert.

A future DB-level guard could use a trigger: BEFORE INSERT ON task_dependencies that checks NEW.taskId and NEW.dependsOnTaskId reference tasks in the same project. This is deferred to Phase 2 — the application-layer check is sufficient for now.

Sync source: Dependency edges are authored in task file frontmatter (depends_on: [other-task]) and synced to this table during the file → DB sync operation. The sync clears and re-inserts all edges for a task on each run — dependencies are fully replaced by the sync, not merged or modified at runtime.

Why ALL Frontmatter Fields Get Proper Columns

ADR-001 establishes the pattern: "separate structured columns for high-query, high-filter fields." For tasks, every taskgraph frontmatter field is queryable and filterable in the coordinator's workflow:

priority — "show me high-priority pending tasks" (coordinator prioritization)
assignee — "which tasks are assigned to agent X?" (work assignment)
dueAt — "which tasks are due this week?" (deadline tracking)
tags — "filter by tag" (cross-cutting concerns)

Shoving these into metadata JSONB loses type safety, indexability, and SQL queryability — exactly the problems the database is meant to solve. The metadata JSONB column (from commonCols) is reserved for truly ad-hoc data that isn't in the taskgraph schema.

Why Categorical Fields Are Nullable (Not NOT NULL with Defaults)

The previous design made scope, risk, impact, and level NOT NULL with defaults (narrow, low, isolated, implementation). This conflated two states:

Assessed as low — the Decomposer evaluated this and determined the risk is low
Not assessed — nobody filled this in

Hiding the distinction with defaults means the coordinator can't distinguish a deliberate assessment from a gap. NULL is the correct signal for "not yet assessed."

Taskgraph itself makes these fields Option<TaskScope>, Option<TaskRisk>, etc. — nullable. The DB should match the source model.

Application-layer handling: When scope, risk, impact, or level is NULL, the coordinator should:

Warn that the task hasn't been assessed
Exclude it from cost-benefit analysis (you can't compute risk-path without risk values)
Suggest the Decomposer assess it

For @alkdev/taskgraph operations that need numeric weights, provide fallbacks at the application layer (e.g., treat NULL risk as low for topo sort, but warn).

Path Semantics

The path column captures the logical grouping of tasks, derived from their location in the tasks/ directory hierarchy:

tasks/
├── architecture/
│   ├── auth-design.md          → path: "architecture"
│   └── storage-overview.md     → path: "architecture"
├── research/
│   └── embedding-approach.md   → path: "research"
└── implementation/
    ├── storage/
    │   ├── tasks-table.md      → path: "implementation/storage"
    │   └── relations.md        → path: "implementation/storage"
    └── auth/
        └── oauth-flow.md       → path: "implementation/auth"

path is nullable because tasks created at runtime via hub operations (not synced from files) have no filesystem origin.

path enables scoped queries:

WHERE path = 'architecture' — all architecture tasks
WHERE path LIKE 'implementation/%' — all implementation tasks
WHERE path = 'implementation/storage' — storage implementation tasks

This is a prefix-based grouping mechanism. It replaces parentId (which was not in the taskgraph model and conflated organizational grouping with dependency ordering).

Locale sensitivity: The path column uses text type with the database's default collation. LIKE pattern matching (WHERE path LIKE 'implementation/%') is collation-sensitive. For case-sensitive matching (recommended for task paths which use lowercase), use COLLATE "C" or ensure the default collation is C/POSIX. Alternatively, use text_pattern_ops operator class for the index: CREATE INDEX idx_tasks_path ON tasks (path text_pattern_ops) which enables LIKE and ~ pattern matching regardless of collation.

Grouping vs Dependencies

There is no parentId column. Task grouping and dependency ordering are separate concepts:

Grouping — path column. "This task belongs to the implementation/storage group." Enables scoped queries. Derived from filesystem layout during sync.
Dependencies — task_dependencies table. "This task cannot start until that task completes." Enables topological sort, cycle detection, critical path. Derived from depends_on frontmatter.

A "meta task" (e.g., "implement storage") is simply a task that depends_on all its sub-tasks. There is no special entity type — it's regular task + dependency edges. The coordinator picks up the meta task as an assignment, and the implementation specialist works through sub-tasks in dependency order.

Why not parentId: parentId was invented in a previous doc revision but has no basis in the taskgraph data model. It created confusion:

Redundant with task_dependencies (a meta task's dependencies ARE its sub-tasks)
Required a fragile "inference from directory structure" during sync
Violated the invariant that the DB schema mirrors the taskgraph frontmatter model

Relationship to Existing Tables

`mappings` Table

The mappings table links sessions to coordinators, spokes, and worktrees. A taskId column references the task a mapping is assigned to:

taskId: text REFERENCES tasks(id)   // FK to tasks
task: text                           // denormalized display name (e.g., task.slug or task.name)

This preserves the quick-reference pattern (coordinators can list mappings with task names without a JOIN) while maintaining referential integrity.

`projects` Table

Tasks belong to a project via tasks.projectId. A project's tasks live in the project's tasks/ directory. Cross-project task dependencies are not supported — tasks can only depend on other tasks within the same project. This is enforced at the application level (see task_dependencies cross-project guard).

`sessions` Table

Sessions are linked to tasks indirectly through mappings. When the coordinator spawns a session for a meta task:

The task row already exists in tasks (synced from file or created via API)
Creates a sessions row for the implementation specialist
Creates a mappings row with taskId pointing to the meta task

Task Status Lifecycle

pending → in-progress → completed
                      ↘ failed → in-progress (retry)
                      ↘ blocked → in-progress (unblocked)

Status	Meaning
`pending`	Task exists, not yet started
`in-progress`	A session is actively working on this task
`completed`	Task finished successfully
`failed`	Task failed, may retry (Safe Exit protocol)
`blocked`	Task is blocked by an unmet dependency or external issue

Status transitions go through hub operations (hub.task.updateStatus), not file edits. This ensures:

All agents see consistent state immediately
The coordinator can query "which tasks are pending?" reliably
No merge conflicts from parallel file edits

Timestamp columns startedAt and completedAt track when a task entered in-progress and completed states respectively. These are set by the hub operation, not by the agent.

Task Notes (Append-Only)

Agents may need to add notes to a task during execution (observations, partial progress, blockers encountered). For v1, this is handled by appending markdown to the body column:

## Task Description (original)

Implement the tasks table with Drizzle-TypeBox pattern...

## Implementation Notes

- 2026-04-19: Started with table definition, commonCols pattern works
- 2026-04-19: Hit issue with text[] type for tags — need to check Drizzle support

The hub.task.addNote operation appends a timestamped note section to the end of body. This is simple, preserves the full context in one place, and requires no additional tables.

Concurrency model for hub.task.addNote: Notes are appended to the task body field using DB-level concatenation: UPDATE tasks SET body = COALESCE(body, '') || $note WHERE id = $taskId. This avoids read-modify-write cycles entirely — the append is atomic at the SQL level, eliminating race conditions between concurrent agents.

As a fallback for scenarios where DB-level concatenation isn't feasible, optimistic locking via updatedAt can be used: read the current updatedAt, append the note, and UPDATE WHERE updatedAt = readValue. If the row was updated between read and write, the UPDATE affects 0 rows and the operation must be retried. This is sufficient for the expected low-contention scenario (one agent at a time writing notes to a task).

For high-contention scenarios (multiple agents writing simultaneously), consider a separate task_notes table with INSERT operations instead of UPDATE appends.

If structured, multi-agent notes become necessary later, a dedicated task_notes table can be added. The body append pattern doesn't preclude this — it's additive.

Why Categorical Estimates Matter

The scope, risk, impact, and level fields are not cosmetic metadata — they are what make taskgraph's analysis commands produce useful results. The cost-benefit framework (see taskgraph framework docs) demonstrates a structural property: upstream failures multiply downstream damage.

These fields power:

taskgraph decompose — flags tasks where risk > medium or scope > moderate
taskgraph risk-path — finds the highest cumulative risk path
taskgraph critical — finds completion blockers
taskgraph bottleneck — finds high-betweenness tasks

Without them, you just get topological sort — useful, but not structurally insightful. The DB columns for these fields are nullable (NULL = not assessed) rather than NOT NULL with defaults, because the distinction between "deliberately assessed as low" and "nobody filled this in" is itself valuable information for the coordinator.

Graphology Integration (Runtime Graph Ops)

For runtime graph operations, the hub uses @alkdev/taskgraph — a TypeScript package that wraps graphology and provides a high-level TaskGraph class plus analysis functions. The CLI (taskgraph) is for offline authoring and analysis; the TS package is for runtime use.

The approach:

Load all tasks + task_dependencies rows for a project from the DB
Build a TaskGraph via TaskGraph.fromRecords(tasks, edges)
Run analysis functions as needed: criticalPath(), parallelGroups(), bottlenecks(), riskPath(), shouldDecomposeTask(), workflowCost()

This works because realistic task graphs are small — typically 10–50 tasks, rarely exceeding 200 even on large projects. Building a graph from DB rows is instant at this scale (TaskGraph.fromRecords with 100 nodes reconstructs in <5ms).

@alkdev/taskgraph exports:

TaskGraph — construction (fromTasks, fromRecords, fromJSON), mutation (addTask, removeTask, addDependency, updateTask), queries (hasCycles, findCycles, topologicalOrder, dependencies, dependents, getTask), validation (validateSchema, validateGraph), export
Analysis functions — criticalPath, weightedCriticalPath, parallelGroups, bottlenecks, riskPath, riskDistribution, calculateTaskEv, workflowCost, shouldDecomposeTask
Schema types — TaskScope, TaskRisk, TaskImpact, TaskLevel, TaskPriority, TaskStatus enums with TypeBox schemas
Frontmatter — parseFrontmatter, serializeFrontmatter (YAML + markdown)
Error classes — TaskgraphError, CircularDependencyError, TaskNotFoundError, etc.

Why not taskgraph NAPI for v1: The Rust CLI (taskgraph) is for offline authoring and analysis. The TypeScript package (@alkdev/taskgraph) handles all runtime graph operations. Graphology is a transitive dependency through @alkdev/taskgraph and handles < 200 nodes trivially. NAPI is unnecessary at realistic scales.

Sync Flow

┌──────────────┐       ┌───────────────┐       ┌──────────────────┐
│ Decomposer   │       │ taskgraph CLI │       │ Hub DB            │
│ creates .md  │──────►│ validates     │──────►│ tasks table       │
│ files        │       │ analyzes      │       │ task_dependencies  │
└──────────────┘       └───────────────┘       └──────────────────┘
                                                       ▲
                                                       │
                                              ┌────────┴─────────┐
                                              │ Hub operations     │
                                              │ hub.task.*         │
                                              │ (status, notes)    │
                                              └────────────────────┘

Sync: Files → DB

The sync operation runs as a single database transaction:

Begin transaction
Scan tasks/ directory for markdown files
Parse frontmatter (YAML) + body (markdown) from each file. @alkdev/taskgraph provides parseFrontmatter() and serializeFrontmatter() for YAML+markdown parsing. parseTaskFile() and parseTaskDirectory() are Node.js only (use node:fs/promises); for Deno, use parseFrontmatter() with Deno file I/O.
Upsert into tasks table (matches by (projectId, slug))
For each task, DELETE FROM task_dependencies WHERE dependentTaskId = ? then INSERT the current edges — dependency edges are fully replaced, not merged, because the files own the dependency declarations
Commit transaction

If any step fails, the entire sync rolls back — no partial updates.

Concurrency: Only one sync should run at a time. The Decomposer triggers sync after creating/updating task files. No concurrent sync mechanism is needed for v1.

Deleted files: When a task file is removed from tasks/, the sync operation deletes the corresponding DB row. Git history preserves the full file-level history — the DB doesn't need to duplicate it with soft deletes. FK cascade handles cleanup (task_dependencies rows, mappings.taskId SET NULL).

DB → Files (Export)

When graph analysis is needed, export DB rows back to markdown files:

Query tasks + task_dependencies for a project
For each task, generate markdown with YAML frontmatter + body
Write to tasks/ directory structure (using path to determine subdirectory)
Run taskgraph validate, taskgraph risk-path, etc.

This is a manual step — "I want to run analysis now" — not an automatic sync.

Sync Error Handling

Error	Behavior
Invalid YAML frontmatter	Skip file, log warning with file path and parse error. Continue with remaining files.
Missing required `id` or `name` field	Skip file, log warning. Task cannot be synced without these fields.
`depends_on` references non-existent slug within project	Insert the dependency edge anyway (dangling reference). The coordinator detects and warns about unresolvable dependencies. `taskgraph validate` should be run before sync to catch these.
Duplicate `id` (slug) in same project	Fail the sync with a clear error. Slug uniqueness is enforced by the DB constraint `unq_tasks_project_slug`.
File removed from filesystem	DELETE the DB row. FK cascade handles dependent rows. Git preserves history.

Validation ordering: Run taskgraph validate before sync to catch structural errors (cycles, missing dependencies, duplicate IDs) at the CLI level. The DB sync then handles data-level integrity (unique constraints, FK checks).

Open Questions

Embeddings: Task descriptions may benefit from vector embeddings for similarity search. Deferred — the metadata JSONB column can hold an embedding reference later, or a separate task_embeddings table can be added.
Bulk status updates: When the coordinator completes a meta task (all sub-tasks done), should it automatically mark the meta task completed? Likely yes — this is an application-level operation, not a DB concern.
Cross-project dependencies: Not supported. Tasks can only depend on other tasks within the same project. Application-layer validation rejects cross-project dependencies; a future DB-level trigger guard is deferred to Phase 2 (see task_dependencies cross-project guard).
Task versioning: When a task's body is modified (e.g., notes appended), should we keep previous versions? For v1, no — the current body is sufficient. If audit trail is needed, updatedAt timestamp + metadata revision count could suffice.

References

Cost-benefit framework: taskgraph framework docs — why categorical estimates are structurally required
Workflow guide: taskgraph workflow docs — practical usage patterns
Task file format: @alkdev/taskgraph README — field definitions
TaskFrontmatter struct: @alkdev/taskgraph package source — canonical field types and defaults
taskgraph architecture: taskgraph architecture docs
Storage pattern: README.md
Table reference (cross-cutting): table-reference.md
ADR-011: ../../decisions/ADR-011-dual-task-representation.md
@alkdev/taskgraph (runtime graph engine): @alkdev/taskgraph npm package

32 KiB Raw Blame History Unescape Escape