Copy architecture docs, ADRs, storage domain specs, research, reviews, and 56 storage architecture tasks from the alkhub_ts monorepo. Adapt for standalone @alkdev/hub repo structure (src/ not packages/hub/). Sanitize all sensitive information: - Replace private IPs (10.0.0.1) with localhost defaults - Remove internal server hostnames (dev1, ns528096) - Replace /workspace/ private paths with npm package references - Remove hardcoded credentials from examples - Rewrite infrastructure.md without private network details Add Deno project scaffolding: deno.json (pinned deps), .gitignore, AGENTS.md, entry point. Migrate existing code stubs (crypto, config types, logger) with updated import paths.
445 lines
32 KiB
Markdown
445 lines
32 KiB
Markdown
---
|
||
status: draft
|
||
last_updated: 2026-05-18
|
||
---
|
||
|
||
# Storage: Tasks & Task Dependencies
|
||
|
||
Tasks are the unit of work in the Spec-Driven Development (SDD) process. The **database is the source of truth** for task data at runtime. Markdown files serve as the **authoring surface** for the Decomposer role and the `taskgraph` CLI — they are ingested into the DB via a sync operation and can be exported back for offline analysis.
|
||
|
||
For the overall storage pattern, see [README.md](./README.md). For cross-cutting table reference (common columns, cascade behavior, index reference, status enums, relations), see [table-reference.md](./table-reference.md). For design decisions, see [../../decisions/](../../decisions/).
|
||
|
||
## Overview
|
||
|
||
### Why Database as Source of Truth
|
||
|
||
Taskgraph's file-based model works well for single-agent, single-worktree workflows. In the hub's multi-agent, multi-worktree environment, files create problems:
|
||
|
||
- **Parallel worktrees**: Agent A marks a task `in-progress` in their worktree's file. Agent B can't see this — the file lives in A's working directory. The coordinator can't get a consistent view.
|
||
- **Reliable coordination**: The coordinator needs to query "which tasks are pending?" and "what's blocking task X?" at runtime without scanning filesystems across worktrees.
|
||
- **Atomic status updates**: An agent calling `hub.task.updateStatus` gets an immediate, transactional state change visible to all other agents and the coordinator.
|
||
|
||
The database is the authoritative, queryable, concurrent-safe representation. Files are the authoring format.
|
||
|
||
### Relationship to taskgraph CLI
|
||
|
||
The `taskgraph` CLI operates on markdown files. Its value is in **offline analysis** — `topo`, `cycles`, `parallel`, `critical`, `bottleneck`, `risk-path`, `decompose`. These commands depend on categorical fields (`scope`, `risk`, `impact`, `level`) being assessed.
|
||
|
||
The workflow is:
|
||
|
||
1. **Author** — Decomposer creates/edits markdown files using `taskgraph init` and direct editing
|
||
2. **Sync** — Files are ingested into the DB (files → DB)
|
||
3. **Execute** — Coordinator and agents query and mutate the DB via hub operations
|
||
4. **Analyze** — When needed, export from DB to files, run `taskgraph risk-path` etc.
|
||
|
||
The taskgraph CLI is not required at runtime. The hub uses **@alkdev/taskgraph** for runtime graph operations (topological sort, cycle detection, parallel groups, critical path, risk analysis) — see [Graphology Integration](#graphology-integration-runtime-graph-ops).
|
||
|
||
## Task Authority Model
|
||
|
||
| Aspect | Authority | Why |
|
||
|--------|-----------|-----|
|
||
| Task structure (all fields) | **DB** | Queryable, concurrent-safe, consistent |
|
||
| Task specification (body) | **DB** (`body` column) | Stored as markdown text; agents append notes during execution |
|
||
| Task authoring/creation | **Files** → sync → DB | Decomposer edits files; sync ingests them |
|
||
| Runtime status mutations | **DB** (hub operations) | `hub.task.*` operations — coordinator and agents call these |
|
||
| Offline graph analysis | **Files** (taskgraph CLI) | Export from DB when needed for `taskgraph risk-path` etc. |
|
||
|
||
See [Field Authority Split](#field-authority-split) for the explicit list of authored vs runtime-managed fields.
|
||
|
||
## Field Authority Split
|
||
|
||
Fields are split into two categories based on who writes them:
|
||
|
||
### Authored Fields (upserted by file sync)
|
||
|
||
These fields are written by the Decomposer/file sync. The `ON CONFLICT DO UPDATE SET` clause in the sync upsert includes **only** these columns:
|
||
|
||
| Field | DB Column |
|
||
|-------|-----------|
|
||
| id | `slug` |
|
||
| name | `name` |
|
||
| (project) | `projectId` |
|
||
| (directory path) | `path` |
|
||
| scope | `scope` |
|
||
| risk | `risk` |
|
||
| impact | `impact` |
|
||
| level | `level` |
|
||
| priority | `priority` |
|
||
| tags | `tags` |
|
||
| assignee | `assignee` |
|
||
| due | `dueAt` |
|
||
| (body) | `body` |
|
||
| created | `fileCreatedAt` |
|
||
| modified | `fileModifiedAt` |
|
||
| depends_on | `task_dependencies` table |
|
||
|
||
**Note**: `projectId` is set from the project context during sync (the task file's location within a project's `tasks/` directory determines the project), not from taskgraph frontmatter. `commonCols` fields (`id`, `metadata`, `createdAt`, `updatedAt`) are DB-generated and not part of the sync conflict domain.
|
||
|
||
### Runtime-Managed Fields (mutated via `hub.task.*` operations only)
|
||
|
||
These fields are never overwritten by sync. They are only mutated by hub operations (`hub.task.updateStatus`, `hub.task.addNote`, etc.):
|
||
|
||
| Field | DB Column | Set By |
|
||
|-------|-----------|--------|
|
||
| status | `status` | `hub.task.updateStatus` |
|
||
| (started timestamp) | `startedAt` | `hub.task.updateStatus` (on `in-progress`) |
|
||
| (completed timestamp) | `completedAt` | `hub.task.updateStatus` (on `completed`) |
|
||
|
||
> **Warning**: Sync must never write `status`, `startedAt`, or `completedAt` — these are owned by hub operations. The sync upsert uses `ON CONFLICT DO UPDATE SET` only for authored fields; runtime fields are excluded from the SET clause.
|
||
|
||
## Field Mapping: taskgraph Frontmatter → DB Columns
|
||
|
||
Every field in taskgraph's `TaskFrontmatter` struct maps to a dedicated DB column. No frontmatter fields are relegated to JSONB `metadata`.
|
||
|
||
| taskgraph Field | DB Column | Type | Notes |
|
||
|---|---|---|---|
|
||
| `id` | `slug` | text NOT NULL | Direct mapping. No transformation. `slug` is taskgraph-compatible, used in `depends_on` references. |
|
||
| `name` | `name` | text NOT NULL | Direct mapping |
|
||
| `status` | `status` | text NOT NULL, enum | Direct mapping: `pending`, `in-progress`, `completed`, `failed`, `blocked`. Default: `pending`. |
|
||
| `depends_on` | `task_dependencies` table | — | Each element creates a row: `depends_on[i]` → `dependsOnTaskId`, task → `dependentTaskId` |
|
||
| `scope` | `scope` | text, enum | `single`, `narrow`, `moderate`, `broad`, `system`. **Nullable** — NULL = not yet assessed. |
|
||
| `risk` | `risk` | text, enum | `trivial`, `low`, `medium`, `high`, `critical`. **Nullable** — NULL = not yet assessed. |
|
||
| `impact` | `impact` | text, enum | `isolated`, `component`, `phase`, `project`. **Nullable** — NULL = not yet assessed. |
|
||
| `level` | `level` | text, enum | `planning`, `decomposition`, `implementation`, `review`, `research`. **Nullable** — NULL = not yet assessed. |
|
||
| `priority` | `priority` | text, enum | `low`, `medium`, `high`, `critical`. Nullable. |
|
||
| `tags` | `tags` | text[] | String array. Default `{}`. |
|
||
| `assignee` | `assignee` | text | Assigned agent or person. Nullable. |
|
||
| `due` | `dueAt` | timestamp with tz | Renamed from `due` for DB convention. Nullable. |
|
||
| `created` | `fileCreatedAt` | timestamp with tz | Frontmatter `created` field. Separate from DB `createdAt` (row creation time). Nullable — frontmatter may not include it. |
|
||
| `modified` | `fileModifiedAt` | timestamp with tz | Frontmatter `modified` field. Separate from DB `updatedAt` (row update time). Nullable. |
|
||
| (body) | `body` | text | Markdown content after frontmatter. Nullable — empty body is valid. |
|
||
| (directory path) | `path` | text | Logical grouping prefix: `architecture`, `implementation/storage`. Nullable — tasks created via API with no file origin have no path. See [Path Semantics](#path-semantics). |
|
||
| (project) | `projectId` | text NOT NULL | FK → projects.id |
|
||
|
||
### Table Schemas
|
||
|
||
### `tasks`
|
||
|
||
SDD task definitions. The database is the source of truth for task data at runtime. Markdown files serve as the authoring surface for the Decomposer and taskgraph CLI — they are ingested into the DB via a sync operation. Every field in taskgraph's `TaskFrontmatter` struct maps to a dedicated DB column (no frontmatter fields in `metadata` JSONB).
|
||
|
||
| Column | Type | Notes |
|
||
|--------|------|-------|
|
||
| commonCols | — | id, metadata, createdAt, updatedAt |
|
||
| projectId | text NOT NULL | FK → projects.id (cascade) — tasks belong to a project |
|
||
| slug | text NOT NULL | taskgraph `id` — kebab-case identifier used in `depends_on` references. Unique within a project. |
|
||
| name | text NOT NULL | Human-readable task name (from frontmatter `name`) |
|
||
| path | text | Logical grouping prefix derived from filesystem location (e.g., `architecture`, `implementation/storage`). Nullable — tasks created via API with no file origin have no path. Enables `WHERE path LIKE 'implementation/%'` for scoped queries. |
|
||
| status | text NOT NULL | Enum: `pending`, `in-progress`, `completed`, `failed`, `blocked`. Default: `pending`. Status transitions go through hub operations, not file edits. |
|
||
| scope | text | Categorical scope: `single`, `narrow`, `moderate`, `broad`, `system`. **Nullable** — NULL = not yet assessed. See [Why Categorical Fields Are Nullable](#why-categorical-fields-are-nullable-not-not-null-with-defaults). |
|
||
| risk | text | Categorical risk: `trivial`, `low`, `medium`, `high`, `critical`. **Nullable** — NULL = not yet assessed. |
|
||
| impact | text | Categorical impact: `isolated`, `component`, `phase`, `project`. **Nullable** — NULL = not yet assessed. |
|
||
| level | text | Task level: `planning`, `decomposition`, `implementation`, `review`, `research`. **Nullable** — NULL = not yet assessed. |
|
||
| priority | text | Priority: `low`, `medium`, `high`, `critical`. Nullable. |
|
||
| assignee | text | Assigned agent or person. Nullable. |
|
||
| dueAt | timestamp with tz | Due date (from frontmatter `due`). Nullable. |
|
||
| tags | text[] | Filtering tags. Default `{}`. GIN index for array-contains queries. |
|
||
| body | text | Markdown task specification (from file body after frontmatter). Nullable — empty body is valid. Agents may append notes during execution. |
|
||
| fileCreatedAt | timestamp with tz | Frontmatter `created` field — file creation time from the markdown. Separate from DB `createdAt` (row creation time). Nullable. |
|
||
| fileModifiedAt | timestamp with tz | Frontmatter `modified` field — file modification time from the markdown. Separate from DB `updatedAt` (row update time). Nullable. |
|
||
| startedAt | timestamp with tz | When status became `in-progress`. Set by hub operation, not by agent. |
|
||
| completedAt | timestamp with tz | When status became `completed`. Set by hub operation. |
|
||
|
||
**Unique constraint**: `unq_tasks_project_slug` UNIQUE on `(projectId, slug)` — task slugs are unique within a project.
|
||
|
||
**pgEnum Definitions**: The following enum columns use PostgreSQL `pgEnum` for type safety. Drizzle's `pgEnum` generates named PostgreSQL enums and provides TypeScript type inference. The enum values are aligned with taskgraph's categorical fields.
|
||
|
||
```ts
|
||
export const taskStatus = pgEnum("task_status", ["pending", "in-progress", "completed", "failed", "blocked"]);
|
||
export const taskScope = pgEnum("task_scope", ["single", "narrow", "moderate", "broad", "system"]);
|
||
export const taskRisk = pgEnum("task_risk", ["trivial", "low", "medium", "high", "critical"]);
|
||
export const taskImpact = pgEnum("task_impact", ["isolated", "component", "phase", "project"]);
|
||
export const taskLevel = pgEnum("task_level", ["planning", "decomposition", "implementation", "review", "research"]);
|
||
export const taskPriority = pgEnum("task_priority", ["low", "medium", "high", "critical"]);
|
||
```
|
||
|
||
The decomposer template should consume these same enum definitions to ensure DB-level constraints match the application-level typing.
|
||
|
||
**Indexes**: `idx_tasks_project_id` on `(projectId)`, `idx_tasks_project_status` on `(projectId, status)` — composite for "find all pending tasks in project X", `idx_tasks_status` on `(status)`, `idx_tasks_active` partial on `(projectId)` WHERE `status IN ('pending', 'in-progress', 'blocked')` — efficiently find active tasks, `idx_tasks_path` on `(path)` **with `text_pattern_ops`** — locale-independent LIKE pattern matching for path prefix queries (e.g., `WHERE path LIKE 'implementation/%'`), `idx_tasks_priority` on `(priority)`, `idx_tasks_assignee` on `(assignee)`, `idx_tasks_due_at` on `(dueAt)`, `idx_tasks_tags` GIN on `(tags)` — for array-contains queries (`tags @> '{security}'`).
|
||
|
||
**`slug` semantics**: From taskgraph frontmatter `id` field. Kebab-case identifiers like `auth-setup`, `storage-tasks-table`. Appears in `depends_on` arrays.
|
||
|
||
**`path` semantics**: Nullable — tasks created via API with no filesystem origin have no path. When set, captures the logical grouping derived from the `tasks/` directory structure. E.g., a file at `tasks/implementation/storage/tasks-table.md` gets `path: "implementation/storage"`. Enables `WHERE path LIKE 'implementation/%'` (scoped queries) without requiring a `parentId` FK. This replaces the previous `parentId` column — grouping is a path concern, not a tree relationship.
|
||
|
||
**No `parentId` column**: Grouping is handled by `path`, dependencies by `task_dependencies`. A "meta task" is just a regular task that depends on its sub-tasks — no special entity type needed.
|
||
|
||
**No `removedAt` column**: When a task file is removed, the sync operation DELETEs the DB row. Git history preserves the file-level history; the DB doesn't need to duplicate it with soft deletes. FK cascade handles cleanup.
|
||
|
||
**`metadata` JSONB**: Reserved for truly ad-hoc data not in the taskgraph schema. No taskgraph frontmatter fields are stored here — all have proper columns.
|
||
|
||
### `task_dependencies`
|
||
|
||
Dependency edges between tasks. Directed: a row means the dependent task depends on the prerequisite task (prerequisite must complete before dependent can start). Mirrors the taskgraph `depends_on` relationship.
|
||
|
||
| Column | Type | Notes |
|
||
|--------|------|-------|
|
||
| commonCols | — | id, metadata, createdAt, updatedAt |
|
||
| dependsOnTaskId | text NOT NULL | FK → tasks.id (cascade) — The prerequisite task (must complete first) |
|
||
| dependentTaskId | text NOT NULL | FK → tasks.id (cascade) — The dependent task (waits for prerequisite) |
|
||
|
||
**Unique constraint**: `unq_task_dependencies_depends_on_task` UNIQUE on `(dependsOnTaskId, dependentTaskId)` — no duplicate dependency edges.
|
||
|
||
**Indexes**: `idx_task_dependencies_depends_on_task_id` on `(dependsOnTaskId)` — "what depends on this task?", `idx_task_dependencies_dependent_task_id` on `(dependentTaskId)` — "what does this task depend on?".
|
||
|
||
**Direction**: `dependentTaskId` is the task that has the dependency. `dependsOnTaskId` is the prerequisite task. Together they form a directed edge: `dependentTaskId` → `dependsOnTaskId` meaning "task dependentTaskId depends on task dependsOnTaskId". In the graph, there's an edge from `dependsOnTaskId` → `dependentTaskId` (prerequisite → dependent). This gives correct topological order: prerequisites before dependents.
|
||
|
||
**Cross-project dependency guard**: `taskId` and `dependsOnTaskId` MUST reference tasks within the same project. The application layer enforces this constraint — creating a dependency between tasks in different projects is rejected with a validation error. This is not enforced at the DB level (FK constraints allow cross-project references), so the application must check project consistency before insert.
|
||
|
||
A future DB-level guard could use a trigger: `BEFORE INSERT ON task_dependencies` that checks `NEW.taskId` and `NEW.dependsOnTaskId` reference tasks in the same project. This is deferred to Phase 2 — the application-layer check is sufficient for now.
|
||
|
||
**Sync source**: Dependency edges are authored in task file frontmatter (`depends_on: [other-task]`) and synced to this table during the file → DB sync operation. The sync clears and re-inserts all edges for a task on each run — dependencies are fully replaced by the sync, not merged or modified at runtime.
|
||
|
||
## Why ALL Frontmatter Fields Get Proper Columns
|
||
|
||
ADR-001 establishes the pattern: "separate structured columns for high-query, high-filter fields." For tasks, **every** taskgraph frontmatter field is queryable and filterable in the coordinator's workflow:
|
||
|
||
- `priority` — "show me high-priority pending tasks" (coordinator prioritization)
|
||
- `assignee` — "which tasks are assigned to agent X?" (work assignment)
|
||
- `dueAt` — "which tasks are due this week?" (deadline tracking)
|
||
- `tags` — "filter by tag" (cross-cutting concerns)
|
||
|
||
Shoving these into `metadata` JSONB loses type safety, indexability, and SQL queryability — exactly the problems the database is meant to solve. The `metadata` JSONB column (from `commonCols`) is reserved for truly ad-hoc data that isn't in the taskgraph schema.
|
||
|
||
### Why Categorical Fields Are Nullable (Not NOT NULL with Defaults)
|
||
|
||
The previous design made `scope`, `risk`, `impact`, and `level` NOT NULL with defaults (`narrow`, `low`, `isolated`, `implementation`). This conflated two states:
|
||
|
||
- **Assessed as `low`** — the Decomposer evaluated this and determined the risk is low
|
||
- **Not assessed** — nobody filled this in
|
||
|
||
Hiding the distinction with defaults means the coordinator can't distinguish a deliberate assessment from a gap. NULL is the correct signal for "not yet assessed."
|
||
|
||
Taskgraph itself makes these fields `Option<TaskScope>`, `Option<TaskRisk>`, etc. — nullable. The DB should match the source model.
|
||
|
||
**Application-layer handling**: When `scope`, `risk`, `impact`, or `level` is NULL, the coordinator should:
|
||
- Warn that the task hasn't been assessed
|
||
- Exclude it from cost-benefit analysis (you can't compute risk-path without risk values)
|
||
- Suggest the Decomposer assess it
|
||
|
||
For @alkdev/taskgraph operations that need numeric weights, provide fallbacks at the application layer (e.g., treat NULL risk as `low` for topo sort, but warn).
|
||
|
||
## Path Semantics
|
||
|
||
The `path` column captures the logical grouping of tasks, derived from their location in the `tasks/` directory hierarchy:
|
||
|
||
```
|
||
tasks/
|
||
├── architecture/
|
||
│ ├── auth-design.md → path: "architecture"
|
||
│ └── storage-overview.md → path: "architecture"
|
||
├── research/
|
||
│ └── embedding-approach.md → path: "research"
|
||
└── implementation/
|
||
├── storage/
|
||
│ ├── tasks-table.md → path: "implementation/storage"
|
||
│ └── relations.md → path: "implementation/storage"
|
||
└── auth/
|
||
└── oauth-flow.md → path: "implementation/auth"
|
||
```
|
||
|
||
**`path` is nullable** because tasks created at runtime via hub operations (not synced from files) have no filesystem origin.
|
||
|
||
**`path` enables scoped queries**:
|
||
- `WHERE path = 'architecture'` — all architecture tasks
|
||
- `WHERE path LIKE 'implementation/%'` — all implementation tasks
|
||
- `WHERE path = 'implementation/storage'` — storage implementation tasks
|
||
|
||
This is a prefix-based grouping mechanism. It replaces `parentId` (which was not in the taskgraph model and conflated organizational grouping with dependency ordering).
|
||
|
||
**Locale sensitivity**: The `path` column uses `text` type with the database's default collation. LIKE pattern matching (`WHERE path LIKE 'implementation/%'`) is collation-sensitive. For case-sensitive matching (recommended for task paths which use lowercase), use `COLLATE "C"` or ensure the default collation is `C`/`POSIX`. Alternatively, use `text_pattern_ops` operator class for the index: `CREATE INDEX idx_tasks_path ON tasks (path text_pattern_ops)` which enables `LIKE` and `~` pattern matching regardless of collation.
|
||
|
||
## Grouping vs Dependencies
|
||
|
||
**There is no `parentId` column.** Task grouping and dependency ordering are separate concepts:
|
||
|
||
- **Grouping** — `path` column. "This task belongs to the `implementation/storage` group." Enables scoped queries. Derived from filesystem layout during sync.
|
||
- **Dependencies** — `task_dependencies` table. "This task cannot start until that task completes." Enables topological sort, cycle detection, critical path. Derived from `depends_on` frontmatter.
|
||
|
||
A "meta task" (e.g., "implement storage") is simply a task that `depends_on` all its sub-tasks. There is no special entity type — it's regular task + dependency edges. The coordinator picks up the meta task as an assignment, and the implementation specialist works through sub-tasks in dependency order.
|
||
|
||
**Why not `parentId`**: `parentId` was invented in a previous doc revision but has no basis in the taskgraph data model. It created confusion:
|
||
- Redundant with `task_dependencies` (a meta task's dependencies ARE its sub-tasks)
|
||
- Required a fragile "inference from directory structure" during sync
|
||
- Violated the invariant that the DB schema mirrors the taskgraph frontmatter model
|
||
|
||
## Relationship to Existing Tables
|
||
|
||
### `mappings` Table
|
||
|
||
The `mappings` table links sessions to coordinators, spokes, and worktrees. A `taskId` column references the task a mapping is assigned to:
|
||
|
||
```ts
|
||
taskId: text REFERENCES tasks(id) // FK to tasks
|
||
task: text // denormalized display name (e.g., task.slug or task.name)
|
||
```
|
||
|
||
This preserves the quick-reference pattern (coordinators can list mappings with task names without a JOIN) while maintaining referential integrity.
|
||
|
||
### `projects` Table
|
||
|
||
Tasks belong to a project via `tasks.projectId`. A project's tasks live in the project's `tasks/` directory. Cross-project task dependencies are not supported — tasks can only depend on other tasks within the same project. This is enforced at the application level (see task_dependencies cross-project guard).
|
||
|
||
### `sessions` Table
|
||
|
||
Sessions are linked to tasks indirectly through `mappings`. When the coordinator spawns a session for a meta task:
|
||
1. The task row already exists in `tasks` (synced from file or created via API)
|
||
2. Creates a `sessions` row for the implementation specialist
|
||
3. Creates a `mappings` row with `taskId` pointing to the meta task
|
||
|
||
## Task Status Lifecycle
|
||
|
||
```
|
||
pending → in-progress → completed
|
||
↘ failed → in-progress (retry)
|
||
↘ blocked → in-progress (unblocked)
|
||
```
|
||
|
||
| Status | Meaning |
|
||
|--------|---------|
|
||
| `pending` | Task exists, not yet started |
|
||
| `in-progress` | A session is actively working on this task |
|
||
| `completed` | Task finished successfully |
|
||
| `failed` | Task failed, may retry (Safe Exit protocol) |
|
||
| `blocked` | Task is blocked by an unmet dependency or external issue |
|
||
|
||
Status transitions go through **hub operations** (`hub.task.updateStatus`), not file edits. This ensures:
|
||
- All agents see consistent state immediately
|
||
- The coordinator can query "which tasks are pending?" reliably
|
||
- No merge conflicts from parallel file edits
|
||
|
||
Timestamp columns `startedAt` and `completedAt` track when a task entered `in-progress` and `completed` states respectively. These are set by the hub operation, not by the agent.
|
||
|
||
## Task Notes (Append-Only)
|
||
|
||
Agents may need to add notes to a task during execution (observations, partial progress, blockers encountered). For v1, this is handled by **appending markdown to the `body` column**:
|
||
|
||
```markdown
|
||
## Task Description (original)
|
||
|
||
Implement the tasks table with Drizzle-TypeBox pattern...
|
||
|
||
## Implementation Notes
|
||
|
||
- 2026-04-19: Started with table definition, commonCols pattern works
|
||
- 2026-04-19: Hit issue with text[] type for tags — need to check Drizzle support
|
||
```
|
||
|
||
The `hub.task.addNote` operation appends a timestamped note section to the end of `body`. This is simple, preserves the full context in one place, and requires no additional tables.
|
||
|
||
**Concurrency model for `hub.task.addNote`**: Notes are appended to the task `body` field using **DB-level concatenation**: `UPDATE tasks SET body = COALESCE(body, '') || $note WHERE id = $taskId`. This avoids read-modify-write cycles entirely — the append is atomic at the SQL level, eliminating race conditions between concurrent agents.
|
||
|
||
As a fallback for scenarios where DB-level concatenation isn't feasible, **optimistic locking via `updatedAt`** can be used: read the current `updatedAt`, append the note, and `UPDATE WHERE updatedAt = readValue`. If the row was updated between read and write, the UPDATE affects 0 rows and the operation must be retried. This is sufficient for the expected low-contention scenario (one agent at a time writing notes to a task).
|
||
|
||
For high-contention scenarios (multiple agents writing simultaneously), consider a separate `task_notes` table with `INSERT` operations instead of UPDATE appends.
|
||
|
||
If structured, multi-agent notes become necessary later, a dedicated `task_notes` table can be added. The `body` append pattern doesn't preclude this — it's additive.
|
||
|
||
## Why Categorical Estimates Matter
|
||
|
||
The `scope`, `risk`, `impact`, and `level` fields are not cosmetic metadata — they are what make taskgraph's analysis commands produce useful results. The cost-benefit framework (see taskgraph framework docs) demonstrates a structural property: **upstream failures multiply downstream damage**.
|
||
|
||
These fields power:
|
||
- **`taskgraph decompose`** — flags tasks where `risk > medium` or `scope > moderate`
|
||
- **`taskgraph risk-path`** — finds the highest cumulative risk path
|
||
- **`taskgraph critical`** — finds completion blockers
|
||
- **`taskgraph bottleneck`** — finds high-betweenness tasks
|
||
|
||
Without them, you just get topological sort — useful, but not structurally insightful. The DB columns for these fields are **nullable** (NULL = not assessed) rather than NOT NULL with defaults, because the distinction between "deliberately assessed as `low`" and "nobody filled this in" is itself valuable information for the coordinator.
|
||
|
||
## Graphology Integration (Runtime Graph Ops)
|
||
|
||
For runtime graph operations, the hub uses **`@alkdev/taskgraph`** — a TypeScript package that wraps graphology and provides a high-level `TaskGraph` class plus analysis functions. The CLI (`taskgraph`) is for offline authoring and analysis; the TS package is for runtime use.
|
||
|
||
The approach:
|
||
1. Load all `tasks` + `task_dependencies` rows for a project from the DB
|
||
2. Build a `TaskGraph` via `TaskGraph.fromRecords(tasks, edges)`
|
||
3. Run analysis functions as needed: `criticalPath()`, `parallelGroups()`, `bottlenecks()`, `riskPath()`, `shouldDecomposeTask()`, `workflowCost()`
|
||
|
||
This works because realistic task graphs are small — typically 10–50 tasks, rarely exceeding 200 even on large projects. Building a graph from DB rows is instant at this scale (`TaskGraph.fromRecords` with 100 nodes reconstructs in <5ms).
|
||
|
||
`@alkdev/taskgraph` exports:
|
||
- **`TaskGraph`** — construction (fromTasks, fromRecords, fromJSON), mutation (addTask, removeTask, addDependency, updateTask), queries (hasCycles, findCycles, topologicalOrder, dependencies, dependents, getTask), validation (validateSchema, validateGraph), export
|
||
- **Analysis functions** — criticalPath, weightedCriticalPath, parallelGroups, bottlenecks, riskPath, riskDistribution, calculateTaskEv, workflowCost, shouldDecomposeTask
|
||
- **Schema types** — TaskScope, TaskRisk, TaskImpact, TaskLevel, TaskPriority, TaskStatus enums with TypeBox schemas
|
||
- **Frontmatter** — parseFrontmatter, serializeFrontmatter (YAML + markdown)
|
||
- **Error classes** — TaskgraphError, CircularDependencyError, TaskNotFoundError, etc.
|
||
|
||
**Why not taskgraph NAPI for v1**: The Rust CLI (`taskgraph`) is for offline authoring and analysis. The TypeScript package (`@alkdev/taskgraph`) handles all runtime graph operations. Graphology is a transitive dependency through `@alkdev/taskgraph` and handles < 200 nodes trivially. NAPI is unnecessary at realistic scales.
|
||
|
||
## Sync Flow
|
||
|
||
```
|
||
┌──────────────┐ ┌───────────────┐ ┌──────────────────┐
|
||
│ Decomposer │ │ taskgraph CLI │ │ Hub DB │
|
||
│ creates .md │──────►│ validates │──────►│ tasks table │
|
||
│ files │ │ analyzes │ │ task_dependencies │
|
||
└──────────────┘ └───────────────┘ └──────────────────┘
|
||
▲
|
||
│
|
||
┌────────┴─────────┐
|
||
│ Hub operations │
|
||
│ hub.task.* │
|
||
│ (status, notes) │
|
||
└────────────────────┘
|
||
```
|
||
|
||
### Sync: Files → DB
|
||
|
||
The sync operation runs as a **single database transaction**:
|
||
|
||
1. **Begin transaction**
|
||
2. Scan `tasks/` directory for markdown files
|
||
3. Parse frontmatter (YAML) + body (markdown) from each file. `@alkdev/taskgraph` provides `parseFrontmatter()` and `serializeFrontmatter()` for YAML+markdown parsing. `parseTaskFile()` and `parseTaskDirectory()` are Node.js only (use `node:fs/promises`); for Deno, use `parseFrontmatter()` with Deno file I/O.
|
||
4. Upsert into `tasks` table (matches by `(projectId, slug)`)
|
||
5. For each task, `DELETE FROM task_dependencies WHERE dependentTaskId = ?` then `INSERT` the current edges — dependency edges are fully replaced, not merged, because the files own the dependency declarations
|
||
6. **Commit transaction**
|
||
|
||
If any step fails, the entire sync rolls back — no partial updates.
|
||
|
||
**Concurrency**: Only one sync should run at a time. The Decomposer triggers sync after creating/updating task files. No concurrent sync mechanism is needed for v1.
|
||
|
||
**Deleted files**: When a task file is removed from `tasks/`, the sync operation **deletes** the corresponding DB row. Git history preserves the full file-level history — the DB doesn't need to duplicate it with soft deletes. FK cascade handles cleanup (`task_dependencies` rows, `mappings.taskId` SET NULL).
|
||
|
||
### DB → Files (Export)
|
||
|
||
When graph analysis is needed, export DB rows back to markdown files:
|
||
|
||
1. Query `tasks` + `task_dependencies` for a project
|
||
2. For each task, generate markdown with YAML frontmatter + body
|
||
3. Write to `tasks/` directory structure (using `path` to determine subdirectory)
|
||
4. Run `taskgraph validate`, `taskgraph risk-path`, etc.
|
||
|
||
This is a manual step — "I want to run analysis now" — not an automatic sync.
|
||
|
||
### Sync Error Handling
|
||
|
||
| Error | Behavior |
|
||
|-------|----------|
|
||
| Invalid YAML frontmatter | Skip file, log warning with file path and parse error. Continue with remaining files. |
|
||
| Missing required `id` or `name` field | Skip file, log warning. Task cannot be synced without these fields. |
|
||
| `depends_on` references non-existent slug within project | Insert the dependency edge anyway (dangling reference). The coordinator detects and warns about unresolvable dependencies. `taskgraph validate` should be run before sync to catch these. |
|
||
| Duplicate `id` (slug) in same project | Fail the sync with a clear error. Slug uniqueness is enforced by the DB constraint `unq_tasks_project_slug`. |
|
||
| File removed from filesystem | DELETE the DB row. FK cascade handles dependent rows. Git preserves history. |
|
||
|
||
**Validation ordering**: Run `taskgraph validate` before sync to catch structural errors (cycles, missing dependencies, duplicate IDs) at the CLI level. The DB sync then handles data-level integrity (unique constraints, FK checks).
|
||
|
||
## Open Questions
|
||
|
||
1. **Embeddings**: Task descriptions may benefit from vector embeddings for similarity search. Deferred — the `metadata` JSONB column can hold an embedding reference later, or a separate `task_embeddings` table can be added.
|
||
|
||
2. **Bulk status updates**: When the coordinator completes a meta task (all sub-tasks done), should it automatically mark the meta task `completed`? Likely yes — this is an application-level operation, not a DB concern.
|
||
|
||
3. **Cross-project dependencies**: Not supported. Tasks can only depend on other tasks within the same project. Application-layer validation rejects cross-project dependencies; a future DB-level trigger guard is deferred to Phase 2 (see task_dependencies cross-project guard).
|
||
|
||
4. **Task versioning**: When a task's body is modified (e.g., notes appended), should we keep previous versions? For v1, no — the current body is sufficient. If audit trail is needed, `updatedAt` timestamp + `metadata` revision count could suffice.
|
||
|
||
## References
|
||
|
||
- Cost-benefit framework: taskgraph framework docs — why categorical estimates are structurally required
|
||
- Workflow guide: taskgraph workflow docs — practical usage patterns
|
||
- Task file format: @alkdev/taskgraph README — field definitions
|
||
- TaskFrontmatter struct: @alkdev/taskgraph package source — canonical field types and defaults
|
||
- taskgraph architecture: taskgraph architecture docs
|
||
- Storage pattern: [README.md](./README.md)
|
||
- Table reference (cross-cutting): [table-reference.md](./table-reference.md)
|
||
- ADR-011: [../../decisions/ADR-011-dual-task-representation.md](../../decisions/ADR-011-dual-task-representation.md)
|
||
- @alkdev/taskgraph (runtime graph engine): `@alkdev/taskgraph` npm package |