Files
hub/docs/decisions/ADR-011-dual-task-representation.md
glm-5.1 2b63cda1c7 Setup repo: migrate architecture specs, code stubs, and tasks from alkhub_ts
Copy architecture docs, ADRs, storage domain specs, research, reviews,
and 56 storage architecture tasks from the alkhub_ts monorepo. Adapt for
standalone @alkdev/hub repo structure (src/ not packages/hub/).

Sanitize all sensitive information:
- Replace private IPs (10.0.0.1) with localhost defaults
- Remove internal server hostnames (dev1, ns528096)
- Replace /workspace/ private paths with npm package references
- Remove hardcoded credentials from examples
- Rewrite infrastructure.md without private network details

Add Deno project scaffolding: deno.json (pinned deps), .gitignore,
AGENTS.md, entry point. Migrate existing code stubs (crypto, config
types, logger) with updated import paths.
2026-05-25 10:56:32 +00:00

4.9 KiB

ADR-011: Database as source of truth for tasks

  • Status: Accepted
  • Date: 2026-04-19
  • Deciders: alkdev
  • Supersedes: Previous "dual representation" design where files were source of truth for content and DB for state

Context

The SDD process uses tasks as markdown files (compatible with the taskgraph CLI). The hub coordinator needs to query and mutate task state at runtime across multiple parallel worktrees. We need a storage model that serves both authoring and runtime coordination.

Taskgraph's file-based model works well for single-agent, single-worktree workflows. In the hub's multi-agent, multi-worktree environment, files create problems:

  • Parallel worktrees: Agent A marks a task in-progress in their worktree's file. Agent B can't see this — the file lives in A's working directory. The coordinator can't get a consistent view.
  • Merge conflicts: Two agents editing the same task file in different worktrees creates git conflicts on merge.
  • Reliable coordination: The coordinator needs to query "which tasks are pending?" without scanning filesystems across worktrees.
  • Atomic mutations: Status changes must be immediately visible to all agents, not delayed until file merges.

Three options were considered:

  1. Files only — The coordinator runs taskgraph CLI commands via bash to query status. Agents edit files directly.
  2. Database only — Tasks are stored exclusively in Postgres. No markdown files.
  3. Database as source of truth, files as authoring surface — The DB is the authoritative runtime representation. Markdown files serve as the Decomposer's authoring format, ingested to DB via sync. Taskgraph CLI used for offline analysis via DB export.

Decision

We choose Option 3: Database as source of truth, files as authoring surface.

Authority Model

Aspect Authority Why
All task fields (structure, categorical estimates, metadata) DB Every taskgraph frontmatter field maps to a dedicated DB column. Queryable, concurrent-safe, consistent.
Task specification (body) DB (body column) Stored as markdown text. Agents append notes during execution.
Task creation/authoring Files → sync → DB Decomposer edits markdown files; sync ingests them into DB.
Runtime status mutations DB (hub operations) hub.task.* operations ensure all agents see consistent state.
Offline graph analysis Files (taskgraph CLI) Export from DB when needed for taskgraph risk-path etc.

Key Design Principles

  1. Every taskgraph frontmatter field is a proper DB column — no fields relegated to JSONB metadata. priority, assignee, dueAt, tags get dedicated columns because they're queryable and filterable in coordinator workflows.

  2. Categorical fields are nullable, not NOT NULL with defaultsscope, risk, impact, level are nullable (NULL = not yet assessed). This preserves the distinction between "deliberately assessed as low" and "nobody filled this in." Taskgraph itself uses Option<TaskScope> etc.

  3. No parentId — Grouping is handled by path (a nullable text column for scoped queries like WHERE path LIKE 'implementation/%'). Dependencies are in task_dependencies. These are separate concepts.

  4. No removedAt soft delete — When a task file is removed, the sync DELETEs the DB row. Git history preserves file-level history. No DB duplication needed.

  5. fileCreatedAt/fileModifiedAt — Dedicated columns for frontmatter timestamps, separate from DB createdAt/updatedAt (row lifecycle times).

Consequences

Positive:

  • Coordinator gets a reliable, consistent view of all task state across parallel worktrees.
  • No merge conflicts from agents editing the same file in different worktrees.
  • Status changes are atomic and immediately visible to all agents via hub operations.
  • All taskgraph fields are queryable with proper SQL types and indexes.
  • Taskgraph CLI still works for offline analysis via DB → file export.
  • Nullable categorical fields provide the "not yet assessed" signal that defaults hide.

Negative:

  • Two representations exist (files and DB), requiring a sync operation.
  • Files are no longer the source of truth — they're the authoring surface. This is a conceptual shift from taskgraph's default model.
  • DB → file export is needed for offline analysis (not automatic).

Mitigation for negatives:

  • Sync is idempotent and can be run at any time after authoring.
  • The DB is the authority; files are just one input method. Tasks can also be created via hub API.
  • Export for offline analysis is a manual step (run when needed), not a continuous sync.
  • ADR-001: JSONB data columns vs individual columns (same principle — proper columns for queryable fields)
  • Cost-benefit framework: taskgraph framework docs
  • Task storage: docs/architecture/storage/tasks.md
  • taskgraph TaskFrontmatter: taskgraph source