# ADR-011: Database as source of truth for tasks - **Status**: Accepted - **Date**: 2026-04-19 - **Deciders**: alkdev - **Supersedes**: Previous "dual representation" design where files were source of truth for content and DB for state ## Context The SDD process uses tasks as markdown files (compatible with the `taskgraph` CLI). The hub coordinator needs to query and mutate task state at runtime across multiple parallel worktrees. We need a storage model that serves both authoring and runtime coordination. Taskgraph's file-based model works well for single-agent, single-worktree workflows. In the hub's multi-agent, multi-worktree environment, files create problems: - **Parallel worktrees**: Agent A marks a task `in-progress` in their worktree's file. Agent B can't see this — the file lives in A's working directory. The coordinator can't get a consistent view. - **Merge conflicts**: Two agents editing the same task file in different worktrees creates git conflicts on merge. - **Reliable coordination**: The coordinator needs to query "which tasks are pending?" without scanning filesystems across worktrees. - **Atomic mutations**: Status changes must be immediately visible to all agents, not delayed until file merges. Three options were considered: 1. **Files only** — The coordinator runs `taskgraph` CLI commands via bash to query status. Agents edit files directly. 2. **Database only** — Tasks are stored exclusively in Postgres. No markdown files. 3. **Database as source of truth, files as authoring surface** — The DB is the authoritative runtime representation. Markdown files serve as the Decomposer's authoring format, ingested to DB via sync. Taskgraph CLI used for offline analysis via DB export. ## Decision We choose **Option 3: Database as source of truth, files as authoring surface**. ### Authority Model | Aspect | Authority | Why | |--------|-----------|-----| | All task fields (structure, categorical estimates, metadata) | **DB** | Every taskgraph frontmatter field maps to a dedicated DB column. Queryable, concurrent-safe, consistent. | | Task specification (body) | **DB** (`body` column) | Stored as markdown text. Agents append notes during execution. | | Task creation/authoring | **Files** → sync → DB | Decomposer edits markdown files; sync ingests them into DB. | | Runtime status mutations | **DB** (hub operations) | `hub.task.*` operations ensure all agents see consistent state. | | Offline graph analysis | **Files** (taskgraph CLI) | Export from DB when needed for `taskgraph risk-path` etc. | ### Key Design Principles 1. **Every taskgraph frontmatter field is a proper DB column** — no fields relegated to JSONB `metadata`. `priority`, `assignee`, `dueAt`, `tags` get dedicated columns because they're queryable and filterable in coordinator workflows. 2. **Categorical fields are nullable, not NOT NULL with defaults** — `scope`, `risk`, `impact`, `level` are nullable (NULL = not yet assessed). This preserves the distinction between "deliberately assessed as low" and "nobody filled this in." Taskgraph itself uses `Option` etc. 3. **No `parentId`** — Grouping is handled by `path` (a nullable text column for scoped queries like `WHERE path LIKE 'implementation/%'`). Dependencies are in `task_dependencies`. These are separate concepts. 4. **No `removedAt` soft delete** — When a task file is removed, the sync DELETEs the DB row. Git history preserves file-level history. No DB duplication needed. 5. **`fileCreatedAt`/`fileModifiedAt`** — Dedicated columns for frontmatter timestamps, separate from DB `createdAt`/`updatedAt` (row lifecycle times). ## Consequences **Positive**: - Coordinator gets a reliable, consistent view of all task state across parallel worktrees. - No merge conflicts from agents editing the same file in different worktrees. - Status changes are atomic and immediately visible to all agents via hub operations. - All taskgraph fields are queryable with proper SQL types and indexes. - Taskgraph CLI still works for offline analysis via DB → file export. - Nullable categorical fields provide the "not yet assessed" signal that defaults hide. **Negative**: - Two representations exist (files and DB), requiring a sync operation. - Files are no longer the source of truth — they're the authoring surface. This is a conceptual shift from taskgraph's default model. - DB → file export is needed for offline analysis (not automatic). **Mitigation for negatives**: - Sync is idempotent and can be run at any time after authoring. - The DB is the authority; files are just one input method. Tasks can also be created via hub API. - Export for offline analysis is a manual step (run when needed), not a continuous sync. ## Related - ADR-001: JSONB data columns vs individual columns (same principle — proper columns for queryable fields) - Cost-benefit framework: taskgraph framework docs - Task storage: `docs/architecture/storage/tasks.md` - taskgraph TaskFrontmatter: taskgraph source