Files
hub/docs/decisions/ADR-011-dual-task-representation.md
glm-5.1 2b63cda1c7 Setup repo: migrate architecture specs, code stubs, and tasks from alkhub_ts
Copy architecture docs, ADRs, storage domain specs, research, reviews,
and 56 storage architecture tasks from the alkhub_ts monorepo. Adapt for
standalone @alkdev/hub repo structure (src/ not packages/hub/).

Sanitize all sensitive information:
- Replace private IPs (10.0.0.1) with localhost defaults
- Remove internal server hostnames (dev1, ns528096)
- Replace /workspace/ private paths with npm package references
- Remove hardcoded credentials from examples
- Rewrite infrastructure.md without private network details

Add Deno project scaffolding: deno.json (pinned deps), .gitignore,
AGENTS.md, entry point. Migrate existing code stubs (crypto, config
types, logger) with updated import paths.
2026-05-25 10:56:32 +00:00

76 lines
4.9 KiB
Markdown

# ADR-011: Database as source of truth for tasks
- **Status**: Accepted
- **Date**: 2026-04-19
- **Deciders**: alkdev
- **Supersedes**: Previous "dual representation" design where files were source of truth for content and DB for state
## Context
The SDD process uses tasks as markdown files (compatible with the `taskgraph` CLI). The hub coordinator needs to query and mutate task state at runtime across multiple parallel worktrees. We need a storage model that serves both authoring and runtime coordination.
Taskgraph's file-based model works well for single-agent, single-worktree workflows. In the hub's multi-agent, multi-worktree environment, files create problems:
- **Parallel worktrees**: Agent A marks a task `in-progress` in their worktree's file. Agent B can't see this — the file lives in A's working directory. The coordinator can't get a consistent view.
- **Merge conflicts**: Two agents editing the same task file in different worktrees creates git conflicts on merge.
- **Reliable coordination**: The coordinator needs to query "which tasks are pending?" without scanning filesystems across worktrees.
- **Atomic mutations**: Status changes must be immediately visible to all agents, not delayed until file merges.
Three options were considered:
1. **Files only** — The coordinator runs `taskgraph` CLI commands via bash to query status. Agents edit files directly.
2. **Database only** — Tasks are stored exclusively in Postgres. No markdown files.
3. **Database as source of truth, files as authoring surface** — The DB is the authoritative runtime representation. Markdown files serve as the Decomposer's authoring format, ingested to DB via sync. Taskgraph CLI used for offline analysis via DB export.
## Decision
We choose **Option 3: Database as source of truth, files as authoring surface**.
### Authority Model
| Aspect | Authority | Why |
|--------|-----------|-----|
| All task fields (structure, categorical estimates, metadata) | **DB** | Every taskgraph frontmatter field maps to a dedicated DB column. Queryable, concurrent-safe, consistent. |
| Task specification (body) | **DB** (`body` column) | Stored as markdown text. Agents append notes during execution. |
| Task creation/authoring | **Files** → sync → DB | Decomposer edits markdown files; sync ingests them into DB. |
| Runtime status mutations | **DB** (hub operations) | `hub.task.*` operations ensure all agents see consistent state. |
| Offline graph analysis | **Files** (taskgraph CLI) | Export from DB when needed for `taskgraph risk-path` etc. |
### Key Design Principles
1. **Every taskgraph frontmatter field is a proper DB column** — no fields relegated to JSONB `metadata`. `priority`, `assignee`, `dueAt`, `tags` get dedicated columns because they're queryable and filterable in coordinator workflows.
2. **Categorical fields are nullable, not NOT NULL with defaults**`scope`, `risk`, `impact`, `level` are nullable (NULL = not yet assessed). This preserves the distinction between "deliberately assessed as low" and "nobody filled this in." Taskgraph itself uses `Option<TaskScope>` etc.
3. **No `parentId`** — Grouping is handled by `path` (a nullable text column for scoped queries like `WHERE path LIKE 'implementation/%'`). Dependencies are in `task_dependencies`. These are separate concepts.
4. **No `removedAt` soft delete** — When a task file is removed, the sync DELETEs the DB row. Git history preserves file-level history. No DB duplication needed.
5. **`fileCreatedAt`/`fileModifiedAt`** — Dedicated columns for frontmatter timestamps, separate from DB `createdAt`/`updatedAt` (row lifecycle times).
## Consequences
**Positive**:
- Coordinator gets a reliable, consistent view of all task state across parallel worktrees.
- No merge conflicts from agents editing the same file in different worktrees.
- Status changes are atomic and immediately visible to all agents via hub operations.
- All taskgraph fields are queryable with proper SQL types and indexes.
- Taskgraph CLI still works for offline analysis via DB → file export.
- Nullable categorical fields provide the "not yet assessed" signal that defaults hide.
**Negative**:
- Two representations exist (files and DB), requiring a sync operation.
- Files are no longer the source of truth — they're the authoring surface. This is a conceptual shift from taskgraph's default model.
- DB → file export is needed for offline analysis (not automatic).
**Mitigation for negatives**:
- Sync is idempotent and can be run at any time after authoring.
- The DB is the authority; files are just one input method. Tasks can also be created via hub API.
- Export for offline analysis is a manual step (run when needed), not a continuous sync.
## Related
- ADR-001: JSONB data columns vs individual columns (same principle — proper columns for queryable fields)
- Cost-benefit framework: taskgraph framework docs
- Task storage: `docs/architecture/storage/tasks.md`
- taskgraph TaskFrontmatter: taskgraph source