Copy architecture docs, ADRs, storage domain specs, research, reviews, and 56 storage architecture tasks from the alkhub_ts monorepo. Adapt for standalone @alkdev/hub repo structure (src/ not packages/hub/). Sanitize all sensitive information: - Replace private IPs (10.0.0.1) with localhost defaults - Remove internal server hostnames (dev1, ns528096) - Replace /workspace/ private paths with npm package references - Remove hardcoded credentials from examples - Rewrite infrastructure.md without private network details Add Deno project scaffolding: deno.json (pinned deps), .gitignore, AGENTS.md, entry point. Migrate existing code stubs (crypto, config types, logger) with updated import paths.
4.9 KiB
ADR-011: Database as source of truth for tasks
- Status: Accepted
- Date: 2026-04-19
- Deciders: alkdev
- Supersedes: Previous "dual representation" design where files were source of truth for content and DB for state
Context
The SDD process uses tasks as markdown files (compatible with the taskgraph CLI). The hub coordinator needs to query and mutate task state at runtime across multiple parallel worktrees. We need a storage model that serves both authoring and runtime coordination.
Taskgraph's file-based model works well for single-agent, single-worktree workflows. In the hub's multi-agent, multi-worktree environment, files create problems:
- Parallel worktrees: Agent A marks a task
in-progressin their worktree's file. Agent B can't see this — the file lives in A's working directory. The coordinator can't get a consistent view. - Merge conflicts: Two agents editing the same task file in different worktrees creates git conflicts on merge.
- Reliable coordination: The coordinator needs to query "which tasks are pending?" without scanning filesystems across worktrees.
- Atomic mutations: Status changes must be immediately visible to all agents, not delayed until file merges.
Three options were considered:
- Files only — The coordinator runs
taskgraphCLI commands via bash to query status. Agents edit files directly. - Database only — Tasks are stored exclusively in Postgres. No markdown files.
- Database as source of truth, files as authoring surface — The DB is the authoritative runtime representation. Markdown files serve as the Decomposer's authoring format, ingested to DB via sync. Taskgraph CLI used for offline analysis via DB export.
Decision
We choose Option 3: Database as source of truth, files as authoring surface.
Authority Model
| Aspect | Authority | Why |
|---|---|---|
| All task fields (structure, categorical estimates, metadata) | DB | Every taskgraph frontmatter field maps to a dedicated DB column. Queryable, concurrent-safe, consistent. |
| Task specification (body) | DB (body column) |
Stored as markdown text. Agents append notes during execution. |
| Task creation/authoring | Files → sync → DB | Decomposer edits markdown files; sync ingests them into DB. |
| Runtime status mutations | DB (hub operations) | hub.task.* operations ensure all agents see consistent state. |
| Offline graph analysis | Files (taskgraph CLI) | Export from DB when needed for taskgraph risk-path etc. |
Key Design Principles
-
Every taskgraph frontmatter field is a proper DB column — no fields relegated to JSONB
metadata.priority,assignee,dueAt,tagsget dedicated columns because they're queryable and filterable in coordinator workflows. -
Categorical fields are nullable, not NOT NULL with defaults —
scope,risk,impact,levelare nullable (NULL = not yet assessed). This preserves the distinction between "deliberately assessed as low" and "nobody filled this in." Taskgraph itself usesOption<TaskScope>etc. -
No
parentId— Grouping is handled bypath(a nullable text column for scoped queries likeWHERE path LIKE 'implementation/%'). Dependencies are intask_dependencies. These are separate concepts. -
No
removedAtsoft delete — When a task file is removed, the sync DELETEs the DB row. Git history preserves file-level history. No DB duplication needed. -
fileCreatedAt/fileModifiedAt— Dedicated columns for frontmatter timestamps, separate from DBcreatedAt/updatedAt(row lifecycle times).
Consequences
Positive:
- Coordinator gets a reliable, consistent view of all task state across parallel worktrees.
- No merge conflicts from agents editing the same file in different worktrees.
- Status changes are atomic and immediately visible to all agents via hub operations.
- All taskgraph fields are queryable with proper SQL types and indexes.
- Taskgraph CLI still works for offline analysis via DB → file export.
- Nullable categorical fields provide the "not yet assessed" signal that defaults hide.
Negative:
- Two representations exist (files and DB), requiring a sync operation.
- Files are no longer the source of truth — they're the authoring surface. This is a conceptual shift from taskgraph's default model.
- DB → file export is needed for offline analysis (not automatic).
Mitigation for negatives:
- Sync is idempotent and can be run at any time after authoring.
- The DB is the authority; files are just one input method. Tasks can also be created via hub API.
- Export for offline analysis is a manual step (run when needed), not a continuous sync.
Related
- ADR-001: JSONB data columns vs individual columns (same principle — proper columns for queryable fields)
- Cost-benefit framework: taskgraph framework docs
- Task storage:
docs/architecture/storage/tasks.md - taskgraph TaskFrontmatter: taskgraph source