---
status: active
last_updated: 2026-04-21
review_date: 2026-04-21
reviewer: architect (with 5 subagent reviewers)
scope: docs/architecture/storage/* + docs/decisions/ADR-001 through ADR-012
resolution: pending
---

# Storage Architecture Review: 2026-04-21

Comprehensive review of the storage specification documents (`docs/architecture/storage/`) and related ADRs. Five parallel subagent reviews were conducted, each focused on a domain area. Their findings are consolidated here with deduplication, prioritization, and cross-references.

## Review Sessions (open-memory)

| # | Domain | Session ID |
|---|--------|------------|
| 1 | Identity & Auth | `ses_24f76141effegdhw2bxX2sOvYb` |
| 2 | Sessions & Messages | `ses_24f751efbffeyWo9wb6hAnnj0y` |
| 3 | Services, Spokes, Call Graph | `ses_24f746ebbffeG4jqN3MbK5i9yt` |
| 4 | Tasks & Coordination | `ses_24f7431baffeElbZ3qVHCYQOSv` |
| 5 | Cross-Cutting Concerns | `ses_24f735dbcffea1pN0JCgtPdbt2` |

## Documents Reviewed

- `docs/architecture/storage/README.md` — common pattern, package structure, open questions
- `docs/architecture/storage/table-reference.md` — cross-cutting reference (cascades, indexes, enums, relations)
- `docs/architecture/storage/identity.md` — accounts, organizations, organization_members, api_keys, audit_logs
- `docs/architecture/storage/projects.md` — projects, workspaces
- `docs/architecture/storage/sessions.md` — sessions, messages, parts
- `docs/architecture/storage/roles.md` — roles
- `docs/architecture/storage/services.md` — clients, client_secrets
- `docs/architecture/storage/spokes.md` — spokes, operation_specs
- `docs/architecture/storage/call-graph.md` — call_graph_nodes, call_graph_edges
- `docs/architecture/storage/coordination.md` — mappings, detections
- `docs/architecture/storage/tasks.md` — tasks, task_dependencies
- `docs/decisions/ADR-001` through `ADR-012`

## Summary Statistics

| Severity | Count |
|----------|-------|
| 🔴 Critical | 14 |
| 🟡 Warning | 22 |
| 💡 Suggestion | 17 |

---

## 🔴 Critical Issues

Issues that must be resolved before the storage spec is stabilized. Each represents a concrete inconsistency, data integrity risk, or ambiguity that would cause implementation divergence.

---

### C01. `NOT NULL` + `onDelete: SET NULL` — Contradictory Constraints

**Sessions**: 1, 2, 5
**Files**: `sessions.md:17`, `identity.md:112`, `table-reference.md:80-71`
**Open-memory**: `ses_24f76141effegdhw2bxX2sOvYb`, `ses_24f751efbffeyWo9wb6hAnnj0y`, `ses_24f735dbcffea1pN0JCgtPdbt2`

Two FK columns are declared `NOT NULL` but have `onDelete: SET NULL`. PostgreSQL will reject the DELETE because it cannot nullify a NOT NULL column:

1. **`sessions.accountId`** — `text NOT NULL` (`sessions.md:17`) with `onDelete: SET NULL` (`table-reference.md:80`). Deleting an account that owns sessions fails.
2. **`audit_logs.ownerId`** — `text NOT NULL` (`identity.md:112`) with `onDelete: SET NULL` (`table-reference.md:71`). Deleting an account that has audit entries fails.

**Recommendation**: For each, choose one:
- Make the column **nullable** (if detaching on delete is desired)
- Change cascade to **RESTRICT** (if the FK must always be populated — blocks account deletion)
- Change cascade to **CASCADE** (if deleting dependent records is acceptable)
- Add application-level logic that reassigns/destroys dependents before account deletion

For `audit_logs.ownerId`, RESTRICT may be correct — audit trails should prevent account deletion. For `sessions.accountId`, nullable is likely correct — orphaned sessions (account deleted) are still valuable data.

---

### C02. ADR-003 vs `sessions.md` on Message IDs

**Sessions**: 2, 5
**Files**: `ADR-003`, `sessions.md:42-46`, `table-reference.md:48`
**Open-memory**: `ses_24f751efbffeyWo9wb6hAnnj0y`, `ses_24f735dbcffea1pN0JCgtPdbt2`

ADR-003 explicitly states: *"Parts and messages tables use sortable timestamp-based IDs instead of commonCols.id."* However, `sessions.md` defines the `messages` table using `commonCols` (which provides UUIDv4 via `crypto.randomUUID()`). Only `parts` explicitly uses sortable IDs. `table-reference.md` only mentions parts for sortable IDs.

This is a three-way inconsistency: ADR says both tables, sessions.md does one, table-reference says one. Message ordering is semantically important (the composite index `idx_messages_session_id_created_at_id` on `(session_id, created_at, id)` relies on `created_at` for ordering, making UUIDv4 sortable IDs unnecessary — but this contradicts ADR-003's stated rationale).

**Recommendation**: Either:
- (A) Update `messages` table to use sortable IDs (consistent with ADR-003, eliminates dependency on `created_at` for ordering), **or**
- (B) Amend ADR-003 to state that only `parts` uses sortable IDs, and `messages` relies on the `(session_id, created_at, id)` composite index

---

### C03. Operation Specs: Delete vs. Soft-Deactivation Unresolved

**Sessions**: 3, 5
**Files**: `spokes.md:66`, `table-reference.md:67`, `README.md` Open Question #2
**Open-memory**: `ses_24f746ebbffeG4jqN3MbK5i9yt`, `ses_24f735dbcffea1pN0JCgtPdbt2`

The spoke disconnect lifecycle has three conflicting positions:
- `spokes.md:66`: "Removes the spoke's `operation_specs` rows **(or marks inactive)**" — ambiguous
- `table-reference.md:67`: `operation_specs.spokeId → spokes.id` with **CASCADE** delete
- `README.md` Open Question #2: "DELETE aligns with the ephemeral spoke model" for now

The `operation_specs` table has **no** `active` or `status` column to support soft-deactivation. Crucially, spoke rows are **never deleted** — they're only marked `status: "disconnected"`. This means the CASCADE FK never fires, and there's no mechanism to clean up operation_specs on disconnect. The operation_specs rows remain pointing to a disconnected spoke with no way to deprecate them.

**Recommendation**: Resolve decisively:
- **(A) Hard delete on disconnect**: Add explicit cleanup in the disconnect handler. Remove "or marks inactive" from spokes.md. CASCADE only applies to rare admin spoke-row deletion.
- **(B) Add active/status column to operation_specs**: Support soft-deactivation. Update cascade rationale. This preserves the operation registry for audit/reconnection but adds schema complexity.

Option A aligns with the ephemeral spoke model. Option B supports spoke reconnection. Choose one and update all documents.

---

### C04. `parts.sessionId` Denormalization: No Enforcement Mechanism

**Sessions**: 2
**Files**: `sessions.md:96`, `sessions.md:105`
**Open-memory**: `ses_24f751efbffeyWo9wb6hAnnj0y`

The stated invariant: *"when inserting a part, always set `sessionId` to the message's `sessionId`. Never update `messages.sessionId` without updating all child parts."* However:
- No DB trigger enforces this
- No application-level transaction pattern is documented
- No CHECK constraint exists
- If `messages.sessionId` could change, there's a race condition window

**Recommendation**: Document that `sessionId` on both `messages` and `parts` is **immutable after creation** (which eliminates the update problem). Define the application-level contract for part insertion: read the message's `sessionId` and set it on the part within the same transaction. Add an explicit "IMMUTABLE" note to the `sessionId` column in `sessions.md`.

---

### C05. `sessions.roleName` — No FK, No Validation Strategy Documented

**Sessions**: 2
**Files**: `sessions.md:26`, `table-reference.md:100-101`, `roles.md`
**Open-memory**: `ses_24f751efbffeyWo9wb6hAnnj0y`

`sessions.roleName` is bare `text` with no FK to `roles.name` and no documented reason why. Is this intentional (to support file-based roles in Phase 1)? What happens if the role name has a typo? What about sessions referencing a role that was deleted?

**Recommendation**: Either:
- (A) Add `FK → roles.name` with `onDelete: SET NULL` (role deletions detach sessions), **or**
- (B) Document why the FK is intentionally omitted: "role definitions may come from `.opencode/agents/*.md` files before DB sync; application-level validation checks against known role names at session creation time."

---

### C06. `mappings.task` Denormalized Column: No Sync Strategy

**Sessions**: 4
**Files**: `coordination.md:22`, `tasks.md:209`
**Open-memory**: `ses_24f7431baffeElbZ3qVHCYQOSv`

The `mappings` table has both `taskId` (FK → tasks.id) and `task` (denormalized display name). No mechanism keeps them in sync. If `taskId` points to a task whose `slug` or `name` changes, `mappings.task` becomes stale. When `taskId` is SET NULL (task deleted), what happens to `task`?

**Recommendation**: Document the invariant: "`mappings.task` is set to `tasks.slug` at insert time and is **not** automatically updated when the task's slug changes. When `taskId` is SET NULL (task deleted), `task` should also be SET NULL. This is a cache, not a source of truth." Alternatively, remove the denormalized column and use a VIEW that joins.

---

### C07. Sync vs. Runtime Field Conflict in Tasks

**Sessions**: 4
**Files**: `tasks.md:296-325`
**Open-memory**: `ses_24f7431baffeElbZ3qVHCYQOSv`

The task sync does a full upsert, but the Authority Model says runtime status mutations go through `hub.task.updateStatus`. If sync blindly writes frontmatter `status`, it can clobber runtime state. Example:
1. Agent sets `task.status = 'in-progress'` via `hub.task.updateStatus`
2. Decomposer edits the task file (still has `status: pending`)
3. Sync runs and upserts the task — overwrites `in-progress` back to `pending`

**Recommendation**: Define the sync field split explicitly: "Sync upserts **authored fields** (slug, name, path, scope, risk, impact, level, priority, tags, assignee, due, body, fileCreatedAt, fileModifiedAt, depends_on) and must **not overwrite runtime-managed fields** (status, startedAt, completedAt). Runtime fields are only mutated via `hub.task.*` operations." Update the sync flow specification in tasks.md.

---

### C08. Concurrent `task.body` Appends: No Collision Handling

**Sessions**: 4
**Files**: `tasks.md:249-266`
**Open-memory**: `ses_24f7431baffeElbZ3qVHCYQOSv`

`hub.task.addNote` appends a timestamped note section to `body`. In a multi-agent system, read-modify-write is a race condition: Agent A reads body, Agent B reads body, both append, Agent A writes, Agent B overwrites A's addition. The spec says "This is simple" — it is not simple under concurrency.

**Recommendation**: Specify the concurrency model: `hub.task.addNote` must use DB-level concatenation (`UPDATE tasks SET body = body || $note WHERE id = $taskId`), not a read-modify-write cycle. Or use optimistic locking with `updatedAt`. Document this explicitly in the `addNote` specification.

---

### C09. Cross-Project Dependency Constraint: No DB Enforcement

**Sessions**: 4
**Files**: `tasks.md:217`, `tasks.md:357`
**Open-memory**: `ses_24f7431baffeElbZ3qVHCYQOSv`

"Tasks can only depend on tasks within the same project" is declared but only "enforced at the application level." `task_dependencies` has FK columns with no `projectId` column or check constraint. Application-level enforcement is vulnerable to race conditions, direct SQL access, or bugs.

**Recommendation**: At minimum, add a DB-level guard. Options:
- (A) Add a trigger that checks `dependsOnTaskId` and `taskId` belong to the same project
- (B) Add a denormalized `projectId` column to `task_dependencies` with a composite FK
- (C) Document the risk explicitly and specify that the sync operation validates project scope within a transaction (SELECT FOR SHARE)

---

### C10. Call Graph Edges: Missing Indexes and Cascade Documentation

**Sessions**: 3, 5
**Files**: `call-graph.md:32-41`, `table-reference.md` (missing)
**Open-memory**: `ses_24f746ebbffeG4jqN3MbK5i9yt`, `ses_24f735dbcffea1pN0JCgtPdbt2`

`call_graph_edges` has **no indexes** and **no cascade entries** in `table-reference.md`. Both `sourceId` and `targetId` reference `call_graph_nodes.id` with CASCADE (implied by domain doc), but this is undocumented. Without indexes, graph traversal queries (find children, find parents) will require sequential scans.

Additionally, the relationship between `call_graph_nodes.parentRequestId` and `call_graph_edges` is ambiguous: do they store the same parent-child relationship redundantly, or serve different purposes?

**Recommendation**:
- Add indexes: `idx_call_graph_edges_source_id` on `(sourceId)`, `idx_call_graph_edges_target_id` on `(targetId)`. Consider unique on `(sourceId, targetId, edgeType)` to prevent duplicates.
- Add cascade entries to `table-reference.md` for both FKs (CASCADE).
- Clarify `parentRequestId` vs `call_graph_edges`: document whether `parentRequestId` is a convenience shortcut or redundant with edges.

---

### C11. Secret Key Rotation: Underspecified

**Sessions**: 3
**Files**: `services.md:94-97`, `ADR-008`
**Open-memory**: `ses_24f746ebbffeG4jqN3MbK5i9yt`

Lazy re-encryption is insufficiently specified for a security-critical operation:
1. **Multi-key storage**: `HUB_ENCRYPTION_KEY` (singular env var) — how are old and new keys stored simultaneously during rotation?
2. **Re-encryption transaction**: If the process crashes between decrypt and re-encrypt-update, is the secret left in the old key version?
3. **Old key unavailability**: What happens if a secret with `keyVersion=1` is accessed after the old key is removed? Permanent data loss with no documented handling.
4. **No background sweep**: Old-key-version secrets persist indefinitely until accessed. If the old key is compromised, those secrets remain vulnerable.

**Recommendation**:
- Specify multi-key storage: e.g., `HUB_ENCRYPTION_KEYS=v1:base64key,v2:base64key` or a key file
- Document the re-encryption transaction: decrypt → encrypt → UPDATE in a single DB transaction, with crash-safety note
- Add a warning about the vulnerability window (old-key secrets not yet re-encrypted)
- Specify whether a background re-encryption sweep is needed or deferred

---

### C12. Client Config Schema Validation: Timing and Evolution Ambiguous

**Sessions**: 3
**Files**: `services.md:19`, `ADR-007`, `README.md` Open Question #10
**Open-memory**: `ses_24f746ebbffeG4jqN3MbK5i9yt`

"Validated against the TypeBox schema for this type **on write**" is ambiguous:
1. Who validates? Drizzle insert schema? API handler? DB trigger? Direct SQL bypasses application validation.
2. Schema evolution: when code deployment changes a client type's TypeBox schema, existing DB rows may become invalid under the new schema.
3. No re-validation on read is documented.

**Recommendation**:
- Specify: "validate on write (API handler layer) + warn on read (start-up validation pass with logging, not blocking)"
- Document the schema evolution contract: new fields MUST be `Type.Optional()`; breaking changes MUST use a new client `type` string (e.g., `llm-provider-v2`)
- Consider a `configSchemaVersion` in `metadata` tracking which schema version validated the config

---

### C13. Dual Ownership Model for Organizations: Undefined

**Sessions**: 1
**Files**: `identity.md:44` (ownerId), `identity.md:58` (membershipLevel: "owner")
**Open-memory**: `ses_24f76141effegdhw2bxX2sOvYb`

Two competing ownership concepts with no documented relationship:
1. `organizations.ownerId` — a single FK to one account
2. `organization_members.membershipLevel: "owner"` — can exist on multiple rows

Can `ownerId` point to an account with `membershipLevel: "member"` (not "owner")? Can an org have zero members with `membershipLevel: "owner"` but a non-null `ownerId`? An implementer cannot determine which field is authoritative for ownership queries.

**Recommendation**: Document the invariant. E.g.: "`ownerId` is always a member with `membershipLevel: 'owner'` (enforced by app logic). If all owner-level members are removed, `ownerId` must be transferred first." Or: "`ownerId` is the creator; `membershipLevel: 'owner'` is a separate authorization concept."

---

### C14. Missing FK Cascade Entries in `table-reference.md`

**Sessions**: 5
**Files**: `table-reference.md:53-83`
**Open-memory**: `ses_24f735dbcffea1pN0JCgtPdbt2`

The following FK relationships are documented in per-domain docs but **absent** from the cascade reference table:

| Missing Relationship | Source Doc |
|---|---|
| `mappings.workspaceId → workspaces.id` | coordination.md:19 |
| `detections.sessionId → sessions.id` | coordination.md:36 |
| `call_graph_edges.sourceId → call_graph_nodes.id` | call-graph.md:39 |
| `call_graph_edges.targetId → call_graph_nodes.id` | call-graph.md:41 |
| `api_keys.rotatedToId → api_keys.id` | identity.md:80 |

Without documented cascade behavior, PostgreSQL defaults to `RESTRICT`, which may not be the intended behavior for all of these.

**Recommendation**: Add all missing FK entries to the cascade table with explicit `onDelete` behavior. For the `rotatedToId` FK specifically: SET NULL (old key keeps its data but rotation link is broken if new key is deleted).

---

## 🟡 Warnings

Issues that should be resolved if possible. They represent gaps in documentation, suboptimal designs, or inconsistencies that could cause confusion.

---

### W01. Dual JSONB Overlap: `commonCols.metadata` vs Per-Table `data`

**Sessions**: 1
**Files**: `identity.md:85-88`, `identity.md:23`, `README.md:73`
**Open-memory**: `ses_24f76141effegdhw2bxX2sOvYb`

Two overlapping JSONB columns exist on some tables with no documented boundary:
- `commonCols.metadata` — present on every table, `Record<string, unknown>`
- Per-table `data` columns — domain-specific data (e.g., `accounts.data`, `organizations.data`)

For `api_keys`, keypal stores `scopes`, `resources`, and `tags` **inside `commonCols.metadata`**. For `accounts`, both `data` ("preferences, avatar URL") and `metadata` (arbitrary) exist with overlapping purposes and no split documentation.

**Recommendation**: Document the boundary: "`data` holds structured domain-specific data with known TypeScript types. `metadata` holds opaque key-value pairs for subsystem use, with a namespacing convention (e.g., `metadata._keypal.scopes`). Never mix domain data into `metadata`."

---

### W02. No Account Deactivation Mechanism

**Sessions**: 1
**Files**: `identity.md` (accounts table)
**Open-memory**: `ses_24f76141effegdhw2bxX2sOvYb`

The `accounts` table has no `enabled`/`suspended` column. Combined with `organizations.ownerId → RESTRICT`, an org owner's account cannot be deleted. But there's also no way to deactivate it when an employee leaves.

**Recommendation**: Add an `enabled` boolean (consistent with `api_keys.enabled` and `clients.enabled`), or a `status` column (`active`/`suspended`/`deactivated`). Document the interaction with cascade constraints.

---

### W03. Missing Indexes Across Many Tables

**Sessions**: 1, 2, 3, 5
**Files**: `table-reference.md:87-145`, per-domain docs
**Open-memory**: All sessions (consensus finding)

Multiple tables have FK columns or common query patterns without supporting indexes:

| Table | Missing Index | Purpose |
|---|---|---|
| `sessions` | `unq_sessions_slug` in index ref | UNIQUE constraint not listed (unlike other UNIQUEs) |
| `sessions` | `idx_sessions_parent_id` on `(parentId)` | Find child sessions of coordinator |
| `projects` | `idx_projects_org_id` on `(orgId)` | Find projects for an org |
| `workspaces` | `idx_workspaces_project_id` on `(projectId)` | Find workspaces for a project |
| `spokes` | `idx_spokes_name` on `(name)` | Look up spoke by name |
| `detections` | `idx_detections_session_id` on `(sessionId)` | Find detections for a session (no indexes at all) |
| `call_graph_nodes` | `idx_call_graph_nodes_created_at` on `(createdAt)` | Time-range queries |
| `call_graph_nodes` | `idx_call_graph_nodes_operation_created` on `(operationId, createdAt)` | Operation + time queries |
| `call_graph_edges` | `idx_call_graph_edges_source_id` on `(sourceId)` | Graph traversal (children) |
| `call_graph_edges` | `idx_call_graph_edges_target_id` on `(targetId)` | Graph traversal (parents) |
| `mappings` | `idx_mappings_workspace_id` on `(workspaceId)` | Workspace-scoped mapping queries |

Also: `idx_api_keys_key_hash` (B-tree) is redundant with `unq_api_keys_key_hash` (UNIQUE). Postgres automatically creates an index for UNIQUE constraints.

**Recommendation**: Add all missing indexes to `table-reference.md` and relevant per-domain docs. Remove the redundant `idx_api_keys_key_hash`.

---

### W04. `operation_specs` Pre-Remap vs. Post-Remap Namespace Ambiguity

**Sessions**: 3
**Files**: `spokes.md:51-55`, `spoke-runner.md:62`
**Open-memory**: `ses_24f746ebbffeG4jqN3MbK5i9yt`

Do `operation_specs.namespace` and `operation_specs.name` store the original spoke identifiers (pre-remap, e.g., `dev.fs.read`) or the remapped hub identifiers (post-remap, e.g., `dev.{spokeId}.fs.read`)? The spoke-runner.md says the hub remaps spoke operations into a hub namespace, but the operation_specs storage format is never specified.

If pre-remap: two spokes registering `dev.fs.read` creates ambiguity without joining on `spokeId`.
If post-remap: the partial unique indexes may be over-constraining since the spoke-specific namespace prefix makes `spokeId` redundant for uniqueness.

**Recommendation**: Explicitly document which identifiers are stored. If pre-remap, document how callers resolve ambiguity. If post-remap, adjust the uniqueness rationale.

---

### W05. `call_graph_edges.edgeType` Semantics Undefined

**Sessions**: 3
**Files**: `call-graph.md:41`
**Open-memory**: `ses_24f746ebbffeG4jqN3MbK5i9yt`

Three edge types are listed (`triggered`, `depends_on`, `requested_by`) but none are explained. The call-graph architecture doc only discusses parent-child relationships (triggered). `depends_on` and `requested_by` are novel and undocumented. Are these exhaustive or extensible?

**Recommendation**: Document each edge type's semantics in `call-graph.md`, or state that `edgeType` is an extensible text field with these three initial values and define what each means.

---

### W06. `spokes.status` Missing `reconnecting` State

**Sessions**: 3
**Files**: `spokes.md:18`, `spoke-runner.md:130-136`
**Open-memory**: `ses_24f746ebbffeG4jqN3MbK5i9yt`

The spoke status enum is `connected`, `disconnected`. The spoke-runner.md describes a reconnection flow, but there's no intermediate state for "reconnecting." When a spoke's WebSocket drops, it shows `disconnected` — indistinguishable from a permanently offline spoke.

**Recommendation**: Add `reconnecting` to the spoke status enum, or document that reconnection is handled at the application layer (WebSocket reconnect timer) without a DB state change.

---

### W07. `client_secrets.keyVersion` Redundancy

**Sessions**: 3
**Files**: `services.md:71`, `services.md:82-86`
**Open-memory**: `ses_24f746ebbffeG4jqN3MbK5i9yt`

`client_secrets` has both a standalone `keyVersion` column (integer NOT NULL DEFAULT 1) AND `keyVersion` embedded in the `value` JSONB (`EncryptedData.keyVersion`). These can diverge with no documented invariant.

**Recommendation**: Either remove the standalone column (read from `value.keyVersion`), or document that the standalone column is authoritative and they must be kept in sync.

---

### W08. Call Graph Payload Security

**Sessions**: 3
**Files**: `call-graph.md:22-23`
**Open-memory**: `ses_24f746ebbffeG4jqN3MbK5i9yt`

The `input` and `output` JSONB columns store full call payloads. Operations like `hub.register` (which receives auth tokens) would store API keys and secrets in cleartext. The truncation strategy (10KB) addresses size, not sensitive data. No redaction is mentioned.

**Recommendation**: Add a section on sensitive data handling. Options:
- Operation handlers mark certain fields as redacted
- The call graph writer applies field-level redaction by convention (fields named `password`, `token`, `secret`, `key`)
- The truncation strategy is extended with a redaction pass

---

### W09. No Call Graph Retention Policy

**Sessions**: 3, 4
**Files**: `call-graph.md` (absent), `README.md` Open Question #5
**Open-memory**: `ses_24f746ebbffeG4jqN3MbK5i9yt`

Call graph data grows unboundedly. Every operation invocation creates a node and edges. CASCADE handles cleanup when a node is deleted, but nothing deletes old nodes. README.md acknowledges this as Open Question #5.

**Recommendation**: Specify the intended approach: TTL-based deletion, archival to cold storage, or aggregation + deletion. Even a "v1: manual cleanup, v2: automatic TTL" notation helps.

---

### W10. `sessions.version` Column: Unspecified

**Sessions**: 2
**Files**: `sessions.md:24`, `README.md` Open Question #1
**Open-memory**: `ses_24f751efbffeyWo9wb6hAnnj0y`

The `version` column is `text NOT NULL` with description "Schema version (opencode compat)" but:
- No valid values listed
- No default documented for hub-direct sessions vs opencode imports
- No versioning scheme defined
- README.md Open Question #1 asks whether to version `data` columns — this is unresolved

**Recommendation**: Define initial version value (e.g., `"1"`), document what `version` governs (the `data` JSONB shape? the message/parts schema? opencode compatibility only?), and specify the default for hub-direct sessions.

---

### W11. Overlapping Status Enums Without Cross-Table Disambiguation

**Sessions**: 4, 5
**Files**: `table-reference.md:147-164`, `coordination.md:23`, `tasks.md:84-86`
**Open-memory**: `ses_24f7431baffeElbZ3qVHCYQOSv`, `ses_24f735dbcffea1pN0JCgtPdbt2`

Three tables have `status` with overlapping values:

| Table | Shared Values | Unique Values |
|---|---|---|
| `mappings` | `completed`, `failed`, `aborted` | `active` |
| `call_graph_nodes` | `completed`, `failed`, `aborted` | `pending`, `running` |
| `tasks` | `completed`, `failed` | `pending`, `in-progress`, `blocked` |

`table-reference.md:164` only contrasts `mappings.active` vs `call_graph_nodes.pending/running`. It does NOT contrast `tasks` statuses with the others. `mappings.completed` and `tasks.completed` mean different things (mapping workflow completion vs task completion).

**Recommendation**: Add cross-table state mapping documentation. When a task goes `in-progress`, there should be an active mapping; when a task is `completed`, the mapping becomes `completed`. Document valid combinations.

---

### W12. Audit Logs Missing Session and Org Context

**Sessions**: 1
**Files**: `identity.md:103-117`
**Open-memory**: `ses_24f76141effegdhw2bxX2sOvYb`

`audit_logs` has `ownerId` and `keyId` but no `sessionId` or `orgId`. For LLM accounts that fill roles in sessions, the session correlation is a significant traceability gap. Multi-tenant auditing requires org filtering.

**Recommendation**: Add `sessionId` (nullable FK → sessions.id, SET NULL) and `orgId` (nullable FK → organizations.id, SET NULL). Expand `action` types to cover account, membership, and organization lifecycle events — or document the `action` enum as extensible.

---

### W13. API Key Hashing (SHA-256) Trade-Off Undocumented

**Sessions**: 1
**Files**: `identity.md:74`, `ADR-010`
**Open-memory**: `ses_24f76141effegdhw2bxX2sOvYb`

API keys are bearer tokens stored as SHA-256 hashes. SHA-256 is a fast hash, not a deliberately slow KDF (bcrypt/Argon2). If the database is compromised, SHA-256 hashes can be brute-forced orders of magnitude faster than slow hashes. However, API keys are high-entropy machine-generated strings (128-bit+), making brute-force infeasible even with a fast hash. No ADR documents this trade-off.

**Recommendation**: Add documentation: "API keys are high-entropy random strings (128-bit+), making brute-force infeasible even with a fast hash. SHA-256 was chosen for O(1) verification latency at high throughput. This is acceptable because API keys are machine-generated, unlike human-chosen passwords."

---

### W14. ADR Terminology Inconsistencies

**Sessions**: 1
**Files**: `ADR-009:13`, `ADR-012:55`, `agent-roles.md`
**Open-memory**: `ses_24f76141effegdhw2bxX2sOvYb`

- ADR-009 says "organization_members (membership with **roles**)" — contradicts ADR-012's rename to `membershipLevel`
- ADR-012 itself uses `accounts.role: "service"` in its rationale, despite mandating the rename to `accessLevel`
- `agent-roles.md` also uses `accounts.role: "service"`

**Recommendation**: Update ADR-009 to say "membership with levels." Update ADR-012:55 and agent-roles.md to use `accounts.accessLevel: "service"`.

---

### W15. Resolved Open Questions Still Listed as Open in README

**Sessions**: 5
**Files**: `README.md:197-225`
**Open-memory**: `ses_24f735dbcffea1pN0JCgtPdbt2`

Several open questions are resolved by per-domain docs or ADRs but remain listed as open:
- **Q2** (operation spec cleanup): Resolved — DELETE aligns with ephemeral spoke model (spokes.md, table-reference.md CASCADE)
- **Q4** (workspaces vs. directories): Marked as "Resolved" in the list but still present
- **Q14** (`accounts.role` → `accessLevel`): Renamed in identity.md, referenced in ADR-012

**Recommendation**: Move resolved items to a "Resolved Decisions" section with cross-references to the resolving documents.

---

### W16. `organizations.ownerId` RESTRICT: No Deletion/Transfer Workflow

**Sessions**: 1, 5
**Files**: `identity.md:44`, `table-reference.md:56`
**Open-memory**: `ses_24f76141effegdhw2bxX2sOvYb`, `ses_24f735dbcffea1pN0JCgtPdbt2`

RESTRICT prevents deletion of accounts that own organizations, but no ownership transfer mechanism is documented.

**Recommendation**: Add a note: "Before deleting an account, transfer all owned organizations via `org.transferOwnership` operation." Document the transfer pattern in identity.md or coordination.md.

---

### W17. Path LIKE Queries May Not Use B-Tree Indexes in PostgreSQL

**Sessions**: 4
**Files**: `tasks.md:83`, `tasks.md:101`
**Open-memory**: `ses_24f7431baffeElbZ3qVHCYQOSv`

`WHERE path LIKE 'implementation/%'` can use a B-tree index **only with the `C` locale or `text_pattern_ops`**. With the default locale, LIKE pattern matching may not use the index.

**Recommendation**: Specify that the `path` index should use `text_pattern_ops` (`CREATE INDEX idx_tasks_path ON tasks (path text_pattern_ops)`) or document the locale dependency.

---

### W18. Call Graph Payload Truncation Lacks Precision

**Sessions**: 3
**Files**: `call-graph.md:30`
**Open-memory**: `ses_24f746ebbffeG4jqN3MbK5i9yt`

The truncation strategy says "truncate payloads larger than 10KB" but doesn't specify: when truncation happens (on write? after call completes?), what `preview` contains (first N bytes? N characters?), whether 10KB is configurable, or how object storage reference URLs are structured.

**Recommendation**: Specify: (a) truncation happens on write to DB (in-flight calls have full payloads); (b) `preview` is the first 1024 bytes of the JSON-serialized payload; (c) make the threshold configurable per operation type or via hub config; (d) defer object storage details but add a placeholder section.

---

### W19. `call_graph_nodes.identity` Has No FK or Account Linkage

**Sessions**: 3
**Files**: `call-graph.md:20`
**Open-memory**: `ses_24f746ebbffeG4jqN3MbK5i9yt`

The `identity` JSONB column stores `{ id, scopes, resources }` as a snapshot, but there's no FK to `accounts.id`. Querying "all calls made by account X" requires JSONB containment, which is slow without a GIN index.

**Recommendation**: Add a `callerAccountId` text column with FK → accounts.id (SET NULL) for efficient querying, or add a GIN index on `identity` if JSONB queries are the intended access pattern.

---

### W20. `mappings` Table Overloaded — Three Distinct Relationship Types

**Sessions**: 4
**Files**: `coordination.md:10-27`
**Open-memory**: `ses_24f7431baffeElbZ3qVHCYQOSv`

The `mappings` table stores three conceptually different relationships in one table:
1. Session → Spoke (where is the session running?)
2. Session → Parent session (coordination hierarchy)
3. Session → Task (what work is the session doing?)

All nullable FKs allow any combination, including potentially invalid ones. The table name `mappings` doesn't convey what's being mapped.

**Recommendation**: Document the valid column combinations (e.g., `sessionId` always NOT NULL, `taskId` only for task-scoped mappings, `parentSessionId` only for coordinator children). This makes it a polymorphic association table with documented shapes.

---

### W21. `detections` Table Is Minimal — No Resolution or Deduplication

**Sessions**: 4
**Files**: `coordination.md:29-39`
**Open-memory**: `ses_24f7431baffeElbZ3qVHCYQOSv`

- No resolution tracking (resolved, acknowledged, false-positive)
- No deduplication (persistent `MODEL_DEGRADATION` creates a new row every check interval)
- No session end correlation
- `anomalyType` value set is unclear (open or closed enum?)

**Recommendation**: Add `resolvedAt` timestamp column. Add a UNIQUE constraint on `(sessionId, anomalyType)` with upsert semantics, or document that deduplication is handled at the application level. Specify whether `anomalyType` is extensible.

---

## 💡 Suggestions

Quality-of-life improvements that should be considered but won't block stabilization.

---

### S01. Document `accessLevel` Change Authorization

Who can change an `accounts.accessLevel`? Can a `user` self-promote? Document the assumed invariants even for application-level concerns.

---

### S02. Add Partial Indexes for Common Access Patterns

Several partial indexes would improve performance: active API keys (`WHERE revoked_at IS NULL AND enabled = true`), connected spokes (`WHERE status = 'connected'`), non-archived sessions, active tasks (`WHERE status IN ('pending', 'in-progress', 'blocked')`).

---

### S03. Reserve `@alk.dev` Email Domain for System Accounts

LLM accounts use fallback addresses like `glm-5.1@alk.dev`. Document that all `*@alk.dev` emails are reserved for system-generated accounts and humans must use other domains.

---

### S04. Consider `displayName` Index for User Search

`accounts.displayName` is not indexed. For UIs with user search/autocomplete, this would require full table scans.

---

### S05. Document API Key Expiration Behavior

Does an expired key return "key expired" or a generic "authentication failed"? Recommend generic response to avoid leaking key state to attackers.

---

### S06. Cross-Reference `sessions.accountId` in Identity Docs

`identity.md:12` lists FK targets but omits `sessions.accountId`. Add it for completeness.

---

### S07. Define `FilePartData` Type

`sessions.md:132` references `FilePartData[]` in ToolState but never defines it. Clarify whether it's the same as the `file` part type's data shape.

---

### S08. Complete AI SDK UIMessage Part Type Mapping

`sessions.md:145-152` maps 6 part types but omits `step-finish`, `patch`, `snapshot`, `compaction`, `agent`. Document that these are excluded from the UIMessage view, or add mappings.

---

### S09. Document `sessions.slug` Generation Strategy

Is it human-provided? Auto-generated? Random? This matters for API design and uniqueness enforcement.

---

### S10. Add `parts.type` Index for Part-Type Queries

A composite index `(session_id, type)` would support queries like "all tool-call parts in session X" without a full scan. At minimum, document that `type` queries rely on existing indexes + sequential scan.

---

### S11. Document Whether Parts Are Flat or Nested

The `agent` part type implies sub-agent delegation, which might need nesting. The current schema has no `parentId` on parts. Document whether parts are flat or whether nesting might be needed.

---

### W22. `parts` Table: Missing `$onUpdate` and `NOT NULL` on Timestamp Columns

**Sessions**: 5
**Files**: `sessions.md:99-107`, `README.md:69-82`
**Open-memory**: `ses_24f735dbcffea1pN0JCgtPdbt2`

The `parts` table defines its own `id`, `metadata`, `createdAt`, and `updatedAt` instead of using `commonCols`, but the spec only says "defaults to `now()`" without specifying `NOT NULL` or `$onUpdate`. If the Drizzle implementation omits `$onUpdate`, parts rows never have `updatedAt` updated on modification, silently breaking any optimistic concurrency or caching logic. If `createdAt`/`updatedAt` are not `NOT NULL`, they can become NULL.

**Recommendation**: The `parts` table spec must explicitly state that `createdAt` and `updatedAt` are `NOT NULL` and that `updatedAt` includes `$onUpdate(() => new Date())`. Either replicate these details from `commonCols` with an explicit override note for `id`, or reference `commonCols` with the `id` exception documented.

---

### S13. Add `projectId` to `mappings` for Direct Project-Scoped Queries

Finding all active mappings for a project's tasks requires a JOIN through `sessions.projectId` or `tasks.projectId`. A denormalized `projectId` would simplify this, or document that the JOIN pattern is acceptable.

---

### S14. Document `mappings.status` Lifecycle Transitions

Unlike `tasks.status` which has an explicit lifecycle diagram, `mappings.status` transitions are unspecified. Add a lifecycle diagram or state machine.

---

### S15. Specify Task Enum Values as Drizzle `pgEnum`

The categorical enum values (`scope`, `risk`, `impact`, `level`, `priority`, `status`) are documented as text strings but not referenced as Drizzle `pgEnum` types. Specify that these should be `pgEnum` for type safety, with the decomposer template consuming the same definitions.

---

### S16. Rename `taskId` to `dependentTaskId` in `task_dependencies`

The column name `taskId` is generic and could be confused as "this task" rather than "the dependent task." Renaming to `dependentTaskId` makes the direction unmistakable.

---

### S17. Add `call_graph_nodes.startedAt` Index for Latency Analysis

`startedAt` is crucial for p99 latency analysis. Consider an index alongside or instead of the suggested `createdAt` index.

---

### S18. Consider Unique Constraint on `call_graph_edges(sourceId, targetId, edgeType)`

Nothing prevents duplicate edges between the same two nodes with the same type. A unique constraint prevents silently duplicated edge events from retries/reconnections.

---

## ✅ What's Working Well

Strengths identified across all five reviews:

1. **Drizzle-TypeBox pattern** — Well-documented and consistently applied. The `createSelectSchema`/`createInsertSchema` workflow with JSONB overrides is clear and implementable.

2. **Cross-cutting reference pattern** — `table-reference.md` as a single source for cascades, indexes, enums, and relations is an excellent organizational pattern that prevents "hunt through every domain doc" problems.

3. **Nullable categorical fields (ADR-011)** — The "not yet assessed" signal via NULL (instead of defaults) is well-reasoned and matches taskgraph's own `Option<T>` model.

4. **Dual task representation** — DB as source of truth, files as authoring surface. The authority model table is excellent and provides clear guidance.

5. **ADR-012 terminology clarification** — The account/role/session distinction is clearly motivated and the rename guidance is actionable.

6. **Cascade behavior documentation** — Having all FK behaviors in one place with rationale per relationship prevents implementation errors.

7. **Operation specs as capabilities (ADR-006)** — Elegant decision. Avoids opaque JSONB blobs, makes capabilities fully typed and queryable. Nullable `spoke_id` allows hub-native operations to coexist.

8. **Config/secrets separation (ADR-007)** — Four-layer model (config schema, config instance, auth schema, auth instance) with different storage strategies is well-structured.

9. **Path semantics for tasks** — Replacing `parentId` with `path` column for group scoping is clean. Prefix-based queries are intuitive and well-explained.

10. **Partial unique index design** — The `operation_specs` partial indexes correctly handle PostgreSQL's NULL-in-unique-index behavior. The explanation prevents a common pitfall.

---

## Action Plan

### Before Stabilization (Must Fix)

| Priority | Issue | Action |
|---|---|---|
| 🔴1 | C01 | Resolve NOT NULL + SET NULL contradictions for `sessions.accountId` and `audit_logs.ownerId` |
| 🔴2 | C02 | Resolve ADR-003 vs sessions.md on message IDs — update one or the other |
| 🔴3 | C03 | Resolve operation_specs delete vs soft-deactivation — choose one, update all docs |
| 🔴4 | C04 | Document sessionId immutability invariant on messages/parts |
| 🔴5 | C05 | Document roleName validation strategy (FK or intentional omission) |
| 🔴6 | C14 | Add missing FK cascade entries to table-reference.md |

### Before Implementation (Should Fix)

| Priority | Issue | Action |
|---|---|---|
| 🟡1 | C06 | Document mappings.task denormalization invariant |
| 🟡2 | C07 | Define sync field split (authored vs. runtime fields) |
| 🟡3 | C08 | Specify DB-level concatenation for task.body appends |
| 🟡4 | C09 | Add DB-level guard for cross-project dependencies |
| 🟡5 | C10 | Add call_graph_edges indexes and cascade docs |
| 🟡6 | C11 | Specify key rotation multi-key format and transaction safety |
| 🟡7 | C14 | Add remaining missing cascade entries |
| 🟡8 | W03 | Add missing indexes across tables |
| 🟡9 | W11 | Add cross-table status disambiguation |
| 🟡10 | W14 | Fix ADR terminology inconsistencies |

### Before Production (Consider)

| Priority | Issue | Action |
|---|---|---|
| 💡1 | W02 | Add account deactivation mechanism |
| 💡2 | W08 | Add call graph payload redaction |
| 💡3 | W09 | Define call graph retention policy |
| 💡4 | W12 | Add sessionId and orgId to audit_logs |
| 💡5 | W21 | Add resolution tracking to detections |
| 💡6 | All S01-S18 | Quality-of-life improvements |