Copy architecture docs, ADRs, storage domain specs, research, reviews, and 56 storage architecture tasks from the alkhub_ts monorepo. Adapt for standalone @alkdev/hub repo structure (src/ not packages/hub/). Sanitize all sensitive information: - Replace private IPs (10.0.0.1) with localhost defaults - Remove internal server hostnames (dev1, ns528096) - Replace /workspace/ private paths with npm package references - Remove hardcoded credentials from examples - Rewrite infrastructure.md without private network details Add Deno project scaffolding: deno.json (pinned deps), .gitignore, AGENTS.md, entry point. Migrate existing code stubs (crypto, config types, logger) with updated import paths.
41 KiB
status, last_updated, review_date, reviewer, scope, resolution
| status | last_updated | review_date | reviewer | scope | resolution |
|---|---|---|---|---|---|
| active | 2026-04-21 | 2026-04-21 | architect (with 5 subagent reviewers) | docs/architecture/storage/* + docs/decisions/ADR-001 through ADR-012 | pending |
Storage Architecture Review: 2026-04-21
Comprehensive review of the storage specification documents (docs/architecture/storage/) and related ADRs. Five parallel subagent reviews were conducted, each focused on a domain area. Their findings are consolidated here with deduplication, prioritization, and cross-references.
Review Sessions (open-memory)
| # | Domain | Session ID |
|---|---|---|
| 1 | Identity & Auth | ses_24f76141effegdhw2bxX2sOvYb |
| 2 | Sessions & Messages | ses_24f751efbffeyWo9wb6hAnnj0y |
| 3 | Services, Spokes, Call Graph | ses_24f746ebbffeG4jqN3MbK5i9yt |
| 4 | Tasks & Coordination | ses_24f7431baffeElbZ3qVHCYQOSv |
| 5 | Cross-Cutting Concerns | ses_24f735dbcffea1pN0JCgtPdbt2 |
Documents Reviewed
docs/architecture/storage/README.md— common pattern, package structure, open questionsdocs/architecture/storage/table-reference.md— cross-cutting reference (cascades, indexes, enums, relations)docs/architecture/storage/identity.md— accounts, organizations, organization_members, api_keys, audit_logsdocs/architecture/storage/projects.md— projects, workspacesdocs/architecture/storage/sessions.md— sessions, messages, partsdocs/architecture/storage/roles.md— rolesdocs/architecture/storage/services.md— clients, client_secretsdocs/architecture/storage/spokes.md— spokes, operation_specsdocs/architecture/storage/call-graph.md— call_graph_nodes, call_graph_edgesdocs/architecture/storage/coordination.md— mappings, detectionsdocs/architecture/storage/tasks.md— tasks, task_dependenciesdocs/decisions/ADR-001throughADR-012
Summary Statistics
| Severity | Count |
|---|---|
| 🔴 Critical | 14 |
| 🟡 Warning | 22 |
| 💡 Suggestion | 17 |
🔴 Critical Issues
Issues that must be resolved before the storage spec is stabilized. Each represents a concrete inconsistency, data integrity risk, or ambiguity that would cause implementation divergence.
C01. NOT NULL + onDelete: SET NULL — Contradictory Constraints
Sessions: 1, 2, 5
Files: sessions.md:17, identity.md:112, table-reference.md:80-71
Open-memory: ses_24f76141effegdhw2bxX2sOvYb, ses_24f751efbffeyWo9wb6hAnnj0y, ses_24f735dbcffea1pN0JCgtPdbt2
Two FK columns are declared NOT NULL but have onDelete: SET NULL. PostgreSQL will reject the DELETE because it cannot nullify a NOT NULL column:
sessions.accountId—text NOT NULL(sessions.md:17) withonDelete: SET NULL(table-reference.md:80). Deleting an account that owns sessions fails.audit_logs.ownerId—text NOT NULL(identity.md:112) withonDelete: SET NULL(table-reference.md:71). Deleting an account that has audit entries fails.
Recommendation: For each, choose one:
- Make the column nullable (if detaching on delete is desired)
- Change cascade to RESTRICT (if the FK must always be populated — blocks account deletion)
- Change cascade to CASCADE (if deleting dependent records is acceptable)
- Add application-level logic that reassigns/destroys dependents before account deletion
For audit_logs.ownerId, RESTRICT may be correct — audit trails should prevent account deletion. For sessions.accountId, nullable is likely correct — orphaned sessions (account deleted) are still valuable data.
C02. ADR-003 vs sessions.md on Message IDs
Sessions: 2, 5
Files: ADR-003, sessions.md:42-46, table-reference.md:48
Open-memory: ses_24f751efbffeyWo9wb6hAnnj0y, ses_24f735dbcffea1pN0JCgtPdbt2
ADR-003 explicitly states: "Parts and messages tables use sortable timestamp-based IDs instead of commonCols.id." However, sessions.md defines the messages table using commonCols (which provides UUIDv4 via crypto.randomUUID()). Only parts explicitly uses sortable IDs. table-reference.md only mentions parts for sortable IDs.
This is a three-way inconsistency: ADR says both tables, sessions.md does one, table-reference says one. Message ordering is semantically important (the composite index idx_messages_session_id_created_at_id on (session_id, created_at, id) relies on created_at for ordering, making UUIDv4 sortable IDs unnecessary — but this contradicts ADR-003's stated rationale).
Recommendation: Either:
- (A) Update
messagestable to use sortable IDs (consistent with ADR-003, eliminates dependency oncreated_atfor ordering), or - (B) Amend ADR-003 to state that only
partsuses sortable IDs, andmessagesrelies on the(session_id, created_at, id)composite index
C03. Operation Specs: Delete vs. Soft-Deactivation Unresolved
Sessions: 3, 5
Files: spokes.md:66, table-reference.md:67, README.md Open Question #2
Open-memory: ses_24f746ebbffeG4jqN3MbK5i9yt, ses_24f735dbcffea1pN0JCgtPdbt2
The spoke disconnect lifecycle has three conflicting positions:
spokes.md:66: "Removes the spoke'soperation_specsrows (or marks inactive)" — ambiguoustable-reference.md:67:operation_specs.spokeId → spokes.idwith CASCADE deleteREADME.mdOpen Question #2: "DELETE aligns with the ephemeral spoke model" for now
The operation_specs table has no active or status column to support soft-deactivation. Crucially, spoke rows are never deleted — they're only marked status: "disconnected". This means the CASCADE FK never fires, and there's no mechanism to clean up operation_specs on disconnect. The operation_specs rows remain pointing to a disconnected spoke with no way to deprecate them.
Recommendation: Resolve decisively:
- (A) Hard delete on disconnect: Add explicit cleanup in the disconnect handler. Remove "or marks inactive" from spokes.md. CASCADE only applies to rare admin spoke-row deletion.
- (B) Add active/status column to operation_specs: Support soft-deactivation. Update cascade rationale. This preserves the operation registry for audit/reconnection but adds schema complexity.
Option A aligns with the ephemeral spoke model. Option B supports spoke reconnection. Choose one and update all documents.
C04. parts.sessionId Denormalization: No Enforcement Mechanism
Sessions: 2
Files: sessions.md:96, sessions.md:105
Open-memory: ses_24f751efbffeyWo9wb6hAnnj0y
The stated invariant: "when inserting a part, always set sessionId to the message's sessionId. Never update messages.sessionId without updating all child parts." However:
- No DB trigger enforces this
- No application-level transaction pattern is documented
- No CHECK constraint exists
- If
messages.sessionIdcould change, there's a race condition window
Recommendation: Document that sessionId on both messages and parts is immutable after creation (which eliminates the update problem). Define the application-level contract for part insertion: read the message's sessionId and set it on the part within the same transaction. Add an explicit "IMMUTABLE" note to the sessionId column in sessions.md.
C05. sessions.roleName — No FK, No Validation Strategy Documented
Sessions: 2
Files: sessions.md:26, table-reference.md:100-101, roles.md
Open-memory: ses_24f751efbffeyWo9wb6hAnnj0y
sessions.roleName is bare text with no FK to roles.name and no documented reason why. Is this intentional (to support file-based roles in Phase 1)? What happens if the role name has a typo? What about sessions referencing a role that was deleted?
Recommendation: Either:
- (A) Add
FK → roles.namewithonDelete: SET NULL(role deletions detach sessions), or - (B) Document why the FK is intentionally omitted: "role definitions may come from
.opencode/agents/*.mdfiles before DB sync; application-level validation checks against known role names at session creation time."
C06. mappings.task Denormalized Column: No Sync Strategy
Sessions: 4
Files: coordination.md:22, tasks.md:209
Open-memory: ses_24f7431baffeElbZ3qVHCYQOSv
The mappings table has both taskId (FK → tasks.id) and task (denormalized display name). No mechanism keeps them in sync. If taskId points to a task whose slug or name changes, mappings.task becomes stale. When taskId is SET NULL (task deleted), what happens to task?
Recommendation: Document the invariant: "mappings.task is set to tasks.slug at insert time and is not automatically updated when the task's slug changes. When taskId is SET NULL (task deleted), task should also be SET NULL. This is a cache, not a source of truth." Alternatively, remove the denormalized column and use a VIEW that joins.
C07. Sync vs. Runtime Field Conflict in Tasks
Sessions: 4
Files: tasks.md:296-325
Open-memory: ses_24f7431baffeElbZ3qVHCYQOSv
The task sync does a full upsert, but the Authority Model says runtime status mutations go through hub.task.updateStatus. If sync blindly writes frontmatter status, it can clobber runtime state. Example:
- Agent sets
task.status = 'in-progress'viahub.task.updateStatus - Decomposer edits the task file (still has
status: pending) - Sync runs and upserts the task — overwrites
in-progressback topending
Recommendation: Define the sync field split explicitly: "Sync upserts authored fields (slug, name, path, scope, risk, impact, level, priority, tags, assignee, due, body, fileCreatedAt, fileModifiedAt, depends_on) and must not overwrite runtime-managed fields (status, startedAt, completedAt). Runtime fields are only mutated via hub.task.* operations." Update the sync flow specification in tasks.md.
C08. Concurrent task.body Appends: No Collision Handling
Sessions: 4
Files: tasks.md:249-266
Open-memory: ses_24f7431baffeElbZ3qVHCYQOSv
hub.task.addNote appends a timestamped note section to body. In a multi-agent system, read-modify-write is a race condition: Agent A reads body, Agent B reads body, both append, Agent A writes, Agent B overwrites A's addition. The spec says "This is simple" — it is not simple under concurrency.
Recommendation: Specify the concurrency model: hub.task.addNote must use DB-level concatenation (UPDATE tasks SET body = body || $note WHERE id = $taskId), not a read-modify-write cycle. Or use optimistic locking with updatedAt. Document this explicitly in the addNote specification.
C09. Cross-Project Dependency Constraint: No DB Enforcement
Sessions: 4
Files: tasks.md:217, tasks.md:357
Open-memory: ses_24f7431baffeElbZ3qVHCYQOSv
"Tasks can only depend on tasks within the same project" is declared but only "enforced at the application level." task_dependencies has FK columns with no projectId column or check constraint. Application-level enforcement is vulnerable to race conditions, direct SQL access, or bugs.
Recommendation: At minimum, add a DB-level guard. Options:
- (A) Add a trigger that checks
dependsOnTaskIdandtaskIdbelong to the same project - (B) Add a denormalized
projectIdcolumn totask_dependencieswith a composite FK - (C) Document the risk explicitly and specify that the sync operation validates project scope within a transaction (SELECT FOR SHARE)
C10. Call Graph Edges: Missing Indexes and Cascade Documentation
Sessions: 3, 5
Files: call-graph.md:32-41, table-reference.md (missing)
Open-memory: ses_24f746ebbffeG4jqN3MbK5i9yt, ses_24f735dbcffea1pN0JCgtPdbt2
call_graph_edges has no indexes and no cascade entries in table-reference.md. Both sourceId and targetId reference call_graph_nodes.id with CASCADE (implied by domain doc), but this is undocumented. Without indexes, graph traversal queries (find children, find parents) will require sequential scans.
Additionally, the relationship between call_graph_nodes.parentRequestId and call_graph_edges is ambiguous: do they store the same parent-child relationship redundantly, or serve different purposes?
Recommendation:
- Add indexes:
idx_call_graph_edges_source_idon(sourceId),idx_call_graph_edges_target_idon(targetId). Consider unique on(sourceId, targetId, edgeType)to prevent duplicates. - Add cascade entries to
table-reference.mdfor both FKs (CASCADE). - Clarify
parentRequestIdvscall_graph_edges: document whetherparentRequestIdis a convenience shortcut or redundant with edges.
C11. Secret Key Rotation: Underspecified
Sessions: 3
Files: services.md:94-97, ADR-008
Open-memory: ses_24f746ebbffeG4jqN3MbK5i9yt
Lazy re-encryption is insufficiently specified for a security-critical operation:
- Multi-key storage:
HUB_ENCRYPTION_KEY(singular env var) — how are old and new keys stored simultaneously during rotation? - Re-encryption transaction: If the process crashes between decrypt and re-encrypt-update, is the secret left in the old key version?
- Old key unavailability: What happens if a secret with
keyVersion=1is accessed after the old key is removed? Permanent data loss with no documented handling. - No background sweep: Old-key-version secrets persist indefinitely until accessed. If the old key is compromised, those secrets remain vulnerable.
Recommendation:
- Specify multi-key storage: e.g.,
HUB_ENCRYPTION_KEYS=v1:base64key,v2:base64keyor a key file - Document the re-encryption transaction: decrypt → encrypt → UPDATE in a single DB transaction, with crash-safety note
- Add a warning about the vulnerability window (old-key secrets not yet re-encrypted)
- Specify whether a background re-encryption sweep is needed or deferred
C12. Client Config Schema Validation: Timing and Evolution Ambiguous
Sessions: 3
Files: services.md:19, ADR-007, README.md Open Question #10
Open-memory: ses_24f746ebbffeG4jqN3MbK5i9yt
"Validated against the TypeBox schema for this type on write" is ambiguous:
- Who validates? Drizzle insert schema? API handler? DB trigger? Direct SQL bypasses application validation.
- Schema evolution: when code deployment changes a client type's TypeBox schema, existing DB rows may become invalid under the new schema.
- No re-validation on read is documented.
Recommendation:
- Specify: "validate on write (API handler layer) + warn on read (start-up validation pass with logging, not blocking)"
- Document the schema evolution contract: new fields MUST be
Type.Optional(); breaking changes MUST use a new clienttypestring (e.g.,llm-provider-v2) - Consider a
configSchemaVersioninmetadatatracking which schema version validated the config
C13. Dual Ownership Model for Organizations: Undefined
Sessions: 1
Files: identity.md:44 (ownerId), identity.md:58 (membershipLevel: "owner")
Open-memory: ses_24f76141effegdhw2bxX2sOvYb
Two competing ownership concepts with no documented relationship:
organizations.ownerId— a single FK to one accountorganization_members.membershipLevel: "owner"— can exist on multiple rows
Can ownerId point to an account with membershipLevel: "member" (not "owner")? Can an org have zero members with membershipLevel: "owner" but a non-null ownerId? An implementer cannot determine which field is authoritative for ownership queries.
Recommendation: Document the invariant. E.g.: "ownerId is always a member with membershipLevel: 'owner' (enforced by app logic). If all owner-level members are removed, ownerId must be transferred first." Or: "ownerId is the creator; membershipLevel: 'owner' is a separate authorization concept."
C14. Missing FK Cascade Entries in table-reference.md
Sessions: 5
Files: table-reference.md:53-83
Open-memory: ses_24f735dbcffea1pN0JCgtPdbt2
The following FK relationships are documented in per-domain docs but absent from the cascade reference table:
| Missing Relationship | Source Doc |
|---|---|
mappings.workspaceId → workspaces.id |
coordination.md:19 |
detections.sessionId → sessions.id |
coordination.md:36 |
call_graph_edges.sourceId → call_graph_nodes.id |
call-graph.md:39 |
call_graph_edges.targetId → call_graph_nodes.id |
call-graph.md:41 |
api_keys.rotatedToId → api_keys.id |
identity.md:80 |
Without documented cascade behavior, PostgreSQL defaults to RESTRICT, which may not be the intended behavior for all of these.
Recommendation: Add all missing FK entries to the cascade table with explicit onDelete behavior. For the rotatedToId FK specifically: SET NULL (old key keeps its data but rotation link is broken if new key is deleted).
🟡 Warnings
Issues that should be resolved if possible. They represent gaps in documentation, suboptimal designs, or inconsistencies that could cause confusion.
W01. Dual JSONB Overlap: commonCols.metadata vs Per-Table data
Sessions: 1
Files: identity.md:85-88, identity.md:23, README.md:73
Open-memory: ses_24f76141effegdhw2bxX2sOvYb
Two overlapping JSONB columns exist on some tables with no documented boundary:
commonCols.metadata— present on every table,Record<string, unknown>- Per-table
datacolumns — domain-specific data (e.g.,accounts.data,organizations.data)
For api_keys, keypal stores scopes, resources, and tags inside commonCols.metadata. For accounts, both data ("preferences, avatar URL") and metadata (arbitrary) exist with overlapping purposes and no split documentation.
Recommendation: Document the boundary: "data holds structured domain-specific data with known TypeScript types. metadata holds opaque key-value pairs for subsystem use, with a namespacing convention (e.g., metadata._keypal.scopes). Never mix domain data into metadata."
W02. No Account Deactivation Mechanism
Sessions: 1
Files: identity.md (accounts table)
Open-memory: ses_24f76141effegdhw2bxX2sOvYb
The accounts table has no enabled/suspended column. Combined with organizations.ownerId → RESTRICT, an org owner's account cannot be deleted. But there's also no way to deactivate it when an employee leaves.
Recommendation: Add an enabled boolean (consistent with api_keys.enabled and clients.enabled), or a status column (active/suspended/deactivated). Document the interaction with cascade constraints.
W03. Missing Indexes Across Many Tables
Sessions: 1, 2, 3, 5
Files: table-reference.md:87-145, per-domain docs
Open-memory: All sessions (consensus finding)
Multiple tables have FK columns or common query patterns without supporting indexes:
| Table | Missing Index | Purpose |
|---|---|---|
sessions |
unq_sessions_slug in index ref |
UNIQUE constraint not listed (unlike other UNIQUEs) |
sessions |
idx_sessions_parent_id on (parentId) |
Find child sessions of coordinator |
projects |
idx_projects_org_id on (orgId) |
Find projects for an org |
workspaces |
idx_workspaces_project_id on (projectId) |
Find workspaces for a project |
spokes |
idx_spokes_name on (name) |
Look up spoke by name |
detections |
idx_detections_session_id on (sessionId) |
Find detections for a session (no indexes at all) |
call_graph_nodes |
idx_call_graph_nodes_created_at on (createdAt) |
Time-range queries |
call_graph_nodes |
idx_call_graph_nodes_operation_created on (operationId, createdAt) |
Operation + time queries |
call_graph_edges |
idx_call_graph_edges_source_id on (sourceId) |
Graph traversal (children) |
call_graph_edges |
idx_call_graph_edges_target_id on (targetId) |
Graph traversal (parents) |
mappings |
idx_mappings_workspace_id on (workspaceId) |
Workspace-scoped mapping queries |
Also: idx_api_keys_key_hash (B-tree) is redundant with unq_api_keys_key_hash (UNIQUE). Postgres automatically creates an index for UNIQUE constraints.
Recommendation: Add all missing indexes to table-reference.md and relevant per-domain docs. Remove the redundant idx_api_keys_key_hash.
W04. operation_specs Pre-Remap vs. Post-Remap Namespace Ambiguity
Sessions: 3
Files: spokes.md:51-55, spoke-runner.md:62
Open-memory: ses_24f746ebbffeG4jqN3MbK5i9yt
Do operation_specs.namespace and operation_specs.name store the original spoke identifiers (pre-remap, e.g., dev.fs.read) or the remapped hub identifiers (post-remap, e.g., dev.{spokeId}.fs.read)? The spoke-runner.md says the hub remaps spoke operations into a hub namespace, but the operation_specs storage format is never specified.
If pre-remap: two spokes registering dev.fs.read creates ambiguity without joining on spokeId.
If post-remap: the partial unique indexes may be over-constraining since the spoke-specific namespace prefix makes spokeId redundant for uniqueness.
Recommendation: Explicitly document which identifiers are stored. If pre-remap, document how callers resolve ambiguity. If post-remap, adjust the uniqueness rationale.
W05. call_graph_edges.edgeType Semantics Undefined
Sessions: 3
Files: call-graph.md:41
Open-memory: ses_24f746ebbffeG4jqN3MbK5i9yt
Three edge types are listed (triggered, depends_on, requested_by) but none are explained. The call-graph architecture doc only discusses parent-child relationships (triggered). depends_on and requested_by are novel and undocumented. Are these exhaustive or extensible?
Recommendation: Document each edge type's semantics in call-graph.md, or state that edgeType is an extensible text field with these three initial values and define what each means.
W06. spokes.status Missing reconnecting State
Sessions: 3
Files: spokes.md:18, spoke-runner.md:130-136
Open-memory: ses_24f746ebbffeG4jqN3MbK5i9yt
The spoke status enum is connected, disconnected. The spoke-runner.md describes a reconnection flow, but there's no intermediate state for "reconnecting." When a spoke's WebSocket drops, it shows disconnected — indistinguishable from a permanently offline spoke.
Recommendation: Add reconnecting to the spoke status enum, or document that reconnection is handled at the application layer (WebSocket reconnect timer) without a DB state change.
W07. client_secrets.keyVersion Redundancy
Sessions: 3
Files: services.md:71, services.md:82-86
Open-memory: ses_24f746ebbffeG4jqN3MbK5i9yt
client_secrets has both a standalone keyVersion column (integer NOT NULL DEFAULT 1) AND keyVersion embedded in the value JSONB (EncryptedData.keyVersion). These can diverge with no documented invariant.
Recommendation: Either remove the standalone column (read from value.keyVersion), or document that the standalone column is authoritative and they must be kept in sync.
W08. Call Graph Payload Security
Sessions: 3
Files: call-graph.md:22-23
Open-memory: ses_24f746ebbffeG4jqN3MbK5i9yt
The input and output JSONB columns store full call payloads. Operations like hub.register (which receives auth tokens) would store API keys and secrets in cleartext. The truncation strategy (10KB) addresses size, not sensitive data. No redaction is mentioned.
Recommendation: Add a section on sensitive data handling. Options:
- Operation handlers mark certain fields as redacted
- The call graph writer applies field-level redaction by convention (fields named
password,token,secret,key) - The truncation strategy is extended with a redaction pass
W09. No Call Graph Retention Policy
Sessions: 3, 4
Files: call-graph.md (absent), README.md Open Question #5
Open-memory: ses_24f746ebbffeG4jqN3MbK5i9yt
Call graph data grows unboundedly. Every operation invocation creates a node and edges. CASCADE handles cleanup when a node is deleted, but nothing deletes old nodes. README.md acknowledges this as Open Question #5.
Recommendation: Specify the intended approach: TTL-based deletion, archival to cold storage, or aggregation + deletion. Even a "v1: manual cleanup, v2: automatic TTL" notation helps.
W10. sessions.version Column: Unspecified
Sessions: 2
Files: sessions.md:24, README.md Open Question #1
Open-memory: ses_24f751efbffeyWo9wb6hAnnj0y
The version column is text NOT NULL with description "Schema version (opencode compat)" but:
- No valid values listed
- No default documented for hub-direct sessions vs opencode imports
- No versioning scheme defined
- README.md Open Question #1 asks whether to version
datacolumns — this is unresolved
Recommendation: Define initial version value (e.g., "1"), document what version governs (the data JSONB shape? the message/parts schema? opencode compatibility only?), and specify the default for hub-direct sessions.
W11. Overlapping Status Enums Without Cross-Table Disambiguation
Sessions: 4, 5
Files: table-reference.md:147-164, coordination.md:23, tasks.md:84-86
Open-memory: ses_24f7431baffeElbZ3qVHCYQOSv, ses_24f735dbcffea1pN0JCgtPdbt2
Three tables have status with overlapping values:
| Table | Shared Values | Unique Values |
|---|---|---|
mappings |
completed, failed, aborted |
active |
call_graph_nodes |
completed, failed, aborted |
pending, running |
tasks |
completed, failed |
pending, in-progress, blocked |
table-reference.md:164 only contrasts mappings.active vs call_graph_nodes.pending/running. It does NOT contrast tasks statuses with the others. mappings.completed and tasks.completed mean different things (mapping workflow completion vs task completion).
Recommendation: Add cross-table state mapping documentation. When a task goes in-progress, there should be an active mapping; when a task is completed, the mapping becomes completed. Document valid combinations.
W12. Audit Logs Missing Session and Org Context
Sessions: 1
Files: identity.md:103-117
Open-memory: ses_24f76141effegdhw2bxX2sOvYb
audit_logs has ownerId and keyId but no sessionId or orgId. For LLM accounts that fill roles in sessions, the session correlation is a significant traceability gap. Multi-tenant auditing requires org filtering.
Recommendation: Add sessionId (nullable FK → sessions.id, SET NULL) and orgId (nullable FK → organizations.id, SET NULL). Expand action types to cover account, membership, and organization lifecycle events — or document the action enum as extensible.
W13. API Key Hashing (SHA-256) Trade-Off Undocumented
Sessions: 1
Files: identity.md:74, ADR-010
Open-memory: ses_24f76141effegdhw2bxX2sOvYb
API keys are bearer tokens stored as SHA-256 hashes. SHA-256 is a fast hash, not a deliberately slow KDF (bcrypt/Argon2). If the database is compromised, SHA-256 hashes can be brute-forced orders of magnitude faster than slow hashes. However, API keys are high-entropy machine-generated strings (128-bit+), making brute-force infeasible even with a fast hash. No ADR documents this trade-off.
Recommendation: Add documentation: "API keys are high-entropy random strings (128-bit+), making brute-force infeasible even with a fast hash. SHA-256 was chosen for O(1) verification latency at high throughput. This is acceptable because API keys are machine-generated, unlike human-chosen passwords."
W14. ADR Terminology Inconsistencies
Sessions: 1
Files: ADR-009:13, ADR-012:55, agent-roles.md
Open-memory: ses_24f76141effegdhw2bxX2sOvYb
- ADR-009 says "organization_members (membership with roles)" — contradicts ADR-012's rename to
membershipLevel - ADR-012 itself uses
accounts.role: "service"in its rationale, despite mandating the rename toaccessLevel agent-roles.mdalso usesaccounts.role: "service"
Recommendation: Update ADR-009 to say "membership with levels." Update ADR-012:55 and agent-roles.md to use accounts.accessLevel: "service".
W15. Resolved Open Questions Still Listed as Open in README
Sessions: 5
Files: README.md:197-225
Open-memory: ses_24f735dbcffea1pN0JCgtPdbt2
Several open questions are resolved by per-domain docs or ADRs but remain listed as open:
- Q2 (operation spec cleanup): Resolved — DELETE aligns with ephemeral spoke model (spokes.md, table-reference.md CASCADE)
- Q4 (workspaces vs. directories): Marked as "Resolved" in the list but still present
- Q14 (
accounts.role→accessLevel): Renamed in identity.md, referenced in ADR-012
Recommendation: Move resolved items to a "Resolved Decisions" section with cross-references to the resolving documents.
W16. organizations.ownerId RESTRICT: No Deletion/Transfer Workflow
Sessions: 1, 5
Files: identity.md:44, table-reference.md:56
Open-memory: ses_24f76141effegdhw2bxX2sOvYb, ses_24f735dbcffea1pN0JCgtPdbt2
RESTRICT prevents deletion of accounts that own organizations, but no ownership transfer mechanism is documented.
Recommendation: Add a note: "Before deleting an account, transfer all owned organizations via org.transferOwnership operation." Document the transfer pattern in identity.md or coordination.md.
W17. Path LIKE Queries May Not Use B-Tree Indexes in PostgreSQL
Sessions: 4
Files: tasks.md:83, tasks.md:101
Open-memory: ses_24f7431baffeElbZ3qVHCYQOSv
WHERE path LIKE 'implementation/%' can use a B-tree index only with the C locale or text_pattern_ops. With the default locale, LIKE pattern matching may not use the index.
Recommendation: Specify that the path index should use text_pattern_ops (CREATE INDEX idx_tasks_path ON tasks (path text_pattern_ops)) or document the locale dependency.
W18. Call Graph Payload Truncation Lacks Precision
Sessions: 3
Files: call-graph.md:30
Open-memory: ses_24f746ebbffeG4jqN3MbK5i9yt
The truncation strategy says "truncate payloads larger than 10KB" but doesn't specify: when truncation happens (on write? after call completes?), what preview contains (first N bytes? N characters?), whether 10KB is configurable, or how object storage reference URLs are structured.
Recommendation: Specify: (a) truncation happens on write to DB (in-flight calls have full payloads); (b) preview is the first 1024 bytes of the JSON-serialized payload; (c) make the threshold configurable per operation type or via hub config; (d) defer object storage details but add a placeholder section.
W19. call_graph_nodes.identity Has No FK or Account Linkage
Sessions: 3
Files: call-graph.md:20
Open-memory: ses_24f746ebbffeG4jqN3MbK5i9yt
The identity JSONB column stores { id, scopes, resources } as a snapshot, but there's no FK to accounts.id. Querying "all calls made by account X" requires JSONB containment, which is slow without a GIN index.
Recommendation: Add a callerAccountId text column with FK → accounts.id (SET NULL) for efficient querying, or add a GIN index on identity if JSONB queries are the intended access pattern.
W20. mappings Table Overloaded — Three Distinct Relationship Types
Sessions: 4
Files: coordination.md:10-27
Open-memory: ses_24f7431baffeElbZ3qVHCYQOSv
The mappings table stores three conceptually different relationships in one table:
- Session → Spoke (where is the session running?)
- Session → Parent session (coordination hierarchy)
- Session → Task (what work is the session doing?)
All nullable FKs allow any combination, including potentially invalid ones. The table name mappings doesn't convey what's being mapped.
Recommendation: Document the valid column combinations (e.g., sessionId always NOT NULL, taskId only for task-scoped mappings, parentSessionId only for coordinator children). This makes it a polymorphic association table with documented shapes.
W21. detections Table Is Minimal — No Resolution or Deduplication
Sessions: 4
Files: coordination.md:29-39
Open-memory: ses_24f7431baffeElbZ3qVHCYQOSv
- No resolution tracking (resolved, acknowledged, false-positive)
- No deduplication (persistent
MODEL_DEGRADATIONcreates a new row every check interval) - No session end correlation
anomalyTypevalue set is unclear (open or closed enum?)
Recommendation: Add resolvedAt timestamp column. Add a UNIQUE constraint on (sessionId, anomalyType) with upsert semantics, or document that deduplication is handled at the application level. Specify whether anomalyType is extensible.
💡 Suggestions
Quality-of-life improvements that should be considered but won't block stabilization.
S01. Document accessLevel Change Authorization
Who can change an accounts.accessLevel? Can a user self-promote? Document the assumed invariants even for application-level concerns.
S02. Add Partial Indexes for Common Access Patterns
Several partial indexes would improve performance: active API keys (WHERE revoked_at IS NULL AND enabled = true), connected spokes (WHERE status = 'connected'), non-archived sessions, active tasks (WHERE status IN ('pending', 'in-progress', 'blocked')).
S03. Reserve @alk.dev Email Domain for System Accounts
LLM accounts use fallback addresses like glm-5.1@alk.dev. Document that all *@alk.dev emails are reserved for system-generated accounts and humans must use other domains.
S04. Consider displayName Index for User Search
accounts.displayName is not indexed. For UIs with user search/autocomplete, this would require full table scans.
S05. Document API Key Expiration Behavior
Does an expired key return "key expired" or a generic "authentication failed"? Recommend generic response to avoid leaking key state to attackers.
S06. Cross-Reference sessions.accountId in Identity Docs
identity.md:12 lists FK targets but omits sessions.accountId. Add it for completeness.
S07. Define FilePartData Type
sessions.md:132 references FilePartData[] in ToolState but never defines it. Clarify whether it's the same as the file part type's data shape.
S08. Complete AI SDK UIMessage Part Type Mapping
sessions.md:145-152 maps 6 part types but omits step-finish, patch, snapshot, compaction, agent. Document that these are excluded from the UIMessage view, or add mappings.
S09. Document sessions.slug Generation Strategy
Is it human-provided? Auto-generated? Random? This matters for API design and uniqueness enforcement.
S10. Add parts.type Index for Part-Type Queries
A composite index (session_id, type) would support queries like "all tool-call parts in session X" without a full scan. At minimum, document that type queries rely on existing indexes + sequential scan.
S11. Document Whether Parts Are Flat or Nested
The agent part type implies sub-agent delegation, which might need nesting. The current schema has no parentId on parts. Document whether parts are flat or whether nesting might be needed.
W22. parts Table: Missing $onUpdate and NOT NULL on Timestamp Columns
Sessions: 5
Files: sessions.md:99-107, README.md:69-82
Open-memory: ses_24f735dbcffea1pN0JCgtPdbt2
The parts table defines its own id, metadata, createdAt, and updatedAt instead of using commonCols, but the spec only says "defaults to now()" without specifying NOT NULL or $onUpdate. If the Drizzle implementation omits $onUpdate, parts rows never have updatedAt updated on modification, silently breaking any optimistic concurrency or caching logic. If createdAt/updatedAt are not NOT NULL, they can become NULL.
Recommendation: The parts table spec must explicitly state that createdAt and updatedAt are NOT NULL and that updatedAt includes $onUpdate(() => new Date()). Either replicate these details from commonCols with an explicit override note for id, or reference commonCols with the id exception documented.
S13. Add projectId to mappings for Direct Project-Scoped Queries
Finding all active mappings for a project's tasks requires a JOIN through sessions.projectId or tasks.projectId. A denormalized projectId would simplify this, or document that the JOIN pattern is acceptable.
S14. Document mappings.status Lifecycle Transitions
Unlike tasks.status which has an explicit lifecycle diagram, mappings.status transitions are unspecified. Add a lifecycle diagram or state machine.
S15. Specify Task Enum Values as Drizzle pgEnum
The categorical enum values (scope, risk, impact, level, priority, status) are documented as text strings but not referenced as Drizzle pgEnum types. Specify that these should be pgEnum for type safety, with the decomposer template consuming the same definitions.
S16. Rename taskId to dependentTaskId in task_dependencies
The column name taskId is generic and could be confused as "this task" rather than "the dependent task." Renaming to dependentTaskId makes the direction unmistakable.
S17. Add call_graph_nodes.startedAt Index for Latency Analysis
startedAt is crucial for p99 latency analysis. Consider an index alongside or instead of the suggested createdAt index.
S18. Consider Unique Constraint on call_graph_edges(sourceId, targetId, edgeType)
Nothing prevents duplicate edges between the same two nodes with the same type. A unique constraint prevents silently duplicated edge events from retries/reconnections.
✅ What's Working Well
Strengths identified across all five reviews:
-
Drizzle-TypeBox pattern — Well-documented and consistently applied. The
createSelectSchema/createInsertSchemaworkflow with JSONB overrides is clear and implementable. -
Cross-cutting reference pattern —
table-reference.mdas a single source for cascades, indexes, enums, and relations is an excellent organizational pattern that prevents "hunt through every domain doc" problems. -
Nullable categorical fields (ADR-011) — The "not yet assessed" signal via NULL (instead of defaults) is well-reasoned and matches taskgraph's own
Option<T>model. -
Dual task representation — DB as source of truth, files as authoring surface. The authority model table is excellent and provides clear guidance.
-
ADR-012 terminology clarification — The account/role/session distinction is clearly motivated and the rename guidance is actionable.
-
Cascade behavior documentation — Having all FK behaviors in one place with rationale per relationship prevents implementation errors.
-
Operation specs as capabilities (ADR-006) — Elegant decision. Avoids opaque JSONB blobs, makes capabilities fully typed and queryable. Nullable
spoke_idallows hub-native operations to coexist. -
Config/secrets separation (ADR-007) — Four-layer model (config schema, config instance, auth schema, auth instance) with different storage strategies is well-structured.
-
Path semantics for tasks — Replacing
parentIdwithpathcolumn for group scoping is clean. Prefix-based queries are intuitive and well-explained. -
Partial unique index design — The
operation_specspartial indexes correctly handle PostgreSQL's NULL-in-unique-index behavior. The explanation prevents a common pitfall.
Action Plan
Before Stabilization (Must Fix)
| Priority | Issue | Action |
|---|---|---|
| 🔴1 | C01 | Resolve NOT NULL + SET NULL contradictions for sessions.accountId and audit_logs.ownerId |
| 🔴2 | C02 | Resolve ADR-003 vs sessions.md on message IDs — update one or the other |
| 🔴3 | C03 | Resolve operation_specs delete vs soft-deactivation — choose one, update all docs |
| 🔴4 | C04 | Document sessionId immutability invariant on messages/parts |
| 🔴5 | C05 | Document roleName validation strategy (FK or intentional omission) |
| 🔴6 | C14 | Add missing FK cascade entries to table-reference.md |
Before Implementation (Should Fix)
| Priority | Issue | Action |
|---|---|---|
| 🟡1 | C06 | Document mappings.task denormalization invariant |
| 🟡2 | C07 | Define sync field split (authored vs. runtime fields) |
| 🟡3 | C08 | Specify DB-level concatenation for task.body appends |
| 🟡4 | C09 | Add DB-level guard for cross-project dependencies |
| 🟡5 | C10 | Add call_graph_edges indexes and cascade docs |
| 🟡6 | C11 | Specify key rotation multi-key format and transaction safety |
| 🟡7 | C14 | Add remaining missing cascade entries |
| 🟡8 | W03 | Add missing indexes across tables |
| 🟡9 | W11 | Add cross-table status disambiguation |
| 🟡10 | W14 | Fix ADR terminology inconsistencies |
Before Production (Consider)
| Priority | Issue | Action |
|---|---|---|
| 💡1 | W02 | Add account deactivation mechanism |
| 💡2 | W08 | Add call graph payload redaction |
| 💡3 | W09 | Define call graph retention policy |
| 💡4 | W12 | Add sessionId and orgId to audit_logs |
| 💡5 | W21 | Add resolution tracking to detections |
| 💡6 | All S01-S18 | Quality-of-life improvements |