Copy architecture docs, ADRs, storage domain specs, research, reviews, and 56 storage architecture tasks from the alkhub_ts monorepo. Adapt for standalone @alkdev/hub repo structure (src/ not packages/hub/). Sanitize all sensitive information: - Replace private IPs (10.0.0.1) with localhost defaults - Remove internal server hostnames (dev1, ns528096) - Replace /workspace/ private paths with npm package references - Remove hardcoded credentials from examples - Rewrite infrastructure.md without private network details Add Deno project scaffolding: deno.json (pinned deps), .gitignore, AGENTS.md, entry point. Migrate existing code stubs (crypto, config types, logger) with updated import paths.
14 KiB
status, last_updated
| status | last_updated |
|---|---|
| draft | 2026-04-20 |
Table Schemas: Sessions, Messages & Parts
Agent conversation session tables. For cross-cutting reference (cascade behavior, index reference, status enums, relations), see table-reference.md. For design decisions, see ../../../decisions/. For the session architecture, see ../../agent-sessions.md.
sessions
Agent conversation sessions. Every session — whether the LLM runs directly in the hub or in a remote opencode container — stores its data here. The hub is the source of truth; spokes are execution environments.
| Column | Type | Notes |
|---|---|---|
| commonCols | — | id, metadata, createdAt, updatedAt |
| accountId | text | FK → accounts.id — Nullable — orphaned sessions preserve conversation history for audit and debugging. See D1 in storage-spec-phase1-resolutions.md. |
| projectId | text NOT NULL | FK → projects.id (cascade) |
| workspaceId | text | FK → workspaces.id |
| parentId | text | FK → sessions.id — Parent session (coordinator relationship). onDelete: SET NULL — deleting a parent session detaches children but preserves them. |
| slug | text NOT NULL UNIQUE | URL-friendly session identifier (unique across all sessions). slug is generated from the session title using URL-friendly slugification (lowercase, hyphens for spaces, alphanumeric only). Uniqueness is enforced by the UNIQUE constraint. If a collision occurs, append a short random suffix. |
| title | text NOT NULL | Session title |
| status | text NOT NULL | Enum: idle, busy, retry, archived. Default: idle |
| version | text NOT NULL | Schema version of the session's data column. Default: '1'. Incremented when the data format changes (e.g., new optional fields added). New fields should be optional in the schema, so version advances for breaking changes only. The hub uses this for migration-aware reads: version 1 sessions get default values for new fields. This field exists for forward compatibility — it allows the hub to interpret session data correctly as the schema evolves. It is NOT a concurrency version (for optimistic locking, use commonCols.updatedAt). |
| provider | text | Execution path: direct (hub AI SDK) or opencode (spoke) |
| roleName | text | Which role this session fills (e.g., "architect", "implementation-specialist"). Formerly agentName in OpenCode. See ADR-012 and agent-roles.md. roleName is a free-form string (not a FK constraint). Known role names are defined in the roles table, but sessions may use ad-hoc role names. Application code should validate against known roles when available but tolerate unknown values. |
| data | jsonb | Role-specific metadata (model, tokens, cost, finish reason, etc.) |
data boundaries: Execution metadata goes in data (model, tokens, cost, finish reason, resolved permissions). Structured fields like status, provider, roleName are separate columns because they're queried, filtered, and constrained. If a field appears in WHERE clauses or JOINs, it should be a proper column, not buried in JSONB.
Session data shapes: The data JSONB column holds execution-path-specific metadata. For direct sessions: { model, tokens, cost, finish }. For opencode sessions: additional fields from opencode's session model (summary stats, etc.). The data column also holds the resolved permissions for the session (data.scope), which is computed from the intersection of role permissions, account scopes, and spoke type trust level. See agent-sessions.md and agent-roles.md for the full models.
Status lifecycle:
idle: Session exists, not currently executingbusy: Session is actively processing (LLM call in progress)retry: Last execution failed, session pending retryarchived: Session is read-only, no further interaction
Indexes: unq_sessions_slug UNIQUE on (slug), idx_sessions_project_id on (projectId), idx_sessions_workspace_id on (workspaceId), idx_sessions_status on (status), idx_sessions_active partial on (id) WHERE status IN ('idle', 'busy', 'retry') — efficiently find active (non-archived) sessions, idx_sessions_account_id on (accountId), idx_sessions_role_name on (roleName), idx_sessions_parent_id on (parentId) — find child sessions of coordinator.
messages
Messages within sessions. Content is stored separately in the parts table. This follows the opencode pattern: message metadata in one row, parts in separate rows. This enables streaming individual part updates, querying parts independently, and SSE events for message.part.updated.
| Column | Type | Notes |
|---|---|---|
| commonCols | — | id, metadata, createdAt, updatedAt |
| sessionId | text NOT NULL IMMUTABLE | FK → sessions.id (cascade) — Never updated after creation. |
| role | text NOT NULL | user, assistant, system |
| data | jsonb NOT NULL | Role-specific metadata |
Message IDs use UUIDv4 (via commonCols.id). Ordering is handled by the composite index idx_messages_session_id_created_at_id on (session_id, created_at, id). See ADR-003 for the rationale.
Message data shapes (discriminated by role):
user messages:
{
time: { created: number }, // epoch ms
format?: "text" | "json_schema", // input format hint
summary?: { title?: string, body?: string, diffs?: FileDiff[] },
agent?: string, // target agent name
model?: { providerID: string, modelID: string },
tools?: Record<string, boolean>, // enabled tools for this turn
}
assistant messages:
{
time: { created: number, completed?: number },
parentID?: string, // FK to the user message that triggered this turn
modelID: string,
providerID: string,
agent?: string,
path?: { cwd: string, root: string },
cost?: number,
tokens?: { input: number, output: number, reasoning?: number, cache?: { read: number, write: number } },
finish?: string, // "stop", "tool-calls", "length", etc.
error?: { code: string, message: string }, // typed error if the turn failed
}
system messages:
{
time: { created: number },
content: string, // system prompt text
}
Compatibility with opencode: The data blob is a superset of opencode's InfoData. When importing an opencode session, the opencode-specific fields (parentID, path, modelID, providerID, cost, tokens, finish) map directly. When importing from a hub-direct AI SDK session, the AI SDK UIMessage fields are projected into the same shape.
Compatibility with AI SDK: The AI SDK's UIMessage format (role + parts array) is assembled from these tables via a JOIN query. Storage is normalized; the API presents the denormalized view. No format conversion needed.
parts
Message parts — the actual content of the conversation. Each part has a type discriminator and type-specific content in the data column. Parts are ordered by their id within a message, using sortable timestamp-based IDs (not commonCols.id).
Important: The id column for parts uses a sortable ID scheme (not UUIDv4 from commonCols). Opencode uses prefix-based sortable IDs like prt_{timestamp_hex}{random} that give chronological ordering. This enables ORDER BY id ASC within a message without needing a separate position column. The implementation should use a monotonic ID generator that produces lexicographically sortable IDs.
The sessionId column on parts is a deliberate denormalization of message.sessionId — it allows direct queries like "all parts for a session" without joining through messages. sessionId on both messages and parts is IMMUTABLE after creation. It must never be updated. This is enforced by application logic, not a DB trigger. When inserting a part, read the message's sessionId and set it on the part within the same transaction. Direct SQL must not update sessionId on existing rows.
| Column | Type | Notes |
|---|---|---|
| id | text PK NOT NULL | Sortable timestamp-based ID (not commonCols.id) |
| metadata | jsonb | defaults to {} |
| createdAt | timestamp with tz NOT NULL | defaults to now() |
| updatedAt | timestamp with tz NOT NULL | defaults to now(), $onUpdate(() => new Date()) |
| messageId | text NOT NULL | FK → messages.id (cascade) |
| sessionId | text NOT NULL IMMUTABLE | FK → sessions.id (cascade, denormalized for direct queries) — Never updated after creation. |
| type | text NOT NULL | Part type discriminator (see below) |
| data | jsonb NOT NULL | Type-specific content |
Parts are immutable after creation. updatedAt is set on creation but parts should never be updated. The $onUpdate hook from commonCols is a no-op for parts because insert-only operations don't trigger it. If a part needs correction, insert a new part (e.g., a correction or amendment) rather than updating an existing one. The id column uses a sortable ID scheme (not UUIDv4 from commonCols) because chronological ordering within a message is required — see the sortable ID note above.
Part types and their data shapes:
The type field determines the shape of data. Our part types are a subset of opencode's MessageV2.Part discriminated union, expanded with AI SDK compatibility types. The types we include are:
| type | Description | data shape |
|---|---|---|
text |
Main text content (user or assistant) | { text: string, synthetic?: boolean, ignored?: boolean, time?: { start: number, end: number }, metadata?: Record<string, unknown> } |
reasoning |
Chain-of-thought / extended thinking | { text: string, metadata?: Record<string, unknown>, time: { start: number, end: number } } |
tool |
Tool invocation with lifecycle state | { callID: string, tool: string, state: ToolState } — see below |
step-start |
Beginning of an agentic step | { snapshot?: string } — git tree hash |
step-finish |
End of an agentic step with cost accounting | { reason: string, snapshot?: string, cost?: number, tokens: { input: number, output: number, reasoning?: number, cache?: { read: number, write: number } } } |
file |
File attachment | { mime: string, filename?: string, url: string, source?: FileSource } |
patch |
Git patch applied during tool execution | { hash: string, files: string[] } |
snapshot |
Git tree hash reference | { snapshot: string } |
agent |
Sub-agent delegation (e.g., @reviewer) | { name: string, source?: { value: string, start: number, end: number } } |
compaction |
Context window compaction marker | { auto: boolean, overflow?: boolean } |
Tool state discriminated union (ToolState):
type ToolState =
| { status: "pending", input: Record<string, unknown>, raw: string }
| { status: "running", input: Record<string, unknown>, title?: string, metadata?: Record<string, unknown>, time: { start: number } }
| { status: "completed", input: Record<string, unknown>, output: string, title: string, metadata: Record<string, unknown>, time: { start: number, end: number, compacted?: boolean }, attachments?: FilePartData[] }
| { status: "error", input: Record<string, unknown>, error: string, metadata?: Record<string, unknown>, time: { start: number, end: number } }
File source types:
type FileSource =
| { type: "file", path: string, text: { value: string, start: number, end: number } }
| { type: "symbol", path: string, name: string, kind: number, range: LSPLikeRange, text: { value: string, start: number, end: number } }
| { type: "resource", clientName: string, uri: string, text: { value: string, start: number, end: number } }
type FilePartData = {
mime: string;
filename?: string;
url: string;
source?: FileSource;
};
AI SDK UIMessage compatibility: The API assembles UIMessage from messages + parts via JOIN. The mapping is:
text(not ignored) →{ type: "text", text }file(non-text, non-directory) →{ type: "file", url, mediaType, filename }reasoning→{ type: "reasoning", text }step-start→{ type: "step-start" }tool(completed) →{ type: "tool-{name}", state: "output-available", toolCallId, input, output }tool(error) →{ type: "tool-{name}", state: "output-error", toolCallId, input, errorText }
AI SDK part types not mapped to the UIMessage view: step-finish, patch, snapshot, compaction, agent. These are either internal SDK events (step-finish, compaction), tool-execution metadata handled within the tool part's state lifecycle (patch, snapshot), or session-level delegation (agent, handled via sessions.parentId). They are stored in the parts table but excluded from the UIMessage assembly.
Why separate parts table: Streaming individual part updates, publishing message.part.updated SSE events, and querying parts independently (e.g., "find all tool calls in this session") all require parts to be their own rows, not embedded in a message JSON blob. This is the same pattern opencode uses and it works well at scale (100k+ parts across 24k+ messages in production).
Parts are flat — there is no parentId column on parts. Sub-agent delegation is handled at the session level (via sessions.parentId), not by nesting parts. If nesting becomes necessary in the future, it would require a schema change (adding parentId to parts).
Indexes: part_session_idx on (session_id), part_message_id_id_idx on (message_id, id) for efficient message loading, and idx_parts_session_id_type on (session_id, type) for queries like "all tool-call parts in session X".