Files
hub/docs/architecture/storage/spokes.md
glm-5.1 93e2286343 Align storage & architecture specs with published npm libraries
Systematically compared @alkdev/taskgraph, @alkdev/operations, and
@alkdev/flowgraph against storage/arch specs and fixed all mismatches.

Key changes:

Tasks (storage/tasks.md + ADR-011):
- Rename TaskFrontmatter → TaskInput to match library export
- Fix dependsOn (was depends_on) in field mappings — library uses
  camelCase; parseFrontmatter normalizes YAML snake_case on input
- Document DependencyEdge shape {from, to, qualityRetention?} and
  DB↔library field mapping
- Document graph node vs DB column distinction (TaskGraphNodeAttrs
  is a subset of TaskInput)
- Fix default risk fallback from low → medium (matches resolveDefaults)
- Fix cross-project guard column references (dependentTaskId, not taskId)
- Clarify @alkdev/taskgraph TS is source of truth; frontmatter is for
  LLM output parsing and legacy imports, not Rust CLI
- Add complete library exports reference

Operations (storage/spokes.md + operations.md):
- Add version, title, _meta columns to operations table (required by
  OperationSpec, were missing)
- Fix type casing: query/mutation/subscription (lowercase, matching
  OperationType runtime values)
- Make outputSchema and accessControl NOT NULL (matching library)
- Document ErrorDefinition shape {code, description, schema, httpStatus?}
- Document _meta vs commonCols.metadata distinction
- Add registerAll, get, getHandler, getByName, list, subscribe methods
- Fix buildCallHandler signature ({ registry, callMap })
- Fix OperationType values (lowercase)

Call graph (storage/call-graph.md + call-graph.md):
- Change operationId to NOT NULL with RESTRICT FK (was nullable/SET NULL)
  — matches flowgraph's required CallNodeAttrs.operationId
- Document sentinel __removed__ operation strategy for deletions
- Document ISO 8601 string ↔ timestamptz conversion requirement
- Rewrite CallEventMap to match actual library: flat dot-notation keys,
  timestamp on all events, nested error structure, optional output on
  completed event
- Remove call.running event (doesn't exist in library) — hub calls
  updateStatus(running) directly on dispatch
- Fix buildCallHandler({ registry, callMap }) signature
- Fix PendingRequestMap constructor (positional EventTarget)
- Add updateCall/removeCall/graph methods to API summary
- Document abort cascade as hub logic, not flowgraph logic
- Add open questions for operation deletion and reactive vs call graph
  semantics

Table reference (storage/table-reference.md):
- Update call_graph_nodes.operationId cascade to RESTRICT
- Update operations.type comment to lowercase
- Update status enum reference
2026-05-25 11:46:42 +00:00

12 KiB

status, last_updated
status last_updated
draft 2026-05-25

Table Schemas: Spokes & Operations

Spoke registration and operation specification tables. For cross-cutting reference (cascade behavior, index reference, status enums, relations), see table-reference.md. For design decisions, see ../../../decisions/. For spoke architecture, see ../../spoke-runner.md.

spokes

Spoke registrations. When a spoke connects to the hub via WebSocket, it calls hub.register with its details and operation list. The hub creates a spoke record and registers the operations. When the spoke disconnects, the record is updated with status: "disconnected".

Column Type Notes
commonCols id, metadata, createdAt, updatedAt
name text NOT NULL Spoke display name
status text NOT NULL Enum: connected, disconnected. Default: connected
spokeType text NOT NULL Spoke type: dev-env, client, compute
projectId text FK → projects.id (nullable — some spokes aren't project-scoped)
lastHeartbeat timestamp with tz Last heartbeat timestamp
hostInfo jsonb Host metadata ({ os, arch, nodeVersion, memory, cpu })
connectedAt timestamp with tz When the spoke connected
disconnectedAt timestamp with tz When the spoke disconnected (null if still connected)

Indexes: idx_spokes_project_id on (projectId), idx_spokes_status on (status), idx_spokes_name on (name) — look up spoke by name, idx_spokes_active partial on (id) WHERE status = 'connected' — efficiently find connected spokes.

No reconnecting status: Spoke reconnection is handled at the WebSocket layer, not in the database. When a spoke disconnects, its status becomes disconnected. When it reconnects, it's a new connection — the spoke row is updated back to connected with a new connectedAt. Transient reconnection attempts don't need a database state; they're a transport concern.

If monitoring of reconnection attempts is needed, use the call graph (a hub.register call from the spoke) or observability events (WebSocket reconnection logs), not a database status.

No capabilities column on spokes: A spoke's capabilities are its registered operations. Query operation_registrations filtered by providerId and status = 'active' to find what a connected spoke can do. The operations table holds the definitions. See ADR-006 in decisions/.

Relationship to operations and registrations: When a spoke calls hub.register with an operations list, the hub creates or finds operations rows (definitions) for each operation, then creates operation_registrations rows linking the spoke to those definitions. When the spoke disconnects, registrations are set to inactive but definitions persist. See the operations and operation_registrations tables below.

Input mapping from hub.register: The hub.register operation (see spoke-runner.md) accepts { spokeId, operations[], spokeType, project, hardware }. This maps to the spokes table columns as: spokeIdid, spokeTypespokeType, projectprojectId (looked up by project identifier), hardwarehostInfo. The name field may be derived from the spoke's configuration or provided separately. Each operation in operations[] maps to an operations row (definition, created or found by namespace+name) and an operation_registrations row (provider binding, linking the spoke to the definition).

operations

Operation definitions — what an operation IS. These persist independently of spoke connections. Multiple providers can register the same operation (by namespace+name); they share the definition.

Column Type Notes
commonCols id, metadata, createdAt, updatedAt
namespace text NOT NULL Post-remap identifier (e.g., dev.{spokeId}.fs.read)
name text NOT NULL Operation name within namespace (e.g., fs.read, call)
version text NOT NULL DEFAULT '1.0.0' Semantic version of the operation definition. Required by OperationSpec.version. When a spoke re-registers with a different version, the hub updates this column.
type text NOT NULL query, mutation, subscription (lowercase, matching @alkdev/operations OperationType enum runtime values)
title text Display/UX title. Populated by FromOpenAPI (from OpenAPI summary) and MCP adapter (from MCP tool description). Nullable — native operations may not set this. Falls back to name for display.
description text Human-readable description
inputSchema jsonb NOT NULL TypeBox schema for input
outputSchema jsonb NOT NULL TypeBox schema for output. NOT NULL — OperationSpec requires this. Use {} (empty schema) for operations with no meaningful output.
errorSchemas jsonb Array of ErrorDefinition objects (see ErrorDefinition Shape). Nullable — operations with no declared error schemas leave this null.
accessControl jsonb NOT NULL AccessControl definition. NOT NULL — OperationSpec requires this. Use { requiredScopes: [] } for operations with no access restrictions.
tags jsonb String array for search/filter
_meta jsonb Operation-specific extension metadata. Distinct from commonCols.metadata (which is generic row-level metadata). Used by adapters: FromOpenAPI stores { method, path, summary }, MCP adapter stores MCP-specific metadata. Nullable — native operations may not set this.

Unique constraint: CREATE UNIQUE INDEX unq_operations_namespace_name ON operations (namespace, name) — operation definitions are unique by namespace+name, regardless of how many providers register them.

Indexes: idx_operations_namespace on (namespace), idx_operations_type on (type).

type column casing: Values are lowercase (query, mutation, subscription), matching the OperationType enum runtime values in @alkdev/operations. The enum names are uppercase (OperationType.QUERY) but the string values are lowercase ("query"). SQL queries should use lowercase: WHERE type = 'query'.

_meta vs commonCols.metadata: Both are JSONB but serve different purposes. _meta holds operation-specific adapter metadata (HTTP method/path for OpenAPI ops, protocol details for MCP ops). metadata holds generic row-level metadata (retention, audit, key versioning) with a namespacing convention (_subsystem.key). They are not interchangeable — _meta is set by the operation author/adapter, metadata is set by hub subsystems.

ErrorDefinition Shape

The errorSchemas column stores an array of ErrorDefinition objects (from @alkdev/operations):

interface ErrorDefinition {
  code: string;        // e.g., "INVALID_INPUT", "NOT_FOUND"
  description: string; // Human-readable description
  schema: unknown;     // TypeBox schema for error detail shape
  httpStatus?: number; // Optional HTTP status code mapping
}

This is the structured error contract between an operation and its callers. No errorSchemas = safe default with EXECUTION_ERROR wrapper (see call-graph.md error model).

operation_registrations

Provider registrations — which spoke/client PROVIDES an operation right now. Ephemeral data: these reflect the current runtime state of who can handle a call.

Column Type Notes
commonCols id, metadata, createdAt, updatedAt
operationId text NOT NULL FK → operations.id (CASCADE — deleting a definition removes all its registrations)
providerType text NOT NULL spoke or client — which provider type
providerId text NOT NULL FK → spokes.id when providerType is spoke; FK → clients.id when providerType is client
preRemapNamespace text The original namespace before remapping (e.g., dev for dev.{spokeId}.fs.read). Stored for traceability.
preRemapName text The original name before remapping
status text NOT NULL active or inactive. Default: active. Set to inactive on disconnect, re-activated on reconnect.
metadata jsonb Provider-specific metadata (version, health, latency hints)

Unique constraint: CREATE UNIQUE INDEX unq_operation_registrations_active ON operation_registrations (operationId, providerType, providerId) WHERE status = 'active' — only one active registration per provider per operation.

Indexes: idx_operation_registrations_operation_id on (operationId), idx_operation_registrations_provider_id on (providerId), idx_operation_registrations_status on (status).

Spoke registration lifecycle: When a spoke connects and registers:

  1. Creates/updates the spokes row
  2. For each operation the spoke provides:
    • Creates or finds the operations row (by namespace+name). If this is a new spoke instance providing a known operation, the definition already exists.
    • Creates an operation_registrations row linking the spoke to the operation definition, with status: 'active' and the pre-remap identifiers.

When a spoke disconnects:

  1. Updates the spokes row to status: "disconnected"
  2. Sets all the spoke's operation_registrations rows to status: "inactive"
  3. Aborts in-flight calls via call protocol cascading
  4. Operation definitions (in operations) are never deleted on disconnect — they persist for audit and potential reconnection.

When an admin deletes a spoke row (rare):

  1. operation_registrations with that providerId are CASCADE deleted (ephemeral data, follows D1 cascade policy for ephemeral config)
  2. If no other registrations exist for an operation, its definition may be cleaned up separately

Polymorphic FK Enforcement for providerId

operation_registrations.providerId is a polymorphic FK: it references spokes.id when providerType = 'spoke' and clients.id when providerType = 'client'. Postgres does not support multi-target FK constraints natively. The current approach uses application-layer enforcement:

  • No DB-level FK on providerId — referential integrity is enforced by the application at registration time
  • onDelete behavior is also application-managed: when a spoke disconnects, registrations are set to inactive; when an admin deletes a spoke, registrations are CASCADE-deleted by the application

This is a pragmatic trade-off: polymorphic FKs in a single column are awkward in Postgres (requiring triggers or check constraints with multiple nullable FK columns). The application layer already knows the provider type at registration time, making enforcement straightforward.

Alternative approaches (deferred):

  • Two nullable FK columns (spokeId and clientId) with a CHECK constraint ensuring exactly one is set
  • A trigger that validates providerId against the correct table based on providerType

Open Questions

  1. Operation deletion and call graph integrity: An operation row referenced by call_graph_nodes.operationId cannot be deleted while call records exist (RESTRICT FK). Two strategies: (a) deny the removal while any call records reference it, or (b) reassign call records to a sentinel __removed__ operation row (pre-seeded in migrations) before deleting. Strategy (a) is simpler and recommended for v1. Strategy (b) requires the sentinel row to exist before any call records can reference it, and adds write overhead. The sentinel operation row (__removed__, namespace system) should be pre-seeded in migrations if strategy (b) is adopted.

  2. providerId FK enforcement: Should operation_registrations.providerId use application-layer enforcement (current), triggers, or separate nullable FK columns? See Polymorphic FK Enforcement section above.