alkdev/hub

Files

glm-5.1 2b63cda1c7 Setup repo: migrate architecture specs, code stubs, and tasks from alkhub_ts

Copy architecture docs, ADRs, storage domain specs, research, reviews,
and 56 storage architecture tasks from the alkhub_ts monorepo. Adapt for
standalone @alkdev/hub repo structure (src/ not packages/hub/).

Sanitize all sensitive information:
- Replace private IPs (10.0.0.1) with localhost defaults
- Remove internal server hostnames (dev1, ns528096)
- Replace /workspace/ private paths with npm package references
- Remove hardcoded credentials from examples
- Rewrite infrastructure.md without private network details

Add Deno project scaffolding: deno.json (pinned deps), .gitignore,
AGENTS.md, entry point. Migrate existing code stubs (crypto, config
types, logger) with updated import paths.

2026-05-25 10:56:32 +00:00

20 KiB

Raw Blame History

status, last_updated

status	last_updated
draft	2026-05-22

Spoke: WebSocket-Connected Operation Provider

Overview

A "spoke" is any process connected to the hub via a persistent websocket that provides and/or consumes operations. The hub-spoke protocol is the same four operations that MCP agents use: list, search, schema, call. There is one contract — the spoke is just another client of the hub's operation interface, except it also provides operations to the hub's registry.

A spoke can be many things:

Dev env spoke — exposes local dev tools (bash, file ops, fs.read, fs.write) to the hub
Client spoke — a user's local machine, where the hub can call operations like notifications or local integrations back to the user
GPU compute spoke — a vast.ai instance exposing CUDA operations
Any future spoke — anything that connects, lists its ops, and responds to calls

Design Principles

One contract — the hub-spoke protocol is list/search/schema/call. Same operations, same event shapes, whether the consumer is an MCP agent, a browser client, or another spoke. No separate "runner management" protocol.
WebSocket is the transport — persistent bidirectional connection. The hub pushes call.requested, the spoke pushes call.responded/call.error. Same call protocol, WebSocketEventTarget (@alkdev/pubsub/event-target-websocket-client on spoke, @alkdev/pubsub/event-target-websocket-server on hub) as the TypedEventTarget impl.
Bidirectional — the hub calls operations on the spoke (dispatch), and the spoke calls operations on the hub (e.g., publishing events, calling other spokes' operations through the hub). Same protocol in both directions.
Registration = list — when a spoke connects, it calls hub.register and includes its operation list. The hub now knows what that spoke can do. No separate registration protocol.
Filtered by identity — list and search return operations scoped to the caller's identity. An admin sees everything. A dev env spoke sees only the operations it's allowed to call. This prevents context bloat and enforces access control at the discovery layer.
Op remapping — a dev env spoke exposes fs.read, fs.write, bash.exec, etc. The hub maps these to its own dev.fs.read, dev.fs.write, dev.bash.exec (or similar namespaced form) so they don't collide with hub-native operations. When an LLM calls dev.fs.read, the hub routes to the right spoke. From the LLM's perspective it's just a call — it doesn't know or care which spoke executes it.
No persistent state — spoke is ephemeral. All state lives in the hub's Postgres. PendingRequestMap and CallHandler are from @alkdev/operations.
Stateless on reconnect — if the websocket drops, the spoke reconnects. The hub aborts in-flight calls via call protocol cascading. On reconnect, hub.register re-establishes what the spoke can do.

Why WebSocket, Not Redis or HTTP

Redis Pub/Sub	HTTP Long-Poll	WebSocket
Spoke needs Redis access	Spoke is always a client	Spoke is always a client
Separate channels for dispatch vs results	Polling latency	Bidirectional, push-based
`spoke:{id}:dispatch` + `spoke:{id}:results`	POST result back after poll	Same connection, same protocol
Requires Redis on spoke's network	Works anywhere but slow	Works anywhere, fast
Hub mediates via Redis, not call protocol	Hub mediates via HTTP, not call protocol	Call protocol flows end-to-end

External compute (vast.ai, ubicloud) won't have Redis access. A user's laptop running a client spoke won't have Redis. WebSocket works from anywhere with just an internet connection, and gives us bidirectional push. The call protocol's TypedEventTarget abstraction means the hub's PendingRequestMap (from @alkdev/operations) doesn't care whether the event traverses Redis, in-process EventTarget, or a websocket.

The hub uses Redis internally for its own cross-process event routing (see pubsub-redis.md). Spokes don't need to know about Redis.

Spoke Types

Dev Env Spoke

Wraps local development tools. The spoke scans its local operation definitions (bash, filesystem, git) and registers them with the hub on connect. The hub remaps these into a namespace (e.g., dev.*) so an LLM agent working with this spoke gets dev.fs.read, dev.bash.exec, etc. in its list results.

This is what replaces the per-opencode-container MCP server model. Instead of each container running its own MCP server with open-websearch etc., the container runs a dev env spoke. The hub provides shared infrastructure operations (websearch, coordination); the spoke provides local dev tools.

Client Spoke

A user's local machine or browser. The hub can call operations on the client spoke — for example, sending a notification, triggering a local action, or providing a callback for a long-running agent task. The client spoke might expose only a few operations (client.notify, client.openUrl, client.confirm), but the bidirectional nature means the hub can push to the user proactively.

From the LLM's perspective, calling client.notify is just another call. It doesn't know the operation routes to the user's laptop.

GPU Compute Spoke

# On vast.ai instance
curl -fsSL https://alk.dev/install-spoke | sh
alk-spoke start --hub <hub-url> --token <token> --capability cuda

Same websocket, same hub.register with its operation list. The hub routes compute.train or compute.infer to it.

Container Spoke (deferred)

Extends the base spoke with Docker container lifecycle management + opencode integration. A dev server spoke that manages opencode containers on a compute server, wrapping container start/stop/restart as operations. A separate variant (without Docker) will target cloud compute instances. Both are just spokes with extra operations — they register like any other spoke, the hub dispatches to them.

Prerequisite: Working hub + minimal base spoke first. The open-coordinator plugin's container/worktree patterns inform the design but are not a runtime dependency.

Identity-Filtered Discovery

The list and search operations return different results based on the caller's identity. This is access control at the discovery layer:

Identity	What `list`/`search` returns
Admin	All operations across all connected spokes + hub-native
Dev env spoke (authenticated)	Hub operations it's allowed to call + its own operations
Dev env spoke's LLM agent	Operations the LLM is allowed to call (dev tools, coordination, search)
Client spoke	Hub operations scoped to that user + any client-callable ops
Unauthenticated	Nothing (auth required)

This is why list/search/schema/call are operations, not just passive endpoints — they go through CallHandler which checks the operation's AccessControl (requiredScopes, resource permissions) against the caller's Identity. The hub can also filter based on the spoke type (dev env vs client vs compute) and the spoke's declared capabilities.

Op remapping in practice: when a dev env spoke registers with fs.read, fs.write, bash.exec, the hub stores these as dev.{spokeId}.fs.read, dev.{spokeId}.fs.write, dev.{spokeId}.bash.exec. For LLM agents using this spoke, list can collapse the prefix to just dev.fs.read if only one dev env spoke is active for that session. If multiple dev env spokes are connected, the full dev.{spokeId}.* form disambiguates.

Registration Flow

Registration is a spoke calling hub.register — a regular operation call over the websocket:

Spoke connects (WS)
  │
  ├── Auth (token in first message or WS handshake)
  │
  ├── Spoke calls: hub.register { runnerId, operations[], spokeType, project, hardware }
  │   └── Hub's hub.register handler:
  │       ├── Stores spoke's websocket reference
  │       ├── Remaps spoke's operations into hub namespace
  │       ├── Adds to RunnerPool
  │       └── Returns { runnerId, status: "connected" }
  │
  └── Spoke is now registered. Hub can dispatch to it; it can call hub ops.

On reconnect: the spoke calls hub.register again. The hub refreshes. Any in-flight calls from the previous connection were already aborted by the call protocol on disconnect.

On disconnect: the hub detects the closed websocket, aborts in-flight calls via call protocol cascading, and marks the spoke disconnected. The spoke's remapped operations are removed from the hub's registry so list/search no longer return them.

Spoke Lifecycle

1. Start
   ├── Load config (hub WS URL, auth token)
   ├── Scan local operations (OperationRegistry.scan via `@alkdev/operations` with `ScannerFS` Deno adapter)
   ├── Open websocket to hub (wss://api.alk.dev/ws)
   ├── Call hub.register with runnerId + operation list + spokeType + hardware
   │   └── Hub stores spoke in RunnerPool, remaps operations
   └── Heartbeat via WS ping/pong

2. Running
   ├── Receive call.requested over WS (hub dispatching an operation to this spoke)
   │   ├── Execute via local OperationRegistry
   │   ├── Send call.responded (or call.error) back over WS
   │   └── Call graph tracked on hub side via parentRequestId
   ├── Receive call.aborted over WS
   │   └── Abort local execution (AbortController cascade)
   └── Send call.requested over WS to hub (spoke calling a hub operation)
       └── Hub responds with call.responded

3. Disconnect / Reconnect
   ├── WebSocket drops
   ├── Hub detects missed heartbeats
   │   └── Abort in-flight calls dispatched to spoke (call protocol cascading)
   ├── Spoke reconnects
   │   └── Call hub.register again → hub refreshes
   └── Or spoke shuts down gracefully
       └── Call hub.unregister before closing WS

Dispatch Flow

Hub                                      Spoke
 │                                          │
 │──── call.requested ─────────────────────→│  (hub → spoke: "execute this")
 │                                          ├── CallHandler validates
 │                                          ├── registry.execute(operationId, input)
 │←─── call.responded ────────────────────│  (spoke → hub: "here's the result")
 │                                          │
 │──── call.aborted ──────────────────────→│  (hub → spoke: "cancel this")
 │                                          ├── AbortController.abort()
 │←─── call.aborted ──────────────────────│  (spoke → hub: "confirmed")
 │                                          │
 │←─── call.requested ─────────────────────│  (spoke → hub: "call a hub op")
 │──── call.responded ────────────────────→│  (hub → spoke: "result")

The call protocol is fully bidirectional over the websocket. The hub dispatches operations to the spoke; the spoke calls hub operations. Same CallEventMap, same requestId correlation, same error model.

WebSocketEventTarget

Available in @alkdev/pubsub:

Spoke side: @alkdev/pubsub/event-target-websocket-client — createWebSocketEventTarget(ws) wraps a WebSocket instance as a TypedEventTarget
Hub side: @alkdev/pubsub/event-target-websocket-server — creates a WebSocketEventTarget for each incoming spoke connection

Both implement the same TypedEventTarget interface as RedisEventTarget, using EventEnvelope for structured cross-process messaging.

On the hub side, each spoke's websocket connection gets a WebSocketEventTarget. The hub creates a PendingRequestMap (from @alkdev/operations) scoped to that spoke. When the hub needs to call an operation on a specific spoke, it uses that spoke's PendingRequestMap.call() — the event traverses the websocket, the spoke handles it, the response comes back, the Promise resolves.

Hub-Side WebSocket Handling (Architectural Task)

The hub needs a WebSocket server component that handles the other side of spoke connections. This is an architectural task that needs deeper design:

Hono WebSocket upgrade — app.get("/ws", upgradeWebSocket(...)) handler
Per-connection WebSocketEventTarget — create a WebSocketEventTarget for each incoming spoke connection
Per-connection PendingRequestMap — scoped callMap for dispatching to this specific spoke
Spoke lifecycle — on connect: hub.register → create event target + call map → add to RunnerPool; on disconnect: abort in-flight calls → remove from pool
Identity/authentication — verify token at upgrade or first message, attach to OperationContext.identity

This connects the pubsub system's WebSocketEventTarget (@alkdev/pubsub/event-target-websocket-client for spokes, @alkdev/pubsub/event-target-websocket-server for the hub) with the hub's PendingRequestMap and CallHandler (from @alkdev/operations). The full design needs to account for reconnection, heartbeat, and the interaction with the existing RedisEventTarget (@alkdev/pubsub) for cross-process event routing.

Hub-Side Operations

Spoke management and discovery are just operations in the hub's registry — the same ones the MCP interface exposes:

Operation	Input	Output	Description
`hub.register`	`{ runnerId, operations[], spokeType, project, hardware }`	`{ status: "connected" }`	Register spoke, remap its operations
`hub.unregister`	`{ runnerId }`	`{ status: "disconnected" }`	Graceful disconnect, abort in-flight calls
`hub.list`	`{ namespace?, q? }`	`OperationSpec[]`	List available ops (filtered by caller identity)
`hub.search`	`{ q, namespace? }`	`{ tool, description }[]`	Search ops (filtered by caller identity)
`hub.schema`	`{ tool }`	`{ inputSchema, outputSchema }`	Get schemas for an operation
`hub.call`	`{ calls: [{ tool, input }] }`	`{ success, result/error }[]`	Execute operations (routes to correct spoke)

When an MCP agent calls search, it's calling hub.search. When a spoke calls hub.register, it's using the same interface. One contract.

Routing in hub.call:

Operation starts with hub.* → execute locally in hub's registry
Operation matches a spoke's remapped namespace → dispatch via that spoke's WebSocketEventTarget
Operation not found → OPERATION_NOT_FOUND error via call protocol

What a Spoke Does NOT Have

No Postgres connection
No Redis connection
No HTTP API server (it's a websocket client, not a server)
No UI of any kind
No session storage
No task graph
No call graph (the hub tracks the graph; the spoke just executes and responds)
No separate "spoke protocol" — same operation interface as everyone else

It is an operation provider/consumer connected to the hub by a single websocket.

Composability Note

MCP as an RPC protocol has a fundamental limitation: you can't get return types from MCP servers, so MCP tools aren't composable. This is fine for LLMs calling tools interactively, but it breaks programmatic composition — you can't chain MCP tools together or build higher-level operations from MCP tool outputs. That's what started the toolEnv POC research in the first place.

Our operations avoid this because every operation has typed inputSchema and outputSchema (TypeBox/JSON Schema). You can compose: the output of dev.fs.read can feed into the input of hub.search because schemas are known and type-checkable. MCP tools can't do this.

Schema Wire Format

Schemas travel over the wire as JSON Schema, not as TypeBox objects. TypeBox schemas are a superset of JSON Schema (they add [Kind] symbols for runtime type checking), so JSON.parse(JSON.stringify(typeboxSchema)) produces valid JSON Schema. On the receiving end, FromSchema() decorates plain JSON Schema with [Kind] symbols to create TypeBox TSchema objects suitable for Value.Check() validation.

This means:

TypeScript spokes using TypeBox: serialize naturally (TypeBox schemas are already valid JSON Schema minus the [Kind] symbols, which strip on serialization).
TypeScript spokes using Zod or Valibot: the scanner converts to TypeBox at registration time via @alkdev/operations/from-typemap (see ADR-013), then serialize as JSON Schema.
Non-TypeScript spokes (Python, Rust, etc.): send JSON Schema directly. Any language with a JSON Schema library and a WebSocket client can implement a spoke. No TypeBox dependency required.
The hub deserializes incoming JSON Schema via FromSchema() (from @alkdev/operations/from-schema) — same path used for MCP tools and OpenAPI specs (from @alkdev/operations/from-openapi).

This makes the hub-spoke protocol language-agnostic at the schema level. The hub's internal use of TypeBox for validation is an implementation detail, not a protocol requirement.

Wire Schema Constraints

Schemas sent over the wire must be self-contained JSON Schema — no external $refs, no $defs/definitions. The hub's FromSchema() converter handles the commonly-used JSON Schema subset (objects, arrays, primitives, allOf/anyOf/oneOf, enum, const, format annotations) but not features like patternProperties, if/then/else, or not (see ADR-013 for the full coverage table).

The hub enforces security constraints on inbound schemas:

Depth limit (suggested: 10 levels of nesting) — prevents stack overflow from deeply nested allOf/anyOf
Size limit (suggested: 64KB per schema) — prevents oversized payloads
No circular $refs — the hub rejects schemas with $ref or $defs/definitions, or pre-processes by inlining with cycle detection

Unsupported JSON Schema features silently degrade to Type.Unknown() (accepts any value — safe but unvalidated). The hub should log degradation warnings to help spoke authors fix their schemas.

For "legacy" systems like opencode that only speak MCP, we expose an MCP endpoint as a thin adapter over the same hub.list/hub.search/hub.schema/hub.call operations. The MCP endpoint is a compatibility layer, not the primary interface.

Open Questions

How does a spoke receive its project context? — Does the hub tell it which git repo to clone, or does it come pre-configured?
Container lifecycle — See "Container Spoke (deferred)" above. Container lifecycle management will be handled by a container spoke that extends the base spoke.
Source sync for external compute — Does a GPU spoke clone from Gitea automatically, or does the hub push source?
WebSocket auth — Token in first message after connect, or token in query string / subprotocol header? (Related: hub-architecture.md API auth model)
Concurrent operations per spoke — Can a spoke handle multiple call.requested events concurrently? Concurrent is better for SUBSCRIPTION operations.
Operation list freshness — Does the spoke re-register on reconnect only, or does it push updates when its registry changes?

20 KiB Raw Blame History