docs: fix inconsistencies in architecture specs

- Replace hub/spoke with head/worker terminology in call-protocol.md, auth.md, open-questions.md, napi-and-pubsub.md - Update operation paths from /{spoke}/{service}/{op} to /{node}/{service}/{op} throughout call-protocol.md - Unify Identity struct: auth.md already had {id, scopes, resources}, add note clarifying this is canonical (vs research/services.md which used {node_id, fingerprint, scopes}) - Update integration-plan.md inconsistencies section to track what's been fixed (hub/spoke, identity model) and expand service naming to include external services - Update call-protocol.md last_updated date ADRs are intentionally left unchanged as historical records.
2026-06-07 07:50:00 +00:00
parent 69d232fda7
commit 6db1266672
5 changed files with 88 additions and 82 deletions
--- a/docs/architecture/auth.md
+++ b/docs/architecture/auth.md
@@ -1,6 +1,6 @@
 ---
 status: draft
-last_updated: 2026-06-04
+last_updated: 2026-06-07
 ---

 # Authentication & Identity
@@ -95,11 +95,19 @@ pub struct Identity {
 }
 ```

+> **Note on identity models**: Earlier research used `{node_id, fingerprint, scopes}`.
+> The unified model uses `{id, scopes, resources}` where `id` serves as both
+> fingerprint (for key-based auth from config) and account UUID (for
+> database-backed auth). The `resources` field provides resource-level
+> authorization beyond what scopes offer. This is the canonical definition
+> that all components should use.
+```
+
 **Default implementation**: `ConfigIdentityProvider` loads from
 `DynamicConfig.auth` (the `authorized_keys` set). Every authorized key gets a
 default scope set. No database required.

-**Hub implementation**: Backed by `@alkdev/storage`'s `peer_credentials` and
+**Head implementation**: Backed by `@alkdev/storage`'s `peer_credentials` and
 `accounts` tables plus the ACL graph. Resolves fingerprint → account →
 organization membership → effective scopes. Uses `ArcSwap` for hot reload.

--- a/docs/architecture/call-protocol.md
+++ b/docs/architecture/call-protocol.md
@@ -1,6 +1,6 @@
 ---
 status: draft
-last_updated: 2026-06-04
+last_updated: 2026-06-07
 ---

 # Call Protocol
@@ -11,15 +11,15 @@ A bidirectional, transport-agnostic call and event protocol that runs over
 authenticated pipes. It supports request/response calls, streaming
 subscriptions, and unidirectional events — all using the same wire format. The
 protocol is defined as a spec + handler + registry; downstream consumers (NAPI,
-Python, hub/spoke) register their own operations without modifying core.
+Python, head/worker) register their own operations without modifying core.

 ## Why

 The current control channel (ADR-018) is unidirectional (client → server) and
 provides fire-and-forget event dispatch without request/response semantics.
 The call protocol generalizes it to support bidirectional calls (ADR-024) and
-downstream service registration (ADR-025), enabling the hub/spoke model where
-spokes expose operations the hub invokes.
+downstream service registration (ADR-025), enabling the head/worker model where
+workers expose operations the head invokes.

 ## Architecture

@@ -28,10 +28,10 @@ spokes expose operations the hub invokes.
 Operation names use slash-based paths aligned with URL routing conventions:

 ```
-/{spoke}/{service}/{op}
+/{node}/{service}/{op}
 ```

- **spoke** — identity prefix of the node that exposes the operation. The hub
+- **node** — identity prefix of the node that exposes the operation. The head
  uses this segment to route calls to the correct connected node.
 - **service** — the logical service namespace. Groups related operations
  under one handler prefix.
@@ -41,11 +41,11 @@ Examples:

 | Path | Meaning |
 |------|---------|
-| `/dev1/fs/readFile` | Spoke `dev1`, service `fs`, operation `readFile` |
-| `/dev1/bash/exec` | Spoke `dev1`, service `bash`, operation `exec` |
-| `/hub/agent/chat` | Hub's own `agent` service, operation `chat` |
-| `/hub/sessions/list` | Hub's own `sessions` service, operation `list` |
-| `/browser-1/notify/alert` | Browser spoke `browser-1`, `notify` service |
+| `/dev1/fs/readFile` | Node `dev1`, service `fs`, operation `readFile` |
+| `/dev1/bash/exec` | Node `dev1`, service `bash`, operation `exec` |
+| `/head/agent/chat` | Head's own `agent` service, operation `chat` |
+| `/head/sessions/list` | Head's own `sessions` service, operation `list` |
+| `/browser-1/notify/alert` | Worker `browser-1`, `notify` service |

 This three-level routing mirrors iroh's ALPN dispatch: the first segment
 routes to a connected node (like ALPN routes to a protocol handler), the
@@ -110,11 +110,11 @@ The `id` field carries the `requestId` for correlation.

 ### Bidirectional Calls and Routing

-Both sides of a connection can initiate calls. The hub routes calls to spokes
+Both sides of a connection can initiate calls. The head routes calls to workers
 using the first path segment:

 ```
-Hub (server)                              Spoke: "dev1" (client)
+Head (server)                              Worker: "dev1" (client)
     │                                           │
     │  call.requested                           │
     │  name: "/dev1/fs/readFile"                │
@@ -126,11 +126,11 @@ Hub (server)                              Spoke: "dev1" (client)
     │  payload: { content: "fn main()..." }     │
     │◀──────────────────────────────────────────│
     │                                           │
-     │          Spoke exposes /dev1/fs/*,        │
-     │          /dev1/bash/* to hub              │
+     │          Worker exposes /dev1/fs/*,        │
+     │          /dev1/bash/* to head              │
     │                                           │
     │◀─ call.requested ────────────────────────│
-     │  name: "/hub/agent/chat"                  │
+     │  name: "/head/agent/chat"                  │
     │  payload: { provider: "anthropic", ... }  │
     │                                           │
     │── call.responded ──────────────────────▶ │
@@ -138,54 +138,54 @@ Hub (server)                              Spoke: "dev1" (client)
     │  payload: { completion: "..." }            │
 ```

-The hub's registry includes:
- **Hub-local operations** (`/hub/*`) — handled directly
- **Remote operations** (`/{spoke}/*`) — forwarded to the spoke connection
+The head's registry includes:
+- **Head-local operations** (`/head/*`) — handled directly
+- **Remote operations** (`/{node}/*`) — forwarded to the worker connection

-When the hub routes `/dev1/fs/readFile` to spoke `dev1`, it strips the spoke
-prefix and delivers the call to the spoke's local registry as `/fs/readFile`.
-The spoke doesn't need to know its own alias.
+When the head routes `/dev1/fs/readFile` to worker `dev1`, it strips the node
+prefix and delivers the call to the worker's local registry as `/fs/readFile`.
+The worker doesn't need to know its own alias.

-### Hub/Spoke Architecture
+### Head/Worker Architecture

 ```
         ┌─────────────────────────────────┐
-         │              Hub                │
+         │           Head Node             │
         │                                 │
-         │  Hub-local services:            │
-         │  /hub/agent/chat   (LLM coord)  │
-         │  /hub/agent/complete            │
-         │  /hub/sessions/list             │
-         │  /hub/sessions/history          │
+         │  Head-local services:           │
+         │  /head/agent/chat  (LLM coord)  │
+         │  /head/agent/complete           │
+         │  /head/sessions/list            │
+         │  /head/sessions/history         │
         │                                 │
-         │  Spoke registry (discovered):   │
-         │  /dev1/fs/* → dev1 connection    │
-         │  /dev1/bash/* → dev1 connection  │
-         │  /dev2/fs/* → dev2 connection    │
-         │  /browser-1/notify/* → WT conn  │
+         │  Worker registry (discovered):  │
+         │  /dev1/fs/* → dev1 connection   │
+         │  /dev1/bash/* → dev1 connection │
+         │  /dev2/fs/* → dev2 connection   │
+         │  /browser-1/notify/* → WT conn │
         └──────┬───────┬───────┬──────────┘
                │       │       │
      ┌─────────▼┐ ┌───▼────┐ ┌▼───────────┐
-      │  Dev Spoke│ │Dev Spk │ │Browser Spoke│
-      │  "dev1"   │ │"dev2"  │ │"browser-1"  │
-      │  /fs/*    │ │/fs/*   │ │/notify/*    │
-      │  /bash/*  │ │/bash/* │ │             │
-      │  /search/*│ │        │ │             │
-      └───────────┘ └────────┘ └─────────────┘
+      │  Worker  │ │Worker  │ │Browser Worker│
+      │  "dev1"  │ │"dev2"  │ │"browser-1"  │
+      │  /fs/*   │ │/fs/*   │ │/notify/*    │
+      │  /bash/* │ │/bash/* │ │             │
+      │  /search/*│ │       │ │             │
+      └──────────┘ └────────┘ └─────────────┘
 ```

-When a spoke connects, it registers its operations with the hub:
+When a worker connects, it registers its operations with the head:

 ```
-spoke → hub:  call.requested { name: "/hub/services/register", payload: {
-  spoke: "dev1",
+worker → head:  call.requested { name: "/head/services/register", payload: {
+  node: "dev1",
  operations: ["/fs/readFile", "/fs/writeFile", "/bash/exec", "/search/query"]
 }}
 ```

-The hub adds these to its routing table with the spoke prefix. Other spokes
+The head adds these to its routing table with the node prefix. Other workers
 and browser clients can then call `/dev1/fs/readFile` without knowing how
-the hub routes it internally.
+the head routes it internally.

 ### Operation Registry

@@ -223,7 +223,7 @@ pub struct AccessControl {
 registry.register(OperationSpec { name: "/services/list", ... }, list_services_handler);
 registry.register(OperationSpec { name: "/services/schema", ... }, schema_handler);

-// A dev env spoke registers its tools
+// A dev env worker registers its tools
 registry.register(OperationSpec { name: "/fs/readFile", ... }, fs_read_handler);
 registry.register(OperationSpec { name: "/bash/exec", ... }, bash_exec_handler);

@@ -231,10 +231,10 @@ registry.register(OperationSpec { name: "/bash/exec", ... }, bash_exec_handler);
 registry.register(OperationSpec { name: "/notify/alert", ... }, notify_handler);
 ```

-Core-provided operations use short paths without a spoke prefix
+Core-provided operations use short paths without a node prefix
 (`/services/list`, `/services/schema`). They live on whatever node the
-caller is connected to. Spoke-prefixed operations (`/dev1/fs/readFile`)
-are routed by the hub.
+caller is connected to. Worker-prefixed operations (`/dev1/fs/readFile`)
+are routed by the head.

 ### ACL Per Operation Path

@@ -242,12 +242,12 @@ Access control maps to path prefixes using standard URL-like matching:

 | Pattern | Matches | Purpose |
 |---------|---------|---------|
-| `/dev1/*` | All operations on spoke `dev1` | Full access to a spoke |
-| `/*/fs/*` | `fs` service on any spoke | Read file access across dev envs |
-| `/*/bash/*` | `bash` service on any spoke | Shell access (higher risk) |
-| `/hub/agent/*` | Hub LLM agent | LLM calls |
-| `/hub/sessions/*` | Hub session management | Session history |
-| `/browser-1/notify/alert` | Specific operation on specific spoke | One UI notification |
+| `/dev1/*` | All operations on node `dev1` | Full access to a worker |
+| `/*/fs/*` | `fs` service on any node | Read file access across dev envs |
+| `/*/bash/*` | `bash` service on any node | Shell access (higher risk) |
+| `/head/agent/*` | Head LLM agent | LLM calls |
+| `/head/sessions/*` | Head session management | Session history |
+| `/browser-1/notify/alert` | Specific operation on specific node | One UI notification |

 Higher-risk operations (shell, filesystem write) can require tighter scopes
 than read-only operations. The ACL evaluates against the caller's
@@ -337,20 +337,20 @@ translation at the wire level.

 ### Agent Service Pattern

-The hub commonly runs an agent service that coordinates between LLM providers
+The head commonly runs an agent service that coordinates between LLM providers
 and tool calls. This service is just another set of registered operations —
 no special treatment:

- `/hub/agent/chat` — send a message, get a completion. Routes to the
-  appropriate LLM provider based on available spokes and configuration.
- `/hub/agent/complete` — streaming completion. Yields tokens as they arrive.
- `/hub/sessions/list` — list session histories (backed by Honker or other
+- `/head/agent/chat` — send a message, get a completion. Routes to the
+  appropriate LLM provider based on available workers and configuration.
+- `/head/agent/complete` — streaming completion. Yields tokens as they arrive.
+- `/head/sessions/list` — list session histories (backed by Honker or other
  durable storage).
- `/hub/sessions/history` — retrieve a specific session's message history.
+- `/head/sessions/history` — retrieve a specific session's message history.

-The agent service uses the same call protocol to invoke tools on spokes:
+The agent service uses the same call protocol to invoke tools on workers:
 `/dev1/fs/readFile` for file access, `/dev1/bash/exec` for shell commands. It
-stores session state via whatever mechanism the hub deployment provides — core
+stores session state via whatever mechanism the head deployment provides — core
 doesn't mandate Honker or any specific storage.

 ## Constraints
@@ -364,15 +364,15 @@ doesn't mandate Honker or any specific storage.
  admin operations are exposed through the call protocol itself.
 - Batch is not a protocol primitive. Multiple `call.requested` events with
  correlated `requestId`s provide equivalent semantics.
- The spoke prefix in the operation path is a routing mechanism, not a security
+- The node prefix in the operation path is a routing mechanism, not a security
  boundary. ACL is enforced at the `AccessControl` level, not by path prefix
-  alone. A spoke that exposes `/dev1/bash/exec` can restrict access via
+  alone. A worker that exposes `/dev1/bash/exec` can restrict access via
  `required_scopes` — not every authenticated identity should have shell access.

 ## Open Questions

- **OQ-20**: How does the hub track which spokes expose which operations when
-  spokes connect and disconnect? Registration on connect and cleanup on
+- **OQ-20**: How does the head track which workers expose which operations when
+  workers connect and disconnect? Registration on connect and cleanup on
  disconnect, or heartbeat-based discovery? See
  [open-questions.md](open-questions.md).

--- a/docs/architecture/napi-and-pubsub.md
+++ b/docs/architecture/napi-and-pubsub.md
@@ -14,7 +14,7 @@ Two integration layers that enable TypeScript/JavaScript consumers to use alknet

 ## Why

-The alknet Rust binary serves CLI users. But the broader ecosystem (pubsub, operations, agent spokes) is TypeScript-first. These integration layers let TypeScript code use alknet's transport without reimplementing SSH.
+The alknet Rust binary serves CLI users. But the broader ecosystem (pubsub, operations, agent workers) is TypeScript-first. These integration layers let TypeScript code use alknet's transport without reimplementing SSH.

 The NAPI surface is intentionally minimal — it exposes transport connections as duplex streams, not the full SSH protocol. The pubsub adapter wraps those streams with `EventEnvelope` serialization.

@@ -127,14 +127,11 @@ The alknet server uses a reserved `direct_tcpip` destination (`alknet-control:0`
 2. Instead of opening a TCP connection, it bridges the channel to its local pubsub event bus
 3. `EventEnvelope` JSON flows bidirectionally over the SSH channel

-Users who prefer not to use the control channel can alternatively run a pubsub hub on a specific port and use standard port forwarding: `alknet connect --forward 9736:hub:9736`. This is a deployment choice, not a separate implementation — alknet's port forwarding works normally for any TCP service.
+Users who prefer not to use the control channel can alternatively run a pubsub service on a specific port and use standard port forwarding: `alknet connect --forward 9736:head:9736`. This is a deployment choice, not a separate implementation — alknet's port forwarding works normally for any TCP service.

-### Direction Agnostic
+- **Worker connects to head**: `alknet connect --forward 9736:head:9736` then create WebSocket event target pointing at `ws://localhost:9736`

-Because alknet supports both local and remote port forwarding, the event target works in either direction:
-
- **Worker connects to hub**: `alknet connect --forward 9736:hub:9736` then create WebSocket event target pointing at `ws://localhost:9736`
- **Hub connects to worker**: `alknet connect --remote-forward 9736:worker:9736` — same result, opposite initiator
+- **Head connects to worker**: `alknet connect --remote-forward 9736:worker:9736` — same result, opposite initiator

 The pubsub adapter doesn't care which side initiated the SSH session. It just needs a byte stream.

--- a/docs/architecture/open-questions.md
+++ b/docs/architecture/open-questions.md
@@ -154,18 +154,18 @@ last_updated: 2026-06-04

 ## Call Protocol

-### OQ-20: Spoke registration and discovery on connect/disconnect
+### OQ-20: Worker registration and discovery on connect/disconnect
 - **Origin**: [call-protocol.md](call-protocol.md)
 - **Status**: open
 - **Priority**: medium
 - **Resolution**: (pending — registration on connect / cleanup on disconnect is the leading approach)
 - **Cross-references**: ADR-024, ADR-025

-### OQ-21: Routing calls to specific spokes with same-service operations
+### OQ-21: Routing calls to specific workers with same-service operations
 - **Origin**: [call-protocol.md](call-protocol.md)
 - **Status**: ~~resolved~~
 - **Priority**: ~~medium~~ —
- **Resolution**: ADR-024, ADR-025 — Operation paths use `/{spoke}/{service}/{op}` format. The first path segment identifies the spoke and routes the call to the correct connected node. Multiple spokes exposing the same service (e.g., two dev envs both with `/fs/*`) are differentiated by the spoke prefix (`/dev1/fs/readFile` vs `/dev2/fs/readFile`). The hub maintains a routing table mapping spoke identity to connection. This mirrors iroh's ALPN dispatch: first segment = routing key.
+- **Resolution**: ADR-024, ADR-025 — Operation paths use `/{node}/{service}/{op}` format. The first path segment identifies the node and routes the call to the correct connected node. Multiple workers exposing the same service (e.g., two dev envs both with `/fs/*`) are differentiated by the node prefix (`/dev1/fs/readFile` vs `/dev2/fs/readFile`). The head maintains a routing table mapping node identity to connection. This mirrors iroh's ALPN dispatch: first segment = routing key.
 - **Cross-references**: [call-protocol.md](call-protocol.md), ADR-024, ADR-025

 ### OQ-22: Client streaming (streaming inputs) in the call protocol?
--- a/docs/research/integration-plan.md
+++ b/docs/research/integration-plan.md
@@ -628,17 +628,18 @@ These must have answers before implementation begins:

 The research documents have a few areas that need reconciliation:

-1. **Hub/spoke vs head/worker**: core.md and services.md use head/worker. call-protocol.md still uses hub/spoke in several places. All docs need to be updated consistently. ADR-034 formalizes this.
+1. **Hub/spoke vs head/worker**~~: core.md and services.md use head/worker. call-protocol.md still uses hub/spoke in several places. All docs need to be updated consistently. ADR-034 formalizes this.~~ **Fixed**: call-protocol.md, auth.md, open-questions.md, and napi-and-pubsub.md updated to head/worker terminology. ADRs are historical records and retain original terminology. ADR-034 still needed to formalize the decision.

 2. **DNS as transport vs interface**: core.md conflates "DNS as transport" (encoding bytes as DNS queries) with "DNS as naming/discovery" (TXT records). The three-layer model cleanly separates these: DNS transport is Layer 1, DNS naming is a separate concern (similar to DNS-SD or iroh-dns).

-3. **Service naming collision — irpc service vs call protocol operation**: The research uses "service" for both irpc protocol enums (AuthProtocol, SecretProtocol) and call protocol path-based handlers (`/head/auth/verify`, `/head/secrets/derive`). These are different concepts that compose through OperationEnv. The architecture should consistently use:
+3. **Service naming collision — irpc service vs call protocol operation vs external service**: The research uses "service" for both irpc protocol enums (AuthProtocol, SecretProtocol) and call protocol path-based handlers (`/head/auth/verify`, `/head/secrets/derive`). These are different concepts that compose through OperationEnv. The architecture should consistently use:
   - **irpc service** for in-cluster, Rust-to-Rust protocol enums dispatched by variant (AuthProtocol::VerifyPubkey)
   - **operation** for path-based call protocol handlers dispatched by namespace + name (`/head/auth/verify`)
+   - **external service** for any endpoint reachable via the call protocol from another node or over an interface — an HTTP endpoint, a vast.ai instance, another head node. These are "services" in the broadest sense but sit outside the cluster. They're reachable through OperationEnv's remote dispatch path.
   - An irpc service can back an operation — the OperationEnv routes to the right dispatch path automatically
   - Both are "services" in the broad sense, but the dispatch mechanism differs. OperationEnv unifies them.

-4. **Identity model divergence**: auth.md defines `Identity` with `{id, scopes, resources}`. services.md defines `Identity` with `{node_id, fingerprint, scopes}`. These need to be unified. Proposed: `{id, scopes, resources}` where `id` is a fingerprint (for key-based auth) or account UUID (for database-backed auth).
+4. **Identity model divergence**~~: auth.md defines `Identity` with `{id, scopes, resources}`. services.md defines `Identity` with `{node_id, fingerprint, scopes}`. These need to be unified. Proposed: `{id, scopes, resources}` where `id` is a fingerprint (for key-based auth) or account UUID (for database-backed auth).~~ **Fixed**: auth.md already has the correct unified definition `{id, scopes, resources}`. Added a note in auth.md calling out the unification. services.md (research) still uses the old form — will be corrected when the services spec is formally written.

 5. **OperationEnv is a universal composition mechanism, not an implementation detail**: services.md defines `OperationEnv` as `HashMap<String, HashMap<String, fn(Value, OperationContext) -> ResponseEnvelope>>`. This is not a TypeScript pattern to be "translated" to Rust as an irpc Client<S>. The OperationEnv composition model is what makes operations universally addressable across HTTP, MCP, DNS, call protocol, and irpc. The Rust implementation can use typed method dispatch or a registry behind the scenes, but the behavioral contract — namespace + operation name → invoke with input, return output — must match. Adapters (MCP, HTTP, DNS) map to this interface. Handlers compose through this interface. irpc is one dispatch backend for OperationEnv, not a replacement for it.