Files
alknet/docs/architecture/services.md
glm-5.1 e7941da04a docs: clarify phase boundaries — Phase 1 vs downstream concerns
The architecture specs were implying that StorageIdentityProvider, irpc
service implementations, and application services (agent, Docker, etc.)
already exist. This commit makes the phasing explicit:

- services.md: deployment topology now clearly labels 'Current (Phase 1)'
  vs 'Future (Phase 2+)', notes that application services are downstream
- identity.md: StorageIdentityProvider labeled 'Future — Phase 2+',
  clarifying alknet-storage doesn't exist yet
- storage.md: adds phase note that the crate hasn't been built yet,
  StorageIdentityProvider is a future impl
- ADR-028: ConfigAuthService is Phase 1 path, StorageAuthService is
  Phase 2+ contract
- call-protocol.md: Agent Service Pattern section explicitly framed as
  a downstream application concern, not a core requirement
2026-06-07 10:29:52 +00:00

9.8 KiB

status, last_updated
status last_updated
draft 2026-06-07

Services

Phase note: This spec defines the contracts for the service layer — the protocol enums, OperationEnv, and deployment topologies. Phase 1 ships ConfigIdentityProvider (ArcSwap-based) and ConfigServiceImpl (ArcSwap-based) as the only auth and config implementations. The irpc service protocols (AuthProtocol, SecretProtocol, etc.) and the production deployment topology (multi-node with StorageIdentityProvider) are contracted here but will be implemented in Phase 2+. Application services (DockerService, NodeService, agent services) are downstream concerns that build on top of the call protocol and OperationEnv — they are not core requirements.

What

The irpc service layer decomposes alknet's core responsibilities into independently testable, deployable, and replaceable components. Auth, Secret, Config, and Storage are irpc protocol enums that work both as in-process async boundaries (tokio channels) and cross-process/cross-network (QUIC streams via noq). OperationEnv is the universal composition mechanism that unifies local dispatch, irpc service dispatch, and remote call protocol dispatch.

Why

Without the service layer, auth verification, key derivation, and config reload are scattered across the codebase with no async boundary. For head nodes serving many users, in-memory key lookup doesn't scale — auth needs to query a database on demand. For secret management, the seed must be isolated in its own process boundary.

Without OperationEnv, handlers calling other operations would need to know whether the target is local, in-cluster, or on a remote node. OperationEnv abstracts this away: context.env.invoke("secrets", "derive", input) works regardless of dispatch path.

Architecture

Service Definition Pattern

Services are defined as irpc protocol enums:

#[rpc_requests(message = AuthMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum AuthProtocol {
    #[rpc(tx=oneshot::Sender<AuthResult>)]
    #[wrap(VerifyPubkey)]
    VerifyPubkey { fingerprint: String, key_data: Vec<u8> },
    // ...
}

The #[rpc_requests] macro generates two versions:

  • Serializable (Request): for remote communication (postcard encoding)
  • With channels (RequestWithChannels): for local communication (tokio channels)

Both use the same Client<S> type. The local/remote distinction is transparent at the call site.

Core Services

Service Protocol Purpose Always Local?
Auth AuthProtocol Verify identities, check credentials Can be remote
Secret SecretProtocol Derive keys, encrypt/decrypt Local or remote
Config ConfigProtocol Dynamic config reload Local
Storage StorageProtocol Graph CRUD, metagraph operations Local or remote

OperationContext

Every handler receives an OperationContext:

pub struct OperationContext {
    pub request_id: String,
    pub parent_request_id: Option<String>,
    pub identity: Option<Identity>,
    pub metadata: HashMap<String, Value>,
    pub env: OperationEnv,
    pub trusted: bool,  // set by buildEnv(), not by callers
}
  • identity: The authenticated identity making the call. Populated by IdentityProvider from the interface layer.
  • env: The operation environment — namespaced access to other operations.
  • trusted: When a handler calls another operation through env, the nested call is trusted (skips ACL checks).

OperationEnv — Universal Composition Mechanism

OperationEnv provides namespace + operation name → invoke with input, return output. The handler doesn't know or care whether the dispatch is local, irpc, or remote.

Three dispatch paths:

Path Mechanism Serialization Scope
Local Direct function call through registry None (in-process) Same process
Service irpc protocol enum dispatch postcard (binary) Same cluster
Remote Call protocol EventEnvelope JSON Cross-node

All three produce the same ResponseEnvelope.

Service assembly determines which path each operation uses:

// Minimal deployment (single node, all local)
let env = OperationEnv::local(local_registry);

// Production deployment (mix of local and remote)
let env = OperationEnv::new()
    .local("auth", auth_registry)
    .local("config", config_registry)
    .service("secrets", secret_irpc_client)
    .remote("worker-1", call_protocol_conn);

Service vs Call Protocol vs External Service

These are different concepts that compose through OperationEnv:

  • irpc service: In-cluster, Rust-to-Rust, type-safe, postcard serialization. Dispatched by enum variant. Example: AuthProtocol::VerifyPubkey.
  • Call protocol operation: Cross-node, cross-language, path-based, JSON EventEnvelope. Dispatched by namespace + name. Example: /head/auth/verify.
  • External service: Any endpoint reachable via the call protocol. Example: a vast.ai instance, an HTTP API, another head node.

An irpc service can back a call protocol operation. The OperationEnv routes to the appropriate dispatch path:

Call Protocol (Layer 3, external, JSON)
    └── irpc Service (Layer 3, internal, postcard)
            └── Honker Streams (Domain events, within service boundary)

Adapters

HTTP, MCP, DNS, and WebSocket adapters all resolve through OperationEnv:

  • HTTP: POST /v1/{namespace}/{op}context.env.invoke(namespace, op, input)
  • MCP: tools/call with tool name → context.env.invoke(namespace, op, input)
  • DNS: {op}.{namespace}.alk.dev TXT?context.env.invoke(namespace, op, input)
  • Call protocol: call.requested with operationIdcontext.env.invoke(namespace, op, input)

Deployment Topologies

Current (Phase 1, single node, CLI): This is what exists and ships today. Auth uses ConfigIdentityProvider backed by ArcSwap<DynamicConfig>. Config uses ConfigServiceImpl backed by ArcSwap<DynamicConfig>. There is no database dependency.

┌──────────────────────────────────────────────┐
│                 Single Process                │
│  ConfigIdentityProvider (ArcSwap)             │
│  ConfigServiceImpl (ArcSwap)                  │
│  alknet-core Server                           │
└──────────────────────────────────────────────┘

The irpc service layer (AuthProtocol, SecretProtocol, ConfigProtocol, StorageProtocol) and the application services (DockerService, NodeService, WalletService, agent services) are downstream concerns that will be built in later phases. The architecture defines the contracts (IdentityProvider trait, OperationEnv, service protocol enums) so that implementations can plug in without modifying core, but the implementations don't exist yet.

Future (multi-node, production): Auth and secrets on dedicated nodes; workers access them remotely via irpc over QUIC. StorageIdentityProvider backed by SQLite replaces ConfigIdentityProvider for auth.

Auth Node (SQLite)           Secret Node (seed in RAM)
       ↑                              ↑
       │ QUIC (irpc)                  │ QUIC (irpc)
       │                              │
Head Node (Config, Storage, alknet-core Server)
       │
       │ SSH / iroh / TLS
       │
Worker Node (alknet-core Client)

This topology requires alknet-storage, alknet-secret, and the irpc service layer to be built — they are Phase 2+ concerns.

Constraints

  • Services are internal — they run within a node or cluster.
  • The call protocol is external — it's how nodes talk to each other.
  • Per ADR-032, domain events (Honker streams) stay within the owning service. irpc calls are synchronous request-response within a node. Call protocol EventEnvelope is the integration boundary between nodes.
  • OperationEnv is a hard constraint: the handler-facing API must match the behavioral contract from @alkdev/operations. Namespace + operation name → invoke with input, return output.
  • irpc is behind a feature flag in alknet-core. Nodes that only do SSH tunneling don't need the service layer overhead.

Open Questions

  • OQ-SVC-01: Should the secret service support multiple seed phrases (one per tenant)? Defer for now — one seed per node. Multi-seed can be added later by indexing the Unlock call with a tenant ID.

  • OQ-SVC-02: Should service protocols use postcard (binary) or JSON for remote calls? Postcard for irpc (Rust-to-Rust, efficient). JSON for call protocol (cross-language, universal). The irpc remote path naturally uses postcard.

Design Decisions

ADR Decision Summary
027 Crate decomposition Service crates are independent of core
028 Auth as irpc service AuthProtocol behind feature flag
032 Event boundary Domain events never cross service boundaries
033 OperationEnv Universal composition mechanism with three dispatch paths

References