Files
alknet/docs/architecture/crates/call/call-protocol.md
glm-5.2 2e34590522 docs(architecture): resolve review #003 — type/API surface completeness
Review #003 found 11 critical, 14 warning, and 6 suggestion findings
after reviews #001 (governance/security) and #002 (cross-document
consistency/two-way-door audit) were resolved. The theme: types and
APIs that were *referenced* but never *defined*, and stale ADR sketches
that didn't match the now-updated spec docs.

Critical fixes (11):

- C1: DerivedKey #[derive(Deserialize)] contradicted the custom
  Deserialize that rejects "[REDACTED]" — dropped the derive, added
  explicit manual Serialize/Deserialize impls (protocol.md).
- C2: encrypt prose said "derived at PATHS::ENCRYPTION" but the
  signature takes key_version — updated to encryption_path_for_version
  (service.md).
- C3: derive_encryption_key returned DerivedKey, derive_encryption_key
  _for_version returned EncryptionKey (same cache) — unified on
  DerivedKey, defined CachedKey (service.md).
- C4: tokio vs std::sync::RwLock contradiction — specified
  std::sync::RwLock, dropped tokio from vault deps (ADR-018, ADR-025,
  service.md).
- C5: Missing drift rows in vault README — added #9 (key_version
  ignored) and #10 (rotate not implemented).
- C6: ADR-022 build_root_context and invoke() sketches omitted
  abort_policy (9 fields vs 10) — added the field to both sketches.
- C7: Capabilities type referenced 20+ times, never defined — added
  struct definition to core-types.md with Clone+Send+Sync, Zeroize,
  sealed builder API, immutability guard.
- C8: SessionOverlaySource on CallAdapter but never defined, crate
  violation (alknet-call can't depend on alknet-agent) — defined the
  trait in alknet-call (call-protocol.md), matching the IdentityProvider
  pattern.
- C9: CompositeOperationEnv dispatch fall-through was "a two-way door"
  — added contains() to OperationEnv trait, made the composite probe
  before dispatching, eliminating the sentinel ambiguity.
- C10: No API for Layer 2 (connection overlay) registration, CallConnection
  undefined — defined CallConnection struct + register_imported() API
  (call-protocol.md).
- C11: with_local signature diverged between two examples (4 args vs 5)
  — added capabilities as the 5th arg, made both examples consistent.

Warning fixes (14):

- W1: invoke_with_policy restructured as required method, invoke gets a
  default impl delegating to it — eliminates duplication across impls.
- W2: CachedKey defined (service.md).
- W3: EncryptionKey constructor/glue specified, added to re-export list.
- W4: Secp256k1ExtendedPrivKey defined, derive_ethereum_key glue shown.
- W5: encryption_path_for_version rejects version < 2 (v1 is TS PBKDF2).
- W6: Wire payload schemas for all event types + ResponseEnvelope →
  EventEnvelope conversion table (call-protocol.md).
- W7: Timeout section — deadline on OperationContext, composed calls
  inherit parent's deadline, CallAdapter::with_timeout().
- W8: Request ID generation spec — UUID v4 for composed calls, wire ID
  vs internal ID relationship for abort cascade.
- W9: unlock_new already-unlocked behavior specified (returns
  AlreadyUnlocked).
- W10: KeyType Serialize/Deserialize justification corrected (stale
  irpc reference removed).
- W11: OperationProvenance and CompositionAuthority defined inline in
  operation-registry.md (were only in ADR-022).
- W12: encrypt/decrypt free functions marked pub(crate), relationship
  to VaultServiceHandle methods stated.
- W13: rotate signature removed from encryption.md (it's a
  VaultServiceHandle method, not a free function).
- W14: CallAdapter::new() + with_session_source() + with_timeout()
  constructors shown.

Suggestion fixes (6): Seed: Clone note, VaultServiceInner invariant,
ExtendedPrivKey accessor signatures, CURRENT_KEY_VERSION location, ADR-018
stale actor text, derivation helpers re-export note.
2026-06-23 10:56:05 +00:00

545 lines
36 KiB
Markdown

---
status: draft
last_updated: 2026-06-23
---
# Call Protocol
The wire protocol, stream model, framing, and adapter that alknet-call implements on ALPN `alknet/call`.
## What
The call protocol is a bidirectional, stream-agnostic RPC protocol that runs over QUIC bidirectional streams within a single `alknet/call` connection. It supports request/response calls, streaming subscriptions, batch operations, and service discovery — all using the same EventEnvelope wire format.
The `CallAdapter` implements `ProtocolHandler` for ALPN `alknet/call`. It receives a `Connection` from the endpoint, accepts bidirectional streams, and dispatches incoming `EventEnvelope` messages to the operation registry.
## Why
The call protocol is the primary programmatic interface to an alknet node. While SSH provides interactive shell access and HTTP provides REST APIs, the call protocol provides structured, discoverable RPC — the same interface that NAPI clients, MCP tools, and other automation consumers use.
The protocol must be:
- **Cross-language**: JSON wire format consumable from TypeScript, Python, any language
- **Bidirectional**: Both sides can initiate calls (server-to-client is as natural as client-to-server)
- **Stream-agnostic**: QUIC provides stream multiplexing; the protocol shouldn't impose additional constraints
- **Discoverable**: Clients can query what operations exist and their schemas
See ADR-005 for the decision to use irpc as the call protocol's foundation and ADR-012 for the stream model decision.
## Architecture
### CallAdapter
The `CallAdapter` implements `ProtocolHandler`:
```rust
pub struct CallAdapter {
/// Layer 0 — the curated operation registry. Immutable after startup.
registry: Arc<OperationRegistry>,
identity_provider: Arc<dyn IdentityProvider>,
/// Layer 1 — optional session-overlay source (agent crate supplies this;
/// None for non-agent deployments). See ADR-024, OQ-19.
session_source: Option<Arc<dyn SessionOverlaySource + Send + Sync>>,
/// Default timeout for wire calls (30s). Composed calls inherit the
/// parent's remaining deadline via `OperationContext.deadline`.
default_timeout: Duration,
}
impl CallAdapter {
/// Non-agent deployment: no session overlay, default timeout.
pub fn new(
registry: Arc<OperationRegistry>,
identity_provider: Arc<dyn IdentityProvider>,
) -> Self {
Self { registry, identity_provider, session_source: None,
default_timeout: Duration::from_secs(30) }
}
/// Agent deployment: supply a session-overlay source. The agent crate
/// implements `SessionOverlaySource`; alknet-call defines the trait.
pub fn with_session_source(mut self, source: Arc<dyn SessionOverlaySource + Send + Sync>) -> Self {
self.session_source = Some(source);
self
}
/// Override the default timeout.
pub fn with_timeout(mut self, timeout: Duration) -> Self {
self.default_timeout = timeout;
self
}
}
/// Session overlay integration point (ADR-024). Defined in alknet-call
/// because `CallAdapter` must name the type — alknet-call cannot depend on
/// alknet-agent (agent depends on call, not reverse). The agent crate
/// implements this trait; alknet-call defines it. This is the same pattern
/// as `IdentityProvider` (ADR-004: core defines the trait, handlers impl it).
///
/// The session overlay is an `OperationEnv` impl that wraps the curated base
/// (Layer 0). The `CallAdapter` composes it into the root
/// `OperationContext.env` per incoming call when a session is active. The
/// lookup mechanism (session ID in metadata, payload field, connection-bound
/// session state) belongs to the agent crate — this trait is the integration
/// point, not the lookup policy.
pub trait SessionOverlaySource: Send + Sync {
/// Returns the session overlay env for the given call, if a session is
/// active. `None` means no session is active for this call — the root
/// env is `curated base + connection overlay` (no session layer).
/// The agent crate determines how to map a call to its session.
fn overlay_for(&self, context: &OperationContext) -> Option<Arc<dyn OperationEnv + Send + Sync>>;
}
The `CallAdapter` holds the static curated registry and an optional
session-overlay source. Per-connection imported-ops overlays (Layer 2,
ADR-024) are held with the connection and composed into the root
`OperationContext.env` per incoming call. See ADR-024 for the layering
model and `compose_root_env` below.
### CallConnection
A `CallConnection` represents an established `alknet/call` connection,
regardless of which side opened it (ADR-017). It holds the connection's
imported-ops overlay (Layer 2, ADR-024) the set of `from_call`-imported
operations discovered when the connection was established.
```rust
/// An established alknet/call connection (either direction — accepted or
/// opened). Holds the connection's Layer 2 overlay (imported ops).
pub struct CallConnection {
/// The underlying QUIC connection (from endpoint.accept or CallClient.connect).
connection: Connection,
/// Layer 2 — this connection's imported-ops overlay. Populated by
/// `from_call` discovery when the connection is established. Each
/// imported op is a `HandlerRegistration` with `provenance: FromCall`.
/// This overlay is an `OperationEnv` impl that the `CallAdapter`
/// composes into the root `OperationContext.env` per incoming call.
imported_operations: Arc<RwLock<HashMap<String, HandlerRegistration>>>,
}
impl CallConnection {
/// Register an imported operation into this connection's overlay
/// (Layer 2, ADR-024). Called by `from_call` after discovery.
pub fn register_imported(&self, registration: HandlerRegistration) {
let name = registration.spec.name.clone();
self.imported_operations.write().insert(name, registration);
}
/// Register multiple imported operations (bulk variant for `from_call`).
pub fn register_imported_all(&self, registrations: Vec<HandlerRegistration>) {
let mut overlay = self.imported_operations.write();
for reg in registrations {
overlay.insert(reg.spec.name.clone(), reg);
}
}
/// Build an `OperationEnv` impl for this connection's overlay. Used by
/// the `CallAdapter` when composing the root `OperationContext.env`.
/// Returns an `OperationEnv` that dispatches to this connection's
/// imported ops (and reports `contains` only for ops in the overlay).
pub fn overlay_env(&self) -> Arc<dyn OperationEnv + Send + Sync>;
/// Call an operation on the remote peer (sends `call.requested`).
pub async fn call(&self, operation_id: &str, input: Value) -> ResponseEnvelope;
/// Subscribe to a streaming operation on the remote peer.
pub async fn subscribe(&self, operation_id: &str, input: Value) -> impl Stream<Item = ResponseEnvelope>;
/// Abort an in-flight request (sends `call.aborted`, cascades per ADR-016).
pub async fn abort(&self, request_id: &str);
}
```
**Layer 0 vs Layer 2 registration API** (ADR-024): `OperationRegistryBuilder`
builds Layer 0 (curated, immutable after startup) via `.with_local()` /
`.with_leaf()` / `.with()`. Layer 2 (per-connection) registration uses
`CallConnection::register_imported()` at runtime — the builder is
Layer-0-only; runtime overlay registration uses `CallConnection` methods.
When the connection drops, the overlay (and all imported ops) is dropped —
no explicit deregistration needed.
The adapter:
1. Accepts bidirectional streams on the connection
2. Reads length-prefixed JSON `EventEnvelope` frames from each stream
3. Resolves the peer's identity using `AuthContext` and `IdentityProvider`
4. Dispatches `call.requested` events to the operation registry
5. Writes response `EventEnvelope` frames back to the appropriate stream
6. Manages the `PendingRequestMap` for outgoing calls
### Stream Model
See ADR-012 for the full rationale.
The call protocol uses bidirectional QUIC streams with EventEnvelope framing. Key properties:
- **Either side can open streams**: The client opens a stream to call a server operation. The server opens a stream to call a client operation. Both use `open_bi()` and `accept_bi()`.
- **Correlation by request ID**: The `id` field in `EventEnvelope` correlates requests with responses. A response arriving on stream N can fulfill a request sent on stream M. The `PendingRequestMap` is keyed by ID, not by stream.
- **Stream usage is the client's choice**: A client may open one stream per operation, one stream for all operations, or any mix. The server processes EventEnvelopes regardless of stream origin.
- **One connection, full access**: A single `alknet/call` connection provides access to all operations (call, subscribe, batch, schema). No need for multiple connections or multiple ALPNs.
### Wire Format: EventEnvelope
Every message on the wire is a length-prefixed JSON `EventEnvelope`:
```rust
pub struct EventEnvelope {
pub r#type: String, // Event type
pub id: String, // Correlation key (request ID, subscription ID)
pub payload: Value, // serde_json::Value — schema depends on event type
}
// Frame: 4-byte big-endian length prefix + UTF-8 JSON body
```
The `Value` type is `serde_json::Value`. The envelope is JSON because it must be consumable from JavaScript, Python, and any language. The envelope itself stays JSON for cross-language compatibility.
Binary payloads (postcard, protobuf) are base64-encoded as a JSON string within the `payload` field. The convention is: if an operation's output schema specifies a binary field, the handler encodes it as a base64 string and the client decodes it. The `EventEnvelope` structure is not aware of this convention — it carries a `serde_json::Value` and does not interpret the payload. This is a handler-level concern, not a protocol-level concern.
This is the same framing used by irpc. The Rust implementation in alknet-call is canonical — the `@alkdev/pubsub` TypeScript adapters serve as a reference and browser adaptation, not a parallel implementation (see ADR-013).
### Event Types
Five event types carry request/response and subscription semantics:
| Event | Direction | Purpose |
|-------|-----------|---------|
| `call.requested` | Caller → Handler | Initiate a call or subscription |
| `call.responded` | Handler → Caller | Deliver a result (one for calls, many for subscriptions) |
| `call.completed` | Handler → Caller | Signal end of subscription stream |
| `call.aborted` | Either side | Cancel the call/subscription |
| `call.error` | Handler → Caller | Signal an error |
**A call is a subscribe that resolves after one event.** Both `call()` and `subscribe()` send the same `call.requested` event. The difference is consumption pattern:
- **call()**: Sends `call.requested`, resolves on first `call.responded`
- **subscribe()**: Sends `call.requested`, yields each `call.responded` until `call.completed` or `call.aborted`
The `id` field carries the `requestId` for correlation.
`call.completed` is sent only for subscriptions. A plain `call()` (request/response)
is complete after its single `call.responded`; no `call.completed` follows. The
`PendingRequestMap` entry for a `Call` is deleted on the first `call.responded`.
### `call.requested` Payload
The `payload` of a `call.requested` event has this shape:
```json
{
"operationId": "/fs/readFile",
"input": { ... },
"auth_token": "alk_..." // optional — see Identity Resolution below
}
```
- `operationId` — the operation to invoke, **with a leading slash** on the wire (e.g., `/fs/readFile`, `/agent/chat`, `/services/list`). This is the display form of the operation name. The registry stores names without the leading slash (`fs/readFile` — see [operation-registry.md](operation-registry.md#operationspec)); the wire format adds it. The `CallAdapter` strips the leading slash before registry lookup.
- `input` — the operation input, matching the operation's `input_schema` (JSON Schema). Always a `serde_json::Value`.
- `auth_token` — optional. If present, the `CallAdapter` resolves it via `IdentityProvider::resolve_from_token()` and the resulting `Identity` takes precedence over the connection-level identity for this request. See [Identity Resolution](#authcontext-and-identity-resolution) below.
The `call.requested` payload does **not** carry an abort policy field. The abort policy (`abort-dependents` vs `continue-running`, ADR-016) is set on `OperationContext` and propagated through `OperationEnv::invoke()` — the composing handler decides the child's policy, not the wire caller. See [Abort Cascade and Nested Calls](#abort-cascade-and-nested-calls) below.
**Leading-slash convention**: `operationId` on the wire always has a leading slash (`/fs/readFile`). `OperationSpec.name` in the registry and in `services/list` responses never has a leading slash (`fs/readFile`). `OperationSpec.path()` produces the wire form (`/fs/readFile`). This is a single rule applied consistently — do not mix the two forms.
### `call.error` Payload
```json
{
"code": "FILE_NOT_FOUND",
"message": "file not found: /etc/nonexistent",
"retryable": false,
"details": { "path": "/etc/nonexistent", "errno": 2 }
}
```
Error codes use an extensible string enum. The protocol defines the following **protocol-level codes** (emitted by the dispatch machinery, not by handlers):
- `NOT_FOUND` — operation not in registry (or Internal op called from wire)
- `FORBIDDEN` — access denied (insufficient scopes or unauthenticated)
- `INVALID_INPUT` — input doesn't match the operation's JSON Schema
- `INTERNAL` — handler error, panic, connection failure
- `TIMEOUT` — request timed out (retryable: true)
Operations may also declare **operation-level domain codes** in their `error_schemas` (ADR-023) — e.g., `FILE_NOT_FOUND`, `RATE_LIMITED`, `INSUFFICIENT_CREDITS`. These are emitted by handlers and carry a `details` payload conforming to the declared `ErrorDefinition.schema`. Protocol-level errors omit `details` or carry protocol-specific context (e.g., the operation name for `NOT_FOUND`).
Fields:
- `code` — the error code (protocol-level or operation-level)
- `message` — human-readable error message. For logging and debugging, not for programmatic handling. Clients should switch on `code`, not parse `message`.
- `retryable` — whether the caller should retry. `true` for transient failures, `false` for permanent ones.
- `details` — optional. When the code matches a declared `ErrorDefinition`, `details` conforms to that definition's schema. This is the typed error payload — it makes errors structured instead of string-matched. See ADR-023.
New error codes may be added in future versions. Clients should treat unknown error codes as `INTERNAL` with `retryable: false`.
### Wire Payload Schemas
The `payload` field of `EventEnvelope` has a different shape per event type:
| Event | `payload` shape |
|-------|----------------|
| `call.requested` | `{ "operationId": "/fs/readFile", "input": {...}, "auth_token": "alk_..." (optional) }` |
| `call.responded` | `{ "output": <Value> }` — the operation's output, matching `output_schema` |
| `call.completed` | `{}` — empty object (subscription stream end signal) |
| `call.aborted` | `{}` — empty object (cancellation signal; the `id` identifies which request) |
| `call.error` | `{ "code": "...", "message": "...", "retryable": bool, "details": {...} (optional) }` |
### `ResponseEnvelope` → `EventEnvelope` Conversion
Local dispatch produces `ResponseEnvelope { request_id, result: Result<Value, CallError> }`. The `CallAdapter` converts it to `EventEnvelope` for the wire:
| `ResponseEnvelope` | `EventEnvelope` |
|--------------------|-----------------|
| `Ok(value)` | `{ type: "call.responded", id: request_id, payload: { output: value } }` |
| `Err(call_error)` | `{ type: "call.error", id: request_id, payload: <serialized CallError> }` |
The `request_id` becomes the `id` field. For subscriptions, each `call.responded` is a separate `EventEnvelope` with the same `id`; `call.completed` is `{ type: "call.completed", id, payload: {} }`.
### Protocol Operations
The call protocol defines four top-level operations, expressed through event types and operation names:
| Operation | Event Pattern | Description |
|-----------|--------------|-------------|
| **call** | `call.requested``call.responded` or `call.error` | Request/response — one result |
| **subscribe** | `call.requested` → many `call.responded``call.completed` or `call.aborted` | Streaming — zero or more results |
| **batch** | multiple `call.requested` (different IDs) → multiple `call.responded` | Multiple operations in one round |
| **schema** | `call.requested` name `services/list` or `services/schema``call.responded` | Discover available operations |
Batch is not a separate event type — it's multiple `call.requested` events with different request IDs. The client sends them (on one or many streams) and correlates the responses by ID. See OQ-14.
### Bidirectional Calls
Both sides of the connection can initiate calls. The server can call operations on the client just as the client calls operations on the server.
```
Client Server
│ │
│── open_bi() → stream ─────────────────────────▶│
│── call.requested { id: "c1", ... } ────────────▶│ (client calls server)
│◀─ call.responded { id: "c1", ... } ───────────│
│ │
│◀─ open_bi() ← stream ──────────────────────────│
│◀─ call.requested { id: "s1", ... } ────────────│ (server calls client)
│── call.responded { id: "s1", ... } ───────────▶│
│ │
```
The server calls client operations using the same `PendingRequestMap` and the same `EventEnvelope` format. The operation registry on the client side dispatches `call.requested` events just like the server side.
This enables patterns where the server pushes notifications, requests configuration from the client, or orchestrates workflows that require the client to perform operations.
### Streaming Subscribe Example: LLM Chat
The subscribe operation pattern maps naturally to LLM streaming. An agent handler exposing `/agent/chat` as a subscription receives a `call.requested` event and streams `call.responded` events back as the LLM generates tokens. The output payloads use a normalized streaming UI format (e.g., Vercel AI SDK UI chunks — text-delta, tool-input-delta, etc.):
```
Client Server (agent handler)
│ │
│── open_bi() → stream ──────────────────────────────▶│
│── call.requested { id: "c1", │
│ operationId: "/agent/chat", │
│ input: { messages, model } } │
│ │ handler reads capabilities (API key)
│ │ handler makes HTTP request to LLM provider
│ │ handler normalizes provider SSE → UI chunks
│←─ call.responded { id: "c1", output: { type: "text-start", ... } } │
│←─ call.responded { id: "c1", output: { type: "text-delta", delta: "Hel" } }│
│←─ call.responded { id: "c1", output: { type: "text-delta", delta: "lo" } } │
│←─ call.responded { id: "c1", output: { type: "text-end", ... } } │
│←─ call.completed { id: "c1" } │
```
The API key used for the outbound LLM HTTP request comes from `OperationContext.capabilities`, not from the call protocol input and not from environment variables. See ADR-014 and [operation-registry.md → Capability Injection](operation-registry.md#capability-injection).
### PendingRequestMap
Manages in-flight calls and subscriptions. Correlates `call.responded` events back to the original `call.requested`:
```rust
pub struct PendingRequestMap {
pending: HashMap<String, PendingEntry>,
}
enum PendingEntry {
Call {
tx: oneshot::Sender<Result<Value, CallError>>,
timeout: Instant,
},
Subscribe {
tx: mpsc::Sender<Result<Value, CallError>>,
timeout: Option<Instant>,
},
}
```
When a `call.responded` event arrives:
- If `PendingEntry::Call` → resolve the oneshot, delete entry
- If `PendingEntry::Subscribe` → push to the mpsc channel, keep entry alive
When `call.completed` arrives on a subscription → close the mpsc channel, delete entry.
When `call.aborted` arrives → cancel/drop whichever side initiated it.
A `call.aborted` for an unknown `requestId` is silently discarded.
Timeouts prevent dangling entries. A background task sweeps expired entries periodically.
### CallAdapter Stream Handling
The `CallAdapter::handle()` method:
1. Spawns a task that continuously calls `connection.accept_bi()` to receive incoming streams
2. For each accepted stream, reads `EventEnvelope` frames using `FrameFramedReader`
3. Dispatches `call.requested` events to the operation registry
4. Writes response `EventEnvelope` frames using `FrameFramedWriter`
5. Manages `PendingRequestMap` for outgoing calls initiated by the server
For outgoing calls (server → client), the adapter:
1. Opens a bidirectional stream with `connection.open_bi()`
2. Sends `call.requested` on that stream
3. Adds the request ID to the `PendingRequestMap`
4. Reads responses from any stream, correlates by ID
### AuthContext and Identity Resolution
The `CallAdapter` receives an `AuthContext` from the endpoint. The call protocol resolves identity per-request, not per-connection:
**Resolution flow**:
1. The endpoint provides `AuthContext` with whatever identity it resolved at the TLS layer (e.g., client certificate fingerprint). This may be `None` — the `AuthContext.identity` field is `Option<Identity>`.
2. When a `call.requested` event arrives, the `CallAdapter` constructs an `OperationContext` with the connection-level `AuthContext.identity`.
3. If the `call.requested` payload includes an `auth_token` field, the `CallAdapter` resolves it using `IdentityProvider::resolve_from_token()`. If resolution succeeds, the resulting `Identity` replaces the connection-level identity in the `OperationContext`. If resolution fails, the request proceeds with the connection-level identity (which may be `None`).
4. The `OperationContext.identity` is passed to the `OperationRegistry` for ACL checking.
5. If `identity` is `None` and the operation's `AccessControl` has restrictions, the registry returns `FORBIDDEN` with message `"authentication required"`.
**Key point**: Identity is resolved per-request, not per-connection. This allows a single connection to upgrade authentication mid-session (e.g., after an `auth/login` operation returns a token), and allows different operations on the same connection to have different identity levels.
### Root OperationContext Construction
When a `call.requested` arrives from the wire, the `CallAdapter` constructs the root `OperationContext` — the entry point of the call tree. This is the counterpart to `OperationEnv::invoke()` (which constructs nested contexts with `internal: true`): the wire path sets `internal: false`, meaning ACL runs against the caller's `identity`, not a handler's composition authority (ADR-015, ADR-022).
```rust
// CallAdapter dispatch path — root context for an incoming wire request
fn build_root_context(
&self,
request_id: String,
operation_name: &str, // looked up in registry for the registration bundle
identity: Option<Identity>, // resolved per-request above (caller's identity)
) -> OperationContext {
let registration = self.registry.registration(operation_name);
OperationContext {
request_id,
parent_request_id: None, // wire request — top of the call tree
identity: identity.clone(), // caller's identity (inbound — gate credential)
// Composition authority from the registration bundle (ADR-022).
// None for leaves (FromOpenAPI/FromMCP/FromCall); Some for Local/Session.
// This is on the context for PROPAGATION to children via invoke(),
// not for the root's own ACL (which uses identity above).
handler_identity: registration.composition_authority.clone(),
capabilities: registration.capabilities.clone(), // from the registration bundle
metadata: HashMap::new(), // fresh per request
deadline: Some(Instant::now() + self.default_timeout), // root deadline (W7)
scoped_env: registration.scoped_env.clone()
.unwrap_or_else(ScopedOperationEnv::empty), // from the bundle, empty for leaves
// Per-call env composition (ADR-024): the root env is a composite
// of the curated base + this connection's imported-ops overlay +
// the active session overlay (if any). The CallAdapter builds this
// composite per incoming call — same shape as per-call identity
// resolution via IdentityProvider. Handlers call env.invoke();
// the composite routes to the right overlay.
env: self.compose_root_env(/* connection, session */),
abort_policy: AbortPolicy::default(), // abort-dependents (ADR-016 Decision 6)
internal: false, // external call — ACL against caller identity
}
}
```
The `internal: false` here is what makes a wire call a wire call — ACL checks against the caller's resolved `identity`. When a handler subsequently calls `context.env.invoke(...)`, the `OperationEnv::invoke()` path (see [operation-registry.md](operation-registry.md#operationenv)) constructs a nested `OperationContext` with `internal: true`, switching authority to `handler_identity`. The two construction paths — `CallAdapter` for wire-originated, `OperationEnv::invoke()` for composition-originated — are the only places `internal` is set. Handlers cannot set it themselves (the field is module-private for writes — see [operation-registry.md](operation-registry.md#operationcontext) and ADR-015).
The per-call `env` composition (ADR-024) is the operation-dispatch analogue of the per-call identity resolution the CallAdapter already does via `IdentityProvider`. Both are integration-point patterns: the trait object owns the routing, the CallAdapter supplies the right sources per call. A connection's imported-ops overlay is part of the root env only for calls arriving on that connection; a session overlay is part of the root env only when a session is active. See ADR-024.
### ResponseEnvelope
The universal return type from all operation invocations:
```rust
pub struct ResponseEnvelope {
pub request_id: String,
pub result: Result<Value, CallError>,
}
pub struct CallError {
pub code: String, // protocol-level (NOT_FOUND, FORBIDDEN, ...) or operation-level (ADR-023)
pub message: String, // human-readable, for logging — not for programmatic handling
pub retryable: bool,
pub details: Option<Value>, // typed error payload, conforms to ErrorDefinition.schema (ADR-023)
}
```
Local dispatch produces `ResponseEnvelope` with no serialization overhead. The `CallAdapter` converts `ResponseEnvelope` to `EventEnvelope` for the wire. When a handler returns a `CallError` whose `code` matches a declared `ErrorDefinition`, the `details` field carries the typed error payload. See ADR-023.
### Connection and Stream Lifecycle
**Connection drop**: When the QUIC connection closes, all pending requests in the `PendingRequestMap` are failed with `call.error` code `INTERNAL` and message `"connection closed"`. All subscription channels are closed. The `CallAdapter::handle()` method returns `Ok(())` (clean shutdown) or `Err(HandlerError::ConnectionClosed)` (unexpected).
**Stream reset**: When a QUIC stream is reset mid-operation, the `FrameFramedReader` returns an error. If the stream was carrying a subscription, the `PendingRequestMap` entry is removed and the mpsc channel is closed. If the stream was carrying a call, the oneshot is resolved with an error. No `call.aborted` is sent — the stream is gone.
**Timeouts**: Default timeout for wire calls is 30 seconds, configurable via
`CallAdapter::with_timeout()`. The `build_root_context` sets
`OperationContext.deadline` to `now + default_timeout`. Composed calls
inherit the parent's deadline (children do **not** get a fresh 30s — the
root call's deadline bounds the entire call tree, preventing a depth-5
composition from running 150s). A composed call that exceeds the deadline
is cancelled (future dropped, `Drop` guards release resources) and returns
`CallError { code: "TIMEOUT", retryable: true }`. Subscriptions default to
no deadline (`deadline: None` — unbounded); the client can specify a
timeout in the `call.requested` payload. The `PendingRequestMap` sweeper
runs every 10 seconds and removes expired wire entries.
**Error handling in `CallAdapter::handle()`**: If a handler panics, the stream is closed and the `PendingRequestMap` entry (if any) is cleaned up by the next sweeper pass. Other streams and the connection are unaffected.
### Abort Cascade and Nested Calls
When a handler composes other operations via `OperationEnv::invoke()`, it creates a call tree: a parent request (r1) spawns children (r1-a, r1-b), which may spawn their own children. The `parent_request_id` field on `OperationContext` records this tree — it is the agency chain (ADR-015).
When `call.aborted` arrives for a parent request, the protocol cascades the abort to all non-terminal descendants in the tree. The CallAdapter walks the tree (indexed by `parent_request_id` in `PendingRequestMap`) and sends `call.aborted` for each descendant. The default policy is **`abort-dependents`**: aborting a request aborts everything downstream, regardless of branch. This is the correct default because aborted parent work has no consumer waiting for results — continuing is wasted work at best and unwanted side effects at worst (e.g., a `bash/exec` that keeps running after the caller stopped caring).
An opt-in **`continue-running`** policy is available for cases where long-running work should survive a parent's abort (e.g., a subscription that should keep streaming). Under `continue-running`, descendants that have already started continue to completion; descendants that haven't started yet are aborted; no new descendants start.
The abort policy is set on `OperationContext` and propagated through `OperationEnv::invoke()` — the composing handler decides the child's policy, not the wire caller. The `call.requested` payload does not carry an abort policy field (the wire caller doesn't know the composition tree). The root context gets the default (`abort-dependents`); a handler can opt a child into `continue-running` at `invoke()` time. See ADR-016 Decision 6.
Handlers clean up resources when their call is cancelled (in Rust, the future is dropped and `Drop` guards release resources — HTTP streams, file handles, locks). This is a handler-level concern; the protocol's job is to cascade the abort. See ADR-016.
## Constraints
- The call protocol does not depend on any database. `PendingRequestMap` is in-memory. Durable session storage is a consumer concern.
- Operation specs use JSON Schema. The envelope is always JSON. Binary payloads may be base64-encoded in the `payload` field.
- Batch is not a protocol primitive — multiple `call.requested` events with correlated IDs provide equivalent semantics. See OQ-14.
- The call protocol is transport-agnostic at the envelope level. The `EventEnvelope` framing can run over QUIC streams, WebSocket frames, or Worker `postMessage`. The `CallAdapter` is the QUIC-specific implementation.
- `OperationEnv::invoke()` dispatches through the local registry. Remote dispatch (federation, head/worker routing) would be a separate mechanism at a different layer. See ADR-005 and OQ-13.
- **The call protocol carries no secret material.** Secret material (private keys, API keys, mnemonics, decrypted credentials, raw tokens) must not appear in `call.requested` payloads, `call.responded` payloads, or `OperationContext.metadata`. The wire format carries `serde_json::Value` and cannot enforce this at the type level — the constraint is architectural, enforced by the operation registry and by convention. Operations that need to share public key material use a dedicated operation that returns only the public component. See ADR-014.
- **Abort cascades to descendants.** `call.aborted` for a parent request cascades to all non-terminal descendants in the call tree. Default policy is `abort-dependents`; `continue-running` is an opt-in. See ADR-016.
## Design Decisions
| Decision | ADR | Summary |
|----------|-----|---------|
| irpc as call protocol foundation | [ADR-005](../../decisions/005-irpc-as-call-protocol-foundation.md) | irpc provides framing and service dispatch |
| Call protocol stream model | [ADR-012](../../decisions/012-call-protocol-stream-model.md) | Bidirectional streams, EventEnvelope, ID-based correlation |
| ALPN per connection | [ADR-006](../../decisions/006-alpn-convention-and-connection-model.md) | `alknet/call` is a distinct ALPN, one connection per ALPN |
| ProtocolHandler receives Connection | [ADR-007](../../decisions/007-bistream-type-definition.md) | CallAdapter gets Connection, can accept/open multiple streams |
| Vault integration point | [ADR-008](../../decisions/008-secret-service-integration.md) | Vault is a capability source, accessed at assembly time |
| Secret material flow | [ADR-014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Call protocol carries no secret material; capabilities injected at assembly layer |
| Privilege model and authority context | [ADR-015](../../decisions/015-privilege-model-and-authority-context.md) | `internal` = authority switch not ACL skip; External/Internal visibility; handler identity + scoped env |
| Abort cascade for nested calls | [ADR-016](../../decisions/016-abort-cascade-for-nested-calls.md) | `call.aborted` cascades to descendants; default `abort-dependents`, `continue-running` opt-in |
| Call protocol client and adapter contract | [ADR-017](../../decisions/017-call-protocol-client-and-adapter-contract.md) | `CallClient` opens connections; `from_call` imports remote ops; connection direction independent of call direction |
| Handler registration, provenance, and composition authority | [ADR-022](../../decisions/022-handler-registration-provenance-and-composition-authority.md) | Registration bundle carries provenance, composition authority, scoped env, capabilities; dispatch path reads from bundle |
| Operation error schemas | [ADR-023](../../decisions/023-operation-error-schemas.md) | Operations declare domain errors; `call.error` carries typed `details` |
## Open Questions
See [open-questions.md](../../open-questions.md) for full details.
- **OQ-13** (resolved): Operation path format is `/{service}/{op}`. Remote dispatch is a separate mechanism, not a path prefix.
- **OQ-14** (resolved): Batch is a client-side pattern of correlated `call.requested` events, not a protocol primitive.
- **OQ-16** (resolved by ADR-014): No vault operations are exposed over the call protocol for now.
- **OQ-19** (resolved): Session-scoped operation registries — agent-written operations overlaid on global registry via `OperationEnv` trait layering. Protocol doesn't need changes; `OperationEnv` must remain a trait.
## References
- [operation-registry.md](operation-registry.md) — OperationSpec, Handler, AccessControl, service discovery
- ADR-005: irpc as call protocol foundation
- ADR-012: Call protocol stream model
- Reference implementation: `/workspace/@alkdev/alknet-main/crates/alknet-core/src/call/`