docs(arch): sync call-completion specs with implementation — Dispatcher/RemoteFilter, ClientError, OQ-29

Post-implementation spec sync after the call-completion batch landed
(commits e4a2594..a3825f5). The sub-agent review flagged no spec drift, but
comparing the implemented types against the spec sketches surfaced five
details the specs didn't name — filled in here so the spec matches what was
built:

- client-and-adapters.md: name the shared Dispatcher (protocol/dispatch.rs)
  + RemoteFilter mechanism that enforces ADR-028's default-deny at dispatch
  time (the load-bearing security gate — checks remote_safe before building
  context, before any capability material reaches the handler). Add
  ClientError/RemoteIdentity types, the spawn_dispatch lower-level API, and
  the services_list_handler_peer_scoped wiring (the assembly layer must
  register the peer-scoped services/list handler for a CallClient's registry,
  not the plain one). Record the v1 TLS client-auth gap (AcceptAnyServerCertVerifier,
  with_no_client_auth) as OQ-29.
- call-protocol.md: point the adapter dispatch-loop description at the shared
  Dispatcher (dispatch.rs) so readers find the mechanism ADR-017 §1 commits to.
- open-questions.md: OQ-29 — CallClient TLS client-auth + remote-identity
  verification is a two-way-door remainder; the no-env-vars invariant is
  unaffected (auth_token flows via call-protocol payload, not TLS).
- READMEs: current-state now reflects completion done + reviewed (207 lib +
  2 integration tests); OQ-29 added to both OQ summaries.
This commit is contained in:
2026-06-26 13:42:42 +00:00
parent 2fe471ad4e
commit f9c0ab092b
5 changed files with 139 additions and 12 deletions

View File

@@ -9,9 +9,9 @@ last_updated: 2026-06-26
**Pre-implementation.** The project has completed a pivot from a three-layer model to an ALPN-as-service model. The greenfield workspace contains only `alknet-vault` (stable — implementation complete and verified, local-only by construction per ADR-025, HD-derivation key model per ADR-026) and research/reference material. Foundational ADRs (001028) are in place. ADR-024 resolves the registry mutability question and the `OperationContext.env` type identity crisis by layering the registry by trust boundary. ADR-025 drops irpc from the vault, making it local-only by construction. ADR-026 records the HD-derivation key model as a foundational decision. Review #003 (type/API surface completeness) resolved: `DerivedKey` derive contradiction, `encrypt` prose, return-type divergence, RwLock contradiction, drift table gaps, ADR-022 stale sketches, `Capabilities`/`SessionOverlaySource`/`CallConnection`/`CachedKey` definitions, `CompositeOperationEnv` dispatch contract, `with_local` signature, payload schemas, timeout propagation, and request ID generation. The alknet-core and alknet-call crate specs are in draft; the alknet-vault crate specs are stable.
The alknet-call server-side core (`CallAdapter`, `CallConnection` dispatch loop, wire framing, pending map, abort cascade, operation registry, service discovery) is implemented and tested (159 tests passing). The call-completion gap analysis (`docs/research/alknet-call-completion/gap-analysis.md`) identified the missing client/adapter surface specced in ADR-017 `CallClient`, `from_call`, `from_jsonschema`, the `OperationAdapter` trait — plus four decisions (DC-1..4) needed before implementation. DC-1 (the one-way door: peer-scoped registry filtering) is resolved by ADR-028; DC-2/3/4 are two-way-door defaults recorded in `client-and-adapters.md` and tracked as OQ-25..28. The client/adapter surface is specced (`crates/call/client-and-adapters.md`); implementation is pending.
The alknet-call crate is **implemented and reviewed** — both the server-side core and the client/adapter surface. The server-side core (`CallAdapter`, `CallConnection` dispatch loop, wire framing, pending map, abort cascade, operation registry, service discovery) and the client/adapter surface (`CallClient`, `from_call`, `from_jsonschema`, `OperationAdapter` trait, shared `Dispatcher`) are implemented and tested (207 lib + 2 integration tests passing). The call-completion gap analysis (`docs/research/alknet-call-completion/gap-analysis.md`) identified the missing client/adapter surface specced in ADR-017 plus four decisions (DC-1..4); all are resolved — DC-1 by ADR-028 (peer-scoped default-deny filtering), DC-2/3/4 as two-way-door defaults in `client-and-adapters.md` (OQ-25..28). A post-implementation review (`tasks/call/review-completion.md`) confirmed spec conformance; the one spec-sketch omission (`connect()`'s `ClientError` return type) was the intended illustrative-sketch gap, now filled in. A TLS client-auth gap surfaced during implementation is tracked as OQ-29.
**Next step**: Implementation of the alknet-call client/adapter surface (priority order in `client-and-adapters.md`): `CallClient``from_call``OperationAdapter` trait → `from_jsonschema`. All one-way doors are resolved; remaining open questions (OQ-25..28) are two-way-door shape/defaults decided during implementation.
**Next step**: The alknet-call crate is ready for downstream consumers (alknet-http's `OperationAdapter` implementations, the container-service/runner pattern, alknet-agent, alknet-napi). The remaining open questions (OQ-25..29) are all two-way-door shape/defaults, not blockers. The next crate phase is alknet-http (Phase 0 findings in `docs/research/alknet-http/`).
## Architecture Documents
@@ -102,6 +102,7 @@ See [open-questions.md](open-questions.md) for the full tracker.
- **OQ-26**: `OperationAdapter` error type — `import()` returns `Result<_, AdapterError>`; variants decided in implementation
- **OQ-27**: `from_call` re-import trigger — v1 default auto-on-reconnect; explicit `refresh()` additive
- **OQ-28**: `from_call` namespace collision — v1 default error-on-collision (no prefix by default)
- **OQ-29**: `CallClient` TLS client-auth + remote-identity verification — v1 connects with `with_no_client_auth()` and `AcceptAnyServerCertVerifier`; wiring RawKey client-auth is additive (the no-env-vars invariant is unaffected — `auth_token` flows through the call-protocol payload, not TLS)
**Deferred (not active):**
- **OQ-09**: WASM target boundaries — design constraint, not deliverable

View File

@@ -53,6 +53,7 @@ Structured RPC over QUIC: operations, request/response, streaming subscriptions,
| OQ-26 | OperationAdapter error type (AdapterError variants) | open (two-way) | `import()` returns `Result<_, AdapterError>`; variants decided in implementation |
| OQ-27 | from_call re-import trigger | open (two-way) | v1 default: auto-on-reconnect; explicit `refresh()` is additive |
| OQ-28 | from_call namespace collision behavior | open (two-way) | v1 default: error on collision (no prefix by default) |
| OQ-29 | CallClient TLS client-auth and remote-identity verification | open (two-way) | v1 connects with `with_no_client_auth()` + `AcceptAnyServerCertVerifier`; wiring RawKey client-auth and a real `ServerCertVerifier` is additive (no-env-vars invariant unaffected — `auth_token` flows via call-protocol payload, not TLS) |
## Key Design Principles

View File

@@ -164,6 +164,15 @@ The adapter:
5. Writes response `EventEnvelope` frames back to the appropriate stream
6. Manages the `PendingRequestMap` for outgoing calls
The dispatch loop is **shared** with `CallClient` (ADR-017 §1): both
`CallAdapter::handle` (accept path) and `CallClient::connect` (connect path)
construct a `Dispatcher` (`protocol/dispatch.rs`) and call `run_loop` — the
dispatch half is one implementation, the connection-establishment half differs
(accept vs dial). The `Dispatcher` carries a `RemoteFilter` (ADR-028) that
gates dispatch by `remote_safe`; the accept path uses `RemoteFilter::trusted()`
by convention. See [client-and-adapters.md](client-and-adapters.md) for the
`Dispatcher`/`RemoteFilter` mechanism.
### Stream Model
See ADR-012 for the full rationale.
@@ -538,6 +547,7 @@ See [open-questions.md](../../open-questions.md) for full details.
- **OQ-16** (resolved by ADR-014): No vault operations are exposed over the call protocol for now.
- **OQ-19** (resolved): Session-scoped operation registries — agent-written operations overlaid on global registry via `OperationEnv` trait layering. Protocol doesn't need changes; `OperationEnv` must remain a trait.
- **OQ-25..28** (open, two-way): Call-completion remainders — `CallClient` remote-safe marking shape, `OperationAdapter` error type, `from_call` re-import trigger, `from_call` namespace collision. The `CallClient`/adapter surface itself is specced in [client-and-adapters.md](client-and-adapters.md); the one-way door among these (existence of default-deny filtering) is resolved by ADR-028.
- **OQ-29** (open, two-way): `CallClient` TLS client-auth + remote-identity verification — v1 connects with `with_no_client_auth()` and `AcceptAnyServerCertVerifier`; wiring RawKey client-auth and a real `ServerCertVerifier` is additive. See [client-and-adapters.md](client-and-adapters.md).
## References

View File

@@ -92,15 +92,9 @@ pub struct CallClient {
}
impl CallClient {
/// Open a QUIC connection to `addr` on ALPN `alknet/call`, perform
/// credential handshake, and return a CallConnection running the shared
/// dispatch loop. Credentials come from capabilities (ADR-014), not env
/// vars — see "No-Env-Vars Invariant" below.
pub async fn connect(
&self,
addr: SocketAddr,
credentials: CallCredentials,
) -> Result<CallConnection>;
/// Default-deny mode: only `remote_safe: true` ops dispatch/list to the
/// remote peer (ADR-028).
pub fn new(registry: Arc<OperationRegistry>, idp: Arc<dyn IdentityProvider>) -> Self;
/// Trusted-peer mode: construct a CallClient that exposes all External
/// ops from `registry` to the remote peer, ignoring the remote-safe
@@ -109,6 +103,18 @@ impl CallClient {
registry: Arc<OperationRegistry>,
identity_provider: Arc<dyn IdentityProvider>,
) -> Self;
/// Open a QUIC connection to `addr` on ALPN `alknet/call`, perform
/// credential handshake, and return a CallConnection running the shared
/// dispatch loop. Credentials come from capabilities (ADR-014), not env
/// vars — see "No-Env-Vars Invariant" below. The dispatch loop runs on a
/// spawned task; the returned `CallConnection` is live until the remote
/// closes the connection or the caller drops it.
pub async fn connect(
&self,
addr: SocketAddr,
credentials: CallCredentials,
) -> Result<CallConnection, ClientError>;
}
```
@@ -127,6 +133,63 @@ both a caller and a callee — it dispatches incoming calls from the remote
peer against its peer-scoped registry view, and it initiates outgoing calls
through the `CallConnection::call()` / `subscribe()` / `abort()` API.
#### Shared Dispatcher
The shared dispatch loop lives in `protocol/dispatch.rs` as the `Dispatcher`
struct. This is the architectural mechanism that keeps `CallClient` from
becoming a parallel protocol implementation (ADR-017 §1): both `CallAdapter`'s
accept path and `CallClient`'s connect path construct a `Dispatcher` and call
`run_loop` — the dispatch half is one implementation, the
connection-establishment half differs (accept vs dial).
```rust
/// Peer-scoped registry filter state (ADR-028). `trusted_peer: false`
/// (default-deny for a CallClient) hides ops whose
/// `HandlerRegistration.remote_safe` is false from both dispatch and
/// `services/list`. `trusted_peer: true` (explicit opt-in, also used by the
/// CallAdapter's local accept path) bypasses the filter.
pub struct RemoteFilter { pub trusted_peer: bool }
/// Shared dispatcher for an established CallConnection. Constructed by both
/// CallAdapter (accept path) and CallClient (connect path). Holds no
/// per-connection state; the CallConnection is passed into run_loop.
pub struct Dispatcher {
pub registry: Arc<OperationRegistry>,
pub identity_provider: Arc<dyn IdentityProvider>,
pub session_source: Option<Arc<dyn SessionOverlaySource + Send + Sync>>,
pub default_timeout: Duration,
pub remote_filter: RemoteFilter,
}
```
The `remote_filter` is the dispatch-time gate that enforces ADR-028's
default-deny: `dispatch_requested` checks `remote_filter.allows(registration.remote_safe)`
**before** building the context or invoking the handler — a non-remote-safe op
returns `NOT_FOUND` before any capability material reaches the handler (the
security argument for default-deny, ADR-028 Context). The accept path
(`CallAdapter`) uses `RemoteFilter::trusted()` by convention — a direct QUIC
client is not a filtered `CallClient` peer in the ADR-028 sense.
`CallClient::spawn_dispatch(connection)` is the lower-level API that takes a
pre-established `Connection`, constructs a `CallConnection`, builds a
`Dispatcher` with the appropriate `RemoteFilter`, spawns the dispatch task,
and returns the live `CallConnection`. `connect()` uses it after the QUIC dial
completes; tests use it to wire mock/loopback connections directly.
#### services/list peer-scoped serving
The `services/list` hide behavior (ADR-028 Assumption 2) is wired via a
separate handler factory: `services_list_handler_peer_scoped(registry,
trusted_peer)` in `registry/discovery.rs`, backed by
`OperationRegistry::list_operations_peer_scoped(trusted_peer)`. The assembly
layer constructs the `CallClient`'s registry with this peer-scoped handler
(not the plain `services_list_handler` used by the `CallAdapter`'s local
accept path) so that when the remote peer calls `services/list` on the
`CallClient`, the response hides non-remote-safe ops in default-deny mode.
The dispatch-path `RemoteFilter` (above) and the `services/list`-handler
filter are the two halves of the same default-deny posture — discovery and
dispatch filters agree.
### Credential sources for connections
`CallClient::connect()` takes a `CallCredentials` bundle. Credentials come
@@ -139,6 +202,14 @@ pub struct CallCredentials {
pub auth_token: Option<AuthToken>, // call-protocol-level token
pub remote_identity: Option<RemoteIdentity>, // expected fingerprint/cert
}
/// Expected identity of the remote node (ADR-017 §7). v1 carries a
/// fingerprint string the assembly layer derives from `Capabilities`.
pub struct RemoteIdentity { pub fingerprint: String }
/// Errors produced by `CallClient::connect`.
#[non_exhaustive]
pub enum ClientError { Transport { .. }, TlsSetup { .. }, ConnectionClosed }
```
- **TLS identity** — the local node's Ed25519 raw key (RFC 7250) or X.509 cert,
@@ -154,6 +225,18 @@ invariant (below). The concrete shapes of `TlsIdentity`, `AuthToken`, and
`RemoteIdentity` are implementation-detail two-way doors; the one-way
constraints are that they come from `Capabilities`, not env vars (ADR-014).
**v1 TLS client-auth gap** (OQ-29): v1 `connect()` builds the quinn client
config with `with_no_client_auth()` and an `AcceptAnyServerCertVerifier` — the
client does not present its TLS identity as a client cert, and does not pin the
remote's expected identity from `credentials.remote_identity`. This is a
two-way-door remainder: wiring the local node's RawKey/X509 identity as a
rustls client-auth cert (for servers that verify client identity) and
plugging `credentials.remote_identity` into a real `ServerCertVerifier` is
additive. The one-way constraint (credentials from `Capabilities`, not env
vars, ADR-014) is unaffected — the `auth_token` dimension flows through the
call-protocol `auth_token` payload field, not TLS, so the no-env-vars
invariant holds independently of this gap.
### from_call
`from_call` discovers the remote peer's `External` operations and registers
@@ -511,6 +594,14 @@ See [open-questions.md](../../open-questions.md) for full details.
auto-on-reconnect; the explicit path is additive.
- **OQ-28** (open, two-way): `from_call` namespace collision behavior — error
on collision (v1 default, recorded here) vs last-wins.
- **OQ-29** (open, two-way): `CallClient` TLS client-auth + remote-identity
verification — v1 connects with `with_no_client_auth()` and
`AcceptAnyServerCertVerifier` (does not present a client cert, does not pin
the remote's expected identity from `credentials.remote_identity`). Wiring
the local node's RawKey/X509 identity as a rustls client-auth cert and
plugging `remote_identity` into a real `ServerCertVerifier` is additive.
The one-way constraint (credentials from `Capabilities`, ADR-014) is
unaffected — `auth_token` flows through the call-protocol payload, not TLS.
## References

View File

@@ -409,3 +409,27 @@ revisited during implementation without a new ADR.
remote's op behind another's, which is the kind of surprise the
default-deny posture exists to avoid.
- **Cross-references**: ADR-015, ADR-017, ADR-028, [client-and-adapters.md](crates/call/client-and-adapters.md)
### OQ-29: CallClient TLS Client-Auth and Remote-Identity Verification
- **Origin**: [client-and-adapters.md](crates/call/client-and-adapters.md), ADR-017 §7
- **Status**: open
- **Door type**: Two-way
- **Priority**: medium
- **Resolution**: v1 `CallClient::connect()` builds the quinn client config
with `with_no_client_auth()` and an `AcceptAnyServerCertVerifier` — the
client does not present its TLS identity (`credentials.tls_identity`) as a
client cert, and does not pin the remote's expected identity from
`credentials.remote_identity`. The server-side
`AcceptAnyCertVerifier` (in alknet-core's endpoint) does not require or
verify client certs, so a client cert is not needed to establish a
connection in v1. Wiring the local node's RawKey/X509 identity as a rustls
client-auth cert (for servers that *do* verify client identity) and
plugging `credentials.remote_identity` into a real `ServerCertVerifier` is
additive — a two-way-door remainder surfaced during implementation.
**The one-way constraint (credentials from `Capabilities`, not env vars,
ADR-014) is unaffected**: the `auth_token` dimension flows through the
call-protocol `auth_token` payload field, not TLS, so the no-env-vars
invariant holds independently of this gap. Decided during a future task that
wires RawKey client-auth; recorded here, not in a full ADR.
- **Cross-references**: ADR-014, ADR-017, ADR-027, [client-and-adapters.md](crates/call/client-and-adapters.md), [endpoint.md](crates/core/endpoint.md)