docs(arch): sync call-completion specs with implementation — Dispatcher/RemoteFilter, ClientError, OQ-29
Post-implementation spec sync after the call-completion batch landed (commits e4a2594..a3825f5). The sub-agent review flagged no spec drift, but comparing the implemented types against the spec sketches surfaced five details the specs didn't name — filled in here so the spec matches what was built: - client-and-adapters.md: name the shared Dispatcher (protocol/dispatch.rs) + RemoteFilter mechanism that enforces ADR-028's default-deny at dispatch time (the load-bearing security gate — checks remote_safe before building context, before any capability material reaches the handler). Add ClientError/RemoteIdentity types, the spawn_dispatch lower-level API, and the services_list_handler_peer_scoped wiring (the assembly layer must register the peer-scoped services/list handler for a CallClient's registry, not the plain one). Record the v1 TLS client-auth gap (AcceptAnyServerCertVerifier, with_no_client_auth) as OQ-29. - call-protocol.md: point the adapter dispatch-loop description at the shared Dispatcher (dispatch.rs) so readers find the mechanism ADR-017 §1 commits to. - open-questions.md: OQ-29 — CallClient TLS client-auth + remote-identity verification is a two-way-door remainder; the no-env-vars invariant is unaffected (auth_token flows via call-protocol payload, not TLS). - READMEs: current-state now reflects completion done + reviewed (207 lib + 2 integration tests); OQ-29 added to both OQ summaries.
This commit is contained in:
@@ -9,9 +9,9 @@ last_updated: 2026-06-26
|
||||
|
||||
**Pre-implementation.** The project has completed a pivot from a three-layer model to an ALPN-as-service model. The greenfield workspace contains only `alknet-vault` (stable — implementation complete and verified, local-only by construction per ADR-025, HD-derivation key model per ADR-026) and research/reference material. Foundational ADRs (001–028) are in place. ADR-024 resolves the registry mutability question and the `OperationContext.env` type identity crisis by layering the registry by trust boundary. ADR-025 drops irpc from the vault, making it local-only by construction. ADR-026 records the HD-derivation key model as a foundational decision. Review #003 (type/API surface completeness) resolved: `DerivedKey` derive contradiction, `encrypt` prose, return-type divergence, RwLock contradiction, drift table gaps, ADR-022 stale sketches, `Capabilities`/`SessionOverlaySource`/`CallConnection`/`CachedKey` definitions, `CompositeOperationEnv` dispatch contract, `with_local` signature, payload schemas, timeout propagation, and request ID generation. The alknet-core and alknet-call crate specs are in draft; the alknet-vault crate specs are stable.
|
||||
|
||||
The alknet-call server-side core (`CallAdapter`, `CallConnection` dispatch loop, wire framing, pending map, abort cascade, operation registry, service discovery) is implemented and tested (159 tests passing). The call-completion gap analysis (`docs/research/alknet-call-completion/gap-analysis.md`) identified the missing client/adapter surface specced in ADR-017 — `CallClient`, `from_call`, `from_jsonschema`, the `OperationAdapter` trait — plus four decisions (DC-1..4) needed before implementation. DC-1 (the one-way door: peer-scoped registry filtering) is resolved by ADR-028; DC-2/3/4 are two-way-door defaults recorded in `client-and-adapters.md` and tracked as OQ-25..28. The client/adapter surface is specced (`crates/call/client-and-adapters.md`); implementation is pending.
|
||||
The alknet-call crate is **implemented and reviewed** — both the server-side core and the client/adapter surface. The server-side core (`CallAdapter`, `CallConnection` dispatch loop, wire framing, pending map, abort cascade, operation registry, service discovery) and the client/adapter surface (`CallClient`, `from_call`, `from_jsonschema`, `OperationAdapter` trait, shared `Dispatcher`) are implemented and tested (207 lib + 2 integration tests passing). The call-completion gap analysis (`docs/research/alknet-call-completion/gap-analysis.md`) identified the missing client/adapter surface specced in ADR-017 plus four decisions (DC-1..4); all are resolved — DC-1 by ADR-028 (peer-scoped default-deny filtering), DC-2/3/4 as two-way-door defaults in `client-and-adapters.md` (OQ-25..28). A post-implementation review (`tasks/call/review-completion.md`) confirmed spec conformance; the one spec-sketch omission (`connect()`'s `ClientError` return type) was the intended illustrative-sketch gap, now filled in. A TLS client-auth gap surfaced during implementation is tracked as OQ-29.
|
||||
|
||||
**Next step**: Implementation of the alknet-call client/adapter surface (priority order in `client-and-adapters.md`): `CallClient` → `from_call` → `OperationAdapter` trait → `from_jsonschema`. All one-way doors are resolved; remaining open questions (OQ-25..28) are two-way-door shape/defaults decided during implementation.
|
||||
**Next step**: The alknet-call crate is ready for downstream consumers (alknet-http's `OperationAdapter` implementations, the container-service/runner pattern, alknet-agent, alknet-napi). The remaining open questions (OQ-25..29) are all two-way-door shape/defaults, not blockers. The next crate phase is alknet-http (Phase 0 findings in `docs/research/alknet-http/`).
|
||||
|
||||
## Architecture Documents
|
||||
|
||||
@@ -102,6 +102,7 @@ See [open-questions.md](open-questions.md) for the full tracker.
|
||||
- **OQ-26**: `OperationAdapter` error type — `import()` returns `Result<_, AdapterError>`; variants decided in implementation
|
||||
- **OQ-27**: `from_call` re-import trigger — v1 default auto-on-reconnect; explicit `refresh()` additive
|
||||
- **OQ-28**: `from_call` namespace collision — v1 default error-on-collision (no prefix by default)
|
||||
- **OQ-29**: `CallClient` TLS client-auth + remote-identity verification — v1 connects with `with_no_client_auth()` and `AcceptAnyServerCertVerifier`; wiring RawKey client-auth is additive (the no-env-vars invariant is unaffected — `auth_token` flows through the call-protocol payload, not TLS)
|
||||
|
||||
**Deferred (not active):**
|
||||
- **OQ-09**: WASM target boundaries — design constraint, not deliverable
|
||||
|
||||
@@ -53,6 +53,7 @@ Structured RPC over QUIC: operations, request/response, streaming subscriptions,
|
||||
| OQ-26 | OperationAdapter error type (AdapterError variants) | open (two-way) | `import()` returns `Result<_, AdapterError>`; variants decided in implementation |
|
||||
| OQ-27 | from_call re-import trigger | open (two-way) | v1 default: auto-on-reconnect; explicit `refresh()` is additive |
|
||||
| OQ-28 | from_call namespace collision behavior | open (two-way) | v1 default: error on collision (no prefix by default) |
|
||||
| OQ-29 | CallClient TLS client-auth and remote-identity verification | open (two-way) | v1 connects with `with_no_client_auth()` + `AcceptAnyServerCertVerifier`; wiring RawKey client-auth and a real `ServerCertVerifier` is additive (no-env-vars invariant unaffected — `auth_token` flows via call-protocol payload, not TLS) |
|
||||
|
||||
## Key Design Principles
|
||||
|
||||
|
||||
@@ -164,6 +164,15 @@ The adapter:
|
||||
5. Writes response `EventEnvelope` frames back to the appropriate stream
|
||||
6. Manages the `PendingRequestMap` for outgoing calls
|
||||
|
||||
The dispatch loop is **shared** with `CallClient` (ADR-017 §1): both
|
||||
`CallAdapter::handle` (accept path) and `CallClient::connect` (connect path)
|
||||
construct a `Dispatcher` (`protocol/dispatch.rs`) and call `run_loop` — the
|
||||
dispatch half is one implementation, the connection-establishment half differs
|
||||
(accept vs dial). The `Dispatcher` carries a `RemoteFilter` (ADR-028) that
|
||||
gates dispatch by `remote_safe`; the accept path uses `RemoteFilter::trusted()`
|
||||
by convention. See [client-and-adapters.md](client-and-adapters.md) for the
|
||||
`Dispatcher`/`RemoteFilter` mechanism.
|
||||
|
||||
### Stream Model
|
||||
|
||||
See ADR-012 for the full rationale.
|
||||
@@ -538,6 +547,7 @@ See [open-questions.md](../../open-questions.md) for full details.
|
||||
- **OQ-16** (resolved by ADR-014): No vault operations are exposed over the call protocol for now.
|
||||
- **OQ-19** (resolved): Session-scoped operation registries — agent-written operations overlaid on global registry via `OperationEnv` trait layering. Protocol doesn't need changes; `OperationEnv` must remain a trait.
|
||||
- **OQ-25..28** (open, two-way): Call-completion remainders — `CallClient` remote-safe marking shape, `OperationAdapter` error type, `from_call` re-import trigger, `from_call` namespace collision. The `CallClient`/adapter surface itself is specced in [client-and-adapters.md](client-and-adapters.md); the one-way door among these (existence of default-deny filtering) is resolved by ADR-028.
|
||||
- **OQ-29** (open, two-way): `CallClient` TLS client-auth + remote-identity verification — v1 connects with `with_no_client_auth()` and `AcceptAnyServerCertVerifier`; wiring RawKey client-auth and a real `ServerCertVerifier` is additive. See [client-and-adapters.md](client-and-adapters.md).
|
||||
|
||||
## References
|
||||
|
||||
|
||||
@@ -92,15 +92,9 @@ pub struct CallClient {
|
||||
}
|
||||
|
||||
impl CallClient {
|
||||
/// Open a QUIC connection to `addr` on ALPN `alknet/call`, perform
|
||||
/// credential handshake, and return a CallConnection running the shared
|
||||
/// dispatch loop. Credentials come from capabilities (ADR-014), not env
|
||||
/// vars — see "No-Env-Vars Invariant" below.
|
||||
pub async fn connect(
|
||||
&self,
|
||||
addr: SocketAddr,
|
||||
credentials: CallCredentials,
|
||||
) -> Result<CallConnection>;
|
||||
/// Default-deny mode: only `remote_safe: true` ops dispatch/list to the
|
||||
/// remote peer (ADR-028).
|
||||
pub fn new(registry: Arc<OperationRegistry>, idp: Arc<dyn IdentityProvider>) -> Self;
|
||||
|
||||
/// Trusted-peer mode: construct a CallClient that exposes all External
|
||||
/// ops from `registry` to the remote peer, ignoring the remote-safe
|
||||
@@ -109,6 +103,18 @@ impl CallClient {
|
||||
registry: Arc<OperationRegistry>,
|
||||
identity_provider: Arc<dyn IdentityProvider>,
|
||||
) -> Self;
|
||||
|
||||
/// Open a QUIC connection to `addr` on ALPN `alknet/call`, perform
|
||||
/// credential handshake, and return a CallConnection running the shared
|
||||
/// dispatch loop. Credentials come from capabilities (ADR-014), not env
|
||||
/// vars — see "No-Env-Vars Invariant" below. The dispatch loop runs on a
|
||||
/// spawned task; the returned `CallConnection` is live until the remote
|
||||
/// closes the connection or the caller drops it.
|
||||
pub async fn connect(
|
||||
&self,
|
||||
addr: SocketAddr,
|
||||
credentials: CallCredentials,
|
||||
) -> Result<CallConnection, ClientError>;
|
||||
}
|
||||
```
|
||||
|
||||
@@ -127,6 +133,63 @@ both a caller and a callee — it dispatches incoming calls from the remote
|
||||
peer against its peer-scoped registry view, and it initiates outgoing calls
|
||||
through the `CallConnection::call()` / `subscribe()` / `abort()` API.
|
||||
|
||||
#### Shared Dispatcher
|
||||
|
||||
The shared dispatch loop lives in `protocol/dispatch.rs` as the `Dispatcher`
|
||||
struct. This is the architectural mechanism that keeps `CallClient` from
|
||||
becoming a parallel protocol implementation (ADR-017 §1): both `CallAdapter`'s
|
||||
accept path and `CallClient`'s connect path construct a `Dispatcher` and call
|
||||
`run_loop` — the dispatch half is one implementation, the
|
||||
connection-establishment half differs (accept vs dial).
|
||||
|
||||
```rust
|
||||
/// Peer-scoped registry filter state (ADR-028). `trusted_peer: false`
|
||||
/// (default-deny for a CallClient) hides ops whose
|
||||
/// `HandlerRegistration.remote_safe` is false from both dispatch and
|
||||
/// `services/list`. `trusted_peer: true` (explicit opt-in, also used by the
|
||||
/// CallAdapter's local accept path) bypasses the filter.
|
||||
pub struct RemoteFilter { pub trusted_peer: bool }
|
||||
|
||||
/// Shared dispatcher for an established CallConnection. Constructed by both
|
||||
/// CallAdapter (accept path) and CallClient (connect path). Holds no
|
||||
/// per-connection state; the CallConnection is passed into run_loop.
|
||||
pub struct Dispatcher {
|
||||
pub registry: Arc<OperationRegistry>,
|
||||
pub identity_provider: Arc<dyn IdentityProvider>,
|
||||
pub session_source: Option<Arc<dyn SessionOverlaySource + Send + Sync>>,
|
||||
pub default_timeout: Duration,
|
||||
pub remote_filter: RemoteFilter,
|
||||
}
|
||||
```
|
||||
|
||||
The `remote_filter` is the dispatch-time gate that enforces ADR-028's
|
||||
default-deny: `dispatch_requested` checks `remote_filter.allows(registration.remote_safe)`
|
||||
**before** building the context or invoking the handler — a non-remote-safe op
|
||||
returns `NOT_FOUND` before any capability material reaches the handler (the
|
||||
security argument for default-deny, ADR-028 Context). The accept path
|
||||
(`CallAdapter`) uses `RemoteFilter::trusted()` by convention — a direct QUIC
|
||||
client is not a filtered `CallClient` peer in the ADR-028 sense.
|
||||
|
||||
`CallClient::spawn_dispatch(connection)` is the lower-level API that takes a
|
||||
pre-established `Connection`, constructs a `CallConnection`, builds a
|
||||
`Dispatcher` with the appropriate `RemoteFilter`, spawns the dispatch task,
|
||||
and returns the live `CallConnection`. `connect()` uses it after the QUIC dial
|
||||
completes; tests use it to wire mock/loopback connections directly.
|
||||
|
||||
#### services/list peer-scoped serving
|
||||
|
||||
The `services/list` hide behavior (ADR-028 Assumption 2) is wired via a
|
||||
separate handler factory: `services_list_handler_peer_scoped(registry,
|
||||
trusted_peer)` in `registry/discovery.rs`, backed by
|
||||
`OperationRegistry::list_operations_peer_scoped(trusted_peer)`. The assembly
|
||||
layer constructs the `CallClient`'s registry with this peer-scoped handler
|
||||
(not the plain `services_list_handler` used by the `CallAdapter`'s local
|
||||
accept path) so that when the remote peer calls `services/list` on the
|
||||
`CallClient`, the response hides non-remote-safe ops in default-deny mode.
|
||||
The dispatch-path `RemoteFilter` (above) and the `services/list`-handler
|
||||
filter are the two halves of the same default-deny posture — discovery and
|
||||
dispatch filters agree.
|
||||
|
||||
### Credential sources for connections
|
||||
|
||||
`CallClient::connect()` takes a `CallCredentials` bundle. Credentials come
|
||||
@@ -139,6 +202,14 @@ pub struct CallCredentials {
|
||||
pub auth_token: Option<AuthToken>, // call-protocol-level token
|
||||
pub remote_identity: Option<RemoteIdentity>, // expected fingerprint/cert
|
||||
}
|
||||
|
||||
/// Expected identity of the remote node (ADR-017 §7). v1 carries a
|
||||
/// fingerprint string the assembly layer derives from `Capabilities`.
|
||||
pub struct RemoteIdentity { pub fingerprint: String }
|
||||
|
||||
/// Errors produced by `CallClient::connect`.
|
||||
#[non_exhaustive]
|
||||
pub enum ClientError { Transport { .. }, TlsSetup { .. }, ConnectionClosed }
|
||||
```
|
||||
|
||||
- **TLS identity** — the local node's Ed25519 raw key (RFC 7250) or X.509 cert,
|
||||
@@ -154,6 +225,18 @@ invariant (below). The concrete shapes of `TlsIdentity`, `AuthToken`, and
|
||||
`RemoteIdentity` are implementation-detail two-way doors; the one-way
|
||||
constraints are that they come from `Capabilities`, not env vars (ADR-014).
|
||||
|
||||
**v1 TLS client-auth gap** (OQ-29): v1 `connect()` builds the quinn client
|
||||
config with `with_no_client_auth()` and an `AcceptAnyServerCertVerifier` — the
|
||||
client does not present its TLS identity as a client cert, and does not pin the
|
||||
remote's expected identity from `credentials.remote_identity`. This is a
|
||||
two-way-door remainder: wiring the local node's RawKey/X509 identity as a
|
||||
rustls client-auth cert (for servers that verify client identity) and
|
||||
plugging `credentials.remote_identity` into a real `ServerCertVerifier` is
|
||||
additive. The one-way constraint (credentials from `Capabilities`, not env
|
||||
vars, ADR-014) is unaffected — the `auth_token` dimension flows through the
|
||||
call-protocol `auth_token` payload field, not TLS, so the no-env-vars
|
||||
invariant holds independently of this gap.
|
||||
|
||||
### from_call
|
||||
|
||||
`from_call` discovers the remote peer's `External` operations and registers
|
||||
@@ -511,6 +594,14 @@ See [open-questions.md](../../open-questions.md) for full details.
|
||||
auto-on-reconnect; the explicit path is additive.
|
||||
- **OQ-28** (open, two-way): `from_call` namespace collision behavior — error
|
||||
on collision (v1 default, recorded here) vs last-wins.
|
||||
- **OQ-29** (open, two-way): `CallClient` TLS client-auth + remote-identity
|
||||
verification — v1 connects with `with_no_client_auth()` and
|
||||
`AcceptAnyServerCertVerifier` (does not present a client cert, does not pin
|
||||
the remote's expected identity from `credentials.remote_identity`). Wiring
|
||||
the local node's RawKey/X509 identity as a rustls client-auth cert and
|
||||
plugging `remote_identity` into a real `ServerCertVerifier` is additive.
|
||||
The one-way constraint (credentials from `Capabilities`, ADR-014) is
|
||||
unaffected — `auth_token` flows through the call-protocol payload, not TLS.
|
||||
|
||||
## References
|
||||
|
||||
|
||||
@@ -409,3 +409,27 @@ revisited during implementation without a new ADR.
|
||||
remote's op behind another's, which is the kind of surprise the
|
||||
default-deny posture exists to avoid.
|
||||
- **Cross-references**: ADR-015, ADR-017, ADR-028, [client-and-adapters.md](crates/call/client-and-adapters.md)
|
||||
|
||||
### OQ-29: CallClient TLS Client-Auth and Remote-Identity Verification
|
||||
|
||||
- **Origin**: [client-and-adapters.md](crates/call/client-and-adapters.md), ADR-017 §7
|
||||
- **Status**: open
|
||||
- **Door type**: Two-way
|
||||
- **Priority**: medium
|
||||
- **Resolution**: v1 `CallClient::connect()` builds the quinn client config
|
||||
with `with_no_client_auth()` and an `AcceptAnyServerCertVerifier` — the
|
||||
client does not present its TLS identity (`credentials.tls_identity`) as a
|
||||
client cert, and does not pin the remote's expected identity from
|
||||
`credentials.remote_identity`. The server-side
|
||||
`AcceptAnyCertVerifier` (in alknet-core's endpoint) does not require or
|
||||
verify client certs, so a client cert is not needed to establish a
|
||||
connection in v1. Wiring the local node's RawKey/X509 identity as a rustls
|
||||
client-auth cert (for servers that *do* verify client identity) and
|
||||
plugging `credentials.remote_identity` into a real `ServerCertVerifier` is
|
||||
additive — a two-way-door remainder surfaced during implementation.
|
||||
**The one-way constraint (credentials from `Capabilities`, not env vars,
|
||||
ADR-014) is unaffected**: the `auth_token` dimension flows through the
|
||||
call-protocol `auth_token` payload field, not TLS, so the no-env-vars
|
||||
invariant holds independently of this gap. Decided during a future task that
|
||||
wires RawKey client-auth; recorded here, not in a full ADR.
|
||||
- **Cross-references**: ADR-014, ADR-017, ADR-027, [client-and-adapters.md](crates/call/client-and-adapters.md), [endpoint.md](crates/core/endpoint.md)
|
||||
Reference in New Issue
Block a user