docs(arch): ADR-034 — outgoing-only X.509 and three peer roles, resolve OQ-37

Untangles the conflation of three distinct remote roles under 'X.509
endpoint': (1) public X.509 endpoint — a remote HTTPS/call-over-TLS
server the local node is a client of (no PeerEntry, no PeerId, not in
the peer graph; CA verification + bearer token); (2) transport relay —
iroh's DERP-equivalent, infrastructure, not an alknet peer; (3) hub /
hosting node — an alknet peer that also exposes a public domain + X.509
for browsers (mixed-fingerprint PeerEntry, already supported by
ADR-030).

The load-bearing one-way door is the client-side verifier selection
rule: known peer (PeerEntry present) → fingerprint pin; unknown X.509
remote → CA verification (WebPkiServerVerifier); unknown Ed25519
remote → fails closed. This closes the AcceptAnyServerCertVerifier
security hole OQ-29 flagged, with the peer-model criterion (PeerEntry
presence) made explicit. The 'make PeerEntry symmetric' instinct is
rejected — pure-client connections to public APIs have no stable
logical identity to pin.

Documents that CallCredentials.remote_identity: None is load-bearing
(None = public X.509 endpoint → CA path, not a missing field; Some =
known peer → fingerprint pin), closing a subtle gap where an
implementer could have defaulted to a placeholder or treated None as
skip-verify.

Records WebTransport relay-as-proxy (deferred with h3/WebTransport,
new OQ-HTTP-07) and on-chain/smart-contract peer discovery (fits the
OQ-36 repo/adapter pattern, no auth-model change) so they aren't lost.

Amends auth.md and client-and-adapters.md with the three-role naming,
the verifier selection rule, and the Option semantics; updates OQ-37
to resolved in open-questions.md, README.md, and both crate READMEs.
This commit is contained in:
2026-06-28 10:47:49 +00:00
parent 3f011cbb82
commit 6cc8715ccf
8 changed files with 602 additions and 65 deletions

View File

@@ -64,7 +64,7 @@ Structured RPC over QUIC: operations, request/response, streaming subscriptions,
| OQ-33 | PeerId — crypto identity vs stable logical id | **resolved** (ADR-030) | `PeerId = Identity.id = PeerEntry.peer_id` (stable across key rotation) |
| OQ-34 | Persistent peer registry | **resolved** (ADR-030+033) | Core trait + in-memory default; persistence adapters are separate crates |
| OQ-35 | ~~API key asymmetry~~ | **dissolved** | `PeerEntry` supports multiple credential paths; `ApiKeyEntry` is for tokens that ARE the identity |
| OQ-37 | X.509 outgoing-only case | open | Three auth types; how X.509 server identity fits the peer model. Not blocking. |
| OQ-37 | X.509 outgoing-only case | **resolved** (ADR-034) | Three remote roles (public X.509 endpoint, transport relay, hub); `PeerEntry` asymmetry correct; verifier by `PeerEntry` presence |
## Key Design Principles

View File

@@ -1,6 +1,6 @@
---
status: draft
last_updated: 2026-06-27
last_updated: 2026-06-28
---
# alknet-call — Client and Adapters
@@ -205,16 +205,28 @@ credential dimensions (ADR-017 §7):
pub struct CallCredentials {
pub tls_identity: Option<TlsIdentity>, // RFC 7250 raw key or X.509
pub auth_token: Option<AuthToken>, // call-protocol-level token
pub remote_identity: Option<RemoteIdentity>, // expected fingerprint/cert
pub remote_identity: Option<RemoteIdentity>, // expected fingerprint/cert (None = CA path, see below)
}
/// Expected identity of the remote node (ADR-017 §7). v1 carries a
/// fingerprint string the assembly layer derives from `Capabilities`.
/// Expected identity of the remote node (ADR-017 §7, extended by
/// ADR-034 §2). Carries a fingerprint string the assembly layer
/// derives from `Capabilities` when the local node has a `PeerEntry`
/// for the remote (the known-peer case → fingerprint pin).
///
/// `remote_identity: None` is the **public X.509 endpoint** case: the
/// local node has no `PeerEntry` for the remote, so there is no
/// fingerprint to pin. Combined with an X.509 transport, `None`
/// selects CA verification (`WebPkiServerVerifier`) per the
/// verifier-selection rule in ADR-034 §3. Combined with an Ed25519
/// raw-key transport, `None` fails closed (raw-key remotes are always
/// known peers — no CA to fall back to).
///
/// The `Option` is therefore load-bearing, not cosmetic: `Some(fingerprint)`
/// means "pin this" (known peer), `None` means "trust the CA or fail"
/// (unknown remote). An implementer must not default `remote_identity`
/// to a placeholder value to "satisfy" the field — `None` is a real
/// state that drives verifier selection.
pub struct RemoteIdentity { pub fingerprint: String }
/// Errors produced by `CallClient::connect`.
#[non_exhaustive]
pub enum ClientError { Transport { .. }, TlsSetup { .. }, ConnectionClosed }
```
- **TLS identity** — the local node's Ed25519 raw key (RFC 7250) or X.509 cert,
@@ -222,7 +234,10 @@ pub enum ClientError { Transport { .. }, TlsSetup { .. }, ConnectionClosed }
- **Auth token** — an opaque call-protocol-level token, decrypted from the
vault or derived from a shared secret.
- **Remote identity verification** — the expected fingerprint/cert of the
remote node, stored as a capability.
remote node, stored as a capability. `Some` → fingerprint pin (known
peer with a `PeerEntry`); `None` → CA verification for X.509 remotes,
fail-closed for Ed25519 raw-key remotes (ADR-034 §2/§3). The `None`
case is the public-X.509-endpoint path, not a missing field.
These are populated by the assembly layer at `CallClient` construction time
from vault-derived `Capabilities`. The credential path is the no-env-vars
@@ -242,6 +257,22 @@ vars, ADR-014) is unaffected — the `auth_token` dimension flows through the
call-protocol `auth_token` payload field, not TLS, so the no-env-vars
invariant holds independently of this gap.
**Outgoing X.509 and the peer model** (ADR-034): the client-side
`ServerCertVerifier` is selected by whether the local node has a
`PeerEntry` for the remote, not by key type alone. A pure-client
connection to a **public X.509 endpoint** (no `PeerEntry` on the local
side — e.g., dialing `api.alk.dev` or a third-party API) uses
`WebPkiServerVerifier` (CA verification), gets **no `PeerId`** on the
client side, and is **not added to `PeerCompositeEnv`** — it is not in
the call-protocol peer graph (ADR-029). Ops discovered via `from_call`
on such a connection land in the connection's Layer 2 overlay
(ADR-024) and are invoked through the `CallConnection` handle directly,
not via `PeerRef::Specific`. A connection to a **hub** (a `PeerEntry`
with mixed Ed25519 + X.509 fingerprints) uses fingerprint pinning on
both cert paths and does enter the peer graph. See
[ADR-034](../../decisions/034-outgoing-only-x509-and-three-peer-roles.md)
for the verifier selection rule and the three-role naming.
### from_call
`from_call` discovers the remote peer's `External` operations and registers
@@ -591,6 +622,22 @@ Based on the gap analysis and the downstream unblock chain:
- **MCP stdio transport is not built.** Streamable HTTP is the only supported
MCP transport in alknet. stdio = spawn arbitrary executable = built-in RCE.
Recorded as an explicit security position, not a feature gap.
- **Pure-client X.509 connections are not in the peer graph on the client
side.** A `CallClient` connection to a public X.509 endpoint with no
local `PeerEntry` for the remote gets no `PeerId`, is not added to
`PeerCompositeEnv`, and is not addressable via `PeerRef::Specific`.
Ops discovered on it live in the connection's Layer 2 overlay and are
invoked through the `CallConnection` handle. The client-side
`ServerCertVerifier` uses CA verification (`WebPkiServerVerifier`) for
such remotes; known peers (hub with `PeerEntry`) use fingerprint
pinning. See [ADR-034](../../decisions/034-outgoing-only-x509-and-three-peer-roles.md).
- **`CallCredentials.remote_identity: None` is load-bearing.** `None`
means "no `PeerEntry` for this remote → use CA verification (X.509)
or fail closed (Ed25519 raw key)" per the ADR-034 §3 verifier rule.
The implementation must not default `remote_identity` to a placeholder
to satisfy the field, and must not treat `None` as "skip verification"
`None` + X.509 is CA verification, `None` + raw key is a hard
failure. `Some(fingerprint)` is the known-peer pin path.
## Design Decisions
@@ -609,6 +656,7 @@ Based on the gap analysis and the downstream unblock chain:
| Abort cascade for nested calls | [ADR-016](../../decisions/016-abort-cascade-for-nested-calls.md) | Cross-node abort through `from_call` forwarding handler's `parent_request_id` |
| Operation error schemas | [ADR-023](../../decisions/023-operation-error-schemas.md) | `error_schemas` mirrored by `from_call` from remote op's spec |
| TLS identity redesign | [ADR-027](../../decisions/027-tls-identity-redesign-acme-rawkey-decoupling.md) | RFC 7250 raw key / X.509 cert dimensions of `CallCredentials` |
| Outgoing-only X.509 and three peer roles | [ADR-034](../../decisions/034-outgoing-only-x509-and-three-peer-roles.md) | Public X.509 endpoint is not a `PeerEntry` on the client side (no `PeerId`, not in peer graph); client-side verifier by `PeerEntry` presence (CA vs fingerprint pin); hub = mixed-fingerprint `PeerEntry` |
| HD derivation for encryption keys | [ADR-020](../../decisions/020-hd-derivation-for-encryption-keys.md) | Vault-derived TLS identity material |
| Vault key model | [ADR-026](../../decisions/026-vault-key-model-hd-derivation.md) | Vault-derived TLS identity material |
| Vault local-only dispatch | [ADR-025](../../decisions/025-vault-local-only-dispatch.md) | Vault access at assembly layer only; the credential injection path's first hop |
@@ -662,9 +710,17 @@ See [open-questions.md](../../open-questions.md) for full details.
shapes — the repo/adapter pattern is committed (ADR-033); the in-memory
adapters ship with core; the persistence adapter shapes (SQLite, etc.)
are deferred for exploration. See OQ-36 in open-questions.md.
- **OQ-37** (open): X.509 outgoing-only case — the three auth types and
how X.509 server identity fits the peer model. Not blocking the
ADR-029 migration. See OQ-37 in open-questions.md.
- **OQ-37** (resolved by ADR-034): X.509 outgoing-only case — three
remote roles named (public X.509 endpoint, transport relay, hub).
`PeerEntry` asymmetry is correct: a pure-client connection to a public
X.509 endpoint is **not** in the call-protocol peer graph on the
client side — no `PeerEntry`, no `PeerId`, no `PeerRef::Specific`
routing. Ops discovered via `from_call`/`from_openapi`/`from_mcp`
land in the connection's Layer 2 overlay and are invoked through the
connection handle. The client-side `ServerCertVerifier` is selected
by `PeerEntry` presence: known peer → fingerprint pin; unknown X.509
remote → CA verification (`WebPkiServerVerifier`). See ADR-034 and
OQ-37 in open-questions.md.
## References

View File

@@ -45,7 +45,7 @@ Core library for ALPN-based protocol dispatch. Every handler crate depends on al
| OQ-34 | Persistent peer registry (storage boundary) | resolved by ADR-030+031+033 | Core defines repo traits + in-memory defaults; persistence adapters are separate crates |
| OQ-35 | ~~API key asymmetry~~ | dissolved | `PeerEntry` supports multiple credential paths; `ApiKeyEntry` is for tokens that ARE the identity |
| OQ-36 | Concrete persistence adapter shapes | open (deferred for exploration) | The repo/adapter pattern is committed (ADR-033); in-memory adapters ship with core; persistence adapters deferred |
| OQ-37 | X.509 outgoing-only case | open | Three auth types; how X.509 server identity fits the peer model. Not blocking. |
| OQ-37 | X.509 outgoing-only case | resolved by ADR-034 | Three remote roles (public X.509 endpoint, transport relay, hub); `PeerEntry` asymmetry correct; client-side verifier by `PeerEntry` presence (CA vs fingerprint pin) |
## Key Design Principles

View File

@@ -1,6 +1,6 @@
---
status: draft
last_updated: 2026-06-27
last_updated: 2026-06-28
---
# Authentication
@@ -132,6 +132,64 @@ Bearer tokens have two paths:
The distinction is whether the token needs a stable logical id across rotation (`PeerEntry`) or not (`ApiKeyEntry`). See ADR-030 §"Bearer tokens."
## Three Remote Roles (ADR-034)
The three credential types above describe how a *single* `PeerEntry` can
be authenticated. Separately, there are **three distinct remote roles**
that the architecture must not conflate (see [ADR-034](../../decisions/034-outgoing-only-x509-and-three-peer-roles.md)):
| Role | Identity | alknet peer? | `PeerEntry` on local side? |
|------|----------|--------------|----------------------------|
| **Public X.509 endpoint** | Domain + CA-issued X.509 | No (local node is a client) | No |
| **Transport relay** (iroh's DERP-equivalent) | iroh `NodeId` (Ed25519) | No (infrastructure) | No |
| **Hub / hosting node** | Ed25519 raw key **and/or** X.509 | Yes (full peer) | Yes |
(Transport path and examples per role are in ADR-034; this table is
auth-focused — identity, peer-graph membership, and `PeerEntry`
presence on the local side.)
`PeerEntry` (and the `PeerId` it resolves to) is the model for peers in
the call-protocol peer graph (ADR-029) — peers that get a stable logical
identity, are addressable via `PeerRef::Specific`, and whose ops land in
the peer-keyed overlay. A pure-client connection to a public X.509
endpoint (e.g., `api.alk.dev`, a third-party API) is **not** in that
graph on the client side: the local node holds no `PeerEntry` for it,
the connection gets no `PeerId`, and ops discovered via
`from_call`/`from_openapi`/`from_mcp` are invoked through the
connection handle directly (Layer 2 overlay, ADR-024), not through
peer-keyed routing. The asymmetry is deliberate — a public domain's
operator can change hands, so there is no stable logical identity to
attach; the local node trusts the CA today and holds the connection
handle.
The **hub** case is an ordinary `PeerEntry` that happens to expose both
an Ed25519 fingerprint (P2P path) and an X.509 fingerprint
(`SHA256:<hex>`, WebTransport/HTTPS path) — already supported by
`PeerEntry.fingerprints: Vec<String>` (ADR-030). Browsers connecting to
a hub over WebTransport/HTTPS are *not* alknet peers on the hub's side
either — they're served by `alknet-http`, authenticate by bearer token,
and get no `PeerId`.
### Client-side verifier selection (outgoing connections)
The `CallClient` / `from_openapi` / `from_mcp` client-side
`ServerCertVerifier` is selected by **whether the local node has a
`PeerEntry` for the remote**, not by key type alone:
| Local has `PeerEntry` for remote? | Remote cert type | Client verifier |
|----------------------------------|------------------|-----------------|
| No (public X.509 endpoint) | X.509 | `WebPkiServerVerifier` (CA verification) |
| No | Ed25519 raw key | fails closed (no CA to fall back to — raw-key remotes are always known peers) |
| Yes (hub, Ed25519 path) | Ed25519 raw key | fingerprint match (`ed25519:<hex>`) |
| Yes (hub, X.509 path) | X.509 | fingerprint match (`SHA256:<hex>`) |
This is the key-type-aware verifier from OQ-29, with the peer-model
criterion (ADR-034) made explicit. `AcceptAnyServerCertVerifier` is a
security hole for X.509 and is only safe for raw-key fingerprint
extraction on the *server* side; the *client* side must use CA
verification for unknown X.509 remotes and fingerprint pinning for
known peers.
## AuthToken
Opaque authentication token carried in protocol frames.
@@ -230,7 +288,7 @@ The verifier accepts any presented cert without CA verification because
alknet's identity model is fingerprint-based, not PKI-based — the
`AuthPolicy::peers` set is the trust anchor, not a root CA store. The
cert bytes are extracted at the TLS layer and hashed to a fingerprint
string; the fingerprint is then matched against the configured `PeerEntry.fingerprint`
string; the fingerprint is then matched against the configured `PeerEntry.fingerprints`
fields by `IdentityProvider::resolve_from_fingerprint()`.
## Resolution Flow
@@ -328,12 +386,13 @@ The endpoint's `AlknetEndpoint` also holds `Arc<dyn IdentityProvider>` for endpo
| PeerEntry and Identity.id decoupling | [ADR-030](../../decisions/030-peerentry-and-identity-id-decoupling.md) | `authorized_fingerprints``peers: Vec<PeerEntry>`; `Identity.id` = `peer_id` (stable), not fingerprint; key rotation changes fingerprint, not identity |
| CredentialStore repo trait | [ADR-031](../../decisions/031-credentialstore-repo-trait.md) | Second repo trait in core (alongside `IdentityProvider`); `InMemoryCredentialStore` default adapter |
| Storage boundary and repo/adapter pattern | [ADR-033](../../decisions/033-storage-boundary-and-repo-adapter-pattern.md) | Core defines traits + in-memory defaults; persistence adapters are separate crates |
| Three remote roles and outgoing-only X.509 | [ADR-034](../../decisions/034-outgoing-only-x509-and-three-peer-roles.md) | Public X.509 endpoint / transport relay / hub; `PeerEntry` asymmetry (pure-client X.509 is not a peer); client-side verifier by `PeerEntry` presence |
## Open Questions
- **OQ-29** (resolved): `CallClient` TLS client-auth — wire quinn client-auth (present Ed25519 key as raw public key client cert); key-type-aware server cert verification (raw key = fingerprint match, X.509 = CA verification); fingerprint normalization (`ed25519:` across quinn/iroh). See OQ-29 in open-questions.md.
- **OQ-35** (dissolved): the "API key asymmetry" framing was wrong; `PeerEntry` supports multiple credential paths (fingerprints + auth_token_hash), `ApiKeyEntry` is for tokens that ARE the identity. See OQ-35 in open-questions.md.
- **OQ-37** (open): X.509 outgoing-only case — the three auth types and how X.509 server identity fits the peer model. Not blocking the ADR-029 migration. See OQ-37 in open-questions.md.
- **OQ-37** (resolved): X.509 outgoing-only case — three remote roles named (public X.509 endpoint, transport relay, hub); `PeerEntry` asymmetry is correct (pure-client X.509 connections are not in the peer graph on the client side); client-side verifier selection by `PeerEntry` presence (CA verification for unknown X.509, fingerprint pin for known peers). See ADR-034 and OQ-37 in open-questions.md.
## Security Constraints