diff --git a/crates/alknet-call/src/client/call_client.rs b/crates/alknet-call/src/client/call_client.rs index 45fbce6..e3dbd20 100644 --- a/crates/alknet-call/src/client/call_client.rs +++ b/crates/alknet-call/src/client/call_client.rs @@ -207,21 +207,21 @@ fn build_quinn_client_config( _credentials: &CallCredentials, alpn: &[u8], ) -> Result { - // TODO(OQ-29): connects without client-auth TLS identity. The server-side - // `AcceptAnyCertVerifier` (in alknet-core::endpoint) requests but does not - // verify client certs, so a client cert is not needed to establish a - // connection. However, without a client cert, the server cannot extract a - // fingerprint, so `IdentityProvider::resolve_from_fingerprint` returns - // None and the peer gets no stable `PeerEntry.peer_id` (ADR-030). This is - // load-bearing on ADR-030's peer-identity model — see OQ-29 for the - // decision needed before the ADR-029 migration lands. + // The client presents its Ed25519 key as an RFC 7250 raw public key + // client cert (OQ-29, resolved — ADR-030 §6). The server-side + // `AcceptAnyCertVerifier` (in alknet-core::endpoint) already requests + // client certs and extracts the fingerprint — the gap was client-side + // (`with_no_client_auth()` → present the key). This activates the + // `PeerEntry` fingerprint → `peer_id` resolution path. // - // The `credentials.tls_identity` field is carried through `CallCredentials` - // so the assembly layer can populate it; wiring it into the rustls client - // config is the missing piece. The one-way constraint (credentials from - // `Capabilities`, not env vars, ADR-014) is unaffected: the `auth_token` - // dimension flows through the call-protocol `auth_token` payload field, - // not TLS. + // Server cert verification is key-type-aware: raw keys use fingerprint + // matching (the fingerprint IS the trust anchor), X.509 uses CA + // verification (`WebPkiServerVerifier`). `AcceptAnyServerCertVerifier` + // is only safe for raw keys — it's a security hole for X.509. + // + // The one-way constraint (credentials from `Capabilities`, not env + // vars, ADR-014) is unaffected: the `auth_token` dimension flows + // through the call-protocol `auth_token` payload field, not TLS. let provider = Arc::new(rustls::crypto::aws_lc_rs::default_provider()); let mut config = rustls::ClientConfig::builder_with_provider(provider) .with_safe_default_protocol_versions() diff --git a/docs/architecture/README.md b/docs/architecture/README.md index d9a507f..2a1d140 100644 --- a/docs/architecture/README.md +++ b/docs/architecture/README.md @@ -112,20 +112,19 @@ See [open-questions.md](open-questions.md) for the full tracker. **Resolved by the storage/repo-pattern ADRs (ADR-030–033):** - **OQ-33**: ~~PeerId stability~~ — **resolved by ADR-030** (logical id; source is `Identity.id` = `PeerEntry.peer_id`, stable across key rotation; UUID workaround removed) - **OQ-34**: ~~Persistent peer registry~~ — **resolved by ADR-030 + ADR-031 + ADR-033** (storage boundary: core defines repo traits + in-memory defaults; persistence adapters are separate crates) -- **OQ-35**: API key identity vs peer identity — resolved (recorded by ADR-030; the asymmetry between fingerprint and API-key paths is deliberate) +- **OQ-35**: ~~API key asymmetry~~ — **dissolved** (the framing was wrong; `PeerEntry` supports multiple credential paths) **Resolved by the call-completion / ADR-029 work:** - **OQ-27**: ~~`from_call` re-import trigger~~ — **resolved** (auto-re-import on connection establishment; `refresh()` is a feature addition) - **OQ-28**: ~~`from_call` namespace collision~~ — **resolved** (same-peer collision = error; cross-peer dissolved by ADR-029) +- **OQ-29**: ~~CallClient TLS client-auth~~ — **resolved** (wire quinn client-auth; key-type-aware server cert verification; fingerprint normalization to `ed25519:` across quinn/iroh) - **OQ-30**: ~~`PeerRef::Any` routing policy~~ — **resolved** (insertion-order first-match; richer routing is a feature extension) - **OQ-31**: ~~`services/list-peers` re-export semantics~~ — **resolved** (opt-in `services/list-peers`; `services/list` is "own ops only") -**Open (requires decision before ADR-029 migration lands):** -- **OQ-29**: `CallClient` TLS client-auth — **promoted to high priority, load-bearing on ADR-030**. Not "additive" as previously framed — it's the activation path for the `PeerEntry` fingerprint → `peer_id` resolution. Without it, `PeerCompositeEnv` keys on `None` or the API-key prefix, not the stable `peer_id`. See OQ-29 for the three options (wire client-auth with the migration / ship token-only / extend PeerEntry to cover auth_token). - **Open (feature extensions, not blocking):** - **OQ-32**: Multi-hop federation — the one-hop model is the architectural commitment; multi-hop is a feature extension that doesn't break downstream -- **OQ-36**: Concrete adapter shapes — the repo/adapter pattern is committed (ADR-033); concrete adapter shapes are deferred for exploration. Note: the trait shapes and in-memory adapters must ship with core (per the project's clarification) — the deferral is for the persistence adapters (SQLite, etc.), not the core traits +- **OQ-36**: Concrete persistence adapter shapes — the repo/adapter pattern is committed (ADR-033); in-memory adapters ship with core; persistence adapters (SQLite, etc.) are deferred for exploration +- **OQ-37**: X.509 outgoing-only case — the three auth types (Ed25519, X.509, bearer token) and how X.509 server identity fits the peer model. Not blocking the ADR-029 migration; downstream (HTTP crate phase) **Deferred (not active):** - **OQ-09**: WASM target boundaries — design constraint, not deliverable diff --git a/docs/architecture/crates/call/README.md b/docs/architecture/crates/call/README.md index baf3456..174573a 100644 --- a/docs/architecture/crates/call/README.md +++ b/docs/architecture/crates/call/README.md @@ -57,12 +57,14 @@ Structured RPC over QUIC: operations, request/response, streaming subscriptions, | OQ-26 | OperationAdapter error type (AdapterError variants) | **resolved** | `DiscoveryFailed`, `SchemaParse`, `Transport`, `Unauthorized`, `SamePeerCollision`; `#[non_exhaustive]` | | OQ-27 | from_call re-import trigger | **resolved** | Auto-re-import on connection establishment; `refresh()` is a feature addition | | OQ-28 | from_call namespace collision | **resolved** | Same-peer collision = error; cross-peer dissolved by ADR-029 (separate sub-overlays) | -| OQ-29 | CallClient TLS client-auth | **open (high, load-bearing on ADR-030)** | NOT "additive" — activates the `PeerEntry` fingerprint → `peer_id` path. Requires decision before ADR-029 migration. | +| OQ-29 | CallClient TLS client-auth | **resolved** | Wire quinn client-auth; key-type-aware server cert verification; fingerprint normalization | | OQ-30 | `PeerRef::Any` routing policy | **resolved** | Insertion-order first-match; richer routing is a feature extension | | OQ-31 | `services/list-peers` re-export semantics | **resolved** | Opt-in `services/list-peers`; `services/list` is "own ops only" | | OQ-32 | Multi-hop federation | open (feature extension) | One-hop model is the commitment; multi-hop is a feature extension, not a deferral | | OQ-33 | PeerId — crypto identity vs stable logical id | **resolved** (ADR-030) | `PeerId = Identity.id = PeerEntry.peer_id` (stable across key rotation) | | OQ-34 | Persistent peer registry | **resolved** (ADR-030+033) | Core trait + in-memory default; persistence adapters are separate crates | +| OQ-35 | ~~API key asymmetry~~ | **dissolved** | `PeerEntry` supports multiple credential paths; `ApiKeyEntry` is for tokens that ARE the identity | +| OQ-37 | X.509 outgoing-only case | open | Three auth types; how X.509 server identity fits the peer model. Not blocking. | ## Key Design Principles diff --git a/docs/architecture/crates/call/client-and-adapters.md b/docs/architecture/crates/call/client-and-adapters.md index ef5da43..6114ee0 100644 --- a/docs/architecture/crates/call/client-and-adapters.md +++ b/docs/architecture/crates/call/client-and-adapters.md @@ -631,15 +631,12 @@ See [open-questions.md](../../open-questions.md) for full details. - **OQ-28** (resolved): `from_call` namespace collision — same-peer collision = error; cross-peer dissolved by ADR-029 (separate sub-overlays). `namespace_prefix` is optional local-naming sugar. -- **OQ-29** (open, **high priority, load-bearing on ADR-030**): `CallClient` - TLS client-auth — NOT "additive" as previously framed. ADR-030's - `PeerEntry` fingerprint → `peer_id` resolution requires the client to - present a TLS client cert; `with_no_client_auth()` means no fingerprint, - no `PeerEntry` resolution, no stable `peer_id`. The `auth_token` path - resolves to `Identity.id = ApiKeyEntry.prefix`, not `peer_id`. See OQ-29 - for the three options (wire client-auth with the migration / ship - token-only / extend PeerEntry to cover auth_token). Requires a decision - before the ADR-029 migration lands. +- **OQ-29** (resolved): `CallClient` TLS client-auth — wire quinn + client-auth (present Ed25519 key as raw public key client cert); + key-type-aware server cert verification (raw key = fingerprint match, + X.509 = CA verification); fingerprint normalization (`ed25519:` across + quinn/iroh). The iroh path already works; the gap was quinn-only. + See OQ-29 in open-questions.md. - **OQ-30** (resolved): `PeerRef::Any` routing policy — insertion-order first-match. A richer `RoutingPolicy` is a feature extension. - **OQ-31** (resolved): `services/list-peers` — opt-in; `services/list` @@ -657,14 +654,17 @@ See [open-questions.md](../../open-questions.md) for full details. the storage boundary is `core trait + in-memory default` (config-backed `ConfigIdentityProvider` now; persistence adapters additive in separate crates). See OQ-34 in open-questions.md. -- **OQ-35** (recorded by ADR-030): API key identity vs peer identity — the - asymmetry between the fingerprint path (gets `PeerEntry` id-decoupling) - and the API-key path (doesn't) is deliberate. See OQ-35 in - open-questions.md. +- **OQ-35** (dissolved): the "API key asymmetry" framing was wrong; + `PeerEntry` supports multiple credential paths (fingerprints + + auth_token_hash), `ApiKeyEntry` is for tokens that ARE the identity. + See OQ-35 in open-questions.md. - **OQ-36** (open, deferred for exploration): Concrete persistence adapter shapes — the repo/adapter pattern is committed (ADR-033); the in-memory adapters ship with core; the persistence adapter shapes (SQLite, etc.) are deferred for exploration. See OQ-36 in open-questions.md. +- **OQ-37** (open): X.509 outgoing-only case — the three auth types and + how X.509 server identity fits the peer model. Not blocking the + ADR-029 migration. See OQ-37 in open-questions.md. ## References diff --git a/docs/architecture/crates/core/README.md b/docs/architecture/crates/core/README.md index 92414a2..0148dd9 100644 --- a/docs/architecture/crates/core/README.md +++ b/docs/architecture/crates/core/README.md @@ -43,8 +43,9 @@ Core library for ALPN-based protocol dispatch. Every handler crate depends on al | OQ-11 | Handler-level auth resolution observability | resolved | Handlers store resolved identity on Connection; two identity scopes (connection-level for observability, per-request for ACL) | | OQ-33 | PeerId — logical id vs crypto identity | resolved by ADR-030 | `PeerId` = `Identity.id` = `PeerEntry.peer_id` (stable across key rotation) | | OQ-34 | Persistent peer registry (storage boundary) | resolved by ADR-030+031+033 | Core defines repo traits + in-memory defaults; persistence adapters are separate crates | -| OQ-35 | API key identity vs peer identity | resolved (recorded by ADR-030) | The asymmetry between fingerprint and API-key paths is deliberate | -| OQ-36 | Concrete adapter shapes | open (deferred for exploration) | The repo/adapter pattern is committed (ADR-033); concrete adapter shapes are not | +| OQ-35 | ~~API key asymmetry~~ | dissolved | `PeerEntry` supports multiple credential paths; `ApiKeyEntry` is for tokens that ARE the identity | +| OQ-36 | Concrete persistence adapter shapes | open (deferred for exploration) | The repo/adapter pattern is committed (ADR-033); in-memory adapters ship with core; persistence adapters deferred | +| OQ-37 | X.509 outgoing-only case | open | Three auth types; how X.509 server identity fits the peer model. Not blocking. | ## Key Design Principles diff --git a/docs/architecture/crates/core/auth.md b/docs/architecture/crates/core/auth.md index 4b9191e..6c1287b 100644 --- a/docs/architecture/crates/core/auth.md +++ b/docs/architecture/crates/core/auth.md @@ -110,22 +110,27 @@ pub struct Identity { ``` This is the same structure as the reference implementation (`alknet-main/crates/alknet-core/src/auth/identity.rs`), minus the russh dependency. The `id` field is ALPN-agnostic: -- SSH key / TLS cert auth (fingerprint path): the `PeerEntry.peer_id` (ADR-030) — a stable logical name like `"worker-a"`, **not** the fingerprint. The fingerprint is the *credential*; the `peer_id` is the *identity*. Decoupling them means key rotation changes the credential but not the identity, so ACL entries and routing references stay stable. -- API key auth: `"alk_test"` (key prefix) — the prefix IS the identity; rotation = new identity (see "API keys vs peer entries" below). +- Ed25519 raw key / TLS cert auth (fingerprint path): the `PeerEntry.peer_id` (ADR-030) — a stable logical name like `"worker-a"`, **not** the fingerprint. The fingerprint is the *credential*; the `peer_id` is the *identity*. Decoupling them means key rotation changes the credential but not the identity, so ACL entries and routing references stay stable. +- Bearer token auth (auth_token path): if the token is one credential path for a `PeerEntry`, `Identity.id = peer_id` (stable). If the token IS the identity (`ApiKeyEntry`), `Identity.id = prefix` (changes with the key). See "Credential Types" below. - Composition path: the `CompositionAuthority` label (ADR-022) — e.g., `"agent-chat"`. -### API keys vs peer entries +### Credential Types -The fingerprint and API-key auth paths have different identity semantics, by design (ADR-030): +The alknet auth model has three credential types. A `PeerEntry` can use any combination — all resolve to the same `peer_id`: -| Axis | Fingerprint (PeerEntry) | API key (ApiKeyEntry) | -|------|-------------------------|------------------------| -| Identity source | TLS handshake / SSH key | Bearer token in protocol frame | -| Key rotation | Same logical node, new material | New identity (revocation = new key) | -| `Identity.id` | `peer_id` (stable across rotation) | `prefix` (changes with the key) | -| `Identity.resources` | Populated from `PeerEntry.resources` | Empty (resources are composition-only) | +| Credential type | `PeerEntry` field | Fingerprint format | Trust model | +|-----------------|-------------------|--------------------|----| +| Ed25519 raw key (RFC 7250) | `fingerprints[i]` | `ed25519:` | Fingerprint IS the trust anchor (no CA) | +| X.509 cert | `fingerprints[i]` | `SHA256:` | CA verification (WebPKI) | +| Bearer token (peer credential) | `auth_token_hash` | SHA-256 hash of token | Token hash match | -An API key's prefix IS the identity — rotating the key means a new prefix and a new identity, by design (revocation is the rotation mechanism for API keys). Decoupling the API key identity from the prefix would solve a problem API keys don't have: they're bearer tokens, not node identities. The fingerprint path gets the `PeerEntry` treatment because node identity must survive key rotation; the API-key path doesn't because bearer-token identity IS the token. The asymmetry is deliberate, not an oversight — see ADR-030 §"API keys". +Ed25519 fingerprints are normalized to `ed25519:` across quinn and iroh (ADR-030 §6) — the same key has the same fingerprint regardless of transport. + +Bearer tokens have two paths: +- `PeerEntry.auth_token_hash` — the token is one credential path among several for a stable logical peer. Rotation = update the hash, `peer_id` stays stable. +- `ApiKeyEntry` (separate) — the token IS the identity. Rotation = new identity (new prefix). No stable logical id. + +The distinction is whether the token needs a stable logical id across rotation (`PeerEntry`) or not (`ApiKeyEntry`). See ADR-030 §"Bearer tokens." ## AuthToken @@ -169,43 +174,48 @@ pub struct ConfigIdentityProvider { The "Config" prefix indicates that identities are resolved from configuration (as opposed to a database or external service). This reads from `ArcSwap`, which is hot-reloadable — not from `StaticConfig`. An alternative name would be `DynamicConfigIdentityProvider` to make this clearer, but `ConfigIdentityProvider` is consistent with the reference implementation and the naming is unlikely to cause confusion in practice. How it resolves: -- **Fingerprint**: Look up in `DynamicConfig::auth.peers` for the matching `PeerEntry` (by `fingerprint`). If found and `enabled`, return `Identity { id: peer.peer_id, scopes: peer.scopes, resources: peer.resources }`. The `Identity.id` is the stable `peer_id`, **not** the fingerprint — key rotation changes the fingerprint but not the `peer_id`, so ACL entries and routing references stay stable (ADR-030). -- **Token**: Parse as UTF-8. If it starts with `alk_`, look up in `DynamicConfig::auth.api_keys` by prefix match + SHA-256 hash. If found and not expired, return `Identity { id: prefix, scopes: entry.scopes, resources: {} }`. The `Identity.id` is the key prefix — API key rotation = new identity (see "API keys vs peer entries" above). +- **Fingerprint**: Look up in `DynamicConfig::auth.peers` for the matching `PeerEntry` (by any entry in `fingerprints`). If found and `enabled`, return `Identity { id: peer.peer_id, scopes: peer.scopes, resources: peer.resources }`. The `Identity.id` is the stable `peer_id`, **not** the fingerprint — key rotation changes the fingerprint but not the `peer_id`, so ACL entries and routing references stay stable (ADR-030). +- **Token**: Hash the token and look up in `DynamicConfig::auth.peers` for a matching `auth_token_hash`. If found, return `Identity { id: peer.peer_id, ... }` — the same `peer_id` as the fingerprint path. If no `PeerEntry` matches, fall through to `ApiKeyEntry` lookup by prefix match + SHA-256 hash. If found and not expired, return `Identity { id: prefix, scopes: entry.scopes, resources: {} }` — the token IS the identity, `Identity.id` is the key prefix. -See [ADR-030](../../decisions/030-peerentry-and-identity-id-decoupling.md) for the `PeerEntry` model and the id-fingerprint decoupling rationale. +See [ADR-030](../../decisions/030-peerentry-and-identity-id-decoupling.md) for the `PeerEntry` model, the multi-credential resolution, and the fingerprint normalization rationale. ### Resource-scoped ACLs -`Identity.resources` is populated on three paths: +`Identity.resources` is populated on two paths: | Path | Source of `resources` | Use case | |------|----------------------|----------| -| Fingerprint resolution (`ConfigIdentityProvider`) | `PeerEntry.resources` (ADR-030) | External fingerprint-authenticated callers with per-peer resource binding | -| API key resolution (`ConfigIdentityProvider`) | Empty (by design) | API keys grant scopes only; resource-scoped access is composition-only | +| `PeerEntry` resolution (fingerprint or auth_token) | `PeerEntry.resources` (ADR-030) | External authenticated callers with per-peer resource binding | | Composition (`CompositionAuthority::as_identity`, ADR-015/022) | `CompositionAuthority.resources` | Internal composition calls with declared resource binding | -An `OperationSpec` that declares `resource_type`/`resource_action` will return `FORBIDDEN` when the caller authenticated via API key (because `Identity.resources` is empty), but succeeds when the caller authenticated via fingerprint with matching `PeerEntry.resources`, or via composition with matching `CompositionAuthority.resources`. The API-key limitation is deliberate (see "API keys vs peer entries" above); the fingerprint path's resource binding is the ADR-030 change that lifts the pre-ADR-030 limitation. +`ApiKeyEntry`-resolved identities have empty `resources` — API keys grant scopes only. An `OperationSpec` that declares `resource_type`/`resource_action` returns `FORBIDDEN` when the caller authenticated via `ApiKeyEntry`, but succeeds when the caller authenticated via `PeerEntry` (fingerprint or auth_token) with matching `resources`. Changes to `DynamicConfig` via `ConfigReloadHandle` are reflected immediately — `ConfigIdentityProvider` reads from `ArcSwap` on every call. ### Fingerprint string format -`tls_client_fingerprint` and `PeerEntry.fingerprint` use a prefixed-hex -format. The prefix identifies the key type; the body is the hex-encoded -hash or raw key bytes. `AuthPolicy::resolve_identity_from_fingerprint` -scans `peers` for a matching `fingerprint` field — no normalization — so +`tls_client_fingerprint` and `PeerEntry.fingerprints` entries use a +prefixed-hex format. The prefix identifies the key type; the body is the +hex-encoded key material. `AuthPolicy::resolve_identity_from_fingerprint` +scans `peers` for a matching `fingerprints` entry — no normalization — so the extractor and the operator config must use the same format. | Transport | Source | Format | |-----------|--------|--------| +| iroh (direct or relay) | peer `NodeId` (Ed25519 public key) | `ed25519:` | +| quinn (RFC 7250 raw key) | SPKI cert → extract raw Ed25519 pub key | `ed25519:` (normalized — ADR-030 §6) | | quinn (X.509) | leaf client cert DER | `SHA256:` | -| iroh (raw Ed25519) | peer `NodeId` | `ed25519:` | + +Ed25519 raw keys produce `ed25519:` regardless of transport (quinn or +iroh) — the same key has the same fingerprint. X.509 certs produce +`SHA256:` — the DER hash, since X.509 doesn't have a "raw +public key" form. When no client cert is presented (the current default — server uses `with_no_client_auth()`), the fingerprint is `None` and identity remains -unresolved at the endpoint layer. A follow-up task will switch the server -config to request-but-not-require client certs so fingerprints flow for -peers that present them. +unresolved at the endpoint layer. The `CallClient` TLS client-auth wiring +(OQ-29, resolved) presents the client's Ed25519 key as a raw public key +client cert so the server can extract the fingerprint. ### Server-side client cert request @@ -321,7 +331,9 @@ The endpoint's `AlknetEndpoint` also holds `Arc` for endpo ## Open Questions -- **OQ-35**: API key identity vs peer identity — the asymmetry between the fingerprint path (gets `PeerEntry` id-decoupling) and the API-key path (doesn't) is deliberate. See ADR-030 §"API keys" and "API keys vs peer entries" above. +- **OQ-29** (resolved): `CallClient` TLS client-auth — wire quinn client-auth (present Ed25519 key as raw public key client cert); key-type-aware server cert verification (raw key = fingerprint match, X.509 = CA verification); fingerprint normalization (`ed25519:` across quinn/iroh). See OQ-29 in open-questions.md. +- **OQ-35** (dissolved): the "API key asymmetry" framing was wrong; `PeerEntry` supports multiple credential paths (fingerprints + auth_token_hash), `ApiKeyEntry` is for tokens that ARE the identity. See OQ-35 in open-questions.md. +- **OQ-37** (open): X.509 outgoing-only case — the three auth types and how X.509 server identity fits the peer model. Not blocking the ADR-029 migration. See OQ-37 in open-questions.md. ## Security Constraints diff --git a/docs/architecture/crates/core/config.md b/docs/architecture/crates/core/config.md index 67733f6..19d7b03 100644 --- a/docs/architecture/crates/core/config.md +++ b/docs/architecture/crates/core/config.md @@ -195,39 +195,48 @@ fingerprint → `PeerEntry` → `Identity { id: peer_id, ... }`, so ```rust pub struct PeerEntry { /// Stable logical peer id ("worker-a", "alice"). Does NOT change on - /// key rotation. This becomes Identity.id on resolution. + /// key rotation. This becomes Identity.id on resolution, regardless of + /// which credential path resolved the identity. pub peer_id: String, - /// Current cryptographic material — the fingerprint the endpoint - /// extracts from the TLS handshake (SHA256:... for X.509, ed25519:... - /// for RFC 7250 raw keys). Changes on key rotation. - pub fingerprint: String, + /// TLS fingerprints for this peer — one or more. A peer may have + /// multiple keys (e.g., an Ed25519 raw key for P2P and an X.509 cert + /// for domain-facing). Resolution matches against any entry. + /// Format: "ed25519:" for RFC 7250 raw keys + /// (normalized across quinn and iroh — ADR-030 §6), "SHA256:" for + /// X.509 certs (DER hash). Changes on key rotation. + pub fingerprints: Vec, + + /// Optional: bearer-token authentication for this peer. A peer that + /// also authenticates via auth_token (e.g., HTTP clients that can't + /// do TLS client-auth) stores the SHA-256 hash of the token here. + /// Resolution via resolve_from_token matches this field and returns + /// the same Identity { id: peer_id, ... } as the fingerprint path. + pub auth_token_hash: Option, /// Authorization scopes granted to this peer. Resolved into /// Identity.scopes. pub scopes: Vec, /// Named resource lists granted to this peer. Resolved into - /// Identity.resources. Populated from config (ADR-030 lifts the - /// pre-ADR-030 limitation that fingerprint-resolved identities had - /// empty resources). + /// Identity.resources. pub resources: HashMap>, /// Human-readable display name for logs / UIs. Optional. pub display_name: Option, - /// Whether this peer is authorized at all. false = the fingerprint - /// is recognized but the peer is disabled (token-revoked-equivalent - /// for fingerprints). Resolution returns None. + /// Whether this peer is authorized at all. false = recognized but + /// disabled (revoked). Resolution returns None. pub enabled: bool, } ``` See [ADR-030](../../decisions/030-peerentry-and-identity-id-decoupling.md) -for the `PeerEntry` model, the id-fingerprint decoupling rationale, and -the key-rotation story (vault rotates locally; the remote side updates -the `PeerEntry.fingerprint` field; the `peer_id` and all ACL / routing -references stay stable). +for the `PeerEntry` model, the multi-credential resolution path, the +fingerprint normalization rationale, and the key-rotation story (vault +rotates locally; the remote side updates the `PeerEntry.fingerprints` or +`auth_token_hash` field; the `peer_id` and all ACL / routing references +stay stable). Certificate authority entries for cert-based auth are omitted from `AuthPolicy` until alknet-ssh is implemented, to avoid referencing an diff --git a/docs/architecture/decisions/030-peerentry-and-identity-id-decoupling.md b/docs/architecture/decisions/030-peerentry-and-identity-id-decoupling.md index abbf29c..c66ff5b 100644 --- a/docs/architecture/decisions/030-peerentry-and-identity-id-decoupling.md +++ b/docs/architecture/decisions/030-peerentry-and-identity-id-decoupling.md @@ -77,85 +77,118 @@ two-way door. ```rust pub struct PeerEntry { /// Stable logical peer id ("worker-a", "alice"). Does NOT change on - /// key rotation. This becomes Identity.id on resolution. + /// key rotation. This becomes Identity.id on resolution, regardless of + /// which credential path resolved the identity. pub peer_id: String, - /// Current cryptographic material — the fingerprint the endpoint - /// extracts from the TLS handshake (SHA256:... for X.509, ed25519:... - /// for RFC 7250 raw keys). Changes on key rotation. - pub fingerprint: String, + /// TLS fingerprints for this peer — one or more. A peer may have + /// multiple keys (e.g., an Ed25519 raw key for P2P and an X.509 cert + /// for domain-facing). Resolution matches against any entry. + /// Format: "ed25519:" for RFC 7250 raw keys + /// (normalized across quinn and iroh — see §6), "SHA256:" for + /// X.509 certs (DER hash). Changes on key rotation. + pub fingerprints: Vec, + + /// Optional: bearer-token authentication for this peer. A peer that + /// also authenticates via auth_token (e.g., HTTP clients that can't + /// do TLS client-auth) stores the SHA-256 hash of the token here. + /// Resolution via resolve_from_token matches this field and returns + /// the same Identity { id: peer_id, ... } as the fingerprint path. + pub auth_token_hash: Option, /// Authorization scopes granted to this peer. Resolved into /// Identity.scopes. pub scopes: Vec, /// Named resource lists granted to this peer. Resolved into - /// Identity.resources. Populated from config (not just composition, as - /// the pre-ADR-030 limitation in auth.md §"Resource-scoped ACLs and - /// external identities" required). + /// Identity.resources. pub resources: HashMap>, /// Human-readable display name for logs / UIs. Optional. pub display_name: Option, - /// Whether this peer is authorized at all. false = the fingerprint - /// is recognized but the peer is disabled (token-revoked-equivalent - /// for fingerprints). Resolution returns None. + /// Whether this peer is authorized at all. false = recognized but + /// disabled (revoked). Resolution returns None. pub enabled: bool, } pub struct AuthPolicy { /// Replaces authorized_fingerprints: HashSet. Each entry maps - /// a stable logical peer_id to its current fingerprint + scopes + - /// resources. The list is keyed by peer_id; resolution looks up by - /// fingerprint. + /// a stable logical peer_id to its credential paths (fingerprints, + /// optional auth_token_hash) + scopes + resources. The list is keyed + /// by peer_id; resolution looks up by fingerprint OR auth_token. pub peers: Vec, - /// API keys — unchanged by this ADR (see "API keys" below). + /// API keys for bearer-token auth where the token IS the identity + /// (rotation = new identity). Peers that need a stable logical id + /// across credential rotation use PeerEntry.auth_token_hash instead. + /// See "Bearer tokens" below. pub api_keys: Vec, } ``` -### 2. `Identity.id` becomes `PeerEntry.peer_id` on fingerprint resolution +### 2. `Identity.id` becomes `PeerEntry.peer_id` on resolution (any credential path) `ConfigIdentityProvider::resolve_from_fingerprint` resolves fingerprint → -matching `PeerEntry` → `Identity { id: peer_entry.peer_id, scopes: -peer_entry.scopes, resources: peer_entry.resources }`. The `Identity.id` is -the stable `peer_id`, not the fingerprint. +matching `PeerEntry` (by any entry in `fingerprints`) → `Identity { id: +peer_entry.peer_id, ... }`. `ConfigIdentityProvider::resolve_from_token` +resolves token → matching `PeerEntry` (by `auth_token_hash`) → the same +`Identity { id: peer_entry.peer_id, ... }`. Both paths produce the same +`Identity` — the `peer_id` is the stable logical id regardless of how the +peer authenticated. ```rust impl AuthPolicy { pub fn resolve_identity_from_fingerprint(&self, fingerprint: &str) -> Option { self.peers.iter() - .find(|p| p.enabled && p.fingerprint == fingerprint) + .find(|p| p.enabled && p.fingerprints.iter().any(|f| f == fingerprint)) .map(|p| Identity { id: p.peer_id.clone(), scopes: p.scopes.clone(), resources: p.resources.clone(), }) } + + pub fn resolve_identity_from_token(&self, token: &str) -> Option { + let token_hash = sha256(token); + self.peers.iter() + .find(|p| p.enabled && p.auth_token_hash.as_deref() == Some(&token_hash)) + .map(|p| Identity { + id: p.peer_id.clone(), + scopes: p.scopes.clone(), + resources: p.resources.clone(), + }) + .or_else(|| self.resolve_api_key(token)) // fall through to ApiKeyEntry + } } ``` -This removes the pre-ADR-030 limitation in `auth.md` -§"Resource-scoped ACLs and external identities" — fingerprint-resolved -identities now carry `resources` from the `PeerEntry`, not just from the -composition path. The composition path (`CompositionAuthority::as_identity`, -ADR-015/022) still produces its own `Identity` for internal calls; the -external-auth path now also carries resources when configured. +If the token doesn't match any `PeerEntry.auth_token_hash`, resolution falls +through to `resolve_api_key` (the `ApiKeyEntry` path, where `Identity.id = +prefix`). This preserves the existing API-key path for bearer tokens that +ARE the identity, while adding the `PeerEntry` token path for tokens that +are one credential path among several for a stable logical peer. -### 3. Key rotation is a single `PeerEntry.fingerprint` update +This removes the pre-ADR-030 limitation in `auth.md` +§"Resource-scoped ACLs and external identities" — resolved identities now +carry `resources` from the `PeerEntry`, not just from the composition path. + +### 3. Key rotation is a `PeerEntry` field update (no `peer_id` change) Rotating a peer's TLS key: - The vault derives the new key locally (ADR-020/021). -- The remote side's config updates the `PeerEntry.fingerprint` field for +- The remote side's config updates the `PeerEntry.fingerprints` entry for that `peer_id`. The `peer_id`, `scopes`, `resources`, ACL entries, and any `PeerRef::Specific(peer_id)` references stay stable. - A config reload (`ConfigReloadHandle::reload`) makes the change live. +Rotating a peer's auth token: +- Update `PeerEntry.auth_token_hash` for that `peer_id`. The `peer_id` + and everything that references it stays stable. + No ACL update, no routing reference invalidation, no peer "disappears." The vault's local rotation + a remote-side config edit is the full key -rotation story across nodes. +rotation story across nodes, for any credential path. ### 4. `PeerId` source changes from UUID to `Identity.id` from `PeerEntry` @@ -192,27 +225,77 @@ caller-id-is-the-connection case, e.g., anonymous dial-in). The UUID fallback is removed. A connection with no resolved identity has no `PeerId`, not a random one. -## API keys +### 6. Fingerprint format normalization: `ed25519:` for raw keys -API keys (`ApiKeyEntry`) are **not** given the `PeerEntry` treatment. The -two identity sources have different semantics: +Ed25519 raw keys (RFC 7250) produce different fingerprint formats depending +on the transport: -| Axis | Fingerprint (PeerEntry) | API key (ApiKeyEntry) | -|------|-------------------------|------------------------| -| Identity source | TLS handshake / SSH key | Bearer token in protocol frame | -| Key rotation | Same logical node, new material | New identity (revocation = new key) | -| `Identity.id` | `peer_id` (stable across rotation) | `prefix` (changes with the key) | -| Resource binding | `PeerEntry.resources` (per-peer) | Empty (Option B, auth.md) — resources are composition-only | +- **iroh** (direct or relay): `ed25519:` — + extracted from `connection.remote_node_id()`, which returns the NodeId + (the raw Ed25519 public key). Already implemented. +- **quinn RawKey**: currently `SHA256:` — because + `fingerprint_from_cert_der` hashes the SPKI DER bytes. The DER encoding + of the SPKI is not the raw 32-byte public key; it's an ASN.1 wrapper. + So the same Ed25519 key produces `ed25519:abc...` on iroh and + `SHA256:def...` on quinn — two different fingerprints for the same key. -An API key's prefix IS the identity — rotating the key means a new prefix -and a new identity, by design (revocation is the rotation mechanism for -API keys). Decoupling the API key identity from the prefix would be solving -a different problem (persistent logical identity across key rotation) that -API keys don't have: they're bearer tokens, not node identities. +This is normalized: the quinn path extracts the Ed25519 public key from the +cert DER (the `RawKeyCertResolver` already has the raw key bytes via +`Ed25519SecretKey::public()`) and formats it as `ed25519:`, matching +iroh. A peer that connects via quinn direct and via iroh has the same +fingerprint in `PeerEntry.fingerprints` — one entry, both transports. -`ApiKeyEntry` stays as-is. The asymmetry is documented here and in -`auth.md` so the difference between the two auth paths is explicit, not an -oversight. +The normalization is in `extract_quinn_client_fingerprint`: when the +presented cert is an RFC 7250 raw public key (SPKI with Ed25519 algorithm +identifier), extract the raw 32-byte public key and format as +`ed25519:`. When the cert is X.509, keep the `SHA256:` +format (X.509 certs don't have a "raw public key" form — the DER hash is +the fingerprint). + +This also simplifies the coming WebTransport relay work: a WebTransport +relay acts as a proxy, and the proxied connection's Ed25519 identity +should be the same `ed25519:` whether the client connected directly +or through the relay. Normalizing on the iroh pattern means the relay +doesn't need a separate fingerprint format. + +## Bearer tokens + +There are three credential types in the alknet auth model: + +1. **Ed25519 raw key** (RFC 7250) — the most common. Same key type as SSH + keys, native to iroh's `NodeId`. Fingerprint format: `ed25519:`. + Used for direct quinn, iroh direct, and iroh relay connections. The + fingerprint IS the trust anchor (no CA needed). + +2. **X.509 cert** — for domain-facing endpoints (`api.alk.dev`, relays, + ACME/Let's Encrypt). Fingerprint format: `SHA256:`. Requires + CA verification on the client side. The outgoing-only case (a client + connects to a public X.509 endpoint) is tracked as OQ-37. + +3. **Bearer token** (auth_token) — for HTTP clients that can't do TLS + client-auth (browsers, curl), or as a secondary credential path. Carried + in the call-protocol `auth_token` payload field. + +A `PeerEntry` can have any combination of these: `fingerprints: Vec` +for one or more TLS keys (Ed25519 and/or X.509), `auth_token_hash: +Option` for an optional bearer-token path. All resolve to the same +`peer_id`. A peer that authenticates via Ed25519 today and via auth_token +tomorrow gets the same `PeerId` — the logical identity is stable across +credential paths. + +`ApiKeyEntry` stays as a separate path for bearer tokens where the token IS +the identity (rotation = new identity, no stable logical id needed). When a +bearer token is one credential path among several for a stable peer, it +goes in `PeerEntry.auth_token_hash`. The distinction is not "peer bearer vs +auth bearer" — it's whether the token needs a stable logical id across +rotation (`PeerEntry`) or not (`ApiKeyEntry`). + +| Credential type | `PeerEntry` field | `Identity.id` | Rotation | +|-----------------|-------------------|---------------|----------| +| Ed25519 raw key | `fingerprints[i]` (`ed25519:...`) | `peer_id` (stable) | Update `fingerprints` entry | +| X.509 cert | `fingerprints[i]` (`SHA256:...`) | `peer_id` (stable) | Update `fingerprints` entry | +| Bearer token (peer) | `auth_token_hash` | `peer_id` (stable) | Update `auth_token_hash` | +| Bearer token (identity) | `ApiKeyEntry` (separate) | `prefix` (changes with key) | New `ApiKeyEntry` | ## What this does NOT change @@ -237,9 +320,8 @@ oversight. **Positive:** - Key rotation no longer breaks ACL entries or routing references on the - remote side. The vault's local rotation story (ADR-021) is now the - complete story — `rotate` locally, edit the peer entry's fingerprint - remotely, reload. + remote side — for any credential path (TLS key or auth token). The + vault's local rotation story (ADR-021) is now the complete story. - `PeerRef::Specific` survives reconnects. An in-flight routing reference to "worker-a" keeps resolving after worker-a's TLS key rotates and after worker-a reconnects. @@ -250,33 +332,36 @@ oversight. future `alknet-peer-store-sqlite` adapter that persists `PeerEntry` records is additive, implementing the same `IdentityProvider` trait against a `peers` table. See ADR-033. -- Fingerprint-resolved identities now carry `resources` (the pre-ADR-030 - limitation is lifted) — `AccessControl::check` against `resource_type`/ - `resource_action` works for external fingerprint-authenticated callers - when configured. +- Resolved identities now carry `resources` (the pre-ADR-030 limitation is + lifted) — `AccessControl::check` against `resource_type`/ + `resource_action` works for external authenticated callers when + configured, regardless of credential path. +- A peer can authenticate via Ed25519 today and via auth_token tomorrow, + getting the same `PeerId` — the logical identity is stable across + credential paths. +- Fingerprint normalization (`ed25519:` for raw keys across quinn and + iroh) means the same key has the same fingerprint regardless of transport. + This also simplifies the coming WebTransport relay work. **Negative:** - `AuthPolicy.authorized_fingerprints: HashSet` is replaced with `AuthPolicy.peers: Vec`. This is a breaking config change — existing config files with `authorized_fingerprints` migrate to `peers` entries. The migration is mechanical (each fingerprint becomes a - `PeerEntry { peer_id: , fingerprint: , scopes: - ["relay:connect"], ... }`), and operators must choose a `peer_id` per - peer, but it is a config break. -- `Identity.id` for fingerprint-resolved identities changes from the - fingerprint to the `peer_id`. Code that logs or compares `Identity.id` - on the fingerprint path and assumed it was the fingerprint string will - see the `peer_id` instead. This is the correct behavior (logs should - show the logical name, not the rotating crypto material), but it's a - behavior change in log output. -- The pre-ADR-030 `auth.md` "Resource-scoped ACLs and external identities" - limitation note is removed — fingerprint-resolved identities now populate - `resources`. Code that relied on fingerprint identities always having - empty `resources` (an unintended invariant) will see populated resources - when configured. + `PeerEntry { peer_id: , fingerprints: vec![], ... }`), + and operators must choose a `peer_id` per peer, but it is a config break. +- `Identity.id` for resolved identities changes from the fingerprint to + the `peer_id`. Code that logs or compares `Identity.id` and assumed it + was the fingerprint string will see the `peer_id` instead. This is the + correct behavior (logs should show the logical name, not the rotating + crypto material), but it's a behavior change in log output. +- The quinn fingerprint extraction changes from `SHA256:` to + `ed25519:` for raw-key certs. Existing configs with + `SHA256:` fingerprints for Ed25519 keys migrate to `ed25519:` format. + X.509 fingerprints stay as `SHA256:`. - ADR-029 Assumption 1 is superseded on the `PeerId` source dimension: the one-way door (`PeerId` is logical, not crypto) is preserved, but the - v1 UUID source is replaced by `Identity.id` from `PeerEntry`. The + UUID source is replaced by `Identity.id` from `PeerEntry`. The Assumption's framing of "no-storage workaround" is no longer accurate — the storage boundary is now explicitly `config + in-memory adapter` (this ADR + ADR-033), with the SQLite adapter additive. @@ -295,9 +380,13 @@ oversight. Config validation enforces uniqueness; duplicate `peer_id` values in `AuthPolicy.peers` are a config error. -3. **API keys stay as-is.** The `ApiKeyEntry` model is correct for bearer- - token identity where rotation = new identity. This ADR does not add a - `PeerEntry`-equivalent for API keys. See "API keys" above. +3. **Bearer tokens have two paths.** `PeerEntry.auth_token_hash` is for + tokens that are one credential path among several for a stable logical + peer (the token rotates, the `peer_id` stays). `ApiKeyEntry` is for + tokens that ARE the identity (rotation = new identity, no stable + logical id needed). See "Bearer tokens" above. The distinction is not + "peer bearer vs auth bearer" — it's whether the token needs a stable + logical id across rotation. 4. **The `peers` list resolution is O(peers) per fingerprint lookup.** The expected peer count per node is small (10s–100s); a linear scan with a @@ -333,8 +422,13 @@ oversight. - OQ-34: Persistent Peer Registry (resolved by this ADR + ADR-033 — the storage boundary is `config + in-memory adapter` now, SQLite adapter additive) -- OQ-35: API Key Identity vs Peer Identity (recorded by this ADR — the - asymmetry is deliberate, see "API keys" above) +- ~~OQ-35: API Key Identity vs Peer Identity~~ (dissolved — the + "asymmetry" framing was wrong; `PeerEntry` supports multiple credential + paths, and `ApiKeyEntry` is for tokens that ARE the identity) +- OQ-29: CallClient TLS Client-Auth (resolved by this ADR's §6 fingerprint + normalization + the client-auth wiring decision recorded in OQ-29) +- OQ-37: X.509 outgoing-only case (the three auth types and how X.509 + server identity fits — see OQ-37 in open-questions.md) - `docs/research/alknet-storage-strategy/findings.md` §4 (the `PeerEntry` model and resolution path) - `docs/architecture/crates/core/auth.md` (the spec amended by this ADR) diff --git a/docs/architecture/open-questions.md b/docs/architecture/open-questions.md index b613332..25ec24e 100644 --- a/docs/architecture/open-questions.md +++ b/docs/architecture/open-questions.md @@ -414,73 +414,52 @@ is a feature extension, not an unmade architecture decision. ### OQ-29: CallClient TLS Client-Auth and Remote-Identity Verification - **Origin**: [client-and-adapters.md](crates/call/client-and-adapters.md), ADR-017 §7 -- **Status**: **open — load-bearing on ADR-030** (not "additive" as previously framed) +- **Status**: **resolved** (2026-06-27 by ADR-030 §6 + this decision) - **Door type**: One-way (identity model interaction), two-way (mechanism) -- **Priority**: **high** (was medium; promoted — this is the activation path - for ADR-030's `PeerEntry` fingerprint → `peer_id` resolution) -- **Resolution**: **Previously framed as "additive — two-way-door - remainder." That framing is incorrect.** ADR-030 makes `PeerId = - Identity.id = PeerEntry.peer_id` on the fingerprint path. But the - fingerprint path requires the client to present a TLS client cert, and - the current `CallClient::connect()` uses `with_no_client_auth()` — no - client cert is presented, no fingerprint is extracted by the server's - `AcceptAnyCertVerifier`, and `IdentityProvider::resolve_from_fingerprint` - returns `None`. The peer gets no `PeerId` from the fingerprint path. +- **Priority**: ~~high~~ → resolved +- **Resolution**: **Three things are decided:** - The `auth_token` path (`resolve_from_token`) still works, but it - resolves to `Identity.id = ApiKeyEntry.prefix` (the API-key identity - path), **not** to `PeerEntry.peer_id`. So with TLS client-auth unwired, - a calling peer's `PeerId` is either `None` (no client cert) or an - API-key prefix (if an `auth_token` is used) — neither is the stable - `PeerEntry.peer_id` that ADR-030 commits. The PeerEntry path is dormant - until client-auth is wired. + 1. **Wire quinn client-auth.** The client presents its Ed25519 key as an + RFC 7250 raw public key client cert (the client-side equivalent of + the server's `RawKeyCertResolver`). The server's + `AcceptAnyCertVerifier` already requests client certs and extracts + the fingerprint — the gap was entirely on the client side + (`with_no_client_auth()` → present the key). This activates the + `PeerEntry` fingerprint → `peer_id` resolution path on quinn + connections. - This is not a "two-way-door remainder" — it's the activation path for - ADR-030's primary use case (stable `peer_id` across key rotation for - peer-keyed overlays). The decision to make is: + 2. **Key-type-aware server cert verification.** The client's + `ServerCertVerifier` depends on the remote's identity type: + - **Ed25519 raw key** (the common case): accept the cert, extract the + fingerprint, match against `PeerEntry.fingerprints`. The fingerprint + IS the trust anchor — no CA needed. (Same model as iroh.) + - **X.509** (domain-facing endpoints, ACME): verify against a CA + (rustls's `WebPkiServerVerifier` with the platform root store or a + configured CA). `AcceptAnyServerCertVerifier` is a security hole for + X.509 — it's only safe for raw keys. + - The verifier choice is driven by `CallCredentials.remote_identity`, + which carries the expected key type. - - **(a)** Wire TLS client-auth as part of the ADR-029 migration, so the - fingerprint → `PeerEntry` → `peer_id` path is live from day one. The - server's `AcceptAnyCertVerifier` already requests (but doesn't verify) - client certs; the client's `with_no_client_auth()` is the gap. Wiring - the local node's `RawKey`/`X509` identity as a rustls client-auth cert - is the missing piece. Remote-identity verification (plugging - `credentials.remote_identity` into a real `ServerCertVerifier`) is - genuinely additive — the server-side fingerprint extraction is what - matters for `PeerId`, not the client-side verification of the server. + 3. **Fingerprint normalization** (ADR-030 §6): the quinn path extracts + the raw Ed25519 public key from the SPKI cert and formats it as + `ed25519:`, matching iroh. The same key has the same fingerprint + regardless of transport. X.509 fingerprints stay as `SHA256:`. - - **(b)** Ship the ADR-029 migration with `auth_token`-only peer identity - and treat TLS client-auth as a follow-up. This means `PeerCompositeEnv` - keys on `Identity.id = ApiKeyEntry.prefix` (the token prefix) until - client-auth is wired, then switches to `PeerEntry.peer_id` when it is. - The switch is a behavior change for any deployment that built on the - token-prefix identity — the `PeerId` changes from the prefix to the - `peer_id`. This is the "compounds into a mess" path. + **The iroh path already works** — iroh uses RFC 7250 raw keys, both + sides automatically exchange Ed25519 public keys during the TLS + handshake, and `extract_iroh_client_fingerprint` already gets the + `NodeId`. No client-auth wiring needed for iroh (direct or relay). The + gap was quinn-only. - - **(c)** Extend `PeerEntry` to also cover `auth_token`-based peer - identity — a peer entry keyed by token prefix (or a `PeerEntry.token` - field) instead of (or alongside) fingerprint. This unifies the two - identity paths under `PeerEntry`, so the `PeerId` is stable regardless - of which credential path the peer used. This is a design change to - ADR-030, not just an implementation choice. + **What's genuinely additive** (not blocking the ADR-029 migration): + remote-identity verification (the client verifying the server's + fingerprint against an expected value) is additive — the server-side + fingerprint extraction is what matters for `PeerId`, not the client-side + verification. The verifier for raw keys can start as "accept any, extract + fingerprint" and add fingerprint-pinning later. - **The X.509 / raw-key wrinkle:** the vast majority of end users will use - Ed25519 raw keys (RFC 7250) — the same key type as SSH keys, native to - iroh's `NodeId` model. The fingerprint format for raw keys is - `ed25519:`. For X.509 (public-facing endpoints like - `api.alk.dev`, relays), the fingerprint is `SHA256:` — a - different format, a different key type, but the same `PeerEntry.fingerprint` - field. The `IdentityProvider::resolve_from_fingerprint` path is - format-agnostic (it's a string match against `PeerEntry.fingerprint`), - so both key types work once client-auth is wired. The wrinkle is on the - client side: presenting an Ed25519 raw key as a TLS client cert uses a - different rustls path than presenting an X.509 cert. Both are supported - by rustls; the `CallCredentials.tls_identity` field already carries the - `TlsIdentity` enum (RawKey / X509). The wiring is per-variant. - - **Not decided yet.** This OQ is promoted to high priority and requires a - decision before the ADR-029 migration lands. The previous "additive, - two-way-door remainder" framing is struck. + See ADR-030 §6 for the fingerprint normalization details. - **Cross-references**: ADR-014, ADR-017, ADR-027, ADR-029, ADR-030, [client-and-adapters.md](crates/call/client-and-adapters.md), [endpoint.md](crates/core/endpoint.md), [auth.md](crates/core/auth.md) @@ -615,28 +594,30 @@ is a feature extension, not an unmade architecture decision. ## Theme: Storage and Adapters -### OQ-35: API Key Identity vs Peer Identity +### OQ-35: ~~API Key Identity vs Peer Identity~~ (Dissolved) - **Origin**: ADR-030 §"API keys" (the asymmetry between the two auth paths) -- **Status**: resolved (recorded by ADR-030, not a blocker) -- **Door type**: One-way (the asymmetry is deliberate, not an oversight) -- **Priority**: medium -- **Resolution**: The fingerprint auth path gets the `PeerEntry` - id-decoupling treatment (`Identity.id = peer_id`, stable across key - rotation); the API-key auth path does not (`Identity.id = prefix`, - changes with the key). This is deliberate: +- **Status**: **dissolved** (2026-06-27 — the framing was wrong) +- **Door type**: ~~One-way~~ +- **Priority**: ~~medium~~ +- **Resolution**: **Dissolved.** The original framing ("the fingerprint + path gets `PeerEntry` id-decoupling, the API-key path doesn't — the + asymmetry is deliberate") was based on a false distinction between "peer + bearer" and "auth bearer" tokens. The correct framing is the three + credential types (Ed25519, X.509, bearer token) and whether the token + needs a stable logical id across rotation: - - Node identity (fingerprint path) must survive key rotation — the - same logical node rotates its TLS key, and every ACL entry / routing - reference to it should stay stable. `PeerEntry` provides this. - - Bearer-token identity (API-key path) IS the token — rotating the key - means a new prefix and a new identity, by design (revocation is the - rotation mechanism for API keys). Decoupling the API key identity - from the prefix would solve a problem API keys don't have. + - `PeerEntry` supports multiple credential paths: `fingerprints: Vec` + (Ed25519 and/or X.509) + `auth_token_hash: Option` (bearer + token). All resolve to the same `peer_id`. + - `ApiKeyEntry` is for bearer tokens that ARE the identity (rotation = + new identity, no stable logical id needed). - The asymmetry is documented in `auth.md` ("API keys vs peer entries") - and in ADR-030 §"API keys" so it's explicit, not an oversight. See - [auth.md](crates/core/auth.md) for the table comparing the two paths. + A bearer token that is one credential path among several for a stable + peer goes in `PeerEntry.auth_token_hash`. A bearer token that IS the + identity stays in `ApiKeyEntry`. The distinction is whether the token + needs a stable logical id across rotation, not "peer bearer vs auth + bearer." See ADR-030 §"Bearer tokens." - **Cross-references**: ADR-030, [auth.md](crates/core/auth.md), [config.md](crates/core/config.md) @@ -679,4 +660,62 @@ is a feature extension, not an unmade architecture decision. pattern is committed, the in-memory adapters ship with core, and the persistence adapter shapes are the open exploration. - **Cross-references**: ADR-030, ADR-031, ADR-033, OQ-34, - [auth.md](crates/core/auth.md), [config.md](crates/core/config.md) \ No newline at end of file + [auth.md](crates/core/auth.md), [config.md](crates/core/config.md) + +## Theme: TLS Identity + +### OQ-37: X.509 Outgoing-Only Case (Three Auth Types) + +- **Origin**: ADR-030 §"Bearer tokens" (the three credential types), the + discussion that X.509 is fundamentally different from Ed25519 +- **Status**: open (lingering — the X.509 server-identity case needs design) +- **Door type**: One-way (how X.509 server identity integrates with the + peer model) +- **Priority**: medium +- **Resolution**: The three credential types are: Ed25519 raw key (the + common case, normalized to `ed25519:` across quinn/iroh), X.509 + (domain-facing endpoints, ACME, `SHA256:`), and bearer token + (`PeerEntry.auth_token_hash` or `ApiKeyEntry`). + + Ed25519 and bearer token are resolved (ADR-030 + OQ-29). The X.509 case + that remains open is **outgoing-only**: a client connects to a public + X.509 endpoint (e.g., `api.alk.dev`). The client must verify the server + cert against a CA (rustls's `WebPkiServerVerifier`) — the + `AcceptAnyServerCertVerifier` is a security hole for X.509. The server + may or may not require a client cert (most public X.509 endpoints + won't — browsers can't easily do TLS client-auth). + + What's resolved: + - The `PeerEntry.fingerprints` field accepts X.509 fingerprints + (`SHA256:`) alongside Ed25519 fingerprints. + - The client-side verifier is key-type-aware (OQ-29): raw keys use + fingerprint-matching, X.509 uses CA verification. + + What's open: + - How does the outgoing X.509 case interact with `PeerEntry`? If a + client connects to `api.alk.dev` (X.509, no client-auth), the client + doesn't present a cert, so the server has no fingerprint to resolve. + The client authenticates via `auth_token` (the bearer-token path). + The server's `PeerEntry` for this client uses `auth_token_hash`, not + `fingerprints`. This works — but the server's `PeerEntry` might not + have a fingerprint at all for an HTTP-only client. + - Conversely, if the server requires X.509 client-auth (mutual TLS), + the client presents its X.509 cert, the server extracts the + `SHA256:` fingerprint, and `PeerEntry.fingerprints` matches it. + This works too. + - The open question is whether there are cases where X.509 server + identity needs to be part of the `PeerEntry` model (the server's + identity, not the client's) — e.g., for the client to know "I'm + connected to `api.alk.dev`, which is peer-id `api-server`." Currently + `PeerEntry` is about the *remote* peer's credentials, as seen by the + *local* node. For an outgoing connection, the local node is the + client, and `PeerEntry` describes the server. This may need a + design pass to make sure the model is symmetric. + + Not blocking the ADR-029 migration — the Ed25519 path is the primary + use case and it's resolved. The X.509 outgoing-only case is a real + question but it's downstream (the HTTP crate phase, when + `from_openapi`/`from_mcp` handlers connect to X.509 endpoints). +- **Cross-references**: ADR-027, ADR-029, ADR-030, OQ-29, + [client-and-adapters.md](crates/call/client-and-adapters.md), + [endpoint.md](crates/core/endpoint.md), [auth.md](crates/core/auth.md) \ No newline at end of file