fix: resolve review #004 findings W1-W4 + close review gate
W1 (call/protocol/abort-cascade-wiring): wire AbortCascade into CallAdapter handle_stream for EVENT_ABORTED. Cascades with AbortPolicy::AbortDependents, aborts root, no descendant frames on wire (ADR-016 Decision 2). Two integration tests added. W2 (core/endpoint-client-fingerprint): extract TLS client cert fingerprint in dispatch_quinn (SHA256:<hex> of leaf cert DER via peer_identity) and dispatch_iroh (ed25519:<hex> of peer NodeId). Fingerprint format documented in auth.md. Server config change (with_no_client_auth → request-but-don't-require) deferred to new follow-up task core/endpoint-request-client-cert. W3 (vault/mnemonic-debug-redaction): replace Mnemonic derive(Debug) with manual redacting impl (phrase: "[REDACTED]"). Seed confirmed no Debug impl. Redaction test added. W4 (core/auth-apikey-resources): Option B — drop entry.resources from spec. External identities (token/fingerprint) grant scopes only; resource-scoped ACLs are composition-internal (ADR-015/022). auth.md corrected + limitation documented. Two tests confirm empty resources. review-post-impl-fixes: all 4 verified, workspace green (326 tests, 0 failures, 0 clippy warnings). Review #004 status → resolved. Graph: 34 tasks, 12 gens.
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: call/protocol/abort-cascade-wiring
|
||||
name: Wire AbortCascade into CallAdapter inbound event path (ADR-016)
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on: [call/protocol/abort-cascade]
|
||||
scope: narrow
|
||||
risk: medium
|
||||
@@ -127,4 +127,14 @@ frame actually reaches `cascade_abort`.
|
||||
> criteria likewise omitted "inbound `call.aborted` triggers cascade."
|
||||
> This task closes that integration gap — all the hard logic already
|
||||
> exists and is tested; this task adds the ~30-line bolt and the one
|
||||
> integration test that would have caught the gap.
|
||||
> integration test that would have caught the gap.
|
||||
|
||||
## Summary
|
||||
|
||||
`handle_stream` now matches `EVENT_ABORTED` → invokes
|
||||
`AbortCascade::cascade_abort` with `AbortPolicy::AbortDependents`, then
|
||||
aborts the root. Non-requested/non-aborted events still log at `debug!`.
|
||||
No descendant `call.aborted` frames sent on the wire. Two integration
|
||||
tests: cascade removes parent + child from `PendingRequestMap`; unknown
|
||||
request_id is a no-op. `cargo test -p alknet-call` (161 tests) and
|
||||
clippy clean.
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: core/auth-apikey-resources
|
||||
name: Reconcile ApiKeyEntry.resources — add field to type and populate in resolve_api_key, or drop from spec
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on: []
|
||||
scope: narrow
|
||||
risk: low
|
||||
@@ -114,4 +114,17 @@ applied to handler-internal composition identities
|
||||
> decision first, then implement. If the decision is A and the
|
||||
> implementation is more than ~30 lines, split a follow-up
|
||||
> `level: implementation` task (`core/auth-apikey-resources-impl`)
|
||||
> depending on this one.
|
||||
> depending on this one.
|
||||
|
||||
## Summary
|
||||
|
||||
Decision: **Option B** — dropped `entry.resources` from the spec.
|
||||
Rationale: `Identity.resources` is populated only by
|
||||
`CompositionAuthority::as_identity` (the composition path, ADR-015/022).
|
||||
All architecture examples use scope-based ACLs for external identities
|
||||
(`fs:read`, `vastai:query`, `llm:call`). Adding a second
|
||||
resource-population path for API keys would muddy the external/internal
|
||||
separation without a demonstrated downstream need. `auth.md:153`
|
||||
corrected to `resources: {}`; documented limitation added. Two tests
|
||||
confirm both `resolve_api_key` and `resolve_identity_from_fingerprint`
|
||||
return empty resources. `cargo test -p alknet-core` and clippy clean.
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: core/endpoint-client-fingerprint
|
||||
name: Extract TLS client certificate fingerprint in endpoint dispatch (ADR-004)
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on: []
|
||||
scope: narrow
|
||||
risk: medium
|
||||
@@ -115,4 +115,17 @@ in the codebase before committing.
|
||||
> client certs. The extraction code is correct either way — it returns
|
||||
> `None` when no cert was presented, which is the current behavior, so
|
||||
> landing extraction first is a safe no-op until the server config
|
||||
> changes.
|
||||
> changes.
|
||||
|
||||
## Summary
|
||||
|
||||
Added `extract_quinn_client_fingerprint` (leaf cert DER → `SHA256:<hex>`
|
||||
via `peer_identity()` downcast) and `extract_iroh_client_fingerprint`
|
||||
(peer `NodeId` → `ed25519:<hex>`). Both dispatch functions now pass the
|
||||
extracted fingerprint to `build_auth_context`. Fingerprint format
|
||||
documented in `auth.md` (table: quinn X.509 vs iroh raw Ed25519).
|
||||
Server config still uses `with_no_client_auth()` — extraction is a safe
|
||||
no-op. Follow-up task `core/endpoint-request-client-cert` created for the
|
||||
server config change. Two unit tests cover fingerprint format +
|
||||
determinism. `cargo test -p alknet-core --all-features` (59 tests) and
|
||||
clippy clean.
|
||||
90
tasks/core/endpoint-request-client-cert.md
Normal file
90
tasks/core/endpoint-request-client-cert.md
Normal file
@@ -0,0 +1,90 @@
|
||||
---
|
||||
id: core/endpoint-request-client-cert
|
||||
name: Switch rustls ServerConfig from with_no_client_auth to request-but-don't-require client certs
|
||||
status: pending
|
||||
depends_on: [core/endpoint-client-fingerprint]
|
||||
scope: narrow
|
||||
risk: medium
|
||||
impact: component
|
||||
level: implementation
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
`core/endpoint-client-fingerprint` landed the extraction logic: when a
|
||||
client certificate *is* presented, `dispatch_quinn` / `dispatch_iroh`
|
||||
extract the fingerprint and populate `AuthContext`. However, the server
|
||||
still builds `rustls::ServerConfig` with `with_no_client_auth()` in all
|
||||
three `TlsIdentity` branches (`endpoint.rs:477`, `490`, `501`), so the
|
||||
server never *requests* a client cert. Extraction is a safe no-op until
|
||||
this task changes the server-side TLS config.
|
||||
|
||||
This follow-up switches from `with_no_client_auth()` to a
|
||||
request-but-don't-require mode so that peers presenting a client cert
|
||||
(X.509 or RFC 7250 raw Ed25519 key) flow through the extraction path
|
||||
landed in the predecessor task, while peers without a cert still connect
|
||||
without regression.
|
||||
|
||||
### Design decision: how to request-but-not-require
|
||||
|
||||
rustls does not have a direct `with_optional_client_auth()` builder.
|
||||
The standard approach is:
|
||||
|
||||
1. Build the config with `.with_client_auth(verifier)` where `verifier`
|
||||
is a custom `ServerCertVerifier` that accepts any presentation (returns
|
||||
`Ok(Certified::yes())` when a cert is presented, `Ok(Certified::no())`
|
||||
when none is presented — rustls 0.23's `WebPkiServerVerifier` cannot
|
||||
be used directly for optional auth).
|
||||
2. Alternatively, use `rustls::server::WebPkiServerVerifier` with a
|
||||
`NoClientAuth` fallback — check the exact rustls API available in the
|
||||
pinned version before implementing.
|
||||
|
||||
Read the rustls API docs for the pinned version
|
||||
(`rustls::server::ServerConfig::builder_with_provider`) and confirm the
|
||||
correct verifier construction. The key property: a peer *may* present a
|
||||
cert, and if it does, `peer_identity()` returns it; if it doesn't, the
|
||||
connection still succeeds.
|
||||
|
||||
### iroh path
|
||||
|
||||
iroh's `Endpoint` builder uses its own TLS session internally. For the
|
||||
raw-key path (`TlsIdentity::RawKey`), iroh already advertises
|
||||
`only_raw_public_keys()` via `RawKeyCertResolver` — the server-side half
|
||||
of RFC 7250. The client-side presentation is set by the client's
|
||||
`rustls::ClientConfig`, not the server. So the iroh path may already
|
||||
receive peer identities when the client is an iroh node (the `NodeId` is
|
||||
always in the TLS cert). Verify: does `Connection::remote_node_id()`
|
||||
already work for iroh connections today, or does it require the server to
|
||||
request client certs? If iroh always presents a cert (raw-key mode), no
|
||||
server-side change is needed for the iroh path — only quinn/X.509 needs
|
||||
this task. Confirm before implementing.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `build_rustls_server_config` uses request-but-don't-require client auth (not `with_no_client_auth()`) for at least the X.509 path
|
||||
- [ ] Peer presenting a client cert: `peer_identity()` returns the cert chain → fingerprint extraction works end-to-end
|
||||
- [ ] Peer without a client cert: connection still succeeds, `tls_client_fingerprint` is `None` (no regression)
|
||||
- [ ] iroh path: confirm whether a server-side change is needed; if yes, apply it; if no, document why
|
||||
- [ ] Integration test: quinn endpoint with a client that presents a cert → `AuthContext.tls_client_fingerprint` is `Some(SHA256:...)`
|
||||
- [ ] Integration test: quinn endpoint with a client that presents no cert → `AuthContext.tls_client_fingerprint` is `None` and connection succeeds
|
||||
- [ ] `cargo test -p alknet-core --all-features` succeeds
|
||||
- [ ] `cargo clippy -p alknet-core --all-features --all-targets` succeeds with no warnings
|
||||
- [ ] `auth.md` updated: server-config decision documented (request-but-don't-require, not no-client-auth)
|
||||
|
||||
## References
|
||||
|
||||
- tasks/core/endpoint-client-fingerprint.md — predecessor task (landed extraction, deferred this config change)
|
||||
- crates/alknet-core/src/endpoint.rs:477, 490, 501 — the three `with_no_client_auth()` sites
|
||||
- crates/alknet-core/src/endpoint.rs — `extract_quinn_client_fingerprint` / `extract_iroh_client_fingerprint` (already landed, waiting for certs to flow)
|
||||
- docs/architecture/crates/core/auth.md — fingerprint format table and endpoint-level resolution flow
|
||||
- docs/architecture/decisions/004-auth-as-shared-core.md — ADR-004 (hybrid resolution)
|
||||
- docs/architecture/open-questions.md — OQ-12 (TLS identity provisioning)
|
||||
|
||||
## Notes
|
||||
|
||||
> Split from `core/endpoint-client-fingerprint` per the task's own
|
||||
> suggestion: extraction is correct either way (returns `None` when no
|
||||
> cert), so landing it first is a safe no-op. This task is the
|
||||
> behavioral change that makes fingerprints actually flow. The risk is
|
||||
> medium because it alters the TLS handshake for every connection —
|
||||
> ensure the no-cert-peer case has explicit test coverage.
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: review-post-impl-fixes
|
||||
name: Review the four post-implementation sanity-check #004 fixes for spec conformance
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on: [call/protocol/abort-cascade-wiring, core/endpoint-client-fingerprint, vault/mnemonic-debug-redaction, core/auth-apikey-resources]
|
||||
scope: moderate
|
||||
risk: low
|
||||
@@ -93,4 +93,23 @@ check.md`, does not introduce new spec drift, and is adequately tested.
|
||||
> at the end of a fix batch, with `scope: moderate`, `risk: low`,
|
||||
> `impact: phase`. It does not need to re-derive the findings — review
|
||||
> #004 already did that work. It only needs to confirm the fixes land
|
||||
> correctly and the workspace stays green.
|
||||
> correctly and the workspace stays green.
|
||||
|
||||
## Summary
|
||||
|
||||
All four fixes verified against acceptance criteria:
|
||||
- W1: `handle_stream` handles `EVENT_ABORTED`, cascades with
|
||||
`AbortDependents`, no descendant frames on wire, root removed, two
|
||||
integration tests pass.
|
||||
- W2: both dispatch paths extract fingerprints, format documented in
|
||||
`auth.md`, no-cert case returns `None` (no regression), server-config
|
||||
change deferred to `core/endpoint-request-client-cert`.
|
||||
- W3: `Mnemonic` has manual redacting `Debug`, `Seed` has no `Debug`,
|
||||
redaction test passes.
|
||||
- W4: Option B — spec corrected, limitation documented, both resolvers
|
||||
return empty resources, tests pass.
|
||||
|
||||
Workspace green: `cargo build --workspace --all-features` ✓, `cargo test
|
||||
--workspace --all-features` (326 tests, 0 failures) ✓, `cargo clippy
|
||||
--workspace --all-features --all-targets` (0 warnings) ✓. Review #004
|
||||
status updated to `resolved`.
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: vault/mnemonic-debug-redaction
|
||||
name: Replace Mnemonic derive(Debug) with redacting impl to prevent seed phrase leak
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on: []
|
||||
scope: single
|
||||
risk: low
|
||||
@@ -107,4 +107,12 @@ fn test_mnemonic_debug_redacts_phrase() {
|
||||
|
||||
> Small fix, but eliminates a latent root-of-trust leak. The same
|
||||
> pattern (custom redacting `Debug`) is already established in three
|
||||
> other places in this codebase — this task brings `Mnemonic` in line.
|
||||
> other places in this codebase — this task brings `Mnemonic` in line.
|
||||
|
||||
## Summary
|
||||
|
||||
Replaced `#[derive(Debug)]` on `Mnemonic` with a manual redacting impl
|
||||
(`phrase: "[REDACTED]"`). Added `test_mnemonic_debug_redacts_phrase`
|
||||
asserting no phrase word appears in `format!("{:?}", mnemonic)`.
|
||||
Confirmed `Seed` has no `Debug` impl (derives only `Clone, Zeroize`).
|
||||
`cargo test -p alknet-vault` and clippy clean.
|
||||
Reference in New Issue
Block a user