docs(architecture): resolve review #003 — type/API surface completeness

Review #003 found 11 critical, 14 warning, and 6 suggestion findings
after reviews #001 (governance/security) and #002 (cross-document
consistency/two-way-door audit) were resolved. The theme: types and
APIs that were *referenced* but never *defined*, and stale ADR sketches
that didn't match the now-updated spec docs.

Critical fixes (11):

- C1: DerivedKey #[derive(Deserialize)] contradicted the custom
  Deserialize that rejects "[REDACTED]" — dropped the derive, added
  explicit manual Serialize/Deserialize impls (protocol.md).
- C2: encrypt prose said "derived at PATHS::ENCRYPTION" but the
  signature takes key_version — updated to encryption_path_for_version
  (service.md).
- C3: derive_encryption_key returned DerivedKey, derive_encryption_key
  _for_version returned EncryptionKey (same cache) — unified on
  DerivedKey, defined CachedKey (service.md).
- C4: tokio vs std::sync::RwLock contradiction — specified
  std::sync::RwLock, dropped tokio from vault deps (ADR-018, ADR-025,
  service.md).
- C5: Missing drift rows in vault README — added #9 (key_version
  ignored) and #10 (rotate not implemented).
- C6: ADR-022 build_root_context and invoke() sketches omitted
  abort_policy (9 fields vs 10) — added the field to both sketches.
- C7: Capabilities type referenced 20+ times, never defined — added
  struct definition to core-types.md with Clone+Send+Sync, Zeroize,
  sealed builder API, immutability guard.
- C8: SessionOverlaySource on CallAdapter but never defined, crate
  violation (alknet-call can't depend on alknet-agent) — defined the
  trait in alknet-call (call-protocol.md), matching the IdentityProvider
  pattern.
- C9: CompositeOperationEnv dispatch fall-through was "a two-way door"
  — added contains() to OperationEnv trait, made the composite probe
  before dispatching, eliminating the sentinel ambiguity.
- C10: No API for Layer 2 (connection overlay) registration, CallConnection
  undefined — defined CallConnection struct + register_imported() API
  (call-protocol.md).
- C11: with_local signature diverged between two examples (4 args vs 5)
  — added capabilities as the 5th arg, made both examples consistent.

Warning fixes (14):

- W1: invoke_with_policy restructured as required method, invoke gets a
  default impl delegating to it — eliminates duplication across impls.
- W2: CachedKey defined (service.md).
- W3: EncryptionKey constructor/glue specified, added to re-export list.
- W4: Secp256k1ExtendedPrivKey defined, derive_ethereum_key glue shown.
- W5: encryption_path_for_version rejects version < 2 (v1 is TS PBKDF2).
- W6: Wire payload schemas for all event types + ResponseEnvelope →
  EventEnvelope conversion table (call-protocol.md).
- W7: Timeout section — deadline on OperationContext, composed calls
  inherit parent's deadline, CallAdapter::with_timeout().
- W8: Request ID generation spec — UUID v4 for composed calls, wire ID
  vs internal ID relationship for abort cascade.
- W9: unlock_new already-unlocked behavior specified (returns
  AlreadyUnlocked).
- W10: KeyType Serialize/Deserialize justification corrected (stale
  irpc reference removed).
- W11: OperationProvenance and CompositionAuthority defined inline in
  operation-registry.md (were only in ADR-022).
- W12: encrypt/decrypt free functions marked pub(crate), relationship
  to VaultServiceHandle methods stated.
- W13: rotate signature removed from encryption.md (it's a
  VaultServiceHandle method, not a free function).
- W14: CallAdapter::new() + with_session_source() + with_timeout()
  constructors shown.

Suggestion fixes (6): Seed: Clone note, VaultServiceInner invariant,
ExtendedPrivKey accessor signatures, CURRENT_KEY_VERSION location, ADR-018
stale actor text, derivation helpers re-export note.
This commit is contained in:
2026-06-23 10:56:05 +00:00
parent cb98f42cd4
commit 2e34590522
14 changed files with 1129 additions and 120 deletions

View File

@@ -0,0 +1,512 @@
---
status: resolved
last_updated: 2026-06-23
reviewed_documents:
- overview.md
- README.md
- open-questions.md
- crates/core/README.md
- crates/core/core-types.md
- crates/core/endpoint.md
- crates/core/auth.md
- crates/core/config.md
- crates/call/README.md
- crates/call/call-protocol.md
- crates/call/operation-registry.md
- crates/vault/README.md
- crates/vault/mnemonic-derivation.md
- crates/vault/encryption.md
- crates/vault/service.md
- crates/vault/protocol.md
- decisions/001 through 026 (all 26 ADRs)
reviewer: architect
---
# Architecture Gap Review #003
## Purpose
This is the third pre-implementation sanity check. Reviews #001 and #002
caught the registration-bundle tangle (C1C4), the missing error schemas (C5),
the governance reconciliation (Proposed ADRs treated as binding, Accepted ADRs
contradicting later refinements), the abort-policy wiring gap, and the vault
deferral audit. All of those are resolved.
This review goes deeper on a different axis: **type and API surface
completeness**. The prior reviews focused on cross-document *decisions*
(contradictions between ADRs, stale framing). This review asks: could two
implementers, working from these specs alone, produce compatible code? The
findings are mostly places where a type or API is *referenced* but never
*defined*, or where a code sketch in an ADR went stale after the spec docs
were updated.
The two themes:
1. **Stale ADR sketches**: ADR-022's `build_root_context` and `invoke()`
sketches were written before ADR-024 split `env` into `scoped_env` + `env`
and before ADR-016's abort policy was added to `OperationContext`. The spec
docs (call-protocol.md, operation-registry.md) were updated; the ADR
sketches were not. An implementer copying the ADR verbatim won't compile.
2. **Referenced-but-undefined types**: `Capabilities`, `SessionOverlaySource`,
`CachedKey`, `CallConnection`, `Secp256k1ExtendedPrivKey` — all referenced
in specs, none defined. These are exactly the boundary types where two
implementers must agree, and the spec leaves them as exercises.
## Severity Definitions
| Severity | Meaning |
|----------|---------|
| **Critical** | Must resolve before implementation — would cause divergent implementations or security issues |
| **Warning** | Should resolve — missing specifications that could cause confusion or rework |
| **Suggestion** | Consider — improvements for clarity or completeness |
---
## Critical Findings
### C1. `DerivedKey`: `#[derive(Deserialize)]` contradicts "custom Deserialize rejects `[REDACTED]`"
**Documents**: crates/vault/protocol.md (L29, L68),
decisions/025-vault-local-only-dispatch.md (L137)
**Problem**: `protocol.md:29` shows `#[derive(Zeroize, Deserialize)]` on
`DerivedKey`. But `protocol.md:68` and ADR-025:137 both require a **custom**
`Deserialize` that explicitly rejects `private_key == "[REDACTED]"` with an
error. These are mutually exclusive: `#[derive(Deserialize)]` generates a
default impl, and you cannot also write a manual `impl Deserialize` (conflicting
impls). With the derived impl, a redacted string would fail only incidentally
(serde type mismatch: string vs sequence), producing a generic error — not the
explicit redaction-rejection the spec describes.
**Resolution**: Dropped `Deserialize` from the derive list in protocol.md and
showed a separate manual `impl<'de> Deserialize<'de> for DerivedKey` that
checks for `"[REDACTED]"` and returns a clear error.
---
### C2. `encrypt` prose says "derived at `PATHS::ENCRYPTION`" but signature takes `key_version`
**Documents**: crates/vault/service.md (L175),
decisions/021-key-rotation-via-version-indexed-paths.md (L115122)
**Problem**: `service.md:172` signature: `encrypt(&self, plaintext, key_version)`.
`service.md:175` prose: "Encrypt plaintext using the encryption key derived at
`PATHS::ENCRYPTION`." `PATHS::ENCRYPTION` is the v2 path only. ADR-021 makes
`encrypt` derive at `encryption_path_for_version(key_version)`. An implementer
following the prose stamps `key_version: 3` but uses the v2 key — a rotation bug
producing mislabeled blobs that decrypt fine locally but are cryptographically
mislabeled.
**Resolution**: Updated service.md:175 to "Encrypt plaintext using the
encryption key derived at `encryption_path_for_version(key_version)` (ADR-021);
the same `key_version` is stamped on the resulting `EncryptedData`."
---
### C3. `derive_encryption_key(path)` returns `DerivedKey` but `derive_encryption_key_for_version(version)` returns `EncryptionKey`
**Documents**: crates/vault/service.md (L133, L143),
decisions/021-... (L104122)
**Problem**: Same cache, different return types. No conversion between them is
specified, and `CachedKey` (the cache value type) is never defined. An
implementer cannot reconcile the cache.
**Resolution**: Made `derive_encryption_key_for_version` return `DerivedKey`
(consistent with `derive_encryption_key`), and `encrypt`/`decrypt` extract the
`EncryptionKey` from the cached `DerivedKey` via `EncryptionKey::from_derived_bytes`.
Defined `CachedKey` as a `#[derive(Zeroize, ZeroizeOnDrop)]` struct wrapping a
`DerivedKey` with cache metadata.
---
### C4. `tokio` vs `std::sync::RwLock` contradiction
**Documents**: decisions/018 (L69, L155156), decisions/025 (L166171),
crates/vault/service.md (L2733)
**Problem**: ADR-018:155 says "`std::sync::RwLock` works"; ADR-018:69 and
ADR-025:166 list `tokio` as a retained dep "for RwLock sync primitives";
service.md shows `RwLock` without specifying which. All vault methods are sync
(no `async`), implying `std::sync::RwLock` — which means tokio isn't needed,
contradicting both ADR dependency lists. ADR-018:155 still references the
removed actor's `tokio::sync::mpsc`.
**Resolution**: Specified `std::sync::RwLock` in service.md. Dropped `tokio`
from retained-deps lists in ADR-018 and ADR-025. Removed the stale actor/mpsc
sentence in ADR-018.
---
### C5. Missing drift rows in vault README drift table
**Documents**: crates/vault/README.md (L121130),
decisions/021-... (L126128), crates/vault/encryption.md (L185188)
**Problem**: The README drift table (the stated "single source of truth") omits
two drift items: (a) `decrypt`/`encrypt` ignore `key_version` and always derive
at `PATHS::ENCRYPTION`; (b) `rotate` is not implemented. An implementer using
only the table would miss these.
**Resolution**: Added drift rows #9 (key_version ignored) and #10 (rotate not
implemented) to the README table.
---
### C6. ADR-022's `build_root_context` and `invoke()` omit `abort_policy`
**Documents**: decisions/022 (L344358, L382394),
crates/call/call-protocol.md (L296318),
crates/call/operation-registry.md (L309336)
**Problem**: ADR-022's sketches have 9 fields; the spec docs have 10 (including
`abort_policy`). An implementer copying ADR-022 verbatim won't compile against
the `OperationContext` struct.
**Resolution**: Added `abort_policy: AbortPolicy::default(),` to ADR-022's
`build_root_context` and `abort_policy: parent.abort_policy.clone(),` to its
`invoke()` sketch. Restated the `env` type as `Arc<dyn OperationEnv + Send +
Sync>` in both sketches.
---
### C7. `Capabilities` type referenced widely but defined nowhere
**Documents**: overview.md (L168), crates/call/operation-registry.md (L116,
L206, L455, L473, L507513), crates/call/call-protocol.md (L305)
**Problem**: `Capabilities` appears in `OperationContext`, `HandlerRegistration`,
`build_root_context`, and the capability injection section — but has no struct
definition, no trait bounds, no builder API, no module location anywhere. The
spec asserts `Clone`, "non-serializable", "zeroized", "immutable after
construction" in prose only. The core and call implementers will produce
incompatible types.
**Resolution**: Added a `Capabilities` struct definition to core-types.md with
`Clone + Send + Sync`, `Zeroize`/`ZeroizeOnDrop`, a sealed builder API
(`new()`, `with_api_key()`, `with_http_token()`), private fields (immutability
guard), and no `Serialize` derive. Cross-referenced from operation-registry.md.
---
### C8. `SessionOverlaySource` referenced on `CallAdapter` but never defined; crate violation
**Documents**: crates/call/call-protocol.md (L41),
decisions/024-... (L447456)
**Problem**: `CallAdapter` (alknet-call) has a field of type
`Arc<dyn SessionOverlaySource>`, but ADR-024 says the trait is "an agent-crate
concern." That's a crate-graph violation: alknet-call can't hold a field whose
type is defined in alknet-agent (agent depends on call, not reverse).
**Resolution**: Defined `SessionOverlaySource` as a trait in alknet-call
(alongside `OperationEnv`), since the `CallAdapter` must name it. alknet-agent
implements it; alknet-call defines the integration point. This matches the
`IdentityProvider` pattern (ADR-004: core defines the trait, handlers implement
it).
---
### C9. `CompositeOperationEnv` dispatch fall-through semantics unspecified
**Documents**: crates/call/operation-registry.md (L348387)
**Problem**: The `OperationEnv::invoke` signature returns `ResponseEnvelope`
no "not in this overlay" sentinel. The sketch says "returns a sentinel or the
composite continues" and "the exact mechanism is a two-way door" — but this
affects cross-impl interop. Two implementers picking different mechanisms
(sentinel vs `contains` check) will produce `OperationEnv` impls that don't
compose.
**Resolution**: Added a `contains(&self, name: &str) -> bool` method to the
`OperationEnv` trait (with a default impl that returns `true` for backward
compat). The composite env probes `contains()` before dispatching, eliminating
the sentinel ambiguity. `CompositeOperationEnv::invoke` now checks
`session.contains()``connection.contains()` → base, dispatching to the first
overlay that contains the op.
---
### C10. No API for Layer 2 (connection overlay) registration; `CallConnection` undefined
**Documents**: crates/call/operation-registry.md (L214232, L456480),
decisions/024-... (L252263)
**Problem**: `OperationRegistryBuilder` only builds Layer 0. `from_call` imports
go into the connection's overlay at runtime, but there's no registration API, no
`CallConnection` struct definition, and no spec for how the overlay becomes part
of `CompositeOperationEnv.connection`.
**Resolution**: Added a `CallConnection` struct definition (overlay-related
fields) and a `register_imported(registration)` / `imported_operations` API to
call-protocol.md. Clarified that `OperationRegistryBuilder` is Layer-0-only;
runtime overlay registration uses `CallConnection` methods.
---
### C11. `with_local` signature diverges between two examples in operation-registry.md
**Documents**: crates/call/operation-registry.md (L217230, L458478)
**Problem**: The first example shows `with_local(spec, handler, authority,
scoped_env)` — 4 args, no capabilities. The second uses
`.with(HandlerRegistration { ... capabilities })`. `HandlerRegistration.capabilities`
is required (not `Option`). The first example silently drops capabilities for
the agent handler.
**Resolution**: Added `capabilities` as the 5th arg to `with_local` in the first
example, making both examples consistent. Documented the `with_local` and
`with_leaf` signatures explicitly.
---
## Warning Findings
### W1. `invoke_with_policy` has no default impl and can't delegate to `invoke()`
**Documents**: crates/call/operation-registry.md (L265273, L340342)
**Problem**: `invoke_with_policy` is a required trait method with no default
impl. `invoke()` hardcodes `abort_policy: parent.abort_policy.clone()`, so
`invoke_with_policy` can't delegate to it — every `OperationEnv` impl must
duplicate the full body.
**Resolution**: Restructured the trait: `invoke_with_policy` is the required
method; `invoke` has a default impl that calls `invoke_with_policy` with
`parent.abort_policy.clone()`. Impls only write `invoke_with_policy`. Added the
`CompositeOperationEnv::invoke_with_policy` body.
---
### W2. `CachedKey` referenced 7+ times but never defined
**Documents**: crates/vault/service.md (L216, L231, L330336, L351352),
crates/vault/protocol.md (L114), crates/vault/README.md (L99, L128)
**Resolution**: Added a `CachedKey` struct definition to service.md's Cache
section, showing fields (`key: DerivedKey`, `cached_at: Instant`,
`last_accessed: Instant`) and `Zeroize`/`ZeroizeOnDrop` derives.
---
### W3. `EncryptionKey` constructor + derivation glue unspecified; missing from re-export
**Documents**: crates/vault/encryption.md (L6078),
crates/vault/service.md (L140155), crates/vault/README.md (L144)
**Resolution**: Added `from_derived_bytes` pseudocode wiring SLIP-0010 output to
`EncryptionKey`. Added `EncryptionKey` to README re-export list. Specified
trait derives (custom redacting `Debug`, no `Clone`, `Zeroize`/`ZeroizeOnDrop`).
---
### W4. `Secp256k1ExtendedPrivKey` never defined; `derive_ethereum_key` glue unspecified
**Documents**: crates/vault/service.md (L157165),
crates/vault/mnemonic-derivation.md (L152164)
**Resolution**: Added `Secp256k1ExtendedPrivKey` struct definition to
mnemonic-derivation.md (parallel to `ExtendedPrivKey`). Documented
`derive_ethereum_key``derive_secp256k1_path` → wrap into `DerivedKey` glue.
---
### W5. `encryption_path_for_version(v<2)` collides onto v2 path
**Documents**: decisions/021-... (L7884),
crates/vault/mnemonic-derivation.md (L203), crates/vault/service.md (L147148)
**Problem**: `saturating_sub(2)` maps v0/v1/v2 to the same path. v1 is TS PBKDF2
(undecryptable by design), but `derive_encryption_key_for_version(1)` silently
returns the v2 key.
**Resolution**: Added a guard: `encryption_path_for_version` and
`derive_encryption_key_for_version` return `VaultServiceError::InvalidPath` for
`version < 2`. Documented in mnemonic-derivation.md and service.md.
---
### W6. `ResponseEnvelope` → `EventEnvelope` conversion unspecified; payload schemas missing
**Documents**: crates/call/call-protocol.md (L8085, L331341)
**Resolution**: Added a "Wire Payload Schemas" section to call-protocol.md
showing the `payload` shape for `call.responded`, `call.completed`, and
`call.aborted`. Documented the `ResponseEnvelope``EventEnvelope` mapping.
---
### W7. Timeout location and composition propagation unspecified
**Documents**: crates/call/call-protocol.md (L352)
**Resolution**: Added a timeout section to call-protocol.md specifying: timeout
duration is on `CallAdapter` (constructor param); composed calls inherit the
parent's remaining deadline via a `deadline: Option<Instant>` on
`OperationContext`; composed calls that exceed the deadline are cancelled (future
dropped). Added a cross-reference from `OperationContext`.
---
### W8. `generate_request_id()` unspecified; wire ID vs internal ID relationship undefined
**Documents**: crates/call/operation-registry.md (L315, L167),
crates/call/call-protocol.md (L291)
**Resolution**: Added a "Request ID Generation" subsection to
operation-registry.md specifying UUID v4 generation for composed calls, that wire
`call.requested.id` becomes the root `request_id`, and that composed child IDs
are internal (appear in `PendingRequestMap` for abort indexing but are not sent
as `call.requested` unless the child is a `from_call` op).
---
### W9. `unlock_new` already-unlocked behavior unspecified
**Documents**: crates/vault/service.md (L7493)
**Resolution**: Added: "`unlock_new` returns `AlreadyUnlocked` if the vault is
already unlocked, matching `unlock`'s behavior."
---
### W10. `KeyType`'s `Serialize`/`Deserialize` justification stale post-ADR-025
**Documents**: crates/vault/protocol.md (L115)
**Resolution**: Removed the stale "part of the irpc protocol" justification.
`KeyType` derives `Serialize`/`Deserialize` for `EncryptedData` interop (future
use) and `Clone` because it's a non-secret tag; the derives are retained but the
justification is corrected.
---
### W11. `OperationProvenance` and `CompositionAuthority` defined only in ADR-022
**Documents**: crates/call/operation-registry.md (L202, L209, L468),
decisions/022-... (L116131, L165181)
**Resolution**: Added inline definitions of `OperationProvenance` and
`CompositionAuthority` to operation-registry.md, with "See ADR-022 for
rationale" cross-references.
---
### W12. `encrypt`/`decrypt` free functions vs `VaultServiceHandle` methods relationship unstated
**Documents**: crates/vault/encryption.md (L132146),
crates/vault/service.md (L169190)
**Resolution**: Added a note to encryption.md that the free `encrypt`/`decrypt`
functions are module-internal crypto helpers; `VaultServiceHandle::encrypt`/
`decrypt` are the public API and call through to them.
---
### W13. `rotate` placement in encryption.md (method signature in crypto doc)
**Documents**: crates/vault/encryption.md (L177179)
**Resolution**: Replaced the `rotate` signature block in encryption.md with a
prose pointer to service.md and ADR-021 (rotate is a `VaultServiceHandle`
method, not a free function).
---
### W14. `CallAdapter::new()` missing session_source param
**Documents**: crates/call/operation-registry.md (L480),
crates/call/call-protocol.md (L3542)
**Resolution**: Added `CallAdapter::new(registry, identity_provider)` (defaults
`session_source` to `None`) and `CallAdapter::with_session_source(source)`
builder method to call-protocol.md.
---
## Suggestions
### S1. `Seed: Clone` duplicates root of trust
**Documents**: crates/vault/mnemonic-derivation.md (L8488)
**Resolution**: Kept `Clone` on `Seed` (derivation functions take `&[u8]`, so
`Clone` is not strictly needed, but the `to_seed` path and cache rebuild may
benefit). Added a note that callers should prefer `&Seed` and avoid cloning.
---
### S2. `VaultServiceInner.unlocked: bool` redundant with `seed: Option<Seed>`
**Documents**: crates/vault/service.md (L3541)
**Resolution**: Added an invariant note: `unlocked` is `true` iff
`seed.is_some()`; `is_unlocked()` reads `unlocked` for cheap checks, but the
ground truth is `seed.is_some()`.
---
### S3. `ExtendedPrivKey`/`Mnemonic` accessor signatures incomplete
**Documents**: crates/vault/mnemonic-derivation.md (L5863, L138150)
**Resolution**: Added accessor signatures for `ExtendedPrivKey`
(`private_key()`, `public_key()`, `chain_code()`, `path()`).
---
### S4. `CURRENT_KEY_VERSION` module location unspecified
**Documents**: decisions/021-... (L159), crates/vault/encryption.md (L154)
**Resolution**: Added: `CURRENT_KEY_VERSION` lives in `encryption.rs` and is
re-exported from the crate root.
---
### S5. ADR-018 stale actor reference
**Documents**: decisions/018 (L155156)
**Resolution**: Removed the stale actor/mpsc sentence (addressed in C4).
---
### S6. `device_path`/`encryption_path_for_version`/`derive_path_from_seed` not in re-export list
**Documents**: crates/vault/README.md (L136154)
**Resolution**: Added a note that derivation helpers are accessible as
`alknet_vault::derivation::*`.
---
## Summary Statistics
| Severity | Count | Status |
|----------|-------|--------|
| Critical | 11 | All resolved (C1C11) |
| Warning | 14 | All resolved (W1W14) |
| Suggestion | 6 | All resolved (S1S6) |
## Recommended Resolution Order
All findings resolved. Resolution summary:
1. **Vault type/reconciliation pass** (C1C5, W2W5, W9W13, S1S4): Fixed
`DerivedKey` derive contradiction, `encrypt` prose, return-type divergence,
RwLock contradiction, drift table, `CachedKey`/`EncryptionKey`/
`Secp256k1ExtendedPrivKey` definitions, `encryption_path_for_version`
guard, `unlock_new` behavior, `KeyType` justification, free-vs-method
relationship, `rotate` placement.
2. **Core+call type/API pass** (C6C11, W1, W6W8, W14, S5S6): Fixed
ADR-022 stale sketches, defined `Capabilities`, `SessionOverlaySource`,
`CallConnection`, Layer 2 registration API, `CompositeOperationEnv`
dispatch, `with_local` signature, `invoke_with_policy` default impl,
payload schemas, timeout, request ID generation, `CallAdapter` constructor.
3. **Inline definitions** (W11): Added `OperationProvenance` and
`CompositionAuthority` inline in operation-registry.md.
Review #003 is complete. All 11 critical, 14 warning, and 6 suggestion findings
resolved.