From 7dda6eec687aa8e9f6ae47465b7afd254a3d0911 Mon Sep 17 00:00:00 2001 From: "glm-5.2" Date: Mon, 22 Jun 2026 14:53:52 +0000 Subject: [PATCH] =?UTF-8?q?docs(architecture):=20add=20ADR-025=20=E2=80=94?= =?UTF-8?q?=20vault=20local-only=20dispatch,=20drop=20irpc?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Drops irpc from alknet-vault entirely. The vault's dispatch is now direct method calls on VaultServiceHandle — no VaultProtocol enum, no VaultMessage, no VaultServiceActor, no mpsc channel, no Service trait, no RemoteService trait, no postcard serialization. The vault is local-only by construction. The core security argument: irpc made the vault remote-capable by default (RemoteService generated unless no_rpc is passed). The IrohProtocol handler forwards all messages without auth. The docs framed 'register an ALPN' as a server-setup change. This is the default-insecure anti-pattern — security should be opt-in, not opt-out. ADR-025 inverts the default: local-only is the only mode, and remote access requires building a separate vault-server crate (a visible architectural act, not a flag flip). The actor path was already dead code — service.md said 'prefer VaultServiceHandle directly — no channel, no serialization.' The actor existed only to make irpc's Service trait work, which existed only to make RemoteService work, which was the footgun. VaultServiceHandle's Arc provides concurrent reads and exclusive writes — better throughput than the actor's sequential processing. DerivedKey serialization simplifies: always redact on serialize (for logging safety), reject '[REDACTED]' on deserialize with an error. No 'postcard preserves bytes' path. This resolves review #002 W8 (silent corruption on JSON-deserialized DerivedKey). Resolves: - OQ-21: remote vault access — resolved (not deferred). Not a vault crate feature; if needed, a separate vault-server crate with its own ADR. - C7: vault-server-crate question decided — not created now, not precluded. - C8: operation access policy table dissolved — all operations local-only by default; if a vault-server crate exposes some remotely, that crate defines the policy. - W8: DerivedKey JSON deserialization — resolved (reject redacted payloads). Amends ADR-005 (irpc remains for alknet-call, not for alknet-vault), ADR-018 (vault is even more standalone — zero RPC framework deps), ADR-019 (vault is the only layer, not just the only direct-caller layer), ADR-008 (vault integration point unchanged, but now local-only by construction). --- docs/architecture/README.md | 9 +- .../crates/call/operation-registry.md | 5 +- docs/architecture/crates/vault/README.md | 70 ++-- docs/architecture/crates/vault/protocol.md | 277 +++++---------- docs/architecture/crates/vault/service.md | 121 +++---- .../decisions/003-crate-decomposition.md | 2 +- .../005-irpc-as-call-protocol-foundation.md | 8 +- .../008-secret-service-integration.md | 10 +- .../decisions/018-vault-standalone-crate.md | 30 +- .../019-vault-assembly-layer-only.md | 10 +- .../025-vault-local-only-dispatch.md | 314 ++++++++++++++++++ docs/architecture/open-questions.md | 34 +- docs/architecture/overview.md | 5 +- 13 files changed, 527 insertions(+), 368 deletions(-) create mode 100644 docs/architecture/decisions/025-vault-local-only-dispatch.md diff --git a/docs/architecture/README.md b/docs/architecture/README.md index 759fe2a..9616165 100644 --- a/docs/architecture/README.md +++ b/docs/architecture/README.md @@ -7,9 +7,9 @@ last_updated: 2026-06-22-20 ## Current State -**Pre-implementation.** The project has completed a pivot from a three-layer model to an ALPN-as-service model. The greenfield workspace contains only `alknet-vault` (stable — implementation exists) and research/reference material. Foundational ADRs (001–024) are in place, including the BiStream type definition (ADR-007), vault integration (ADR-008), ALPN router/endpoint (ADR-010), AuthContext structure (ADR-011), call protocol stream model (ADR-012), Rust as canonical implementation language (ADR-013), secret material flow with capability injection (ADR-014), privilege model with authority context (ADR-015), abort cascade for nested calls (ADR-016), call protocol client and adapter contract (ADR-017), vault standalone crate (ADR-018), vault assembly-layer-only access (ADR-019), HD derivation for encryption keys (ADR-020), key rotation via version-indexed paths (ADR-021), handler registration, provenance, and composition authority (ADR-022), operation error schemas (ADR-023), and operation registry layering (ADR-024). ADR-024 resolves the registry mutability question that ADR-022/017 surfaced (`from_call` imports require a runtime-mutable home) and the `OperationContext.env` type identity crisis (review #002 C6), by layering the registry by trust boundary (curated static + session/connection dynamic overlays) and making `OperationEnv` a trait-object integration point. The alknet-core, alknet-call, and alknet-vault crate specs are in draft. +**Pre-implementation.** The project has completed a pivot from a three-layer model to an ALPN-as-service model. The greenfield workspace contains only `alknet-vault` (stable — implementation exists, pending ADR-025 refactor to drop irpc) and research/reference material. Foundational ADRs (001–025) are in place, including the BiStream type definition (ADR-007), vault integration (ADR-008), ALPN router/endpoint (ADR-010), AuthContext structure (ADR-011), call protocol stream model (ADR-012), Rust as canonical implementation language (ADR-013), secret material flow with capability injection (ADR-014), privilege model with authority context (ADR-015), abort cascade for nested calls (ADR-016), call protocol client and adapter contract (ADR-017), vault standalone crate (ADR-018), vault assembly-layer-only access (ADR-019), HD derivation for encryption keys (ADR-020), key rotation via version-indexed paths (ADR-021), handler registration, provenance, and composition authority (ADR-022), operation error schemas (ADR-023), operation registry layering (ADR-024), and vault local-only dispatch (ADR-025). ADR-024 resolves the registry mutability question (`from_call` imports require a runtime-mutable home) and the `OperationContext.env` type identity crisis (review #002 C6), by layering the registry by trust boundary and making `OperationEnv` a trait-object integration point. ADR-025 drops irpc from the vault, making it local-only by construction (inverting the security default from remote-capable-by-default to local-only-by-default) and resolving OQ-21, C7, C8, and W8. The alknet-core, alknet-call, and alknet-vault crate specs are in draft. -**Next step**: Continue working through review #002's remaining Tier 4 findings (vault security decisions, guard clauses, ADR-writing exercises, smaller spec decisions). All open questions for the core and call crates are resolved; the vault crate has one deferred OQ (OQ-21, remote vault administration) that does not block implementation. +**Next step**: Continue working through review #002's remaining Tier 4 findings (vault security decisions, guard clauses, ADR-writing exercises, smaller spec decisions). All open questions for the core and call crates are resolved; the vault crate's OQ-21 (remote vault) is now resolved (ADR-025 — vault is local-only by construction). ## Architecture Documents @@ -29,7 +29,7 @@ last_updated: 2026-06-22-20 | [crates/vault/mnemonic-derivation.md](crates/vault/mnemonic-derivation.md) | draft | BIP39, SLIP-0010, BIP-0032, derivation paths, key types | | [crates/vault/encryption.md](crates/vault/encryption.md) | draft | AES-256-GCM, EncryptedData, key versioning, salt (Phase B reserved) | | [crates/vault/service.md](crates/vault/service.md) | draft | VaultServiceHandle lifecycle, actor dispatch, cache, error model | -| [crates/vault/protocol.md](crates/vault/protocol.md) | draft | VaultProtocol irpc messages, DerivedKey redaction, serialization | +| [crates/vault/protocol.md](crates/vault/protocol.md) | draft | DerivedKey redaction, KeyType, serialization behavior | ## ADR Table @@ -59,6 +59,7 @@ last_updated: 2026-06-22-20 | [022](decisions/022-handler-registration-provenance-and-composition-authority.md) | Handler Registration, Provenance, and Composition Authority | Accepted | | [023](decisions/023-operation-error-schemas.md) | Operation Error Schemas | Accepted | | [024](decisions/024-operation-registry-layering.md) | Operation Registry Layering | Accepted | +| [025](decisions/025-vault-local-only-dispatch.md) | Vault Local-Only Dispatch | Accepted | ## Open Questions @@ -85,6 +86,7 @@ See [open-questions.md](open-questions.md) for the full tracker. - **OQ-14**: Batch operation semantics — multiple correlated `call.requested` events is the correct protocol design, not a simplification - **OQ-19**: Session-scoped registries — agent-written operations via `OperationEnv` trait layering; protocol doesn't need changes; `OperationEnv` must remain a trait. Generalized by ADR-024 to cover connection-scoped overlays as well. - **OQ-20**: Encryption key derivation — HD derivation from BIP39 seed, not PBKDF2; salt field unused in v2 (wire-format compat) (ADR-020) +- **OQ-21**: Remote vault access — resolved (ADR-025): vault is local-only by construction; remote access requires a separate vault-server crate with its own ADR - **OQ-22**: Key rotation — version-indexed derivation paths; `rotate` method re-encrypts (ADR-021) - **OQ-23**: Handler identity registration path — registration bundle with provenance, composition authority, scoped env, capabilities (ADR-022) - **OQ-24**: Operation error schemas — declared domain errors with typed `details` payload; adapter fidelity for `from_openapi`/`to_openapi` (ADR-023) @@ -92,7 +94,6 @@ See [open-questions.md](open-questions.md) for the full tracker. **Deferred (not active):** - **OQ-09**: WASM target boundaries — design constraint, not deliverable - **OQ-10**: Git adapter scope — start with smart protocol, add ERC721 later -- **OQ-21**: Remote vault access — protocol is remote-capable by construction (irpc `RemoteService`); enabling is a server-setup change with an auth-wrapping handler in the assembly layer; `Unlock`/`Lock` are local-only ## Document Lifecycle diff --git a/docs/architecture/crates/call/operation-registry.md b/docs/architecture/crates/call/operation-registry.md index 4c30d85..58b9e0a 100644 --- a/docs/architecture/crates/call/operation-registry.md +++ b/docs/architecture/crates/call/operation-registry.md @@ -436,10 +436,9 @@ irpc and the operation registry serve different scopes: | Layer | Mechanism | Serialization | Scope | |-------|-----------|---------------|-------| | Call protocol (external) | `EventEnvelope` over QUIC streams | JSON | Cross-language, cross-node | -| irpc services (internal) | `VaultProtocol` derive macro, `Service` trait | postcard (binary) | Rust-to-Rust, in-process or in-cluster | -| Local dispatch (in-process) | Direct function call through `OperationRegistry` | None | Same process | +| irpc services (internal) | `#[rpc_requests]` derive macro, `Service` trait | postcard (binary) | Rust-to-Rust, in-process or in-cluster | -irpc services are an internal dispatch mechanism — they are not directly exposed on the call protocol. The vault's `VaultProtocol` uses irpc for in-process, type-safe dispatch via `VaultServiceHandle` (postcard serialization for in-cluster, direct calls for in-process). The vault is accessed by the assembly layer (CLI binary) at startup, not by handlers at call time. See ADR-008 and ADR-014. +irpc services are an internal dispatch mechanism — they are not directly exposed on the call protocol. alknet-call itself uses irpc for its call-protocol framing (ADR-005); the vault no longer uses irpc (ADR-025 — direct method calls on `VaultServiceHandle`). The vault is accessed by the assembly layer (CLI binary) at startup, not by handlers at call time. See ADR-008 and ADR-014. If a handler internally uses an irpc-based service, the handler bridges the two: it receives JSON input from the call protocol, calls the irpc service in-process (postcard, type-safe), and serializes the result back to JSON for the call protocol response. This layering preserves irpc's type safety for internal calls while keeping the external interface cross-language. diff --git a/docs/architecture/crates/vault/README.md b/docs/architecture/crates/vault/README.md index 4455d80..11d4988 100644 --- a/docs/architecture/crates/vault/README.md +++ b/docs/architecture/crates/vault/README.md @@ -1,6 +1,6 @@ --- status: draft -last_updated: 2026-06-22-19 +last_updated: 2026-06-22-25 --- # alknet-vault @@ -13,16 +13,18 @@ and encrypted credentials in the alknet system. ## What This Crate Is alknet-vault is a **standalone crate** with zero alknet crate dependencies -(ADR-018). It provides the cryptographic primitives and runtime API for -managing the root of trust. The CLI binary (the `alknet` crate) is the sole -component that talks to the vault directly (ADR-019) — handlers receive -derived/decrypted material through capabilities, never through a vault -reference. +(ADR-018) and zero RPC framework dependencies (ADR-025). It provides the +cryptographic primitives and runtime API for managing the root of trust. +The CLI binary (the `alknet` crate) is the sole component that talks to the +vault directly (ADR-019) — handlers receive derived/decrypted material +through capabilities, never through a vault reference. The vault is **not a network service**. It has no ALPN, no -`ProtocolHandler` implementation, and no operations registered in the call -protocol (ADR-008, ADR-014). The master seed and derived private keys never -cross the network. +`ProtocolHandler` implementation, no operations registered in the call +protocol (ADR-008, ADR-014), and no remote dispatch capability (ADR-025). +The vault is **local-only by construction** — direct method calls on +`VaultServiceHandle`, no actor, no message enum, no wire format. The master +seed and derived private keys never cross the network. ## Documents @@ -30,16 +32,14 @@ cross the network. |----------|--------|-------------| | [mnemonic-derivation.md](mnemonic-derivation.md) | draft | BIP39, SLIP-0010, BIP-0032, derivation paths, key types | | [encryption.md](encryption.md) | draft | AES-256-GCM, EncryptedData, key versioning, HD derivation (ADR-020) | -| [service.md](service.md) | draft | VaultServiceHandle lifecycle, actor dispatch, cache, error model | -| [protocol.md](protocol.md) | draft | VaultProtocol irpc messages, DerivedKey redaction, serialization | +| [service.md](service.md) | draft | VaultServiceHandle lifecycle, direct dispatch, cache, error model | +| [protocol.md](protocol.md) | draft | DerivedKey redaction, KeyType, serialization behavior | ## Applicable ADRs | ADR | Title | Relevance | |-----|-------|-----------| | [003](../../decisions/003-crate-decomposition.md) | Crate Decomposition | alknet-vault's standalone position | -| [006](../../decisions/006-alpn-convention-and-connection-model.md) | ALPN String Convention | ALPN versioning pattern for potential `alknet/vault/v2` | -| [005](../../decisions/005-irpc-as-call-protocol-foundation.md) | irpc as Call Protocol Foundation | VaultProtocol uses irpc directly | | [008](../../decisions/008-secret-service-integration.md) | Vault Integration Point | CLI-embedded, capability source | | [010](../../decisions/010-alpn-router-and-endpoint.md) | ALPN Router and Endpoint | Ed25519 as default curve for TLS raw key identity | | [014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Secret Material Flow and Capability Injection | Capabilities carry vault-derived material | @@ -47,33 +47,38 @@ cross the network. | [019](../../decisions/019-vault-assembly-layer-only.md) | Vault Assembly-Layer-Only Access | The assembly layer is the sole caller | | [020](../../decisions/020-hd-derivation-for-encryption-keys.md) | HD Derivation for Encryption Keys | SLIP-0010 derivation, not PBKDF2; salt unused in v2 | | [021](../../decisions/021-key-rotation-via-version-indexed-paths.md) | Key Rotation via Version-Indexed Paths | Version-indexed paths; `rotate` re-encrypts | +| [025](../../decisions/025-vault-local-only-dispatch.md) | Vault Local-Only Dispatch | Dropped irpc; direct method calls; local-only by construction | ## Relevant Open Questions | OQ | Title | Status | Relevance | |----|-------|--------|-----------| | OQ-20 | Encryption key derivation | resolved (ADR-020) | HD derivation from seed; salt field unused in v2 | -| OQ-21 | Remote vault access | deferred | Protocol is remote-capable by construction; enabling = server-setup change with auth-wrapping handler; Unlock/Lock local-only | +| OQ-21 | Remote vault access | resolved (ADR-025) | Vault is local-only by construction; remote access requires a separate vault-server crate with its own ADR | | OQ-22 | Key rotation mechanism | resolved (ADR-021) | Version-indexed paths; `rotate` method | ## Key Design Principles -1. **Standalone**: The vault depends on no alknet crate. It defines its own - types, errors, and protocol. External crates depend on the vault; the - vault depends on nothing in alknet. +1. **Standalone**: The vault depends on no alknet crate and no RPC framework. + It defines its own types and errors. External crates depend on the vault; + the vault depends on nothing in alknet. 2. **Assembly-layer only**: The vault's API is consumed by the CLI binary, not by handlers. Handlers receive material through capabilities (ADR-014). The vault is not on the wire. -3. **Zeroize everything sensitive**: The mnemonic, seed, derived private +3. **Local-only by construction**: The vault has no remote dispatch + capability. Direct method calls on `VaultServiceHandle` — no actor, no + message enum, no wire format (ADR-025). Remote access, if ever needed, + requires a separate crate with its own ADR. +4. **Zeroize everything sensitive**: The mnemonic, seed, derived private keys, encryption keys, and cached keys all implement `Zeroize` and `ZeroizeOnDrop`. Secret material does not linger in freed heap memory. -4. **Deterministic derivation**: The same mnemonic + passphrase + path +5. **Deterministic derivation**: The same mnemonic + passphrase + path always produces the same key. Derivation is reproducible across runs and across nodes. -5. **OsRng for nonces**: AES-GCM IVs and any cryptographic nonces use +6. **OsRng for nonces**: AES-GCM IVs and any cryptographic nonces use `OsRng` (or equivalent CSPRNG), never `rand::random()`. IV reuse under the same key is catastrophic for GCM. -6. **No `unwrap()` or `expect()` outside tests**: vault operations +7. **No `unwrap()` or `expect()` outside tests**: vault operations propagate errors. A poisoned lock is recovered with `unwrap_or_else(|e| e.into_inner())`, not `unwrap()`. A panic in one vault operation must not brick the vault for all other operations. @@ -97,11 +102,13 @@ the full list. `unwrap_or_else(|e| e.into_inner())` or explicit error propagation. The current source uses `unwrap()` in `VaultServiceHandle` methods — this is a known drift and must be corrected. -- **DerivedKey redaction in JSON**: `DerivedKey` serializes the - `private_key` as `"[REDACTED]"` in human-readable formats (JSON) and as - raw bytes in binary formats (postcard). The redaction is a defense-in- - depth measure, not the primary control — the primary control is that - `DerivedKey` never crosses the call protocol wire (ADR-014). +- **DerivedKey redaction in serialization**: `DerivedKey` serializes the + `private_key` as `"[REDACTED]"` in all formats (ADR-025 dropped the + postcard/remote path that previously preserved bytes in binary formats). + Deserialization rejects `"[REDACTED]"` with an error (resolves review + #002 W8). The redaction is a defense-in-depth measure for logging safety, + not the primary control — the primary control is that `DerivedKey` never + crosses the call protocol wire (ADR-014). ## Known Source Drift @@ -115,8 +122,9 @@ truth for drift tracking — if an item is fixed in source, update this table. | 1 | IV generation | `rand::random()` | `OsRng` (CSPRNG) | `encryption.rs` L133 | [encryption.md → Security Constraints](encryption.md#security-constraints), [service.md → Security Constraints](service.md#security-constraints) | | 2 | RwLock `unwrap()` | `unwrap()` on every `RwLock` acquisition (L142, 161, 182, 191, 196, 227, 264, 307, 340, 367) | `unwrap_or_else(\|e\| e.into_inner())` for poisoned lock recovery | `service.rs` (see line numbers) | [service.md → Security Constraints](service.md#security-constraints) | | 3 | `CURRENT_KEY_VERSION` | `1` (HD-derived, but v1 is reserved for TS PBKDF2 legacy per ADR-020) | `2` (HD-derived, per ADR-020) | `encryption.rs` | [encryption.md → Key Versioning](encryption.md#key-versioning), [ADR-020](../../decisions/020-hd-derivation-for-encryption-keys.md) | -| 4 | `spawn()` return value | Returns a fresh, unspawned `VaultServiceActor` as the second tuple element (the spawned actor is consumed by `run`) | Either drop the second return value (return only `Client`) or restructure so the returned actor is the one that was spawned | `service.rs` `VaultServiceActor::spawn()` | [service.md → Actor Dispatch](service.md#actor-dispatch) | -| 5 | `HashMap::clear` zeroization | `KeyCache::clear()` removes entries and relies on `CachedKey`'s `Drop` impl for zeroization | Verify `HashMap::clear()` actually drops values (it does, but worth a test) | `cache.rs` | [service.md → Security Constraints](service.md#security-constraints) | +| 4 | irpc dependency | `VaultProtocol` enum with `#[rpc_requests]`, `VaultServiceActor`, `Client`, irpc/postcard deps | Remove entirely — direct method calls on `VaultServiceHandle` (ADR-025) | `protocol.rs`, `service.rs`, `Cargo.toml` | [ADR-025](../../decisions/025-vault-local-only-dispatch.md) | +| 5 | `DerivedKey` dual serialization | JSON redacts, postcard preserves bytes | Always redact on serialize; reject `"[REDACTED]"` on deserialize with error (ADR-025, resolves W8) | `protocol.rs` | [protocol.md → Serialization Redaction](protocol.md#serialization-redaction), [ADR-025](../../decisions/025-vault-local-only-dispatch.md) | +| 6 | `HashMap::clear` zeroization | `KeyCache::clear()` removes entries and relies on `CachedKey`'s `Drop` impl for zeroization | Verify `HashMap::clear()` actually drops values (it does, but worth a test) | `cache.rs` | [service.md → Security Constraints](service.md#security-constraints) | ## Public API @@ -132,11 +140,11 @@ pub use derivation::{DerivationError, ExtendedPrivKey, PATHS}; // Encryption pub use encryption::{EncryptedData, EncryptionError}; -// Protocol (irpc messages) -pub use protocol::{DerivedKey, KeyType, VaultMessage, VaultProtocol}; +// Key types (DerivedKey, KeyType) +pub use protocol::{DerivedKey, KeyType}; // Service (runtime) -pub use service::{VaultService, VaultServiceActor, VaultServiceError, VaultServiceHandle}; +pub use service::{VaultServiceError, VaultServiceHandle}; // Cache pub use cache::CacheConfig; diff --git a/docs/architecture/crates/vault/protocol.md b/docs/architecture/crates/vault/protocol.md index 2cb62ff..44a1509 100644 --- a/docs/architecture/crates/vault/protocol.md +++ b/docs/architecture/crates/vault/protocol.md @@ -1,58 +1,24 @@ --- status: draft -last_updated: 2026-06-19 +last_updated: 2026-06-22-25 --- # Protocol -The `VaultProtocol` irpc message enum, `DerivedKey` type, and serialization -behavior. +The `DerivedKey` type, `KeyType` enum, and serialization behavior. The +vault's "protocol" is the `VaultServiceHandle` method API (ADR-025) — there +is no message enum, no irpc dispatch, and no wire format. ## What -The protocol layer defines the message enum that the irpc dispatch -infrastructure uses (ADR-005) and the `DerivedKey` type that derivation -methods return. This is the vault's internal dispatch protocol — not the -alknet call protocol (the vault has no ALPN, ADR-008). +The vault's dispatch is direct method calls on `VaultServiceHandle` +(ADR-025). The types defined here — `DerivedKey`, `KeyType` — are the +return types from those methods. There is no `VaultProtocol` enum, no +`VaultMessage`, no `VaultServiceActor`, and no remote dispatch capability. -## VaultProtocol - -The irpc message enum. The `#[rpc_requests]` macro generates the -`VaultMessage` enum (with `WithChannels` wrappers), `Channels` impls, -`From` impls, and `Service`/`RemoteService` traits for remote dispatch. - -```rust -#[rpc_requests(message = VaultMessage, no_spans)] -#[derive(Debug, Serialize, Deserialize)] -pub enum VaultProtocol { - DeriveEd25519 { path: String }, - DeriveEncryptionKey { path: String }, - DeriveEthereumKey { path: String }, - DerivePassword { path: String, length: usize }, - Encrypt { plaintext: String, key_version: u32 }, - Decrypt { encrypted: EncryptedData }, - Lock, - Unlock { mnemonic: String, passphrase: Option }, -} -``` - -Each variant is a vault operation. The `tx` channel type for each variant -is `oneshot::Sender>`, where `T` is the -operation's return type (`DerivedKey`, `Vec`, `EncryptedData`, `String`, -or `()`). - -### State requirements - -All operations except `Unlock` require the vault to be **unlocked**. -Calling derive/encrypt/decrypt on a locked vault returns -`VaultServiceError::VaultLocked` (not a panic, not a channel close). - -### Dispatch - -The `VaultServiceActor` (see [service.md](service.md)) processes -`VaultMessage` variants and dispatches to `VaultServiceHandle` methods. -For local in-process use, prefer `VaultServiceHandle` directly — no -channel overhead. +The vault is **local-only by construction**. If remote vault access is ever +needed, it requires a separate crate that wraps the vault and adds remote +transport + auth (ADR-025, OQ-021). ## DerivedKey @@ -92,19 +58,23 @@ and zeroized. ### Serialization redaction -`DerivedKey` has a custom `Serialize` impl that redacts the private key in -human-readable formats: +`DerivedKey` has a custom `Serialize` impl that **always** redacts the +private key, regardless of format: -- **JSON** (human-readable): `private_key` serializes as `"[REDACTED]"`. - This is defense-in-depth — if a `DerivedKey` accidentally ends up in a - log or a JSON config, the private key is not exposed. -- **postcard** (binary, used by irpc): `private_key` serializes as the - actual bytes. This is required for in-cluster irpc dispatch to work — - the remote side needs the actual key bytes. -- **Deserialization**: always reads the full bytes, regardless of format. - A JSON-deserialized `DerivedKey` will have `"[REDACTED]"` as its - `private_key` string — this is expected; JSON round-tripping a - `DerivedKey` is not a supported use case (the private key is gone). +- **JSON** (and all human-readable formats): `private_key` serializes as + `"[REDACTED]"`. This is defense-in-depth — if a `DerivedKey` accidentally + ends up in a log, a JSON config, or debug output, the private key is not + exposed. +- **Deserialization**: rejects `private_key == "[REDACTED]"` with an error. + A JSON-deserialized `DerivedKey` with a redacted private key is invalid + and produces a deserialization error, not a corrupted key. This resolves + review #002 W8 (silent corruption on JSON-deserialized `DerivedKey`). +- **No binary-format preservation path.** ADR-025 dropped the postcard/remote + dispatch path that previously preserved private key bytes in binary + formats. `DerivedKey` is always used in-process (ADR-014: never appears + in call protocol payloads). If a future remote-vault crate needs to send + `DerivedKey` over the wire, it defines its own serialization for that + context — the vault's `DerivedKey` stays redact-always. The redaction is **not the primary control** for keeping private keys off the wire. The primary control is architectural: `DerivedKey` never appears @@ -147,169 +117,74 @@ Tags `DerivedKey` and `CachedKey` so consumers know what they received. ## Wire Format -For local (in-process) calls, the protocol uses tokio channels directly — -no serialization. For remote (in-cluster) calls, the protocol is serialized -with postcard (binary, compact). For cross-node (call protocol) exposure, -the vault is wrapped in an operation that serializes to JSON — but **no -vault operations are exposed over the call protocol** (ADR-014). The JSON -serialization path exists only for the `DerivedKey` redaction safety net. +The vault has no wire format (ADR-025). Dispatch is direct method calls on +`VaultServiceHandle` — no serialization, no channels, no network. The +`DerivedKey` custom `Serialize`/`Deserialize` impls exist solely for +logging safety (redaction) and defense-in-depth, not for wire transport. -## Remote Capability +`EncryptedData` has a stable wire format (shared with `alknet-storage` and +the TypeScript consumer by type-level agreement — see +[encryption.md](encryption.md) and ADR-018). That format is for *stored +encrypted data*, not for vault dispatch — the vault's `encrypt`/`decrypt` +methods operate on `EncryptedData` as a value type, not as a wire message. -The `VaultProtocol` is a remote-capable irpc service **by construction**. -The `#[rpc_requests]` macro generates both `Service` (local) and -`RemoteService` (remote) trait implementations. The `VaultServiceActor` -processes `VaultMessage` variants identically regardless of transport — -the only difference between local and remote use is the `Client` -construction and the server-side listener setup. +## Local-Only by Construction -This was a purposeful design decision: irpc's "zero-overhead local, -transparent remote" architecture means the same protocol definition and -actor code work for both in-process and cross-network dispatch. Enabling -remote vault access is a server-setup change, not a protocol change. +The vault is **local-only by construction** (ADR-025). There is no +`RemoteService` trait, no remote handler, no wire format for vault +messages. The vault's API is `VaultServiceHandle` — direct method calls, +nothing else. -### What's already in place +If remote vault access is ever needed (e.g., the machine→worker pattern +where a long-lived node exposes a restricted vault API to ephemeral +workers), it requires a **separate vault-server crate** that: -- **Protocol**: `VaultProtocol` is already a `RemoteService`. No code - changes needed in the protocol definition. -- **Serialization**: `DerivedKey`'s dual serialization (JSON redacts private - key for safety; postcard preserves bytes for remote dispatch) was - designed for this use case. -- **Actor**: `VaultServiceActor` already processes all message types. The - actor is transport-agnostic — it doesn't know whether a message arrived - via a local mpsc channel or a remote QUIC stream. -- **Auth transport**: irpc over iroh uses iroh's QUIC connections, which - authenticate via NodeId (Ed25519, RFC 7250 raw keys) — the same identity - model as the rest of alknet (ADR-010). The connection-level identity - ("which NodeId is calling") is available before any vault operation is - dispatched. +1. Depends on both alknet-core (for `IdentityProvider`, scopes, + auth-wrapping) and alknet-vault (for `VaultServiceHandle`). +2. Defines its own threat model, access policy, and operation filtering + (`Unlock`/`Lock` must be local-only; other operations may be + remote-capable depending on the policy). +3. Adds the remote transport (iroh/QUIC or similar) and an auth-wrapping + handler that checks caller identity before forwarding to the vault. +4. Requires its own ADR (matching ADR-019's language: "requires its own + ADR") defining the threat model and access policy. -### What's not in place (the gap) +This is a deliberate addition, not a flag flip on a default that was +already loaded. The pre-ADR-025 design made the vault remote-capable *by +construction* (irpc generated `RemoteService` by default), which was the +default-insecure anti-pattern. ADR-025 inverts the default: local-only is +the only mode, and remote access requires building something new. -The `IrohProtocol` handler that irpc provides forwards **all** message -types to the actor without auth checks. For local use this is correct -(the assembly layer is trusted). For remote use, the listener needs: - -1. **NodeId allowlist**: only known worker NodeIds may connect. -2. **Message filtering**: reject `Unlock` and `Lock` from remote callers - (see "Operation access policy" below). -3. **Then** forward to the actor. - -This auth-wrapping handler cannot live in the vault crate — the vault is -standalone (ADR-018) and depends on no alknet crate. The auth model -(`IdentityProvider`, `Identity`, scopes) lives in alknet-core. The -auth-wrapping listener lives in the **assembly layer** (the CLI binary) -or a dedicated vault-server crate that depends on both alknet-core and -alknet-vault. This is the same pattern as ADR-019: the vault is a -library, the assembly layer is the integrator. - -``` -alknet-vault (standalone, no deps) - - VaultProtocol (RemoteService by construction) - - VaultServiceActor (processes all message types, no auth) - - VaultServiceHandle (direct API) - -assembly layer / vault-server (depends on alknet-core + alknet-vault) - - AuthWrappingHandler: checks NodeId, filters message types, forwards - - IrohProtocol::new(auth_wrapping_handler) - - Router::builder(endpoint).accept(b"alknet/vault", protocol).spawn() -``` - -### Operation access policy - -Not all `VaultProtocol` operations are safe to expose remotely. The vault -spec defines the policy; the assembly-layer listener enforces it. - -| Operation | Local (assembly layer) | Remote (workers) | Why | -|-----------|----------------------|-------------------|-----| -| `Unlock` | ✅ | ❌ | Sends the mnemonic (root of trust) over the wire. Even with NodeId auth, the mnemonic in transit is a different threat model — it's in memory on the receiving end, potentially in logs/traces. Local-only. | -| `Lock` | ✅ | ❌ | Locking the vault bricks the machine node for all workers. A compromised or buggy worker could DoS the entire machine node. Local-only. | -| `DeriveEd25519` | ✅ | ✅ | Workers need derived keys for signing, identity. The derivation path is the access control — the worker can only derive at paths the assembly layer declares. | -| `DeriveEncryptionKey` | ✅ | ✅ | Workers need encryption keys for credential encryption. Same path-based access control. | -| `DeriveEthereumKey` | ✅ | ✅ | Same as DeriveEd25519, for Ethereum signing. | -| `DerivePassword` | ✅ | ✅ | Workers need deterministic passwords for service credentials. | -| `Encrypt` | ✅ | ✅ | Workers encrypt external credentials (API keys) for storage. | -| `Decrypt` | ✅ | ✅ | Workers decrypt stored credentials at call time. | - -The policy is: **`Unlock` and `Lock` are local-only; all other operations -are remote-capable.** The assembly-layer listener filters `Unlock` and -`Lock` messages from remote connections and returns an error. - -### Use case: machine node → workers - -The primary use case is a **machine node** (long-lived, holds the mnemonic, -manages container services) exposing a restricted vault API to its -**workers** (ephemeral, containerized, no mnemonic): - -``` -Machine Node (head, vault unlocked locally) -├── exposes alknet/vault ALPN to workers -├── NodeId allowlist: only known worker NodeIds may connect -├── message filter: rejects Unlock/Lock from remote callers -│ -├── Worker A (no mnemonic) -│ └── calls DeriveEd25519, Encrypt, Decrypt on machine node's vault -│ -└── Worker B (also a head for its own sub-workers) - ├── gets its own credentials from machine node's vault - └── can expose its own restricted vault API to sub-workers -``` - -Workers don't hold mnemonics. They get static credentials injected at -construction (the common case) and call the machine node's vault for -dynamic derivation or decryption when needed. This is the -defense-in-depth (Russian doll) model: the seed is the innermost layer, -the machine node's vault is the next, iroh's NodeId auth is the outer, -and workers are outside that — calling in through authenticated channels. - -### Per-machine-node vaults, not shared - -Each machine node has its own vault and mnemonic. Machine nodes do not -share vaults with each other. Compromising one machine node exposes only -that node's workers, not all nodes. This is compartmentalization — the -blast radius of a vault compromise is one machine node, not the entire -fleet. - -The remote vault capability is for the **machine→worker** relationship, -not for cross-machine-node sharing. Machine nodes don't expose their -vaults to peer machine nodes — only to their own workers, authenticated -by NodeId. - -### What's breaking vs. non-breaking - -| Change | Breaking? | Why | -|--------|-----------|-----| -| Enabling remote vault access | **No** | Server-setup change — register `IrohProtocol` with an ALPN. The protocol is already a `RemoteService`. | -| Restricting which operations are remote-capable | **No** | Policy in the assembly-layer handler, not a protocol change. | -| Adding NodeId auth checks | **No** | Implementation in the assembly-layer handler. The vault crate doesn't change. | -| Adding new `VaultProtocol` variants | **Yes (wire break)** | Inherent to irpc — versioning is a non-goal. Would need ALPN versioning (`alknet/vault/v2`) if the protocol evolves. Same constraint as any irpc service. | -| Changing `DerivedKey` serialization | **No** | Dual serialization is already in place — postcard preserves bytes for remote, JSON redacts for safety. | - -The only breaking change is evolving the `VaultProtocol` enum itself, and -that's manageable with ALPN versioning (`alknet/vault`, then -`alknet/vault/v2` if needed) — the same pattern alknet uses for all ALPN -protocols (ADR-006). +**Per-node vaults are the recommended pattern for multi-node deployments.** +Each node has its own vault and mnemonic. Credentials are encrypted *for* +the receiving node's public key or derived at a shared path the receiving +node can derive locally. This is end-to-end encryption between nodes, not +a centralized decryption oracle. It matches ADR-008's "capability source" +model — credentials are injected at the assembly layer, not fetched over +the network at call time. ## Design Decisions | Decision | ADR | Summary | |----------|-----|---------| -| irpc for vault dispatch | [ADR-005](../../decisions/005-irpc-as-call-protocol-foundation.md) | In-process type-safe dispatch; remote-capable by construction | +| Vault is standalone | [ADR-018](../../decisions/018-vault-standalone-crate.md) | Zero alknet crate dependencies | +| Vault is local-only | [ADR-025](../../decisions/025-vault-local-only-dispatch.md) | Direct method calls, no irpc, no remote dispatch capability | +| HD derivation (not stored keys) | — | One seed, many keys, no key storage | | `DerivedKey` is move-only | [ADR-014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Prevents accidental duplication of secret material | -| JSON redacts private key | [ADR-014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Defense-in-depth for logging accidents | -| postcard preserves private key | — | Required for in-cluster irpc dispatch | +| JSON redacts private key (always) | [ADR-014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Defense-in-depth for logging accidents | | No vault operations on call protocol | [ADR-008](../../decisions/008-secret-service-integration.md), [ADR-014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Master seed never crosses the network | -| Unlock/Lock are local-only | OQ-21 (deferred) | Mnemonic and lock control must not be remotely accessible | -| Auth wrapping lives in assembly layer | [ADR-018](../../decisions/018-vault-standalone-crate.md), [ADR-019](../../decisions/019-vault-assembly-layer-only.md) | Vault is standalone; can't import alknet-core's auth model | +| No remote dispatch in vault crate | [ADR-025](../../decisions/025-vault-local-only-dispatch.md) | Remote access requires a separate vault-server crate with its own ADR | ## Open Questions -None active for this document. +None active for this document. OQ-21 (remote vault) is resolved — see +ADR-025 and [open-questions.md](../../open-questions.md). ## References -- Implementation: `crates/alknet-vault/src/protocol.rs` +- Implementation: `crates/alknet-vault/src/protocol.rs` (to be updated + per ADR-025 — remove `VaultProtocol` enum and irpc usage) - Tests: `crates/alknet-vault/src/protocol.rs` (unit tests for redaction - and zeroize behavior) -- [service.md](service.md) — how the actor dispatches `VaultMessage` + and zeroize behavior; postcard tests to be removed) +- [service.md](service.md) — `VaultServiceHandle` runtime API - [mnemonic-derivation.md](mnemonic-derivation.md) — what `KeyType` means \ No newline at end of file diff --git a/docs/architecture/crates/vault/service.md b/docs/architecture/crates/vault/service.md index aa294b9..a9f9e01 100644 --- a/docs/architecture/crates/vault/service.md +++ b/docs/architecture/crates/vault/service.md @@ -1,12 +1,13 @@ --- status: draft -last_updated: 2026-06-20 +last_updated: 2026-06-22-25 --- # Service The `VaultServiceHandle` runtime API: unlock/lock lifecycle, key -derivation, encryption, caching, and the actor dispatch path. +derivation, encryption, caching, and the direct method-call dispatch +path. ## What @@ -16,7 +17,9 @@ stateful runtime with a clear lifecycle. It holds the master seed in lifecycle, key derivation, and encryption/decryption. This is the API the assembly layer (CLI binary) calls. No other component -calls these methods directly (ADR-019). +calls these methods directly (ADR-019). The vault is local-only by +construction (ADR-025) — direct method calls, no actor, no message enum, +no remote dispatch. ## VaultServiceHandle @@ -254,91 +257,54 @@ pub struct CacheConfig { bytes, not a keypair that's reused. Caching it would grow the cache with unique paths (one per site hash) for no reuse benefit. -## Actor Dispatch +## Dispatch -The `VaultServiceActor` processes `VaultMessage` variants from an mpsc -channel and dispatches to `VaultServiceHandle` methods. This is the irpc -dispatch mechanism (ADR-005) — the in-process actor pattern that irpc -services use. +The vault uses **direct method calls** on `VaultServiceHandle` — no actor, +no message enum, no channels, no serialization (ADR-025). The handle is +`Arc>` — clone it, share it, call methods +directly. The RwLock provides concurrent reads (derive operations) and +exclusive writes (unlock/lock). -```rust -pub struct VaultServiceActor { - handle: VaultServiceHandle, -} - -impl VaultServiceActor { - pub fn new(handle: VaultServiceHandle) -> Self; - pub async fn run(mut self, mut rx: mpsc::Receiver); - pub fn spawn(handle: VaultServiceHandle) -> (Client, VaultServiceActor); -} +``` +Assembly layer (CLI binary): + 1. Create VaultServiceHandle + 2. Unlock with mnemonic (local, from secure prompt or file) + 3. Call derive/encrypt/decrypt methods directly + 4. Extract bytes, construct alknet-core types at the assembly boundary + 5. Inject into handler capabilities (ADR-014) ``` -- `run(rx)`: Message loop. Each `VaultMessage` variant is dispatched to the - corresponding handle method, and the response is sent through the oneshot - channel embedded in the message. Consumes `self`. -- `spawn(handle)`: Spawn the actor as a `tokio::task` and return a - `Client` for sending messages. **Source bug: the current - `spawn` implementation returns a fresh, unspawned `VaultServiceActor` as - the second tuple element (the spawned actor is consumed by `run`). The - returned actor has no channel and is non-functional. This should be - corrected during implementation sync — either drop the second return - value (return only `Client`) or restructure the API so - the returned actor is the one that was spawned.** +There is no `VaultProtocol` enum, no `VaultServiceActor`, no `Client`, +and no remote dispatch capability. The vault is local-only by +construction (ADR-025). If remote vault access is ever needed, it requires +a separate vault-server crate with its own ADR (OQ-021, ADR-025). -The actor pattern is the irpc dispatch mechanism (ADR-005). For local -in-process use, prefer `VaultServiceHandle` directly — no channel, no -serialization. The actor exists for irpc service dispatch, which is an -in-process pattern (the actor and the handle share state via `Arc`). - -### Dispatch paths - -| Path | Type | Serialization | Use case | -|------|------|---------------|----------| -| Direct (in-process) | `VaultServiceHandle` method calls | None | CLI binary at startup (the supported path) | -| Actor (in-process) | `VaultMessage` over mpsc | None (channel) | irpc service dispatch (in-process) | - -Remote vault dispatch — where the vault is exposed over irpc/iroh to -workers or other processes — is **deferred** (OQ-21). The `VaultProtocol` -is already a `RemoteService` by construction (irpc's `#[rpc_requests]` -generates it), and `DerivedKey`'s dual serialization was designed for this. -Enabling remote access is a server-setup change (register `IrohProtocol` -with an ALPN), not a protocol change. - -However, the `IrohProtocol` handler that irpc provides forwards all -message types without auth checks. Remote use needs an **auth-wrapping -handler** in the assembly layer (not the vault crate — the vault is -standalone, ADR-018, and can't import alknet-core's auth model) that: -1. Checks the caller's NodeId against an allowlist -2. Filters `Unlock` and `Lock` messages from remote callers (local-only) -3. Forwards remaining messages to the actor - -See [protocol.md → Remote Capability](protocol.md#remote-capability) for -the full design, operation access policy, use case (machine node → -workers), and breaking-vs-non-breaking analysis. - -The assembly layer (CLI binary) uses the direct path. The actor path -exists for in-process irpc dispatch. Neither path is on the alknet call -protocol (ADR-008, ADR-014) — the vault has no ALPN until a future -deployment explicitly registers one with an auth-wrapping handler. +The pre-ADR-025 design had an actor path (mpsc channel + oneshot +backchannels, using irpc's `Service` trait) that was described as +"secondary" to direct calls. ADR-025 removed it — the actor existed only +to make irpc's dispatch work, and the direct path was always preferred. +The RwLock-based concurrency model is both simpler and better for +throughput (concurrent reads vs. sequential processing). ## Errors ```rust -#[derive(Debug, thiserror::Error, Serialize, Deserialize)] +#[derive(Debug, thiserror::Error)] pub enum VaultServiceError { VaultLocked, // called derive/encrypt/decrypt while locked AlreadyUnlocked, // called unlock while already unlocked Mnemonic(String), // mnemonic generation/validation failed Derivation(String), // HD derivation failed (bad path, HMAC error) - Encryption(String), // AES-GCM encrypt/decrypt failed - InvalidPath(String), // derivation path is malformed + Encryption(String), // AES-GCM encrypt/decrypt failed + InvalidPath(String), // derivation path is malformed UnsupportedKeyType, // secp256k1 called without the feature } ``` -`VaultServiceError` is `Serialize`/`Deserialize` (for irpc dispatch) and -wraps sub-errors as strings. It does not implement `From` for alknet-core -error types — the CLI binary converts at the assembly boundary (ADR-018). +`VaultServiceError` is a plain `thiserror::Error` enum (ADR-025 dropped +the `Serialize`/`Deserialize` derives that were needed for irpc dispatch). +It wraps sub-errors as strings. The CLI binary converts vault errors to +alknet-core error types at the assembly boundary (ADR-018). ## Design Decisions @@ -349,18 +315,19 @@ error types — the CLI binary converts at the assembly boundary (ADR-018). | Version-indexed paths for rotation | [ADR-021](../../decisions/021-key-rotation-via-version-indexed-paths.md) | `decrypt` selects key by version; `rotate` re-encrypts | | RwLock for thread safety | — | Multiple readers (derive), exclusive writer (unlock/lock) | | TTL + LRU cache | — | Bounded memory, fresh keys, zeroized eviction | -| Actor for in-process irpc dispatch | [ADR-005](../../decisions/005-irpc-as-call-protocol-foundation.md) | irpc message dispatch; not on the call protocol | +| Direct method calls (no actor) | [ADR-025](../../decisions/025-vault-local-only-dispatch.md) | No irpc, no message enum, no remote dispatch capability | | `derive_password` not cached | — | One-shot; caching grows cache with no reuse | ## Open Questions See [open-questions.md](../../open-questions.md) for full details. -- **OQ-21** (deferred): Remote vault access — the `VaultProtocol` is - remote-capable by construction (irpc `RemoteService`). Enabling remote - access is a server-setup change with an auth-wrapping handler in the - assembly layer. `Unlock`/`Lock` are local-only; other operations are - remote-capable. See [protocol.md → Remote Capability](protocol.md#remote-capability). +- **OQ-21** (resolved by ADR-025): Remote vault access is not a feature + of the vault crate. The vault is local-only by construction — direct + method calls on `VaultServiceHandle`, no remote dispatch capability. + If remote access is ever needed, it requires a separate vault-server + crate with its own ADR. See [protocol.md → Local-Only by + Construction](protocol.md#local-only-by-construction). ## Security Constraints @@ -405,5 +372,5 @@ don't miss them. - Tests: `crates/alknet-vault/tests/service_tests.rs`, `crates/alknet-vault/src/service.rs` (unit tests), `crates/alknet-vault/src/cache.rs` (unit tests) -- [protocol.md](protocol.md) — `VaultMessage` and `DerivedKey` +- [protocol.md](protocol.md) — `DerivedKey` and `KeyType` - [encryption.md](encryption.md) — `encrypt` / `decrypt` cryptographic details \ No newline at end of file diff --git a/docs/architecture/decisions/003-crate-decomposition.md b/docs/architecture/decisions/003-crate-decomposition.md index d3cd683..84711dc 100644 --- a/docs/architecture/decisions/003-crate-decomposition.md +++ b/docs/architecture/decisions/003-crate-decomposition.md @@ -25,7 +25,7 @@ The workspace decomposes into the following crates: | Crate | Responsibility | Depends on | |-------|---------------|------------| | `alknet-core` | ProtocolHandler trait, ALPN router, endpoint, BiStream, AuthContext, IdentityProvider, config, ArcSwap dynamic config | tokio, quinn, rustls, irpc, iroh (feature-gated, added by ADR-010) | -| `alknet-vault` | Local key vault: BIP39/SLIP-0010/AES-GCM key derivation, encryption, VaultProtocol dispatch | (standalone, no alknet-core) | +| `alknet-vault` | Local key vault: BIP39/SLIP-0010/AES-GCM key derivation, encryption | (standalone, no alknet-core) | | `alknet-ssh` | SshAdapter (russh, SOCKS5, port forwarding) | alknet-core, russh | | `alknet-call` | CallAdapter (JSON-RPC via irpc, operation registry, pub/sub, access control, call protocol client, adapter traits) | alknet-core, irpc | | `alknet-agent` | Agent service: LLM execution loop (forked aisdk), tool dispatch via call protocol, provider key retrieval via vault | alknet-call | diff --git a/docs/architecture/decisions/005-irpc-as-call-protocol-foundation.md b/docs/architecture/decisions/005-irpc-as-call-protocol-foundation.md index 7975f33..fe4be98 100644 --- a/docs/architecture/decisions/005-irpc-as-call-protocol-foundation.md +++ b/docs/architecture/decisions/005-irpc-as-call-protocol-foundation.md @@ -30,7 +30,12 @@ This means: - The TypeScript operation and pub/sub patterns that can import OpenAPI schemas, wrap MCP servers, and expose operations as endpoints are supported at the protocol level — the adapter contract (from_*, to_*) is defined in Rust (see ADR-013) - Future NAPI and WASM clients speak the same wire format — alknet-napi projects the Rust call protocol client to Node.js; a browser SDK can be adapted from the existing TypeScript code -The `VaultProtocol` in alknet-vault also uses irpc as its service protocol. This is consistent — alknet-vault's irpc service is an independent service that happens to use the same framing, not a dependency on alknet-call. +The `VaultProtocol` in alknet-vault previously used irpc as its service +protocol. ADR-025 dropped irpc from the vault — the vault uses direct method +calls on `VaultServiceHandle`, not irpc dispatch. irpc remains the +foundation for alknet-*call* (the call protocol), not for alknet-*vault*. +See ADR-025 for the rationale (security default inversion: the vault is +local-only by construction, not remote-capable by default). ## Consequences @@ -39,7 +44,6 @@ The `VaultProtocol` in alknet-vault also uses irpc as its service protocol. This - JSON Schema compatible — OpenAPI import, MCP tool exposure, cross-language client generation - No need to design a custom RPC wire format — irpc's is already battle-tested - The call protocol inherits irpc's streaming and subscription patterns -- Consistency with alknet-vault's service model — both use irpc **Negative:** - alknet-call depends on irpc — if irpc has limitations or bugs, we're affected (mitigated: irpc is lightweight and we can fork if needed) diff --git a/docs/architecture/decisions/008-secret-service-integration.md b/docs/architecture/decisions/008-secret-service-integration.md index 849ee6b..ef5ec68 100644 --- a/docs/architecture/decisions/008-secret-service-integration.md +++ b/docs/architecture/decisions/008-secret-service-integration.md @@ -6,11 +6,11 @@ Accepted ## Context -alknet-vault (formerly alknet-secret) is a standalone crate with zero alknet crate dependencies. It provides BIP39 mnemonic generation, SLIP-0010 Ed25519 HD key derivation, AES-256-GCM encryption, and an irpc-based `VaultProtocol` for message dispatch. It is already implemented and stable. +alknet-vault (formerly alknet-secret) is a standalone crate with zero alknet crate dependencies and zero RPC framework dependencies (ADR-025). It provides BIP39 mnemonic generation, SLIP-0010 Ed25519 HD key derivation, AES-256-GCM encryption, and a direct-method-call API (`VaultServiceHandle`). It is already implemented and stable, pending the ADR-025 refactor to drop irpc. The question (OQ-08) was: how does the rest of the alknet system access alknet-vault's capabilities? The options were: -1. **irpc service over `alknet/call`**: Other services call vault operations through the call protocol. +1. **Call protocol exposure**: Other services call vault operations through the call protocol. 2. **ALPN handler on `alknet/secret`**: alknet-vault implements ProtocolHandler and gets its own ALPN. 3. **Direct library dependency**: alknet-core or handler crates depend on alknet-vault directly, breaking its independence. 4. **CLI-embedded with call protocol exposure**: The CLI binary instantiates VaultServiceHandle locally and registers vault operations in the call protocol's registry. @@ -64,9 +64,9 @@ This is analogous to the reverse-proxy admin key pattern (ADR-028 in the reverse ## References - ADR-003: Crate decomposition (alknet-vault is standalone) -- ADR-005: irpc as call protocol foundation +- ADR-005: irpc as call protocol foundation (for alknet-call; the vault no longer uses irpc — see ADR-025) - ADR-009: One-way door decision framework - ADR-014: Secret material flow and capability injection (specifies the mechanism this ADR described in prose) +- ADR-025: Vault local-only dispatch (dropped irpc from the vault; direct method calls only) - OQ-08: Secret service integration point (resolved by this ADR, refined by ADR-014) -- alknet-vault implementation: `crates/alknet-vault/` -- Reverse-proxy ADR-028: Admin HTTP API (analogous key management pattern) \ No newline at end of file +- alknet-vault implementation: `crates/alknet-vault/` \ No newline at end of file diff --git a/docs/architecture/decisions/018-vault-standalone-crate.md b/docs/architecture/decisions/018-vault-standalone-crate.md index a7cd8cf..b1dab6a 100644 --- a/docs/architecture/decisions/018-vault-standalone-crate.md +++ b/docs/architecture/decisions/018-vault-standalone-crate.md @@ -17,7 +17,7 @@ The question is: what does alknet-vault depend on? The candidates: pulls QUIC, quinn, iroh, rustls, and tokio runtime dependencies into the vault's dependency tree. 2. **Stand alone** — zero alknet crate dependencies. The vault defines its own - types, its own error enum, its own irpc protocol. Other crates depend on + types, its own error enum. Other crates depend on the vault; the vault depends on nothing in alknet. This is a one-way door. Once the vault depends on alknet-core, reversing it @@ -51,22 +51,24 @@ The vault defines its own types and traits: derived key material - `DerivedKey`, `KeyType` — protocol-level key representation - `EncryptedData`, `EncryptionKey` — AES-256-GCM blobs -- `VaultServiceHandle`, `VaultServiceActor` — runtime API -- `VaultProtocol` — irpc message enum (in-process dispatch) +- `VaultServiceHandle` — runtime API (direct method calls; no actor, no + message enum — see ADR-025) - `VaultServiceError` — its own error enum (string-wrapped sub-errors; the vault doesn't share an error type with alknet-core) -The `VaultProtocol` uses irpc directly (see ADR-005), not through alknet-call. -This is consistent: irpc is a lightweight framing library, not an alknet -crate. The vault's irpc usage is an in-process dispatch mechanism, not a -network-exposed service. +The vault uses direct method calls on `VaultServiceHandle`, not irpc +dispatch (ADR-025). The vault is local-only by construction — no remote +dispatch capability, no `RemoteService` trait, no wire format for vault +messages. If remote vault access is ever needed, it's a separate crate that +wraps the vault (see ADR-025, OQ-021). ## Decision **alknet-vault has zero alknet crate dependencies.** It depends only on external crates (`bip39`, `ed25519-bip32`, `aes-gcm`, `sha2`, `hmac`, -`secp256k1`, `irpc`, `tokio` for the actor's sync primitives, `serde`, -`zeroize`, `thiserror`, `base64`, `rand`). +`secp256k1`, `tokio` for `RwLock` sync primitives, `serde`, +`zeroize`, `thiserror`, `base64`, `rand`). ADR-025 dropped `irpc`, +`irpc-derive`, and `postcard` — the vault no longer uses irpc dispatch. The vault does not depend on: - `alknet-core` — no shared types, no `Identity`, no `AuthContext` @@ -90,8 +92,9 @@ CLI binary is the sole integration point (ADR-008, ADR-019). The vault defines its own types and does not share types with alknet-core: -- `VaultServiceError` is the vault's error enum. It is `Serialize`/`Deserialize` - (for irpc dispatch) and wraps sub-errors as strings. It does not implement +- `VaultServiceError` is the vault's error enum. It is a plain + `thiserror::Error` (ADR-025 dropped irpc, so vault errors no longer need + `Serialize`/`Deserialize` for wire dispatch). It does not implement `From` for alknet-core error types — the CLI binary converts at the assembly boundary. - `DerivedKey` is the vault's key representation. It is not shared with @@ -154,7 +157,10 @@ The vault defines its own types and does not share types with alknet-core: ## References - ADR-003: Crate decomposition (alknet-vault is standalone) -- ADR-005: irpc as call protocol foundation (vault uses irpc directly) +- ADR-005: irpc as call protocol foundation (irpc remains the foundation + for alknet-*call*; the vault no longer uses irpc — see ADR-025) +- ADR-025: Vault local-only dispatch (dropped irpc from the vault; the + vault uses direct method calls, no actor, no remote capability) - ADR-008: Vault integration point (CLI-embedded, assembly-layer only) - ADR-014: Secret material flow and capability injection - ADR-019: Vault assembly-layer-only access diff --git a/docs/architecture/decisions/019-vault-assembly-layer-only.md b/docs/architecture/decisions/019-vault-assembly-layer-only.md index c833924..4a50d43 100644 --- a/docs/architecture/decisions/019-vault-assembly-layer-only.md +++ b/docs/architecture/decisions/019-vault-assembly-layer-only.md @@ -131,9 +131,13 @@ that door; it simply does not open it. assembly layer, not just registering an operation. This is a feature: it forces an explicit decision about what secret material a handler needs. - Remote vault administration (unlock a running node's vault over the - network) is not supported. If needed in the future, it requires a - separate, heavily restricted mechanism (admin scope, mTLS-only, never - expose the mnemonic over an unauthenticated channel) and its own ADR. + network) is not supported. The vault is local-only by construction + (ADR-025) — no remote dispatch capability exists in the vault crate. If + remote vault access is needed in the future, it requires a separate + vault-server crate that depends on both alknet-core (for auth) and + alknet-vault (for the handle), with a heavily restricted mechanism + (admin scope, mTLS-only, never expose the mnemonic over an + unauthenticated channel) and its own ADR. ## Assumptions diff --git a/docs/architecture/decisions/025-vault-local-only-dispatch.md b/docs/architecture/decisions/025-vault-local-only-dispatch.md new file mode 100644 index 0000000..5433519 --- /dev/null +++ b/docs/architecture/decisions/025-vault-local-only-dispatch.md @@ -0,0 +1,314 @@ +# ADR-025: Vault Local-Only Dispatch + +## Status + +Accepted + +## Context + +alknet-vault uses irpc for its internal dispatch. The `VaultProtocol` enum is +annotated with `#[rpc_requests(message = VaultMessage, no_spans)]`, which +generates a `Service` trait impl (for in-process mpsc dispatch) and a +`RemoteService` trait impl (for remote QUIC dispatch). The vault's +`VaultServiceActor` processes `VaultMessage` variants from an mpsc channel. +This was adopted from irpc's actor pattern (ADR-005). + +### What irpc gives the vault + +Separating irpc into its constituent parts and asking which the vault +actually needs: + +| irpc component | What it does | Does the vault need it? | +|---|---|---| +| `#[rpc_requests]` macro | Generates message enum, `Channels` impls, `From` conversions | Marginally — it's convenient boilerplate, but the vault's protocol is 8 variants | +| `Service` trait | Local in-process dispatch via mpsc + oneshot | No — `VaultServiceHandle` direct calls are already preferred (service.md: "For local in-process use, prefer `VaultServiceHandle` directly — no channel, no serialization") | +| `RemoteService` trait | Remote dispatch via QUIC + postcard | No — this is the footgun | +| `Client` | Wraps either local mpsc or remote QUIC | No — the assembly layer uses the handle directly | +| `IrohProtocol` handler | Forwards all messages without auth | No — this is the default-insecure handler | +| postcard serialization | Binary serialization for remote dispatch | No — not needed without remote dispatch | +| `DerivedKey` dual serialization | JSON redacts, postcard preserves | Only needed *because* remote dispatch exists | + +The vault uses irpc for the actor pattern (in-process mpsc dispatch), but +the actor pattern is the *secondary* dispatch path. The primary path — direct +method calls on `VaultServiceHandle` — doesn't use irpc at all. And the thing +that makes irpc attractive for the actor pattern (the macro-generated +boilerplate) is a convenience, not a structural need. The vault's protocol +is small enough that the boilerplate is manageable by hand, or simply +unnecessary when the actor is removed. + +### The security problem: default-insecure + +The core problem is not that remote vault access is *possible* in principle +— it's that irpc makes it possible *by default*, with the unsafe path being +the easy path. + +The `#[rpc_requests]` macro generates `RemoteService` unless you pass +`no_rpc`. The `IrohProtocol` handler forwards all message types without auth +checks. The docs frame "register an ALPN" as a server-setup change +(OQ-21: "Enabling remote access is a server-setup change"). The result is +an architecture where: + +1. The vault is remote-capable by construction (the footgun is loaded). +2. Enabling remote access is easy — one line: `Router::builder(endpoint) + .accept(b"alknet/vault", protocol).spawn()`. +3. The default handler has no auth (the safety is off). +4. Making it safe requires an auth-wrapping handler *outside the vault + crate* (the safety is a separate part you have to remember to install). + +This is the **default-insecure anti-pattern**. Security should be opt-in, not +opt-out. The vault should be local-only by default, and remote access should +require *adding* something, not *removing* a default. + +### The use cases don't justify the default + +**Single node, local vault (the designed path):** The CLI binary unlocks the +vault at startup, derives/decrypts credentials, injects them into handler +capabilities. The vault is accessed only at the assembly layer (ADR-019). No +network. This is the path every deployment starts with, and it needs only +direct in-process method calls on `VaultServiceHandle`. irpc adds nothing. + +**Many nodes encrypt/decrypt the same data:** The most likely network-vault +use case, but a stretch. The better pattern is per-node vaults: the head +encrypts credentials *for* the worker using the worker's public key or a +shared derivation path the worker can derive locally. The worker decrypts +locally. This is end-to-end encryption between nodes, not a centralized +decryption oracle. It matches ADR-008's "capability source" model — +credentials are injected at the assembly layer, not fetched over the network +at call time. + +**Machine node → workers (OQ-21's use case):** A long-lived machine node +holds the mnemonic and exposes a restricted vault API to ephemeral workers. +This is the use case the vault docs actually spec. But `from_call`'s trust +model already flags the risk: "a compromised remote node can do anything its +operations are declared to do" (operation-registry.md). If the machine node +is compromised, every worker that calls it is compromised. That's inherent +to remote vault access and not a reason to forbid it, but it *is* a reason +to make the exposure a deliberate, hard-to-accidentally-enable act — not the +default state of the crate. + +None of these use cases justify making the vault remote-capable *by +construction*. The first needs no remote. The second has a better pattern +(per-node vaults). The third is real but should be an explicit addition, not +a default that's already loaded. + +### The actor path is dead code + +service.md says "For local in-process use, prefer `VaultServiceHandle` +directly — no channel, no serialization." The actor exists *for* irpc, and +the direct path is preferred. So the vault has two dispatch paths, and the +one irpc provides (actor) is the secondary one. The primary path (direct +method calls) doesn't use irpc at all. The actor is dead code for the +designed use case — it exists only to make irpc's `Service` trait work, +which exists only to make `RemoteService` work, which is the footgun. + +## Decision + +### 1. alknet-vault drops irpc entirely + +The vault's dispatch is direct method calls on `VaultServiceHandle`. No +`VaultProtocol` enum, no `VaultMessage`, no `VaultServiceActor`, no mpsc +channel, no `Service` trait, no `RemoteService` trait, no `Client`, no +`IrohProtocol` handler, no postcard serialization. + +The vault's public API is `VaultServiceHandle` (and the types it returns: +`DerivedKey`, `KeyType`, `EncryptedData`, `EncryptionKey`). That's it. An +implementer reading the vault crate sees one way to use it, not two ways +with a note saying "prefer the first." + +### 2. The vault is local-only by construction + +The vault crate has no remote dispatch capability. There is no +`RemoteService` trait, no remote handler, no wire format for vault messages. +Enabling remote vault access is not a flag flip or a server-setup change — +it requires *building a separate crate* that depends on both alknet-core +(for auth) and alknet-vault (for the handle) and adds the remote transport ++ auth-wrapping handler. That is a visible architectural act that shows up +in code review, not a runtime config flip on a macro that was already +generating the remote code. + +This inverts the security default: local-only is the only mode. Remote +access requires adding something, not removing a default. + +### 3. `DerivedKey` serialization simplifies + +Without the postcard/remote-dispatch path, `DerivedKey`'s custom +`Serialize` always redacts the private key (for logging safety) — there is +no "postcard preserves bytes" path. The custom `Deserialize` rejects +`private_key == "[REDACTED]"` with an error rather than producing a +corrupted key (this resolves review #002 finding W8). + +The redaction is purely for defense-in-depth against logging accidents. +The architectural control — `DerivedKey` never appears in call protocol +payloads (ADR-014) — is unchanged and remains the primary control. The +serialization redaction is the safety net, not the primary mechanism. + +`VaultServiceError` no longer needs `Serialize`/`Deserialize` (which it had +for irpc dispatch). It can be a plain `thiserror::Error` enum. If a future +remote-vault crate needs to serialize errors across the wire, *that crate* +defines the wire representation. + +### 4. If remote vault access is ever needed, it's a separate crate + +The vault-server-crate question (review #002 C7) is decided: *if* remote +vault access is ever needed, it is a separate crate that depends on both +alknet-core (for `IdentityProvider`, scopes, auth-wrapping) and +alknet-vault (for `VaultServiceHandle`). The vault crate itself remains +local-only. This is a decision not to create the crate now, and not to +preclude it. It is the path of least commitment, and it matches ADR-018's +standalone-vault principle. + +The remote vault crate would need its own ADR (matching ADR-019's language: +"requires its own ADR") defining the threat model, the access policy, the +auth-wrapping handler, and the operation filtering (Unlock/Lock local-only). + +### 5. The vault's dependency footprint shrinks + +The vault drops: `irpc`, `irpc-derive`, `postcard` (for remote), `noq` +(via irpc), `iroh` (via irpc-iroh). It retains: `bip39`, `ed25519-bip32`, +`aes-gcm`, `sha2`, `hmac`, `secp256k1` (feature-gated), `tokio` (for +`RwLock` sync primitives, not for channels), `serde` (for `DerivedKey` +redaction and `EncryptedData` wire format), `zeroize`, `thiserror`, `base64`, +`rand`. + +ADR-018's "zero alknet crate dependencies" becomes "zero alknet crate +dependencies and zero RPC framework dependencies." This is the cleanest +version of ADR-018's intent. + +## Consequences + +**Positive:** + +- The security default is inverted. Local-only is the only mode. Remote + access requires building a separate crate — a visible, deliberate act. + This matches the principle that security should be opt-in, not opt-out. +- The vault's API is honest. `VaultServiceHandle` is the API. No secondary + dispatch path that exists for a feature (remote) that isn't enabled. An + implementer sees one way to use the vault, not two with a note saying + "prefer the first." +- Dead code is removed. The actor path, which service.md says is secondary + to direct calls, is gone. The `VaultProtocol` enum, `VaultMessage`, + `VaultServiceActor`, and the mpsc dispatch loop are gone. The vault is a + pure library with a thread-safe handle. +- `DerivedKey` serialization simplifies. The dual serialization (JSON + redacts, postcard preserves) is replaced by always-redact-on-serialize, + reject-on-deserialize. No "postcard preserves bytes" path to test or + document. This resolves review #002 W8 (silent corruption on + JSON-deserialized `DerivedKey`) — the custom `Deserialize` rejects + redacted payloads with an error. +- The dependency footprint shrinks. No irpc, no postcard-for-remote, no + noq, no iroh via irpc. The vault is truly standalone (ADR-018's intent, + strengthened). Supply-chain surface is reduced. +- The vault's concurrency model is honest. `VaultServiceHandle` is + `Arc>` — the RwLock provides concurrent reads (derive) and + exclusive writes (unlock/lock). The actor's sequential processing was + actually *worse* for throughput than the RwLock. Removing the actor + makes the concurrency model visible and correct. + +**Negative:** + +- The vault's `VaultProtocol` enum and `VaultServiceActor` are removed. + This is a breaking change to the vault crate's public API (`VaultProtocol`, + `VaultMessage`, `VaultServiceActor`, `Client` are removed + from the public exports). Since no implementation consumer exists outside + the vault crate itself (ADR-019: the assembly layer uses + `VaultServiceHandle` directly), this is a spec edit, not a migration. +- If a future use case needs the actor pattern (e.g., for a remote-vault + crate that wants in-process mpsc dispatch before forwarding over the + wire), it must be re-added in *that crate*, not in the vault. This is + additive — the vault's direct-handle API is unchanged. +- The `DerivedKey` postcard round-trip tests in `protocol.rs` are removed. + The JSON-redaction tests remain. If a future remote-vault crate needs + postcard serialization, it defines and tests its own serialization path + for the types it sends over the wire. +- `VaultServiceError` loses `Serialize`/`Deserialize`. Any code that + serialized vault errors (only the irpc dispatch path, which is removed) + must adapt. The assembly layer converts vault errors to alknet-core + errors at the boundary (ADR-018), and that conversion is string-based + already. + +**On review #002 findings resolved by this ADR:** + +- **C7 (OQ-21 remote vault)**: resolved. OQ-21 moves from "deferred" to + "resolved: remote vault access is not a feature of the vault crate; if + needed, a separate vault-server crate wraps the vault and adds remote + transport + auth, requiring its own ADR." The vault-server-crate question + is decided: not created now, not precluded. The crate-decomposition + one-way door (ADR-003 territory) is decided by *not* creating the crate + now. +- **W8 (`DerivedKey` JSON deserialization silently corrupts)**: resolved. + Without the postcard path, the custom `Deserialize` rejects + `private_key == "[REDACTED]"` with an error. There is no + "postcard preserves bytes" path to complicate the serialization story. + The redaction is purely for logging safety; deserialization of a redacted + payload is always an error. +- **C8 (operation access policy table incomplete)**: dissolved. Without + `VaultProtocol`'s remote capability, there is no operation access policy + table to complete — all operations are local-only by default. The table + in protocol.md goes away. If a future vault-server crate exposes some + operations remotely, *that crate* defines the access policy in its own + ADR. + +## Assumptions + +1. **The vault's designed use case is local-only.** ADR-019 says the + assembly layer is the sole direct caller. ADR-008 says the vault is a + capability source accessed at assembly time. ADR-014 says handlers + receive credentials through `OperationContext.capabilities`, not by + calling vault operations. The vault was always designed to be local — + irpc's remote capability was an accident of adoption, not a designed + feature. + +2. **Per-node vaults are the right pattern for multi-node deployments.** + Each node has its own vault and mnemonic. Credentials are encrypted *for* + the receiving node's public key, not decrypted centrally. This is + end-to-end encryption, not a centralized decryption oracle. If this + assumption is wrong (a use case truly requires centralized vault + access), a remote-vault crate is the answer — not making the vault + remote-capable by default. + +3. **The actor pattern's sequential processing is not needed.** + `VaultServiceHandle`'s `Arc>` provides concurrent reads + (derive operations) and exclusive writes (unlock/lock). The actor's + sequential processing was a constraint, not a feature — it serialized + all operations including independent reads. The RwLock is the better + concurrency model for this workload. + +4. **The vault's protocol is small enough that macro-generated boilerplate + is not a maintenance burden.** With 8 operations, the + `VaultServiceHandle` method signatures *are* the protocol. There is no + need for a separate protocol enum when the handle's methods are the + API. If the vault grew to dozens of operations (unlikely given its + scope), a protocol enum could be re-introduced — but it would be a + local enum, not an irpc-generated one. + +5. **`DerivedKey` never needs to cross a wire format that preserves + private key bytes.** The architectural control (ADR-014: + `DerivedKey` never appears in call protocol payloads) means + `DerivedKey` is always used in-process. The redacting `Serialize` impl + is for logging safety (defense-in-depth), not for wire transport. If a + future remote-vault crate needs to send `DerivedKey` over the wire, it + defines its own serialization for that context — the vault's + `DerivedKey` stays redact-always. + +## References + +- ADR-005: irpc as call protocol foundation (this ADR amends the vault + reference in ADR-005's Decision and Consequences; irpc remains the + foundation for alknet-*call*, just not for alknet-*vault*) +- ADR-008: Vault integration point (the vault is a capability source + accessed at assembly time — this ADR makes that the *only* mode) +- ADR-014: Secret material flow and capability injection (`DerivedKey` + never appears in call protocol payloads — the redacting `Serialize` + is defense-in-depth for logging, not for wire transport) +- ADR-018: Vault as standalone crate (this ADR strengthens the + standalone principle: zero alknet crate dependencies *and* zero RPC + framework dependencies) +- ADR-019: Vault assembly-layer-only access (this ADR makes the vault + local-only, not just assembly-layer-only-for-direct-calls) +- OQ-21: Remote vault administration (resolved by this ADR — not a vault + crate feature; if needed, a separate crate with its own ADR) +- docs/reviews/002-pre-implementation-architecture-sanity-check.md + (findings C7, C8, W8 — resolved or dissolved by this ADR) +- irpc design patterns: `docs/research/references/iroh/irpc/09-design-patterns-and-examples.md` + (Pattern 3: `no_rpc` flag — this ADR goes further by dropping irpc + entirely, since the actor pattern is also unnecessary) \ No newline at end of file diff --git a/docs/architecture/open-questions.md b/docs/architecture/open-questions.md index c596dcf..27b4621 100644 --- a/docs/architecture/open-questions.md +++ b/docs/architecture/open-questions.md @@ -267,37 +267,17 @@ These questions are acknowledged but not active. They will be promoted to open w ### OQ-21: Remote Vault Administration - **Origin**: [service.md](crates/vault/service.md), [protocol.md](crates/vault/protocol.md), ADR-019 -- **Status**: deferred -- **Door type**: One-way (if implemented — wire format exposure), two-way (enabling is non-breaking) +- **Status**: resolved +- **Door type**: One-way (vault crate is local-only by construction) - **Priority**: medium -- **Resolution**: The `VaultProtocol` is a remote-capable irpc service by construction — the `#[rpc_requests]` macro generates both `Service` (local) and `RemoteService` (remote) trait implementations. `DerivedKey`'s dual serialization (JSON redacts private key for safety; postcard preserves bytes for remote dispatch) was designed for this. Enabling remote vault access is a server-setup change (register `IrohProtocol` with an ALPN), not a protocol change. +- **Resolution**: Remote vault access is **not a feature of the vault crate**. ADR-025 dropped irpc from the vault, making the vault local-only by construction — no `RemoteService` trait, no wire format for vault messages, no default-insecure remote handler. The vault's API is `VaultServiceHandle` (direct method calls), nothing else. - **What's already in place:** - - Protocol: `VaultProtocol` is already a `RemoteService` - - Serialization: `DerivedKey` redacts in JSON, preserves in postcard - - Actor: `VaultServiceActor` processes all message types, transport-agnostic - - Auth transport: irpc over iroh uses iroh's QUIC connections (NodeId auth, RFC 7250 raw keys) + If remote vault access is ever needed (e.g., the machine→worker pattern), it requires a **separate vault-server crate** that depends on both alknet-core (for `IdentityProvider`, scopes, auth-wrapping) and alknet-vault (for `VaultServiceHandle`). That crate would define its own threat model, access policy, operation filtering (Unlock/Lock local-only), and wire format — and requires its own ADR. This is a deliberate addition, not a flag flip on a default that was already loaded. - **What's not in place (the gap):** - - The `IrohProtocol` handler forwards all message types without auth checks - - Remote use needs: (1) NodeId allowlist, (2) message filtering (reject `Unlock`/`Lock` from remote callers), (3) forwarding to the actor - - This auth-wrapping handler cannot live in the vault crate (standalone, ADR-018) — it needs alknet-core's auth model (`IdentityProvider`, scopes). It lives in the assembly layer or a dedicated vault-server crate that depends on both alknet-core and alknet-vault. + The pre-ADR-025 deferral framed remote access as "non-breaking" (the wire format was additive). That framing was misleading: once workers build dependencies on the remote vault API, disabling it breaks them — the door is operationally one-way even if the wire format is additive. ADR-025 inverts the default: the vault is local-only by construction, and remote access requires building something new, not removing a default. - **Operation access policy:** - - `Unlock` and `Lock` are local-only (mnemonic and lock control must not be remotely accessible) - - All other operations (`DeriveEd25519`, `DeriveEncryptionKey`, `DeriveEthereumKey`, `DerivePassword`, `Encrypt`, `Decrypt`) are remote-capable - - The policy is documented in the vault spec; the assembly-layer listener enforces it - - **Use case:** machine node (head, holds mnemonic) exposes restricted vault API to workers (ephemeral, no mnemonic) over irpc/iroh. Per-machine-node vaults, not shared — compartmentalization limits blast radius. - - **What's breaking vs. non-breaking:** - - Enabling remote access: non-breaking (server-setup change) - - Restricting operations / adding auth: non-breaking (handler policy) - - Adding new `VaultProtocol` variants: wire break (inherent to irpc; manageable via ALPN versioning `alknet/vault/v2`) - - Changing `DerivedKey` serialization: non-breaking (dual serialization already in place) - - **Why deferred:** the capability is available and the use cases are clear (machine→worker credential access), but no current deployment needs it. The door is left open intentionally — irpc's remote support was chosen for this reason. When a use case materializes, the assembly-layer auth-wrapping handler is the implementation task, not a protocol change. The vault spec documents the policy (which operations are remote-capable) so the future implementer has clear guidance. -- **Cross-references**: ADR-005, ADR-008, ADR-014, ADR-018, ADR-019, [protocol.md](crates/vault/protocol.md), [service.md](crates/vault/service.md) + Per-node vaults are the recommended pattern for multi-node deployments: each node has its own vault and mnemonic; credentials are encrypted *for* the receiving node's public key, not decrypted centrally. This is end-to-end encryption between nodes, matching ADR-008's "capability source" model. +- **Cross-references**: ADR-005, ADR-008, ADR-014, ADR-018, ADR-019, ADR-025, [protocol.md](crates/vault/protocol.md), [service.md](crates/vault/service.md) ### OQ-22: Key Rotation Mechanism diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md index feadba3..3e3c71f 100644 --- a/docs/architecture/overview.md +++ b/docs/architecture/overview.md @@ -106,7 +106,7 @@ See [ADR-002](decisions/002-protocol-handler-trait.md) and [ADR-007](decisions/0 > **Note**: `alknet/agent` is not in the ALPN registry. The agent service is a future consumer that builds on top of `alknet-call` (it depends on `alknet-call`, not `alknet-core` directly — see ADR-003). It uses the call protocol for tool dispatch and exposes agent operations (e.g., `/agent/chat`) as call-protocol operations in the `OperationRegistry`, not as a separate ALPN. The agent is a mental model that informed the core architecture (capabilities, scoped env, abort cascade) but is not specced yet — its design will change as it's built out against the implemented core crates. -> **Note**: `alknet/vault` is not in the ALPN registry. alknet-vault is a standalone local key vault with no alknet-core dependency. The CLI binary embeds it and accesses it at the assembly layer — unlocking the vault at startup, deriving and decrypting credentials, and injecting them into handler capabilities. The vault is not exposed over the call protocol. No vault operations are registered in the operation registry. See ADR-008 and ADR-014. +> **Note**: `alknet/vault` is not in the ALPN registry. alknet-vault is a standalone local key vault with no alknet-core dependency and no remote dispatch capability (ADR-025). The CLI binary embeds it and accesses it at the assembly layer — unlocking the vault at startup, deriving and decrypting credentials, and injecting them into handler capabilities. The vault is not exposed over the call protocol. No vault operations are registered in the operation registry. See ADR-008, ADR-014, and ADR-025. ## Authentication @@ -215,6 +215,7 @@ All design decisions are documented as ADRs in [decisions/](decisions/). | [022](decisions/022-handler-registration-provenance-and-composition-authority.md) | Handler Registration, Provenance, and Composition Authority | Registration bundle carries provenance, composition authority, scoped env, capabilities; dispatch path reads from bundle | | [023](decisions/023-operation-error-schemas.md) | Operation Error Schemas | Operations declare domain errors; `call.error` carries typed `details`; adapter fidelity for `from_openapi`/`to_openapi` | | [024](decisions/024-operation-registry-layering.md) | Operation Registry Layering | Curated (static) + session/connection overlays (dynamic); `OperationEnv` as trait-object integration point | +| [025](decisions/025-vault-local-only-dispatch.md) | Vault Local-Only Dispatch | Dropped irpc from vault; direct method calls; local-only by construction | ## Open Questions @@ -227,7 +228,7 @@ Open questions are tracked in [open-questions.md](open-questions.md). Key questi - **OQ-08**: Vault integration point (resolved: CLI-embedded, assembly-layer only — see ADR-008, ADR-014, ADR-018, ADR-019) - **OQ-16**: Safe vault operations for call protocol exposure (resolved: none for now — see ADR-014) - **OQ-20**: Encryption key derivation (resolved: HD derivation, not PBKDF2 — see ADR-020) -- **OQ-21**: Remote vault access (deferred: protocol is remote-capable; enabling = server-setup + auth-wrapping handler; Unlock/Lock local-only — see [protocol.md](crates/vault/protocol.md#remote-capability)) +- **OQ-21**: Remote vault access (resolved: vault is local-only by construction — see ADR-025; remote access requires a separate vault-server crate with its own ADR) - **OQ-22**: Key rotation (resolved: version-indexed paths, `rotate` method — see ADR-021) ## Failure Modes