docs(architecture): add ADR-025 — vault local-only dispatch, drop irpc

Drops irpc from alknet-vault entirely. The vault's dispatch is now direct
method calls on VaultServiceHandle — no VaultProtocol enum, no
VaultMessage, no VaultServiceActor, no mpsc channel, no Service trait, no
RemoteService trait, no postcard serialization. The vault is local-only by
construction.

The core security argument: irpc made the vault remote-capable by default
(RemoteService generated unless no_rpc is passed). The IrohProtocol handler
forwards all messages without auth. The docs framed 'register an ALPN' as a
server-setup change. This is the default-insecure anti-pattern — security
should be opt-in, not opt-out. ADR-025 inverts the default: local-only is
the only mode, and remote access requires building a separate vault-server
crate (a visible architectural act, not a flag flip).

The actor path was already dead code — service.md said 'prefer
VaultServiceHandle directly — no channel, no serialization.' The actor
existed only to make irpc's Service trait work, which existed only to make
RemoteService work, which was the footgun. VaultServiceHandle's
Arc<RwLock> provides concurrent reads and exclusive writes — better
throughput than the actor's sequential processing.

DerivedKey serialization simplifies: always redact on serialize (for
logging safety), reject '[REDACTED]' on deserialize with an error. No
'postcard preserves bytes' path. This resolves review #002 W8 (silent
corruption on JSON-deserialized DerivedKey).

Resolves:
- OQ-21: remote vault access — resolved (not deferred). Not a vault crate
  feature; if needed, a separate vault-server crate with its own ADR.
- C7: vault-server-crate question decided — not created now, not precluded.
- C8: operation access policy table dissolved — all operations local-only
  by default; if a vault-server crate exposes some remotely, that crate
  defines the policy.
- W8: DerivedKey JSON deserialization — resolved (reject redacted payloads).

Amends ADR-005 (irpc remains for alknet-call, not for alknet-vault),
ADR-018 (vault is even more standalone — zero RPC framework deps),
ADR-019 (vault is the only layer, not just the only direct-caller layer),
ADR-008 (vault integration point unchanged, but now local-only by
construction).
This commit is contained in:
2026-06-22 14:53:52 +00:00
parent cdf340bec7
commit 7dda6eec68
13 changed files with 527 additions and 368 deletions

View File

@@ -1,58 +1,24 @@
---
status: draft
last_updated: 2026-06-19
last_updated: 2026-06-22-25
---
# Protocol
The `VaultProtocol` irpc message enum, `DerivedKey` type, and serialization
behavior.
The `DerivedKey` type, `KeyType` enum, and serialization behavior. The
vault's "protocol" is the `VaultServiceHandle` method API (ADR-025) — there
is no message enum, no irpc dispatch, and no wire format.
## What
The protocol layer defines the message enum that the irpc dispatch
infrastructure uses (ADR-005) and the `DerivedKey` type that derivation
methods return. This is the vault's internal dispatch protocol — not the
alknet call protocol (the vault has no ALPN, ADR-008).
The vault's dispatch is direct method calls on `VaultServiceHandle`
(ADR-025). The types defined here — `DerivedKey`, `KeyType` — are the
return types from those methods. There is no `VaultProtocol` enum, no
`VaultMessage`, no `VaultServiceActor`, and no remote dispatch capability.
## VaultProtocol
The irpc message enum. The `#[rpc_requests]` macro generates the
`VaultMessage` enum (with `WithChannels` wrappers), `Channels` impls,
`From` impls, and `Service`/`RemoteService` traits for remote dispatch.
```rust
#[rpc_requests(message = VaultMessage, no_spans)]
#[derive(Debug, Serialize, Deserialize)]
pub enum VaultProtocol {
DeriveEd25519 { path: String },
DeriveEncryptionKey { path: String },
DeriveEthereumKey { path: String },
DerivePassword { path: String, length: usize },
Encrypt { plaintext: String, key_version: u32 },
Decrypt { encrypted: EncryptedData },
Lock,
Unlock { mnemonic: String, passphrase: Option<String> },
}
```
Each variant is a vault operation. The `tx` channel type for each variant
is `oneshot::Sender<Result<T, VaultServiceError>>`, where `T` is the
operation's return type (`DerivedKey`, `Vec<u8>`, `EncryptedData`, `String`,
or `()`).
### State requirements
All operations except `Unlock` require the vault to be **unlocked**.
Calling derive/encrypt/decrypt on a locked vault returns
`VaultServiceError::VaultLocked` (not a panic, not a channel close).
### Dispatch
The `VaultServiceActor` (see [service.md](service.md)) processes
`VaultMessage` variants and dispatches to `VaultServiceHandle` methods.
For local in-process use, prefer `VaultServiceHandle` directly — no
channel overhead.
The vault is **local-only by construction**. If remote vault access is ever
needed, it requires a separate crate that wraps the vault and adds remote
transport + auth (ADR-025, OQ-021).
## DerivedKey
@@ -92,19 +58,23 @@ and zeroized.
### Serialization redaction
`DerivedKey` has a custom `Serialize` impl that redacts the private key in
human-readable formats:
`DerivedKey` has a custom `Serialize` impl that **always** redacts the
private key, regardless of format:
- **JSON** (human-readable): `private_key` serializes as `"[REDACTED]"`.
This is defense-in-depth — if a `DerivedKey` accidentally ends up in a
log or a JSON config, the private key is not exposed.
- **postcard** (binary, used by irpc): `private_key` serializes as the
actual bytes. This is required for in-cluster irpc dispatch to work —
the remote side needs the actual key bytes.
- **Deserialization**: always reads the full bytes, regardless of format.
A JSON-deserialized `DerivedKey` will have `"[REDACTED]"` as its
`private_key` string — this is expected; JSON round-tripping a
`DerivedKey` is not a supported use case (the private key is gone).
- **JSON** (and all human-readable formats): `private_key` serializes as
`"[REDACTED]"`. This is defense-in-depth — if a `DerivedKey` accidentally
ends up in a log, a JSON config, or debug output, the private key is not
exposed.
- **Deserialization**: rejects `private_key == "[REDACTED]"` with an error.
A JSON-deserialized `DerivedKey` with a redacted private key is invalid
and produces a deserialization error, not a corrupted key. This resolves
review #002 W8 (silent corruption on JSON-deserialized `DerivedKey`).
- **No binary-format preservation path.** ADR-025 dropped the postcard/remote
dispatch path that previously preserved private key bytes in binary
formats. `DerivedKey` is always used in-process (ADR-014: never appears
in call protocol payloads). If a future remote-vault crate needs to send
`DerivedKey` over the wire, it defines its own serialization for that
context — the vault's `DerivedKey` stays redact-always.
The redaction is **not the primary control** for keeping private keys off
the wire. The primary control is architectural: `DerivedKey` never appears
@@ -147,169 +117,74 @@ Tags `DerivedKey` and `CachedKey` so consumers know what they received.
## Wire Format
For local (in-process) calls, the protocol uses tokio channels directly —
no serialization. For remote (in-cluster) calls, the protocol is serialized
with postcard (binary, compact). For cross-node (call protocol) exposure,
the vault is wrapped in an operation that serializes to JSON — but **no
vault operations are exposed over the call protocol** (ADR-014). The JSON
serialization path exists only for the `DerivedKey` redaction safety net.
The vault has no wire format (ADR-025). Dispatch is direct method calls on
`VaultServiceHandle` — no serialization, no channels, no network. The
`DerivedKey` custom `Serialize`/`Deserialize` impls exist solely for
logging safety (redaction) and defense-in-depth, not for wire transport.
## Remote Capability
`EncryptedData` has a stable wire format (shared with `alknet-storage` and
the TypeScript consumer by type-level agreement — see
[encryption.md](encryption.md) and ADR-018). That format is for *stored
encrypted data*, not for vault dispatch — the vault's `encrypt`/`decrypt`
methods operate on `EncryptedData` as a value type, not as a wire message.
The `VaultProtocol` is a remote-capable irpc service **by construction**.
The `#[rpc_requests]` macro generates both `Service` (local) and
`RemoteService` (remote) trait implementations. The `VaultServiceActor`
processes `VaultMessage` variants identically regardless of transport —
the only difference between local and remote use is the `Client<VaultProtocol>`
construction and the server-side listener setup.
## Local-Only by Construction
This was a purposeful design decision: irpc's "zero-overhead local,
transparent remote" architecture means the same protocol definition and
actor code work for both in-process and cross-network dispatch. Enabling
remote vault access is a server-setup change, not a protocol change.
The vault is **local-only by construction** (ADR-025). There is no
`RemoteService` trait, no remote handler, no wire format for vault
messages. The vault's API is `VaultServiceHandle` — direct method calls,
nothing else.
### What's already in place
If remote vault access is ever needed (e.g., the machine→worker pattern
where a long-lived node exposes a restricted vault API to ephemeral
workers), it requires a **separate vault-server crate** that:
- **Protocol**: `VaultProtocol` is already a `RemoteService`. No code
changes needed in the protocol definition.
- **Serialization**: `DerivedKey`'s dual serialization (JSON redacts private
key for safety; postcard preserves bytes for remote dispatch) was
designed for this use case.
- **Actor**: `VaultServiceActor` already processes all message types. The
actor is transport-agnostic — it doesn't know whether a message arrived
via a local mpsc channel or a remote QUIC stream.
- **Auth transport**: irpc over iroh uses iroh's QUIC connections, which
authenticate via NodeId (Ed25519, RFC 7250 raw keys) — the same identity
model as the rest of alknet (ADR-010). The connection-level identity
("which NodeId is calling") is available before any vault operation is
dispatched.
1. Depends on both alknet-core (for `IdentityProvider`, scopes,
auth-wrapping) and alknet-vault (for `VaultServiceHandle`).
2. Defines its own threat model, access policy, and operation filtering
(`Unlock`/`Lock` must be local-only; other operations may be
remote-capable depending on the policy).
3. Adds the remote transport (iroh/QUIC or similar) and an auth-wrapping
handler that checks caller identity before forwarding to the vault.
4. Requires its own ADR (matching ADR-019's language: "requires its own
ADR") defining the threat model and access policy.
### What's not in place (the gap)
This is a deliberate addition, not a flag flip on a default that was
already loaded. The pre-ADR-025 design made the vault remote-capable *by
construction* (irpc generated `RemoteService` by default), which was the
default-insecure anti-pattern. ADR-025 inverts the default: local-only is
the only mode, and remote access requires building something new.
The `IrohProtocol` handler that irpc provides forwards **all** message
types to the actor without auth checks. For local use this is correct
(the assembly layer is trusted). For remote use, the listener needs:
1. **NodeId allowlist**: only known worker NodeIds may connect.
2. **Message filtering**: reject `Unlock` and `Lock` from remote callers
(see "Operation access policy" below).
3. **Then** forward to the actor.
This auth-wrapping handler cannot live in the vault crate — the vault is
standalone (ADR-018) and depends on no alknet crate. The auth model
(`IdentityProvider`, `Identity`, scopes) lives in alknet-core. The
auth-wrapping listener lives in the **assembly layer** (the CLI binary)
or a dedicated vault-server crate that depends on both alknet-core and
alknet-vault. This is the same pattern as ADR-019: the vault is a
library, the assembly layer is the integrator.
```
alknet-vault (standalone, no deps)
- VaultProtocol (RemoteService by construction)
- VaultServiceActor (processes all message types, no auth)
- VaultServiceHandle (direct API)
assembly layer / vault-server (depends on alknet-core + alknet-vault)
- AuthWrappingHandler: checks NodeId, filters message types, forwards
- IrohProtocol::new(auth_wrapping_handler)
- Router::builder(endpoint).accept(b"alknet/vault", protocol).spawn()
```
### Operation access policy
Not all `VaultProtocol` operations are safe to expose remotely. The vault
spec defines the policy; the assembly-layer listener enforces it.
| Operation | Local (assembly layer) | Remote (workers) | Why |
|-----------|----------------------|-------------------|-----|
| `Unlock` | ✅ | ❌ | Sends the mnemonic (root of trust) over the wire. Even with NodeId auth, the mnemonic in transit is a different threat model — it's in memory on the receiving end, potentially in logs/traces. Local-only. |
| `Lock` | ✅ | ❌ | Locking the vault bricks the machine node for all workers. A compromised or buggy worker could DoS the entire machine node. Local-only. |
| `DeriveEd25519` | ✅ | ✅ | Workers need derived keys for signing, identity. The derivation path is the access control — the worker can only derive at paths the assembly layer declares. |
| `DeriveEncryptionKey` | ✅ | ✅ | Workers need encryption keys for credential encryption. Same path-based access control. |
| `DeriveEthereumKey` | ✅ | ✅ | Same as DeriveEd25519, for Ethereum signing. |
| `DerivePassword` | ✅ | ✅ | Workers need deterministic passwords for service credentials. |
| `Encrypt` | ✅ | ✅ | Workers encrypt external credentials (API keys) for storage. |
| `Decrypt` | ✅ | ✅ | Workers decrypt stored credentials at call time. |
The policy is: **`Unlock` and `Lock` are local-only; all other operations
are remote-capable.** The assembly-layer listener filters `Unlock` and
`Lock` messages from remote connections and returns an error.
### Use case: machine node → workers
The primary use case is a **machine node** (long-lived, holds the mnemonic,
manages container services) exposing a restricted vault API to its
**workers** (ephemeral, containerized, no mnemonic):
```
Machine Node (head, vault unlocked locally)
├── exposes alknet/vault ALPN to workers
├── NodeId allowlist: only known worker NodeIds may connect
├── message filter: rejects Unlock/Lock from remote callers
├── Worker A (no mnemonic)
│ └── calls DeriveEd25519, Encrypt, Decrypt on machine node's vault
└── Worker B (also a head for its own sub-workers)
├── gets its own credentials from machine node's vault
└── can expose its own restricted vault API to sub-workers
```
Workers don't hold mnemonics. They get static credentials injected at
construction (the common case) and call the machine node's vault for
dynamic derivation or decryption when needed. This is the
defense-in-depth (Russian doll) model: the seed is the innermost layer,
the machine node's vault is the next, iroh's NodeId auth is the outer,
and workers are outside that — calling in through authenticated channels.
### Per-machine-node vaults, not shared
Each machine node has its own vault and mnemonic. Machine nodes do not
share vaults with each other. Compromising one machine node exposes only
that node's workers, not all nodes. This is compartmentalization — the
blast radius of a vault compromise is one machine node, not the entire
fleet.
The remote vault capability is for the **machine→worker** relationship,
not for cross-machine-node sharing. Machine nodes don't expose their
vaults to peer machine nodes — only to their own workers, authenticated
by NodeId.
### What's breaking vs. non-breaking
| Change | Breaking? | Why |
|--------|-----------|-----|
| Enabling remote vault access | **No** | Server-setup change — register `IrohProtocol` with an ALPN. The protocol is already a `RemoteService`. |
| Restricting which operations are remote-capable | **No** | Policy in the assembly-layer handler, not a protocol change. |
| Adding NodeId auth checks | **No** | Implementation in the assembly-layer handler. The vault crate doesn't change. |
| Adding new `VaultProtocol` variants | **Yes (wire break)** | Inherent to irpc — versioning is a non-goal. Would need ALPN versioning (`alknet/vault/v2`) if the protocol evolves. Same constraint as any irpc service. |
| Changing `DerivedKey` serialization | **No** | Dual serialization is already in place — postcard preserves bytes for remote, JSON redacts for safety. |
The only breaking change is evolving the `VaultProtocol` enum itself, and
that's manageable with ALPN versioning (`alknet/vault`, then
`alknet/vault/v2` if needed) — the same pattern alknet uses for all ALPN
protocols (ADR-006).
**Per-node vaults are the recommended pattern for multi-node deployments.**
Each node has its own vault and mnemonic. Credentials are encrypted *for*
the receiving node's public key or derived at a shared path the receiving
node can derive locally. This is end-to-end encryption between nodes, not
a centralized decryption oracle. It matches ADR-008's "capability source"
model — credentials are injected at the assembly layer, not fetched over
the network at call time.
## Design Decisions
| Decision | ADR | Summary |
|----------|-----|---------|
| irpc for vault dispatch | [ADR-005](../../decisions/005-irpc-as-call-protocol-foundation.md) | In-process type-safe dispatch; remote-capable by construction |
| Vault is standalone | [ADR-018](../../decisions/018-vault-standalone-crate.md) | Zero alknet crate dependencies |
| Vault is local-only | [ADR-025](../../decisions/025-vault-local-only-dispatch.md) | Direct method calls, no irpc, no remote dispatch capability |
| HD derivation (not stored keys) | — | One seed, many keys, no key storage |
| `DerivedKey` is move-only | [ADR-014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Prevents accidental duplication of secret material |
| JSON redacts private key | [ADR-014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Defense-in-depth for logging accidents |
| postcard preserves private key | — | Required for in-cluster irpc dispatch |
| JSON redacts private key (always) | [ADR-014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Defense-in-depth for logging accidents |
| No vault operations on call protocol | [ADR-008](../../decisions/008-secret-service-integration.md), [ADR-014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Master seed never crosses the network |
| Unlock/Lock are local-only | OQ-21 (deferred) | Mnemonic and lock control must not be remotely accessible |
| Auth wrapping lives in assembly layer | [ADR-018](../../decisions/018-vault-standalone-crate.md), [ADR-019](../../decisions/019-vault-assembly-layer-only.md) | Vault is standalone; can't import alknet-core's auth model |
| No remote dispatch in vault crate | [ADR-025](../../decisions/025-vault-local-only-dispatch.md) | Remote access requires a separate vault-server crate with its own ADR |
## Open Questions
None active for this document.
None active for this document. OQ-21 (remote vault) is resolved — see
ADR-025 and [open-questions.md](../../open-questions.md).
## References
- Implementation: `crates/alknet-vault/src/protocol.rs`
- Implementation: `crates/alknet-vault/src/protocol.rs` (to be updated
per ADR-025 — remove `VaultProtocol` enum and irpc usage)
- Tests: `crates/alknet-vault/src/protocol.rs` (unit tests for redaction
and zeroize behavior)
- [service.md](service.md) — how the actor dispatches `VaultMessage`
and zeroize behavior; postcard tests to be removed)
- [service.md](service.md) — `VaultServiceHandle` runtime API
- [mnemonic-derivation.md](mnemonic-derivation.md) — what `KeyType` means