docs(architecture): add ADR-025 — vault local-only dispatch, drop irpc
Drops irpc from alknet-vault entirely. The vault's dispatch is now direct method calls on VaultServiceHandle — no VaultProtocol enum, no VaultMessage, no VaultServiceActor, no mpsc channel, no Service trait, no RemoteService trait, no postcard serialization. The vault is local-only by construction. The core security argument: irpc made the vault remote-capable by default (RemoteService generated unless no_rpc is passed). The IrohProtocol handler forwards all messages without auth. The docs framed 'register an ALPN' as a server-setup change. This is the default-insecure anti-pattern — security should be opt-in, not opt-out. ADR-025 inverts the default: local-only is the only mode, and remote access requires building a separate vault-server crate (a visible architectural act, not a flag flip). The actor path was already dead code — service.md said 'prefer VaultServiceHandle directly — no channel, no serialization.' The actor existed only to make irpc's Service trait work, which existed only to make RemoteService work, which was the footgun. VaultServiceHandle's Arc<RwLock> provides concurrent reads and exclusive writes — better throughput than the actor's sequential processing. DerivedKey serialization simplifies: always redact on serialize (for logging safety), reject '[REDACTED]' on deserialize with an error. No 'postcard preserves bytes' path. This resolves review #002 W8 (silent corruption on JSON-deserialized DerivedKey). Resolves: - OQ-21: remote vault access — resolved (not deferred). Not a vault crate feature; if needed, a separate vault-server crate with its own ADR. - C7: vault-server-crate question decided — not created now, not precluded. - C8: operation access policy table dissolved — all operations local-only by default; if a vault-server crate exposes some remotely, that crate defines the policy. - W8: DerivedKey JSON deserialization — resolved (reject redacted payloads). Amends ADR-005 (irpc remains for alknet-call, not for alknet-vault), ADR-018 (vault is even more standalone — zero RPC framework deps), ADR-019 (vault is the only layer, not just the only direct-caller layer), ADR-008 (vault integration point unchanged, but now local-only by construction).
This commit is contained in:
@@ -1,58 +1,24 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-19
|
||||
last_updated: 2026-06-22-25
|
||||
---
|
||||
|
||||
# Protocol
|
||||
|
||||
The `VaultProtocol` irpc message enum, `DerivedKey` type, and serialization
|
||||
behavior.
|
||||
The `DerivedKey` type, `KeyType` enum, and serialization behavior. The
|
||||
vault's "protocol" is the `VaultServiceHandle` method API (ADR-025) — there
|
||||
is no message enum, no irpc dispatch, and no wire format.
|
||||
|
||||
## What
|
||||
|
||||
The protocol layer defines the message enum that the irpc dispatch
|
||||
infrastructure uses (ADR-005) and the `DerivedKey` type that derivation
|
||||
methods return. This is the vault's internal dispatch protocol — not the
|
||||
alknet call protocol (the vault has no ALPN, ADR-008).
|
||||
The vault's dispatch is direct method calls on `VaultServiceHandle`
|
||||
(ADR-025). The types defined here — `DerivedKey`, `KeyType` — are the
|
||||
return types from those methods. There is no `VaultProtocol` enum, no
|
||||
`VaultMessage`, no `VaultServiceActor`, and no remote dispatch capability.
|
||||
|
||||
## VaultProtocol
|
||||
|
||||
The irpc message enum. The `#[rpc_requests]` macro generates the
|
||||
`VaultMessage` enum (with `WithChannels` wrappers), `Channels` impls,
|
||||
`From` impls, and `Service`/`RemoteService` traits for remote dispatch.
|
||||
|
||||
```rust
|
||||
#[rpc_requests(message = VaultMessage, no_spans)]
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
pub enum VaultProtocol {
|
||||
DeriveEd25519 { path: String },
|
||||
DeriveEncryptionKey { path: String },
|
||||
DeriveEthereumKey { path: String },
|
||||
DerivePassword { path: String, length: usize },
|
||||
Encrypt { plaintext: String, key_version: u32 },
|
||||
Decrypt { encrypted: EncryptedData },
|
||||
Lock,
|
||||
Unlock { mnemonic: String, passphrase: Option<String> },
|
||||
}
|
||||
```
|
||||
|
||||
Each variant is a vault operation. The `tx` channel type for each variant
|
||||
is `oneshot::Sender<Result<T, VaultServiceError>>`, where `T` is the
|
||||
operation's return type (`DerivedKey`, `Vec<u8>`, `EncryptedData`, `String`,
|
||||
or `()`).
|
||||
|
||||
### State requirements
|
||||
|
||||
All operations except `Unlock` require the vault to be **unlocked**.
|
||||
Calling derive/encrypt/decrypt on a locked vault returns
|
||||
`VaultServiceError::VaultLocked` (not a panic, not a channel close).
|
||||
|
||||
### Dispatch
|
||||
|
||||
The `VaultServiceActor` (see [service.md](service.md)) processes
|
||||
`VaultMessage` variants and dispatches to `VaultServiceHandle` methods.
|
||||
For local in-process use, prefer `VaultServiceHandle` directly — no
|
||||
channel overhead.
|
||||
The vault is **local-only by construction**. If remote vault access is ever
|
||||
needed, it requires a separate crate that wraps the vault and adds remote
|
||||
transport + auth (ADR-025, OQ-021).
|
||||
|
||||
## DerivedKey
|
||||
|
||||
@@ -92,19 +58,23 @@ and zeroized.
|
||||
|
||||
### Serialization redaction
|
||||
|
||||
`DerivedKey` has a custom `Serialize` impl that redacts the private key in
|
||||
human-readable formats:
|
||||
`DerivedKey` has a custom `Serialize` impl that **always** redacts the
|
||||
private key, regardless of format:
|
||||
|
||||
- **JSON** (human-readable): `private_key` serializes as `"[REDACTED]"`.
|
||||
This is defense-in-depth — if a `DerivedKey` accidentally ends up in a
|
||||
log or a JSON config, the private key is not exposed.
|
||||
- **postcard** (binary, used by irpc): `private_key` serializes as the
|
||||
actual bytes. This is required for in-cluster irpc dispatch to work —
|
||||
the remote side needs the actual key bytes.
|
||||
- **Deserialization**: always reads the full bytes, regardless of format.
|
||||
A JSON-deserialized `DerivedKey` will have `"[REDACTED]"` as its
|
||||
`private_key` string — this is expected; JSON round-tripping a
|
||||
`DerivedKey` is not a supported use case (the private key is gone).
|
||||
- **JSON** (and all human-readable formats): `private_key` serializes as
|
||||
`"[REDACTED]"`. This is defense-in-depth — if a `DerivedKey` accidentally
|
||||
ends up in a log, a JSON config, or debug output, the private key is not
|
||||
exposed.
|
||||
- **Deserialization**: rejects `private_key == "[REDACTED]"` with an error.
|
||||
A JSON-deserialized `DerivedKey` with a redacted private key is invalid
|
||||
and produces a deserialization error, not a corrupted key. This resolves
|
||||
review #002 W8 (silent corruption on JSON-deserialized `DerivedKey`).
|
||||
- **No binary-format preservation path.** ADR-025 dropped the postcard/remote
|
||||
dispatch path that previously preserved private key bytes in binary
|
||||
formats. `DerivedKey` is always used in-process (ADR-014: never appears
|
||||
in call protocol payloads). If a future remote-vault crate needs to send
|
||||
`DerivedKey` over the wire, it defines its own serialization for that
|
||||
context — the vault's `DerivedKey` stays redact-always.
|
||||
|
||||
The redaction is **not the primary control** for keeping private keys off
|
||||
the wire. The primary control is architectural: `DerivedKey` never appears
|
||||
@@ -147,169 +117,74 @@ Tags `DerivedKey` and `CachedKey` so consumers know what they received.
|
||||
|
||||
## Wire Format
|
||||
|
||||
For local (in-process) calls, the protocol uses tokio channels directly —
|
||||
no serialization. For remote (in-cluster) calls, the protocol is serialized
|
||||
with postcard (binary, compact). For cross-node (call protocol) exposure,
|
||||
the vault is wrapped in an operation that serializes to JSON — but **no
|
||||
vault operations are exposed over the call protocol** (ADR-014). The JSON
|
||||
serialization path exists only for the `DerivedKey` redaction safety net.
|
||||
The vault has no wire format (ADR-025). Dispatch is direct method calls on
|
||||
`VaultServiceHandle` — no serialization, no channels, no network. The
|
||||
`DerivedKey` custom `Serialize`/`Deserialize` impls exist solely for
|
||||
logging safety (redaction) and defense-in-depth, not for wire transport.
|
||||
|
||||
## Remote Capability
|
||||
`EncryptedData` has a stable wire format (shared with `alknet-storage` and
|
||||
the TypeScript consumer by type-level agreement — see
|
||||
[encryption.md](encryption.md) and ADR-018). That format is for *stored
|
||||
encrypted data*, not for vault dispatch — the vault's `encrypt`/`decrypt`
|
||||
methods operate on `EncryptedData` as a value type, not as a wire message.
|
||||
|
||||
The `VaultProtocol` is a remote-capable irpc service **by construction**.
|
||||
The `#[rpc_requests]` macro generates both `Service` (local) and
|
||||
`RemoteService` (remote) trait implementations. The `VaultServiceActor`
|
||||
processes `VaultMessage` variants identically regardless of transport —
|
||||
the only difference between local and remote use is the `Client<VaultProtocol>`
|
||||
construction and the server-side listener setup.
|
||||
## Local-Only by Construction
|
||||
|
||||
This was a purposeful design decision: irpc's "zero-overhead local,
|
||||
transparent remote" architecture means the same protocol definition and
|
||||
actor code work for both in-process and cross-network dispatch. Enabling
|
||||
remote vault access is a server-setup change, not a protocol change.
|
||||
The vault is **local-only by construction** (ADR-025). There is no
|
||||
`RemoteService` trait, no remote handler, no wire format for vault
|
||||
messages. The vault's API is `VaultServiceHandle` — direct method calls,
|
||||
nothing else.
|
||||
|
||||
### What's already in place
|
||||
If remote vault access is ever needed (e.g., the machine→worker pattern
|
||||
where a long-lived node exposes a restricted vault API to ephemeral
|
||||
workers), it requires a **separate vault-server crate** that:
|
||||
|
||||
- **Protocol**: `VaultProtocol` is already a `RemoteService`. No code
|
||||
changes needed in the protocol definition.
|
||||
- **Serialization**: `DerivedKey`'s dual serialization (JSON redacts private
|
||||
key for safety; postcard preserves bytes for remote dispatch) was
|
||||
designed for this use case.
|
||||
- **Actor**: `VaultServiceActor` already processes all message types. The
|
||||
actor is transport-agnostic — it doesn't know whether a message arrived
|
||||
via a local mpsc channel or a remote QUIC stream.
|
||||
- **Auth transport**: irpc over iroh uses iroh's QUIC connections, which
|
||||
authenticate via NodeId (Ed25519, RFC 7250 raw keys) — the same identity
|
||||
model as the rest of alknet (ADR-010). The connection-level identity
|
||||
("which NodeId is calling") is available before any vault operation is
|
||||
dispatched.
|
||||
1. Depends on both alknet-core (for `IdentityProvider`, scopes,
|
||||
auth-wrapping) and alknet-vault (for `VaultServiceHandle`).
|
||||
2. Defines its own threat model, access policy, and operation filtering
|
||||
(`Unlock`/`Lock` must be local-only; other operations may be
|
||||
remote-capable depending on the policy).
|
||||
3. Adds the remote transport (iroh/QUIC or similar) and an auth-wrapping
|
||||
handler that checks caller identity before forwarding to the vault.
|
||||
4. Requires its own ADR (matching ADR-019's language: "requires its own
|
||||
ADR") defining the threat model and access policy.
|
||||
|
||||
### What's not in place (the gap)
|
||||
This is a deliberate addition, not a flag flip on a default that was
|
||||
already loaded. The pre-ADR-025 design made the vault remote-capable *by
|
||||
construction* (irpc generated `RemoteService` by default), which was the
|
||||
default-insecure anti-pattern. ADR-025 inverts the default: local-only is
|
||||
the only mode, and remote access requires building something new.
|
||||
|
||||
The `IrohProtocol` handler that irpc provides forwards **all** message
|
||||
types to the actor without auth checks. For local use this is correct
|
||||
(the assembly layer is trusted). For remote use, the listener needs:
|
||||
|
||||
1. **NodeId allowlist**: only known worker NodeIds may connect.
|
||||
2. **Message filtering**: reject `Unlock` and `Lock` from remote callers
|
||||
(see "Operation access policy" below).
|
||||
3. **Then** forward to the actor.
|
||||
|
||||
This auth-wrapping handler cannot live in the vault crate — the vault is
|
||||
standalone (ADR-018) and depends on no alknet crate. The auth model
|
||||
(`IdentityProvider`, `Identity`, scopes) lives in alknet-core. The
|
||||
auth-wrapping listener lives in the **assembly layer** (the CLI binary)
|
||||
or a dedicated vault-server crate that depends on both alknet-core and
|
||||
alknet-vault. This is the same pattern as ADR-019: the vault is a
|
||||
library, the assembly layer is the integrator.
|
||||
|
||||
```
|
||||
alknet-vault (standalone, no deps)
|
||||
- VaultProtocol (RemoteService by construction)
|
||||
- VaultServiceActor (processes all message types, no auth)
|
||||
- VaultServiceHandle (direct API)
|
||||
|
||||
assembly layer / vault-server (depends on alknet-core + alknet-vault)
|
||||
- AuthWrappingHandler: checks NodeId, filters message types, forwards
|
||||
- IrohProtocol::new(auth_wrapping_handler)
|
||||
- Router::builder(endpoint).accept(b"alknet/vault", protocol).spawn()
|
||||
```
|
||||
|
||||
### Operation access policy
|
||||
|
||||
Not all `VaultProtocol` operations are safe to expose remotely. The vault
|
||||
spec defines the policy; the assembly-layer listener enforces it.
|
||||
|
||||
| Operation | Local (assembly layer) | Remote (workers) | Why |
|
||||
|-----------|----------------------|-------------------|-----|
|
||||
| `Unlock` | ✅ | ❌ | Sends the mnemonic (root of trust) over the wire. Even with NodeId auth, the mnemonic in transit is a different threat model — it's in memory on the receiving end, potentially in logs/traces. Local-only. |
|
||||
| `Lock` | ✅ | ❌ | Locking the vault bricks the machine node for all workers. A compromised or buggy worker could DoS the entire machine node. Local-only. |
|
||||
| `DeriveEd25519` | ✅ | ✅ | Workers need derived keys for signing, identity. The derivation path is the access control — the worker can only derive at paths the assembly layer declares. |
|
||||
| `DeriveEncryptionKey` | ✅ | ✅ | Workers need encryption keys for credential encryption. Same path-based access control. |
|
||||
| `DeriveEthereumKey` | ✅ | ✅ | Same as DeriveEd25519, for Ethereum signing. |
|
||||
| `DerivePassword` | ✅ | ✅ | Workers need deterministic passwords for service credentials. |
|
||||
| `Encrypt` | ✅ | ✅ | Workers encrypt external credentials (API keys) for storage. |
|
||||
| `Decrypt` | ✅ | ✅ | Workers decrypt stored credentials at call time. |
|
||||
|
||||
The policy is: **`Unlock` and `Lock` are local-only; all other operations
|
||||
are remote-capable.** The assembly-layer listener filters `Unlock` and
|
||||
`Lock` messages from remote connections and returns an error.
|
||||
|
||||
### Use case: machine node → workers
|
||||
|
||||
The primary use case is a **machine node** (long-lived, holds the mnemonic,
|
||||
manages container services) exposing a restricted vault API to its
|
||||
**workers** (ephemeral, containerized, no mnemonic):
|
||||
|
||||
```
|
||||
Machine Node (head, vault unlocked locally)
|
||||
├── exposes alknet/vault ALPN to workers
|
||||
├── NodeId allowlist: only known worker NodeIds may connect
|
||||
├── message filter: rejects Unlock/Lock from remote callers
|
||||
│
|
||||
├── Worker A (no mnemonic)
|
||||
│ └── calls DeriveEd25519, Encrypt, Decrypt on machine node's vault
|
||||
│
|
||||
└── Worker B (also a head for its own sub-workers)
|
||||
├── gets its own credentials from machine node's vault
|
||||
└── can expose its own restricted vault API to sub-workers
|
||||
```
|
||||
|
||||
Workers don't hold mnemonics. They get static credentials injected at
|
||||
construction (the common case) and call the machine node's vault for
|
||||
dynamic derivation or decryption when needed. This is the
|
||||
defense-in-depth (Russian doll) model: the seed is the innermost layer,
|
||||
the machine node's vault is the next, iroh's NodeId auth is the outer,
|
||||
and workers are outside that — calling in through authenticated channels.
|
||||
|
||||
### Per-machine-node vaults, not shared
|
||||
|
||||
Each machine node has its own vault and mnemonic. Machine nodes do not
|
||||
share vaults with each other. Compromising one machine node exposes only
|
||||
that node's workers, not all nodes. This is compartmentalization — the
|
||||
blast radius of a vault compromise is one machine node, not the entire
|
||||
fleet.
|
||||
|
||||
The remote vault capability is for the **machine→worker** relationship,
|
||||
not for cross-machine-node sharing. Machine nodes don't expose their
|
||||
vaults to peer machine nodes — only to their own workers, authenticated
|
||||
by NodeId.
|
||||
|
||||
### What's breaking vs. non-breaking
|
||||
|
||||
| Change | Breaking? | Why |
|
||||
|--------|-----------|-----|
|
||||
| Enabling remote vault access | **No** | Server-setup change — register `IrohProtocol` with an ALPN. The protocol is already a `RemoteService`. |
|
||||
| Restricting which operations are remote-capable | **No** | Policy in the assembly-layer handler, not a protocol change. |
|
||||
| Adding NodeId auth checks | **No** | Implementation in the assembly-layer handler. The vault crate doesn't change. |
|
||||
| Adding new `VaultProtocol` variants | **Yes (wire break)** | Inherent to irpc — versioning is a non-goal. Would need ALPN versioning (`alknet/vault/v2`) if the protocol evolves. Same constraint as any irpc service. |
|
||||
| Changing `DerivedKey` serialization | **No** | Dual serialization is already in place — postcard preserves bytes for remote, JSON redacts for safety. |
|
||||
|
||||
The only breaking change is evolving the `VaultProtocol` enum itself, and
|
||||
that's manageable with ALPN versioning (`alknet/vault`, then
|
||||
`alknet/vault/v2` if needed) — the same pattern alknet uses for all ALPN
|
||||
protocols (ADR-006).
|
||||
**Per-node vaults are the recommended pattern for multi-node deployments.**
|
||||
Each node has its own vault and mnemonic. Credentials are encrypted *for*
|
||||
the receiving node's public key or derived at a shared path the receiving
|
||||
node can derive locally. This is end-to-end encryption between nodes, not
|
||||
a centralized decryption oracle. It matches ADR-008's "capability source"
|
||||
model — credentials are injected at the assembly layer, not fetched over
|
||||
the network at call time.
|
||||
|
||||
## Design Decisions
|
||||
|
||||
| Decision | ADR | Summary |
|
||||
|----------|-----|---------|
|
||||
| irpc for vault dispatch | [ADR-005](../../decisions/005-irpc-as-call-protocol-foundation.md) | In-process type-safe dispatch; remote-capable by construction |
|
||||
| Vault is standalone | [ADR-018](../../decisions/018-vault-standalone-crate.md) | Zero alknet crate dependencies |
|
||||
| Vault is local-only | [ADR-025](../../decisions/025-vault-local-only-dispatch.md) | Direct method calls, no irpc, no remote dispatch capability |
|
||||
| HD derivation (not stored keys) | — | One seed, many keys, no key storage |
|
||||
| `DerivedKey` is move-only | [ADR-014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Prevents accidental duplication of secret material |
|
||||
| JSON redacts private key | [ADR-014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Defense-in-depth for logging accidents |
|
||||
| postcard preserves private key | — | Required for in-cluster irpc dispatch |
|
||||
| JSON redacts private key (always) | [ADR-014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Defense-in-depth for logging accidents |
|
||||
| No vault operations on call protocol | [ADR-008](../../decisions/008-secret-service-integration.md), [ADR-014](../../decisions/014-secret-material-flow-and-capability-injection.md) | Master seed never crosses the network |
|
||||
| Unlock/Lock are local-only | OQ-21 (deferred) | Mnemonic and lock control must not be remotely accessible |
|
||||
| Auth wrapping lives in assembly layer | [ADR-018](../../decisions/018-vault-standalone-crate.md), [ADR-019](../../decisions/019-vault-assembly-layer-only.md) | Vault is standalone; can't import alknet-core's auth model |
|
||||
| No remote dispatch in vault crate | [ADR-025](../../decisions/025-vault-local-only-dispatch.md) | Remote access requires a separate vault-server crate with its own ADR |
|
||||
|
||||
## Open Questions
|
||||
|
||||
None active for this document.
|
||||
None active for this document. OQ-21 (remote vault) is resolved — see
|
||||
ADR-025 and [open-questions.md](../../open-questions.md).
|
||||
|
||||
## References
|
||||
|
||||
- Implementation: `crates/alknet-vault/src/protocol.rs`
|
||||
- Implementation: `crates/alknet-vault/src/protocol.rs` (to be updated
|
||||
per ADR-025 — remove `VaultProtocol` enum and irpc usage)
|
||||
- Tests: `crates/alknet-vault/src/protocol.rs` (unit tests for redaction
|
||||
and zeroize behavior)
|
||||
- [service.md](service.md) — how the actor dispatches `VaultMessage`
|
||||
and zeroize behavior; postcard tests to be removed)
|
||||
- [service.md](service.md) — `VaultServiceHandle` runtime API
|
||||
- [mnemonic-derivation.md](mnemonic-derivation.md) — what `KeyType` means
|
||||
Reference in New Issue
Block a user