docs(architecture): add alknet-vault spec, ADR-018, ADR-019, OQ-20/21/22

Spec the vault crate from its existing implementation. The vault is
stable (implementation exists); this spec documents what IS so the
implementation-sync agent can reconcile source drift.

New spec documents (crates/vault/):
- README.md — crate index, security constraints, public API
- mnemonic-derivation.md — BIP39, SLIP-0010, BIP-0032, derivation paths
- encryption.md — AES-256-GCM, EncryptedData, key versioning, salt
- service.md — VaultServiceHandle lifecycle, actor dispatch, cache
- protocol.md — VaultProtocol irpc messages, DerivedKey redaction

New ADRs:
- ADR-018: Vault as standalone crate (zero alknet deps; own types/errors)
- ADR-019: Vault assembly-layer-only access (CLI is sole caller)

New open questions:
- OQ-20: Salt/KDF Phase B (open, low priority — salt field reserved)
- OQ-21: Remote vault administration (deferred — needs ADR if ever needed)
- OQ-22: Key rotation mechanism (open, low priority — workflow not specced)

Spec-vs-source drift explicitly flagged (for the sync agent):
- rand::random() used for IVs instead of OsRng (security-critical)
- unwrap() on every RwLock acquisition (must use unwrap_or_else)
- ADR-038 / OQ-SVC-03 references in source comments are stale (old numbering)
- VaultServiceActor::spawn returns a non-functional second actor (source bug)
- KeyVersionMismatch error variant is defined but unused in v1
This commit is contained in:
2026-06-19 09:23:47 +00:00
parent 40f6468e18
commit dd1ca1de70
10 changed files with 1564 additions and 8 deletions

View File

@@ -0,0 +1,162 @@
# ADR-018: Vault as Standalone Crate
## Status
Accepted
## Context
alknet-vault provides BIP39 mnemonic generation, SLIP-0010 Ed25519 HD key
derivation, BIP-0032 secp256k1 derivation (feature-gated), and AES-256-GCM
encryption. It holds the master seed — the root of trust for all derived keys
and encrypted credentials in the alknet system.
The question is: what does alknet-vault depend on? The candidates:
1. **Depend on alknet-core** for shared types (errors, maybe Identity). This
pulls QUIC, quinn, iroh, rustls, and tokio runtime dependencies into the
vault's dependency tree.
2. **Stand alone** — zero alknet crate dependencies. The vault defines its own
types, its own error enum, its own irpc protocol. Other crates depend on
the vault; the vault depends on nothing in alknet.
This is a one-way door. Once the vault depends on alknet-core, reversing it
requires removing that dependency from every type, error conversion, and
test — and the longer it stays, the more entangled it becomes.
### Why standalone matters
The vault is used in contexts where QUIC networking does not exist:
- **CLI tools**: a key-derivation utility that derives an identity key from a
mnemonic without starting a network endpoint.
- **Test harnesses**: integration tests in other crates derive test keys
without spinning up a QUIC endpoint.
- **WASM key derivation**: a future WASM target that derives keys in a browser
(the BiStream trait in ADR-007 preserves this door at the transport layer;
the vault's independence preserves it at the secret layer).
- **Embedded assembly**: a binary that only needs the vault to decrypt a
config file at startup, with no networking at all.
If the vault depends on alknet-core, all of these contexts pull in quinn,
iroh, rustls, and tokio — none of which they need. The vault's job is
cryptographic derivation and encryption. It has no networking concern.
### What the vault provides without alknet-core
The vault defines its own types and traits:
- `Mnemonic`, `Seed` — BIP39 root material
- `ExtendedPrivKey` (Ed25519), `Secp256k1ExtendedPrivKey` (Ethereum) —
derived key material
- `DerivedKey`, `KeyType` — protocol-level key representation
- `EncryptedData`, `EncryptionKey` — AES-256-GCM blobs
- `VaultServiceHandle`, `VaultServiceActor` — runtime API
- `VaultProtocol` — irpc message enum (in-process dispatch)
- `VaultServiceError` — its own error enum (string-wrapped sub-errors; the
vault doesn't share an error type with alknet-core)
The `VaultProtocol` uses irpc directly (see ADR-005), not through alknet-call.
This is consistent: irpc is a lightweight framing library, not an alknet
crate. The vault's irpc usage is an in-process dispatch mechanism, not a
network-exposed service.
## Decision
**alknet-vault has zero alknet crate dependencies.** It depends only on
external crates (`bip39`, `ed25519-bip32`, `aes-gcm`, `sha2`, `hmac`,
`secp256k1`, `irpc`, `tokio` for the actor's sync primitives, `serde`,
`zeroize`, `thiserror`, `base64`, `rand`).
The vault does not depend on:
- `alknet-core` — no shared types, no `Identity`, no `AuthContext`
- `alknet-call` — no `OperationSpec`, no `OperationContext`, no call protocol
- `alknet-vault` does not implement `ProtocolHandler` — it has no ALPN (see
ADR-019)
Dependency flow is strictly one-directional:
```
alknet-vault (standalone)
alknet (CLI binary) — the only crate that depends on alknet-vault
```
No handler crate depends on alknet-vault directly. Handlers receive derived
material through capabilities injected by the assembly layer (ADR-014). The
CLI binary is the sole integration point (ADR-008, ADR-019).
### Type independence
The vault defines its own types and does not share types with alknet-core:
- `VaultServiceError` is the vault's error enum. It is `Serialize`/`Deserialize`
(for irpc dispatch) and wraps sub-errors as strings. It does not implement
`From` for alknet-core error types — the CLI binary converts at the
assembly boundary.
- `DerivedKey` is the vault's key representation. It is not shared with
alknet-core's `Identity` type. The CLI binary extracts the bytes it needs
(private key for signing, public key for TLS identity) and constructs the
alknet-core types at the assembly layer.
- `EncryptedData` is the vault's encrypted blob format. It is shared with
`alknet-storage` (a future crate) by type-level agreement, not by a crate
dependency — both crates must agree on the serialization format (see
[encryption.md](../crates/vault/encryption.md)).
## Consequences
**Positive:**
- The vault compiles and runs without QUIC, quinn, iroh, rustls, or a tokio
runtime (the `VaultServiceHandle` works with just `std::sync::RwLock`; the
actor uses `tokio::sync::mpsc` but that's a lightweight dependency).
- CLI tools, test harnesses, and future WASM targets can use the vault for key
derivation without pulling in networking crates.
- The vault's API surface is stable — changes to alknet-core types don't
force a vault recompile, and changes to vault types don't force a
handler recompile (the CLI is the only consumer).
- No circular dependency risk. The dependency graph is a strict DAG.
- The vault can be published and used independently of alknet — it's a
general-purpose local key vault, not an alknet-specific component.
**Negative:**
- The vault cannot share types with alknet-core. If a type wants to be shared
(e.g., a future `Fingerprint` type), it must live in alknet-core and the
vault must define its own equivalent, or a new shared crate must be
created. This is a feature, not a bug — it forces explicit boundaries.
- The CLI binary must convert between vault types and alknet-core types at
the assembly boundary. This is a small amount of glue code (extract bytes
from `DerivedKey`, construct alknet-core types). See ADR-019.
- The vault's `VaultServiceError` is separate from alknet-core's
`HandlerError`. The CLI binary maps vault errors to handler errors or
startup failures. This is expected — the vault is a library, not a
handler.
## Assumptions
1. **The vault's API is consumed by one component (the CLI binary) in the
alknet system.** If a future use case requires multiple crates to depend
on the vault directly, the dependency flow still holds — they depend on
the vault, the vault depends on nothing. The standalone property is
preserved.
2. **Shared types between the vault and other crates are agreed by type-level
compatibility, not by a crate dependency.** `EncryptedData` is the example:
both the vault and `alknet-storage` (future) must agree on the
serialization format. This is documented in the type's spec, not enforced
by the type system across crates.
3. **The vault's error type does not need to integrate with alknet-core's
error handling.** The vault returns `VaultServiceError`; the CLI binary
handles it at the assembly boundary. If a future use case requires
propagating vault errors through alknet-core's error types, the CLI
converts at the boundary.
## References
- ADR-003: Crate decomposition (alknet-vault is standalone)
- ADR-005: irpc as call protocol foundation (vault uses irpc directly)
- ADR-008: Vault integration point (CLI-embedded, assembly-layer only)
- ADR-014: Secret material flow and capability injection
- ADR-019: Vault assembly-layer-only access
- [crates/vault/README.md](../crates/vault/README.md)
- Implementation: `crates/alknet-vault/`

View File

@@ -0,0 +1,165 @@
# ADR-019: Vault Assembly-Layer-Only Access
## Status
Accepted
## Context
ADR-008 established that the vault is a **capability source** — the CLI
binary unlocks it at startup, derives and decrypts the credentials each
handler needs, and injects the results into handler capabilities. ADR-014
specified the injection mechanism (`Capabilities` on `OperationContext`) and
locked the constraint that no vault operations are registered in the call
protocol.
These ADRs answer *how the vault integrates with the rest of alknet*. This
ADR answers a narrower question that the vault's own spec needs to be
explicit about: **what is the vault's access model from its own
perspective?**
The vault provides a `VaultServiceHandle` with `unlock`, `lock`,
`derive_ed25519`, `derive_encryption_key`, `derive_ethereum_key`,
`derive_password`, `encrypt`, and `decrypt` methods. Who is allowed to call
these, and through what path?
The candidates:
1. **Handlers call the vault directly** — each handler holds a
`VaultServiceHandle` and derives keys at call time. This was the
pre-ADR-008 model and is rejected: it exposes the vault to every handler,
requires the vault to enforce per-handler path restrictions itself, and
means the master seed is reachable from every call path.
2. **The call protocol exposes vault operations**`vault/derive`,
`vault/decrypt`, `vault/unlock` registered as operations. This was the
contradiction ADR-014 resolved: the master seed and mnemonics would cross
the wire.
3. **The assembly layer is the sole caller** — the CLI binary (or an
embedded assembly layer) holds the `VaultServiceHandle`, calls vault
methods at startup and (rarely) at call time through scoped capabilities,
and injects results into handlers. Handlers never hold a vault reference.
## Decision
**The assembly layer is the sole direct caller of the vault.** This
restates ADR-008/ADR-014 from the vault's perspective and makes the access
model explicit in the vault's own spec.
### What the assembly layer does
At startup:
1. Constructs `VaultServiceHandle::new()`
2. Unlocks with a mnemonic (from a secure prompt, a file, or a hardware
token) and optional passphrase
3. Derives the keys each handler needs (identity, SSH host, TLS identity,
signing keys)
4. Decrypts the credentials each handler needs (LLM provider API keys,
OAuth tokens)
5. Constructs handlers with the derived/decrypted material injected into
their `Capabilities`
6. Registers the handlers in the `OperationRegistry`
7. Starts the endpoint
After startup, the vault is typically not called again. The common case is
construction-time injection — a handler holds a static decrypted API key for
its lifetime.
### What handlers do NOT do
Handlers never:
- Hold a `VaultServiceHandle` reference
- Call `derive_*`, `encrypt`, or `decrypt` directly
- Receive the master seed or mnemonic
- Import `alknet_vault` as a dependency
Handlers receive secret material through `OperationContext.capabilities`
(ADR-014). The `Capabilities` type holds non-serializable, zeroized secret
material that the assembly layer populated at construction time.
### The scoped-capability exception
The narrow exception is a handler that needs a child key at an
unpredictable path determined by call input (e.g., signing for a specific
GitHub repo). This handler receives a **scoped capability** — a restricted
handle that performs a specific derivation at a restricted path set and
returns the result in-process. The handler never sees the master seed and
never holds a full `VaultServiceHandle`.
The scoped capability is still a capability (it lives on
`OperationContext.capabilities`), not a vault reference. Whether it is a
distinct type or a pre-derived key injected at construction is a two-way
door for the alknet-call and alknet-agent crate specs (ADR-014).
### No vault operations on the wire
The vault has no ALPN (ADR-003, ADR-008). No vault operation is registered
in the call protocol's `OperationRegistry` (ADR-014). The master seed,
mnemonics, and derived private keys never appear in `call.requested`
payloads, `call.responded` payloads, or `OperationContext.metadata`
(ADR-014).
If a future use case requires exposing a vault operation over the call
protocol (e.g., a restricted `vault/public-key` operation that returns only
public key material for identity verification), it requires its own ADR
with an explicit threat model justification. This decision does not close
that door; it simply does not open it.
## Consequences
**Positive:**
- The master seed is reachable from exactly one place: the assembly layer.
The attack surface for the root of trust is a single process boundary, not
a distributed set of handlers.
- Handlers don't need to enforce path restrictions — they don't have the
vault. The scoped-capability mechanism enforces restrictions by
construction.
- The vault's API is consumed by one caller. This simplifies the vault's
threat model: it doesn't need per-caller authentication, rate limiting, or
path-based access control. The assembly layer is trusted.
- The vault can be tested in isolation — `VaultServiceHandle::new()`
`unlock_new(24)``derive_*` is the test pattern, with no networking or
handler mockery.
**Negative:**
- The assembly layer has more construction-time responsibility: it must
know which handlers need which credentials and wire them. This is expected
— the CLI assembles everything (ADR-008).
- Adding a new handler that needs a new credential requires updating the
assembly layer, not just registering an operation. This is a feature:
it forces an explicit decision about what secret material a handler needs.
- Remote vault administration (unlock a running node's vault over the
network) is not supported. If needed in the future, it requires a
separate, heavily restricted mechanism (admin scope, mTLS-only, never
expose the mnemonic over an unauthenticated channel) and its own ADR.
## Assumptions
1. **The assembly layer is trusted.** The CLI binary holds the vault handle
and is the trust boundary. If the assembly layer is compromised, all
handlers' capabilities are compromised. This is the same trust boundary
as ADR-008 and ADR-014.
2. **Handlers need credentials at construction time or at call time, not
dynamically discovered at call time.** If a handler needs to derive a key
at an unpredictable path determined by call input, the scoped-capability
model covers it (the handler holds a scoped vault access), but the
surface area is larger. The assumption is that this case is rare.
3. **No legitimate use case requires returning a private key over the
wire.** Public key sharing (identity verification, encryption to a
recipient) is the only cross-node key material flow. If a use case for
returning a private key emerges (e.g., a key-escrow service), it needs
its own ADR and a very different threat model.
## References
- ADR-003: Crate decomposition (alknet-vault is standalone)
- ADR-008: Vault integration point (CLI-embedded, capability source)
- ADR-014: Secret material flow and capability injection (the injection
mechanism this ADR relies on)
- ADR-018: Vault as standalone crate (the independence this ADR preserves)
- [crates/vault/service.md](../crates/vault/service.md)
- [crates/vault/README.md](../crates/vault/README.md)