Files
alknet/docs/architecture/decisions/021-key-rotation-via-version-indexed-paths.md
glm-5.2 c62a6adc7b docs(architecture): resolve review #002 Tiers 1-3 — mechanical and consistency fixes
Governance (Tier 2):
- Advance ADR-022 and ADR-023 from Proposed to Accepted (specs already
  depend on their types as source of truth)
- Amend ADR-015: mark Decision 3 and Assumption 6 as superseded by ADR-022;
  update handler_identity type to CompositionAuthority
- Amend ADR-002: note handle() signature revised by ADR-007 (BiStream → Connection)
- Amend ADR-004: note 'enrich/replace' AuthContext language superseded by
  ADR-011's immutability model; update to describe set_identity on Connection
- Update main README ADR table to show ADR-022/023 as Accepted

Spec-ADR consistency (Tier 3):
- Add abort_policy: AbortPolicy field to OperationContext struct (ADR-016
  Decision 6 mandated this but the spec omitted it)
- Define AbortPolicy enum (AbortDependents | ContinueRunning) with Default impl
- Add abort_policy to build_root_context and LocalOperationEnv::invoke()
- Define the OperationEnv trait explicitly with invoke() and
  invoke_with_policy() methods (was referenced as 'must remain a trait'
  but never defined)
- Specify From<StreamError> for HandlerError impl with exact variant mapping
- Add Connection::from_quinn() / from_iroh() constructors (was referenced
  as Connection::new() but never defined)
- Remove undefined CertAuthorityEntry placeholder from AuthPolicy v1 (will
  be added additively when alknet-ssh lands)
- Fix config.md key-differences table: rate limits are in DynamicConfig,
  not StaticConfig

Mechanical fixes (Tier 1):
- overview.md: 'closes the QUIC stream' → 'closes the connection' (stale
  from pre-ADR-007 model)
- overview.md: OQ-04 entry updated from stale 'defer to implementation'
  to 'resolved: static at startup'
- mnemonic-derivation.md: remove duplicate helper functions block (incomplete
  first copy, complete second copy)
- ADR-003: add iroh (feature-gated) to alknet-core dependency list, added
  by ADR-010
- ADR-021: fix ambiguous 'W1 drift issue from the vault review' cross-reference
- ADR-022: rephrase FromCall 'leaf locally' to 'leaf in the local registry'
- ADR-017: add error_schemas to from_call mirror list and services/schema
  step (inconsistency with ADR-023)
- ADR-016: fix self-referential citation ('ADR-016 Assumption 5' → 'Assumption 5')
- Add ScopedOperationEnv::empty(), allows(), new() and
  CompositionAuthority::none(), new() impl blocks (referenced but undefined)
- Add call.completed clarification for non-subscription calls
- Add services/schema leading-slash normalization note
- Crate README ADR tables: add missing ADR-013 (call), ADR-015 (core),
  ADR-006 + ADR-010 (vault)
- Vault README: add consolidated 'Known Source Drift' table tracking all
  four drift items (OsRng, unwrap, CURRENT_KEY_VERSION, spawn bug) in one
  place, including the two previously missing from README
2026-06-22 05:46:37 +00:00

10 KiB

ADR-021: Key Rotation via Version-Indexed Derivation Paths

Status

Accepted

Context

ADR-020 established that the vault derives the AES-256-GCM encryption key from the BIP39 seed via SLIP-0010 HD derivation at path m/74'/2'/0'/0'. The EncryptedData.key_version field exists for rotation tracking, but the current implementation always derives at the same path regardless of version — key_version is metadata, not a functional selector.

OQ-22 asked: how does key rotation work? The key versioning is in place, but the rotation mechanism — how a new key is derived, how existing data is re-encrypted, and how the vault selects the right key for decryption — is not specified.

Why rotation matters

Key rotation is a fundamental security hygiene practice. The scenarios that require it:

  1. Suspected key compromise: the encryption key may have leaked (memory dump, process compromise, log accident). All data encrypted with that key must be re-encrypted with a new key.
  2. Periodic rotation: security policy mandates key rotation every N months. The vault must support this without re-deriving from a new mnemonic (which would require re-deploying all nodes).
  3. Version transition: moving from TS PBKDF2 data (v1) to vault HD data (v2, per ADR-020) is itself a rotation. The mechanism should generalize — it's the same operation.

What "rotation" means concretely

Rotating from key version N to N+1:

  1. Derive a new encryption key at a new derivation path
  2. For each existing EncryptedData blob with key_version: N:
    • Decrypt with the v-N key
    • Re-encrypt the plaintext with the v-(N+1) key
    • Replace the blob in storage with key_version: N+1
  3. New encryptions use key_version: N+1
  4. Old keys remain available for decrypting any data that hasn't been rotated yet (partial rotation is safe)

The question is: how is the new key derived? The options:

  • Option A: New derivation path per version. m/74'/2'/0'/0' for v2, m/74'/2'/0'/1' for v3, etc. Each version gets its own HD key. No new seed needed.
  • Option B: New mnemonic (new seed). Generate a new mnemonic, unlock with it, re-encrypt everything. This is heavy — it changes all derived keys (identity, SSH host, etc.), not just the encryption key.
  • Option C: KDF from the existing key. Use HKDF or PBKDF2 with the existing derived key + the salt as input. This is the salt field's potential use (OQ-20 mentioned this), but it adds KDF complexity and the salt becomes load-bearing.

Decision

1. Version-indexed derivation paths

Each key version maps to a unique derivation path. The last hardened index in the encryption path is the key version:

v2: m/74'/2'/0'/0'    ← PATHS::ENCRYPTION (current)
v3: m/74'/2'/0'/1'
v4: m/74'/2'/0'/2'
...

The encryption_path_for_version(version) function constructs the path:

pub fn encryption_path_for_version(version: u32) -> String {
    // v1 is the TS PBKDF2 legacy — not an HD path. The vault starts at v2.
    // v2 → m/74'/2'/0'/0', v3 → m/74'/2'/0'/1', etc.
    let index = version.saturating_sub(2);
    format!("m/74'/2'/0'/{}'", index)
}

PATHS::ENCRYPTION remains m/74'/2'/0'/0' — it's the v2 path, and v2 is the current version. When the vault is rotated to v3, encryption_path_for_version(3) produces m/74'/2'/0'/1'.

This means:

  • No new mnemonic needed — rotation uses the same seed, different path
  • Each version's key is cryptographically independent (HD derivation ensures this)
  • The derivation path is self-documenting (m/74'/2'/0'/1' is clearly "encryption key, version 3")
  • Old keys are always derivable (the seed doesn't change), so partial rotation is safe — the vault can decrypt any version

2. encrypt_key(version) and decrypt_key(version) methods

The VaultServiceHandle gains version-aware key derivation:

impl VaultServiceHandle {
    /// Derive the encryption key for the given version. Cached.
    fn derive_encryption_key_for_version(
        &self,
        version: u32,
    ) -> Result<EncryptionKey, VaultServiceError> {
        let path = encryption_path_for_version(version);
        // ... derive at path, cache by path ...
    }

    /// Encrypt with the current key version.
    pub fn encrypt(&self, plaintext: &str, key_version: u32) -> Result<EncryptedData, VaultServiceError>;

    /// Decrypt by deriving the key at the version indicated by the blob.
    pub fn decrypt(&self, encrypted: &EncryptedData) -> Result<String, VaultServiceError> {
        let key = self.derive_encryption_key_for_version(encrypted.key_version)?;
        encryption::decrypt(encrypted, &key)
    }
}

decrypt now derives the key at the path indicated by encrypted.key_version — not always at PATHS::ENCRYPTION. This corrects a source drift: the current source ignores key_version for key selection; the spec now makes it functional.

3. rotate method

impl VaultServiceHandle {
    /// Re-encrypt an EncryptedData blob from one key version to another.
    ///
    /// Decrypts with the key at the blob's current key_version,
    /// re-encrypts with the key at `to_version`. Returns the new
    /// EncryptedData. Does not update storage — the caller replaces the
    /// blob in storage.
    pub fn rotate(
        &self,
        encrypted: &EncryptedData,
        to_version: u32,
    ) -> Result<EncryptedData, VaultServiceError> {
        let plaintext = self.decrypt(encrypted)?;
        self.encrypt(&plaintext, to_version)
    }
}

rotate is a vault method, not a storage operation. It decrypts and re-encrypts; the caller (the assembly layer or a migration tool) handles replacing the blob in storage. This keeps the vault focused on crypto and the storage system focused on storage.

4. CURRENT_KEY_VERSION and rotation policy

pub const CURRENT_KEY_VERSION: u32 = 2;

encrypt() stamps CURRENT_KEY_VERSION (or the explicitly-passed version) onto new EncryptedData blobs. The assembly layer decides when to rotate:

  • Manual rotation: an operator triggers rotation (e.g., a CLI command alknet vault rotate --to v3 that loads all blobs, calls rotate on each, and writes them back to storage).
  • No automatic rotation: the vault does not self-rotate. Rotation is an operational action, not a runtime behavior. The vault provides the mechanism; the policy is external.

5. Cache implications

The KeyCache is keyed by derivation path. Since each version has a distinct path, the cache naturally holds multiple versions simultaneously. This is correct — during a rotation, the vault may need to decrypt old blobs (v2) and encrypt new blobs (v3), and both keys should be cached.

The cache's TTL and LRU eviction still apply. If the cache evicts an old version's key during a long rotation, the next decrypt of an old blob re-derives it (the seed hasn't changed). This is correct but slightly slower — the rotation tool should be aware that cache misses on old keys are expected.

Consequences

Positive:

  • Key rotation is a vault method (rotate), not a storage operation or a full mnemonic change. It's cheap (HD derivation) and local.
  • Partial rotation is safe. Old and new keys coexist — the vault can decrypt any version. This means a rotation can be performed incrementally (rotate some blobs, verify, rotate the rest).
  • No new mnemonic needed. The same seed produces all version keys. A backup node with the same mnemonic can decrypt any version.
  • The derivation path is self-documenting. m/74'/2'/0'/1' is clearly "encryption key version 3."
  • The salt field remains unused — no KDF complexity. Rotation is pure HD path indexing.
  • The mechanism generalizes the TS→vault migration (v1→v2 is a rotation, though v1 requires the TS PBKDF2 decrypt, not the vault's decrypt).

Negative:

  • decrypt now derives the key at the version-indicated path, which means a cache miss on an old version re-derives from the seed. This is a few HMAC operations — negligible, but the path construction and cache lookup add a small amount of complexity over the current "always use PATHS::ENCRYPTION" approach.
  • The rotation tool (CLI command or migration script) must iterate all stored blobs and call rotate on each. This is an operational concern, not a vault concern — but the vault spec should document the expected usage pattern so the tool implementer knows the contract.
  • Old version keys are always derivable (the seed doesn't change). This is a feature (partial rotation is safe) but also means a compromised seed allows decrypting all versions. If the seed itself is compromised, all versions are compromised — rotation doesn't help. This is inherent to HD derivation and not specific to this design.

Assumptions

  1. The seed is not compromised. If the seed is compromised, rotating the encryption key path doesn't help — the attacker can derive all version keys. Seed compromise requires a full mnemonic change (new seed, re-derive everything, re-deploy). This ADR covers encryption key rotation, not seed rotation. Seed rotation is an operational procedure (generate new mnemonic, unlock with it, re-encrypt all data) that is outside the vault's API.

  2. Rotation is infrequent. The vault does not optimize for frequent rotation (e.g., per-request key derivation). Rotation is an operational event triggered by policy or incident. The cache and path-indexed approach are efficient for this usage pattern.

  3. The storage system tracks which blobs to rotate. The vault's rotate method handles one blob at a time. Iterating all stored EncryptedData blobs is the storage system's job (or the CLI's). The vault doesn't know what's in storage — it only knows how to rotate a blob it's given.

  4. v1 (TS PBKDF2) data is not rotated through the vault. v1 data is decrypted by the TS decrypt() function (PBKDF2), not the vault's decrypt() (which uses HD derivation). The v1→v2 migration is a separate tool that has access to both. Once data is at v2, future rotations (v2→v3, etc.) use the vault's rotate method.

References

  • ADR-020: HD derivation for encryption keys (this ADR builds on the version-indexed path scheme)
  • OQ-22: Key rotation mechanism (resolved by this ADR)
  • encryption.md — AES-256-GCM, EncryptedData
  • service.md — encrypt, decrypt, rotate methods
  • mnemonic-derivation.md — derivation paths, PATHS::ENCRYPTION