docs(arch): ADR-030..033 — repo/adapter pattern, PeerEntry, CredentialStore, forwarded-for

Land the storage and auth strategy research (findings.md) as four
accepted ADRs and amend the core and call specs to match:

- ADR-030: PeerEntry and Identity.id decoupling. Replaces
  authorized_fingerprints with peers: Vec<PeerEntry>; Identity.id becomes
  the stable peer_id, decoupled from the rotating fingerprint. Supersedes
  ADR-029 Assumption 1's UUID source (one-way door preserved, source
  changes). Resolves OQ-33 and the storage-boundary half of OQ-34. Records
  the API-key asymmetry as deliberate (OQ-35).

- ADR-031: CredentialStore repo trait + InMemoryCredentialStore default
  adapter in core. Second repo trait alongside IdentityProvider. Vault
  encrypts; the store persists the EncryptedData blob; assembly layer
  loads into Capabilities. EncryptedData core mirror includes salt for
  wire-format compat.

- ADR-032: Forwarded-for identity. forwarded_for field on call.requested
  and OperationContext — metadata only, never read by AccessControl::check
  (enforced structurally via the check signature). The from_call handler
  populates it. Wire-format one-way door, folded into the ADR-029
  migration window.

- ADR-033: Storage boundary and repo/adapter pattern. Core defines repo
  traits + in-memory defaults; persistence adapters are separate crates;
  assembly layer wires. Resolves OQ-34. Concrete adapter shapes deferred
  for exploration (OQ-36).

Amends auth.md, config.md, operation-registry.md, client-and-adapters.md,
open-questions.md, README.md, crates/core/README.md. Marks ADR-029
Accepted (Assumption 1 carries the ADR-030 superseded note). Marks the
research findings doc reviewed.
This commit is contained in:
2026-06-27 12:12:25 +00:00
parent 347bff257c
commit f224ea998c
13 changed files with 1307 additions and 144 deletions

View File

@@ -2,7 +2,8 @@
## Status
Proposed (supersedes ADR-028)
Accepted (supersedes ADR-028; Assumption 1's `PeerId` source is superseded
by ADR-030 on the source dimension — the one-way door is preserved)
## Context
@@ -243,6 +244,14 @@ with attribution, filtered by the calling peer's authorization).
The one-way door: `PeerId` is logical, not crypto — this determines the
`PeerCompositeEnv` key type and `PeerRef::Specific` payload. See OQ-33.
> **Superseded by ADR-030 on the `PeerId` source dimension.** The
> one-way door (`PeerId` is logical, not crypto) is preserved. The v1
> UUID source is replaced by `Identity.id` from `PeerEntry.peer_id`
> (stable across key rotation). The "no-storage workaround" framing is
> no longer accurate — the storage boundary is now `config + in-memory
> adapter` (ADR-030 + ADR-033), with persistence adapters additive. See
> ADR-030 and OQ-33 (resolved).
2. **`PeerRef::Any` = insertion-order first-match.** Deterministic but
order-dependent (worker A connects before worker B → `Any` routes to A
until A disconnects). This is the simplest routing policy and is correct for

View File

@@ -0,0 +1,341 @@
# ADR-030: PeerEntry and Identity.id Decoupling
## Status
Accepted (supersedes the "v1 UUID" source in ADR-029 Assumption 1; resolves
the "real solution" half of OQ-33 and the storage-boundary half of OQ-34)
## Context
`Identity.id` is the string that keys authorization decisions across the
alknet crate graph. Today it is **coupled to the cryptographic material**:
```rust
// crates/alknet-core/src/config.rs — current implementation
pub struct AuthPolicy {
pub authorized_fingerprints: HashSet<String>, // just strings, no stable id
pub api_keys: Vec<ApiKeyEntry>,
}
impl AuthPolicy {
pub fn resolve_identity_from_fingerprint(&self, fingerprint: &str) -> Option<Identity> {
if self.authorized_fingerprints.contains(fingerprint) {
Some(Identity {
id: fingerprint.to_string(), // ← identity IS the crypto material
scopes: vec!["relay:connect".to_string()],
...
})
}
}
}
```
This coupling is a latent bug for any cross-node authorization decision:
- A TLS fingerprint or raw-key identity changes when the node rotates its key.
- When it changes, every ACL entry that references the old fingerprint stops
matching — the peer "disappears" from the authorization system even though
it is the same logical node.
- `PeerRef::Specific(PeerId)` (ADR-029) routes by `Identity.id`; a key
rotation would break in-flight routing references the same way.
- The hub's `authorized_fingerprints` set has to be manually updated on every
rotation on the *remote* side, which is exactly the operational pain the
vault's local key rotation (ADR-021) was meant to remove.
ADR-029 §1 set `PeerId = Identity.id` and made `PeerId` a logical identifier
"NOT `Identity.id` (the fingerprint)" — but left the *source* of that logical
identifier as a connection-assigned UUID (OQ-33's v1 workaround). The UUID
is ephemeral: it survives only for the connection's lifetime, changes on
reconnect, and cannot persist across restarts or key rotations. It is a
no-storage workaround, not a real identity.
The research at `docs/research/alknet-storage-strategy/findings.md` §4
established the real fix: introduce a `PeerEntry` config model that maps a
**stable logical peer id** to its current cryptographic material and
authorization scopes, and have `ConfigIdentityProvider` resolve
fingerprint → `PeerEntry``Identity { id: peer_entry.peer_id, scopes:
peer_entry.scopes, ... }`. The `Identity.id` becomes the stable `peer_id`,
decoupled from the fingerprint. Key rotation is a single field update in the
peer entry; the `peer_id` and every ACL / routing reference to it stay
stable.
This is the storage-boundary question OQ-34 tracks. With ADR-033 (the
repo/adapter pattern) establishing that core defines repo traits and the
default in-memory adapter lives alongside the trait, the answer is: core
gets the `PeerEntry` config model and the
`ConfigIdentityProvider::resolve_from_fingerprint → Identity { id: peer_id
}` resolution path now, with no SQLite dependency in core. A future
`alknet-peer-store-sqlite` adapter that persists `PeerEntry` records is
additive — it implements the same `IdentityProvider` trait against a `peers`
table instead of config. The trait is the one-way door; the adapter is the
two-way door.
## Decision
### 1. Add `PeerEntry` to `AuthPolicy`, replacing `authorized_fingerprints`
```rust
pub struct PeerEntry {
/// Stable logical peer id ("worker-a", "alice"). Does NOT change on
/// key rotation. This becomes Identity.id on resolution.
pub peer_id: String,
/// Current cryptographic material — the fingerprint the endpoint
/// extracts from the TLS handshake (SHA256:... for X.509, ed25519:...
/// for RFC 7250 raw keys). Changes on key rotation.
pub fingerprint: String,
/// Authorization scopes granted to this peer. Resolved into
/// Identity.scopes.
pub scopes: Vec<String>,
/// Named resource lists granted to this peer. Resolved into
/// Identity.resources. Populated from config (not just composition, as
/// the pre-ADR-030 limitation in auth.md §"Resource-scoped ACLs and
/// external identities" required).
pub resources: HashMap<String, Vec<String>>,
/// Human-readable display name for logs / UIs. Optional.
pub display_name: Option<String>,
/// Whether this peer is authorized at all. false = the fingerprint
/// is recognized but the peer is disabled (token-revoked-equivalent
/// for fingerprints). Resolution returns None.
pub enabled: bool,
}
pub struct AuthPolicy {
/// Replaces authorized_fingerprints: HashSet<String>. Each entry maps
/// a stable logical peer_id to its current fingerprint + scopes +
/// resources. The list is keyed by peer_id; resolution looks up by
/// fingerprint.
pub peers: Vec<PeerEntry>,
/// API keys — unchanged by this ADR (see "API keys" below).
pub api_keys: Vec<ApiKeyEntry>,
}
```
### 2. `Identity.id` becomes `PeerEntry.peer_id` on fingerprint resolution
`ConfigIdentityProvider::resolve_from_fingerprint` resolves fingerprint →
matching `PeerEntry``Identity { id: peer_entry.peer_id, scopes:
peer_entry.scopes, resources: peer_entry.resources }`. The `Identity.id` is
the stable `peer_id`, not the fingerprint.
```rust
impl AuthPolicy {
pub fn resolve_identity_from_fingerprint(&self, fingerprint: &str) -> Option<Identity> {
self.peers.iter()
.find(|p| p.enabled && p.fingerprint == fingerprint)
.map(|p| Identity {
id: p.peer_id.clone(),
scopes: p.scopes.clone(),
resources: p.resources.clone(),
})
}
}
```
This removes the pre-ADR-030 limitation in `auth.md`
§"Resource-scoped ACLs and external identities" — fingerprint-resolved
identities now carry `resources` from the `PeerEntry`, not just from the
composition path. The composition path (`CompositionAuthority::as_identity`,
ADR-015/022) still produces its own `Identity` for internal calls; the
external-auth path now also carries resources when configured.
### 3. Key rotation is a single `PeerEntry.fingerprint` update
Rotating a peer's TLS key:
- The vault derives the new key locally (ADR-020/021).
- The remote side's config updates the `PeerEntry.fingerprint` field for
that `peer_id`. The `peer_id`, `scopes`, `resources`, ACL entries, and
any `PeerRef::Specific(peer_id)` references stay stable.
- A config reload (`ConfigReloadHandle::reload`) makes the change live.
No ACL update, no routing reference invalidation, no peer "disappears."
The vault's local rotation + a remote-side config edit is the full key
rotation story across nodes.
### 4. `PeerId` source changes from UUID to `Identity.id` from `PeerEntry`
ADR-029 Assumption 1 said `PeerId` is a connection-assigned UUID (v4). With
`Identity.id` now stable (`peer_id`), the UUID workaround is no longer
needed: `PeerId = Identity.id` from `IdentityProvider` resolution. This is
the one-way-door tightening — `PeerId` was always specified as logical-not-
crypto (ADR-029), the UUID was the *source*; the source now becomes the
auth system.
```rust
// ADR-029 §1, updated by this ADR:
pub type PeerId = String; // = Identity.id from IdentityProvider resolution
// = PeerEntry.peer_id (stable, not crypto material)
```
ADR-029 §2's `invoke_peer` / `PeerRef::Specific(PeerId)` signatures are
unchanged. The `PeerId` payload is now stable across reconnects and key
rotations, instead of ephemeral. An in-flight `PeerRef::Specific` that
survives a reconnect now keeps resolving (the `peer_id` is unchanged), which
is the property the UUID workaround could not provide.
### 5. The `PeerId` for a connection comes from `IdentityProvider` resolution
The dispatch path that builds a `CallConnection` and assigns a `PeerId` to
the peer-keyed overlay (`PeerCompositeEnv::attach_peer`) reads
`connection.identity().id` — the resolved `Identity.id` from the
`IdentityProvider`. If identity resolution returns `None` (no client cert,
unrecognized fingerprint), the peer has no `PeerId` and the connection
cannot be added to the peer-keyed overlay. The handler either rejects the
connection or falls back to a connection-without-peer-identity path (the
caller-id-is-the-connection case, e.g., anonymous dial-in).
The UUID fallback is removed. A connection with no resolved identity has no
`PeerId`, not a random one.
## API keys
API keys (`ApiKeyEntry`) are **not** given the `PeerEntry` treatment. The
two identity sources have different semantics:
| Axis | Fingerprint (PeerEntry) | API key (ApiKeyEntry) |
|------|-------------------------|------------------------|
| Identity source | TLS handshake / SSH key | Bearer token in protocol frame |
| Key rotation | Same logical node, new material | New identity (revocation = new key) |
| `Identity.id` | `peer_id` (stable across rotation) | `prefix` (changes with the key) |
| Resource binding | `PeerEntry.resources` (per-peer) | Empty (Option B, auth.md) — resources are composition-only |
An API key's prefix IS the identity — rotating the key means a new prefix
and a new identity, by design (revocation is the rotation mechanism for
API keys). Decoupling the API key identity from the prefix would be solving
a different problem (persistent logical identity across key rotation) that
API keys don't have: they're bearer tokens, not node identities.
`ApiKeyEntry` stays as-is. The asymmetry is documented here and in
`auth.md` so the difference between the two auth paths is explicit, not an
oversight.
## What this does NOT change
- **`Identity` struct shape** — `id: String`, `scopes: Vec<String>`,
`resources: HashMap<String, Vec<String>>` are unchanged. Only the
*meaning* of `id` on the fingerprint path changes (fingerprint →
peer_id).
- **`IdentityProvider` trait** — unchanged. The adapter's resolution
semantics change, not the trait.
- **`AccessControl::check`** — unchanged. Still a flat scope/resource match
against `Identity`. The `Identity` it checks now has a stable `id` on the
fingerprint path, but `check` doesn't key on `id` (it checks scopes and
resources).
- **`AuthToken`, `AuthContext`** — unchanged.
- **`PeerRef::Specific(PeerId)` signature** — unchanged. The payload is now
stable.
- **`CompositeOperationEnv``PeerCompositeEnv` migration** — unchanged.
This ADR provides the stable `PeerId` source; ADR-029 still owns the
overlay-keying model.
## Consequences
**Positive:**
- Key rotation no longer breaks ACL entries or routing references on the
remote side. The vault's local rotation story (ADR-021) is now the
complete story — `rotate` locally, edit the peer entry's fingerprint
remotely, reload.
- `PeerRef::Specific` survives reconnects. An in-flight routing reference
to "worker-a" keeps resolving after worker-a's TLS key rotates and after
worker-a reconnects.
- OQ-33's UUID workaround is removed — the stable logical id is the real
thing, not an ephemeral stand-in.
- OQ-34's storage-boundary question is resolved: core has the config model
(`PeerEntry`) + the in-memory adapter (`ConfigIdentityProvider`); a
future `alknet-peer-store-sqlite` adapter that persists `PeerEntry`
records is additive, implementing the same `IdentityProvider` trait
against a `peers` table. See ADR-033.
- Fingerprint-resolved identities now carry `resources` (the pre-ADR-030
limitation is lifted) — `AccessControl::check` against `resource_type`/
`resource_action` works for external fingerprint-authenticated callers
when configured.
**Negative:**
- `AuthPolicy.authorized_fingerprints: HashSet<String>` is replaced with
`AuthPolicy.peers: Vec<PeerEntry>`. This is a breaking config change —
existing config files with `authorized_fingerprints` migrate to `peers`
entries. The migration is mechanical (each fingerprint becomes a
`PeerEntry { peer_id: <chosen name>, fingerprint: <old value>, scopes:
["relay:connect"], ... }`), and operators must choose a `peer_id` per
peer, but it is a config break.
- `Identity.id` for fingerprint-resolved identities changes from the
fingerprint to the `peer_id`. Code that logs or compares `Identity.id`
on the fingerprint path and assumed it was the fingerprint string will
see the `peer_id` instead. This is the correct behavior (logs should
show the logical name, not the rotating crypto material), but it's a
behavior change in log output.
- The pre-ADR-030 `auth.md` "Resource-scoped ACLs and external identities"
limitation note is removed — fingerprint-resolved identities now populate
`resources`. Code that relied on fingerprint identities always having
empty `resources` (an unintended invariant) will see populated resources
when configured.
- ADR-029 Assumption 1 is superseded on the `PeerId` source dimension:
the one-way door (`PeerId` is logical, not crypto) is preserved, but the
v1 UUID source is replaced by `Identity.id` from `PeerEntry`. The
Assumption's framing of "no-storage workaround" is no longer accurate —
the storage boundary is now explicitly `config + in-memory adapter`
(this ADR + ADR-033), with the SQLite adapter additive.
## Assumptions
1. **The dispatch path can require identity resolution for peer-keyed
overlay membership.** A connection that fails `IdentityProvider`
resolution has no `PeerId` and is not added to `PeerCompositeEnv`. The
caller either authenticates with a recognized fingerprint (and gets a
`peer_id`) or is rejected / falls back to a no-peer-identity path. The
v1 UUID fallback is removed deliberately — anonymous dial-in to a
peer-keyed composition env is a contradiction.
2. **`PeerEntry.peer_id` is operator-chosen and unique within a config.**
Config validation enforces uniqueness; duplicate `peer_id` values in
`AuthPolicy.peers` are a config error.
3. **API keys stay as-is.** The `ApiKeyEntry` model is correct for bearer-
token identity where rotation = new identity. This ADR does not add a
`PeerEntry`-equivalent for API keys. See "API keys" above.
4. **The `peers` list resolution is O(peers) per fingerprint lookup.** The
expected peer count per node is small (10s100s); a linear scan with a
side index is fine. A `HashMap<fingerprint, &PeerEntry>` index is an
implementation-detail two-way door.
5. **Adapter crates that persist `PeerEntry` records are additive and not
specified here.** ADR-033 establishes the pattern (core trait + in-memory
default; persistence adapters are separate crates); the concrete adapter
shapes are deferred for exploration per the user's note. This ADR's
commitment is to the `PeerEntry` config model + the resolution
semantics + the `PeerId` source, not to any specific backend.
## References
- ADR-004: Auth as Shared Core (`IdentityProvider` in core)
- ADR-015: Privilege Model and Authority Context (`AccessControl::check`
against `Identity`)
- ADR-021: Key Rotation via Version-Indexed Paths (the local rotation half
this ADR completes across nodes)
- ADR-022: Handler Registration, Provenance, and Composition Authority
(the registration bundle's `composition_authority` path produces its own
`Identity`; this ADR's `PeerEntry.resources` populates the external-auth
path's `Identity.resources`)
- ADR-029: Peer-Graph Routing Model (the `PeerId = Identity.id` model;
Assumption 1's UUID source is superseded by this ADR's `PeerEntry.peer_id`
source — the one-way door is preserved)
- ADR-033: Storage Boundary and Repo/Adapter Pattern (the overarching pattern
this ADR's `PeerEntry` + `ConfigIdentityProvider` follows)
- OQ-33: PeerId — Cryptographic Identity vs Stable Logical Identifier
(resolved by this ADR — the "real solution" half, replacing the UUID
workaround)
- OQ-34: Persistent Peer Registry (resolved by this ADR + ADR-033 — the
storage boundary is `config + in-memory adapter` now, SQLite adapter
additive)
- OQ-35: API Key Identity vs Peer Identity (recorded by this ADR — the
asymmetry is deliberate, see "API keys" above)
- `docs/research/alknet-storage-strategy/findings.md` §4 (the `PeerEntry`
model and resolution path)
- `docs/architecture/crates/core/auth.md` (the spec amended by this ADR)
- `docs/architecture/crates/core/config.md` (the `AuthPolicy` change)

View File

@@ -0,0 +1,213 @@
# ADR-031: CredentialStore Repo Trait
## Status
Accepted (establishes the second repo-trait in core, alongside
`IdentityProvider`; resolves the credential-persistence dimension of
OQ-34)
## Context
`alknet-http`'s `from_openapi` / `from_mcp` handlers need provider
credentials (API keys, OAuth tokens) to call outbound services. ADR-014
established the no-env-vars invariant: credentials come from
`Capabilities`, populated by the assembly layer from the vault at startup.
The vault (ADR-018/019/020/025/026) handles encryption/decryption; the
master seed and derived private keys never cross the network.
What's missing is the **persistence layer** for the encrypted credential
blobs. Today the in-memory `Capabilities` path works for the
vault-at-startup deployment (the assembly layer decrypts everything the
handlers need at boot, injects into `Capabilities`), but there is no
shared, trait-bound abstraction for *where the encrypted blobs live* before
the assembly layer decrypts them, and no way for a runtime process to
`put`/`get`/`delete` encrypted credentials without re-implementing the
storage shape in every consumer.
The research at `docs/research/alknet-storage-strategy/findings.md` §4
identified this as the second application of the repo/adapter pattern (the
first being `IdentityProvider` for peer identity). The vault encrypts; a
`CredentialStore` persists the `EncryptedData` blob; the assembly layer
loads them into `Capabilities` at registration time. The trait boundary
that matters for cross-crate sharing is the store trait, not the storage
backend — exactly mirroring `IdentityProvider`.
The kepal reference (`/workspace/keypal`) demonstrates the same pattern in
TypeScript: a `Storage` interface with adapters for Redis, Drizzle, Prisma,
Kysely, Convex, and in-memory. The core logic is backend-agnostic; storage
is a trait; the consumer picks the adapter at wiring time. The alknet
equivalent: core defines the repo trait, the default in-memory adapter
lives alongside it, and a future persistence adapter is a separate crate
(ADR-033).
## Decision
### 1. Add `CredentialStore` trait to alknet-core
```rust
pub trait CredentialStore: Send + Sync {
fn get(&self, provider: &str) -> Option<EncryptedData>;
fn put(&self, provider: &str, data: &EncryptedData) -> Result<(), CredentialStoreError>;
fn delete(&self, provider: &str) -> Result<(), CredentialStoreError>;
}
```
- `provider: &str` — the provider identifier (`"openai"`, `"anthropic"`,
`"github"`, etc.). The key the assembly layer uses to look up a
credential when populating `Capabilities`.
- `EncryptedData` — the vault's encrypted-blob type (ADR-020, defined in
`alknet-vault`). The store persists the blob as-is; it does not decrypt.
Decryption is the vault's job (ADR-025, local-only by construction).
- `CredentialStoreError` — a crate-level error enum for store failures
(backend unreachable, serialization, etc.). `#[non_exhaustive]` so
adapter crates can extend without breaking match arms.
The trait returns `Option<EncryptedData>` from `get` (not `Result`): a
missing credential is the common case (the provider isn't configured),
not an error. `put` and `delete` are mutations and return `Result` since
the backend may be unwritable (a read-only deployment, a corrupted store,
etc.).
### 2. Add `InMemoryCredentialStore` default adapter to alknet-core
```rust
pub struct InMemoryCredentialStore {
entries: RwLock<HashMap<String, EncryptedData>>,
}
impl InMemoryCredentialStore {
pub fn new() -> Self;
pub fn with_entries(entries: HashMap<String, EncryptedData>) -> Self;
}
impl CredentialStore for InMemoryCredentialStore { ... }
```
The default adapter covers tests and config-loaded deployments where
credentials are decrypted from the vault at startup and held in memory for
the process lifetime. This is the same posture as
`ConfigIdentityProvider` — no persistence, no backend dependency, no env
vars. The assembly layer constructs it from vault-decrypted entries at
boot.
### 3. `EncryptedData` re-export shape
The store trait references `EncryptedData`, which is defined in
`alknet-vault`. To keep alknet-core lean (no vault dependency — ADR-003
keeps the vault standalone with zero alknet-crate dependencies), the
trait's `EncryptedData` parameter is a **core-owned serializable type**:
the vault produces it; the store persists it as a serializable blob; the
vault consumes it back. The core trait carries the wire shape without a
vault dependency.
The exact shape of `EncryptedData` in core is a thin serializable struct
mirroring the vault's type: `{ key_version, salt, iv, data }` (the fields
the vault's `EncryptedData` carries, per ADR-020 and
`crates/alknet-vault/src/encryption.rs`). The `salt` field is kept for
wire-format compatibility with the TS predecessor (OQ-20) — a core mirror
that omitted it could not round-trip the vault's `EncryptedData`. This is a
one-way door — it pins the credential-blob wire shape — and it's
intentionally minimal (the vault's HD-derivation path is the vault's
concern, ADR-020). ADR-020 already defines this shape; this ADR's
commitment is that the store trait carries it as a serializable value type,
not a vault-bound reference.
### 4. No `list` method
The trait is `get` / `put` / `delete` — no `list`. The research (§11 OQ-3)
flagged `list` as a two-way-door remainder: a management UI or a startup-
enumeration use case might want to list all stored providers, but no
current consumer needs it. Adding `list` is non-breaking (a new method
with a default-impl, or a `list_providers(&self) -> Vec<String>` that
returns `vec![]` from the in-memory adapter until overridden).
## Consequences
**Positive:**
- A second repo trait in core establishes the pattern concretely:
`IdentityProvider` for identity resolution, `CredentialStore` for
encrypted-credential persistence. Both follow the same shape (core trait
+ in-memory default; persistence adapters additive in separate crates,
ADR-033).
- The vault stays local-only by construction (ADR-025). The store
persists `EncryptedData` blobs; the vault decrypts them. The store
never sees plaintext credentials, never sees the master seed, never
holds derived keys. The encryption boundary is preserved.
- The no-env-vars invariant (ADR-014) gets a persistence-layer
counterpart: encrypted credentials persist in a `CredentialStore`, the
assembly layer loads them into `Capabilities` at registration time, the
handlers read from `Capabilities` per-request. No `std::env::var` path
exists at any layer.
- `alknet-http`'s `from_openapi` / `from_mcp` handlers consume the trait
via `Capabilities` (the assembly layer wires the
`CredentialStore``Capabilities` mapping at registration). The
handlers don't know whether the credential came from an in-memory map
or a SQLite file.
**Negative:**
- alknet-core gains a second trait and a default adapter. The dependency
surface grows by one trait + one struct + one error enum — small, but
non-zero. The trade is that downstream crates (alknet-http, future
credential-management UIs) get a shared abstraction instead of each
rolling their own store shape.
- The `EncryptedData` type is re-stated in core (a thin serializable
shape mirroring the vault's type). If the vault's `EncryptedData` shape
changes (a new key version, an additional field), the core shape must
be kept in sync. The shape is small and stable (ADR-020 locked it), so
the sync cost is low.
- A future persistence adapter (`alknet-credential-store-sqlite` or
similar) is additive and not specified here. The trait shape is the
one-way door; the adapter is a two-way door (ADR-033). Concrete adapter
shapes are deferred for exploration per the project's note that the
repo pattern is a tool to reach for, not a one-size-fits-all mold.
## Assumptions
1. **The vault remains the sole encryption boundary.** `CredentialStore`
persists `EncryptedData` blobs and never decrypts. Decryption is the
vault's job, local-only (ADR-025). This ADR does not introduce a remote
decryption path.
2. **`provider: &str` is the key.** Credentials are keyed by provider
name (`"openai"`, `"anthropic"`, etc.). Multi-credential-per-provider
(e.g., separate keys for org-A vs org-B under the same provider) is
not in the trait shape; if needed, an additive `get_scoped(provider,
scope)` method is the extension path — not a signature change to the
existing `get` (which is a one-way-door break on the trait).
3. **No `list` method.** The trait is `get` / `put` / `delete`. Adding
`list` is non-breaking (a default-impl method). See "No `list` method"
above.
4. **Adapter crates that persist credentials are additive and not
specified here.** ADR-033 establishes the pattern; the concrete adapter
shapes are deferred for exploration. This ADR's commitment is to the
trait shape + the in-memory default, not to any specific backend.
5. **`EncryptedData` in core is a thin serializable mirror of the vault's
type.** The vault owns the encryption logic and the HD-derivation path
(ADR-020); core carries only the wire shape. This keeps the vault
standalone (ADR-018) while letting the store trait reference a concrete
type.
## References
- ADR-014: Secret Material Flow and Capability Injection (the no-env-vars
invariant this trait supports)
- ADR-018: Vault as Standalone Crate (the vault has zero alknet-crate
dependencies; core's `EncryptedData` is a thin mirror, not a vault
reference)
- ADR-019: Vault Assembly-Layer-Only Access (the assembly layer bridges
vault → `CredentialStore``Capabilities`)
- ADR-020: HD Derivation for Encryption Keys (the `EncryptedData` shape)
- ADR-025: Vault Local-Only Dispatch (the store never decrypts; the vault
is the sole decryption boundary)
- ADR-033: Storage Boundary and Repo/Adapter Pattern (the overarching
pattern this ADR follows)
- OQ-34: Persistent Peer Registry (resolved by this ADR + ADR-030 + ADR-033
— the storage boundary is `config + in-memory adapter` now, persistence
adapters additive)
- `docs/research/alknet-storage-strategy/findings.md` §4 (the
`CredentialStore` trait and adapter pattern)
- `/workspace/keypal` — TypeScript repo-pattern reference (Storage
interface + adapters; the pattern alknet's `CredentialStore` follows)

View File

@@ -0,0 +1,221 @@
# ADR-032: Forwarded-For Identity (Metadata, Not Authority)
## Status
Accepted (adds a wire-format field and an `OperationContext` field;
included with the ADR-029 migration or a companion task immediately after,
since `OperationContext` and the `from_call` handler are being rewritten)
## Context
When a hub forwards a call to a spoke (via `from_call`, ADR-017), the spoke
authenticates the hub (resolves the hub's identity from the connection)
and checks its ACL: "is the hub authorized to call this operation?" The
spoke's ACL answers yes/no based on the hub's identity. This is per-node
ACL (ADR-029 §3) — the correct authorization model, no "trusted" bypass.
But the spoke is **blind to the originator**. It knows "the hub called me"
but not "alice asked the hub to call me." The hub's `OperationContext.identity`
holds alice's identity (the hub authenticated alice), but the `from_call`
forwarding handler authenticates as the hub (its own `auth_token`), so the
spoke sees the hub's identity, not alice's. The originator information is
at the hub, not at the spoke.
This matters for three use cases the research at
`docs/research/alknet-storage-strategy/findings.md` §6 identified:
1. **Audit trail.** A cross-node call chain is untraceable at the leaf
without the originator. The spoke logs "the hub called `/docker/start`"
but can't log "alice asked the hub to call `/docker/start`." For
debugging, billing, and abuse investigation, the originator matters.
2. **Per-user rate limiting at the leaf.** If the spoke wants to rate-limit
per-user (not per-hub), or apply per-user quotas, it can't — it only
sees the hub. The hub would have to proxy and track everything, which
defeats the point of direct service composition.
3. **Handler context.** A handler may want the originator's identity for
application logic (per-user views, per-user data isolation, attribution
in logs).
The question is whether to include the originator's identity in the
forwarded call. The wire format is the constraint: a field is either in the
`call.requested` payload or it isn't — it can't be bolted on later without
a protocol change. This is a wire-format one-way door.
## Decision
### 1. Add `forwarded_for` to the `call.requested` payload
```json
{
"operationId": "/docker/start",
"input": { ... },
"auth_token": "alk_...", // the direct caller's token (the hub's)
"forwarded_for": { // the original caller (the end user's)
"id": "alice",
"scopes": ["fs:read", "docker:start"],
"resources": {}
}
}
```
`forwarded_for` is optional (`None` when the call is not forwarded, or
when the forwarder chooses not to propagate it). It carries a serialized
`Identity` (id, scopes, resources) — the originator's resolved identity at
the forwarding node.
### 2. Add `forwarded_for` to `OperationContext`
```rust
pub struct OperationContext {
// ... existing fields ...
/// The original caller when this call was forwarded (ADR-032).
/// Metadata only — NOT used by `AccessControl::check`. The dispatch
/// path populates it from the `call.requested.forwarded_for` field;
/// the `from_call` handler sets it when constructing the forwarded
/// payload (see §3). Handlers may read it for logging, auditing,
/// per-user rate limiting, or application context. The ACL check
/// always runs against `identity` (the direct caller), never against
/// `forwarded_for`.
pub forwarded_for: Option<Identity>,
}
```
`identity` is the direct caller (authorized by ACL). `forwarded_for` is
the original caller (metadata only). The ACL check signature is
`AccessControl::check(identity.as_ref())` — unchanged. The
`forwarded_for` field is a **separate** field, not a parameter to `check`.
### 3. The `from_call` handler populates `forwarded_for`
The hub's `from_call` forwarding handler constructs the `call.requested`
payload to send to the spoke. It populates `forwarded_for` with the end
user's identity — read from the hub's `OperationContext.identity` (the
caller the hub authenticated) when the hub forwards the call. The hub
authenticates as itself (its own `auth_token`); the `forwarded_for` field
carries the originator's identity as context.
This is the hub's responsibility, not the protocol's. The protocol carries
the field; the `from_call` handler chooses to populate it. A forwarder that
doesn't want to disclose the originator can set `forwarded_for: None` (the
spoke sees only the hub). A forwarder that wants to propagate it sets it.
### 4. `AccessControl::check` never reads `forwarded_for`
The security property: `forwarded_for` is metadata, not authority. The
spoke's dispatch path makes it available on `OperationContext` for handlers,
but `AccessControl::check(identity.as_ref())` — the ACL check — always
authorizes the **direct caller's** identity, never the forwarded-for
identity. There is no path through which `forwarded_for` becomes an
authorization input.
This is enforced structurally, not by convention: `AccessControl::check`
takes `Option<&Identity>` (the direct caller's identity). The
`forwarded_for` field is `Option<Identity>` on `OperationContext`, but
the check signature doesn't accept it. If someone wants to ACL on the
forwarded-for identity, they would have to change the
`AccessControl::check` signature — a visible, reviewable change, not a
quiet flag flip. The type system prevents accidental misuse.
## Why include it now
The window is the ADR-029 migration. The `from_call` handler is being
rewritten (peer-keyed overlays, `AccessControl`-based peer authorization,
removal of `remote_safe`/`trusted_peer`), and `OperationContext` is being
touched (the `PeerCompositeEnv` aggregation changes how the context is
built). Adding a field to the `call.requested` payload and to
`OperationContext` now is the cheapest point — the structures are already
under edit. After the protocol ships without it, adding it is a breaking
wire-format change (every client and server must learn the new field) and
an `OperationContext` break (every handler that pattern-matches the struct
must update).
## Consequences
**Positive:**
- The spoke can audit cross-node call chains. The leaf knows who actually
initiated the call, not just who forwarded it.
- Per-user rate limiting at the leaf becomes possible. The spoke can key
rate-limit state on `forwarded_for.id` instead of only on the hub's
identity.
- Handler application logic can use the originator's identity for per-user
views, per-user data isolation, or attribution.
- The security model is unchanged: the spoke authorizes the hub (its
direct caller), not the end user. The `forwarded_for` field is metadata,
not authority. The type-system separation (`check` takes `identity`, not
`forwarded_for`) prevents misuse.
- The forwarder decides. A hub that doesn't want to disclose the
originator (e.g., for privacy, or because the originator's identity is
not meaningful to the spoke) sets `forwarded_for: None`. The field is
opt-in by the forwarder, not mandatory.
**Negative:**
- The `call.requested` payload gains a field. Wire-format addition — old
servers that don't recognize `forwarded_for` ignore it (JSON
deserialization into a struct without the field drops it); old clients
that don't send it produce `forwarded_for: None` on the server. This is
forward-compatible, but a server that wants to *use* `forwarded_for`
must be new enough to deserialize it.
- `OperationContext` gains a field. Handlers that construct
`OperationContext` literals (tests, custom dispatch paths) must add the
field. The composition path (`OperationEnv::invoke`) sets it to `None`
for composed children — `forwarded_for` is a wire-ingress field, not a
composition-ingress field.
- The `Identity` in `forwarded_for` is a serialized value on the wire,
not a server-resolved identity. The spoke receives the hub's *claim*
about the originator's identity. A malicious hub could lie — set
`forwarded_for` to a fake identity. The spoke must not treat
`forwarded_for` as authoritative for anything security-relevant; it's
the hub's assertion, useful for audit/attribution when the hub is
trusted, but not a verified identity. This is the inherent property of
forwarded-for metadata (same as HTTP `X-Forwarded-For` — it's a claim by
the forwarder, not a verified value).
- One more field for the `from_call` handler to populate correctly. The
handler must read the hub's `OperationContext.identity` and decide
whether to propagate it. This is a small implementation cost, but it's a
handler-responsibility increase.
## Assumptions
1. **`forwarded_for` is a claim by the forwarder, not a verified
identity.** The spoke receives the hub's assertion about the
originator. A malicious hub can lie. The spoke must not use
`forwarded_for` as authoritative for security decisions — only for
audit, logging, and application-context purposes when the hub is
trusted. This is the same property as HTTP `X-Forwarded-For`.
2. **`AccessControl::check` never reads `forwarded_for`.** The security
property is structural (the check signature doesn't accept it), not
conventional. Adding `forwarded_for` to the ACL path would require a
signature change to `AccessControl::check` — a visible, reviewable
change.
3. **`forwarded_for` is wire-ingress only.** Composed children (calls via
`OperationEnv::invoke`) do not inherit `forwarded_for` — they get
`None`. The field is populated from `call.requested.forwarded_for` by
the dispatch path, and the `from_call` forwarding handler sets it when
constructing the forwarded payload. Composition-propagation of
`forwarded_for` would be a separate decision (not in this ADR).
4. **The `Identity` shape in `forwarded_for` is the same as `Identity`
on `OperationContext`.** Both carry `id`, `scopes`, `resources`. The
`forwarded_for` value is a serialized `Identity` from the forwarding
node's resolution — the same `Identity` the hub resolved for the end
user, just passed along as metadata.
## References
- ADR-014: Secret Material Flow and Capability Injection (`forwarded_for`
carries an `Identity` with scopes/resources, not secret material — the
no-secret-material-on-the-wire invariant is preserved)
- ADR-015: Privilege Model and Authority Context (the authority-switch
model — `forwarded_for` does not participate; the direct caller's
identity is the authority)
- ADR-017: Call Protocol Client and Adapter Contract (the `from_call`
forwarding handler that populates `forwarded_for`)
- ADR-029: Peer-Graph Routing Model (the migration window —
`OperationContext` and the `from_call` handler are being rewritten)
- `docs/research/alknet-storage-strategy/findings.md` §6 (the
forwarded-for identity decision and rationale)

View File

@@ -0,0 +1,223 @@
# ADR-033: Storage Boundary and Repo/Adapter Pattern
## Status
Accepted (resolves the storage-boundary dimension of OQ-34; establishes the
pattern that ADR-030 and ADR-031 follow)
## Context
OQ-34 tracked the storage-boundary question: do the core crates (alknet-core,
alknet-call, alknet-vault) know about persistence at all, or does persistence
live entirely outside the crate graph? The question was left open because the
project deliberately kept the core crates DB-free — smaller, fewer
dependencies, simpler testing. That posture served the local-only crates
(vault, registry) well: vault key rotation is version-indexed derivation
paths (ADR-021), no DB needed.
Then peer identity surfaced as the first cross-node state that wants
persistence: a stable logical peer identity mapped to its current
cryptographic material, surviving restarts and key rotations. OQ-33's v1
UUID workaround was a no-storage stand-in. The research at
`docs/research/alknet-storage-strategy/findings.md` identified the answer:
core defines repo traits (the abstraction), adapters implement them against
specific backends (the implementation), the assembly layer wires the
adapter. This is the same pattern `IdentityProvider` already establishes —
we're making it explicit and extending it to every storage-shaped concern.
The research also established that `IdentityProvider` is the right shape
*for the trait boundary*, not for the implementation: the trait is in core;
the implementations are adapters. The pre-ADR-030 framing ("core is
storage-free, persistence is entirely outside the crate graph") was too
narrow — it conflated "core has no DB dependency" (true and preserved) with
"core has no storage abstraction" (the question). The answer is: **core has
the trait and the in-memory default; persistence adapters are separate
crates; the assembly layer wires the adapter.**
This is a one-way door. If core gains a repo trait, downstream crates depend
on the trait shape and it becomes a contract. If core stays storage-free,
the registry lives in a service crate and core never knows about
persistence. Reversing either direction breaks downstream consumers. The
research has made the decision; this ADR records it.
## Decision
### 1. Core defines repo traits; the in-memory default adapter lives alongside the trait
The core crates own the **trait boundary** for storage-shaped concerns and
the **in-memory default adapter**. They do NOT own the persistence backends.
```rust
// alknet-core — the pattern, applied to two concerns:
pub trait IdentityProvider: Send + Sync + 'static { // ADR-004
fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
}
pub struct ConfigIdentityProvider { ... } // in-memory default (ADR-030)
pub trait CredentialStore: Send + Sync { // ADR-031
fn get(&self, provider: &str) -> Option<EncryptedData>;
fn put(&self, provider: &str, data: &EncryptedData) -> Result<(), CredentialStoreError>;
fn delete(&self, provider: &str) -> Result<(), CredentialStoreError>;
}
pub struct InMemoryCredentialStore { ... } // in-memory default (ADR-031)
```
The trait is the one-way door — once downstream crates depend on it, the
shape is a contract. The in-memory default adapter is a reference
implementation that covers tests and config-loaded deployments; it carries
no persistence backend dependency.
### 2. Persistence adapters are separate crates, built when a concrete use case forces them
A persistence adapter (e.g., `alknet-peer-store-sqlite`,
`alknet-credential-store-sqlite`) is a **separate crate** that implements a
core repo trait against a specific backend. The adapter:
- Depends on alknet-core (for the trait and the types it implements
against).
- Owns its backend dependency (rusqlite + honker, a key-value store, a
remote service — the backend choice is the adapter's concern).
- Is wired by the assembly layer at deployment time, replacing the
in-memory default when persistence is needed.
The pattern:
```
alknet-core (lean — no SQLite, no honker, no backend deps)
├── IdentityProvider trait (the auth repo trait — ADR-004)
├── ConfigIdentityProvider (in-memory default — ADR-030)
├── CredentialStore trait (the credential repo trait — ADR-031)
└── InMemoryCredentialStore (in-memory default — ADR-031)
Persistence adapters (separate crates, built when needed)
├── peer-store adapter (implements IdentityProvider against a backend)
└── credential-store adapter (implements CredentialStore against a backend)
alknet-call (lean — no SQLite, no honker, no storage traits)
├── Uses IdentityProvider (the trait, not the adapter)
└── AccessControl::check(identity) for per-node ACL
```
The decomposition principle: **the trait lives where the types live
(alknet-core); the adapter implementation lives where its backend
dependency lives (a separate crate).** This mirrors the adapter location
principle in `client-and-adapters.md`: `OperationAdapter` lives in
`alknet-call` (where the types live); `from_openapi`/`from_mcp` live in
`alknet-http` (where the HTTP dependency lives).
### 3. The assembly layer wires the adapter
The CLI binary (the only crate that depends on all handler crates and the
vault, ADR-003) constructs the adapter at startup. For a deployment that
needs persistence, the assembly layer constructs the SQLite adapter instead
of the in-memory default and passes it where the trait is consumed.
This is the same wiring pattern as `IdentityProvider` today: the CLI
constructs `ConfigIdentityProvider` (or, with this ADR, the SQLite adapter)
and passes `Arc<dyn IdentityProvider>` to every handler that needs it.
### 4. What this does NOT do
- **Does not add a SQLite dependency to alknet-core.** Core carries the
trait and the in-memory default. The SQLite dependency lives in the
adapter crate.
- **Does not specify concrete adapter shapes.** The trait shape is the
one-way door. The concrete adapter shapes (table schemas, backend
choice, indexing, caching) are deferred for exploration — the project's
note is that the repo pattern is a tool to reach for when a storage
concern is concrete, not a one-size-fits-all mold to apply
speculatively. The pattern is committed; the adapters are not.
- **Does not change the no-DB posture of the core crates.** Core remains
DB-free in the sense that it has no backend dependency — only a trait
boundary. The in-memory adapter carries no persistence. The persistence
adapters are additive crates.
- **Does not introduce a generic "Storage" trait.** Each storage-shaped
concern gets its own trait (`IdentityProvider`, `CredentialStore`). A
generic `Storage<T>` trait would be over-abstraction — the concerns are
different enough (identity resolution vs. encrypted-blob persistence)
that a single trait would force a least-common-denominator shape.
## Consequences
**Positive:**
- OQ-34 is resolved. The storage boundary is: core defines the repo trait
+ the in-memory default; persistence adapters are separate crates; the
assembly layer wires. The no-DB posture is preserved in the sense that
matters (core has no backend dependency) while the abstraction is in
place for the cross-node state that wants persistence.
- The pattern is reusable. When a future storage-shaped concern surfaces
(e.g., ACL delegation graph, filesystem path tree), it follows the same
shape: trait in core, in-memory default, persistence adapter additive.
The research identified this as the right tool to reach for, and this
ADR commits the pattern.
- Downstream crates that don't use the call protocol (alknet-http, a
service with no protocol at all) still resolve identities and check ACL
via the same trait. The auth layer is not owned by alknet-call — it's
owned by core, consumed everywhere.
- The door to distributed auth adapters (automerge sync, Redis, a remote
identity service) is open without being designed. The trait doesn't care
which backend is wired.
**Negative:**
- alknet-core gains repo traits. Each trait is a contract downstream
crates depend on. Getting the trait shape right matters — a wrong shape
breaks every consumer when it's fixed. ADR-030 and ADR-031 commit the
first two trait shapes; future traits follow the same review bar.
- The in-memory default adapter is a reference implementation, not a
production persistence layer. Deployments that need persistence must
wire a persistence adapter — the in-memory default loses state on
restart. This is documented, not hidden.
- Concrete adapter shapes are not specified. This is deliberate (the
project is iterating on adapter simplification), but it means the
persistence-adapter build order is open. The trait shape is the
commitment; the adapter build is the two-way door.
## Assumptions
1. **The trait shape is the one-way door; the adapter shape is the
two-way door.** Getting the trait right is the review bar; getting the
adapter right is an implementation detail that can iterate.
2. **Each storage-shaped concern gets its own trait.** No generic
`Storage<T>`. The concerns are different enough that a single trait
would over-abstract.
3. **The in-memory default adapter is the reference implementation.** It
covers tests and config-loaded deployments. It is not a production
persistence layer.
4. **Persistence adapters are additive crates, built when a concrete use
case forces them.** Not built speculatively. The pattern is committed;
the adapters are not.
5. **Concrete adapter shapes are deferred for exploration.** The project
is iterating on adapter simplification; the trait shapes in this ADR
and ADR-030/031 are the commitment, not the adapter table schemas or
backend choices.
## References
- ADR-003: Crate Decomposition (the dependency rules this ADR follows —
core is lean, adapters are separate crates, the assembly layer wires)
- ADR-004: Auth as Shared Core (`IdentityProvider` — the first instance of
the pattern this ADR makes explicit)
- ADR-018: Vault as Standalone Crate (the vault has zero alknet-crate
dependencies; the repo pattern doesn't change that)
- ADR-025: Vault Local-Only Dispatch (the vault is the sole decryption
boundary; `CredentialStore` persists encrypted blobs, never decrypts)
- ADR-030: PeerEntry and Identity.id Decoupling (the first application of
this pattern to peer identity — `PeerEntry` config model +
`ConfigIdentityProvider` in-memory default)
- ADR-031: CredentialStore Repo Trait (the second application —
`CredentialStore` trait + `InMemoryCredentialStore` default)
- OQ-34: Persistent Peer Registry (resolved by this ADR — the storage
boundary is `core trait + in-memory default`, persistence adapters
additive)
- OQ-36: Concrete Adapter Shapes (tracked by this ADR — deferred for
exploration; the trait shapes are committed, the adapter shapes are not)
- `docs/research/alknet-storage-strategy/findings.md` §3-4 (the
SQLite+honker foundation and the repo/adapter pattern)
- `/workspace/keypal` — TypeScript repo-pattern reference (the Storage
interface + adapters pattern alknet follows)