docs(arch): resolve OQ-26 (AdapterError variants) + OQ-33 (PeerId = logical id) + OQ-34 (persistent peer registry)

OQ-26 (resolved): AdapterError variants decided — DiscoveryFailed,
SchemaParse, Transport, Unauthorized, SamePeerCollision (replaces flat
Conflict per ADR-029 §5). #[non_exhaustive] for downstream extension.
Two-way door; the initial set is the code's return type.

OQ-33 (resolved): PeerId is a logical identifier, NOT Identity.id. The
research's v1 default (PeerId = fingerprint) is overridden: coupling PeerId
to crypto material breaks every in-flight PeerRef::Specific and every ACL
entry on key rotation. v1 source is a connection-assigned UUID — a
no-storage workaround that works for the immediate use case (head→workers,
reconnect produces fresh PeerRef, in-flight gets NOT_FOUND which is correct).
The one-way door: PeerId is logical, not crypto — this determines
PeerCompositeEnv key type and PeerRef::Specific payload. The id source
(UUID vs configured name vs peer registry) is the two-way-door remainder.

OQ-34 (new): the storage dimension OQ-33 surfaced. The core crates are
deliberately DB-free (smaller, fewer deps, simpler testing) — this served
local-only state (vault, registry) well, but peer identity is the first
cross-node state that wants persistence. The real solution (a persistent
peer registry mapping stable logical name → current crypto material,
surviving key rotation) is not a v1 blocker (UUID works), but tracked so the
no-DB posture's limit is deliberate, not accidental. The storage boundary
(core gets a PeerRegistry trait vs stays storage-free) is the one-way door;
the backend choice is two-way. Key-rotation/ACL note: decoupling PeerId from
crypto keeps the door open for ACL entries that persist across key rotation
— when the peer registry is built, ACLs key on the logical name and key
rotation becomes vault-only with no remote-side ACL update.
This commit is contained in:
2026-06-27 06:34:35 +00:00
parent 77eb35a8a5
commit 99c6dd9483
5 changed files with 167 additions and 38 deletions

View File

@@ -100,13 +100,15 @@ See [open-questions.md](open-questions.md) for the full tracker.
**Open (two-way-door remainders from alknet-call completion + peer-graph routing):** **Open (two-way-door remainders from alknet-call completion + peer-graph routing):**
- **OQ-25**: ~~Remote-safe marking shape~~**dissolved by ADR-029** (no marking; peer authorization is `AccessControl::check(peer_identity)`) - **OQ-25**: ~~Remote-safe marking shape~~**dissolved by ADR-029** (no marking; peer authorization is `AccessControl::check(peer_identity)`)
- **OQ-26**: `OperationAdapter` error type`import()` returns `Result<_, AdapterError>`; variants decided in implementation - **OQ-26**: ~~`OperationAdapter` error type~~**resolved** (`AdapterError` variants: `DiscoveryFailed`, `SchemaParse`, `Transport`, `Unauthorized`, `SamePeerCollision`; `#[non_exhaustive]`)
- **OQ-27**: `from_call` re-import trigger — v1 default auto-on-reconnect; explicit `refresh()` additive - **OQ-27**: `from_call` re-import trigger — v1 default auto-on-reconnect; explicit `refresh()` additive
- **OQ-28**: `from_call` namespace collision — cross-peer **dissolved by ADR-029** (separate sub-overlays); same-peer stays error - **OQ-28**: `from_call` namespace collision — cross-peer **dissolved by ADR-029** (separate sub-overlays); same-peer stays error
- **OQ-29**: `CallClient` TLS client-auth — v1 `with_no_client_auth()` + `AcceptAnyServerCertVerifier`; wiring RawKey client-auth is additive - **OQ-29**: `CallClient` TLS client-auth — v1 `with_no_client_auth()` + `AcceptAnyServerCertVerifier`; wiring RawKey client-auth is additive
- **OQ-30**: `PeerRef::Any` routing policy — v1 insertion-order first-match; round-robin/least-loaded is future (ADR-029) - **OQ-30**: `PeerRef::Any` routing policy — v1 insertion-order first-match; round-robin/least-loaded is future (ADR-029)
- **OQ-31**: `services/list-peers` re-export semantics — v1 "own ops only"; `services/list-peers` is opt-in (ADR-029) - **OQ-31**: `services/list-peers` re-export semantics — v1 "own ops only"; `services/list-peers` is opt-in (ADR-029)
- **OQ-32**: Multi-hop federation — v1 one-hop; peer-keyed model extends without redesign; petgraph candidate (ADR-029) - **OQ-32**: Multi-hop federation — v1 one-hop; peer-keyed model extends without redesign; petgraph candidate (ADR-029)
- **OQ-33**: ~~PeerId stability~~**resolved** (logical id, not `Identity.id`; v1 UUID, decoupled from crypto material for key-rotation-safe ACLs)
- **OQ-34**: Persistent peer registry — the storage dimension OQ-33 surfaced; not a v1 blocker (UUID works); tracked so the no-DB posture's limit is deliberate
**Deferred (not active):** **Deferred (not active):**
- **OQ-09**: WASM target boundaries — design constraint, not deliverable - **OQ-09**: WASM target boundaries — design constraint, not deliverable

View File

@@ -51,13 +51,15 @@ Structured RPC over QUIC: operations, request/response, streaming subscriptions,
| OQ-16 | Safe vault operations for call protocol exposure | resolved (ADR-014) | None exposed for now | | OQ-16 | Safe vault operations for call protocol exposure | resolved (ADR-014) | None exposed for now |
| OQ-19 | Session-scoped operation registries | resolved | Agent-written operations overlaid on curated registry via `OperationEnv` trait layering. Protocol doesn't need changes; `OperationEnv` must remain a trait. Generalized by ADR-024 to cover connection-scoped overlays. | | OQ-19 | Session-scoped operation registries | resolved | Agent-written operations overlaid on curated registry via `OperationEnv` trait layering. Protocol doesn't need changes; `OperationEnv` must remain a trait. Generalized by ADR-024 to cover connection-scoped overlays. |
| OQ-25 | ~~Remote-safe marking shape~~ | **dissolved** (ADR-029) | `remote_safe`/`trusted_peer` retired; peer authorization is `AccessControl::check(peer_identity)` | | OQ-25 | ~~Remote-safe marking shape~~ | **dissolved** (ADR-029) | `remote_safe`/`trusted_peer` retired; peer authorization is `AccessControl::check(peer_identity)` |
| OQ-26 | OperationAdapter error type (AdapterError variants) | open (two-way) | `import()` returns `Result<_, AdapterError>`; variants decided in implementation | | OQ-26 | OperationAdapter error type (AdapterError variants) | **resolved** | `DiscoveryFailed`, `SchemaParse`, `Transport`, `Unauthorized`, `SamePeerCollision`; `#[non_exhaustive]` |
| OQ-27 | from_call re-import trigger | open (two-way) | v1 default: auto-on-reconnect; explicit `refresh()` additive | | OQ-27 | from_call re-import trigger | open (two-way) | v1 default: auto-on-reconnect; explicit `refresh()` additive |
| OQ-28 | from_call namespace collision | cross-peer **dissolved** (ADR-029) / same-peer stays | Cross-peer: separate sub-overlays, no collision. Same-peer: error. `namespace_prefix` is local-naming sugar | | OQ-28 | from_call namespace collision | cross-peer **dissolved** (ADR-029) / same-peer stays | Cross-peer: separate sub-overlays, no collision. Same-peer: error. `namespace_prefix` is local-naming sugar |
| OQ-29 | CallClient TLS client-auth and remote-identity verification | open (two-way) | v1 `with_no_client_auth()` + `AcceptAnyServerCertVerifier`; wiring RawKey client-auth is additive (orthogonal to ADR-029) | | OQ-29 | CallClient TLS client-auth and remote-identity verification | open (two-way) | v1 `with_no_client_auth()` + `AcceptAnyServerCertVerifier`; wiring RawKey client-auth is additive (orthogonal to ADR-029) |
| OQ-30 | `PeerRef::Any` routing policy | open (two-way) | v1 insertion-order first-match; round-robin/least-loaded is future (ADR-029) | | OQ-30 | `PeerRef::Any` routing policy | open (two-way) | v1 insertion-order first-match; round-robin/least-loaded is future (ADR-029) |
| OQ-31 | `services/list-peers` re-export semantics | open (two-way) | v1 "own ops only"; `services/list-peers` is opt-in (ADR-029) | | OQ-31 | `services/list-peers` re-export semantics | open (two-way) | v1 "own ops only"; `services/list-peers` is opt-in (ADR-029) |
| OQ-32 | Multi-hop federation | open | v1 one-hop; peer-keyed model extends without redesign; petgraph candidate (ADR-029) | | OQ-32 | Multi-hop federation | open | v1 one-hop; peer-keyed model extends without redesign; petgraph candidate (ADR-029) |
| OQ-33 | PeerId — crypto identity vs stable logical id | **resolved** | Logical id (UUID v1), not `Identity.id`; decoupled from crypto for key-rotation-safe ACLs |
| OQ-34 | Persistent peer registry (cross-node state storage) | open | Not a v1 blocker (UUID works); the no-DB posture's limit, tracked for deliberate future decision |
## Key Design Principles ## Key Design Principles

View File

@@ -173,7 +173,7 @@ pub struct PeerCompositeEnv {
pub connections: HashMap<PeerId, Arc<dyn OperationEnv + Send + Sync>>, // Layer 2, peer-keyed pub connections: HashMap<PeerId, Arc<dyn OperationEnv + Send + Sync>>, // Layer 2, peer-keyed
connection_order: Vec<PeerId>, // insertion order for PeerRef::Any first-match connection_order: Vec<PeerId>, // insertion order for PeerRef::Any first-match
} }
pub type PeerId = String; // = Identity.id pub type PeerId = String; // logical id (UUID v1), NOT Identity.id — see OQ-33
``` ```
`OperationEnv` gains a peer-routing method with a `PeerRef` selector `OperationEnv` gains a peer-routing method with a `PeerRef` selector
@@ -608,10 +608,9 @@ See [open-questions.md](../../open-questions.md) for full details.
- **OQ-25** (dissolved by ADR-029): `remote_safe` marking shape — moot. - **OQ-25** (dissolved by ADR-029): `remote_safe` marking shape — moot.
`remote_safe`/`trusted_peer` are retired; peer authorization is `remote_safe`/`trusted_peer` are retired; peer authorization is
`AccessControl::check(peer_identity)`. No marking to shape. `AccessControl::check(peer_identity)`. No marking to shape.
- **OQ-26** (open, two-way): `AdapterError` enum variants (DC-4). The - **OQ-26** (resolved): `AdapterError` variants `DiscoveryFailed`,
*presence* of an error type is recorded here; the variants are `SchemaParse`, `Transport`, `Unauthorized`, `SamePeerCollision`
implementation-detail. A `SamePeerCollision` variant may replace the flat (replaces flat `Conflict`). `#[non_exhaustive]`.
`Conflict` variant (ADR-029 §5).
- **OQ-27** (open, two-way): `from_call` re-import trigger — auto-on-reconnect - **OQ-27** (open, two-way): `from_call` re-import trigger — auto-on-reconnect
(v1 default, recorded here) vs explicit `CallConnection::refresh()`. v1 is (v1 default, recorded here) vs explicit `CallConnection::refresh()`. v1 is
auto-on-reconnect; the explicit path is additive. The overlay is now auto-on-reconnect; the explicit path is additive. The overlay is now
@@ -632,6 +631,12 @@ See [open-questions.md](../../open-questions.md) for full details.
- **OQ-32** (open): Multi-hop federation — v1 is one-hop; the peer-keyed - **OQ-32** (open): Multi-hop federation — v1 is one-hop; the peer-keyed
overlay model extends to multi-hop without redesign; petgraph is the overlay model extends to multi-hop without redesign; petgraph is the
candidate if path-finding becomes real (ADR-029 §3.7). candidate if path-finding becomes real (ADR-029 §3.7).
- **OQ-33** (resolved): `PeerId` is a logical id (connection-assigned UUID),
not `Identity.id` — decoupling from crypto material keeps the door open for
key-rotation-safe ACLs. See OQ-33 in open-questions.md.
- **OQ-34** (open): Persistent peer registry — the storage dimension OQ-33
surfaced; not a v1 blocker (UUID works), tracked so the no-DB posture's
limit is deliberate. See OQ-34 in open-questions.md.
## References ## References

View File

@@ -79,7 +79,7 @@ pub enum PeerRef {
Specific(PeerId), // route to this peer; NOT_FOUND if it doesn't serve the op Specific(PeerId), // route to this peer; NOT_FOUND if it doesn't serve the op
Any, // first peer (insertion order) that serves it Any, // first peer (insertion order) that serves it
} }
pub type PeerId = String; // = Identity.id pub type PeerId = String; // logical id, NOT Identity.id — see OQ-33
async fn invoke_peer(&self, peer: &PeerRef, namespace: &str, operation: &str, async fn invoke_peer(&self, peer: &PeerRef, namespace: &str, operation: &str,
input: Value, parent: &OperationContext, policy: AbortPolicy) -> ResponseEnvelope { input: Value, parent: &OperationContext, policy: AbortPolicy) -> ResponseEnvelope {
@@ -221,22 +221,27 @@ with attribution, filtered by the calling peer's authorization).
(the `CallClient`, `Dispatcher`, `HandlerRegistration`, `discovery.rs`) (the `CallClient`, `Dispatcher`, `HandlerRegistration`, `discovery.rs`)
changes. This is the cost of fixing a one-way-door miss — the previous model changes. This is the cost of fixing a one-way-door miss — the previous model
shipped and was reviewed before the structural gap was caught. shipped and was reviewed before the structural gap was caught.
- `PeerId = Identity.id` (the fingerprint) is not stable across key rotation. - `PeerId` is a logical identifier, **not** `Identity.id` (the fingerprint or
A peer that rotates its TLS key gets a new `PeerId`; in-flight API-key prefix). Coupling `PeerId` to the crypto material would break every
`PeerRef::Specific(old_id)` gets `NOT_FOUND` after reconnect. For the in-flight `PeerRef::Specific` and every ACL entry referencing that peer on
immediate use case (head→workers where the operator controls key rotation), key rotation. v1 uses a connection-assigned UUID; a configured node name is
this is acceptable. A stable logical node name decoupled from cryptographic the future shape. See OQ-33 for the full decision and the key-rotation/ACL
identity is the cleaner long-term shape (assumption 1). rationale.
## Assumptions ## Assumptions
1. **`PeerId = Identity.id` (the fingerprint).** Reconnects with a rotated key 1. **`PeerId` is a logical identifier, not `Identity.id`.** v1 source is a
change the `PeerId`; the peer-keyed overlay drops the old `PeerId`'s connection-assigned UUID (v4) — stable for the connection's lifetime,
sub-overlay and creates a new one. An in-flight `PeerRef::Specific(old_id)` changes on reconnect. This is a no-storage workaround: the core crates are
gets `NOT_FOUND`. This is acceptable for v1 (operator-controlled key deliberately DB-free (smaller, fewer deps), which works for local-only
rotation in the head→workers pattern). A stable logical node name separate state but not for cross-node peer identity that wants to persist across
from the cryptographic identity is a future question; the peer-keyed overlay restarts and key rotations. An in-flight `PeerRef::Specific(stale_uuid)`
model accommodates it by changing what `PeerId` aliases, not by redesign. gets `NOT_FOUND` on reconnect — the correct failure mode (the peer is
gone); re-`from_call` produces a fresh `PeerRef`. The real solution (a
persistent peer registry that maps a stable logical name to current crypto
material, surviving key rotation) is tracked as OQ-34, not a v1 blocker.
The one-way door: `PeerId` is logical, not crypto — this determines the
`PeerCompositeEnv` key type and `PeerRef::Specific` payload. See OQ-33.
2. **`PeerRef::Any` = insertion-order first-match.** Deterministic but 2. **`PeerRef::Any` = insertion-order first-match.** Deterministic but
order-dependent (worker A connects before worker B → `Any` routes to A order-dependent (worker A connects before worker B → `Any` routes to A
@@ -278,8 +283,8 @@ with attribution, filtered by the calling peer's authorization).
- ADR-028: Peer-Scoped Registry Filtering for CallClient Inbound Dispatch - ADR-028: Peer-Scoped Registry Filtering for CallClient Inbound Dispatch
(superseded) (superseded)
- OQ-25: dissolved (no `remote_safe` marking — `AccessControl` is the policy) - OQ-25: dissolved (no `remote_safe` marking — `AccessControl` is the policy)
- OQ-26: stays (`AdapterError` a `SamePeerCollision` variant may replace - OQ-26: resolved (`AdapterError` variants `SamePeerCollision` replaces
the flat `Conflict` variant) the flat `Conflict` variant; `#[non_exhaustive]`)
- OQ-27: stays (re-import trigger — unchanged; the overlay is now peer-scoped) - OQ-27: stays (re-import trigger — unchanged; the overlay is now peer-scoped)
- OQ-28: dissolved cross-peer (same name on different peers is fine); stays - OQ-28: dissolved cross-peer (same name on different peers is fine); stays
same-peer same-peer
@@ -287,6 +292,10 @@ with attribution, filtered by the calling peer's authorization).
- OQ-30: `PeerRef::Any` routing policy (new — round-robin/least-loaded) - OQ-30: `PeerRef::Any` routing policy (new — round-robin/least-loaded)
- OQ-31: `services/list-peers` re-export semantics (new) - OQ-31: `services/list-peers` re-export semantics (new)
- OQ-32: Multi-hop federation (new — petgraph candidate) - OQ-32: Multi-hop federation (new — petgraph candidate)
- OQ-33: resolved — `PeerId` is a logical id (UUID v1), not `Identity.id`;
decoupling from crypto material keeps the door open for key-rotation-safe ACLs
- OQ-34: persistent peer registry (new — the storage dimension OQ-33 surfaced;
not a v1 blocker, tracked so the no-DB posture's limit is deliberate)
- Research: `docs/research/alknet-call-peer-routing/findings.md` - Research: `docs/research/alknet-call-peer-routing/findings.md`
- Prior art: Ray.io actors (`ActorHandle` = `PeerRef::Specific`), Dapr service - Prior art: Ray.io actors (`ActorHandle` = `PeerRef::Specific`), Dapr service
invocation (app-ID routing = `PeerRef::Specific`, access-control allowlist = invocation (app-ID routing = `PeerRef::Specific`, access-control allowlist =

View File

@@ -349,22 +349,26 @@ revisited during implementation without a new ADR.
### OQ-26: OperationAdapter Error Type (AdapterError Variants) ### OQ-26: OperationAdapter Error Type (AdapterError Variants)
- **Origin**: [client-and-adapters.md](crates/call/client-and-adapters.md), ADR-017 §5 - **Origin**: [client-and-adapters.md](crates/call/client-and-adapters.md), ADR-017 §5, [ADR-029](decisions/029-peer-graph-routing-model.md) §5
- **Status**: open - **Status**: **resolved** (2026-06-27)
- **Door type**: Two-way - **Door type**: Two-way
- **Priority**: medium - **Priority**: medium
- **Resolution**: ADR-017 §5 showed `async fn import(&self) -> - **Resolution**: The `AdapterError` enum is `#[non_exhaustive]` +
Vec<HandlerRegistration>` with no error type. The trait returns `thiserror::Error`, with these v1 variants:
`Result<Vec<HandlerRegistration>, AdapterError>` where `AdapterError` is a - `DiscoveryFailed { message: String }``from_call` remote unreachable / `services/list` failed
crate-level enum. The *presence* of an error type is recorded in - `SchemaParse { message: String }``from_openapi` / `from_jsonschema` couldn't parse the spec
[client-and-adapters.md](crates/call/client-and-adapters.md); the exact - `Transport { message: String }` — underlying transport error (QUIC for `from_call`, HTTP for `from_openapi`/`from_mcp`)
variants are the two-way-door remainder. The failure modes real - `Unauthorized { message: String }` — HTTP 401 for `from_openapi`/`from_mcp`, auth rejected for `from_call`
implementations hit: discovery transport failure (`from_call` remote - `SamePeerCollision { message: String }` — namespace collision *within a single peer* (ADR-029 §5: cross-peer collision dissolves; same-peer collision stays an error). Replaces the flat `Conflict` variant from the pre-ADR-029 implementation.
unreachable), schema parse failure (`from_openapi`, `from_jsonschema`),
unauthorized (HTTP 401 for `from_openapi`, `from_mcp`). Likely variants: `#[non_exhaustive]` lets `alknet-http`'s adapters extend without breaking
`DiscoveryFailed`, `SchemaParse`, `Transport`, `Unauthorized`. Decided match arms. The variant payloads are `String` messages — kept simple and
during implementation; recorded here, not in a full ADR. `Send + Sync` by construction. This matches the shipped implementation
- **Cross-references**: ADR-017, [client-and-adapters.md](crates/call/client-and-adapters.md) (`crates/alknet-call/src/client/mod.rs`) except `Conflict`
`SamePeerCollision` (the ADR-029 migration renames it). Two-way door:
adding variants later is non-breaking; renaming a variant is a match-arm
update but not an architectural change.
- **Cross-references**: ADR-017, ADR-029, [client-and-adapters.md](crates/call/client-and-adapters.md)
### OQ-27: from_call Re-Import Trigger ### OQ-27: from_call Re-Import Trigger
@@ -485,4 +489,111 @@ revisited during implementation without a new ADR.
suffices. Whether multi-hop federation becomes a real use case is a future suffices. Whether multi-hop federation becomes a real use case is a future
decision; the peer-keyed model does not foreclose it. Not designed; tracked decision; the peer-keyed model does not foreclose it. Not designed; tracked
here so the v1 model's extendability is recorded. here so the v1 model's extendability is recorded.
- **Cross-references**: ADR-029, [client-and-adapters.md](crates/call/client-and-adapters.md) - **Cross-references**: ADR-029, [client-and-adapters.md](crates/call/client-and-adapters.md)
### OQ-33: PeerId — Cryptographic Identity vs Stable Logical Identifier
- **Origin**: [ADR-029](decisions/029-peer-graph-routing-model.md) Assumption 1, `docs/research/alknet-call-peer-routing/findings.md` §6.1
- **Status**: **resolved** (2026-06-27)
- **Door type**: One-way (composition semantics), two-way (id source)
- **Priority**: high
- **Resolution**: `PeerId` is a **logical identifier, decoupled from the
cryptographic identity**. It is *not* `Identity.id` (the TLS fingerprint or
API-key prefix) — those change on key rotation, which would break every
in-flight `PeerRef::Specific` and every ACL entry referencing that peer.
**v1 source**: connection-assigned UUID (v4) at `connect()`/`accept()` time.
Stable for the connection's lifetime; changes on reconnect. This is a
**no-storage workaround** — the project has deliberately avoided a DB
backend for the core crates (smaller, fewer deps, simpler testing), which
has served the local-only crates (vault, registry) well. But peer identity
is the first *cross-node* state that wants persistence: what we actually
want is a persistent mapping from a logical peer identity to its current
cryptographic material, updated on key rotation, surviving restarts.
Without a DB, the UUID is the least-bad ephemeral option — the failure
mode (in-flight `PeerRef::Specific` gets `NOT_FOUND` on reconnect) is
acceptable for v1, and the re-`from_call` produces a fresh `PeerRef`.
**The real solution (future, tracked as OQ-34):** a persistent peer
registry — a mapping from a stable logical peer identity (configured node
name or registered identity) to its current cryptographic material,
persisted across restarts and key rotations. This is what makes the
ACL-stability concern below work correctly: the ACL entry keys on the
logical name, the peer registry tracks the current crypto identity for
that name, and key rotation becomes a vault-only operation with no ACL
update on the remote side. The no-DB posture of the core crates means
this registry lives outside the core — likely in a service crate or an
assembly-layer store — not in alknet-call itself. See OQ-34.
**Key-rotation / ACL note (context for the future, not a v1 decision):**
if `PeerId` were the fingerprint, rotating a node's TLS key would change
its `PeerId`, invalidating every ACL entry that references that peer. The
vault makes local key rotation easy (derive a new key, re-encrypt,
ADR-021); the problem is the *remote* side's ACL — the hub's
`authorized_fingerprints` / `AccessControl` entries that reference the old
fingerprint. Decoupling `PeerId` from the crypto material means the ACL
entry *can* persist across key rotation — but only if there's a store that
maps the logical name to the new crypto identity after rotation. That
store is OQ-34. The v1 decision (logical id, not crypto; UUID source)
keeps the door open for it without requiring it now.
**The one-way door:** `PeerId` is a logical id, not `Identity.id`. This
determines the `PeerCompositeEnv` key type, the `PeerRef::Specific`
payload type, and the `ScopedPeerEnv.peer_pinned` entry shape. Reversing
it (switching to `Identity.id`) would break the peer-keyed overlay, the
routing selector, and the reachability set simultaneously. The *source* of
the logical id (UUID now, peer registry later) is the two-way-door
remainder — switching from UUID to a persistent registry changes the
id-generation path, not the composition model.
- **Cross-references**: ADR-009, ADR-014, ADR-015, ADR-017, ADR-021, ADR-027,
ADR-029, OQ-34, [client-and-adapters.md](crates/call/client-and-adapters.md),
[operation-registry.md](crates/call/operation-registry.md),
[auth.md](crates/core/auth.md)
### OQ-34: Persistent Peer Registry (Cross-Node State Storage)
- **Origin**: OQ-33 (the storage dimension it surfaced), the no-DB posture of ADR-008/018/025
- **Status**: open
- **Door type**: One-way (storage boundary), two-way (backend choice)
- **Priority**: medium (not a v1 blocker — UUID works for v1; becomes real
when key rotation across nodes or peer-attribution persistence matters)
- **Resolution**: The core crates (alknet-core, alknet-call, alknet-vault)
are deliberately storage-free — no DB, no persistence layer, in-memory
state only. This has kept the core small and testable, and it works for
local-only state (vault key rotation is version-indexed paths, no DB
needed, ADR-021). **Peer identity is the first cross-node state that
wants persistence**: a stable logical peer identity mapped to its current
cryptographic material, surviving restarts and key rotations. The v1
workaround (OQ-33: connection-assigned UUID) is ephemeral — it works for
the immediate use case (head→workers, operator-controlled, reconnects
produce a fresh UUID) but doesn't support ACL entries that persist across
key rotation, because there's nowhere to store "worker-a's current crypto
identity is X."
**What this OQ tracks (not designed, not a v1 decision):**
- Whether a persistent peer registry belongs in a service crate (e.g., an
`alknet-registry` or `alknet-peer-store`), in the assembly layer (a
SQLite file the binary owns), or as a new alknet-core abstraction
(a `PeerRegistry` trait with no built-in impl, like `IdentityProvider`).
- Whether the no-DB posture extends to "core has a trait, service has the
impl" (the `IdentityProvider` pattern) or stays "core is storage-free,
persistence is entirely outside the crate graph."
- The backend choice (SQLite, a key-value store, a config file) is the
two-way-door remainder; the *storage boundary* (does core know about
persistence at all?) is the one-way door.
**Why this is a one-way door on the storage boundary, not a two-way door:**
if core gains a `PeerRegistry` trait, downstream crates depend on it and
the trait shape becomes a contract. If core stays storage-free, the
registry lives in a service crate and core never knows about persistence.
Reversing either direction breaks downstream consumers. The decision
should be made when a concrete use case (key rotation across nodes,
durable peer attribution, multi-hop federation with OQ-32) forces it —
not before.
**Not a v1 blocker.** The UUID works for v1; this OQ exists so the
no-DB posture's limit is tracked and the decision is made deliberately
when it's needed, not accidentally when someone bolts a SQLite file onto
the assembly layer and it becomes load-bearing.
- **Cross-references**: ADR-008, ADR-018, ADR-021, ADR-025, ADR-029, OQ-33,
[auth.md](crates/core/auth.md), [config.md](crates/core/config.md)