The 'WebTransport proxy' concept was conflating two distinct things;
this pass separates them:
1. In-process ALPN-stream-proxy (ADR-040, in alknet-http): the h3 handler
hands a WebTransport stream to another ALPN handler (SshAdapter,
GitAdapter, etc.) as a Connection, so a browser with a WASM parser
can reach any ALPN service via WebTransport. Path-based routing
(the CONNECT path declares the target: /alknet/ssh -> SshAdapter).
HttpAdapter gains Arc<HandlerRegistry> for the lookup. The browser's
WASM parser implements BiStream (ADR-007) over the WebTransport
stream. SSH-over-WebTransport is HTTPS-shaped at the network layer
(anti-censorship: the 'VPN-like without being a VPN' use case on a
clean foundation). russh-sftp demonstrates WASM targeting is
feasible; SSH is the next target.
2. Standalone relay service (OQ-38, future alknet-relay crate): a full
relay - fork of iroh-relay - with WebTransport proxy fallback for
NAT traversal. This is infrastructure, not a mode of the h3 handler.
OQ-38 reframed to be the standalone-relay scope question (distinct
from the in-process proxy now resolved by ADR-040).
webtransport.md updated: three stream destinations (call protocol,
ALPN-handler proxy, other sub-protocols) with path-based routing; new
'ALPN-stream-proxy' section covering the WASM client side, auth model
(bearer token gates the session; protocol's own auth gates the
protocol session), and the HandlerRegistry reference.
README/overview ADR tables and OQ summaries updated for ADR-040.
Commits the concrete adapter shape deferred by ADR-033: read-sync /
write-async split with honker NOTIFY/LISTEN for no-restart cache
invalidation, against SQLite, in a separate alknet-store-sqlite crate.
Two constraints drive the design: (1) the hot-path read trait
(IdentityProvider::resolve_from_fingerprint, CredentialStore::get) is
sync — called in the accept loop, no .await — so a SQLite-backed
adapter must cache in memory and serve sync reads from the cache; (2)
auth changes must take effect without a restart (an early issue the
project already fixed for ConfigIdentityProvider via ArcSwap config
reload). honker's SQLite NOTIFY/LISTEN (single-digit-ms wake, no
polling) is the cache-invalidation mechanism that makes both hold:
write commits to SQLite + emits NOTIFY, the running process's LISTEN
wakes, the in-memory index reloads and atomically swaps, the next
read sees the new state. Same ArcSwap-reload pattern as config,
generalized from 'config file is source of truth' to 'SQLite is
source of truth, honker signals when it changed.'
New async IdentityStore write trait (put_peer / update_peer /
remove_peer) extends the sync IdentityProvider read trait for peer
mutations. ConfigIdentityProvider does NOT implement it (config
reload is its write path — a posture enforced by the absence of a
backend, not a type-system constraint); SqliteIdentityProvider
implements both. CredentialStore::put/delete refined to async (within
ADR-031's one-way door — the contract was get/put/delete keyed by
provider persisting EncryptedData never decrypting; sync-vs-async was
unspecified). CredentialStoreError renamed to shared StoreError
covering both traits.
alknet-store-sqlite is one crate implementing both IdentityStore and
CredentialStore with shared SQLite connection + honker LISTEN infra
(splitting later is a two-way door). Schema shape committed (one row
per PeerEntry with JSON columns for fingerprints/scopes/resources;
one row per EncryptedData blob keyed by provider); exact DDL is an
implementation-detail two-way door in the adapter crate. The keypal
adapter-factory pattern is intentionally not ported to Rust (runtime
column-mapping is a TS affordance; in Rust each adapter is a concrete
type, cross-cutting concerns are a shared helper module).
Amends ADR-031 (put/delete async refinement, StoreError rename),
ADR-033 (concrete adapter shape now specified, two-crate framing
collapsed to one), ADR-034 (OQ-36 now resolved), auth.md (IdentityStore
section, cache-invalidation summary, OQ-36 reference), config.md (two
write paths note), and the OQ-36/OQ-34 entries in open-questions.md.
Review fixed 4 criticals (error-type name divergence, duplicate
IdentityProvider sketch, upsert/Duplicate ambiguity, 'shape unchanged'
contradiction), 7 warnings, 5 suggestions.
Untangles the conflation of three distinct remote roles under 'X.509
endpoint': (1) public X.509 endpoint — a remote HTTPS/call-over-TLS
server the local node is a client of (no PeerEntry, no PeerId, not in
the peer graph; CA verification + bearer token); (2) transport relay —
iroh's DERP-equivalent, infrastructure, not an alknet peer; (3) hub /
hosting node — an alknet peer that also exposes a public domain + X.509
for browsers (mixed-fingerprint PeerEntry, already supported by
ADR-030).
The load-bearing one-way door is the client-side verifier selection
rule: known peer (PeerEntry present) → fingerprint pin; unknown X.509
remote → CA verification (WebPkiServerVerifier); unknown Ed25519
remote → fails closed. This closes the AcceptAnyServerCertVerifier
security hole OQ-29 flagged, with the peer-model criterion (PeerEntry
presence) made explicit. The 'make PeerEntry symmetric' instinct is
rejected — pure-client connections to public APIs have no stable
logical identity to pin.
Documents that CallCredentials.remote_identity: None is load-bearing
(None = public X.509 endpoint → CA path, not a missing field; Some =
known peer → fingerprint pin), closing a subtle gap where an
implementer could have defaulted to a placeholder or treated None as
skip-verify.
Records WebTransport relay-as-proxy (deferred with h3/WebTransport,
new OQ-HTTP-07) and on-chain/smart-contract peer discovery (fits the
OQ-36 repo/adapter pattern, no auth-model change) so they aren't lost.
Amends auth.md and client-and-adapters.md with the three-role naming,
the verifier selection rule, and the Option semantics; updates OQ-37
to resolved in open-questions.md, README.md, and both crate READMEs.
ADR-009, open-questions.md, and the architect agent spec all had the same
conflation: 'two-way door' was phrased as 'can be decided during
implementation,' which reads as 'defer the decision.' That's not what it
means. A two-way door is a decision you make now and can revert later if
wrong — it's about reversal cost, not urgency.
ADR-009: add §'What this framework is NOT' — explicitly separates door
type (reversal cost) from deferral (scope management). State that
architecture decisions are the architect's regardless of door type.
Reword the two-way-door process from 'can be decided during
implementation' to 'pick the simplest option that works, implement it,
revert if needed.'
open-questions.md: reword the header to clarify door type describes
reversal cost, not urgency. Add 'Door type is separate from whether a
decision is made.'
architect.md: add Key Principle #8 (decisions are made, not deferred),
a new 'Door Types and Decision Urgency' section, and two new anti-patterns
(#8: door type as deferral, #9: hedging language in resolved decisions).
Amend ADR-030 with three changes from the auth-type analysis:
1. PeerEntry is now multi-credential: fingerprints: Vec<String> (Ed25519
and/or X.509) + auth_token_hash: Option<String> (bearer token). All
resolve to the same peer_id. A peer that authenticates via Ed25519
today and via auth_token tomorrow gets the same PeerId. The 'peer
bearer vs auth bearer' distinction was wrong — the correct framing is
the three credential types (Ed25519, X.509, bearer token) and whether
the token needs a stable logical id across rotation (PeerEntry) or not
(ApiKeyEntry).
2. Fingerprint normalization (§6): quinn extracts the raw Ed25519 public
key from the SPKI cert and formats as ed25519:<hex>, matching iroh.
The same key has the same fingerprint regardless of transport. X.509
fingerprints stay as SHA256:<hex of DER>. This also simplifies the
coming WebTransport relay work.
3. The 'API keys' section is replaced with 'Bearer tokens' — correctly
framing the three auth types and the two bearer-token paths
(PeerEntry.auth_token_hash vs ApiKeyEntry).
Resolve OQ-29 (CallClient TLS client-auth): wire quinn client-auth (present
Ed25519 key as raw public key client cert — the server-side extraction
already works); key-type-aware server cert verification (raw key =
fingerprint match, X.509 = CA verification via WebPkiServerVerifier —
AcceptAnyServerCertVerifier is only safe for raw keys); fingerprint
normalization. The iroh path already works (RFC 7250 raw keys, both sides
exchange automatically); the gap was quinn-only.
Dissolve OQ-35: the 'API key asymmetry' framing was wrong. PeerEntry
supports multiple credential paths; ApiKeyEntry is for tokens that ARE the
identity.
Add OQ-37: X.509 outgoing-only case — the three auth types and how X.509
server identity fits the peer model. Not blocking the ADR-029 migration;
downstream (HTTP crate phase).
Update auth.md, config.md, client-and-adapters.md, call/README.md,
core/README.md, open-questions.md, README.md, and call_client.rs source
comment.
Workspace green: 326 tests pass, build clean.
Land the storage and auth strategy research (findings.md) as four
accepted ADRs and amend the core and call specs to match:
- ADR-030: PeerEntry and Identity.id decoupling. Replaces
authorized_fingerprints with peers: Vec<PeerEntry>; Identity.id becomes
the stable peer_id, decoupled from the rotating fingerprint. Supersedes
ADR-029 Assumption 1's UUID source (one-way door preserved, source
changes). Resolves OQ-33 and the storage-boundary half of OQ-34. Records
the API-key asymmetry as deliberate (OQ-35).
- ADR-031: CredentialStore repo trait + InMemoryCredentialStore default
adapter in core. Second repo trait alongside IdentityProvider. Vault
encrypts; the store persists the EncryptedData blob; assembly layer
loads into Capabilities. EncryptedData core mirror includes salt for
wire-format compat.
- ADR-032: Forwarded-for identity. forwarded_for field on call.requested
and OperationContext — metadata only, never read by AccessControl::check
(enforced structurally via the check signature). The from_call handler
populates it. Wire-format one-way door, folded into the ADR-029
migration window.
- ADR-033: Storage boundary and repo/adapter pattern. Core defines repo
traits + in-memory defaults; persistence adapters are separate crates;
assembly layer wires. Resolves OQ-34. Concrete adapter shapes deferred
for exploration (OQ-36).
Amends auth.md, config.md, operation-registry.md, client-and-adapters.md,
open-questions.md, README.md, crates/core/README.md. Marks ADR-029
Accepted (Assumption 1 carries the ADR-030 superseded note). Marks the
research findings doc reviewed.
OQ-26 (resolved): AdapterError variants decided — DiscoveryFailed,
SchemaParse, Transport, Unauthorized, SamePeerCollision (replaces flat
Conflict per ADR-029 §5). #[non_exhaustive] for downstream extension.
Two-way door; the initial set is the code's return type.
OQ-33 (resolved): PeerId is a logical identifier, NOT Identity.id. The
research's v1 default (PeerId = fingerprint) is overridden: coupling PeerId
to crypto material breaks every in-flight PeerRef::Specific and every ACL
entry on key rotation. v1 source is a connection-assigned UUID — a
no-storage workaround that works for the immediate use case (head→workers,
reconnect produces fresh PeerRef, in-flight gets NOT_FOUND which is correct).
The one-way door: PeerId is logical, not crypto — this determines
PeerCompositeEnv key type and PeerRef::Specific payload. The id source
(UUID vs configured name vs peer registry) is the two-way-door remainder.
OQ-34 (new): the storage dimension OQ-33 surfaced. The core crates are
deliberately DB-free (smaller, fewer deps, simpler testing) — this served
local-only state (vault, registry) well, but peer identity is the first
cross-node state that wants persistence. The real solution (a persistent
peer registry mapping stable logical name → current crypto material,
surviving key rotation) is not a v1 blocker (UUID works), but tracked so the
no-DB posture's limit is deliberate, not accidental. The storage boundary
(core gets a PeerRegistry trait vs stays storage-free) is the one-way door;
the backend choice is two-way. Key-rotation/ACL note: decoupling PeerId from
crypto keeps the door open for ACL entries that persist across key rotation
— when the peer registry is built, ACLs key on the logical name and key
rotation becomes vault-only with no remote-side ACL update.
ADR-028's remote_safe/trusted_peer was a parallel, weaker authorization system
that duplicated the existing AccessControl/Identity machinery and couldn't
express the head→N-workers pattern (the primary use case). The flat-namespace
single-peer overlay model (one connection layer in CompositeOperationEnv)
structurally breaks the moment a head has two workers both exposing
/container/exec.
ADR-029 replaces it with:
- Peer-keyed overlays: PeerCompositeEnv { connections: HashMap<PeerId, ...> }
replaces CompositeOperationEnv's singular connection layer. A head node
routes invoke_peer() to the right peer via PeerRef::Specific / PeerRef::Any.
- AccessControl-based peer authorization: the existing AccessControl::check
(peer_identity) gates peer calls — the same mechanism that gates every other
call. remote_safe/trusted_peer/RemoteFilter/list_operations_peer_scoped/
services_list_handler_peer_scoped are retired. The op's AccessControl IS the
peer-authorization policy; no parallel system.
- ScopedPeerEnv: peer-qualified reachability (peer-pinned allowlist) replaces
from_call's namespace_prefix as the disambiguation mechanism. Cross-peer
collision dissolves (separate sub-overlays); same-peer collision stays error.
- services/list-peers opt-in for peer-attributed re-export listing.
POC-validated against real types (scratch module written, type-checked,
removed; build clean, 207 tests pass). Petgraph not needed for v1 (one-hop,
shallow); nested HashMap suffices; extends to multi-hop without redesign (OQ-32).
OQ impact: OQ-25 dissolved (no marking); OQ-28 cross-peer dissolved / same-peer
stays; OQ-26/27/29 stay; new OQ-30 (Any routing policy), OQ-31 (list-peers
semantics), OQ-32 (multi-hop federation).
Research: docs/research/alknet-call-peer-routing/findings.md (POC shapes,
prior art — Ray.io actors, Dapr service invocation, full ADR draft).
ADR-028 marked Superseded; ADR-017 DC-1 amendment updated to point at ADR-029.
Resolves the four gap-analysis decisions (DC-1..4) blocking the alknet-call
client/adapter surface specced in ADR-017:
- ADR-028 (new): locks the one-way door for DC-1 — CallClient registry is
default-deny (remote_safe: bool on HandlerRegistration, default false across
all provenance); share-global is an explicit trusted-peer opt-in; filtering
is a dispatch-time read over the single Layer-0 registry, not a copy.
- client-and-adapters.md (new spec): operationally fills the gap ADR-017 left
to implementation — CallClient, from_call, from_jsonschema, OperationAdapter
trait, adapter location map, no-env-vars invariant, exchange-of-operations
pattern. Keeps call-protocol.md and operation-registry.md under the
700-line split threshold.
- ADR-017 amended: records DC-2/3/4 v1 defaults (auto-on-reconnect,
error-on-collision, Result error type) and points DC-1 at ADR-028.
- OQ-25..28 (new): two-way-door remainders (remote_safe shape, AdapterError
variants, re-import trigger, namespace collision) with v1 defaults recorded.
- Index/cross-ref updates across READMEs and the two existing call specs.
Tasks: 6 task files under tasks/call/ decomposing the completion work along
the gap-analysis priority order — remote-safe-marking (one-way door, first)
→ call-client (phase-risk) → from-call → operation-adapter-trait →
from-jsonschema (parallel with call-client) → review-completion. Graph
validated with taskgraph; parallelism designed in (from-jsonschema runs
concurrent with call-client/from-call once the trait lands).
ADR-027 resolves the architectural gap surfaced when ACME integration
became a concrete target:
1. TlsIdentity::Acme variant — static config data (domains, cache_dir,
directory, contact) with async AcmeState constructed at endpoint
setup via two-phase TlsSetup (not stuffed into the Clone-able enum).
2. TlsIdentity::RawKey decoupled from the iroh feature — uses
Ed25519SecretKey (alknet-core-owned wrapper over ed25519_dalek)
instead of iroh::SecretKey. Raw-key TLS identity (RFC 7250, the
default for most alknet nodes) now works in quinn-only builds.
iroh transport converts via SecretKey::from_bytes.
3. ACME feature-gated behind new acme feature (rustls-acme optional
dep). Non-ACME builds don't compile it.
4. dispatch_quinn guard for acme-tls/1 challenge connections — TLS-ALPN-01
is handled at the rustls cert resolver layer during the handshake;
the guard closes challenge connections gracefully instead of logging
a misleading "no handler" warning.
Research confirmed QUIC (quinn) handles ACME challenges differently than
TCP (reverse-proxy): quinn gives no ClientHello peek hook, but the
challenge is fully answered at the cert resolution step before the
connection surfaces to the application. No handler registration needed.
Spec updates: config.md, endpoint.md, open-questions.md (OQ-12),
overview.md + README.md (ADR index), ADR-010 (cross-ref).
Tasks: core/rawkey-decouple-from-iroh (gen 1, no deps),
core/acme-integration (gen 2, depends on rawkey). Graph: 36 tasks.
Review #003 found 11 critical, 14 warning, and 6 suggestion findings
after reviews #001 (governance/security) and #002 (cross-document
consistency/two-way-door audit) were resolved. The theme: types and
APIs that were *referenced* but never *defined*, and stale ADR sketches
that didn't match the now-updated spec docs.
Critical fixes (11):
- C1: DerivedKey #[derive(Deserialize)] contradicted the custom
Deserialize that rejects "[REDACTED]" — dropped the derive, added
explicit manual Serialize/Deserialize impls (protocol.md).
- C2: encrypt prose said "derived at PATHS::ENCRYPTION" but the
signature takes key_version — updated to encryption_path_for_version
(service.md).
- C3: derive_encryption_key returned DerivedKey, derive_encryption_key
_for_version returned EncryptionKey (same cache) — unified on
DerivedKey, defined CachedKey (service.md).
- C4: tokio vs std::sync::RwLock contradiction — specified
std::sync::RwLock, dropped tokio from vault deps (ADR-018, ADR-025,
service.md).
- C5: Missing drift rows in vault README — added #9 (key_version
ignored) and #10 (rotate not implemented).
- C6: ADR-022 build_root_context and invoke() sketches omitted
abort_policy (9 fields vs 10) — added the field to both sketches.
- C7: Capabilities type referenced 20+ times, never defined — added
struct definition to core-types.md with Clone+Send+Sync, Zeroize,
sealed builder API, immutability guard.
- C8: SessionOverlaySource on CallAdapter but never defined, crate
violation (alknet-call can't depend on alknet-agent) — defined the
trait in alknet-call (call-protocol.md), matching the IdentityProvider
pattern.
- C9: CompositeOperationEnv dispatch fall-through was "a two-way door"
— added contains() to OperationEnv trait, made the composite probe
before dispatching, eliminating the sentinel ambiguity.
- C10: No API for Layer 2 (connection overlay) registration, CallConnection
undefined — defined CallConnection struct + register_imported() API
(call-protocol.md).
- C11: with_local signature diverged between two examples (4 args vs 5)
— added capabilities as the 5th arg, made both examples consistent.
Warning fixes (14):
- W1: invoke_with_policy restructured as required method, invoke gets a
default impl delegating to it — eliminates duplication across impls.
- W2: CachedKey defined (service.md).
- W3: EncryptionKey constructor/glue specified, added to re-export list.
- W4: Secp256k1ExtendedPrivKey defined, derive_ethereum_key glue shown.
- W5: encryption_path_for_version rejects version < 2 (v1 is TS PBKDF2).
- W6: Wire payload schemas for all event types + ResponseEnvelope →
EventEnvelope conversion table (call-protocol.md).
- W7: Timeout section — deadline on OperationContext, composed calls
inherit parent's deadline, CallAdapter::with_timeout().
- W8: Request ID generation spec — UUID v4 for composed calls, wire ID
vs internal ID relationship for abort cascade.
- W9: unlock_new already-unlocked behavior specified (returns
AlreadyUnlocked).
- W10: KeyType Serialize/Deserialize justification corrected (stale
irpc reference removed).
- W11: OperationProvenance and CompositionAuthority defined inline in
operation-registry.md (were only in ADR-022).
- W12: encrypt/decrypt free functions marked pub(crate), relationship
to VaultServiceHandle methods stated.
- W13: rotate signature removed from encryption.md (it's a
VaultServiceHandle method, not a free function).
- W14: CallAdapter::new() + with_session_source() + with_timeout()
constructors shown.
Suggestion fixes (6): Seed: Clone note, VaultServiceInner invariant,
ExtendedPrivKey accessor signatures, CURRENT_KEY_VERSION location, ADR-018
stale actor text, derivation helpers re-export note.
Add ADR-026 (vault key model — HD derivation) recording the foundational
HD-derivation decision, 74' coin type reservation, SLIP-0010/Ed25519
default, secp256k1 feature-gating, and AES-256-GCM cipher choice. These
were previously inline rationale with no ADR (W9).
Extend ADR-018 with an explicit EncryptedData wire format lock — fields,
encoding, and semantics are frozen; no removal without a format-version
migration (W10).
Resolve the remaining guard clauses and spec decisions:
- W2: Capabilities must be immutable after construction (no interior
mutability). Makes the Arc vs deep-copy clone semantics genuinely
two-way.
- W5: Published to_* specs are compatibility contracts — best-effort
mappings are two-way before first publication, one-way after. Version
generated specs.
- W6: Salt field clarification — v2 salt is permanently unused; a future
KDF is a different derivation family, not a version-indexed path; the
field saves a wire-format change only.
- W7: unlock_new returns Zeroizing<String> — the mnemonic is the root of
trust and must not linger in freed memory.
- W17: OQ-09 WASM — server-side dispatch door is honestly closed
(Connection is concrete, tokio-bound), not implicitly preserved.
- W18: OQ-10 git — composability fork (raw smart protocol vs call-protocol
projection) is a separate decision from ERC721 scope.
- W20: from_openapi must prefix imported error codes (HTTP_404) to avoid
collision with protocol-level codes (NOT_FOUND). Normative rule, not
naming convention.
- W21: ScopedOperationEnv field is private — construction via new()/
empty(), query via allows(). Makes the future subgraph refactor
non-breaking.
- C13: Connection::set_identity — the endpoint does not read identity()
after handle() returns (Connection is moved into the spawned task).
Observability is handler-side logging. Simplest honest answer.
- W1: OperationAdapter trait is async, returns Vec<HandlerRegistration>.
from_call requires async discovery; ADR-022 changed the return type.
- W11: CompositionAuthority::as_identity() defined — constructs a
synthetic Identity (label as id, scopes, resources) not resolvable via
IdentityProvider. Second Identity construction path, acknowledged.
- W14: SecretKey is iroh::SecretKey (Ed25519) — consistent with the
endpoint's iroh dependency.
- W19: Grandchild abort propagation is inherit-by-default (option a) —
invoke() with no explicit policy inherits parent's policy. ContinueRunning
auto-propagates to grandchildren unless explicitly overridden.
The password-manager pattern (deterministic per-site passwords from HD
derivation) is not relevant to an RPC system's vault. Handlers call APIs
(using API keys, OAuth tokens, mTLS), not websites with passwords. The
vault is for cryptographic key derivation and credential encryption.
Removes:
- derive_password, derive_password_string from service.md
- site_password_path from mnemonic-derivation.md
- m/74'/1'/0'/{hash}' path from PATHS module and path semantics table
- derive_password row from the cache table
Resolves review #002 C9 (site_password_path hash mapping underspecified)
by removing the feature rather than specifying the non-standard
string→u32 mapping and Ed25519-as-password-entropy construction.
If deterministic password generation is ever needed (browser-automation
edge case), it can be re-added — the cost is near-zero. Removing it now
eliminates permanent API surface inherited from a prior project's
password-manager pattern.
Drops irpc from alknet-vault entirely. The vault's dispatch is now direct
method calls on VaultServiceHandle — no VaultProtocol enum, no
VaultMessage, no VaultServiceActor, no mpsc channel, no Service trait, no
RemoteService trait, no postcard serialization. The vault is local-only by
construction.
The core security argument: irpc made the vault remote-capable by default
(RemoteService generated unless no_rpc is passed). The IrohProtocol handler
forwards all messages without auth. The docs framed 'register an ALPN' as a
server-setup change. This is the default-insecure anti-pattern — security
should be opt-in, not opt-out. ADR-025 inverts the default: local-only is
the only mode, and remote access requires building a separate vault-server
crate (a visible architectural act, not a flag flip).
The actor path was already dead code — service.md said 'prefer
VaultServiceHandle directly — no channel, no serialization.' The actor
existed only to make irpc's Service trait work, which existed only to make
RemoteService work, which was the footgun. VaultServiceHandle's
Arc<RwLock> provides concurrent reads and exclusive writes — better
throughput than the actor's sequential processing.
DerivedKey serialization simplifies: always redact on serialize (for
logging safety), reject '[REDACTED]' on deserialize with an error. No
'postcard preserves bytes' path. This resolves review #002 W8 (silent
corruption on JSON-deserialized DerivedKey).
Resolves:
- OQ-21: remote vault access — resolved (not deferred). Not a vault crate
feature; if needed, a separate vault-server crate with its own ADR.
- C7: vault-server-crate question decided — not created now, not precluded.
- C8: operation access policy table dissolved — all operations local-only
by default; if a vault-server crate exposes some remotely, that crate
defines the policy.
- W8: DerivedKey JSON deserialization — resolved (reject redacted payloads).
Amends ADR-005 (irpc remains for alknet-call, not for alknet-vault),
ADR-018 (vault is even more standalone — zero RPC framework deps),
ADR-019 (vault is the only layer, not just the only direct-caller layer),
ADR-008 (vault integration point unchanged, but now local-only by
construction).
Diagnoses a conflation in the pre-ADR-024 spec: the OperationRegistry
inherited immutability by analogy from ADR-010's HandlerRegistry (ALPN-level),
but the TLS-config argument that justifies HandlerRegistry immutability does
not apply to the operation registry, which lives behind a single ALPN
(alknet/call). This made from_call (which discovers ops over a live connection
at runtime) structurally incompatible with the blanket immutability claim.
ADR-024 layers the operation registry by trust boundary: curated (Local) ops
are static and immutable — the startup trust boundary is where their
composition authority is granted; session (Session) and imported (FromCall
etc.) ops are dynamic at their respective scopes (per-session, per-connection)
— their trust boundaries are per-scope, not per-startup. The principle:
immutability follows the trust boundary. Immutability is the security control
for composing ops (can escalate privilege); provenance + composition authority
are the controls for non-composing ops (can't escalate).
The OperationEnv trait becomes the integration point (Arc<dyn OperationEnv>),
following the IdentityProvider precedent (ADR-004): the CallAdapter composes
the root OperationContext.env per incoming call from the active layers
(curated base + connection overlay + session overlay). Children inherit the
parent's composite env by Arc::clone — overlay composition happens once at
the root and propagates through the composition tree.
Resolves review #002 C6 (OperationContext.env type identity crisis): the
field is split into scoped_env: ScopedOperationEnv (reachability data, from
the registration bundle) and env: Arc<dyn OperationEnv + Send + Sync>
(dispatch trait object). One field was being used as two different types
(reachability set with .allows() and dispatch trait with .invoke());
Localizes W4 (hot-swap ↔ registry mutability coupling) to the connection
scope: no global mutable registry to hot-swap; overlays replace naturally
with connect/disconnect and session start/end. Schema-drift on reconnect is
a per-connection overlay-rebuild concern, not a global hot-swap protocol.
Partially addresses W3 (CallClient registry security): the registry-shape
sub-question is resolved by the overlay model; the capability-exposure
sub-question (what capabilities a remote peer can trigger) remains for
ADR-017 — ADR-024 does not overclaim resolution there.
Amends OQ-04 to scope its immutability claim to the HandlerRegistry and
cross-reference ADR-024 for the operation registry. Generalizes OQ-19's
session-overlay mechanism to also cover connection-scoped remote imports —
both are per-scope dynamic overlays on the static curated base, using the
same trait-layering mechanism.
Governance (Tier 2):
- Advance ADR-022 and ADR-023 from Proposed to Accepted (specs already
depend on their types as source of truth)
- Amend ADR-015: mark Decision 3 and Assumption 6 as superseded by ADR-022;
update handler_identity type to CompositionAuthority
- Amend ADR-002: note handle() signature revised by ADR-007 (BiStream → Connection)
- Amend ADR-004: note 'enrich/replace' AuthContext language superseded by
ADR-011's immutability model; update to describe set_identity on Connection
- Update main README ADR table to show ADR-022/023 as Accepted
Spec-ADR consistency (Tier 3):
- Add abort_policy: AbortPolicy field to OperationContext struct (ADR-016
Decision 6 mandated this but the spec omitted it)
- Define AbortPolicy enum (AbortDependents | ContinueRunning) with Default impl
- Add abort_policy to build_root_context and LocalOperationEnv::invoke()
- Define the OperationEnv trait explicitly with invoke() and
invoke_with_policy() methods (was referenced as 'must remain a trait'
but never defined)
- Specify From<StreamError> for HandlerError impl with exact variant mapping
- Add Connection::from_quinn() / from_iroh() constructors (was referenced
as Connection::new() but never defined)
- Remove undefined CertAuthorityEntry placeholder from AuthPolicy v1 (will
be added additively when alknet-ssh lands)
- Fix config.md key-differences table: rate limits are in DynamicConfig,
not StaticConfig
Mechanical fixes (Tier 1):
- overview.md: 'closes the QUIC stream' → 'closes the connection' (stale
from pre-ADR-007 model)
- overview.md: OQ-04 entry updated from stale 'defer to implementation'
to 'resolved: static at startup'
- mnemonic-derivation.md: remove duplicate helper functions block (incomplete
first copy, complete second copy)
- ADR-003: add iroh (feature-gated) to alknet-core dependency list, added
by ADR-010
- ADR-021: fix ambiguous 'W1 drift issue from the vault review' cross-reference
- ADR-022: rephrase FromCall 'leaf locally' to 'leaf in the local registry'
- ADR-017: add error_schemas to from_call mirror list and services/schema
step (inconsistency with ADR-023)
- ADR-016: fix self-referential citation ('ADR-016 Assumption 5' → 'Assumption 5')
- Add ScopedOperationEnv::empty(), allows(), new() and
CompositionAuthority::none(), new() impl blocks (referenced but undefined)
- Add call.completed clarification for non-subscription calls
- Add services/schema leading-slash normalization note
- Crate README ADR tables: add missing ADR-013 (call), ADR-015 (core),
ADR-006 + ADR-010 (vault)
- Vault README: add consolidated 'Known Source Drift' table tracking all
four drift items (OsRng, unwrap, CURRENT_KEY_VERSION, spawn bug) in one
place, including the two previously missing from README
ADR-015 L171 said the scoped env API was 'a two-way door for
implementation.' ADR-022 has now resolved it: ScopedOperationEnv with
operation-level granularity (HashSet<String>), not namespace-level.
Update the stale text to point to the resolution.
ADR-016 Decision 6 specifies that the abort policy (abort-dependents vs
continue-running) is set on OperationContext and propagated through
OperationEnv::invoke() — the composing handler decides the child's
policy, not the wire caller. The call.requested payload does not carry
an abort policy field. This resolves the TBD that was masquerading as a
two-way door: two of the three options ADR-016 floated (wire payload,
per-operation declaration) were inconsistent with the ADR's own
assumptions.
Also marks review #001 as resolved — all 5 critical, 4 warning, and 4
suggestion findings are now addressed.
ADR-023 adds error_schemas to OperationSpec so operations can declare
their domain-level failure modes (FILE_NOT_FOUND, RATE_LIMITED, etc.)
distinct from protocol-level codes (NOT_FOUND, FORBIDDEN, etc.). The
call.error payload gains an optional 'details' field carrying the typed
error payload conforming to the declared schema. from_openapi/to_openapi
map OpenAPI response status codes to/from ErrorDefinitions, making the
adapter contract from ADR-017 faithful on the error axis.
Also fixes W2 (KeyVersionMismatch stale comment in encryption.md —
ADR-021 implements rotation without this variant) and W4
(derive_encryption_key_for_version missing from service.md method list).
Spec updates: operation-registry.md (OperationSpec, ErrorDefinition,
Handler error mapping, services/schema), call-protocol.md (call.error
payload, CallError, ResponseEnvelope), README.md, overview.md,
open-questions.md (OQ-24), call/README.md, encryption.md, service.md.
ADR-022 wires the three controls ADR-015 specified but left without
registration paths (C1-C4 from review #001): composition authority,
scoped env, and capabilities now enter through a HandlerRegistration
bundle. Provenance (Local, FromOpenAPI, FromMCP, FromCall, Session)
determines which ops can compose — leaves don't get composition
authority. CompositionAuthority replaces handler_identity: Identity
(it's a declared authority bundle, not a peer identity). Capabilities
are per-request from the bundle (resolves closure-capture vs context
ambiguity). Kernel/user analogy: user's authority checked at External
gate; handler's composition authority used inside; scoped env bounds
reachability.
Also fixes W1 (stale ADR-020 path example) and W3 (from_mcp missing
from adapter lists in operation-registry.md).
Spec updates: operation-registry.md (OperationRegistry,
HandlerRegistration, OperationContext, OperationEnv, registration
example, capability injection), call-protocol.md (build_root_context),
README.md, overview.md, open-questions.md (OQ-23), call/README.md.
Key rotation uses version-indexed derivation paths: each key version maps
to a distinct SLIP-0010 path (m/74'/2'/0'/{version-2}'). v2 is at index 0
(PATHS::ENCRYPTION), v3 at index 1, etc.
Mechanism:
- encryption_path_for_version(version) constructs the path
- decrypt derives the key at the version-indicated path (not always
PATHS::ENCRYPTION)
- rotate(blob, to_version) decrypts with old key, re-encrypts with new
- No new mnemonic needed — same seed, different path
- Partial rotation is safe — old keys remain derivable
- The vault does not self-rotate; the assembly layer iterates blobs
Source drift flagged:
- decrypt currently ignores key_version for path selection (always uses
PATHS::ENCRYPTION) — must use version-indexed paths
- rotate method does not exist in source — must be added
- CURRENT_KEY_VERSION must bump from 1 to 2 (per ADR-020, reinforced here)
OQ-22 resolved. Only OQ-21 (remote vault admin, deferred) remains.
The vault uses SLIP-0010 HD derivation from the BIP39 seed for the
AES-256-GCM encryption key, not PBKDF2. This replaces the TypeScript
predecessor's (@alkdev/storage/src/graphs/crypto.ts) PBKDF2-based
approach.
Key decisions:
- HD derivation at m/74'/2'/0'/0' produces the encryption key
- PBKDF2 is not implemented in the vault; no password-based derivation
- salt field is unused in v2 (wire-format compat only)
- key_version=1 reserved for TS PBKDF2 data; key_version=2 for vault HD
- TS-encrypted data requires one-time migration to v2
- CURRENT_KEY_VERSION changes from 1 to 2 (source drift flagged)
OQ-20 resolved: the encryption key derivation method is locked. OQ-22
(key rotation workflow) remains open but does not block implementation.
Spec the vault crate from its existing implementation. The vault is
stable (implementation exists); this spec documents what IS so the
implementation-sync agent can reconcile source drift.
New spec documents (crates/vault/):
- README.md — crate index, security constraints, public API
- mnemonic-derivation.md — BIP39, SLIP-0010, BIP-0032, derivation paths
- encryption.md — AES-256-GCM, EncryptedData, key versioning, salt
- service.md — VaultServiceHandle lifecycle, actor dispatch, cache
- protocol.md — VaultProtocol irpc messages, DerivedKey redaction
New ADRs:
- ADR-018: Vault as standalone crate (zero alknet deps; own types/errors)
- ADR-019: Vault assembly-layer-only access (CLI is sole caller)
New open questions:
- OQ-20: Salt/KDF Phase B (open, low priority — salt field reserved)
- OQ-21: Remote vault administration (deferred — needs ADR if ever needed)
- OQ-22: Key rotation mechanism (open, low priority — workflow not specced)
Spec-vs-source drift explicitly flagged (for the sync agent):
- rand::random() used for IVs instead of OsRng (security-critical)
- unwrap() on every RwLock acquisition (must use unwrap_or_else)
- ADR-038 / OQ-SVC-03 references in source comments are stale (old numbering)
- VaultServiceActor::spawn returns a non-functional second actor (source bug)
- KeyVersionMismatch error variant is defined but unused in v1
Critical:
- operation-registry: remove stale duplicate OperationEnv impl that
propagated parent.metadata through composition (violated ADR-014);
collapse to one canonical block with metadata: HashMap::new()
- operation-registry: fix request_id collision — format!("env-{name}")
produced identical IDs across concurrent invocations, corrupting
PendingRequestMap correlation and the abort-cascade tree (ADR-016)
- operation-registry + ADR-015: fix OperationContext.internal visibility —
pub field let handlers mark their own call internal (privilege
escalation per ADR-015); change to pub(crate) with pub fn is_internal
Warnings:
- core-types: add Connection::set_identity/identity (OQ-11) to the
Connection type spec — was specified in auth.md but missing from the
type definition
- operation-registry: add Capabilities: Clone design note — invoke()
clones capabilities through composition; explicit security implication
- call-protocol: add CallAdapter root OperationContext construction
example showing internal: false wire path, complementing
OperationEnv::invoke() internal: true composition path
- overview: remove alknet/agent from ALPN registry — agent is a future
consumer of alknet-call (call-protocol operations), not a separate ALPN
- call-protocol: clarify call.requested payload schema and the
leading-slash convention (wire operationId has slash, registry name
does not)
Suggestions:
- operation-registry: cross-reference ResponseEnvelope definition
- core-types: add StreamError to HandlerError mapping table
ADR-017 locks the client/adapter architecture:
- CallClient opens QUIC connections, shares dispatch loop with CallAdapter
- Connection direction independent of call direction (both sides can call)
- from_call adapter: discovers remote ops via services/list + services/schema,
registers with forwarding handlers (same pattern as from_openapi/from_mcp)
- to_openapi/to_mcp: project local ops to external protocols
- OperationAdapter trait: produces (OperationSpec, Handler) pairs
- Cross-node call tree: abort cascade propagates through from_call handlers
- Credentials from capabilities (ADR-014), adapter ops Internal by default (ADR-015)
The dispatch POC at /workspace/@alkdev/dispatch demonstrated head/worker over
SSH+axum; under the call protocol it's cross-node composition via from_call.
Connection topology (who advertises, who opens) is independent of call
direction — runner pattern, dispatch pattern, and P2P all work.
The 'trusted' flag on OperationContext was the wrong word — it implies a
trust decision was made, but what actually happens is the call originated
internally (from composition) not externally (from the wire). Renamed to
'internal' with clarified semantics: internal calls switch authority
context to the handler's identity, not skip ACL. This prevents the
privilege escalation vector where composition with 'trusted: true' bypassed
all access control (buggy handler + parameterized dispatch).
- Rename trusted -> internal across operation-registry.md, ADR-014
- Update OperationContext field description and LocalOperationEnv code
- Add OQ-17: abort cascade for nested calls (call.aborted cascades to
descendants, default abort-dependents, continue-running opt-in). One-way
door on the protocol event schema; mechanism is a two-way door.
- Add OQ-18: privilege model and authority context (internal = authority
switch not ACL skip, External/Internal operation visibility, scoped
composition env + handler identity). Needs agent crate in view.
- Add abort cascade section and constraint to call-protocol.md
- Update crates/call/README.md with OQ-17, OQ-18, and two new design principles
- Update architecture README.md with OQ-17, OQ-18
Resolve the contradiction between ADR-008's "capability source" model
and operation-registry.md showing vault operations on the wire. ADR-014
establishes: vault is assembly-layer only, capabilities carry outbound
credentials (distinct from inbound identity), call protocol carries no
secret material, adapters take credential sources not static tokens.
- Add ADR-014 (Secret Material Flow and Capability Injection)
- Remove vault/derive, vault/unlock, vault/decrypt from call protocol
registration examples and all spec examples
- Add Capabilities field to OperationContext, propagate through
LocalOperationEnv nested calls
- Add Capability Injection section to operation-registry.md
- Add no-secret-material wire constraint to call-protocol.md
- Add streaming subscribe example (LLM chat with Vercel UI chunks)
- Add Security Model section to overview.md (identity vs capabilities)
- Trim WASM treatment from ~20 lines to a design-constraint note
- Add OQ-16 (resolved: no vault ops on wire), update OQ-08, OQ-15
- Update ADR-003, ADR-008, ADR-013 to remove stale "via call protocol"
vault references
- Rewrite OQ-12: separate two distinct TLS identity use cases (RFC 7250
raw keys as default for P2P, X.509 for domain-hosted/browsers) instead
of conflating them as 'file paths now, ACME later'. ACME is a proven
pattern from the reverse-proxy project, not speculative future work.
- Resolve OQ-13 and OQ-14: remove 'Phase 1' framing from core crate
specs. /{service}/{op} is the correct design for alknet-call, not a
simplification. Batch as correlated call.requested events is the correct
protocol design. Core crates need to be done right from the start.
- Add ADR-013: Rust as canonical implementation language. TypeScript
@alkdev/operations is a reference that informed the design, not a
parallel implementation. The only JS use case is browser SDK adaptation.
Five reasons: memory safety, LLM competence, supply chain attacks,
performance, browser-only JS.
- Add alknet-agent crate to the crate graph (depends on alknet-call, not
alknet-core). Agent service uses call protocol client for tool dispatch
and vault/derive for provider keys — no env vars for secrets. ALPN
alknet/agent added to the registry.
- Add OQ-15: call protocol client and adapter contract. alknet-call needs
both server (CallAdapter) and client (remote invocation over QUIC), plus
the adapter traits (from_*, to_*) that enable composition.
- Clarify alknet-napi as thin NAPI projection layer, not business logic.
- Fix bugs: ProtocolController → ProtocolHandler typo, OperationEnv
invoke() path format inconsistency, RateLimitConfig comment confusion.
- Update endpoint.md TLS section: comprehensive identity model comparison
table, RFC 7250 as default mode, ACME as proven pattern.
Add architecture specs for the alknet-call crate:
- call-protocol.md: CallAdapter, EventEnvelope wire format, bidirectional
stream model with ID-based correlation, PendingRequestMap, protocol
operations (call/subscribe/batch/schema), per-request identity resolution,
connection/stream lifecycle, error codes
- operation-registry.md: OperationSpec, async Handler type, OperationRegistry,
AccessControl with trusted call bypass, OperationEnv with context
propagation (parent_request_id, identity inheritance), service discovery,
irpc integration layering, naming convention (no leading slash in names)
- ADR-012: Call protocol uses bidirectional QUIC streams with EventEnvelope
framing and ID-based correlation. Protocol is stream-agnostic and symmetric.
Resolves OQ-07.
Key design decisions:
- Handler type is async (Fn returning Pin<Box<dyn Future>>)
- OperationEnv::invoke propagates parent context (identity, metadata,
parent_request_id)
- Identity resolution is per-request, not per-connection
- Operation names without leading slash (fs/readFile, not /fs/readFile)
- Batch is a client-side pattern, not a protocol primitive (OQ-14)
- Phase 1 uses service/op paths, node prefix added later (OQ-13)
Also: promote ADR-010 and ADR-011 from Proposed to Accepted, add OQ-13
and OQ-14 to open-questions.md.
iroh uses RFC 7250 raw Ed25519 public keys for TLS instead of X.509
certificates. rustls already supports this. This means the quinn
endpoint can also use raw public keys — same key-based identity model
as iroh, but with direct QUIC over UDP. X.509 is optional, needed
only for domain-facing identity (browser/WebTransport clients).
Update StaticConfig with TlsIdentity enum (X509, RawKey, SelfSigned)
and add iroh_relay field. Remove 'iroh deferred' language — iroh is
a first-class connectivity mode.
iroh's Endpoint natively supports ALPN negotiation and set_alpns(). Our
HandlerRegistry dispatches exactly like iroh's own ProtocolMap/Router
pattern, but shared across both quinn and iroh connection sources. We
use iroh::Endpoint directly (not iroh::Router) because our HandlerRegistry
and AuthContext are shared across sources.
Correct the conflation of quinn/TLS/iroh as interchangeable transports.
They are complementary connectivity modes serving different deployment
contexts: quinn (public IP + TLS), iroh (NAT traversal via relay), TCP
(handler-specific, not core). Clarify that TLS cert = network identity,
not auth identity. Map stealth mode to HTTP handler on standard ALPNs
instead of byte-peeking. Resolve OQ-05 as one-way door. SendStream/
RecvStream now use internal enum dispatch for both quinn and iroh
streams.
Rename the crate from alknet-secret to alknet-vault to better reflect its
purpose as a local key vault (seed management, key derivation, encryption)
rather than a network service.
Symbol renames:
- SecretService → VaultService
- SecretServiceHandle → VaultServiceHandle
- SecretServiceActor → VaultServiceActor
- SecretServiceError → VaultServiceError
- SecretProtocol → VaultProtocol
- SecretMessage → VaultMessage
- ServiceLocked → VaultLocked
- alknet_secret → alknet_vault (crate name)
Update ADR-008 with vault access pattern: the vault is a capability source,
not a service endpoint. The CLI injects derived/decrypted material into
operation contexts — handlers never hold vault references.
Create the alknet-secret crate with BIP39 mnemonic generation, SLIP-0010
Ed25519 HD key derivation, AES-256-GCM encryption, and SecretProtocol
irpc service definition. This is Phase 3.1 from the integration plan.
Architecture changes:
- Promote secret-service.md to reviewed status with full spec format
(crate structure, public API, security model, phase progression,
ADR/OQ cross-references, wire format compatibility section)
- Add ADR-038 (seed lifecycle and memory security): zeroize for v1,
mlock deferred to Phase B
- Add OQ-SEC-01 (mlock/VirtualLock for seed RAM) to open-questions.md
- Update README.md with ADR-038 and secret-service status
Crate structure:
- src/mnemonic.rs: BIP39 phrase generation, validation, seed derivation
- src/derivation.rs: SLIP-0010 HD key derivation, path constants (74')
- src/encryption.rs: AES-256-GCM encrypt/decrypt, EncryptedData type
- src/protocol.rs: SecretProtocol irpc enum, DerivedKey, KeyType
- src/service.rs: SecretServiceHandle with Unlock/Lock lifecycle
- 40 passing tests (unit + integration + doc)
Update four existing specs (overview, server, napi-and-pubsub, call-protocol) to
reflect Phase 0 decisions: three-layer model, IdentityProvider, ForwardingPolicy,
OperationEnv, static/dynamic config split. Review all 9 Phase 0a ADRs (026-034)
for consistency. Fix 4 critical issues from architecture review: missing OQ-SVC-05
in open-questions.md, deprecated hub terminology, undefined AuthService and noq
terms. Replace inline OQ text with cross-references per format rules. Add
ConfigServiceImpl definition to configuration.md. Port absolute workspace paths
to project-relative links by copying referenced docs (feasibility, certbot,
fail2ban, event_source_types) into docs/research/.
The architecture specs were implying that StorageIdentityProvider, irpc
service implementations, and application services (agent, Docker, etc.)
already exist. This commit makes the phasing explicit:
- services.md: deployment topology now clearly labels 'Current (Phase 1)'
vs 'Future (Phase 2+)', notes that application services are downstream
- identity.md: StorageIdentityProvider labeled 'Future — Phase 2+',
clarifying alknet-storage doesn't exist yet
- storage.md: adds phase note that the crate hasn't been built yet,
StorageIdentityProvider is a future impl
- ADR-028: ConfigAuthService is Phase 1 path, StorageAuthService is
Phase 2+ contract
- call-protocol.md: Agent Service Pattern section explicitly framed as
a downstream application concern, not a core requirement
Unified authentication (ADR-023): SSH and WebTransport auth share the same
Ed25519 key material. Token auth uses signed timestamps verified against the
same authorized_keys set. IdentityProvider trait decouples core from identity
storage.
Bidirectional call protocol (ADR-024): Generalizes control channel (ADR-018)
to support hub→spoke and spoke→hub calls. Operation paths use /{spoke}/{service}/{op}
format for three-level routing. EventEnvelope wire format, five call events,
PendingRequestMap for correlation.
Handler/spec separation (ADR-025): Downstream consumers register operations
without modifying core. OperationRegistry maps paths to specs + handlers.
Service discovery via /services/list and /services/schema.
Resolves OQ-17 (transport-aware auth), OQ-21 (spoke routing), OQ-CFG-04 and
OQ-CFG-06 (WebTransport auth and transport-aware auth layer). Adds OQ-18
through OQ-22 for remaining open questions.
Address 5 critical and 7 warning issues from architecture review:
- Fix duplicate sentence in napi-and-pubsub.md server side section
- Add wraith- namespace reservation to server.md constraints (ADR-018)
- Document stealth mode TLS-only requirement in server.md
- Create ADR-019 for --proxy dual semantics (client vs server)
- Clarify NAPI connect() vs CLI wraith connect distinction
- Add SOCKS5h default as privacy design decision in client.md
- Expand reconnection section (always-on, re-register port forwards)
- Add graceful shutdown sections to client.md and server.md
- Specify OpenSSH key format for path-or-buffer inputs across all docs
- Resolve pubsub alternative approach ambiguity (ADR-018 is primary)
- Replace server.md handler impl block with behavioral description
- Standardize iroh endpoint ID terminology (base58-encoded)
- Remove iroh API implementation details from transport.md/server.md
- Add error handling pattern as cross-cutting concern in overview.md
- Update all document statuses from draft to reviewed