Reframes the SSH scope around the channel multiplexer as the decomposition
point. Each feature (forwarding, SOCKS5, SFTP) is a channel type or a consumer
of channel types, stacking on the core — each layer functional when built,
none shipped broken. Dissolves the 'massive v1' framing that produced hedging
language proposing non-functional or half-built versions.
Three developments since the initial 2026-06-25 research changed the framing:
(1) WebTransport landed as ADRs 038/040/043, grounding SSH-over-WebTransport
as a constraint (the handler must be source-agnostic about its Connection);
(2) russh's runtime abstraction (russh-util swaps tokio::spawn for
wasm_bindgen_futures on wasm32) means the SSH *client* runs in WASM when fed a
WebTransport BiStream — the browser case is real, not speculative;
(3) the http crate intersection (ALPN-stream-proxy depends on SSH handlers
being source-agnostic) is now visible and specified.
The layered build order (1-4 stream+connection+channels+exec, then 5
forwarding, then 6 SOCKS5, then 7 SFTP) doubles as the configuration surface:
each layer beyond the core is an opt-in channel type, gating on the
default-deny ACL baseline inherited from russh.
A consistency review of the alknet-http specs found two classes of
issues: internal contradictions from the mid-spec pivot (the to_openapi
gateway pattern landed in prose but not in cross-references), and a
systematic client→server assumption that only holds for the OpenAPI/MCP
case leaking into the WebTransport architecture.
Class 1 (internal contradictions):
- C1: to_openapi was half-refactored — body described the ADR-042
gateway pattern but the decisions table and ADR-036 still said
'paths mirror /{service}/{op}'. ADR-036's to_openapi clause is now
amended as superseded by ADR-042; the stale decisions row and README
Principle 2 are fixed.
- C2: the axum Router route list didn't include the 5 gateway endpoints
(/search, /schema, /call, /batch, /subscribe). Added them; clarified
/openapi.json as the gateway description doc; added gateway paths to
the decoy exclusion list.
- C3: ADR-034 §5 still talked about the 'h3/WebTransport deferral
bucket' that ADR-038 eliminated. Amended §5/Consequences/References
to drop the deferral framing (the auth-model decision stands; only
the 'when' wording was stale).
Class 2 (one-way direction assumption):
- C4/C5/C6: the WebTransport specs framed the session as browser→hub
one-way, when the call protocol is bidirectional and WebTransport is
a general ALPN transport substrate. New ADR-043 reframes WebTransport
as a bidirectional ALPN transport substrate (call protocol is the
first/canonical target; needs no WASM parser), names the call
protocol's bidirectionality over WebTransport sessions, and states
the inbound no-PeerId connection-local overlay as the mirror of
ADR-034 §2. webtransport.md is updated to reflect this framing;
ADR-040 is repositioned (not superseded) as the substrate's non-call-
ALPN mechanism.
- C7: the HTTP/1.1+HTTP/2 surface's one-directionality is now named as
a lossy consequence of HTTP request/response; WebTransport is named
as the surface that restores the bidirectional call model.
- C8: overview.md acknowledges the from/to direction model is
OpenAPI/MCP-specific, not a call-protocol property.
A review subagent pass on ADR-043 + webtransport.md found no critical
issues; warnings W1-W3 (residual browser-as-subject framing, ADR-009
rationale in spec, opening abstract tone) and suggestions S2/S4/S5
were addressed.
The to_openapi spec was describing one OpenAPI path per alknet operation
— the inverse of from_openapi. That inverse is genuinely messy: the call
protocol's input is a flat JSON object, and generating a traditional
OpenAPI path entry (POST /fs/{path} with path param, body, query params)
requires reverse-engineering which fields are path/query/body — metadata
the call protocol doesn't carry. The three options (leaky HTTP metadata
on OperationSpec, fragile heuristics, manual annotation) are all messy.
ADR-042 replaces this with the gateway pattern (same as ADR-041 for
to_mcp): to_openapi generates 5 fixed endpoints (search, schema, call,
batch, subscribe) that gate access to the full operation registry. The
input is always a flat JSON body — no path/query/body split to
reverse-engineer. JSON Schema is already in the OperationSpec.
The per-caller API surface is the key advantage: /search is
AccessControl-filtered, so the client sees only what it can call. The
Gitea failure mode (dumping admin ops to every caller in a static
OpenAPI doc) is structurally impossible — the per-caller surface is the
default, not an afterthought. OpenAPI has no per-caller filtering
concept; the gateway pattern provides it through /search.
Gateway endpoint set:
- /search -> services/list (AccessControl-filtered, names + descriptions)
- /schema -> services/schema (full OperationSpec)
- /call -> call.requested (Query/Mutation, flat JSON body)
- /batch -> multiple call.requested (correlated IDs)
- /subscribe -> call.requested (Subscription, SSE) — the one endpoint
the MCP gateway excludes (MCP is request/response; OpenAPI/SSE
supports streaming)
A traditional per-operation-paths projection is additive (a deployment
that wants the nice Swagger UI builds it with HTTP-specific metadata),
not a replacement. The gateway is the default.
http-adapters.md to_openapi section rewritten: the gateway endpoint
set, per-caller filtering, error fidelity on the /call endpoint, and
the additive traditional projection. The 'Why' section adds the
flat->structured and per-caller-surface rationale.
README/overview ADR tables and the top-level README current-state note
updated for ADR-042.
The to_mcp spec was describing one MCP tool per alknet operation — the
tool-bloat problem. An LLM connecting to a node with 200 operations gets
200 MCP tools dumped into its context, degrading reasoning and wasting
context budget.
ADR-041 replaces this with the tool-gateway pattern (same pattern as
opencode's memory and worktree tools): to_mcp exposes 4 fixed meta-tools
(search, schema, call, batch) that gate access to the full operation
registry. The LLM has a few tools in context, discovers operations on
demand through search + schema, then calls. Same principle as Linux's
man command — don't preload all documentation; query on demand.
Gateway tool set:
- search -> services/list (names + descriptions, AccessControl-filtered)
- schema -> services/schema (full OperationSpec for a specific op)
- call -> call.requested (Query/Mutation only, request/response)
- batch -> multiple call.requested (correlated IDs, OQ-14)
Subscription operations are excluded — MCP tool calls are
request/response by protocol design (the client blocks until
CallToolResult returns); streaming subscriptions don't fit. Subscriptions
are filtered out of search results and cannot be invoked via call.
http-mcp.md to_mcp section rewritten: the gateway tool set, Subscription
exclusion, and the service behavior (tools/list returns 4 fixed tools,
tools/call dispatches through the gateway). The 'Why' section adds the
tool-bloat rationale and the memory/worktree tool pattern that informed
the design.
README/overview ADR tables and the top-level README current-state note
updated for ADR-041.
The 'WebTransport proxy' concept was conflating two distinct things;
this pass separates them:
1. In-process ALPN-stream-proxy (ADR-040, in alknet-http): the h3 handler
hands a WebTransport stream to another ALPN handler (SshAdapter,
GitAdapter, etc.) as a Connection, so a browser with a WASM parser
can reach any ALPN service via WebTransport. Path-based routing
(the CONNECT path declares the target: /alknet/ssh -> SshAdapter).
HttpAdapter gains Arc<HandlerRegistry> for the lookup. The browser's
WASM parser implements BiStream (ADR-007) over the WebTransport
stream. SSH-over-WebTransport is HTTPS-shaped at the network layer
(anti-censorship: the 'VPN-like without being a VPN' use case on a
clean foundation). russh-sftp demonstrates WASM targeting is
feasible; SSH is the next target.
2. Standalone relay service (OQ-38, future alknet-relay crate): a full
relay - fork of iroh-relay - with WebTransport proxy fallback for
NAT traversal. This is infrastructure, not a mode of the h3 handler.
OQ-38 reframed to be the standalone-relay scope question (distinct
from the in-process proxy now resolved by ADR-040).
webtransport.md updated: three stream destinations (call protocol,
ALPN-handler proxy, other sub-protocols) with path-based routing; new
'ALPN-stream-proxy' section covering the WASM client side, auth model
(bearer token gates the session; protocol's own auth gates the
protocol session), and the HandlerRegistry reference.
README/overview ADR tables and OQ summaries updated for ADR-040.
Replace AcceptAnyServerCertVerifier (a security hole for X.509) with
verifier selection by PeerEntry presence (ADR-034 §3, OQ-29):
- build_client_auth presents the Ed25519 key as an RFC 7250 raw public
key client cert (replaces with_no_client_auth), activating the
PeerEntry fingerprint -> peer_id resolution path on quinn.
- select_server_verifier: Some(fingerprint) -> FingerprintPinVerifier
(fingerprint match for known peers); None -> WebPkiServerVerifier
(CA verification for public X.509 endpoints). None + Ed25519 raw key
fails closed at handshake (no CA to fall back to).
- FingerprintPinVerifier matches ed25519:<hex> (raw key extraction) and
SHA256:<hex> (DER hash); verifies handshake signatures via
verify_tls13_signature_with_raw_key / verify_tls12/13_signature.
- Extract shared fingerprint logic into alknet_core::fingerprint (pub
module) reused by endpoint (server-side) and call_client (client-side).
- remote_identity: None is load-bearing (not defaulted to placeholder).
- Integration tests updated to pin the self-signed server cert
fingerprint (the known-peer path).
Commits the concrete adapter shape deferred by ADR-033: read-sync /
write-async split with honker NOTIFY/LISTEN for no-restart cache
invalidation, against SQLite, in a separate alknet-store-sqlite crate.
Two constraints drive the design: (1) the hot-path read trait
(IdentityProvider::resolve_from_fingerprint, CredentialStore::get) is
sync — called in the accept loop, no .await — so a SQLite-backed
adapter must cache in memory and serve sync reads from the cache; (2)
auth changes must take effect without a restart (an early issue the
project already fixed for ConfigIdentityProvider via ArcSwap config
reload). honker's SQLite NOTIFY/LISTEN (single-digit-ms wake, no
polling) is the cache-invalidation mechanism that makes both hold:
write commits to SQLite + emits NOTIFY, the running process's LISTEN
wakes, the in-memory index reloads and atomically swaps, the next
read sees the new state. Same ArcSwap-reload pattern as config,
generalized from 'config file is source of truth' to 'SQLite is
source of truth, honker signals when it changed.'
New async IdentityStore write trait (put_peer / update_peer /
remove_peer) extends the sync IdentityProvider read trait for peer
mutations. ConfigIdentityProvider does NOT implement it (config
reload is its write path — a posture enforced by the absence of a
backend, not a type-system constraint); SqliteIdentityProvider
implements both. CredentialStore::put/delete refined to async (within
ADR-031's one-way door — the contract was get/put/delete keyed by
provider persisting EncryptedData never decrypting; sync-vs-async was
unspecified). CredentialStoreError renamed to shared StoreError
covering both traits.
alknet-store-sqlite is one crate implementing both IdentityStore and
CredentialStore with shared SQLite connection + honker LISTEN infra
(splitting later is a two-way door). Schema shape committed (one row
per PeerEntry with JSON columns for fingerprints/scopes/resources;
one row per EncryptedData blob keyed by provider); exact DDL is an
implementation-detail two-way door in the adapter crate. The keypal
adapter-factory pattern is intentionally not ported to Rust (runtime
column-mapping is a TS affordance; in Rust each adapter is a concrete
type, cross-cutting concerns are a shared helper module).
Amends ADR-031 (put/delete async refinement, StoreError rename),
ADR-033 (concrete adapter shape now specified, two-crate framing
collapsed to one), ADR-034 (OQ-36 now resolved), auth.md (IdentityStore
section, cache-invalidation summary, OQ-36 reference), config.md (two
write paths note), and the OQ-36/OQ-34 entries in open-questions.md.
Review fixed 4 criticals (error-type name divergence, duplicate
IdentityProvider sketch, upsert/Duplicate ambiguity, 'shape unchanged'
contradiction), 7 warnings, 5 suggestions.
Untangles the conflation of three distinct remote roles under 'X.509
endpoint': (1) public X.509 endpoint — a remote HTTPS/call-over-TLS
server the local node is a client of (no PeerEntry, no PeerId, not in
the peer graph; CA verification + bearer token); (2) transport relay —
iroh's DERP-equivalent, infrastructure, not an alknet peer; (3) hub /
hosting node — an alknet peer that also exposes a public domain + X.509
for browsers (mixed-fingerprint PeerEntry, already supported by
ADR-030).
The load-bearing one-way door is the client-side verifier selection
rule: known peer (PeerEntry present) → fingerprint pin; unknown X.509
remote → CA verification (WebPkiServerVerifier); unknown Ed25519
remote → fails closed. This closes the AcceptAnyServerCertVerifier
security hole OQ-29 flagged, with the peer-model criterion (PeerEntry
presence) made explicit. The 'make PeerEntry symmetric' instinct is
rejected — pure-client connections to public APIs have no stable
logical identity to pin.
Documents that CallCredentials.remote_identity: None is load-bearing
(None = public X.509 endpoint → CA path, not a missing field; Some =
known peer → fingerprint pin), closing a subtle gap where an
implementer could have defaulted to a placeholder or treated None as
skip-verify.
Records WebTransport relay-as-proxy (deferred with h3/WebTransport,
new OQ-HTTP-07) and on-chain/smart-contract peer discovery (fits the
OQ-36 repo/adapter pattern, no auth-model change) so they aren't lost.
Amends auth.md and client-and-adapters.md with the three-role naming,
the verifier selection rule, and the Option semantics; updates OQ-37
to resolved in open-questions.md, README.md, and both crate READMEs.
ADR-009, open-questions.md, and the architect agent spec all had the same
conflation: 'two-way door' was phrased as 'can be decided during
implementation,' which reads as 'defer the decision.' That's not what it
means. A two-way door is a decision you make now and can revert later if
wrong — it's about reversal cost, not urgency.
ADR-009: add §'What this framework is NOT' — explicitly separates door
type (reversal cost) from deferral (scope management). State that
architecture decisions are the architect's regardless of door type.
Reword the two-way-door process from 'can be decided during
implementation' to 'pick the simplest option that works, implement it,
revert if needed.'
open-questions.md: reword the header to clarify door type describes
reversal cost, not urgency. Add 'Door type is separate from whether a
decision is made.'
architect.md: add Key Principle #8 (decisions are made, not deferred),
a new 'Door Types and Decision Urgency' section, and two new anti-patterns
(#8: door type as deferral, #9: hedging language in resolved decisions).
Amend ADR-030 with three changes from the auth-type analysis:
1. PeerEntry is now multi-credential: fingerprints: Vec<String> (Ed25519
and/or X.509) + auth_token_hash: Option<String> (bearer token). All
resolve to the same peer_id. A peer that authenticates via Ed25519
today and via auth_token tomorrow gets the same PeerId. The 'peer
bearer vs auth bearer' distinction was wrong — the correct framing is
the three credential types (Ed25519, X.509, bearer token) and whether
the token needs a stable logical id across rotation (PeerEntry) or not
(ApiKeyEntry).
2. Fingerprint normalization (§6): quinn extracts the raw Ed25519 public
key from the SPKI cert and formats as ed25519:<hex>, matching iroh.
The same key has the same fingerprint regardless of transport. X.509
fingerprints stay as SHA256:<hex of DER>. This also simplifies the
coming WebTransport relay work.
3. The 'API keys' section is replaced with 'Bearer tokens' — correctly
framing the three auth types and the two bearer-token paths
(PeerEntry.auth_token_hash vs ApiKeyEntry).
Resolve OQ-29 (CallClient TLS client-auth): wire quinn client-auth (present
Ed25519 key as raw public key client cert — the server-side extraction
already works); key-type-aware server cert verification (raw key =
fingerprint match, X.509 = CA verification via WebPkiServerVerifier —
AcceptAnyServerCertVerifier is only safe for raw keys); fingerprint
normalization. The iroh path already works (RFC 7250 raw keys, both sides
exchange automatically); the gap was quinn-only.
Dissolve OQ-35: the 'API key asymmetry' framing was wrong. PeerEntry
supports multiple credential paths; ApiKeyEntry is for tokens that ARE the
identity.
Add OQ-37: X.509 outgoing-only case — the three auth types and how X.509
server identity fits the peer model. Not blocking the ADR-029 migration;
downstream (HTTP crate phase).
Update auth.md, config.md, client-and-adapters.md, call/README.md,
core/README.md, open-questions.md, README.md, and call_client.rs source
comment.
Workspace green: 326 tests pass, build clean.
Resolve the call-crate open questions where the decision is made —
OQ-27 (auto-re-import), OQ-28 (same-peer collision = error), OQ-30
(PeerRef::Any insertion-order first-match), OQ-31 (services/list-peers
opt-in). These were previously marked 'open' with 'v1' hedging language
despite having a decided default. What remains (refresh(), richer routing,
services/list-peers the op) is genuine feature addition, not unmade
architecture.
Reframe OQ-32 (multi-hop) as a feature extension rather than a 'v1'
deferral — the one-hop model is the architectural commitment; extending
to multi-hop doesn't break downstream.
Promote OQ-29 (CallClient TLS client-auth) from medium to high priority
and surface its real interaction with ADR-030. Previously framed as
'additive — two-way-door remainder,' but ADR-030's PeerEntry fingerprint
→ peer_id resolution requires the client to present a TLS client cert.
With with_no_client_auth(), no fingerprint is extracted, the PeerEntry
path is dormant, and PeerCompositeEnv keys on None or the API-key prefix
instead of the stable peer_id. This is the activation path for ADR-030's
primary use case, not an additive feature. Three options laid out: (a)
wire client-auth with the ADR-029 migration, (b) ship token-only and
switch later (the 'compounds into a mess' path), (c) extend PeerEntry
to cover auth_token-based identity. Requires a decision before the
migration lands.
Clarify OQ-36 (concrete adapter shapes): the trait shapes and in-memory
adapters ship with core — the deferral is only for the persistence
adapters (SQLite, etc.). The in-memory adapters are real implementations
of a full repo pattern, not stubs.
Update call_client.rs source comment to reference OQ-29 instead of the
'v1' / 'two-way-door remainder' framing.
Workspace green: 326 tests pass, build clean.
Land the storage and auth strategy research (findings.md) as four
accepted ADRs and amend the core and call specs to match:
- ADR-030: PeerEntry and Identity.id decoupling. Replaces
authorized_fingerprints with peers: Vec<PeerEntry>; Identity.id becomes
the stable peer_id, decoupled from the rotating fingerprint. Supersedes
ADR-029 Assumption 1's UUID source (one-way door preserved, source
changes). Resolves OQ-33 and the storage-boundary half of OQ-34. Records
the API-key asymmetry as deliberate (OQ-35).
- ADR-031: CredentialStore repo trait + InMemoryCredentialStore default
adapter in core. Second repo trait alongside IdentityProvider. Vault
encrypts; the store persists the EncryptedData blob; assembly layer
loads into Capabilities. EncryptedData core mirror includes salt for
wire-format compat.
- ADR-032: Forwarded-for identity. forwarded_for field on call.requested
and OperationContext — metadata only, never read by AccessControl::check
(enforced structurally via the check signature). The from_call handler
populates it. Wire-format one-way door, folded into the ADR-029
migration window.
- ADR-033: Storage boundary and repo/adapter pattern. Core defines repo
traits + in-memory defaults; persistence adapters are separate crates;
assembly layer wires. Resolves OQ-34. Concrete adapter shapes deferred
for exploration (OQ-36).
Amends auth.md, config.md, operation-registry.md, client-and-adapters.md,
open-questions.md, README.md, crates/core/README.md. Marks ADR-029
Accepted (Assumption 1 carries the ADR-030 superseded note). Marks the
research findings doc reviewed.
Reworks the storage strategy doc to commit to concrete design, replacing
the 'when storage arrives' / 'future' / 'later' framing that was putting off
important work.
Key changes from the previous draft:
- §4 (Repo/Adapter Pattern): now an explicit design with the trait contracts
(IdentityProvider, CredentialStore), the adapter contracts
(ConfigIdentityProvider with PeerEntry update, SqliteIdentityProvider,
InMemoryCredentialStore, SqliteCredentialStore), and the concrete table
schemas. Not a pattern description — a design commitment.
- §4: PeerEntry config model — AuthPolicy gains peers: Vec<PeerEntry>
replacing authorized_fingerprints: HashSet<String>. This is the
id-fingerprint decoupling (OQ-33) done as a config change, not a storage
change. ConfigIdentityProvider resolves fingerprint → PeerEntry →
Identity { id: peer_id } (stable, not the fingerprint).
- §7 (Decomposition): the 'what goes where' table now has a Status column
(exists / needs adding / needs building / needs PeerEntry update) instead
of 'future'. The crate graph is a concrete build plan.
- §10 (Build Order): replaces 'What This Means for the Immediate Path' (which
had 'when storage arrives' framing) with a 4-tier dependency-driven build
order. Tier 1 = core repo traits + PeerEntry config model. Tier 2 = SQLite
adapters. Tier 3 = ADR-029 migration + forwarded_for. Tier 4 = alknet-graphs
(built when a graph-shaped problem exists, not speculatively).
- §10: explicit 'What does NOT get built (dropped, not deferred)' section —
multi-tenant, accounts/orgs, secrets module, single storage crate are
dropped, not deferred.
- All 'future' / 'when X arrives' / 'v1' / 'phase n' language removed for
things that are needed. The only 'when X is needed' language remaining is
for genuinely non-existent problems (ACL delegation, workflows, taskgraph)
— those are built when the problem exists, not speculatively.
Synthesizes the multi-thread discussion that surfaced during the peer-graph
routing research (ADR-029) and OQ-33/34 resolution. Three separate threads
(peer identity, filesystem POC, old storage spec) converged on the same
question: where does persistent state live in the alknet crate graph, and
what's the shared infrastructure for it.
Key commitments documented:
- SQLite + honker is the foundation (pattern, not a crate — ~20 lines per
consumer). The metagraph is one tool built on it, for graph-shaped
problems. Direct tables are another tool, for table-shaped problems.
- IdentityProvider is the auth repo trait (already exists in core, make the
pattern explicit). Adapters implement it (Config, SQLite, future
Redis/remote/automerge). PeerStore is adapter-internal, not core.
- Per-node ACL, no 'trusted' flag. Each node authorizes its direct callers
via AccessControl::check(identity). No global ACL, no replication. The
hub authorizes the user; the spoke authorizes the hub. Same mechanism.
- Forwarded-for identity as metadata, not authority. The from_call handler
includes the original caller's identity in the call payload; the spoke's
ACL authorizes the hub (direct caller), never the forwarded_for. The ACL
check signature prevents misuse.
- The ACL check stays table-shaped (flat scope match); the delegation graph
(future) produces effective scopes at resolution time. They compose at the
IdentityProvider boundary.
- The hub proxy tangle: ACL (authorize), bucket routing (operation input),
peer routing (PeerRef) are three separate layers. Bucket-level
authorization is handler logic, not protocol logic.
What the old spec had that's dropped: multi-tenant (each tenant gets own
setup), secrets module (replaced by vault), metagraph-as-foundation (demoted
to tool), single storage crate (split by concern), accounts/orgs (deferred —
v1 is a peers table).
Reference: kepal (/workspace/keypal) — TypeScript repo-pattern example
(Storage interface + adapters) that alknet's IdentityProvider follows.
OQ-26 (resolved): AdapterError variants decided — DiscoveryFailed,
SchemaParse, Transport, Unauthorized, SamePeerCollision (replaces flat
Conflict per ADR-029 §5). #[non_exhaustive] for downstream extension.
Two-way door; the initial set is the code's return type.
OQ-33 (resolved): PeerId is a logical identifier, NOT Identity.id. The
research's v1 default (PeerId = fingerprint) is overridden: coupling PeerId
to crypto material breaks every in-flight PeerRef::Specific and every ACL
entry on key rotation. v1 source is a connection-assigned UUID — a
no-storage workaround that works for the immediate use case (head→workers,
reconnect produces fresh PeerRef, in-flight gets NOT_FOUND which is correct).
The one-way door: PeerId is logical, not crypto — this determines
PeerCompositeEnv key type and PeerRef::Specific payload. The id source
(UUID vs configured name vs peer registry) is the two-way-door remainder.
OQ-34 (new): the storage dimension OQ-33 surfaced. The core crates are
deliberately DB-free (smaller, fewer deps, simpler testing) — this served
local-only state (vault, registry) well, but peer identity is the first
cross-node state that wants persistence. The real solution (a persistent
peer registry mapping stable logical name → current crypto material,
surviving key rotation) is not a v1 blocker (UUID works), but tracked so the
no-DB posture's limit is deliberate, not accidental. The storage boundary
(core gets a PeerRegistry trait vs stays storage-free) is the one-way door;
the backend choice is two-way. Key-rotation/ACL note: decoupling PeerId from
crypto keeps the door open for ACL entries that persist across key rotation
— when the peer registry is built, ACLs key on the logical name and key
rotation becomes vault-only with no remote-side ACL update.
ADR-028's remote_safe/trusted_peer was a parallel, weaker authorization system
that duplicated the existing AccessControl/Identity machinery and couldn't
express the head→N-workers pattern (the primary use case). The flat-namespace
single-peer overlay model (one connection layer in CompositeOperationEnv)
structurally breaks the moment a head has two workers both exposing
/container/exec.
ADR-029 replaces it with:
- Peer-keyed overlays: PeerCompositeEnv { connections: HashMap<PeerId, ...> }
replaces CompositeOperationEnv's singular connection layer. A head node
routes invoke_peer() to the right peer via PeerRef::Specific / PeerRef::Any.
- AccessControl-based peer authorization: the existing AccessControl::check
(peer_identity) gates peer calls — the same mechanism that gates every other
call. remote_safe/trusted_peer/RemoteFilter/list_operations_peer_scoped/
services_list_handler_peer_scoped are retired. The op's AccessControl IS the
peer-authorization policy; no parallel system.
- ScopedPeerEnv: peer-qualified reachability (peer-pinned allowlist) replaces
from_call's namespace_prefix as the disambiguation mechanism. Cross-peer
collision dissolves (separate sub-overlays); same-peer collision stays error.
- services/list-peers opt-in for peer-attributed re-export listing.
POC-validated against real types (scratch module written, type-checked,
removed; build clean, 207 tests pass). Petgraph not needed for v1 (one-hop,
shallow); nested HashMap suffices; extends to multi-hop without redesign (OQ-32).
OQ impact: OQ-25 dissolved (no marking); OQ-28 cross-peer dissolved / same-peer
stays; OQ-26/27/29 stay; new OQ-30 (Any routing policy), OQ-31 (list-peers
semantics), OQ-32 (multi-hop federation).
Research: docs/research/alknet-call-peer-routing/findings.md (POC shapes,
prior art — Ray.io actors, Dapr service invocation, full ADR draft).
ADR-028 marked Superseded; ADR-017 DC-1 amendment updated to point at ADR-029.
Post-implementation spec sync after the call-completion batch landed
(commits e4a2594..a3825f5). The sub-agent review flagged no spec drift, but
comparing the implemented types against the spec sketches surfaced five
details the specs didn't name — filled in here so the spec matches what was
built:
- client-and-adapters.md: name the shared Dispatcher (protocol/dispatch.rs)
+ RemoteFilter mechanism that enforces ADR-028's default-deny at dispatch
time (the load-bearing security gate — checks remote_safe before building
context, before any capability material reaches the handler). Add
ClientError/RemoteIdentity types, the spawn_dispatch lower-level API, and
the services_list_handler_peer_scoped wiring (the assembly layer must
register the peer-scoped services/list handler for a CallClient's registry,
not the plain one). Record the v1 TLS client-auth gap (AcceptAnyServerCertVerifier,
with_no_client_auth) as OQ-29.
- call-protocol.md: point the adapter dispatch-loop description at the shared
Dispatcher (dispatch.rs) so readers find the mechanism ADR-017 §1 commits to.
- open-questions.md: OQ-29 — CallClient TLS client-auth + remote-identity
verification is a two-way-door remainder; the no-env-vars invariant is
unaffected (auth_token flows via call-protocol payload, not TLS).
- READMEs: current-state now reflects completion done + reviewed (207 lib +
2 integration tests); OQ-29 added to both OQ summaries.