Commit Graph

53 Commits

Author SHA1 Message Date
7ecc11610a docs(arch): ADR-049 — streaming handler for subscription operations
The call protocol spec describes streaming (call.responded*N +
call.completed, PendingRequestMap::Subscribe, CallConnection::subscribe),
but the server-side Handler type returned a single ResponseEnvelope —
a Subscription op had no way to produce a stream. The TS predecessor
(@alkdev/operations) had separate OperationHandler / SubscriptionHandler
types; the Rust port collapsed them, losing the streaming path. This
restores it end-to-end: StreamingHandler type, HandlerKind on
HandlerRegistration validated against op_type, invoke_streaming() on
OperationRegistry, server-side dispatch branches on op_type, new
INVALID_OPERATION_TYPE protocol code for wrong-dispatch-path misuse,
GatewayDispatch::invoke_streaming() for /subscribe SSE, from_call stream
forwarding via CallConnection::subscribe(), from_openapi SSE forwarding.
OperationEnv::invoke() stays request/response-only (stream composition is
handler-level, not protocol-level). Amends ADR-023's protocol-code list
(five → six). Tracks the stream-operators library as OQ-41 (feature
extension, not an unmade decision).
2026-07-02 07:43:01 +00:00
2a6e4c371a docs(http): resolve OQ-39; add ADRs 045-047; record pubsub prior art for WS path
OQ-39 (to_openapi published-spec versioning) resolved by ADR-045:
info.version semver tracks the gateway endpoint contract, not the
operation set — per-caller operations discovered via /search do not
bump the version. The gateway pattern (ADR-042) dissolved most of the
original churn concern.

ADR-046: assembly-layer custom HTTP routes on HttpAdapter. The HTTP
router had no documented extension point for deployment-specific
endpoints (e.g., an OAI-compatible proxy at /v1/chat/completions). Adds
extra_routes: Option<Router> at construction; raw HTTP, not operations;
default surface takes precedence on collision. The mechanism is the
one-way door; specific routes are two-way.

ADR-047: remove the direct-call POST /{service}/{op} HTTP surface. The
gateway /call is the sole invoke path — the simplified contract is a
few fixed endpoints, not a per-operation REST tree. The direct-call
surface re-introduced the 'dump the full API regardless of privs'
failure mode at the HTTP level that the gateway /search was built to
escape. ADR-036's routing decision is superseded; its non-routing
clauses (SSE, Bearer auth, /healthz, stealth, error mapping) survive.
A deployment wanting a REST-like per-operation surface builds it as a
custom route projection (ADR-046).

ADR-044 updated with the tradeoff framing (WSS is the right tool for
the call-protocol-from-browser case; WebTransport is the right tool for
the generalized ALPN-stream-proxy case we don't have yet — coexist, not
migrate) and the @alkdev/pubsub concrete prior art (the EventEnvelope
{type,id,payload} the call protocol was derived from already has a
working WebSocket client/server; the sync is a small adjustment, not a
from-scratch build).

call-protocol.md references the pubsub lineage for the
transport-agnosticism claim.
2026-06-30 09:49:25 +00:00
3327d585da docs(http): resolve OQ-40 reqwest client config — ClientWithMiddleware + retry/retry-after middleware stack
OQ-40 resolved: alknet-http owns a shared reqwest_middleware::ClientWithMiddleware
(not a bare reqwest::Client) with a two-layer middleware stack —
RetryTransientMiddleware (reqwest-retry, exponential backoff on transient
failures) + inlined RetryAfterMiddleware (from melotic/reqwest-retry-after, MIT,
~50 lines, inlined to bound the upstream's unbounded HashMap storage). The two
are complementary: reqwest-retry's default strategy does not honor Retry-After.

Hot-reload is rebuild-and-swap via ArcSwap (same pattern as
ConfigIdentityProvider, ADR-035); a rebuild drops the connection pool, which
is acceptable since a config change wanting a fresh pool is the trigger. The
three one-way constraints stand unchanged: alknet-http owns its client (no
env-var config, no shared global), credentials inject per-request from
OperationContext.capabilities, outbound TLS uses the system trust store.

Records the downstream layering boundary: the agent crate's provider SSE
normalization (the solid part of aisdk's pattern — Vercel-UI-message
normalization) sits on top of this client, consuming the reqwest::Response
stream; it does not replace the client. The aisdk core/client.rs reference for
client construction is dropped (env-var config + hand-rolled retry are the
anti-patterns discarded); the from_openapi.ts SSE normalization reference in
the forwarding-handler section is kept (separate, solid pattern).

No ADR — the decision is internal to alknet-http: the client type does not
cross crate boundaries (alknet-call never sees reqwest), the library choice is
reversible, and it does not touch the system's structure, constraints, or
cross-crate API surface.

Updates: http-adapters.md (HTTP client section rewritten, references updated,
constraints/OQ bullets updated), http-mcp.md (OQ-40 status flip), open-
questions.md (OQ-40 resolved with full config-shape table), README.md (OQ-40
folded into the existing two-way-doors bucket), and three secondary docs
(crates/http/README.md, overview.md, http-server.md) that carried stale 'open'
OQ-40 references.
2026-06-30 08:02:30 +00:00
125cb49cc4 docs(http): defer h3/WebTransport (ADR-044); browsers use WebSocket for v1
Working through the WebTransport implementation path surfaced a scope
question distinct from the hedging-as-deferral anti-pattern ADR-038 was
written to correct. Three findings drove the re-evaluation:

1. The browser bidirectional call-protocol path doesn't require
   WebTransport — WebSocket is full-duplex, EventEnvelope fits a WS
   binary message boundary cleanly, and the Dispatcher is stream-
   agnostic (ADR-012). What WebTransport gives over WebSocket (native
   multi-stream multiplexing, the ALPN-as-stream substrate) benefits the
   proxy use case, not the call protocol.
2. WebTransport is a draft standard (-07, not RFC) on an experimental
   Rust dependency stack (wtransport/h3 both self-describe as not
   production-ready). Either choice puts a draft protocol on the
   security surface of the first release.
3. The ALPN-stream-proxy (ADR-040) is speculative — its WASM parser
   consumers (browser SSH/SFTP/git clients) don't exist yet, and the
   downstream crates WebTransport deferral blocks (SSH, git, SFTP)
   expose their ALPNs natively over QUIC regardless.

This is a scope decision (per ADR-009: a decision that 'genuinely
doesn't need to be made yet because the use case isn't concrete'), not
hedging. The reversal trigger is concrete: a real deployment needing
the ALPN-stream-proxy.

ADR-038 is superseded (its anti-pattern correction stands; its specific
'h3 in scope now' decision is reversed). ADR-040 and ADR-043 are
parked, not superseded — their designs revive unchanged when WebTransport
revives, with §2 (bidirectionality) and §3 (no-PeerId overlay) of ADR-043
transferring to WebSocket for v1.

ADR-044 §5 also states the 'browser is not a peer' rationale that
ADR-034 §4 closed without arguing: peer = addressable node in the
call-protocol peer graph (stable PeerId, PeerRef::Specific-reachable,
identity stable across reconnects), not 'any endpoint that exchanges
calls during a live session.' A browser is the second but not the first
(no stable crypto identity of its own, ephemeral, not addressable from
other nodes). ADR-034 §4 and Assumption 2 are amended by reference.

The wtransport-vs-hyperium dependency question is recorded (not
resolved — WebTransport is deferred) in ADR-044 §'Research note' and
webtransport.md so the revival doesn't re-derive it: wtransport probably
isn't the right choice (axum-bridge friction — it owns its own HTTP
serving path); the hyperium stack (h3 + h3-quinn + h3-webtransport) fits
the axum integration better but its server-side WebTransport API needs
verification before commitment.

Reviewed by architecture-review subagent; all critical cross-reference
issues (ADR-034 §5 stale 'in scope' assertion, ADR-036 Context listing
h3 as implemented, webtransport.md Design Decisions table) resolved.
2026-06-30 05:55:55 +00:00
398e3d512d docs(http): add ADR-040 WebTransport ALPN-stream-proxy and reframe OQ-38
The 'WebTransport proxy' concept was conflating two distinct things;
this pass separates them:

1. In-process ALPN-stream-proxy (ADR-040, in alknet-http): the h3 handler
   hands a WebTransport stream to another ALPN handler (SshAdapter,
   GitAdapter, etc.) as a Connection, so a browser with a WASM parser
   can reach any ALPN service via WebTransport. Path-based routing
   (the CONNECT path declares the target: /alknet/ssh -> SshAdapter).
   HttpAdapter gains Arc<HandlerRegistry> for the lookup. The browser's
   WASM parser implements BiStream (ADR-007) over the WebTransport
   stream. SSH-over-WebTransport is HTTPS-shaped at the network layer
   (anti-censorship: the 'VPN-like without being a VPN' use case on a
   clean foundation). russh-sftp demonstrates WASM targeting is
   feasible; SSH is the next target.

2. Standalone relay service (OQ-38, future alknet-relay crate): a full
   relay - fork of iroh-relay - with WebTransport proxy fallback for
   NAT traversal. This is infrastructure, not a mode of the h3 handler.
   OQ-38 reframed to be the standalone-relay scope question (distinct
   from the in-process proxy now resolved by ADR-040).

webtransport.md updated: three stream destinations (call protocol,
ALPN-handler proxy, other sub-protocols) with path-based routing; new
'ALPN-stream-proxy' section covering the WASM client side, auth model
(bearer token gates the session; protocol's own auth gates the
protocol session), and the HandlerRegistry reference.

README/overview ADR tables and OQ summaries updated for ADR-040.
2026-06-29 07:56:35 +00:00
ab47dac4ad docs(http): draft alknet-http architecture specs and ADRs 036-039
First speccing pass for alknet-http (HTTP interface crate: h2/http1.1/h3
server + from_openapi/to_openapi/from_mcp/to_mcp adapters).

Specs (crates/http/):
- README.md, overview.md — crate index, two-roles-in-one-crate framing,
  adapter location map, feature gates (h3, mcp), no-env-vars invariant
- http-server.md — HttpAdapter for h2/http1.1, axum over QUIC stream,
  Bearer auth, SSE projection for subscriptions, /healthz, stealth decoy
- http-adapters.md — from_openapi (reqwest) and to_openapi (projection),
  error fidelity (HTTP_<status> per ADR-023), type definitions
- http-mcp.md — from_mcp/to_mcp (feature-gated), streamable-HTTP-only
- webtransport.md — h3/WebTransport handler, browser streaming path,
  HTTP/3 request vs WebTransport session distinguished at framing layer

ADRs:
- ADR-036 HTTP-to-Call Operation Mapping (Proposed) — direct path
  mapping; to_openapi is projection, not router (the load-bearing one-way
  door from Phase 0 DH-3)
- ADR-037 MCP Stdio Transport Exclusion (Proposed) — streamable HTTP
  only; stdio is not built (RCE-vector security position)
- ADR-038 HTTP/3 and WebTransport as First-Class HTTP Transports
  (Proposed) — corrects the Phase 0 DH-2 deferral framing; h3 is in
  scope, not deferred, per ADR-009 §'What this framework is NOT'
- ADR-039 HTTP Server and Client Host Colocated in alknet-http
  (Proposed) — one crate for server + client host (shared HTTP deps,
  shared operation-spec->HTTP mapping)
- ADR-003 Amendment 1 — clarifies alknet-call is a protocol-foundation
  crate (the alknet-http -> alknet-call dependency edge)

Open questions (OQ-38, OQ-39, OQ-40 added under 'Theme: alknet-http'):
- OQ-38 WebTransport relay-as-proxy scope (genuine scope question, not
  a deferral — the decision is made when the use case becomes concrete)
- OQ-39 to_openapi published-spec versioning (one-way after first
  publication)
- OQ-40 reqwest client config and connection pooling (two-way-door)

Architecture README and overview updated with doc table, ADR table
(036-039), current-state note, and crate graph (alknet-http ->
alknet-call edge).

Reviewed by architecture-reviewer subagent: 3 critical, 4 warning, 5
suggestion issues found and fixed (missing ADR-039, WebTransport stream
routing conflation, undefined types, stale OQ-37 deferral language,
README OQ table completeness, Bearer-only attribution, cross-references,
ADR-038 ALPN quote, feature-gate placeholder, MCP temporal language).
2026-06-29 05:53:38 +00:00
0de2cebb1d docs(arch): ADR-035 — concrete persistence adapter shapes, resolve OQ-36
Commits the concrete adapter shape deferred by ADR-033: read-sync /
write-async split with honker NOTIFY/LISTEN for no-restart cache
invalidation, against SQLite, in a separate alknet-store-sqlite crate.

Two constraints drive the design: (1) the hot-path read trait
(IdentityProvider::resolve_from_fingerprint, CredentialStore::get) is
sync — called in the accept loop, no .await — so a SQLite-backed
adapter must cache in memory and serve sync reads from the cache; (2)
auth changes must take effect without a restart (an early issue the
project already fixed for ConfigIdentityProvider via ArcSwap config
reload). honker's SQLite NOTIFY/LISTEN (single-digit-ms wake, no
polling) is the cache-invalidation mechanism that makes both hold:
write commits to SQLite + emits NOTIFY, the running process's LISTEN
wakes, the in-memory index reloads and atomically swaps, the next
read sees the new state. Same ArcSwap-reload pattern as config,
generalized from 'config file is source of truth' to 'SQLite is
source of truth, honker signals when it changed.'

New async IdentityStore write trait (put_peer / update_peer /
remove_peer) extends the sync IdentityProvider read trait for peer
mutations. ConfigIdentityProvider does NOT implement it (config
reload is its write path — a posture enforced by the absence of a
backend, not a type-system constraint); SqliteIdentityProvider
implements both. CredentialStore::put/delete refined to async (within
ADR-031's one-way door — the contract was get/put/delete keyed by
provider persisting EncryptedData never decrypting; sync-vs-async was
unspecified). CredentialStoreError renamed to shared StoreError
covering both traits.

alknet-store-sqlite is one crate implementing both IdentityStore and
CredentialStore with shared SQLite connection + honker LISTEN infra
(splitting later is a two-way door). Schema shape committed (one row
per PeerEntry with JSON columns for fingerprints/scopes/resources;
one row per EncryptedData blob keyed by provider); exact DDL is an
implementation-detail two-way door in the adapter crate. The keypal
adapter-factory pattern is intentionally not ported to Rust (runtime
column-mapping is a TS affordance; in Rust each adapter is a concrete
type, cross-cutting concerns are a shared helper module).

Amends ADR-031 (put/delete async refinement, StoreError rename),
ADR-033 (concrete adapter shape now specified, two-crate framing
collapsed to one), ADR-034 (OQ-36 now resolved), auth.md (IdentityStore
section, cache-invalidation summary, OQ-36 reference), config.md (two
write paths note), and the OQ-36/OQ-34 entries in open-questions.md.
Review fixed 4 criticals (error-type name divergence, duplicate
IdentityProvider sketch, upsert/Duplicate ambiguity, 'shape unchanged'
contradiction), 7 warnings, 5 suggestions.
2026-06-28 11:10:31 +00:00
6cc8715ccf docs(arch): ADR-034 — outgoing-only X.509 and three peer roles, resolve OQ-37
Untangles the conflation of three distinct remote roles under 'X.509
endpoint': (1) public X.509 endpoint — a remote HTTPS/call-over-TLS
server the local node is a client of (no PeerEntry, no PeerId, not in
the peer graph; CA verification + bearer token); (2) transport relay —
iroh's DERP-equivalent, infrastructure, not an alknet peer; (3) hub /
hosting node — an alknet peer that also exposes a public domain + X.509
for browsers (mixed-fingerprint PeerEntry, already supported by
ADR-030).

The load-bearing one-way door is the client-side verifier selection
rule: known peer (PeerEntry present) → fingerprint pin; unknown X.509
remote → CA verification (WebPkiServerVerifier); unknown Ed25519
remote → fails closed. This closes the AcceptAnyServerCertVerifier
security hole OQ-29 flagged, with the peer-model criterion (PeerEntry
presence) made explicit. The 'make PeerEntry symmetric' instinct is
rejected — pure-client connections to public APIs have no stable
logical identity to pin.

Documents that CallCredentials.remote_identity: None is load-bearing
(None = public X.509 endpoint → CA path, not a missing field; Some =
known peer → fingerprint pin), closing a subtle gap where an
implementer could have defaulted to a placeholder or treated None as
skip-verify.

Records WebTransport relay-as-proxy (deferred with h3/WebTransport,
new OQ-HTTP-07) and on-chain/smart-contract peer discovery (fits the
OQ-36 repo/adapter pattern, no auth-model change) so they aren't lost.

Amends auth.md and client-and-adapters.md with the three-role naming,
the verifier selection rule, and the Option semantics; updates OQ-37
to resolved in open-questions.md, README.md, and both crate READMEs.
2026-06-28 10:47:49 +00:00
3f011cbb82 docs(arch): tighten door-type framing — reversal cost, not deferral
ADR-009, open-questions.md, and the architect agent spec all had the same
conflation: 'two-way door' was phrased as 'can be decided during
implementation,' which reads as 'defer the decision.' That's not what it
means. A two-way door is a decision you make now and can revert later if
wrong — it's about reversal cost, not urgency.

ADR-009: add §'What this framework is NOT' — explicitly separates door
type (reversal cost) from deferral (scope management). State that
architecture decisions are the architect's regardless of door type.
Reword the two-way-door process from 'can be decided during
implementation' to 'pick the simplest option that works, implement it,
revert if needed.'

open-questions.md: reword the header to clarify door type describes
reversal cost, not urgency. Add 'Door type is separate from whether a
decision is made.'

architect.md: add Key Principle #8 (decisions are made, not deferred),
a new 'Door Types and Decision Urgency' section, and two new anti-patterns
(#8: door type as deferral, #9: hedging language in resolved decisions).
2026-06-28 09:19:10 +00:00
7d812af8f4 docs(arch): multi-credential PeerEntry, resolve OQ-29, dissolve OQ-35, add OQ-37
Amend ADR-030 with three changes from the auth-type analysis:

1. PeerEntry is now multi-credential: fingerprints: Vec<String> (Ed25519
   and/or X.509) + auth_token_hash: Option<String> (bearer token). All
   resolve to the same peer_id. A peer that authenticates via Ed25519
   today and via auth_token tomorrow gets the same PeerId. The 'peer
   bearer vs auth bearer' distinction was wrong — the correct framing is
   the three credential types (Ed25519, X.509, bearer token) and whether
   the token needs a stable logical id across rotation (PeerEntry) or not
   (ApiKeyEntry).

2. Fingerprint normalization (§6): quinn extracts the raw Ed25519 public
   key from the SPKI cert and formats as ed25519:<hex>, matching iroh.
   The same key has the same fingerprint regardless of transport. X.509
   fingerprints stay as SHA256:<hex of DER>. This also simplifies the
   coming WebTransport relay work.

3. The 'API keys' section is replaced with 'Bearer tokens' — correctly
   framing the three auth types and the two bearer-token paths
   (PeerEntry.auth_token_hash vs ApiKeyEntry).

Resolve OQ-29 (CallClient TLS client-auth): wire quinn client-auth (present
Ed25519 key as raw public key client cert — the server-side extraction
already works); key-type-aware server cert verification (raw key =
fingerprint match, X.509 = CA verification via WebPkiServerVerifier —
AcceptAnyServerCertVerifier is only safe for raw keys); fingerprint
normalization. The iroh path already works (RFC 7250 raw keys, both sides
exchange automatically); the gap was quinn-only.

Dissolve OQ-35: the 'API key asymmetry' framing was wrong. PeerEntry
supports multiple credential paths; ApiKeyEntry is for tokens that ARE the
identity.

Add OQ-37: X.509 outgoing-only case — the three auth types and how X.509
server identity fits the peer model. Not blocking the ADR-029 migration;
downstream (HTTP crate phase).

Update auth.md, config.md, client-and-adapters.md, call/README.md,
core/README.md, open-questions.md, README.md, and call_client.rs source
comment.

Workspace green: 326 tests pass, build clean.
2026-06-28 08:49:36 +00:00
1d94aaea51 docs(arch): resolve call-crate OQs, promote OQ-29 to load-bearing on ADR-030
Resolve the call-crate open questions where the decision is made —
OQ-27 (auto-re-import), OQ-28 (same-peer collision = error), OQ-30
(PeerRef::Any insertion-order first-match), OQ-31 (services/list-peers
opt-in). These were previously marked 'open' with 'v1' hedging language
despite having a decided default. What remains (refresh(), richer routing,
services/list-peers the op) is genuine feature addition, not unmade
architecture.

Reframe OQ-32 (multi-hop) as a feature extension rather than a 'v1'
deferral — the one-hop model is the architectural commitment; extending
to multi-hop doesn't break downstream.

Promote OQ-29 (CallClient TLS client-auth) from medium to high priority
and surface its real interaction with ADR-030. Previously framed as
'additive — two-way-door remainder,' but ADR-030's PeerEntry fingerprint
→ peer_id resolution requires the client to present a TLS client cert.
With with_no_client_auth(), no fingerprint is extracted, the PeerEntry
path is dormant, and PeerCompositeEnv keys on None or the API-key prefix
instead of the stable peer_id. This is the activation path for ADR-030's
primary use case, not an additive feature. Three options laid out: (a)
wire client-auth with the ADR-029 migration, (b) ship token-only and
switch later (the 'compounds into a mess' path), (c) extend PeerEntry
to cover auth_token-based identity. Requires a decision before the
migration lands.

Clarify OQ-36 (concrete adapter shapes): the trait shapes and in-memory
adapters ship with core — the deferral is only for the persistence
adapters (SQLite, etc.). The in-memory adapters are real implementations
of a full repo pattern, not stubs.

Update call_client.rs source comment to reference OQ-29 instead of the
'v1' / 'two-way-door remainder' framing.

Workspace green: 326 tests pass, build clean.
2026-06-28 05:35:52 +00:00
f224ea998c docs(arch): ADR-030..033 — repo/adapter pattern, PeerEntry, CredentialStore, forwarded-for
Land the storage and auth strategy research (findings.md) as four
accepted ADRs and amend the core and call specs to match:

- ADR-030: PeerEntry and Identity.id decoupling. Replaces
  authorized_fingerprints with peers: Vec<PeerEntry>; Identity.id becomes
  the stable peer_id, decoupled from the rotating fingerprint. Supersedes
  ADR-029 Assumption 1's UUID source (one-way door preserved, source
  changes). Resolves OQ-33 and the storage-boundary half of OQ-34. Records
  the API-key asymmetry as deliberate (OQ-35).

- ADR-031: CredentialStore repo trait + InMemoryCredentialStore default
  adapter in core. Second repo trait alongside IdentityProvider. Vault
  encrypts; the store persists the EncryptedData blob; assembly layer
  loads into Capabilities. EncryptedData core mirror includes salt for
  wire-format compat.

- ADR-032: Forwarded-for identity. forwarded_for field on call.requested
  and OperationContext — metadata only, never read by AccessControl::check
  (enforced structurally via the check signature). The from_call handler
  populates it. Wire-format one-way door, folded into the ADR-029
  migration window.

- ADR-033: Storage boundary and repo/adapter pattern. Core defines repo
  traits + in-memory defaults; persistence adapters are separate crates;
  assembly layer wires. Resolves OQ-34. Concrete adapter shapes deferred
  for exploration (OQ-36).

Amends auth.md, config.md, operation-registry.md, client-and-adapters.md,
open-questions.md, README.md, crates/core/README.md. Marks ADR-029
Accepted (Assumption 1 carries the ADR-030 superseded note). Marks the
research findings doc reviewed.
2026-06-27 12:12:25 +00:00
99c6dd9483 docs(arch): resolve OQ-26 (AdapterError variants) + OQ-33 (PeerId = logical id) + OQ-34 (persistent peer registry)
OQ-26 (resolved): AdapterError variants decided — DiscoveryFailed,
SchemaParse, Transport, Unauthorized, SamePeerCollision (replaces flat
Conflict per ADR-029 §5). #[non_exhaustive] for downstream extension.
Two-way door; the initial set is the code's return type.

OQ-33 (resolved): PeerId is a logical identifier, NOT Identity.id. The
research's v1 default (PeerId = fingerprint) is overridden: coupling PeerId
to crypto material breaks every in-flight PeerRef::Specific and every ACL
entry on key rotation. v1 source is a connection-assigned UUID — a
no-storage workaround that works for the immediate use case (head→workers,
reconnect produces fresh PeerRef, in-flight gets NOT_FOUND which is correct).
The one-way door: PeerId is logical, not crypto — this determines
PeerCompositeEnv key type and PeerRef::Specific payload. The id source
(UUID vs configured name vs peer registry) is the two-way-door remainder.

OQ-34 (new): the storage dimension OQ-33 surfaced. The core crates are
deliberately DB-free (smaller, fewer deps, simpler testing) — this served
local-only state (vault, registry) well, but peer identity is the first
cross-node state that wants persistence. The real solution (a persistent
peer registry mapping stable logical name → current crypto material,
surviving key rotation) is not a v1 blocker (UUID works), but tracked so the
no-DB posture's limit is deliberate, not accidental. The storage boundary
(core gets a PeerRegistry trait vs stays storage-free) is the one-way door;
the backend choice is two-way. Key-rotation/ACL note: decoupling PeerId from
crypto keeps the door open for ACL entries that persist across key rotation
— when the peer registry is built, ACLs key on the logical name and key
rotation becomes vault-only with no remote-side ACL update.
2026-06-27 06:34:35 +00:00
77eb35a8a5 docs(arch): ADR-029 peer-graph routing model — supersedes ADR-028
ADR-028's remote_safe/trusted_peer was a parallel, weaker authorization system
that duplicated the existing AccessControl/Identity machinery and couldn't
express the head→N-workers pattern (the primary use case). The flat-namespace
single-peer overlay model (one connection layer in CompositeOperationEnv)
structurally breaks the moment a head has two workers both exposing
/container/exec.

ADR-029 replaces it with:
- Peer-keyed overlays: PeerCompositeEnv { connections: HashMap<PeerId, ...> }
  replaces CompositeOperationEnv's singular connection layer. A head node
  routes invoke_peer() to the right peer via PeerRef::Specific / PeerRef::Any.
- AccessControl-based peer authorization: the existing AccessControl::check
  (peer_identity) gates peer calls — the same mechanism that gates every other
  call. remote_safe/trusted_peer/RemoteFilter/list_operations_peer_scoped/
  services_list_handler_peer_scoped are retired. The op's AccessControl IS the
  peer-authorization policy; no parallel system.
- ScopedPeerEnv: peer-qualified reachability (peer-pinned allowlist) replaces
  from_call's namespace_prefix as the disambiguation mechanism. Cross-peer
  collision dissolves (separate sub-overlays); same-peer collision stays error.
- services/list-peers opt-in for peer-attributed re-export listing.

POC-validated against real types (scratch module written, type-checked,
removed; build clean, 207 tests pass). Petgraph not needed for v1 (one-hop,
shallow); nested HashMap suffices; extends to multi-hop without redesign (OQ-32).

OQ impact: OQ-25 dissolved (no marking); OQ-28 cross-peer dissolved / same-peer
stays; OQ-26/27/29 stay; new OQ-30 (Any routing policy), OQ-31 (list-peers
semantics), OQ-32 (multi-hop federation).

Research: docs/research/alknet-call-peer-routing/findings.md (POC shapes,
prior art — Ray.io actors, Dapr service invocation, full ADR draft).
ADR-028 marked Superseded; ADR-017 DC-1 amendment updated to point at ADR-029.
2026-06-27 06:04:19 +00:00
f9c0ab092b docs(arch): sync call-completion specs with implementation — Dispatcher/RemoteFilter, ClientError, OQ-29
Post-implementation spec sync after the call-completion batch landed
(commits e4a2594..a3825f5). The sub-agent review flagged no spec drift, but
comparing the implemented types against the spec sketches surfaced five
details the specs didn't name — filled in here so the spec matches what was
built:

- client-and-adapters.md: name the shared Dispatcher (protocol/dispatch.rs)
  + RemoteFilter mechanism that enforces ADR-028's default-deny at dispatch
  time (the load-bearing security gate — checks remote_safe before building
  context, before any capability material reaches the handler). Add
  ClientError/RemoteIdentity types, the spawn_dispatch lower-level API, and
  the services_list_handler_peer_scoped wiring (the assembly layer must
  register the peer-scoped services/list handler for a CallClient's registry,
  not the plain one). Record the v1 TLS client-auth gap (AcceptAnyServerCertVerifier,
  with_no_client_auth) as OQ-29.
- call-protocol.md: point the adapter dispatch-loop description at the shared
  Dispatcher (dispatch.rs) so readers find the mechanism ADR-017 §1 commits to.
- open-questions.md: OQ-29 — CallClient TLS client-auth + remote-identity
  verification is a two-way-door remainder; the no-env-vars invariant is
  unaffected (auth_token flows via call-protocol payload, not TLS).
- READMEs: current-state now reflects completion done + reviewed (207 lib +
  2 integration tests); OQ-29 added to both OQ summaries.
2026-06-26 13:42:42 +00:00
2649e068e5 docs(arch): call-completion — ADR-028 peer-scoped filtering + client-and-adapters spec + tasks
Resolves the four gap-analysis decisions (DC-1..4) blocking the alknet-call
client/adapter surface specced in ADR-017:

- ADR-028 (new): locks the one-way door for DC-1 — CallClient registry is
  default-deny (remote_safe: bool on HandlerRegistration, default false across
  all provenance); share-global is an explicit trusted-peer opt-in; filtering
  is a dispatch-time read over the single Layer-0 registry, not a copy.
- client-and-adapters.md (new spec): operationally fills the gap ADR-017 left
  to implementation — CallClient, from_call, from_jsonschema, OperationAdapter
  trait, adapter location map, no-env-vars invariant, exchange-of-operations
  pattern. Keeps call-protocol.md and operation-registry.md under the
  700-line split threshold.
- ADR-017 amended: records DC-2/3/4 v1 defaults (auto-on-reconnect,
  error-on-collision, Result error type) and points DC-1 at ADR-028.
- OQ-25..28 (new): two-way-door remainders (remote_safe shape, AdapterError
  variants, re-import trigger, namespace collision) with v1 defaults recorded.
- Index/cross-ref updates across READMEs and the two existing call specs.

Tasks: 6 task files under tasks/call/ decomposing the completion work along
the gap-analysis priority order — remote-safe-marking (one-way door, first)
→ call-client (phase-risk) → from-call → operation-adapter-trait →
from-jsonschema (parallel with call-client) → review-completion. Graph
validated with taskgraph; parallelism designed in (from-jsonschema runs
concurrent with call-client/from-call once the trait lands).
2026-06-26 12:25:13 +00:00
d94d7a132a docs(adr-027): TLS identity redesign — ACME + RawKey decoupling
ADR-027 resolves the architectural gap surfaced when ACME integration
became a concrete target:

1. TlsIdentity::Acme variant — static config data (domains, cache_dir,
   directory, contact) with async AcmeState constructed at endpoint
   setup via two-phase TlsSetup (not stuffed into the Clone-able enum).

2. TlsIdentity::RawKey decoupled from the iroh feature — uses
   Ed25519SecretKey (alknet-core-owned wrapper over ed25519_dalek)
   instead of iroh::SecretKey. Raw-key TLS identity (RFC 7250, the
   default for most alknet nodes) now works in quinn-only builds.
   iroh transport converts via SecretKey::from_bytes.

3. ACME feature-gated behind new acme feature (rustls-acme optional
   dep). Non-ACME builds don't compile it.

4. dispatch_quinn guard for acme-tls/1 challenge connections — TLS-ALPN-01
   is handled at the rustls cert resolver layer during the handshake;
   the guard closes challenge connections gracefully instead of logging
   a misleading "no handler" warning.

Research confirmed QUIC (quinn) handles ACME challenges differently than
TCP (reverse-proxy): quinn gives no ClientHello peek hook, but the
challenge is fully answered at the cert resolution step before the
connection surfaces to the application. No handler registration needed.

Spec updates: config.md, endpoint.md, open-questions.md (OQ-12),
overview.md + README.md (ADR index), ADR-010 (cross-ref).

Tasks: core/rawkey-decouple-from-iroh (gen 1, no deps),
core/acme-integration (gen 2, depends on rawkey). Graph: 36 tasks.
2026-06-24 12:29:24 +00:00
cb98f42cd4 docs(architecture): resolve review #002 remaining Tier 4 findings
Add ADR-026 (vault key model — HD derivation) recording the foundational
HD-derivation decision, 74' coin type reservation, SLIP-0010/Ed25519
default, secp256k1 feature-gating, and AES-256-GCM cipher choice. These
were previously inline rationale with no ADR (W9).

Extend ADR-018 with an explicit EncryptedData wire format lock — fields,
encoding, and semantics are frozen; no removal without a format-version
migration (W10).

Resolve the remaining guard clauses and spec decisions:

- W2: Capabilities must be immutable after construction (no interior
  mutability). Makes the Arc vs deep-copy clone semantics genuinely
  two-way.
- W5: Published to_* specs are compatibility contracts — best-effort
  mappings are two-way before first publication, one-way after. Version
  generated specs.
- W6: Salt field clarification — v2 salt is permanently unused; a future
  KDF is a different derivation family, not a version-indexed path; the
  field saves a wire-format change only.
- W7: unlock_new returns Zeroizing<String> — the mnemonic is the root of
  trust and must not linger in freed memory.
- W17: OQ-09 WASM — server-side dispatch door is honestly closed
  (Connection is concrete, tokio-bound), not implicitly preserved.
- W18: OQ-10 git — composability fork (raw smart protocol vs call-protocol
  projection) is a separate decision from ERC721 scope.
- W20: from_openapi must prefix imported error codes (HTTP_404) to avoid
  collision with protocol-level codes (NOT_FOUND). Normative rule, not
  naming convention.
- W21: ScopedOperationEnv field is private — construction via new()/
  empty(), query via allows(). Makes the future subgraph refactor
  non-breaking.
- C13: Connection::set_identity — the endpoint does not read identity()
  after handle() returns (Connection is moved into the spawned task).
  Observability is handler-side logging. Simplest honest answer.
- W1: OperationAdapter trait is async, returns Vec<HandlerRegistration>.
  from_call requires async discovery; ADR-022 changed the return type.
- W11: CompositionAuthority::as_identity() defined — constructs a
  synthetic Identity (label as id, scopes, resources) not resolvable via
  IdentityProvider. Second Identity construction path, acknowledged.
- W14: SecretKey is iroh::SecretKey (Ed25519) — consistent with the
  endpoint's iroh dependency.
- W19: Grandchild abort propagation is inherit-by-default (option a) —
  invoke() with no explicit policy inherits parent's policy. ContinueRunning
  auto-propagates to grandchildren unless explicitly overridden.
2026-06-23 08:20:27 +00:00
7dda6eec68 docs(architecture): add ADR-025 — vault local-only dispatch, drop irpc
Drops irpc from alknet-vault entirely. The vault's dispatch is now direct
method calls on VaultServiceHandle — no VaultProtocol enum, no
VaultMessage, no VaultServiceActor, no mpsc channel, no Service trait, no
RemoteService trait, no postcard serialization. The vault is local-only by
construction.

The core security argument: irpc made the vault remote-capable by default
(RemoteService generated unless no_rpc is passed). The IrohProtocol handler
forwards all messages without auth. The docs framed 'register an ALPN' as a
server-setup change. This is the default-insecure anti-pattern — security
should be opt-in, not opt-out. ADR-025 inverts the default: local-only is
the only mode, and remote access requires building a separate vault-server
crate (a visible architectural act, not a flag flip).

The actor path was already dead code — service.md said 'prefer
VaultServiceHandle directly — no channel, no serialization.' The actor
existed only to make irpc's Service trait work, which existed only to make
RemoteService work, which was the footgun. VaultServiceHandle's
Arc<RwLock> provides concurrent reads and exclusive writes — better
throughput than the actor's sequential processing.

DerivedKey serialization simplifies: always redact on serialize (for
logging safety), reject '[REDACTED]' on deserialize with an error. No
'postcard preserves bytes' path. This resolves review #002 W8 (silent
corruption on JSON-deserialized DerivedKey).

Resolves:
- OQ-21: remote vault access — resolved (not deferred). Not a vault crate
  feature; if needed, a separate vault-server crate with its own ADR.
- C7: vault-server-crate question decided — not created now, not precluded.
- C8: operation access policy table dissolved — all operations local-only
  by default; if a vault-server crate exposes some remotely, that crate
  defines the policy.
- W8: DerivedKey JSON deserialization — resolved (reject redacted payloads).

Amends ADR-005 (irpc remains for alknet-call, not for alknet-vault),
ADR-018 (vault is even more standalone — zero RPC framework deps),
ADR-019 (vault is the only layer, not just the only direct-caller layer),
ADR-008 (vault integration point unchanged, but now local-only by
construction).
2026-06-22 14:53:52 +00:00
cdf340bec7 docs(architecture): add ADR-024 — operation registry layering, resolve C6
Diagnoses a conflation in the pre-ADR-024 spec: the OperationRegistry
inherited immutability by analogy from ADR-010's HandlerRegistry (ALPN-level),
but the TLS-config argument that justifies HandlerRegistry immutability does
not apply to the operation registry, which lives behind a single ALPN
(alknet/call). This made from_call (which discovers ops over a live connection
at runtime) structurally incompatible with the blanket immutability claim.

ADR-024 layers the operation registry by trust boundary: curated (Local) ops
are static and immutable — the startup trust boundary is where their
composition authority is granted; session (Session) and imported (FromCall
etc.) ops are dynamic at their respective scopes (per-session, per-connection)
— their trust boundaries are per-scope, not per-startup. The principle:
immutability follows the trust boundary. Immutability is the security control
for composing ops (can escalate privilege); provenance + composition authority
are the controls for non-composing ops (can't escalate).

The OperationEnv trait becomes the integration point (Arc<dyn OperationEnv>),
following the IdentityProvider precedent (ADR-004): the CallAdapter composes
the root OperationContext.env per incoming call from the active layers
(curated base + connection overlay + session overlay). Children inherit the
parent's composite env by Arc::clone — overlay composition happens once at
the root and propagates through the composition tree.

Resolves review #002 C6 (OperationContext.env type identity crisis): the
field is split into scoped_env: ScopedOperationEnv (reachability data, from
the registration bundle) and env: Arc<dyn OperationEnv + Send + Sync>
(dispatch trait object). One field was being used as two different types
(reachability set with .allows() and dispatch trait with .invoke());

Localizes W4 (hot-swap ↔ registry mutability coupling) to the connection
scope: no global mutable registry to hot-swap; overlays replace naturally
with connect/disconnect and session start/end. Schema-drift on reconnect is
a per-connection overlay-rebuild concern, not a global hot-swap protocol.

Partially addresses W3 (CallClient registry security): the registry-shape
sub-question is resolved by the overlay model; the capability-exposure
sub-question (what capabilities a remote peer can trigger) remains for
ADR-017 — ADR-024 does not overclaim resolution there.

Amends OQ-04 to scope its immutability claim to the HandlerRegistry and
cross-reference ADR-024 for the operation registry. Generalizes OQ-19's
session-overlay mechanism to also cover connection-scoped remote imports —
both are per-scope dynamic overlays on the static curated base, using the
same trait-layering mechanism.
2026-06-22 13:44:58 +00:00
3e238a471b docs(architecture): add ADR-023, resolve OQ-24 — operation error schemas
ADR-023 adds error_schemas to OperationSpec so operations can declare
their domain-level failure modes (FILE_NOT_FOUND, RATE_LIMITED, etc.)
distinct from protocol-level codes (NOT_FOUND, FORBIDDEN, etc.). The
call.error payload gains an optional 'details' field carrying the typed
error payload conforming to the declared schema. from_openapi/to_openapi
map OpenAPI response status codes to/from ErrorDefinitions, making the
adapter contract from ADR-017 faithful on the error axis.

Also fixes W2 (KeyVersionMismatch stale comment in encryption.md —
ADR-021 implements rotation without this variant) and W4
(derive_encryption_key_for_version missing from service.md method list).

Spec updates: operation-registry.md (OperationSpec, ErrorDefinition,
Handler error mapping, services/schema), call-protocol.md (call.error
payload, CallError, ResponseEnvelope), README.md, overview.md,
open-questions.md (OQ-24), call/README.md, encryption.md, service.md.
2026-06-21 10:26:18 +00:00
1cedc4eeba docs(architecture): add ADR-022, resolve OQ-23 — handler registration, provenance, and composition authority
ADR-022 wires the three controls ADR-015 specified but left without
registration paths (C1-C4 from review #001): composition authority,
scoped env, and capabilities now enter through a HandlerRegistration
bundle. Provenance (Local, FromOpenAPI, FromMCP, FromCall, Session)
determines which ops can compose — leaves don't get composition
authority. CompositionAuthority replaces handler_identity: Identity
(it's a declared authority bundle, not a peer identity). Capabilities
are per-request from the bundle (resolves closure-capture vs context
ambiguity). Kernel/user analogy: user's authority checked at External
gate; handler's composition authority used inside; scoped env bounds
reachability.

Also fixes W1 (stale ADR-020 path example) and W3 (from_mcp missing
from adapter lists in operation-registry.md).

Spec updates: operation-registry.md (OperationRegistry,
HandlerRegistration, OperationContext, OperationEnv, registration
example, capability injection), call-protocol.md (build_root_context),
README.md, overview.md, open-questions.md (OQ-23), call/README.md.
2026-06-21 09:09:47 +00:00
9087f0579f docs(architecture): document vault remote capability, enrich OQ-21
The VaultProtocol is a remote-capable irpc service by construction —
#[rpc_requests] generates both Service (local) and RemoteService (remote)
trait impls. DerivedKey's dual serialization (JSON redacts, postcard
preserves) was designed for this. Enabling remote vault access is a
server-setup change, not a protocol change.

OQ-21 enriched with full context:
- What's already in place (protocol, serialization, actor, auth transport)
- What's not in place (IrohProtocol handler forwards all messages without
  auth checks; needs NodeId allowlist + message filtering in assembly layer)
- Operation access policy: Unlock/Lock local-only; Derive/Encrypt/Decrypt
  remote-capable
- Use case: machine node → workers (workers don't hold mnemonics)
- Per-machine-node vaults, not shared (compartmentalization)
- Breaking vs non-breaking analysis (enabling = non-breaking; protocol
  evolution = wire break, manageable via ALPN versioning)

The auth-wrapping handler lives in the assembly layer (or a dedicated
vault-server crate depending on both alknet-core and alknet-vault), not in
the vault crate itself — the vault is standalone (ADR-018) and can't
import alknet-core's auth model.

OQ-21 remains deferred — no commitment to implement, but the door is open
and the design space is mapped.
2026-06-20 06:48:23 +00:00
dc27753680 docs(architecture): add ADR-021, resolve OQ-22 — key rotation via version-indexed paths
Key rotation uses version-indexed derivation paths: each key version maps
to a distinct SLIP-0010 path (m/74'/2'/0'/{version-2}'). v2 is at index 0
(PATHS::ENCRYPTION), v3 at index 1, etc.

Mechanism:
- encryption_path_for_version(version) constructs the path
- decrypt derives the key at the version-indicated path (not always
  PATHS::ENCRYPTION)
- rotate(blob, to_version) decrypts with old key, re-encrypts with new
- No new mnemonic needed — same seed, different path
- Partial rotation is safe — old keys remain derivable
- The vault does not self-rotate; the assembly layer iterates blobs

Source drift flagged:
- decrypt currently ignores key_version for path selection (always uses
  PATHS::ENCRYPTION) — must use version-indexed paths
- rotate method does not exist in source — must be added
- CURRENT_KEY_VERSION must bump from 1 to 2 (per ADR-020, reinforced here)

OQ-22 resolved. Only OQ-21 (remote vault admin, deferred) remains.
2026-06-19 10:09:20 +00:00
6e9414bc81 docs(architecture): add ADR-020, resolve OQ-20 — HD derivation for encryption keys
The vault uses SLIP-0010 HD derivation from the BIP39 seed for the
AES-256-GCM encryption key, not PBKDF2. This replaces the TypeScript
predecessor's (@alkdev/storage/src/graphs/crypto.ts) PBKDF2-based
approach.

Key decisions:
- HD derivation at m/74'/2'/0'/0' produces the encryption key
- PBKDF2 is not implemented in the vault; no password-based derivation
- salt field is unused in v2 (wire-format compat only)
- key_version=1 reserved for TS PBKDF2 data; key_version=2 for vault HD
- TS-encrypted data requires one-time migration to v2
- CURRENT_KEY_VERSION changes from 1 to 2 (source drift flagged)

OQ-20 resolved: the encryption key derivation method is locked. OQ-22
(key rotation workflow) remains open but does not block implementation.
2026-06-19 09:49:06 +00:00
dd1ca1de70 docs(architecture): add alknet-vault spec, ADR-018, ADR-019, OQ-20/21/22
Spec the vault crate from its existing implementation. The vault is
stable (implementation exists); this spec documents what IS so the
implementation-sync agent can reconcile source drift.

New spec documents (crates/vault/):
- README.md — crate index, security constraints, public API
- mnemonic-derivation.md — BIP39, SLIP-0010, BIP-0032, derivation paths
- encryption.md — AES-256-GCM, EncryptedData, key versioning, salt
- service.md — VaultServiceHandle lifecycle, actor dispatch, cache
- protocol.md — VaultProtocol irpc messages, DerivedKey redaction

New ADRs:
- ADR-018: Vault as standalone crate (zero alknet deps; own types/errors)
- ADR-019: Vault assembly-layer-only access (CLI is sole caller)

New open questions:
- OQ-20: Salt/KDF Phase B (open, low priority — salt field reserved)
- OQ-21: Remote vault administration (deferred — needs ADR if ever needed)
- OQ-22: Key rotation mechanism (open, low priority — workflow not specced)

Spec-vs-source drift explicitly flagged (for the sync agent):
- rand::random() used for IVs instead of OsRng (security-critical)
- unwrap() on every RwLock acquisition (must use unwrap_or_else)
- ADR-038 / OQ-SVC-03 references in source comments are stale (old numbering)
- VaultServiceActor::spawn returns a non-functional second actor (source bug)
- KeyVersionMismatch error variant is defined but unused in v1
2026-06-19 09:23:47 +00:00
c0a322ac29 docs(architecture): resolve OQ-11 and OQ-19 — all open questions resolved
OQ-11 (handler-level auth observability): Option B — handlers store
resolved identity on Connection via set_identity. Two identity scopes:
connection-level (observability, write-once-read-many) and per-request
(ACL, on OperationContext). Per-request takes precedence for ACL;
connection-level is for logging/audit only.

OQ-19 (session-scoped registries): Protocol doesn't need changes.
OperationEnv must remain a trait (not concrete) to enable session-overlay
pattern. Three-tier registry: core (static, External+Internal), session
(dynamic, Internal-only), promotion (curated review). Documented as
implementation guard in operation-registry.md.

All 19 open questions are now resolved. No open one-way or two-way doors
remain. The architecture is ready for review and implementation.
2026-06-19 06:05:04 +00:00
8f19eb8861 docs(architecture): add ADR-017 call protocol client and adapter contract, resolve OQ-15
ADR-017 locks the client/adapter architecture:
- CallClient opens QUIC connections, shares dispatch loop with CallAdapter
- Connection direction independent of call direction (both sides can call)
- from_call adapter: discovers remote ops via services/list + services/schema,
  registers with forwarding handlers (same pattern as from_openapi/from_mcp)
- to_openapi/to_mcp: project local ops to external protocols
- OperationAdapter trait: produces (OperationSpec, Handler) pairs
- Cross-node call tree: abort cascade propagates through from_call handlers
- Credentials from capabilities (ADR-014), adapter ops Internal by default (ADR-015)

The dispatch POC at /workspace/@alkdev/dispatch demonstrated head/worker over
SSH+axum; under the call protocol it's cross-node composition via from_call.
Connection topology (who advertises, who opens) is independent of call
direction — runner pattern, dispatch pattern, and P2P all work.
2026-06-18 10:57:29 +00:00
e2730869ca docs(architecture): add ADR-016 abort cascade for nested calls, resolve OQ-17
ADR-016 locks the abort cascade model:
- call.aborted cascades to all non-terminal descendants via parent_request_id
- Default policy: abort-dependents (abort everything downstream)
- Opt-in: continue-running (started descendants continue, pending ones abort)
- Server (CallAdapter) discovers descendants and propagates; client sends one abort
- Handlers clean up via Rust async drop semantics (Drop guards)
- parent_indexed map suffices for tree walking; flowgraph is optional prior art

Spec updates:
- call-protocol.md abort cascade section references ADR-016
- OQ-17 resolved, ADR-016 referenced across all call crate specs
- README.md updated: ADRs 001-016, OQ-17 moved to resolved
2026-06-18 09:37:19 +00:00
6285779c30 docs(architecture): add ADR-015 privilege model and authority context, resolve OQ-18
ADR-015 locks the call protocol's security model:
- internal flag switches authority context to handler identity, not skip ACL
- Operations have External/Internal visibility (Internal returns NOT_FOUND from wire, excluded from services/list)
- OperationContext carries both identity (caller/principal) and handler_identity (handler/agent)
- Scoped composition env bounds reachability (handler can only invoke declared operations)
- Three controls together: visibility (wire boundary) + handler identity (authority) + scoped env (reachability) = least privilege

Spec updates:
- OperationSpec gains Visibility field (External/Internal)
- OperationContext gains handler_identity field
- AccessControl section: ACL runs against caller identity for external, handler identity for internal
- LocalOperationEnv propagates handler_identity
- services/list only returns External operations
- Adapter-registered operations are Internal by default
- OQ-18 resolved, ADR-015 referenced across all call crate specs
2026-06-18 08:55:34 +00:00
b4aadc6b93 docs(architecture): add OQ-19 session-scoped registries and agent-written operations
Document the three-tier registry model (core/session/promotion) and the
self-improving agent workflow where agents write their own operations in
a quickjs sandbox. The POC at /workspace/toolEnv demonstrated the sandbox
mechanism (quickjs in Deno web workers, proxy-based env bridge via
postMessage) but exposed the full registry to the sandbox — the security
gap that OQ-18's scoped composition env addresses.

The call protocol doesn't need changes: the OperationEnv trait is the
composition point, and a session-scoped env wraps the global env (session
registry first, fall through to global). The one-way door this OQ guards
against: making OperationEnv concrete instead of a trait, or hardcoding
the global registry into the dispatch path, would close the session-overlay
pattern. Session-scoped operations are always Internal, run under the
handler's identity, and are ephemeral. Promotion to core requires curation
review (architect role with promote scope).
2026-06-18 08:31:46 +00:00
f27d717ac8 docs(architecture): reframe OQ-17 and OQ-18 as protocol-level concerns, not agent-specific
The abort cascade and privilege model are call protocol semantics that
every consumer inherits — NAPI adapter, Python adapter, agent service, and
any future service speaking the EventEnvelope wire format. Framing them as
'needs agent crate in view' let a single consumer's timeline gate a
protocol-level decision. The agent use case is a useful test case for edge
cases, but the decisions belong to the call protocol.
2026-06-18 07:47:57 +00:00
fab2c88444 docs(architecture): rename trusted to internal, add OQ-17 abort cascade and OQ-18 privilege model
The 'trusted' flag on OperationContext was the wrong word — it implies a
trust decision was made, but what actually happens is the call originated
internally (from composition) not externally (from the wire). Renamed to
'internal' with clarified semantics: internal calls switch authority
context to the handler's identity, not skip ACL. This prevents the
privilege escalation vector where composition with 'trusted: true' bypassed
all access control (buggy handler + parameterized dispatch).

- Rename trusted -> internal across operation-registry.md, ADR-014
- Update OperationContext field description and LocalOperationEnv code
- Add OQ-17: abort cascade for nested calls (call.aborted cascades to
  descendants, default abort-dependents, continue-running opt-in). One-way
  door on the protocol event schema; mechanism is a two-way door.
- Add OQ-18: privilege model and authority context (internal = authority
  switch not ACL skip, External/Internal operation visibility, scoped
  composition env + handler identity). Needs agent crate in view.
- Add abort cascade section and constraint to call-protocol.md
- Update crates/call/README.md with OQ-17, OQ-18, and two new design principles
- Update architecture README.md with OQ-17, OQ-18
2026-06-18 07:38:33 +00:00
6a7d4b9755 docs(architecture): add ADR-014 secret material flow, remove vault ops from call protocol
Resolve the contradiction between ADR-008's "capability source" model
and operation-registry.md showing vault operations on the wire. ADR-014
establishes: vault is assembly-layer only, capabilities carry outbound
credentials (distinct from inbound identity), call protocol carries no
secret material, adapters take credential sources not static tokens.

- Add ADR-014 (Secret Material Flow and Capability Injection)
- Remove vault/derive, vault/unlock, vault/decrypt from call protocol
  registration examples and all spec examples
- Add Capabilities field to OperationContext, propagate through
  LocalOperationEnv nested calls
- Add Capability Injection section to operation-registry.md
- Add no-secret-material wire constraint to call-protocol.md
- Add streaming subscribe example (LLM chat with Vercel UI chunks)
- Add Security Model section to overview.md (identity vs capabilities)
- Trim WASM treatment from ~20 lines to a design-constraint note
- Add OQ-16 (resolved: no vault ops on wire), update OQ-08, OQ-15
- Update ADR-003, ADR-008, ADR-013 to remove stale "via call protocol"
  vault references
2026-06-18 03:16:45 +00:00
6219a323b6 docs(architecture): untangle TLS identity use cases, remove phase framing, add ADR-013 Rust canonical + agent crate
- Rewrite OQ-12: separate two distinct TLS identity use cases (RFC 7250
  raw keys as default for P2P, X.509 for domain-hosted/browsers) instead
  of conflating them as 'file paths now, ACME later'. ACME is a proven
  pattern from the reverse-proxy project, not speculative future work.

- Resolve OQ-13 and OQ-14: remove 'Phase 1' framing from core crate
  specs. /{service}/{op} is the correct design for alknet-call, not a
  simplification. Batch as correlated call.requested events is the correct
  protocol design. Core crates need to be done right from the start.

- Add ADR-013: Rust as canonical implementation language. TypeScript
  @alkdev/operations is a reference that informed the design, not a
  parallel implementation. The only JS use case is browser SDK adaptation.
  Five reasons: memory safety, LLM competence, supply chain attacks,
  performance, browser-only JS.

- Add alknet-agent crate to the crate graph (depends on alknet-call, not
  alknet-core). Agent service uses call protocol client for tool dispatch
  and vault/derive for provider keys — no env vars for secrets. ALPN
  alknet/agent added to the registry.

- Add OQ-15: call protocol client and adapter contract. alknet-call needs
  both server (CallAdapter) and client (remote invocation over QUIC), plus
  the adapter traits (from_*, to_*) that enable composition.

- Clarify alknet-napi as thin NAPI projection layer, not business logic.

- Fix bugs: ProtocolController → ProtocolHandler typo, OperationEnv
  invoke() path format inconsistency, RateLimitConfig comment confusion.

- Update endpoint.md TLS section: comprehensive identity model comparison
  table, RFC 7250 as default mode, ACME as proven pattern.
2026-06-17 09:32:44 +00:00
a596f0d188 docs(architecture): add alknet-call crate spec, ADR-012, resolve OQ-07
Add architecture specs for the alknet-call crate:

- call-protocol.md: CallAdapter, EventEnvelope wire format, bidirectional
  stream model with ID-based correlation, PendingRequestMap, protocol
  operations (call/subscribe/batch/schema), per-request identity resolution,
  connection/stream lifecycle, error codes

- operation-registry.md: OperationSpec, async Handler type, OperationRegistry,
  AccessControl with trusted call bypass, OperationEnv with context
  propagation (parent_request_id, identity inheritance), service discovery,
  irpc integration layering, naming convention (no leading slash in names)

- ADR-012: Call protocol uses bidirectional QUIC streams with EventEnvelope
  framing and ID-based correlation. Protocol is stream-agnostic and symmetric.
  Resolves OQ-07.

Key design decisions:
- Handler type is async (Fn returning Pin<Box<dyn Future>>)
- OperationEnv::invoke propagates parent context (identity, metadata,
  parent_request_id)
- Identity resolution is per-request, not per-connection
- Operation names without leading slash (fs/readFile, not /fs/readFile)
- Batch is a client-side pattern, not a protocol primitive (OQ-14)
- Phase 1 uses service/op paths, node prefix added later (OQ-13)

Also: promote ADR-010 and ADR-011 from Proposed to Accepted, add OQ-13
and OQ-14 to open-questions.md.
2026-06-16 14:22:20 +00:00
5c8448ff86 docs(architecture): fix OQ-05 — multi-connectivity endpoint, not multi-transport
Correct the conflation of quinn/TLS/iroh as interchangeable transports.
They are complementary connectivity modes serving different deployment
contexts: quinn (public IP + TLS), iroh (NAT traversal via relay), TCP
(handler-specific, not core). Clarify that TLS cert = network identity,
not auth identity. Map stealth mode to HTTP handler on standard ALPNs
instead of byte-peeking. Resolve OQ-05 as one-way door. SendStream/
RecvStream now use internal enum dispatch for both quinn and iroh
streams.
2026-06-16 12:41:03 +00:00
90d5f4eaf9 docs(architecture): spec alknet-core with per-crate subdocs, ADR-010/011
Add alknet-core architecture specs in docs/architecture/crates/core/ with
focused subdocuments for core types, endpoint, auth, and config. Write
ADR-010 (ALPN Router and Endpoint) defining AlknetEndpoint, HandlerRegistry,
accept loop, and graceful shutdown. Write ADR-011 (AuthContext Structure)
defining AuthContext fields, immutability in handle(), and IdentityProvider
injection pattern. Resolve OQ-04 (static registration), OQ-12 (file paths
only for v1). Add OQ-11 (auth observability). Fix remaining alknet-secret
references to alknet-vault across ADRs 003/004/005/009.
2026-06-16 12:07:17 +00:00
80128a56e5 refactor: rename alknet-secret to alknet-vault
Rename the crate from alknet-secret to alknet-vault to better reflect its
purpose as a local key vault (seed management, key derivation, encryption)
rather than a network service.

Symbol renames:
- SecretService → VaultService
- SecretServiceHandle → VaultServiceHandle
- SecretServiceActor → VaultServiceActor
- SecretServiceError → VaultServiceError
- SecretProtocol → VaultProtocol
- SecretMessage → VaultMessage
- ServiceLocked → VaultLocked
- alknet_secret → alknet_vault (crate name)

Update ADR-008 with vault access pattern: the vault is a capability source,
not a service endpoint. The CLI injects derived/decrypted material into
operation contexts — handlers never hold vault references.
2026-06-16 11:10:07 +00:00
b47a6fe70b docs(architecture): resolve one-way doors, clean up Phase 0 specs
Resolve blocking one-way door decisions:
- ADR-007: BiStream is a trait, handlers receive Connection not BiStream
- ADR-008: Secret service is CLI-embedded, exposed via call protocol
- ADR-009: One-way door decision framework (classify by reversal cost)

Update existing documents:
- overview.md: add design principles, revise ProtocolHandler signature,
  update shared types, add WASM as design constraint
- open-questions.md: add door-type classifications, resolve OQ-01/OQ-08,
  move OQ-09/OQ-10 to deferred section, mark two-way doors as impl-deferred
- README.md: reflect resolved questions, remove crate spec stubs from index
- ADR-002: cross-reference ADR-007 for signature revision

Clean up premature artifacts:
- Remove 11 empty crate spec stubs (16-28 lines each, no unique content)
- Specs will be created when each crate enters Phase 1
2026-06-16 10:43:31 +00:00
f77b515968 docs(architecture): add Phase 0 architecture specs for ALPN-as-service model
Foundational architecture documents following the SDD process:

ADRs:
- 001: ALPN-based protocol dispatch (one endpoint, ALPN negotiation)
- 002: ProtocolHandler trait (replaces StreamInterface/MessageInterface)
- 003: Crate decomposition (one crate per handler, core provides shared infra)
- 004: Auth as shared core (IdentityProvider, hybrid resolution model)
- 005: irpc as call protocol foundation
- 006: ALPN string convention and connection model (alknet/ prefix, one ALPN per connection)

Docs:
- overview.md: crate graph, shared types, ALPN registry, failure modes
- README.md: index with doc table, ADR table, lifecycle definitions
- open-questions.md: 10 OQs across 7 themes (3 resolved, 7 open)

Crate spec stubs for all 11 planned crates (alknet-core through alknet CLI).

Key decisions resolved during self-review:
- AuthContext resolution is hybrid: endpoint resolves TLS-level auth,
  handlers resolve protocol-level auth (resolves OQ-02)
- ALPN is per-connection not per-stream, corrected ADR-001 (resolves OQ-06)
- ALPN naming uses alknet/ prefix without versions (resolves OQ-03)
- HandlerError return type on ProtocolHandler trait
- alknet/secret removed from ALPN registry until OQ-08 resolved
2026-06-15 22:14:58 +00:00
b5a4600d74 greenfield: clean slate for ALPN-as-service pivot
Delete old source crates (alknet-core, alknet, alknet-napi), old
architecture docs (ADRs, specs, open questions), old research docs
(phase2, event-sourcing, feasibility, etc.), old tasks, and obsolete
reference material (gitserver/MPL, honker, nats, rustfs, polyglot,
keystone, distributed-identity).

Keep: alknet-secret (standalone, compiles), pivot docs, iroh and ssh
references, rudolfs reference (MIT/Apache, fork candidate), ops docs,
sdd_process.md, and licenses.

Previous implementation preserved at /workspace/@alkdev/alknet-main/
for reference during porting.

Workspace compiles: cargo check + 14 tests pass for alknet-secret.
2026-06-15 12:08:08 +00:00
04e969982e feat(secret): add alknet-secret crate and architecture spec for Phase 3
Create the alknet-secret crate with BIP39 mnemonic generation, SLIP-0010
Ed25519 HD key derivation, AES-256-GCM encryption, and SecretProtocol
irpc service definition. This is Phase 3.1 from the integration plan.

Architecture changes:
- Promote secret-service.md to reviewed status with full spec format
  (crate structure, public API, security model, phase progression,
   ADR/OQ cross-references, wire format compatibility section)
- Add ADR-038 (seed lifecycle and memory security): zeroize for v1,
  mlock deferred to Phase B
- Add OQ-SEC-01 (mlock/VirtualLock for seed RAM) to open-questions.md
- Update README.md with ADR-038 and secret-service status

Crate structure:
- src/mnemonic.rs: BIP39 phrase generation, validation, seed derivation
- src/derivation.rs: SLIP-0010 HD key derivation, path constants (74')
- src/encryption.rs: AES-256-GCM encrypt/decrypt, EncryptedData type
- src/protocol.rs: SecretProtocol irpc enum, DerivedKey, KeyType
- src/service.rs: SecretServiceHandle with Unlock/Lock lifecycle
- 40 passing tests (unit + integration + doc)
2026-06-09 13:49:53 +00:00
cfc44008d3 Sync architecture specs with Phase 2 research findings
- Add definitions.md: normative terminology disambiguation (Interface, Service,
  Transport, Token, Identity, Domain, Scope, CredentialProvider, etc.)
- Add credentials.md: CredentialProvider trait and CredentialSet enum for
  outbound auth, mirroring IdentityProvider pattern for inbound auth
- Rewrite interface.md: StreamInterface/MessageInterface split (ADR-035),
  InterfaceRequest/InterfaceResponse, HttpInterface/DnsInterface stubs,
  ListenerConfig with Stream/Http/Dns variants, credential presentation table
- Update auth.md: API keys in DynamicConfig (ADR-037), credential presentation
  per (Transport, Interface) pair, ApiKeyEntry struct in AuthPolicy
- Update configuration.md: API keys, ListenerConfig with Http/Dns variants,
  expanded TOML config examples
- Update call-protocol.md: resolve OQ-IF-01 (InterfaceEvent carries
  EventEnvelope + Identity), add MessageInterface awareness to protocol
  adapter layer
- Update overview.md: three-layer model now includes StreamInterface/
  MessageInterface, CredentialProvider/CredentialSet exports, definitions.md
  reference, ADRs 035-037
- Update open-questions.md: resolve OQ-IF-01, OQ-IF-02, add OQ-P2-01
  through OQ-P2-04, add OQ-CP-01 through OQ-CP-04, add OQ-DEF-01,
  OQ-DEF-03, OQ-DEF-08
- Update README.md: add definitions.md, credentials.md, ADRs 035-037,
  phase2 research docs, current state description

Key architectural decisions:
- ADR-035: StreamInterface/MessageInterface split (two Layer 2 traits)
- ADR-036: CredentialProvider as core type (outbound auth, alknet_core::credentials)
- ADR-037: API keys as DynamicConfig auth (hash-verified bearer tokens)
2026-06-09 08:09:45 +00:00
d3633b7839 docs: complete Phase 0 architecture — spec updates, review fixes, and link portability
Update four existing specs (overview, server, napi-and-pubsub, call-protocol) to
reflect Phase 0 decisions: three-layer model, IdentityProvider, ForwardingPolicy,
OperationEnv, static/dynamic config split. Review all 9 Phase 0a ADRs (026-034)
for consistency. Fix 4 critical issues from architecture review: missing OQ-SVC-05
in open-questions.md, deprecated hub terminology, undefined AuthService and noq
terms. Replace inline OQ text with cross-references per format rules. Add
ConfigServiceImpl definition to configuration.md. Port absolute workspace paths
to project-relative links by copying referenced docs (feasibility, certbot,
fail2ban, event_source_types) into docs/research/.
2026-06-07 11:27:52 +00:00
19b3d3a078 docs: write Phase 0 architecture foundation — ADRs 026-034, spec docs, and task updates
Phase 0a — ADRs (9 new):
- ADR-026: Transport/interface separation (three-layer model)
- ADR-027: Crate decomposition (core, secret, storage, flowgraph, napi, CLI)
- ADR-028: Auth as irpc service (AuthProtocol behind feature flag)
- ADR-029: Identity as core type (Identity + IdentityProvider in alknet-core)
- ADR-030: Static/dynamic config split (ArcSwap, ConfigReloadHandle)
- ADR-031: Forwarding policy (rule-based allow/deny, TransportKind-aware)
- ADR-032: Event boundary discipline (domain, irpc, call protocol boundaries)
- ADR-033: OperationEnv universal composition (three dispatch paths)
- ADR-034: Head/worker terminology (replace hub/spoke)

Phase 0b — New spec documents (7):
- identity.md, services.md, interface.md, configuration.md,
  storage.md, flowgraph.md, secret-service.md

Updated existing docs:
- auth.md: reference identity.md for canonical definitions, add AuthProtocol
- open-questions.md: resolve OQ-12, OQ-16, OQ-18, OQ-22, OQ-23-25
- README.md: add all new docs, ADRs 026-034

Marked 19 architecture tasks as completed.
2026-06-07 09:32:58 +00:00
6db1266672 docs: fix inconsistencies in architecture specs
- Replace hub/spoke with head/worker terminology in call-protocol.md,
  auth.md, open-questions.md, napi-and-pubsub.md
- Update operation paths from /{spoke}/{service}/{op} to
  /{node}/{service}/{op} throughout call-protocol.md
- Unify Identity struct: auth.md already had {id, scopes, resources},
  add note clarifying this is canonical (vs research/services.md which
  used {node_id, fingerprint, scopes})
- Update integration-plan.md inconsistencies section to track what's
  been fixed (hub/spoke, identity model) and expand service naming
  to include external services
- Update call-protocol.md last_updated date

ADRs are intentionally left unchanged as historical records.
2026-06-07 07:50:00 +00:00
596c89ce24 refactor!: rebrand wraith to alknet
Rename all crates, CLI commands, constants, type names, doc comments,
and documentation from wraith to alknet. Includes wire-protocol changes:
ALPN wraith-ssh -> alknet-ssh, reserved destination prefix wraith- ->
alknet-, SSH auth username wraith -> alknet.
2026-06-05 10:04:32 +00:00
af7f4d0006 docs: add auth, call protocol architecture specs and ADRs 023-025
Unified authentication (ADR-023): SSH and WebTransport auth share the same
Ed25519 key material. Token auth uses signed timestamps verified against the
same authorized_keys set. IdentityProvider trait decouples core from identity
storage.

Bidirectional call protocol (ADR-024): Generalizes control channel (ADR-018)
to support hub→spoke and spoke→hub calls. Operation paths use /{spoke}/{service}/{op}
format for three-level routing. EventEnvelope wire format, five call events,
PendingRequestMap for correlation.

Handler/spec separation (ADR-025): Downstream consumers register operations
without modifying core. OperationRegistry maps paths to specs + handlers.
Service discovery via /services/list and /services/schema.

Resolves OQ-17 (transport-aware auth), OQ-21 (spoke routing), OQ-CFG-04 and
OQ-CFG-06 (WebTransport auth and transport-aware auth layer). Adds OQ-18
through OQ-22 for remaining open questions.
2026-06-05 08:19:41 +00:00
41062d810e docs: add configuration architecture research
Explore static/dynamic config split, hot-reloadable auth via ArcSwap,
forwarding policy, multi-transport listeners, and config file format.
Documents three problems: no auth hot-reload, no forwarding access control,
no structured config beyond CLI flags.

Key findings:
- Static config (transport, TLS, host key) loaded once at startup
- Dynamic config (auth, forwarding, rate limits) reloadable via ArcSwap
- ForwardingPolicy with rule-based allow/deny, first-match evaluation
- Multi-transport: Server spawns Vec<ListenerConfig> sharing auth config
- WebTransport out of scope for now (requires separate auth model)
- Proposes ADR-020 (static/dynamic split), ADR-021 (forwarding policy),
  ADR-022 (multi-transport listeners)

Adds OQ-12 through OQ-17 to open-questions.md.
2026-06-04 09:40:58 +00:00