Commit Graph

14 Commits

Author SHA1 Message Date
cdf340bec7 docs(architecture): add ADR-024 — operation registry layering, resolve C6
Diagnoses a conflation in the pre-ADR-024 spec: the OperationRegistry
inherited immutability by analogy from ADR-010's HandlerRegistry (ALPN-level),
but the TLS-config argument that justifies HandlerRegistry immutability does
not apply to the operation registry, which lives behind a single ALPN
(alknet/call). This made from_call (which discovers ops over a live connection
at runtime) structurally incompatible with the blanket immutability claim.

ADR-024 layers the operation registry by trust boundary: curated (Local) ops
are static and immutable — the startup trust boundary is where their
composition authority is granted; session (Session) and imported (FromCall
etc.) ops are dynamic at their respective scopes (per-session, per-connection)
— their trust boundaries are per-scope, not per-startup. The principle:
immutability follows the trust boundary. Immutability is the security control
for composing ops (can escalate privilege); provenance + composition authority
are the controls for non-composing ops (can't escalate).

The OperationEnv trait becomes the integration point (Arc<dyn OperationEnv>),
following the IdentityProvider precedent (ADR-004): the CallAdapter composes
the root OperationContext.env per incoming call from the active layers
(curated base + connection overlay + session overlay). Children inherit the
parent's composite env by Arc::clone — overlay composition happens once at
the root and propagates through the composition tree.

Resolves review #002 C6 (OperationContext.env type identity crisis): the
field is split into scoped_env: ScopedOperationEnv (reachability data, from
the registration bundle) and env: Arc<dyn OperationEnv + Send + Sync>
(dispatch trait object). One field was being used as two different types
(reachability set with .allows() and dispatch trait with .invoke());

Localizes W4 (hot-swap ↔ registry mutability coupling) to the connection
scope: no global mutable registry to hot-swap; overlays replace naturally
with connect/disconnect and session start/end. Schema-drift on reconnect is
a per-connection overlay-rebuild concern, not a global hot-swap protocol.

Partially addresses W3 (CallClient registry security): the registry-shape
sub-question is resolved by the overlay model; the capability-exposure
sub-question (what capabilities a remote peer can trigger) remains for
ADR-017 — ADR-024 does not overclaim resolution there.

Amends OQ-04 to scope its immutability claim to the HandlerRegistry and
cross-reference ADR-024 for the operation registry. Generalizes OQ-19's
session-overlay mechanism to also cover connection-scoped remote imports —
both are per-scope dynamic overlays on the static curated base, using the
same trait-layering mechanism.
2026-06-22 13:44:58 +00:00
c62a6adc7b docs(architecture): resolve review #002 Tiers 1-3 — mechanical and consistency fixes
Governance (Tier 2):
- Advance ADR-022 and ADR-023 from Proposed to Accepted (specs already
  depend on their types as source of truth)
- Amend ADR-015: mark Decision 3 and Assumption 6 as superseded by ADR-022;
  update handler_identity type to CompositionAuthority
- Amend ADR-002: note handle() signature revised by ADR-007 (BiStream → Connection)
- Amend ADR-004: note 'enrich/replace' AuthContext language superseded by
  ADR-011's immutability model; update to describe set_identity on Connection
- Update main README ADR table to show ADR-022/023 as Accepted

Spec-ADR consistency (Tier 3):
- Add abort_policy: AbortPolicy field to OperationContext struct (ADR-016
  Decision 6 mandated this but the spec omitted it)
- Define AbortPolicy enum (AbortDependents | ContinueRunning) with Default impl
- Add abort_policy to build_root_context and LocalOperationEnv::invoke()
- Define the OperationEnv trait explicitly with invoke() and
  invoke_with_policy() methods (was referenced as 'must remain a trait'
  but never defined)
- Specify From<StreamError> for HandlerError impl with exact variant mapping
- Add Connection::from_quinn() / from_iroh() constructors (was referenced
  as Connection::new() but never defined)
- Remove undefined CertAuthorityEntry placeholder from AuthPolicy v1 (will
  be added additively when alknet-ssh lands)
- Fix config.md key-differences table: rate limits are in DynamicConfig,
  not StaticConfig

Mechanical fixes (Tier 1):
- overview.md: 'closes the QUIC stream' → 'closes the connection' (stale
  from pre-ADR-007 model)
- overview.md: OQ-04 entry updated from stale 'defer to implementation'
  to 'resolved: static at startup'
- mnemonic-derivation.md: remove duplicate helper functions block (incomplete
  first copy, complete second copy)
- ADR-003: add iroh (feature-gated) to alknet-core dependency list, added
  by ADR-010
- ADR-021: fix ambiguous 'W1 drift issue from the vault review' cross-reference
- ADR-022: rephrase FromCall 'leaf locally' to 'leaf in the local registry'
- ADR-017: add error_schemas to from_call mirror list and services/schema
  step (inconsistency with ADR-023)
- ADR-016: fix self-referential citation ('ADR-016 Assumption 5' → 'Assumption 5')
- Add ScopedOperationEnv::empty(), allows(), new() and
  CompositionAuthority::none(), new() impl blocks (referenced but undefined)
- Add call.completed clarification for non-subscription calls
- Add services/schema leading-slash normalization note
- Crate README ADR tables: add missing ADR-013 (call), ADR-015 (core),
  ADR-006 + ADR-010 (vault)
- Vault README: add consolidated 'Known Source Drift' table tracking all
  four drift items (OsRng, unwrap, CURRENT_KEY_VERSION, spawn bug) in one
  place, including the two previously missing from README
2026-06-22 05:46:37 +00:00
3e238a471b docs(architecture): add ADR-023, resolve OQ-24 — operation error schemas
ADR-023 adds error_schemas to OperationSpec so operations can declare
their domain-level failure modes (FILE_NOT_FOUND, RATE_LIMITED, etc.)
distinct from protocol-level codes (NOT_FOUND, FORBIDDEN, etc.). The
call.error payload gains an optional 'details' field carrying the typed
error payload conforming to the declared schema. from_openapi/to_openapi
map OpenAPI response status codes to/from ErrorDefinitions, making the
adapter contract from ADR-017 faithful on the error axis.

Also fixes W2 (KeyVersionMismatch stale comment in encryption.md —
ADR-021 implements rotation without this variant) and W4
(derive_encryption_key_for_version missing from service.md method list).

Spec updates: operation-registry.md (OperationSpec, ErrorDefinition,
Handler error mapping, services/schema), call-protocol.md (call.error
payload, CallError, ResponseEnvelope), README.md, overview.md,
open-questions.md (OQ-24), call/README.md, encryption.md, service.md.
2026-06-21 10:26:18 +00:00
1cedc4eeba docs(architecture): add ADR-022, resolve OQ-23 — handler registration, provenance, and composition authority
ADR-022 wires the three controls ADR-015 specified but left without
registration paths (C1-C4 from review #001): composition authority,
scoped env, and capabilities now enter through a HandlerRegistration
bundle. Provenance (Local, FromOpenAPI, FromMCP, FromCall, Session)
determines which ops can compose — leaves don't get composition
authority. CompositionAuthority replaces handler_identity: Identity
(it's a declared authority bundle, not a peer identity). Capabilities
are per-request from the bundle (resolves closure-capture vs context
ambiguity). Kernel/user analogy: user's authority checked at External
gate; handler's composition authority used inside; scoped env bounds
reachability.

Also fixes W1 (stale ADR-020 path example) and W3 (from_mcp missing
from adapter lists in operation-registry.md).

Spec updates: operation-registry.md (OperationRegistry,
HandlerRegistration, OperationContext, OperationEnv, registration
example, capability injection), call-protocol.md (build_root_context),
README.md, overview.md, open-questions.md (OQ-23), call/README.md.
2026-06-21 09:09:47 +00:00
c0a322ac29 docs(architecture): resolve OQ-11 and OQ-19 — all open questions resolved
OQ-11 (handler-level auth observability): Option B — handlers store
resolved identity on Connection via set_identity. Two identity scopes:
connection-level (observability, write-once-read-many) and per-request
(ACL, on OperationContext). Per-request takes precedence for ACL;
connection-level is for logging/audit only.

OQ-19 (session-scoped registries): Protocol doesn't need changes.
OperationEnv must remain a trait (not concrete) to enable session-overlay
pattern. Three-tier registry: core (static, External+Internal), session
(dynamic, Internal-only), promotion (curated review). Documented as
implementation guard in operation-registry.md.

All 19 open questions are now resolved. No open one-way or two-way doors
remain. The architecture is ready for review and implementation.
2026-06-19 06:05:04 +00:00
8f19eb8861 docs(architecture): add ADR-017 call protocol client and adapter contract, resolve OQ-15
ADR-017 locks the client/adapter architecture:
- CallClient opens QUIC connections, shares dispatch loop with CallAdapter
- Connection direction independent of call direction (both sides can call)
- from_call adapter: discovers remote ops via services/list + services/schema,
  registers with forwarding handlers (same pattern as from_openapi/from_mcp)
- to_openapi/to_mcp: project local ops to external protocols
- OperationAdapter trait: produces (OperationSpec, Handler) pairs
- Cross-node call tree: abort cascade propagates through from_call handlers
- Credentials from capabilities (ADR-014), adapter ops Internal by default (ADR-015)

The dispatch POC at /workspace/@alkdev/dispatch demonstrated head/worker over
SSH+axum; under the call protocol it's cross-node composition via from_call.
Connection topology (who advertises, who opens) is independent of call
direction — runner pattern, dispatch pattern, and P2P all work.
2026-06-18 10:57:29 +00:00
e2730869ca docs(architecture): add ADR-016 abort cascade for nested calls, resolve OQ-17
ADR-016 locks the abort cascade model:
- call.aborted cascades to all non-terminal descendants via parent_request_id
- Default policy: abort-dependents (abort everything downstream)
- Opt-in: continue-running (started descendants continue, pending ones abort)
- Server (CallAdapter) discovers descendants and propagates; client sends one abort
- Handlers clean up via Rust async drop semantics (Drop guards)
- parent_indexed map suffices for tree walking; flowgraph is optional prior art

Spec updates:
- call-protocol.md abort cascade section references ADR-016
- OQ-17 resolved, ADR-016 referenced across all call crate specs
- README.md updated: ADRs 001-016, OQ-17 moved to resolved
2026-06-18 09:37:19 +00:00
6285779c30 docs(architecture): add ADR-015 privilege model and authority context, resolve OQ-18
ADR-015 locks the call protocol's security model:
- internal flag switches authority context to handler identity, not skip ACL
- Operations have External/Internal visibility (Internal returns NOT_FOUND from wire, excluded from services/list)
- OperationContext carries both identity (caller/principal) and handler_identity (handler/agent)
- Scoped composition env bounds reachability (handler can only invoke declared operations)
- Three controls together: visibility (wire boundary) + handler identity (authority) + scoped env (reachability) = least privilege

Spec updates:
- OperationSpec gains Visibility field (External/Internal)
- OperationContext gains handler_identity field
- AccessControl section: ACL runs against caller identity for external, handler identity for internal
- LocalOperationEnv propagates handler_identity
- services/list only returns External operations
- Adapter-registered operations are Internal by default
- OQ-18 resolved, ADR-015 referenced across all call crate specs
2026-06-18 08:55:34 +00:00
b4aadc6b93 docs(architecture): add OQ-19 session-scoped registries and agent-written operations
Document the three-tier registry model (core/session/promotion) and the
self-improving agent workflow where agents write their own operations in
a quickjs sandbox. The POC at /workspace/toolEnv demonstrated the sandbox
mechanism (quickjs in Deno web workers, proxy-based env bridge via
postMessage) but exposed the full registry to the sandbox — the security
gap that OQ-18's scoped composition env addresses.

The call protocol doesn't need changes: the OperationEnv trait is the
composition point, and a session-scoped env wraps the global env (session
registry first, fall through to global). The one-way door this OQ guards
against: making OperationEnv concrete instead of a trait, or hardcoding
the global registry into the dispatch path, would close the session-overlay
pattern. Session-scoped operations are always Internal, run under the
handler's identity, and are ephemeral. Promotion to core requires curation
review (architect role with promote scope).
2026-06-18 08:31:46 +00:00
f27d717ac8 docs(architecture): reframe OQ-17 and OQ-18 as protocol-level concerns, not agent-specific
The abort cascade and privilege model are call protocol semantics that
every consumer inherits — NAPI adapter, Python adapter, agent service, and
any future service speaking the EventEnvelope wire format. Framing them as
'needs agent crate in view' let a single consumer's timeline gate a
protocol-level decision. The agent use case is a useful test case for edge
cases, but the decisions belong to the call protocol.
2026-06-18 07:47:57 +00:00
fab2c88444 docs(architecture): rename trusted to internal, add OQ-17 abort cascade and OQ-18 privilege model
The 'trusted' flag on OperationContext was the wrong word — it implies a
trust decision was made, but what actually happens is the call originated
internally (from composition) not externally (from the wire). Renamed to
'internal' with clarified semantics: internal calls switch authority
context to the handler's identity, not skip ACL. This prevents the
privilege escalation vector where composition with 'trusted: true' bypassed
all access control (buggy handler + parameterized dispatch).

- Rename trusted -> internal across operation-registry.md, ADR-014
- Update OperationContext field description and LocalOperationEnv code
- Add OQ-17: abort cascade for nested calls (call.aborted cascades to
  descendants, default abort-dependents, continue-running opt-in). One-way
  door on the protocol event schema; mechanism is a two-way door.
- Add OQ-18: privilege model and authority context (internal = authority
  switch not ACL skip, External/Internal operation visibility, scoped
  composition env + handler identity). Needs agent crate in view.
- Add abort cascade section and constraint to call-protocol.md
- Update crates/call/README.md with OQ-17, OQ-18, and two new design principles
- Update architecture README.md with OQ-17, OQ-18
2026-06-18 07:38:33 +00:00
6a7d4b9755 docs(architecture): add ADR-014 secret material flow, remove vault ops from call protocol
Resolve the contradiction between ADR-008's "capability source" model
and operation-registry.md showing vault operations on the wire. ADR-014
establishes: vault is assembly-layer only, capabilities carry outbound
credentials (distinct from inbound identity), call protocol carries no
secret material, adapters take credential sources not static tokens.

- Add ADR-014 (Secret Material Flow and Capability Injection)
- Remove vault/derive, vault/unlock, vault/decrypt from call protocol
  registration examples and all spec examples
- Add Capabilities field to OperationContext, propagate through
  LocalOperationEnv nested calls
- Add Capability Injection section to operation-registry.md
- Add no-secret-material wire constraint to call-protocol.md
- Add streaming subscribe example (LLM chat with Vercel UI chunks)
- Add Security Model section to overview.md (identity vs capabilities)
- Trim WASM treatment from ~20 lines to a design-constraint note
- Add OQ-16 (resolved: no vault ops on wire), update OQ-08, OQ-15
- Update ADR-003, ADR-008, ADR-013 to remove stale "via call protocol"
  vault references
2026-06-18 03:16:45 +00:00
6219a323b6 docs(architecture): untangle TLS identity use cases, remove phase framing, add ADR-013 Rust canonical + agent crate
- Rewrite OQ-12: separate two distinct TLS identity use cases (RFC 7250
  raw keys as default for P2P, X.509 for domain-hosted/browsers) instead
  of conflating them as 'file paths now, ACME later'. ACME is a proven
  pattern from the reverse-proxy project, not speculative future work.

- Resolve OQ-13 and OQ-14: remove 'Phase 1' framing from core crate
  specs. /{service}/{op} is the correct design for alknet-call, not a
  simplification. Batch as correlated call.requested events is the correct
  protocol design. Core crates need to be done right from the start.

- Add ADR-013: Rust as canonical implementation language. TypeScript
  @alkdev/operations is a reference that informed the design, not a
  parallel implementation. The only JS use case is browser SDK adaptation.
  Five reasons: memory safety, LLM competence, supply chain attacks,
  performance, browser-only JS.

- Add alknet-agent crate to the crate graph (depends on alknet-call, not
  alknet-core). Agent service uses call protocol client for tool dispatch
  and vault/derive for provider keys — no env vars for secrets. ALPN
  alknet/agent added to the registry.

- Add OQ-15: call protocol client and adapter contract. alknet-call needs
  both server (CallAdapter) and client (remote invocation over QUIC), plus
  the adapter traits (from_*, to_*) that enable composition.

- Clarify alknet-napi as thin NAPI projection layer, not business logic.

- Fix bugs: ProtocolController → ProtocolHandler typo, OperationEnv
  invoke() path format inconsistency, RateLimitConfig comment confusion.

- Update endpoint.md TLS section: comprehensive identity model comparison
  table, RFC 7250 as default mode, ACME as proven pattern.
2026-06-17 09:32:44 +00:00
a596f0d188 docs(architecture): add alknet-call crate spec, ADR-012, resolve OQ-07
Add architecture specs for the alknet-call crate:

- call-protocol.md: CallAdapter, EventEnvelope wire format, bidirectional
  stream model with ID-based correlation, PendingRequestMap, protocol
  operations (call/subscribe/batch/schema), per-request identity resolution,
  connection/stream lifecycle, error codes

- operation-registry.md: OperationSpec, async Handler type, OperationRegistry,
  AccessControl with trusted call bypass, OperationEnv with context
  propagation (parent_request_id, identity inheritance), service discovery,
  irpc integration layering, naming convention (no leading slash in names)

- ADR-012: Call protocol uses bidirectional QUIC streams with EventEnvelope
  framing and ID-based correlation. Protocol is stream-agnostic and symmetric.
  Resolves OQ-07.

Key design decisions:
- Handler type is async (Fn returning Pin<Box<dyn Future>>)
- OperationEnv::invoke propagates parent context (identity, metadata,
  parent_request_id)
- Identity resolution is per-request, not per-connection
- Operation names without leading slash (fs/readFile, not /fs/readFile)
- Batch is a client-side pattern, not a protocol primitive (OQ-14)
- Phase 1 uses service/op paths, node prefix added later (OQ-13)

Also: promote ADR-010 and ADR-011 from Proposed to Accepted, add OQ-13
and OQ-14 to open-questions.md.
2026-06-16 14:22:20 +00:00