Diagnoses a conflation in the pre-ADR-024 spec: the OperationRegistry inherited immutability by analogy from ADR-010's HandlerRegistry (ALPN-level), but the TLS-config argument that justifies HandlerRegistry immutability does not apply to the operation registry, which lives behind a single ALPN (alknet/call). This made from_call (which discovers ops over a live connection at runtime) structurally incompatible with the blanket immutability claim. ADR-024 layers the operation registry by trust boundary: curated (Local) ops are static and immutable — the startup trust boundary is where their composition authority is granted; session (Session) and imported (FromCall etc.) ops are dynamic at their respective scopes (per-session, per-connection) — their trust boundaries are per-scope, not per-startup. The principle: immutability follows the trust boundary. Immutability is the security control for composing ops (can escalate privilege); provenance + composition authority are the controls for non-composing ops (can't escalate). The OperationEnv trait becomes the integration point (Arc<dyn OperationEnv>), following the IdentityProvider precedent (ADR-004): the CallAdapter composes the root OperationContext.env per incoming call from the active layers (curated base + connection overlay + session overlay). Children inherit the parent's composite env by Arc::clone — overlay composition happens once at the root and propagates through the composition tree. Resolves review #002 C6 (OperationContext.env type identity crisis): the field is split into scoped_env: ScopedOperationEnv (reachability data, from the registration bundle) and env: Arc<dyn OperationEnv + Send + Sync> (dispatch trait object). One field was being used as two different types (reachability set with .allows() and dispatch trait with .invoke()); Localizes W4 (hot-swap ↔ registry mutability coupling) to the connection scope: no global mutable registry to hot-swap; overlays replace naturally with connect/disconnect and session start/end. Schema-drift on reconnect is a per-connection overlay-rebuild concern, not a global hot-swap protocol. Partially addresses W3 (CallClient registry security): the registry-shape sub-question is resolved by the overlay model; the capability-exposure sub-question (what capabilities a remote peer can trigger) remains for ADR-017 — ADR-024 does not overclaim resolution there. Amends OQ-04 to scope its immutability claim to the HandlerRegistry and cross-reference ADR-024 for the operation registry. Generalizes OQ-19's session-overlay mechanism to also cover connection-scoped remote imports — both are per-scope dynamic overlays on the static curated base, using the same trait-layering mechanism.
26 KiB
ADR-024: Operation Registry Layering
Status
Accepted
Context
The architecture has two registries that the spec documents previously treated as sharing one immutability argument:
-
The endpoint's
HandlerRegistry(ALPN string →ProtocolHandler). This is what ADR-010 and OQ-04 are about. Its immutability is load-bearing: ALPN strings are baked into the TLSServerConfigat startup, so adding a protocol handler at runtime requires rebuilding the TLS config. This is a genuine one-way door and the rationale is correct. -
The call protocol's
OperationRegistry(operation name →HandlerRegistration). This lives inside theCallAdapter, which is oneProtocolHandlerbehind the single ALPNalknet/call. Adding an operation to theOperationRegistrydoes not touch the TLSServerConfig— the ALPN is alreadyalknet/call, registered once at startup.
operation-registry.md stated the operation registry "is immutable after
construction… consistent with OQ-04 and ADR-010." That inheritance was by
analogy, not by shared rationale. The TLS argument that justifies
HandlerRegistry immutability does not apply to the OperationRegistry. The
operation registry's mutability profile is a separate question, and it has been
answered incorrectly by inheriting a constraint that belongs to a different
registry.
Why from_call breaks the inherited constraint
The import adapters have different lifecycle requirements:
from_openapi/from_mcpcan run at startup — the assembly layer reads a static spec file or queries a known service before the registry is frozen. Static import, fits immutability.from_callrequires a live connection to discover operations (services/list+services/schema). Connections happen at runtime. Workers join and leave dynamically in the machine→worker topology. You cannot pre-freeze a set you discover over a connection you haven't opened yet.
So from_call is structurally incompatible with "frozen at startup, never
touched again." The pre-ADR-024 spec held two contradictory positions: the
registry is immutable (operation-registry.md), and from_call imports remote
operations at connection time (ADR-017). An implementer would have to resolve
the contradiction by guessing — likely by either forcing all from_call
imports to happen at startup (awkward, doesn't fit worker topologies) or
quietly making the registry mutable (undermining the stated constraint without
acknowledging it).
Why immutability is not the load-bearing security control for imported ops
Imported operations (FromOpenAPI, FromMCP, FromCall) are leaves — they
cannot compose (ADR-022 Assumption 5). They have no composition authority, no
scoped env, Internal visibility by default, and their trust model is "the
remote endpoint is trusted as much as my own handlers" (ADR-017). Their
reachability from a composing handler is bounded by the parent handler's
scoped env, not by their registration timing.
The security controls on imported ops are provenance and composition authority — both set at registration, both checked at dispatch. Immutability is redundant here. An imported op registered at runtime is no more or less privileged than one registered at startup; it's a forwarding stub either way, and its capacity to do harm is bounded by what the composing parent's authority and scoped env permit.
Immutability is load-bearing for curated operations — the Local ops
the assembly layer writes at startup, which can compose and therefore can
escalate privilege under their own authority. For those, the trust boundary is
"the assembly layer declared them at startup," and immutability is what locks
that declaration. But that's a constraint on Local provenance specifically,
not on the registry as a whole.
The trust-boundary principle
The right axis is not visibility (Internal vs External) or wire-vs-local —
it is provenance combined with import timing, which maps to where each
operation's trust decision is made:
| Provenance | Import timing | Trust boundary | Layer | Lifetime |
|---|---|---|---|---|
Local |
Startup | Assembly layer at startup | 0 (curated) | Process — immutable |
Session |
Sandbox creation | Composing handler at sandbox creation | 1 (session) | Session — dynamic |
FromCall |
Connection (runtime) | Remote node at connection time | 2 (connection) | Connection — dynamic |
FromOpenAPI / FromMCP |
Startup | External endpoint, discovered at startup | 0 (curated) | Process — immutable |
FromOpenAPI / FromMCP |
Runtime (rare) | External endpoint, discovered at runtime | 2 (discovery) | Discovery-scoped — dynamic |
FromOpenAPI / FromMCP provenance is layer-polymorphic: the same
provenance lands in Layer 0 (immutable) or Layer 2 (dynamic) depending on
when the import happens. The common case is startup import into Layer 0
(Decision 6); runtime import into Layer 2 is permitted but rare.
Immutability follows the trust boundary. Operations are mutable at the
scope where their trust decision is made. Local ops (and startup-imported
FromOpenAPI/FromMCP) are trusted at startup → immutable. Session ops
are trusted at sandbox creation → session-scoped dynamic. FromCall ops
(and runtime-imported FromOpenAPI/FromMCP) are trusted at
connection/discovery time → connection/runtime dynamic.
Session ops are the edge case that proves the rule: they are Internal
visibility and can compose, but their trust boundary is per-session (the
parent handler grants them restricted authority at sandbox creation, per
ADR-022 Assumption 6), not per-startup. Visibility alone would misclassify
them; provenance correctly identifies them as dynamic.
The precedent: IdentityProvider
The structural problem — N consumers need to resolve something from M
sources, don't globalize the sources into one pot, don't make each consumer
know about all sources — is the same problem IdentityProvider solves for
auth (ADR-004). An IdentityProvider is a trait (Arc<dyn IdentityProvider>)
that centralizes resolution policy behind a stable interface; source
composition is an impl detail. Handlers consume the result; the trait owns the
routing.
OperationEnv is the same problem one layer over: N handlers need to
dispatch to operations, operations come from M sources (curated local, this
session, this peer connection, that peer connection), don't globalize all
sources into one mutable pot, don't make each handler know about all sources
and pick the right registry. The solution is the same shape: a trait —
Arc<dyn OperationEnv> — that centralizes dispatch routing behind a stable
interface, with overlay composition as an impl detail.
The alternative — a single global ArcSwap<OperationRegistry> into which all
imported ops merge with namespace prefixes — is the registry equivalent of
"every handler reads identity from a global env var." It works at one
connection. At many connections it produces: an unbounded pot, namespace
collisions scaling with connection count, disconnect cleanup requiring a
reverse index (op → owning connection), zero source isolation, and
routing-by-naming-convention instead of routing-by-structure. That is the
failure mode the IdentityProvider pattern exists to prevent.
Decision
1. The operation registry is layered by trust boundary
The OperationRegistry is not a single flat map. It is a layered structure
where each layer corresponds to a trust boundary:
Layer 0 — Curated (static, immutable, startup trust boundary)
Local provenance operations from the assembly layer.
Registered once at startup, never mutated for the process lifetime.
This is where immutability is load-bearing: these ops can compose,
therefore can escalate privilege under their own authority. The
startup trust boundary + immutability is the security control.
Layer 1 — Session (dynamic, per-session, sandbox-creation trust boundary)
Session provenance operations, agent-written, sandboxed.
Created and destroyed with each session.
Already specified by OQ-19 as an overlay on Layer 0.
Layer 2 — Imported (dynamic, per-connection, peer trust boundary)
FromCall operations discovered when a peer connects.
FromOpenAPI / FromMCP operations when imported at runtime (rare;
usually at startup into Layer 0, but runtime import is permitted).
Created and destroyed with the connection / discovery event.
Layers 1 and 2 are the same shape: per-scope dynamic overlays on the static
curated base. The scope is "session" for Layer 1 and "connection" (or
"discovery event") for Layer 2. OQ-19 already specified the overlay mechanism
for Layer 1 (session env wraps global env via OperationEnv trait layering).
This ADR generalizes the same mechanism to Layer 2.
2. The OperationEnv trait is the integration point
OperationContext.env is Arc<dyn OperationEnv + Send + Sync> — a trait
object, not a concrete struct. This is required by the overlay model: a
composite env (curated base + connection overlay + session overlay) is built
by composing OperationEnv impls, not by merging registries.
This resolves review #002 finding C6 (OperationContext.env type identity
crisis). The pre-ADR-024 spec had env: OperationEnv (a trait, which can't
be a field without dyn) and used the same field as both a reachability set
(parent.env.allows()) and a dispatch trait (context.env.invoke()). One
field cannot be both. The split:
scoped_env: ScopedOperationEnv— reachability data. Populated from the registration bundle'sscoped_env(ADR-022). The reachability check ininvoke()consultsparent.scoped_env.allows(&name).env: Arc<dyn OperationEnv + Send + Sync>— dispatch trait. The handler callscontext.env.invoke(...); the trait impl routes to the right overlay.
This is the IdentityProvider-shaped integration point: handlers consume
the trait; source composition is an impl detail.
3. The CallAdapter composes the root env per incoming call
When a call.requested arrives over connection C, the CallAdapter does
not look up the operation in a single global registry. It composes the root
OperationContext.env from the layers active for this call:
root env = CompositeOperationEnv {
base: curated_registry_env, // Layer 0 — static
connection: C.imported_operations, // Layer 2 — this connection's overlay
session: active_session_overlay, // Layer 1 — if a session is active
}
The composite impl checks overlays in order (session first, then connection,
then curated base) and dispatches to the first match. This is structural
source binding: a handler composing worker/exec reaches it via the
connection overlay that contains it, not via a naming convention in a
global pot.
Env inheritance through composition: the child's env is
parent.env.clone() — an Arc::clone, not a re-composition. Overlay
composition happens once at the root (in build_root_context) and
propagates by Arc through the composition tree. A child handler sees the
same active overlays its parent saw. This is deliberate: re-composing per
invoke() would re-resolve overlays on every dispatch and would break the
session-overlay case (a session that was active when the parent ran must
still be active for the child, even if the session ended mid-composition —
the child is part of the same call tree the parent started). The root env
is composed per incoming call; nested calls inherit it by Arc::clone.
When connection C disconnects, its overlay is dropped. Operations imported
from C vanish from the reachable set with no global mutation and no reverse
index. Handlers that try to compose a now-gone op receive NOT_FOUND (if
the overlay was already dropped when invoke() runs the reachability
check) or a connection error with code INTERNAL (if the call was
dispatched to the forwarding handler and the connection drops mid-flight).
Both cases are clean failures — no stale-handler-binds-to-dead-connection
hazard.
4. Curated operations remain immutable; imported and session ops are dynamic
The blanket immutability claim in operation-registry.md is replaced by:
- Layer 0 (curated,
Local): immutable after startup. TheOperationRegistryholding curated ops is constructed once by the assembly layer and never mutated. This is where the security argument for immutability applies: composing ops are privileged, the startup trust boundary is where that privilege is granted, immutability locks it. - Layer 1 (session,
Session): dynamic, per-session. Created at sandbox creation, destroyed at session end. Already specified by OQ-19. - Layer 2 (imported,
FromCalletc.): dynamic, per-connection. Created when a peer connection completesfrom_calldiscovery, destroyed when the connection closes.
Adding a Local op at runtime is not supported — it would require re-entering
the startup trust boundary, which is a deployment (restart), not a runtime
operation. This preserves the security property ADR-010/OQ-04 were concerned
with, scoped to where it actually applies.
5. from_call imports into the connection's overlay, not the global registry
The from_call adapter (ADR-017) discovers operations on a remote peer and
produces HandlerRegistration bundles. Under ADR-024, those bundles are
registered into the connection's overlay, not a global mutable registry.
// On CallConnection establishment:
let imported = from_call(&connection, config).await;
connection.imported_operations.extend(imported);
// The connection's env now includes these ops.
The handler closures produced by from_call capture the CallConnection —
when the connection drops, the handlers become unreachable (their env is
dropped), and any in-flight calls to them return connection errors. This is
the natural lifecycle; no explicit deregistration is needed.
6. from_openapi and from_mcp default to startup import into Layer 0
For the common case — the assembly layer imports a static OpenAPI spec or
connects to a known MCP server at startup — from_openapi / from_mcp
register into the curated (Layer 0) registry, which is then frozen. This
preserves the pre-ADR-024 behavior for the case where it was correct.
Runtime from_openapi / from_mcp import (e.g., discovering an MCP server
at connection time) is permitted and follows the Layer 2 model — the imported
ops live in a connection/discovery-scoped overlay. This is additive and
does not affect the startup-import path.
7. OQ-04 scope clarification and OQ-19 generalization
This ADR amends OQ-04 to scope its immutability claim to the
HandlerRegistry (ALPN-level, ADR-010). The OperationRegistry's
mutability profile is now governed by this ADR: curated (Layer 0) is
immutable; session and imported layers are dynamic at their trust-boundary
scopes. See the OQ-04 amendment in open-questions.md.
This ADR generalizes OQ-19's session-overlay mechanism to also cover
connection-scoped remote imports. Both are per-scope dynamic overlays on the
static curated base, composed into the per-call OperationContext.env by
the CallAdapter. OperationEnv being a trait object is what enables
both. See the OQ-19 resolution update in open-questions.md.
Consequences
Positive:
from_callhas a coherent home. Imported ops live with the connection that produced them, appear when the connection is established, and disappear when it closes. No contradiction with immutability, no awkward "import everything at startup" workaround.- The immutability argument is now correctly scoped. Layer 0 (curated, composing ops) is immutable because that's where the security control applies. Layers 1 and 2 are dynamic because their trust boundaries are per-scope. An implementer reading the spec sees the right constraint in the right place, instead of a blanket claim that doesn't fit all cases.
- The
OperationEnv-as-trait constraint (OQ-19) is now required by the overlay model, not just by the session-overlay pattern. The same mechanism (trait layering) supports both session overlays and connection overlays — one pattern, two scopes. This makes C6's resolution (env: Arc<dyn OperationEnv>) structurally motivated, not just a type-system cleanup. - Disconnect handling is structural. A connection drops → its overlay drops
→ its ops vanish from the reachable set. No
ArcSwapcoordination, no reverse index from op to owning connection, no stale handlers bound to a dead connection. This is the same lifecycle property session overlays already have (session ends → session overlay drops). - Source isolation is structural. Imported ops from peer X are only
reachable from handlers whose
OperationEnvis wired to X's overlay. They are not globally callable. A handler that shouldn't be able to reach peer X's ops simply doesn't have X's overlay in its env. This is better hygiene than a global registry with namespace prefixes, where every handler sees every imported op and isolation is a naming convention. - The
IdentityProviderprecedent makes the design legible. A future reader sees "trait-object integration point, source composition as impl detail" and recognizes the pattern; they don't have to re-derive why trait-composed overlays were chosen over a global mutable registry.
Negative:
- The dispatch path is a composite lookup (session → connection → curated)
rather than a single
HashMaplookup. This is a small constant cost — three hash lookups in the worst case instead of one — and the overlays are small (a session's ops, a connection's imported ops). The common case (composing a curated op) hits Layer 0 after two empty-overlay misses, which is a predictable and cache-friendly path. The cost is justified by the source isolation and lifecycle properties it buys. OperationContext.envis nowArc<dyn OperationEnv + Send + Sync>, which is a trait object with dynamic dispatch. This is the same cost asArc<dyn IdentityProvider>— a vtable call perinvoke(). Negligible relative to the work an operation does, and the same pattern the codebase already uses for auth.- The
CallAdapterhas more responsibility: it composes the root env per call from the active layers, rather than handing every call the same global registry. This is expected — the CallAdapter is the integration point for the call protocol, and per-call env composition is the same shape as per-call identity resolution (which the CallAdapter already does viaIdentityProvider). - Naming across overlays: if two connections import ops with the same name
(e.g., both peers expose
worker/exec), the composite env dispatches to the first overlay that contains the name. This is the same ambiguityFromCallConfig's namespace prefix (ADR-017) was designed to address — the caller disambiguates with a prefix at import time. ADR-024 does not change this; it makes the disambiguation structural (which overlay is in the env) rather than nominal (which prefix is in the name). - The blanket immutability claim in
operation-registry.mdand the cross-references that inherit it (the "Two-way door —ArcSwap<OperationRegistry>can be added later" note, OQ-04's framing) must be updated. This is a spec edit, not a migration — no implementation exists yet.
On review #002 findings resolved by this ADR:
- C6 (
OperationContext.envtype identity crisis): resolved by Decision 2. The field is split intoscoped_env(reachability data) andenv(dispatch trait object). The split is structurally motivated by the overlay model, not just a type-system cleanup. - W4 (hot-swap ↔ registry mutability coupling): localized to the
connection scope. There is no global mutable registry to hot-swap.
Overlays are per-scope and replace naturally with connect/disconnect and
session start/end. The schema-drift hazard (a peer re-runs
services/liston reconnect and re-imports with a changed schema) moves from global to per-connection — it does not vanish. A handler mid-composition whose peer reconnects with a changed schema sees the old schema until the overlay is rebuilt. This is a per-connection concern, not a global one; the guard clause the review asked for becomes a note on overlay rebuild semantics rather than a global hot-swap protocol. - W3 (CallClient registry security dimension): partially addressed. The
registry-shape sub-question is resolved by the overlay model — a
CallClient's incoming-call dispatch uses the same overlay composition, and sharing the curated base with a remote peer is fine (curated ops are trusted). The capability-exposure sub-question (a remote peer calling/llm/generateuses the local node's API key) is not resolved by this ADR — it is a separate concern about what capabilities a remote peer can trigger, and it is unaffected by the registry shape. That sub-question remains open for ADR-017 (a guard-clause note: a peer-scoped subset must filter by capability remote-safety, not just operation name). ADR-024 resolves the dispatch shape; ADR-017 retains the capability-exposure decision.
Assumptions
-
Provenance is knowable at registration time and stable for the registration's lifetime. A
Localop does not becomeFromCalllater; aFromCallop does not becomeLocal. If a remote-imported op is later "promoted" to curated, that's a re-registration at the next startup (deployment), not a runtime mutation. Inherited from ADR-022 Assumption 2. -
Layer 0 immutability is the security control for composing ops. The pre-ADR-024 blanket immutability claim was overbroad but not wrong about
Localops. Curated composing ops must be immutable because the startup trust boundary is where their authority is granted. This ADR narrows the claim, it does not remove it. -
Imported and session ops do not need immutability as a security control for privilege escalation. Their security against privilege escalation is bounded by provenance (no composition authority → no privilege escalation) and by the parent handler's scoped env (reachability control). This is the central argument; if it's wrong — if a
from_callop can escalate in some way provenance + scoped env don't bound — the model needs revisiting. Immutability is not the control for non-escalation threats (availability, schema drift): availability is bounded by per-handler timeouts (ADR-016) and the connection's overlay being drop-on-disconnect; schema drift on reconnect is a per-connection overlay-rebuild concern (see W4 in Consequences), not a global-registry-mutation concern. The point of scoping immutability to Layer 0 is that immutability is the right control for composing ops and the wrong control for non-composing ops; it is not a claim that non-composing ops face no threats. -
A connection's overlay is the right scope for
from_callimports. Operations discovered from peer X are reachable from handlers whose env includes X's overlay. If a use case requires imported ops to be globally reachable (every handler sees every peer's ops), the composite env can be built to include all active connection overlays — but the default is per-connection scoping for isolation. -
Disconnect → overlay drop → op vanishes is acceptable behavior. A handler composing an op whose peer has disconnected receives
NOT_FOUND(or a connection error if the in-flight call was mid-dispatch). This is the same behavior as a peer that never exposed the op. If a use case requires disconnected-peer ops to remain reachable (e.g., cached results), that's a handler-level caching concern, not a registry concern. -
The root env is composed per incoming call, not cached per connection. The active session overlay can change during a connection's lifetime (a session starts or ends mid-connection), so the env cannot be composed once at connection establishment and reused.
build_root_contextruns percall.requestedand composes the env from the layers active at that moment. The cost (constructing anArc<CompositeOperationEnv>per call) is negligible — it's threeArc::clones, not three registry traversals. -
Session-overlay attachment is an agent-crate concern. ADR-024 generalizes OQ-19's session overlay to also cover connection overlays, but the mechanism by which a session overlay attaches to a given wire call (session ID in metadata, payload field, connection-bound session state, etc.) is not specified here. The
CallAdapteris wired with an optional session-overlay source by the assembly layer; the lookup mechanism belongs to the agent crate spec (OQ-19: "the agent-specific mechanism belongs to the agent crate spec"). If a wire call has no active session, the root env iscurated base + connection overlay(no session layer).
References
- ADR-010: ALPN router and endpoint (the
HandlerRegistryimmutability argument — this ADR clarifies that it applies to the ALPN registry, not the operation registry) - ADR-014: Secret material flow and capability injection (capabilities are
per-
HandlerRegistrationbundle, not per-registry — the overlay model doesn't change how capabilities flow; an imported op's capabilities come from its bundle, which forfrom_callis whatever the assembly layer granted the import) - ADR-017: Call protocol client and adapter contract (
from_calladapter; theFromCallConfignamespace prefix is the disambiguation mechanism this ADR's overlay model uses structurally) - ADR-022: Handler registration, provenance, and composition authority
(provenance is the axis this ADR's layering is based on; the
HandlerRegistrationbundle shape is unchanged) - ADR-004: Auth as shared core (
IdentityProvider— the precedent for the trait-object integration point pattern this ADR applies toOperationEnv) - OQ-04: Dynamic handler registration (this ADR amends OQ-04 to scope it to
the
HandlerRegistry; the operation registry's mutability is now governed by ADR-024) - OQ-19: Session-scoped operation registries (this ADR generalizes the session-overlay mechanism to connection overlays — same pattern, two scopes)
- docs/reviews/002-pre-implementation-architecture-sanity-check.md (findings C6, W3, W4 — resolved by this ADR)