docs(architecture): add ADR-024 — operation registry layering, resolve C6
Diagnoses a conflation in the pre-ADR-024 spec: the OperationRegistry inherited immutability by analogy from ADR-010's HandlerRegistry (ALPN-level), but the TLS-config argument that justifies HandlerRegistry immutability does not apply to the operation registry, which lives behind a single ALPN (alknet/call). This made from_call (which discovers ops over a live connection at runtime) structurally incompatible with the blanket immutability claim. ADR-024 layers the operation registry by trust boundary: curated (Local) ops are static and immutable — the startup trust boundary is where their composition authority is granted; session (Session) and imported (FromCall etc.) ops are dynamic at their respective scopes (per-session, per-connection) — their trust boundaries are per-scope, not per-startup. The principle: immutability follows the trust boundary. Immutability is the security control for composing ops (can escalate privilege); provenance + composition authority are the controls for non-composing ops (can't escalate). The OperationEnv trait becomes the integration point (Arc<dyn OperationEnv>), following the IdentityProvider precedent (ADR-004): the CallAdapter composes the root OperationContext.env per incoming call from the active layers (curated base + connection overlay + session overlay). Children inherit the parent's composite env by Arc::clone — overlay composition happens once at the root and propagates through the composition tree. Resolves review #002 C6 (OperationContext.env type identity crisis): the field is split into scoped_env: ScopedOperationEnv (reachability data, from the registration bundle) and env: Arc<dyn OperationEnv + Send + Sync> (dispatch trait object). One field was being used as two different types (reachability set with .allows() and dispatch trait with .invoke()); Localizes W4 (hot-swap ↔ registry mutability coupling) to the connection scope: no global mutable registry to hot-swap; overlays replace naturally with connect/disconnect and session start/end. Schema-drift on reconnect is a per-connection overlay-rebuild concern, not a global hot-swap protocol. Partially addresses W3 (CallClient registry security): the registry-shape sub-question is resolved by the overlay model; the capability-exposure sub-question (what capabilities a remote peer can trigger) remains for ADR-017 — ADR-024 does not overclaim resolution there. Amends OQ-04 to scope its immutability claim to the HandlerRegistry and cross-reference ADR-024 for the operation registry. Generalizes OQ-19's session-overlay mechanism to also cover connection-scoped remote imports — both are per-scope dynamic overlays on the static curated base, using the same trait-layering mechanism.
This commit is contained in:
@@ -132,7 +132,9 @@ pub struct OperationContext {
|
||||
pub handler_identity: Option<CompositionAuthority>, // Handler's composition authority
|
||||
pub capabilities: Capabilities,
|
||||
pub metadata: HashMap<String, Value>,
|
||||
pub env: OperationEnv,
|
||||
// env/scoped_env split by ADR-024:
|
||||
pub scoped_env: ScopedOperationEnv, // Reachability data (ADR-022, ADR-024)
|
||||
pub env: Arc<dyn OperationEnv + Send + Sync>, // Dispatch trait (ADR-024)
|
||||
/// Module-private for writes; read via `is_internal()`. Set only by
|
||||
/// `OperationEnv::invoke()` (true) or `CallAdapter` dispatch (false).
|
||||
pub(crate) internal: bool,
|
||||
|
||||
@@ -319,8 +319,12 @@ fn build_root_context(
|
||||
handler_identity: registration.composition_authority, // C1: from bundle, None for leaves
|
||||
capabilities: registration.capabilities.clone(), // C3: from bundle
|
||||
metadata: HashMap::new(),
|
||||
env: registration.scoped_env.clone()
|
||||
// env/scoped_env split by ADR-024: scoped_env is the reachability
|
||||
// data (from the bundle), env is the dispatch trait object (composed
|
||||
// per-call by the CallAdapter from active overlays).
|
||||
scoped_env: registration.scoped_env.clone()
|
||||
.unwrap_or_else(ScopedOperationEnv::empty), // C2: from bundle, empty for leaves
|
||||
env: /* CallAdapter.compose_root_env(...) — see ADR-024 */,
|
||||
internal: false, // wire call — ACL against caller identity
|
||||
}
|
||||
}
|
||||
@@ -339,7 +343,9 @@ async fn invoke(&self, namespace: &str, operation: &str, input: Value,
|
||||
|
||||
// Reachability check (C2): is this op in the parent's scoped env?
|
||||
// If not, return NOT_FOUND. This is the reachability control.
|
||||
if !parent.env.allows(&name) {
|
||||
// (ADR-024: the reachability check consults parent.scoped_env, not
|
||||
// parent.env — env is now the dispatch trait, scoped_env is the data.)
|
||||
if !parent.scoped_env.allows(&name) {
|
||||
return ResponseEnvelope::not_found(name);
|
||||
}
|
||||
|
||||
@@ -351,8 +357,10 @@ async fn invoke(&self, namespace: &str, operation: &str, input: Value,
|
||||
handler_identity: registration.composition_authority.clone(), // C1: child's own authority
|
||||
capabilities: parent.capabilities.clone(), // C3: propagate through composition
|
||||
metadata: HashMap::new(), // fresh — does NOT propagate (ADR-014)
|
||||
env: registration.scoped_env.clone()
|
||||
// env/scoped_env split by ADR-024:
|
||||
scoped_env: registration.scoped_env.clone()
|
||||
.unwrap_or_else(ScopedOperationEnv::empty), // C2: child's own scoped env
|
||||
env: parent.env.clone(), // child inherits parent's composite env (Arc::clone)
|
||||
internal: true, // composition — ACL against handler_identity
|
||||
};
|
||||
self.registry.invoke(&name, input, context).await
|
||||
@@ -580,16 +588,27 @@ the fuzzer validates the implementation against the spec.
|
||||
cascade tree; `parent_request_id` indexes it)
|
||||
- ADR-017: Call protocol client and adapter contract (adapter-registered
|
||||
ops are `Internal` by default; this ADR's provenance makes that explicit)
|
||||
- ADR-024: Operation registry layering (amends this ADR's Decision 5: the
|
||||
`env` field shown in `build_root_context` and `invoke()` is split into
|
||||
`scoped_env: ScopedOperationEnv` (reachability data, populated from the
|
||||
bundle's `scoped_env`) and `env: Arc<dyn OperationEnv + Send + Sync>`
|
||||
(dispatch trait object). The split is required by ADR-024's overlay model
|
||||
— the trait-object design is what enables connection and session overlays
|
||||
to compose. The `HandlerRegistration` bundle shape, provenance model,
|
||||
composition authority, and capability injection specified by this ADR
|
||||
are unchanged.)
|
||||
- ADR-008: Vault integration point (assembly layer is the trust boundary)
|
||||
- OQ-19: Session-scoped operation registries (session ops are `Session`
|
||||
provenance, always `Internal`, compose under restricted authority)
|
||||
- docs/reviews/001-pre-implementation-architecture-sanity-check.md (findings
|
||||
C1–C4, which this ADR resolves)
|
||||
- docs/reviews/002-pre-implementation-architecture-sanity-check.md (finding
|
||||
C6, resolved by ADR-024's `env`/`scoped_env` split)
|
||||
- `/workspace/@alkdev/flowgraph/README.md` — operation graph, call graph, and
|
||||
scoped subgraph concepts (the graph model this ADR uses as framing)
|
||||
- `/workspace/@alkdev/alknet-main/docs/architecture/flowgraph.md` — prior
|
||||
Rust speccing of flowgraph (incomplete; this ADR uses the model, not the
|
||||
crate)
|
||||
- Kernel/user mode analogy: `getaddrinfo` runs under kernel authority, not
|
||||
the caller's `CAP_NET_RAW`; the curated entry point exists to do things the
|
||||
user can't, on the user's behalf
|
||||
the caller's `CAP_NET_RAW`; the curated entry point exists to do things
|
||||
the user can't, on the user's behalf
|
||||
483
docs/architecture/decisions/024-operation-registry-layering.md
Normal file
483
docs/architecture/decisions/024-operation-registry-layering.md
Normal file
@@ -0,0 +1,483 @@
|
||||
# ADR-024: Operation Registry Layering
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
The architecture has two registries that the spec documents previously treated
|
||||
as sharing one immutability argument:
|
||||
|
||||
1. **The endpoint's `HandlerRegistry`** (ALPN string → `ProtocolHandler`).
|
||||
This is what ADR-010 and OQ-04 are about. Its immutability is load-bearing:
|
||||
ALPN strings are baked into the TLS `ServerConfig` at startup, so adding a
|
||||
protocol handler at runtime requires rebuilding the TLS config. This is a
|
||||
genuine one-way door and the rationale is correct.
|
||||
|
||||
2. **The call protocol's `OperationRegistry`** (operation name →
|
||||
`HandlerRegistration`). This lives *inside* the `CallAdapter`, which is one
|
||||
`ProtocolHandler` behind the single ALPN `alknet/call`. Adding an operation
|
||||
to the `OperationRegistry` does **not** touch the TLS `ServerConfig` — the
|
||||
ALPN is already `alknet/call`, registered once at startup.
|
||||
|
||||
`operation-registry.md` stated the operation registry "is immutable after
|
||||
construction… consistent with OQ-04 and ADR-010." That inheritance was by
|
||||
analogy, not by shared rationale. The TLS argument that justifies
|
||||
`HandlerRegistry` immutability does not apply to the `OperationRegistry`. The
|
||||
operation registry's mutability profile is a separate question, and it has been
|
||||
answered incorrectly by inheriting a constraint that belongs to a different
|
||||
registry.
|
||||
|
||||
### Why `from_call` breaks the inherited constraint
|
||||
|
||||
The import adapters have different lifecycle requirements:
|
||||
|
||||
- **`from_openapi` / `from_mcp`** can run at startup — the assembly layer reads
|
||||
a static spec file or queries a known service before the registry is frozen.
|
||||
Static import, fits immutability.
|
||||
- **`from_call`** requires a **live connection** to discover operations
|
||||
(`services/list` + `services/schema`). Connections happen at runtime.
|
||||
Workers join and leave dynamically in the machine→worker topology. You
|
||||
cannot pre-freeze a set you discover over a connection you haven't opened
|
||||
yet.
|
||||
|
||||
So `from_call` is structurally incompatible with "frozen at startup, never
|
||||
touched again." The pre-ADR-024 spec held two contradictory positions: the
|
||||
registry is immutable (operation-registry.md), and `from_call` imports remote
|
||||
operations at connection time (ADR-017). An implementer would have to resolve
|
||||
the contradiction by guessing — likely by either forcing all `from_call`
|
||||
imports to happen at startup (awkward, doesn't fit worker topologies) or
|
||||
quietly making the registry mutable (undermining the stated constraint without
|
||||
acknowledging it).
|
||||
|
||||
### Why immutability is not the load-bearing security control for imported ops
|
||||
|
||||
Imported operations (`FromOpenAPI`, `FromMCP`, `FromCall`) are leaves — they
|
||||
cannot compose (ADR-022 Assumption 5). They have no composition authority, no
|
||||
scoped env, `Internal` visibility by default, and their trust model is "the
|
||||
remote endpoint is trusted as much as my own handlers" (ADR-017). Their
|
||||
reachability from a composing handler is bounded by the *parent handler's*
|
||||
scoped env, not by their registration timing.
|
||||
|
||||
The security controls on imported ops are **provenance** and **composition
|
||||
authority** — both set at registration, both checked at dispatch. Immutability
|
||||
is redundant here. An imported op registered at runtime is no more or less
|
||||
privileged than one registered at startup; it's a forwarding stub either way,
|
||||
and its capacity to do harm is bounded by what the *composing parent*'s
|
||||
authority and scoped env permit.
|
||||
|
||||
Immutability *is* load-bearing for **curated** operations — the `Local` ops
|
||||
the assembly layer writes at startup, which *can* compose and therefore *can*
|
||||
escalate privilege under their own authority. For those, the trust boundary is
|
||||
"the assembly layer declared them at startup," and immutability is what locks
|
||||
that declaration. But that's a constraint on `Local` provenance specifically,
|
||||
not on the registry as a whole.
|
||||
|
||||
### The trust-boundary principle
|
||||
|
||||
The right axis is not visibility (`Internal` vs `External`) or wire-vs-local —
|
||||
it is **provenance combined with import timing**, which maps to where each
|
||||
operation's trust decision is made:
|
||||
|
||||
| Provenance | Import timing | Trust boundary | Layer | Lifetime |
|
||||
|-----------|---------------|----------------|-------|----------|
|
||||
| `Local` | Startup | Assembly layer at startup | 0 (curated) | Process — immutable |
|
||||
| `Session` | Sandbox creation | Composing handler at sandbox creation | 1 (session) | Session — dynamic |
|
||||
| `FromCall` | Connection (runtime) | Remote node at connection time | 2 (connection) | Connection — dynamic |
|
||||
| `FromOpenAPI` / `FromMCP` | Startup | External endpoint, discovered at startup | 0 (curated) | Process — immutable |
|
||||
| `FromOpenAPI` / `FromMCP` | Runtime (rare) | External endpoint, discovered at runtime | 2 (discovery) | Discovery-scoped — dynamic |
|
||||
|
||||
`FromOpenAPI` / `FromMCP` provenance is **layer-polymorphic**: the same
|
||||
provenance lands in Layer 0 (immutable) or Layer 2 (dynamic) depending on
|
||||
when the import happens. The common case is startup import into Layer 0
|
||||
(Decision 6); runtime import into Layer 2 is permitted but rare.
|
||||
|
||||
**Immutability follows the trust boundary.** Operations are mutable at the
|
||||
scope where their trust decision is made. `Local` ops (and startup-imported
|
||||
`FromOpenAPI`/`FromMCP`) are trusted at startup → immutable. Session ops
|
||||
are trusted at sandbox creation → session-scoped dynamic. `FromCall` ops
|
||||
(and runtime-imported `FromOpenAPI`/`FromMCP`) are trusted at
|
||||
connection/discovery time → connection/runtime dynamic.
|
||||
|
||||
Session ops are the edge case that proves the rule: they are `Internal`
|
||||
visibility and can compose, but their trust boundary is per-session (the
|
||||
parent handler grants them restricted authority at sandbox creation, per
|
||||
ADR-022 Assumption 6), not per-startup. Visibility alone would misclassify
|
||||
them; provenance correctly identifies them as dynamic.
|
||||
|
||||
### The precedent: `IdentityProvider`
|
||||
|
||||
The structural problem — *N consumers need to resolve something from M
|
||||
sources, don't globalize the sources into one pot, don't make each consumer
|
||||
know about all sources* — is the same problem `IdentityProvider` solves for
|
||||
auth (ADR-004). An `IdentityProvider` is a trait (`Arc<dyn IdentityProvider>`)
|
||||
that centralizes resolution policy behind a stable interface; source
|
||||
composition is an impl detail. Handlers consume the result; the trait owns the
|
||||
routing.
|
||||
|
||||
`OperationEnv` is the same problem one layer over: *N handlers need to
|
||||
dispatch to operations, operations come from M sources (curated local, this
|
||||
session, this peer connection, that peer connection), don't globalize all
|
||||
sources into one mutable pot, don't make each handler know about all sources
|
||||
and pick the right registry.* The solution is the same shape: a trait —
|
||||
`Arc<dyn OperationEnv>` — that centralizes dispatch routing behind a stable
|
||||
interface, with overlay composition as an impl detail.
|
||||
|
||||
The alternative — a single global `ArcSwap<OperationRegistry>` into which all
|
||||
imported ops merge with namespace prefixes — is the registry equivalent of
|
||||
"every handler reads identity from a global env var." It works at one
|
||||
connection. At many connections it produces: an unbounded pot, namespace
|
||||
collisions scaling with connection count, disconnect cleanup requiring a
|
||||
reverse index (op → owning connection), zero source isolation, and
|
||||
routing-by-naming-convention instead of routing-by-structure. That is the
|
||||
failure mode the `IdentityProvider` pattern exists to prevent.
|
||||
|
||||
## Decision
|
||||
|
||||
### 1. The operation registry is layered by trust boundary
|
||||
|
||||
The `OperationRegistry` is not a single flat map. It is a layered structure
|
||||
where each layer corresponds to a trust boundary:
|
||||
|
||||
```
|
||||
Layer 0 — Curated (static, immutable, startup trust boundary)
|
||||
Local provenance operations from the assembly layer.
|
||||
Registered once at startup, never mutated for the process lifetime.
|
||||
This is where immutability is load-bearing: these ops can compose,
|
||||
therefore can escalate privilege under their own authority. The
|
||||
startup trust boundary + immutability is the security control.
|
||||
|
||||
Layer 1 — Session (dynamic, per-session, sandbox-creation trust boundary)
|
||||
Session provenance operations, agent-written, sandboxed.
|
||||
Created and destroyed with each session.
|
||||
Already specified by OQ-19 as an overlay on Layer 0.
|
||||
|
||||
Layer 2 — Imported (dynamic, per-connection, peer trust boundary)
|
||||
FromCall operations discovered when a peer connects.
|
||||
FromOpenAPI / FromMCP operations when imported at runtime (rare;
|
||||
usually at startup into Layer 0, but runtime import is permitted).
|
||||
Created and destroyed with the connection / discovery event.
|
||||
```
|
||||
|
||||
Layers 1 and 2 are the same shape: **per-scope dynamic overlays on the static
|
||||
curated base.** The scope is "session" for Layer 1 and "connection" (or
|
||||
"discovery event") for Layer 2. OQ-19 already specified the overlay mechanism
|
||||
for Layer 1 (session env wraps global env via `OperationEnv` trait layering).
|
||||
This ADR generalizes the same mechanism to Layer 2.
|
||||
|
||||
### 2. The `OperationEnv` trait is the integration point
|
||||
|
||||
`OperationContext.env` is `Arc<dyn OperationEnv + Send + Sync>` — a trait
|
||||
object, not a concrete struct. This is required by the overlay model: a
|
||||
composite env (curated base + connection overlay + session overlay) is built
|
||||
by composing `OperationEnv` impls, not by merging registries.
|
||||
|
||||
This resolves review #002 finding C6 (`OperationContext.env` type identity
|
||||
crisis). The pre-ADR-024 spec had `env: OperationEnv` (a trait, which can't
|
||||
be a field without `dyn`) and used the same field as both a reachability set
|
||||
(`parent.env.allows()`) and a dispatch trait (`context.env.invoke()`). One
|
||||
field cannot be both. The split:
|
||||
|
||||
- `scoped_env: ScopedOperationEnv` — reachability data. Populated from the
|
||||
registration bundle's `scoped_env` (ADR-022). The reachability check in
|
||||
`invoke()` consults `parent.scoped_env.allows(&name)`.
|
||||
- `env: Arc<dyn OperationEnv + Send + Sync>` — dispatch trait. The handler
|
||||
calls `context.env.invoke(...)`; the trait impl routes to the right
|
||||
overlay.
|
||||
|
||||
This is the `IdentityProvider`-shaped integration point: handlers consume
|
||||
the trait; source composition is an impl detail.
|
||||
|
||||
### 3. The `CallAdapter` composes the root env per incoming call
|
||||
|
||||
When a `call.requested` arrives over connection C, the `CallAdapter` does
|
||||
not look up the operation in a single global registry. It composes the root
|
||||
`OperationContext.env` from the layers active for this call:
|
||||
|
||||
```
|
||||
root env = CompositeOperationEnv {
|
||||
base: curated_registry_env, // Layer 0 — static
|
||||
connection: C.imported_operations, // Layer 2 — this connection's overlay
|
||||
session: active_session_overlay, // Layer 1 — if a session is active
|
||||
}
|
||||
```
|
||||
|
||||
The composite impl checks overlays in order (session first, then connection,
|
||||
then curated base) and dispatches to the first match. This is structural
|
||||
source binding: a handler composing `worker/exec` reaches it via the
|
||||
connection overlay that contains it, not via a naming convention in a
|
||||
global pot.
|
||||
|
||||
**Env inheritance through composition**: the child's `env` is
|
||||
`parent.env.clone()` — an `Arc::clone`, not a re-composition. Overlay
|
||||
composition happens once at the root (in `build_root_context`) and
|
||||
propagates by `Arc` through the composition tree. A child handler sees the
|
||||
same active overlays its parent saw. This is deliberate: re-composing per
|
||||
`invoke()` would re-resolve overlays on every dispatch and would break the
|
||||
session-overlay case (a session that was active when the parent ran must
|
||||
still be active for the child, even if the session ended mid-composition —
|
||||
the child is part of the same call tree the parent started). The root env
|
||||
is composed per incoming call; nested calls inherit it by `Arc::clone`.
|
||||
|
||||
When connection C disconnects, its overlay is dropped. Operations imported
|
||||
from C vanish from the reachable set with no global mutation and no reverse
|
||||
index. Handlers that try to compose a now-gone op receive `NOT_FOUND` (if
|
||||
the overlay was already dropped when `invoke()` runs the reachability
|
||||
check) or a connection error with code `INTERNAL` (if the call was
|
||||
dispatched to the forwarding handler and the connection drops mid-flight).
|
||||
Both cases are clean failures — no stale-handler-binds-to-dead-connection
|
||||
hazard.
|
||||
|
||||
### 4. Curated operations remain immutable; imported and session ops are dynamic
|
||||
|
||||
The blanket immutability claim in `operation-registry.md` is replaced by:
|
||||
|
||||
- **Layer 0 (curated, `Local`)**: immutable after startup. The
|
||||
`OperationRegistry` holding curated ops is constructed once by the
|
||||
assembly layer and never mutated. This is where the security argument for
|
||||
immutability applies: composing ops are privileged, the startup trust
|
||||
boundary is where that privilege is granted, immutability locks it.
|
||||
- **Layer 1 (session, `Session`)**: dynamic, per-session. Created at sandbox
|
||||
creation, destroyed at session end. Already specified by OQ-19.
|
||||
- **Layer 2 (imported, `FromCall` etc.)**: dynamic, per-connection. Created
|
||||
when a peer connection completes `from_call` discovery, destroyed when the
|
||||
connection closes.
|
||||
|
||||
Adding a `Local` op at runtime is not supported — it would require re-entering
|
||||
the startup trust boundary, which is a deployment (restart), not a runtime
|
||||
operation. This preserves the security property ADR-010/OQ-04 were concerned
|
||||
with, scoped to where it actually applies.
|
||||
|
||||
### 5. `from_call` imports into the connection's overlay, not the global registry
|
||||
|
||||
The `from_call` adapter (ADR-017) discovers operations on a remote peer and
|
||||
produces `HandlerRegistration` bundles. Under ADR-024, those bundles are
|
||||
registered into the **connection's overlay**, not a global mutable registry.
|
||||
|
||||
```rust
|
||||
// On CallConnection establishment:
|
||||
let imported = from_call(&connection, config).await;
|
||||
connection.imported_operations.extend(imported);
|
||||
// The connection's env now includes these ops.
|
||||
```
|
||||
|
||||
The handler closures produced by `from_call` capture the `CallConnection` —
|
||||
when the connection drops, the handlers become unreachable (their env is
|
||||
dropped), and any in-flight calls to them return connection errors. This is
|
||||
the natural lifecycle; no explicit deregistration is needed.
|
||||
|
||||
### 6. `from_openapi` and `from_mcp` default to startup import into Layer 0
|
||||
|
||||
For the common case — the assembly layer imports a static OpenAPI spec or
|
||||
connects to a known MCP server at startup — `from_openapi` / `from_mcp`
|
||||
register into the curated (Layer 0) registry, which is then frozen. This
|
||||
preserves the pre-ADR-024 behavior for the case where it was correct.
|
||||
|
||||
Runtime `from_openapi` / `from_mcp` import (e.g., discovering an MCP server
|
||||
at connection time) is permitted and follows the Layer 2 model — the imported
|
||||
ops live in a connection/discovery-scoped overlay. This is additive and
|
||||
does not affect the startup-import path.
|
||||
|
||||
### 7. OQ-04 scope clarification and OQ-19 generalization
|
||||
|
||||
This ADR amends OQ-04 to scope its immutability claim to the
|
||||
**`HandlerRegistry`** (ALPN-level, ADR-010). The `OperationRegistry`'s
|
||||
mutability profile is now governed by this ADR: curated (Layer 0) is
|
||||
immutable; session and imported layers are dynamic at their trust-boundary
|
||||
scopes. See the OQ-04 amendment in `open-questions.md`.
|
||||
|
||||
This ADR generalizes OQ-19's session-overlay mechanism to also cover
|
||||
connection-scoped remote imports. Both are per-scope dynamic overlays on the
|
||||
static curated base, composed into the per-call `OperationContext.env` by
|
||||
the `CallAdapter`. `OperationEnv` being a trait object is what enables
|
||||
both. See the OQ-19 resolution update in `open-questions.md`.
|
||||
|
||||
## Consequences
|
||||
|
||||
**Positive:**
|
||||
|
||||
- `from_call` has a coherent home. Imported ops live with the connection
|
||||
that produced them, appear when the connection is established, and
|
||||
disappear when it closes. No contradiction with immutability, no awkward
|
||||
"import everything at startup" workaround.
|
||||
- The immutability argument is now correctly scoped. Layer 0 (curated,
|
||||
composing ops) is immutable because that's where the security control
|
||||
applies. Layers 1 and 2 are dynamic because their trust boundaries are
|
||||
per-scope. An implementer reading the spec sees the right constraint in
|
||||
the right place, instead of a blanket claim that doesn't fit all cases.
|
||||
- The `OperationEnv`-as-trait constraint (OQ-19) is now required by the
|
||||
overlay model, not just by the session-overlay pattern. The same
|
||||
mechanism (trait layering) supports both session overlays and connection
|
||||
overlays — one pattern, two scopes. This makes C6's resolution
|
||||
(`env: Arc<dyn OperationEnv>`) structurally motivated, not just a
|
||||
type-system cleanup.
|
||||
- Disconnect handling is structural. A connection drops → its overlay drops
|
||||
→ its ops vanish from the reachable set. No `ArcSwap` coordination, no
|
||||
reverse index from op to owning connection, no stale handlers bound to a
|
||||
dead connection. This is the same lifecycle property session overlays
|
||||
already have (session ends → session overlay drops).
|
||||
- Source isolation is structural. Imported ops from peer X are only
|
||||
reachable from handlers whose `OperationEnv` is wired to X's overlay.
|
||||
They are not globally callable. A handler that shouldn't be able to
|
||||
reach peer X's ops simply doesn't have X's overlay in its env. This is
|
||||
better hygiene than a global registry with namespace prefixes, where
|
||||
every handler sees every imported op and isolation is a naming
|
||||
convention.
|
||||
- The `IdentityProvider` precedent makes the design legible. A future
|
||||
reader sees "trait-object integration point, source composition as impl
|
||||
detail" and recognizes the pattern; they don't have to re-derive why
|
||||
trait-composed overlays were chosen over a global mutable registry.
|
||||
|
||||
**Negative:**
|
||||
|
||||
- The dispatch path is a composite lookup (session → connection → curated)
|
||||
rather than a single `HashMap` lookup. This is a small constant cost —
|
||||
three hash lookups in the worst case instead of one — and the overlays are
|
||||
small (a session's ops, a connection's imported ops). The common case
|
||||
(composing a curated op) hits Layer 0 after two empty-overlay misses, which
|
||||
is a predictable and cache-friendly path. The cost is justified by the
|
||||
source isolation and lifecycle properties it buys.
|
||||
- `OperationContext.env` is now `Arc<dyn OperationEnv + Send + Sync>`, which
|
||||
is a trait object with dynamic dispatch. This is the same cost as
|
||||
`Arc<dyn IdentityProvider>` — a vtable call per `invoke()`. Negligible
|
||||
relative to the work an operation does, and the same pattern the codebase
|
||||
already uses for auth.
|
||||
- The `CallAdapter` has more responsibility: it composes the root env per
|
||||
call from the active layers, rather than handing every call the same
|
||||
global registry. This is expected — the CallAdapter is the integration
|
||||
point for the call protocol, and per-call env composition is the same
|
||||
shape as per-call identity resolution (which the CallAdapter already does
|
||||
via `IdentityProvider`).
|
||||
- Naming across overlays: if two connections import ops with the same name
|
||||
(e.g., both peers expose `worker/exec`), the composite env dispatches to
|
||||
the first overlay that contains the name. This is the same ambiguity
|
||||
`FromCallConfig`'s namespace prefix (ADR-017) was designed to address —
|
||||
the caller disambiguates with a prefix at import time. ADR-024 does not
|
||||
change this; it makes the disambiguation structural (which overlay is in
|
||||
the env) rather than nominal (which prefix is in the name).
|
||||
- The blanket immutability claim in `operation-registry.md` and the
|
||||
cross-references that inherit it (the "Two-way door —
|
||||
`ArcSwap<OperationRegistry>` can be added later" note, OQ-04's framing)
|
||||
must be updated. This is a spec edit, not a migration — no implementation
|
||||
exists yet.
|
||||
|
||||
**On review #002 findings resolved by this ADR:**
|
||||
|
||||
- **C6** (`OperationContext.env` type identity crisis): resolved by Decision 2.
|
||||
The field is split into `scoped_env` (reachability data) and `env` (dispatch
|
||||
trait object). The split is structurally motivated by the overlay model,
|
||||
not just a type-system cleanup.
|
||||
- **W4** (hot-swap ↔ registry mutability coupling): localized to the
|
||||
connection scope. There is no global mutable registry to hot-swap.
|
||||
Overlays are per-scope and replace naturally with connect/disconnect and
|
||||
session start/end. The schema-drift hazard (a peer re-runs
|
||||
`services/list` on reconnect and re-imports with a changed schema) moves
|
||||
from global to per-connection — it does not vanish. A handler
|
||||
mid-composition whose peer reconnects with a changed schema sees the old
|
||||
schema until the overlay is rebuilt. This is a per-connection concern,
|
||||
not a global one; the guard clause the review asked for becomes a note on
|
||||
overlay rebuild semantics rather than a global hot-swap protocol.
|
||||
- **W3** (CallClient registry security dimension): partially addressed. The
|
||||
*registry-shape* sub-question is resolved by the overlay model — a
|
||||
`CallClient`'s incoming-call dispatch uses the same overlay composition,
|
||||
and sharing the curated base with a remote peer is fine (curated ops are
|
||||
trusted). The *capability-exposure* sub-question (a remote peer calling
|
||||
`/llm/generate` uses the local node's API key) is **not resolved by this
|
||||
ADR** — it is a separate concern about what capabilities a remote peer
|
||||
can trigger, and it is unaffected by the registry shape. That sub-question
|
||||
remains open for ADR-017 (a guard-clause note: a peer-scoped subset must
|
||||
filter by capability remote-safety, not just operation name). ADR-024
|
||||
resolves the dispatch shape; ADR-017 retains the capability-exposure
|
||||
decision.
|
||||
|
||||
## Assumptions
|
||||
|
||||
1. **Provenance is knowable at registration time and stable for the
|
||||
registration's lifetime.** A `Local` op does not become `FromCall` later;
|
||||
a `FromCall` op does not become `Local`. If a remote-imported op is later
|
||||
"promoted" to curated, that's a re-registration at the next startup
|
||||
(deployment), not a runtime mutation. Inherited from ADR-022 Assumption 2.
|
||||
|
||||
2. **Layer 0 immutability is the security control for composing ops.** The
|
||||
pre-ADR-024 blanket immutability claim was overbroad but not wrong about
|
||||
`Local` ops. Curated composing ops must be immutable because the startup
|
||||
trust boundary is where their authority is granted. This ADR narrows the
|
||||
claim, it does not remove it.
|
||||
|
||||
3. **Imported and session ops do not need immutability as a security
|
||||
control for privilege escalation.** Their security against privilege
|
||||
escalation is bounded by provenance (no composition authority → no
|
||||
privilege escalation) and by the parent handler's scoped env
|
||||
(reachability control). This is the central argument; if it's wrong —
|
||||
if a `from_call` op can escalate in some way provenance + scoped env
|
||||
don't bound — the model needs revisiting. **Immutability is not the
|
||||
control for non-escalation threats** (availability, schema drift):
|
||||
availability is bounded by per-handler timeouts (ADR-016) and the
|
||||
connection's overlay being drop-on-disconnect; schema drift on
|
||||
reconnect is a per-connection overlay-rebuild concern (see W4 in
|
||||
Consequences), not a global-registry-mutation concern. The point of
|
||||
scoping immutability to Layer 0 is that immutability is the right
|
||||
control *for composing ops* and the wrong control *for non-composing
|
||||
ops*; it is not a claim that non-composing ops face no threats.
|
||||
|
||||
4. **A connection's overlay is the right scope for `from_call` imports.**
|
||||
Operations discovered from peer X are reachable from handlers whose env
|
||||
includes X's overlay. If a use case requires imported ops to be globally
|
||||
reachable (every handler sees every peer's ops), the composite env can be
|
||||
built to include all active connection overlays — but the default is
|
||||
per-connection scoping for isolation.
|
||||
|
||||
5. **Disconnect → overlay drop → op vanishes is acceptable behavior.** A
|
||||
handler composing an op whose peer has disconnected receives `NOT_FOUND`
|
||||
(or a connection error if the in-flight call was mid-dispatch). This is
|
||||
the same behavior as a peer that never exposed the op. If a use case
|
||||
requires disconnected-peer ops to remain reachable (e.g., cached results),
|
||||
that's a handler-level caching concern, not a registry concern.
|
||||
|
||||
6. **The root env is composed per incoming call, not cached per
|
||||
connection.** The active session overlay can change during a connection's
|
||||
lifetime (a session starts or ends mid-connection), so the env cannot be
|
||||
composed once at connection establishment and reused. `build_root_context`
|
||||
runs per `call.requested` and composes the env from the layers active at
|
||||
that moment. The cost (constructing an `Arc<CompositeOperationEnv>` per
|
||||
call) is negligible — it's three `Arc::clone`s, not three registry
|
||||
traversals.
|
||||
|
||||
7. **Session-overlay attachment is an agent-crate concern.** ADR-024
|
||||
generalizes OQ-19's session overlay to also cover connection overlays,
|
||||
but the mechanism by which a session overlay attaches to a given wire
|
||||
call (session ID in metadata, payload field, connection-bound session
|
||||
state, etc.) is not specified here. The `CallAdapter` is wired with an
|
||||
optional session-overlay source by the assembly layer; the lookup
|
||||
mechanism belongs to the agent crate spec (OQ-19: "the agent-specific
|
||||
mechanism belongs to the agent crate spec"). If a wire call has no
|
||||
active session, the root env is `curated base + connection overlay`
|
||||
(no session layer).
|
||||
|
||||
## References
|
||||
|
||||
- ADR-010: ALPN router and endpoint (the `HandlerRegistry` immutability
|
||||
argument — this ADR clarifies that it applies to the ALPN registry, not
|
||||
the operation registry)
|
||||
- ADR-014: Secret material flow and capability injection (capabilities are
|
||||
per-`HandlerRegistration` bundle, not per-registry — the overlay model
|
||||
doesn't change how capabilities flow; an imported op's capabilities come
|
||||
from its bundle, which for `from_call` is whatever the assembly layer
|
||||
granted the import)
|
||||
- ADR-017: Call protocol client and adapter contract (`from_call` adapter;
|
||||
the `FromCallConfig` namespace prefix is the disambiguation mechanism this
|
||||
ADR's overlay model uses structurally)
|
||||
- ADR-022: Handler registration, provenance, and composition authority
|
||||
(provenance is the axis this ADR's layering is based on; the
|
||||
`HandlerRegistration` bundle shape is unchanged)
|
||||
- ADR-004: Auth as shared core (`IdentityProvider` — the precedent for the
|
||||
trait-object integration point pattern this ADR applies to `OperationEnv`)
|
||||
- OQ-04: Dynamic handler registration (this ADR amends OQ-04 to scope it to
|
||||
the `HandlerRegistry`; the operation registry's mutability is now governed
|
||||
by ADR-024)
|
||||
- OQ-19: Session-scoped operation registries (this ADR generalizes the
|
||||
session-overlay mechanism to connection overlays — same pattern, two
|
||||
scopes)
|
||||
- docs/reviews/002-pre-implementation-architecture-sanity-check.md
|
||||
(findings C6, W3, W4 — resolved by this ADR)
|
||||
Reference in New Issue
Block a user