Critical:
- operation-registry: remove stale duplicate OperationEnv impl that
propagated parent.metadata through composition (violated ADR-014);
collapse to one canonical block with metadata: HashMap::new()
- operation-registry: fix request_id collision — format!("env-{name}")
produced identical IDs across concurrent invocations, corrupting
PendingRequestMap correlation and the abort-cascade tree (ADR-016)
- operation-registry + ADR-015: fix OperationContext.internal visibility —
pub field let handlers mark their own call internal (privilege
escalation per ADR-015); change to pub(crate) with pub fn is_internal
Warnings:
- core-types: add Connection::set_identity/identity (OQ-11) to the
Connection type spec — was specified in auth.md but missing from the
type definition
- operation-registry: add Capabilities: Clone design note — invoke()
clones capabilities through composition; explicit security implication
- call-protocol: add CallAdapter root OperationContext construction
example showing internal: false wire path, complementing
OperationEnv::invoke() internal: true composition path
- overview: remove alknet/agent from ALPN registry — agent is a future
consumer of alknet-call (call-protocol operations), not a separate ALPN
- call-protocol: clarify call.requested payload schema and the
leading-slash convention (wire operationId has slash, registry name
does not)
Suggestions:
- operation-registry: cross-reference ResponseEnvelope definition
- core-types: add StreamError to HandlerError mapping table
15 KiB
ADR-015: Privilege Model and Authority Context
Status
Accepted
Context
The call protocol allows handlers to compose other operations through
OperationEnv::invoke(). This creates a call tree: a parent request spawns
children, which may spawn their own children. The parent_request_id field
records this tree.
The previous design had a trusted: bool flag on OperationContext. When a
handler invoked another operation through OperationEnv, the nested call was
marked trusted: true and all ACL checks were skipped. The intent was to
avoid double-checking: if /agent/chat is allowed and it internally calls
/auth/verify, the auth check is "trusted" because the caller already passed
ACL on /agent/chat.
This is a privilege escalation vector. Two concrete attacks:
Buggy handler: a handler accidentally calls an operation it shouldn't. With
trusted: true, ACL is skipped entirely. A handler with read scope that
accidentally calls an operation requiring admin succeeds — the caller's read
scope effectively triggered an admin operation.
Parameterized dispatch: a handler takes caller input that determines which
internal operation to call. This is the core agent use case — an LLM picks which
tool to invoke based on the user's prompt. With trusted: true, the LLM (and
therefore the user) can invoke any registered operation without ACL checks,
regardless of the caller's scopes. A caller with chat scope can invoke
operations requiring admin by choosing the right tool name.
The call protocol is a general-purpose cross-boundary RPC mechanism. Every consumer — NAPI adapter, Python adapter, agent service, future services — inherits whatever privilege model the protocol defines. The privilege boundary between external and internal calls, and the authority context switch for composition, are core protocol semantics. This is not a feature of any single consumer; it is the protocol's security model.
The agent service is a useful test case because it exercises every edge case (parameterized dispatch, deep composition, dynamic operations, role-based escalation), but the decision belongs to the call protocol.
Mental Models
Two analogies clarify the model:
Kernel/user mode: external operations are syscalls — curated entry points
where an unprivileged caller can enter the kernel. Internal operations are
kernel functions — callable only from composition, not from userspace. The
internal flag means "this call is in kernel mode." Kernel mode has access
controls — it runs under a different principal, not with no principal.
Domain/integration events: external operations are integration events —
they cross a boundary and are visible to external systems. Internal operations
are domain events — they stay within the bounded context. services/list is
the integration contract; it only exposes integration events.
Principal/agent (legal contracting): the caller is the principal; the
handler is the agent. The principal delegates scoped authority to the agent.
The agent acts under its own identity (for attribution) but with the principal's
delegated authority (for scope). Liabilities flow upstream (traceable through
parent_request_id); privileges flow downstream (the agent gets a subset of the
principal's authority). Role-based escalation: a lower-privileged role can
escalate through a chain of command (agent requests promotion, architect
performs it), not through direct authority.
Decision
1. The internal flag switches authority context, not skips ACL
The internal flag on OperationContext marks calls that originated from
composition (a handler calling another operation via OperationEnv), as opposed
to external calls that arrived as call.requested from a wire client.
When internal: true:
- The ACL check runs against the handler's identity (set at registration by the assembly layer), not the caller's identity and not as a blanket skip.
- The handler's identity has scopes scoped to its composition needs (least privilege), not blanket root and not the caller's scopes.
When internal: false (external call from the wire):
- The ACL check runs against the caller's identity (from
AuthContext, resolved per-request).
The internal flag is set by OperationEnv, not by callers. A handler cannot
mark its own call as internal. The field uses module-private construction; only
pub fn is_internal(&self) -> bool is exposed for reads.
2. Operations have External/Internal visibility
OperationSpec has a visibility: Visibility field:
pub enum Visibility {
External, // Callable from the wire (call.requested from a client)
Internal, // Composition-only (env.invoke from a handler)
}
The assembly layer declares visibility when registering operations.
When a call.requested arrives from a wire client:
- An
Internaloperation returnscall.errorwith codeNOT_FOUND(notFORBIDDEN). This does not leak that the operation exists. - An
Externaloperation proceeds to ACL checking.
services/list only returns External operations to remote callers. Internal
operations are not part of the wire-facing API surface. A remote client cannot
enumerate the internal call tree.
3. Handler identity is carried on OperationContext
OperationContext carries both the caller's identity (who invoked me) and the
handler's identity (who am I acting as):
pub struct OperationContext {
pub request_id: String,
pub parent_request_id: Option<String>,
pub identity: Option<Identity>, // Caller's identity (inbound)
pub handler_identity: Option<Identity>, // Handler's identity (composition authority)
pub capabilities: Capabilities,
pub metadata: HashMap<String, Value>,
pub env: OperationEnv,
/// Module-private for writes; read via `is_internal()`. Set only by
/// `OperationEnv::invoke()` (true) or `CallAdapter` dispatch (false).
pub(crate) internal: bool,
}
impl OperationContext {
pub fn is_internal(&self) -> bool { self.internal }
}
identity: the authenticated caller (fromAuthContext). For external calls, this is who sent thecall.requested. For internal calls, this is the parent handler's identity (propagated throughOperationEnv::invoke()).handler_identity: the identity of the handler processing this call. Set at registration by the assembly layer. For external calls, this is the handler's own identity. For internal calls, the ACL check runs against this identity.
The distinction is the principal/agent model: identity is the principal (who
delegated), handler_identity is the agent (who is acting). Attribution traces
through both — any action can be attributed to the handler that performed it and
the caller that initiated the chain.
4. Scoped composition env
The OperationEnv given to a handler is scoped — it can only invoke a declared
set of operations. This bounds the parameterized-dispatch attack surface: a
caller (or an LLM) picking which operation to invoke picks from the declared
set, not from the entire registry.
Scoping happens at two levels:
Static scoping at registration: the assembly layer declares which operations
a handler may compose. The OperationEnv given to that handler is pre-filtered
— invoke("fs", "readFile", ...) works, invoke("admin", "deleteUser", ...)
returns NOT_FOUND. This is the reachability control.
Dynamic scoping at sandbox creation: when a handler spawns a sandbox
(quickjs), it passes a further scoped env to the sandbox — a subset of what
the handler itself can reach. The handler might have fs:read and bash:exec,
but it only gives the sandbox fs:read (not bash:exec), because the sandbox
runs untrusted LLM-generated code. This is the "privileges flow downstream"
principle: the principal delegates a subset.
The specific API for declaring the scoped operation set (allowed-operations
list, allowed-namespaces, or a trait-based filter) is a two-way door for
implementation. The TypeScript @alkdev/operations buildEnv() used an
allowedNamespaces filter; the Rust implementation may be finer-grained
(operation-level, not just namespace-level) to be safe.
5. The three controls together
The three controls are independent and all are needed:
| Control | What it gates | Without it |
|---|---|---|
| Operation visibility | Whether an operation is callable from the wire | Internal operations exposed to external callers |
| Handler identity | What authority composition runs under | ACL skipped or caller's scopes propagated (escalation) |
| Scoped composition env | What operations a handler can reach | Handler can call anything in the registry |
- Visibility alone: internal operations are hidden from the wire, but composition skips ACL (escalation through buggy handler).
- Handler identity alone: ACL checks against handler scopes, but the handler can reach any operation (parameterized dispatch unbounded).
- Scoped env alone: handler can only reach declared operations, but ACL is skipped (if a declared operation requires a scope the handler doesn't have, it still runs).
All three together: the handler can only reach declared operations (scoped env), those operations are ACL-checked against the handler's scoped identity (handler identity), and internal operations are never exposed to the wire (visibility). Principle of least privilege.
Consequences
Positive:
- No privilege escalation through composition. A handler can only compose operations its own identity is authorized for, and only from its declared scope.
- Parameterized dispatch is safe. The agent/LLM tool selection case is bounded by the scoped env — the LLM picks from the declared tool set, not from the entire registry. The ACL checks against the handler's identity, not the caller's.
- Buggy handlers can't accidentally escalate. A handler that tries to call an
operation outside its scoped env gets
NOT_FOUND; one that calls an operation its identity lacks scopes for getsFORBIDDEN. - Attribution is complete. Every call carries both the caller's identity (who
initiated the chain) and the handler's identity (who is acting). The
parent_request_idchain traces the full agency chain. This supports the gitea-per-agent pattern where each agent (human or LLM) has its own account. - Session-scoped operations (OQ-19) are safe by construction. They're always
Internal, run under the handler's identity, through the scoped env, in a locked-down sandbox. The self-improving workflow (agents writing tools) is bounded. - Role-based escalation is explicit. An agent requesting promotion (session →
core) is a lower-privileged role asking a higher-privileged role (architect
with
promotescope) to perform an action. The escalation goes through the chain of command, not through direct authority.
Negative:
OperationContexthas two identity fields (identityandhandler_identity), which is more complex than a single identity. This is necessary — the principal/agent distinction is real and both are needed for attribution and ACL.- The assembly layer has more responsibility: it must declare each handler's identity (scopes), its scoped composition env (which operations it may compose), and operation visibility. This is expected — the assembly layer assembles everything (ADR-008), and forcing explicit declaration of privilege is a feature, not a bug.
- Adding a new composition to a handler requires updating the assembly layer (declare the new operation in the scoped env), not just the handler code. This prevents accidental composition of unauthorized operations.
- The scoped env API is not fully specified here. The one-way constraint (scoped env exists, is declared at registration, can be further scoped at runtime) is fixed; the concrete API is a two-way door for implementation.
Assumptions
-
Internal calls should run under a different authority than external calls, not skip ACL entirely. If internal calls should skip ACL (the old
trustedmodel), this entire ADR is wrong. The assumption is that the escalation vectors (buggy handler, parameterized dispatch) are real and must be prevented. -
Handler identity is set at registration by the assembly layer. The assembly layer is the trust boundary (ADR-008, ADR-014). If the assembly layer is compromised, all handler identities are compromised. This is the same trust boundary as capabilities.
-
The scoped env is declared at registration (static) and can be further scoped at runtime (dynamic, for sandbox creation). The static scoping is the reachability control; the dynamic scoping is the sandbox boundary. If a use case requires fully dynamic scoping (handler discovers at call time what it can compose), the model needs extension — but the assumption is that composition reachability is knowable at registration time.
-
services/listhides internal operations. If internal operations should be discoverable by remote callers (e.g., for debugging), the visibility model needs a third state. The assumption is that internal operations are implementation details, not part of the external API surface. -
Internal operations return
NOT_FOUND, notFORBIDDEN. This prevents existence leakage. If a use case requires distinguishing "you can't call this" from "this doesn't exist" (e.g., for debugging), the error model needs refinement. The assumption is that not leaking internal operation existence is more important than debuggability from the wire. -
The handler identity is a full
Identity(with scopes), not a special principal type. This reuses the existingIdentitytype andIdentityProviderinfrastructure (ADR-004). If handler identities need different resolution semantics (e.g., not resolvable throughIdentityProvider), a separate type may be needed. The assumption is that the existing identity infrastructure suffices.
References
- ADR-004: Auth as shared core (
IdentityProvider,Identity) - ADR-008: Vault integration (assembly layer is the trust boundary)
- ADR-014: Secret material flow and capability injection (capabilities are orthogonal — both are set at registration by the assembly layer)
- OQ-15: Call protocol client and adapter contract (adapters produce scoped envs)
- OQ-17: Abort cascade (the call tree is the agency chain —
parent_request_idtraces principal → agent) - OQ-19: Session-scoped registries (session operations are always
Internal) - operation-registry.md
- call-protocol.md
- TypeScript
@alkdev/operationsbuildEnv()withallowedNamespaces— prior art for scoped composition env - POC at
/workspace/toolEnv— demonstrated the sandbox-to-registry bridge with the full-registry exposure gap