diff --git a/docs/architecture/crates/call/operation-registry.md b/docs/architecture/crates/call/operation-registry.md index ff5f05c..53ae95b 100644 --- a/docs/architecture/crates/call/operation-registry.md +++ b/docs/architecture/crates/call/operation-registry.md @@ -1,6 +1,6 @@ --- status: draft -last_updated: 2026-07-02 +last_updated: 2026-07-05 --- # Operation Registry @@ -39,6 +39,15 @@ pub struct OperationSpec { pub output_schema: Value, // JSON Schema for output pub error_schemas: Vec, // Declared domain errors (ADR-023) pub access_control: AccessControl, + /// JSON pointer into the input for the resource ID, when + /// `access_control.resource_type` is set and the operation targets a + /// specific runtime-spawned resource (ADR-050). e.g., `"$.containerId"` + /// for `docker/container/exec`. Absent for no-specific-resource + /// operations (the `list` case — scope-gate + result-filter). The + /// dispatcher extracts the resource ID from the input using this path + /// and passes it to `AccessControl::check`. `None` for operations + /// with no `resource_type` or with static resource sets. + pub resource_id_path: Option, } pub enum OperationType { @@ -73,22 +82,76 @@ Visibility (ADR-015) controls whether an operation is callable from the wire. `E pub struct AccessControl { pub required_scopes: Vec, // AND-checked: caller must have ALL pub required_scopes_any: Option>, // OR-checked: caller must have at LEAST ONE - pub resource_type: Option, // e.g., "service" - pub resource_action: Option, // e.g., "read" + pub resource_type: Option, // e.g., "service", "container" + pub resource_action: Option, // e.g., "read", "exec" } ``` +`AccessControl::check` consults an ownership provider for runtime-spawned +resources (ADR-050). The signature: + +```rust +impl AccessControl { + /// `ownership` is None when the operation has no `resource_type` + /// (pure scope check) or when no ownership provider is wired + /// (the static `Identity.resources` path — backward compatible). + /// `resource_id` is None for the `list` case (resource_type set, + /// `resource_id_path` absent — scope-gate + result-filter, ADR-050 §4a). + pub fn check( + &self, + identity: Option<&Identity>, + resource_id: Option<&str>, + ownership: Option<&dyn OwnershipProvider>, + ) -> bool { + // 1. Scope check (unchanged): identity.scopes ⊇ required_scopes. + // If identity is None and scopes are required, deny here. + // 2. Resource check (only if self.resource_type is Some): + // a. resource_id Some + ownership Some: + // → p.owns(identity?, resource_type, resource_id, resource_action) + // b. resource_id None + ownership Some (the `list` case): + // → p.owns_any(identity?, resource_type) [scope-gate] + // c. ownership None → fall back to static + // identity.resources[resource_type] ∋ resource_action + // (backward compat for non-runtime resources) + } +} +``` + +The `OwnershipProvider` trait (read side, sync — called on the dispatch hot +path) and the `OwnershipStore` trait (write side, async — called by +handlers that manage resource lifecycles) are defined in `alknet-core` per +ADR-050's storage decision (fourth instance of the repo/adapter pattern, +ADR-033). See [auth.md](../core/auth.md) §"Ownership Provider and Store" +for the trait shapes and the in-memory default adapter. + +**The ownership provider is carried on `OperationContext`** (or threaded +by the dispatcher), populated by the dispatch path from the registry's +wiring. When `ownership` is `None`, `check` falls back to the static +`Identity.resources` path — operations with static resource sets work +unchanged. The ownership provider is an additional check, not a +replacement. + +**The `resource_id` parameter** is extracted by the dispatcher from the +operation input using `OperationSpec.resource_id_path` (ADR-050 §2a). +When the spec has no `resource_id_path` (the `list` case), the dispatcher +passes `resource_id: None`, and `check` takes the scope-gate path. The +handler is separately responsible for result-filtering via +`OwnershipProvider::owned_resources` (ADR-050 §4a). + When a `call.requested` event arrives: 1. The `CallAdapter` resolves the caller's `Identity` from `AuthContext` (and possibly an `AuthToken` in the payload) 2. The registry checks operation **visibility** — if the operation is `Internal`, returns `call.error` with code `NOT_FOUND` (does not leak existence) -3. The registry checks `access_control.check(identity)` — for external calls (`internal: false`), ACL runs against the **caller's identity**; for internal calls (`internal: true`), ACL runs against the **handler's identity** (ADR-015) -4. If access is denied, the adapter returns `call.error` with code `FORBIDDEN` -5. If the relevant identity is `None` and the operation has restrictions, the adapter returns `call.error` with code `FORBIDDEN` and message `"authentication required"` +3. The dispatcher extracts `resource_id` from the input via `spec.resource_id_path` (if present) +4. The registry checks `access_control.check(identity, resource_id, ownership)` — for external calls (`internal: false`), ACL runs against the **caller's identity**; for internal calls (`internal: true`), ACL runs against the **handler's identity** (ADR-015) +5. If access is denied, the adapter returns `call.error` with code `FORBIDDEN` +6. If the relevant identity is `None` and the operation has restrictions, the adapter returns `call.error` with code `FORBIDDEN` and message `"authentication required"` Operations with empty `AccessControl` (no required scopes, no resource checks) are accessible to all callers, including unauthenticated ones. **Internal calls and authority context**: When a handler invokes another operation through `OperationEnv`, the nested call is marked `internal: true`, meaning it originated from composition (not from a wire request). The `internal` flag switches the authority context: the ACL check runs against the composing handler's `handler_identity` (set at registration), not the caller's identity and not as a blanket skip. This prevents privilege escalation through composition — a handler can only compose operations its own identity is authorized for. See ADR-015. +**Composition and dynamic ownership (ADR-050 §4d)**: When a handler composes an operation that targets a runtime-spawned resource (e.g., a coordinator composing `docker/container/exec` against a specific container), two checks must pass: (a) the coordinator's `CompositionAuthority` has the `container:exec` scope (static, ADR-015/022 unchanged), and (b) the coordinator owns this specific container (dynamic, ownership provider). The composition authority stays static — it doesn't grow a dynamic path. The ownership store handles the dynamic resource-level check. Both must pass; they're orthogonal. ADR-015 and ADR-022 are unchanged. + ### Handler There are two handler types, one per dispatch shape — mirroring the @@ -841,6 +904,7 @@ The `Capabilities` type holds non-serializable, zeroized secret material. It doe | Forwarded-for identity | [ADR-032](../../decisions/032-forwarded-for-identity.md) | `forwarded_for` field on `OperationContext` and `call.requested`; metadata only — `AccessControl::check` never reads it; the `from_call` handler populates it | | ~~Peer-scoped registry filtering~~ (superseded) | ~~[ADR-028](../../decisions/028-callclient-peer-scoped-registry-filtering.md)~~ | ~~`remote_safe` marking on `HandlerRegistration`~~ — superseded by ADR-029 | | Streaming handler for subscriptions | [ADR-049](../../decisions/049-streaming-handler-for-subscriptions.md) | `StreamingHandler` type alongside `Handler`; `HandlerKind` enum on `HandlerRegistration` validated against `op_type`; `invoke_streaming()` on `OperationRegistry`; `invoke()` and `OperationEnv::invoke()` error with `INVALID_OPERATION_TYPE` on `Subscription` ops; composition stays request/response-only, stream composition is handler-level | +| Dynamic resource ownership for runtime-spawned resources | [ADR-050](../../decisions/050-dynamic-resource-ownership-for-runtime-spawned-resources.md) | `AccessControl::check` consults an `OwnershipProvider` (sync read trait, ADR-033 repo/adapter pattern); `OperationSpec` gains `resource_id_path` (JSON pointer into the input); proxy-only access pattern (spawner owns, proxy to share, teardown revokes); `list` = scope-gate + result-filter; teardown = automatic, handler-driven; composition = two orthogonal checks, ADR-015/022 unchanged | ## Open Questions @@ -875,6 +939,12 @@ See [open-questions.md](../../open-questions.md) for full details. registry) resolved by ADR-030+033; OQ-35 (API key asymmetry) dissolved; OQ-36 (concrete persistence adapter shapes) resolved by ADR-035; OQ-37 (X.509 outgoing-only) resolved by ADR-034. +- **OQ-42** (resolved by ADR-050): Dynamic resource ownership for + runtime-spawned resources — `AccessControl::check` consults an + `OwnershipProvider`; `OperationSpec` gains `resource_id_path`; proxy-only + access pattern; four edge specifics pinned (`list`, teardown, fleet, + composition). See [auth.md](../core/auth.md) §"Ownership Provider and + Store" for the trait shapes. ## References @@ -888,4 +958,5 @@ See [open-questions.md](../../open-questions.md) for full details. - ADR-030: PeerEntry and Identity.id decoupling (`PeerId` source = `Identity.id` = `PeerEntry.peer_id`) - ADR-032: Forwarded-for identity (`forwarded_for` on `OperationContext` and `call.requested`; metadata only) - ADR-049: Streaming handler for subscriptions (`StreamingHandler`, `HandlerKind`, `invoke_streaming()`, `INVALID_OPERATION_TYPE`) +- ADR-050: Dynamic resource ownership for runtime-spawned resources (`OwnershipProvider` consulted by `AccessControl::check`; `OperationSpec.resource_id_path`; proxy-only access pattern; composition = two orthogonal checks, ADR-015/022 unchanged) - Reference implementation: `/workspace/@alkdev/alknet-main/crates/alknet-core/src/call/` \ No newline at end of file diff --git a/docs/architecture/crates/core/auth.md b/docs/architecture/crates/core/auth.md index b1948bc..3c853ae 100644 --- a/docs/architecture/crates/core/auth.md +++ b/docs/architecture/crates/core/auth.md @@ -1,6 +1,6 @@ --- status: draft -last_updated: 2026-06-28 +last_updated: 2026-07-05 --- # Authentication @@ -294,18 +294,160 @@ schema shape, the `StoreError` type, the writer's-own-process cache coherence details, and why honker is a hard dependency of the SQLite adapter rather than an option). +## Ownership Provider and Store (ADR-050) + +Runtime-spawned resources (containers, TTYs, workspace processes) have +derived ownership: whoever spawned the resource owns it. The static +`Identity.resources` model (populated from `PeerEntry` or +`CompositionAuthority` at connection/registration time) can't represent +this — the resource didn't exist when the identity was resolved. ADR-050 +resolves this with a fourth instance of the repo/adapter pattern +(ADR-033): an `OwnershipProvider` read trait (sync, consulted by +`AccessControl::check` on the dispatch hot path) and an `OwnershipStore` +write trait (async, called by handlers that manage resource lifecycles), +with an in-memory default adapter. + +```rust +/// Read side: consulted by AccessControl::check on the dispatch hot path. +/// Sync — called in the dispatch loop, no .await. Fourth instance of the +/// repo/adapter pattern (ADR-033), alongside IdentityProvider (ADR-004), +/// IdentityStore (ADR-035), and CredentialStore (ADR-031). +pub trait OwnershipProvider: Send + Sync + 'static { + /// Does `identity` own `resource_type/resource_id` with `action`? + /// Called when AccessControl has resource_type + resource_action set + /// and the dispatcher has extracted resource_id from the input via + /// OperationSpec.resource_id_path (ADR-050 §2a). + fn owns( + &self, + identity: &Identity, + resource_type: &str, + resource_id: &str, + action: &str, + ) -> bool; + + /// What resources of `resource_type` does `identity` own? + /// Called for the `list` case (resource_type set, resource_id_path + /// absent) — the result-filter path (ADR-050 §4a). Returns the set of + /// resource IDs the caller owns, for the handler to filter against. + fn owned_resources( + &self, + identity: &Identity, + resource_type: &str, + ) -> Vec; + + /// Does `identity` own *any* resource of `resource_type`? + /// Called for the `list` case — the scope-gate path (ADR-050 §4a). + /// Cheap boolean for the "allow if scoped" default. + fn owns_any( + &self, + identity: &Identity, + resource_type: &str, + ) -> bool; +} + +/// Write side: called by the handler that manages the resource lifecycle. +/// Async — not on the dispatch hot path. The handler calls `record` on +/// spawn and `revoke` on teardown (ADR-050 §4b — handler-driven, not a +/// reaper). +#[async_trait] +pub trait OwnershipStore: Send + Sync + 'static { + /// Record that `identity` spawned `resource_type/resource_id`. + /// Called by the docker handler after `docker/container/create` + /// succeeds. + async fn record( + &mut self, + identity: &Identity, + resource_type: &str, + resource_id: &str, + ) -> Result<(), OwnershipError>; + + /// Revoke ownership of `resource_type/resource_id`. + /// Called by the docker handler on container exit / removal + /// (ADR-050 §4b — handler-driven teardown). + async fn revoke( + &mut self, + resource_type: &str, + resource_id: &str, + ) -> Result<(), OwnershipError>; +} + +/// In-memory default adapter. Carries the docker/runner cases with no +/// backend dependency — ownership is runtime state, meaningless across +/// restarts (a container ID from a previous process doesn't exist). +/// A persistence adapter (sqlite/honker-backed, for a hub that wants +/// fleet ownership to survive restarts) is separable and built when a +/// concrete use case forces it — same pattern as `alknet-store-sqlite` +/// (ADR-035). +pub struct InMemoryOwnershipStore { /* HashMap<...> */ } +``` + +The read/write split mirrors ADR-035: `OwnershipProvider` (read, sync) is +the trait the dispatch path depends on; `OwnershipStore` (write, async) is +the trait the handler lifecycle calls. The in-memory default implements +both. A persistence adapter would implement both with an in-memory read +cache backed by SQLite, same as `SqliteIdentityProvider` implements +`IdentityProvider` (sync, cached) + `IdentityStore` (async write). + +### How it integrates with `AccessControl::check` + +`AccessControl::check` grows a parameter for the ownership provider (or +reads one carried on `OperationContext`). When `ownership` is `None`, +`check` falls back to the static `Identity.resources` path — operations +with static resource sets work unchanged. The ownership provider is an +additional check, not a replacement. See +[operation-registry.md](../call/operation-registry.md) §"AccessControl" +for the updated `check` signature and the dispatch flow. + +### Access pattern: proxy-only (ADR-050 §3) + +The base model is **"spawner owns, proxy to share, teardown revokes"** — +no grant/transfer mechanism in the core ownership store. A coordinator +that spawns a container re-exports the docker operations it wants to +expose via `from_call` (ADR-017) or composes them in its own handlers; +the coordinator is the direct caller to the docker endpoint; docker's +ownership store sees the coordinator as owner and caller; the check +passes. The end user's identity rides as `forwarded_for` metadata +(ADR-032), and the coordinator handles its own end-user-level ACL. + +"Poking holes" (the grant pattern — giving an end user direct +call-protocol access to a spawned resource) is a downstream-app concern, +not a core-model concern. A future grant mechanism is additive (a new +method on the ownership store trait), stated as reversal-cost +classification, not deferral. + +### Per-node ownership (ADR-050 §4c) + +The ownership store is **per-node** — each node records its local +ownership. There is no cross-node ownership propagation in the base model. +The hub's "who is this for" mapping is app state, not core ownership +state. The proxy pattern keeps ownership local: the spoke sees the hub as +the owner, and the hub's end-user ACL is its own layer. + ### Resource-scoped ACLs -`Identity.resources` is populated on two paths: +`Identity.resources` is populated on two paths (static), plus a third +path (dynamic, ADR-050) that consults the ownership provider at check +time: | Path | Source of `resources` | Use case | |------|----------------------|----------| | `PeerEntry` resolution (fingerprint or auth_token) | `PeerEntry.resources` (ADR-030) | External authenticated callers with per-peer resource binding | | Composition (`CompositionAuthority::as_identity`, ADR-015/022) | `CompositionAuthority.resources` | Internal composition calls with declared resource binding | +| Dynamic ownership (ADR-050) | `OwnershipProvider::owns` / `owned_resources` | Runtime-spawned resources with derived ownership (containers, TTYs, workspace processes) — not carried on `Identity.resources`; consulted by `AccessControl::check` at check time via the ownership provider | -`ApiKeyEntry`-resolved identities have empty `resources` — API keys grant scopes only. An `OperationSpec` that declares `resource_type`/`resource_action` returns `FORBIDDEN` when the caller authenticated via `ApiKeyEntry`, but succeeds when the caller authenticated via `PeerEntry` (fingerprint or auth_token) with matching `resources`. +The static paths (`PeerEntry`, `CompositionAuthority`) populate +`Identity.resources` at resolution time. The dynamic path (ADR-050) does +**not** populate `Identity.resources` — it consults the ownership provider +at `AccessControl::check` time. When `AccessControl::check`'s `ownership` +parameter is `None` (no ownership provider wired, or the operation has no +`resource_type`), `check` falls back to the static `Identity.resources` +path. When `ownership` is `Some`, `check` consults the provider for +runtime-spawned resources. See [operation-registry.md](../call/operation-registry.md) +§"AccessControl" for the `check` signature and dispatch flow. -Changes to `DynamicConfig` via `ConfigReloadHandle` are reflected immediately — `ConfigIdentityProvider` reads from `ArcSwap` on every call. +`ApiKeyEntry`-resolved identities have empty `resources` — API keys grant scopes only. An `OperationSpec` that declares `resource_type`/`resource_action` returns `FORBIDDEN` when the caller authenticated via `ApiKeyEntry` and no ownership provider is wired, but succeeds when the caller authenticated via `PeerEntry` (fingerprint or auth_token) with matching `resources`, or when the ownership provider confirms ownership (ADR-050 dynamic path). + +Changes to `DynamicConfig` via `ConfigReloadHandle` are reflected immediately — `ConfigIdentityProvider` reads from `ArcSwap` on every call. Ownership state changes (record/revoke) are reflected immediately by the in-memory `OwnershipStore`; a persistence adapter would use honker `NOTIFY` for cross-process cache invalidation (same pattern as `ConfigIdentityProvider`, ADR-035). ### Fingerprint string format @@ -445,6 +587,7 @@ The endpoint's `AlknetEndpoint` also holds `Arc` for endpo | Storage boundary and repo/adapter pattern | [ADR-033](../../decisions/033-storage-boundary-and-repo-adapter-pattern.md) | Core defines traits + in-memory defaults; persistence adapters are separate crates | | Three remote roles and outgoing-only X.509 | [ADR-034](../../decisions/034-outgoing-only-x509-and-three-peer-roles.md) | Public X.509 endpoint / transport relay / hub; `PeerEntry` asymmetry (pure-client X.509 is not a peer); client-side verifier by `PeerEntry` presence | | Concrete persistence adapter shapes | [ADR-035](../../decisions/035-concrete-persistence-adapter-shapes.md) | Read-sync / write-async split (`IdentityStore` async write trait); SQLite adapter caches in memory, honker NOTIFY for no-restart cache invalidation; `StoreError` type | +| Dynamic resource ownership for runtime-spawned resources | [ADR-050](../../decisions/050-dynamic-resource-ownership-for-runtime-spawned-resources.md) | Fourth repo/adapter trait: `OwnershipProvider` (sync read, consulted by `AccessControl::check`) + `OwnershipStore` (async write, handler-driven lifecycle); proxy-only access pattern (spawner owns, proxy to share, teardown revokes); per-node ownership; `OperationSpec.resource_id_path` | ## Open Questions @@ -452,6 +595,7 @@ The endpoint's `AlknetEndpoint` also holds `Arc` for endpo - **OQ-35** (dissolved): the "API key asymmetry" framing was wrong; `PeerEntry` supports multiple credential paths (fingerprints + auth_token_hash), `ApiKeyEntry` is for tokens that ARE the identity. See OQ-35 in open-questions.md. - **OQ-37** (resolved): X.509 outgoing-only case — three remote roles named (public X.509 endpoint, transport relay, hub); `PeerEntry` asymmetry is correct (pure-client X.509 connections are not in the peer graph on the client side); client-side verifier selection by `PeerEntry` presence (CA verification for unknown X.509, fingerprint pin for known peers). See ADR-034 and OQ-37 in open-questions.md. - **OQ-36** (resolved): Concrete persistence adapter shapes — read-sync / write-async split (`IdentityStore` async write trait extends the sync `IdentityProvider` read trait); SQLite adapter caches in memory and uses honker NOTIFY/LISTEN for no-restart cache invalidation; `alknet-store-sqlite` crate implements both `IdentityStore` and `CredentialStore`. See ADR-035 and OQ-36 in open-questions.md. +- **OQ-42** (resolved by ADR-050): Dynamic resource ownership for runtime-spawned resources — `OwnershipProvider` (sync read) + `OwnershipStore` (async write) as the fourth repo/adapter trait; `AccessControl::check` consults the ownership provider; `OperationSpec` gains `resource_id_path`; proxy-only access pattern; per-node ownership; composition = two orthogonal checks, ADR-015/022 unchanged. See ADR-050 and OQ-42 in open-questions.md. ## Security Constraints