tasks: decompose vault, core, call crates into 28 atomic implementation tasks

Break down the three initial crates (alknet-vault, alknet-core, alknet-call) into dependency-ordered task files for implementation agents. Structure: - tasks/vault/ (10 tasks) — drift fixes from ADR-025/026 refactor, review, spec sync. Vault is independent and can run fully in parallel with core/call. - tasks/core/ (6 tasks) — crate init, core types, config, auth, endpoint, review. Core is foundational; call depends on it. - tasks/call/ (12 tasks) — split into registry/ and protocol/ topic subdirs reflecting the two subsystems. CallAdapter is the merge point. Key decisions: - Drifts 3+9+10 grouped as one task (key-versioning-rotation) — the complete ADR-021 rotation feature that doesn't compile in pieces - Reviews injected at end of each crate phase (vault, core, call) - Vault spec-sync task removes the drift table and bumps doc status to stable - ACME deferred in core/endpoint (noted as TODO; X509 manual certs for now) - OperationEnv kept as a trait (load-bearing for ADR-024 layering) Validated: 28 tasks, no cycles, 11 generations of parallel work. Critical path runs through call (11 tasks). Vault completes by generation 4. 6 high-risk tasks identified (21%): irpc-removal, endpoint, operation-context, operation-env, call-adapter, abort-cascade.
2026-06-23 12:41:47 +00:00
parent 2e34590522
commit 098fd8b9b9
28 changed files with 4271 additions and 0 deletions
--- a/tasks/call/crate-init.md
+++ b/tasks/call/crate-init.md
@@ -0,0 +1,103 @@
 ---
 id: call/crate-init
 name: Initialize alknet-call crate with Cargo.toml, dependencies, and module skeleton
 status: pending
 depends_on: [core/core-types]
 scope: moderate
 risk: low
 impact: project
 level: implementation
 ---
 ## Description
 Initialize the `alknet-call` crate from scratch. This crate implements the call
 protocol (structured RPC over QUIC) on ALPN `alknet/call`. It depends on
 alknet-core (for ProtocolHandler, Connection, AuthContext, Capabilities,
 IdentityProvider) and irpc (for framing).
 ### Crate setup
 Create `crates/alknet-call/` with:
 - `Cargo.toml` — package metadata, dependencies
 - `src/lib.rs` — crate root with module declarations and re-exports
 - Module skeleton files for:
  - `src/registry/mod.rs` — registry module root
  - `src/registry/spec.rs` — OperationSpec, OperationType, Visibility, ErrorDefinition, AccessControl
  - `src/registry/context.rs` — OperationContext, AbortPolicy, CompositionAuthority, ScopedOperationEnv
  - `src/registry/registration.rs` — Handler, HandlerRegistration, OperationProvenance, OperationRegistry, OperationRegistryBuilder
  - `src/registry/env.rs` — OperationEnv trait, LocalOperationEnv, CompositeOperationEnv
  - `src/registry/discovery.rs` — services/list, services/schema handlers
  - `src/protocol/mod.rs` — protocol module root
  - `src/protocol/wire.rs` — EventEnvelope, ResponseEnvelope, CallError, framing
  - `src/protocol/pending.rs` — PendingRequestMap, PendingEntry
  - `src/protocol/connection.rs` — CallConnection
  - `src/protocol/adapter.rs` — CallAdapter (ProtocolHandler impl)
  - `src/protocol/abort.rs` — abort cascade logic
 ### Dependencies
 | Crate | Purpose |
 |-------|---------|
 | `alknet-core` | ProtocolHandler, Connection, AuthContext, Capabilities, IdentityProvider, Identity, HandlerError (workspace path) |
 | `irpc` | Framing, service dispatch (workspace dep) |
 | `tokio` 1 (full) | Async runtime, sync primitives (oneshot, mpsc, watch) |
 | `serde` 1 | Serialization for wire types |
 | `serde_json` 1 | JSON wire format, JSON Schema values |
 | `async-trait` 0.1 | OperationEnv trait (async fn in trait) |
 | `tracing` 0.1 | Structured logging |
 | `thiserror` 2 | Error enums |
 | `uuid` 1 | Request ID generation (UUID v4) |
 | `futures` | Stream trait for subscribe |
 ### Workspace Cargo.toml
 Add `crates/alknet-call` to the workspace `members` list in the root
 `Cargo.toml`.
 ### Module skeleton
 ```rust
 // src/lib.rs
 //! alknet-call: Structured RPC over QUIC — operations, streaming, service discovery.
 //! Implements ProtocolHandler on ALPN `alknet/call`.
 pub mod registry;
 pub mod protocol;
 // Re-exports (filled in by subsequent tasks)
 ```
 Each module file gets a doc comment and `// TODO: implement` marker.
 ## Acceptance Criteria
 - [ ] `crates/alknet-call/Cargo.toml` exists with all dependencies
 - [ ] `crates/alknet-call/src/lib.rs` exists with module declarations
 - [ ] All module skeleton files exist (registry/*, protocol/*)
 - [ ] Root `Cargo.toml` `members` list includes `crates/alknet-call`
 - [ ] `cargo check -p alknet-call` succeeds
 - [ ] `cargo clippy -p alknet-call` succeeds with no warnings
 - [ ] Dual licensing: `MIT OR Apache-2.0` (workspace-inherited)
 - [ ] alknet-core dependency uses workspace path (`path = "../alknet-core"`)
 ## References
 - docs/architecture/crates/call/README.md — crate index
 - docs/architecture/crates/call/call-protocol.md — CallAdapter, wire format
 - docs/architecture/crates/call/operation-registry.md — registry, OperationEnv
 - docs/architecture/decisions/003-crate-decomposition.md — ADR-003
 - docs/architecture/decisions/005-irpc-as-call-protocol-foundation.md — ADR-005
 ## Notes
 > alknet-call depends on alknet-core (for ProtocolHandler, Connection,
 > AuthContext, Capabilities, IdentityProvider) and irpc (for framing). The
 > crate has two subsystems: registry (operation specs, context, dispatch) and
 > protocol (wire format, streams, adapter). The module structure reflects
 > this split.
 ## Summary
 > To be filled on completion
--- a/tasks/call/protocol/abort-cascade.md
+++ b/tasks/call/protocol/abort-cascade.md
@@ -0,0 +1,193 @@
 ---
 id: call/protocol/abort-cascade
 name: Implement abort cascade logic for nested calls (ADR-016)
 status: pending
 depends_on: [call/protocol/call-adapter]
 scope: moderate
 risk: high
 impact: component
 level: implementation
 ---
 ## Description
 Implement the abort cascade logic in `src/protocol/abort.rs`. When a handler
 composes other operations via `OperationEnv::invoke()`, it creates a call tree:
 a parent request (r1) spawns children (r1-a, r1-b), which may spawn their own
 children. When `call.aborted` arrives for a parent, the protocol cascades the
 abort to all non-terminal descendants.
 **Read ADR-016 before starting this task.**
 ### Call tree
 The call tree is indexed by `parent_request_id` in the `PendingRequestMap`. The
 root request has `parent_request_id: None`. Each composed call has
 `parent_request_id: Some(parent.request_id)`.
 ```
 r1 (root, wire call)
 ├── r1-a (composed by r1's handler)
 │   ├── r1-a-1 (composed by r1-a's handler)
 │   └── r1-a-2
 └── r1-b
    └── r1-b-1
 ```
 ### Abort cascade
 When `call.aborted` arrives for a parent request:
 1. Find all non-terminal descendants in the tree (walk by `parent_request_id`)
 2. Send `call.aborted` for each descendant
 3. Cancel each descendant's future (Drop releases resources)
 The CallAdapter walks the tree indexed by `parent_request_id` in
 `PendingRequestMap` and sends `call.aborted` for each descendant.
 ### AbortPolicy
 The abort policy is set on `OperationContext` and propagated through
 `OperationEnv::invoke()` — the composing handler decides the child's policy,
 not the wire caller.
 **`AbortDependents` (default)**: aborting a request aborts everything
 downstream, regardless of branch. This is the correct default because aborted
 parent work has no consumer waiting for results — continuing is wasted work at
 best and unwanted side effects at worst (e.g., a `bash/exec` that keeps running
 after the caller stopped caring).
 **`ContinueRunning` (opt-in)**: descendants that have already started continue
 to completion; descendants that haven't started yet are aborted; no new
 descendants start. Use for long-running work that should survive a parent's
 abort (e.g., a subscription that should keep streaming).
 ### Wire visibility
 Composed child `request_id`s are **internal** — they appear in
 `PendingRequestMap` for abort-cascade indexing but are not sent as
 `call.requested` to any peer. The client only sees `call.aborted` for the root
 ID it sent; the server cascades internally to descendants.
 The exception is `from_call` ops, which generate their own wire ID when
 forwarding to the remote node (the remote node's `PendingRequestMap` indexes
 it).
 ### Implementation
 The abort cascade needs access to the `PendingRequestMap` to walk the tree.
 The `CallAdapter` holds the `PendingRequestMap` (or a reference to it). The
 cascade logic:
 ```rust
 pub struct AbortCascade {
    // Access to PendingRequestMap for tree walking
    // The map indexes entries by request_id, and each entry knows its parent_request_id
    // (from OperationContext, stored when the entry was registered)
 }
 impl AbortCascade {
    /// Cascade an abort from the given request ID to all non-terminal descendants.
    /// Returns the list of request IDs that were aborted (for logging/auditing).
    pub fn cascade_abort(&self, root_request_id: &str, policy: AbortPolicy) -> Vec<String>;
    /// Find all descendants of a request ID in the call tree.
    fn find_descendants(&self, parent_id: &str) -> Vec<String>;
 }
 ```
 ### Storing parent_request_id in PendingRequestMap
 The `PendingRequestMap` needs to know the `parent_request_id` for each entry to
 walk the tree. This means `PendingEntry` needs to store the parent ID (or the
 full `OperationContext`):
 ```rust
 enum PendingEntry {
    Call {
        tx: oneshot::Sender<Result<Value, CallError>>,
        timeout: Instant,
        parent_request_id: Option<String>,  // for abort cascade tree
    },
    Subscribe {
        tx: mpsc::Sender<Result<Value, CallError>>,
        timeout: Option<Instant>,
        parent_request_id: Option<String>,  // for abort cascade tree
    },
 }
 ```
 Update the `PendingRequestMap` (from the pending-request-map task) to store
 `parent_request_id` when registering entries. The `register_call` and
 `register_subscribe` methods take an optional `parent_request_id` parameter.
 ### AbortPolicy propagation
 The abort policy is propagated through `OperationEnv::invoke()`:
 - `invoke()` uses the default impl, which delegates to `invoke_with_policy()`
  with `parent.abort_policy.clone()`
 - `invoke_with_policy()` takes an explicit policy — use
  `AbortPolicy::ContinueRunning` for long-running work
 When cascading:
 - `AbortDependents`: abort ALL descendants (started and unstarted)
 - `ContinueRunning`: abort only unstarted descendants; started ones continue to
  completion; no new descendants start
 Determining "started" vs "unstarted" is tricky. A practical approach:
 - A descendant is "started" if its handler has begun executing (the future has
  been polled at least once)
 - A descendant is "unstarted" if it's queued but not yet dispatched
 This may require tracking dispatch state in `PendingEntry`. A simpler
 approximation: under `ContinueRunning`, abort all descendants that haven't sent
 a `call.responded` yet (they're still pending). This is conservative but safe.
 ### Handler cleanup
 Handlers clean up resources when their call is cancelled. In Rust, the future
 is dropped and `Drop` guards release resources (HTTP streams, file handles,
 locks). This is a handler-level concern; the protocol's job is to cascade the
 abort. See ADR-016.
 ## Acceptance Criteria
 - [ ] `PendingEntry` stores `parent_request_id` (Call and Subscribe variants)
 - [ ] `register_call` and `register_subscribe` accept optional `parent_request_id`
 - [ ] `AbortCascade` struct with `cascade_abort()` method
 - [ ] `cascade_abort` walks the tree by `parent_request_id`
 - [ ] `AbortDependents`: aborts ALL descendants (started and unstarted)
 - [ ] `ContinueRunning`: aborts unstarted descendants, started ones continue
 - [ ] `cascade_abort` returns list of aborted request IDs
 - [ ] `call.aborted` for unknown request_id is silently discarded
 - [ ] Composed child request_ids are internal (not sent as call.requested to peer)
 - [ ] Client only sees call.aborted for the root ID it sent
 - [ ] AbortPolicy propagated through OperationEnv::invoke()
 - [ ] Unit test: cascade aborts all descendants under AbortDependents
 - [ ] Unit test: cascade aborts only unstarted under ContinueRunning
 - [ ] Unit test: unknown request_id → no-op (silently discarded)
 - [ ] Unit test: tree with depth 3, abort root → all descendants aborted
 - [ ] `cargo test -p alknet-call` succeeds
 - [ ] `cargo clippy -p alknet-call` succeeds with no warnings
 ## References
 - docs/architecture/decisions/016-abort-cascade-for-nested-calls.md — ADR-016 (full rationale)
 - docs/architecture/crates/call/call-protocol.md — Abort Cascade and Nested Calls section
 - docs/architecture/crates/call/operation-registry.md — AbortPolicy, OperationContext.abort_policy
 ## Notes
 > **Read ADR-016 before starting.** The abort cascade walks the call tree
 > indexed by parent_request_id in PendingRequestMap. The default policy
 > (AbortDependents) aborts everything downstream — this is correct because
 > aborted parent work has no consumer. ContinueRunning is the opt-in for
 > long-running work. Composed child request_ids are internal — the client only
 > sees call.aborted for the root ID. The PendingRequestMap needs to store
 > parent_request_id for tree walking — update the pending-request-map task's
 > output if needed.
 ## Summary
 > To be filled on completion
--- a/tasks/call/protocol/call-adapter.md
+++ b/tasks/call/protocol/call-adapter.md
@@ -0,0 +1,260 @@
 ---
 id: call/protocol/call-adapter
 name: Implement CallAdapter (ProtocolHandler for alknet/call) with stream handling, identity resolution, and root context construction
 status: pending
 depends_on: [call/protocol/call-connection, call/registry/operation-env, call/registry/service-discovery, core/endpoint]
 scope: broad
 risk: high
 impact: component
 level: implementation
 ---
 ## Description
 Implement `CallAdapter` in `src/protocol/adapter.rs`. This is the
 `ProtocolHandler` implementation for ALPN `alknet/call` — the merge point of the
 registry and protocol strands. It ties everything together: stream handling,
 identity resolution, root context construction, env composition, dispatch.
 ### CallAdapter struct
 ```rust
 pub struct CallAdapter {
    registry: Arc<OperationRegistry>,           // Layer 0 — curated, immutable
    identity_provider: Arc<dyn IdentityProvider>,
    session_source: Option<Arc<dyn SessionOverlaySource + Send + Sync>>,  // Layer 1
    default_timeout: Duration,                   // 30s default
 }
 impl CallAdapter {
    pub fn new(registry: Arc<OperationRegistry>, identity_provider: Arc<dyn IdentityProvider>) -> Self {
        Self { registry, identity_provider, session_source: None,
               default_timeout: Duration::from_secs(30) }
    }
    pub fn with_session_source(mut self, source: Arc<dyn SessionOverlaySource + Send + Sync>) -> Self {
        self.session_source = Some(source);
        self
    }
    pub fn with_timeout(mut self, timeout: Duration) -> Self {
        self.default_timeout = timeout;
        self
    }
 }
 ```
 ### SessionOverlaySource trait
 ```rust
 pub trait SessionOverlaySource: Send + Sync {
    fn overlay_for(&self, context: &OperationContext) -> Option<Arc<dyn OperationEnv + Send + Sync>>;
 }
 ```
 Defined in alknet-call because CallAdapter must name the type — alknet-call
 cannot depend on alknet-agent (agent depends on call, not reverse). The agent
 crate implements this trait; alknet-call defines it. Same pattern as
 IdentityProvider (ADR-004).
 ### ProtocolHandler impl
 ```rust
 #[async_trait]
 impl ProtocolHandler for CallAdapter {
    fn alpn(&self) -> &'static [u8] { b"alknet/call" }
    async fn handle(&self, connection: Connection, auth: &AuthContext) -> Result<(), HandlerError> {
        // 1. Create CallConnection from the Connection
        // 2. Spawn a task that continuously calls connection.accept_bi()
        // 3. For each accepted stream, read EventEnvelope frames (FrameFramedReader)
        // 4. Dispatch call.requested events to the operation registry
        // 5. Write response EventEnvelope frames (FrameFramedWriter)
        // 6. Manage PendingRequestMap for outgoing calls
        // 7. On connection close: fail all pending, return Ok or Err(ConnectionClosed)
    }
 }
 ```
 ### Stream handling
 The adapter:
 1. Spawns a task that continuously calls `connection.accept_bi()` to receive
   incoming streams
 2. For each accepted stream, reads `EventEnvelope` frames using
   `FrameFramedReader`
 3. Dispatches `call.requested` events to the operation registry
 4. Writes response `EventEnvelope` frames using `FrameFramedWriter`
 5. Manages `PendingRequestMap` for outgoing calls initiated by the server
 For outgoing calls (server → client), the adapter:
 1. Opens a bidirectional stream with `connection.open_bi()`
 2. Sends `call.requested` on that stream
 3. Adds the request ID to the `PendingRequestMap`
 4. Reads responses from any stream, correlates by ID
 ### Identity resolution (per-request)
 The CallAdapter resolves identity per-request, not per-connection:
 1. The endpoint provides `AuthContext` with whatever identity it resolved at
   the TLS layer (may be `None`)
 2. When a `call.requested` event arrives, the CallAdapter constructs an
   `OperationContext` with the connection-level `AuthContext.identity`
 3. If the `call.requested` payload includes an `auth_token` field, the
   CallAdapter resolves it using `IdentityProvider::resolve_from_token()`. If
   resolution succeeds, the resulting `Identity` replaces the connection-level
   identity in the `OperationContext`. If resolution fails, the request
   proceeds with the connection-level identity (which may be `None`)
 4. The `OperationContext.identity` is passed to the `OperationRegistry` for
   ACL checking
 5. If `identity` is `None` and the operation's `AccessControl` has
   restrictions, the registry returns `FORBIDDEN` with message
   `"authentication required"`
 **Key point**: Identity is resolved per-request. This allows a single
 connection to upgrade authentication mid-session and allows different operations
 on the same connection to have different identity levels.
 ### Root OperationContext construction
 When a `call.requested` arrives from the wire, the CallAdapter constructs the
 root `OperationContext` — the entry point of the call tree. This sets
 `internal: false`, meaning ACL runs against the caller's `identity`, not a
 handler's composition authority (ADR-015, ADR-022).
 ```rust
 fn build_root_context(
    &self,
    request_id: String,
    operation_name: &str,
    identity: Option<Identity>,
    /* connection, session */
 ) -> OperationContext {
    let registration = self.registry.registration(operation_name);
    OperationContext {
        request_id,
        parent_request_id: None,        // wire request — top of call tree
        identity: identity.clone(),     // caller's identity (inbound)
        handler_identity: registration.composition_authority.clone(),
        capabilities: registration.capabilities.clone(),
        metadata: HashMap::new(),
        deadline: Some(Instant::now() + self.default_timeout),
        scoped_env: registration.scoped_env.clone()
            .unwrap_or_else(ScopedOperationEnv::empty),
        env: self.compose_root_env(/* connection, session */),
        abort_policy: AbortPolicy::default(),  // abort-dependents
        internal: false,                 // external call — ACL against caller identity
    }
 }
 ```
 ### compose_root_env
 The per-call `env` composition (ADR-024) builds a `CompositeOperationEnv` from:
 - Layer 0: `LocalOperationEnv` (curated registry)
 - Layer 1: session overlay (if active, from `session_source.overlay_for()`)
 - Layer 2: connection overlay (from `CallConnection.overlay_env()`)
 ```rust
 fn compose_root_env(&self, connection: &CallConnection, context: &OperationContext) -> Arc<dyn OperationEnv + Send + Sync> {
    let base = Arc::new(LocalOperationEnv { registry: self.registry.clone() });
    let session = self.session_source.as_ref()
        .and_then(|s| s.overlay_for(context));
    let connection_overlay = connection.overlay_env();
    Arc::new(CompositeOperationEnv { session, connection: Some(connection_overlay), base })
 }
 ```
 ### operationId normalization
 The `call.requested` payload's `operationId` has a leading slash (`/fs/readFile`).
 The CallAdapter strips it before registry lookup (`fs/readFile`). This is a
 single rule applied consistently — the registry stores names without leading
 slash, the wire format adds it.
 ### ResponseEnvelope → EventEnvelope
 The CallAdapter converts `ResponseEnvelope` (from local dispatch) to
 `EventEnvelope` for the wire:
 | `ResponseEnvelope` | `EventEnvelope` |
 |--------------------|-----------------|
 | `Ok(value)` | `{ type: "call.responded", id: request_id, payload: { output: value } }` |
 | `Err(call_error)` | `{ type: "call.error", id: request_id, payload: <serialized CallError> }` |
 For subscriptions, each `call.responded` is a separate `EventEnvelope` with the
 same `id`; `call.completed` is `{ type: "call.completed", id, payload: {} }`.
 ### Timeout handling
 - Default timeout for wire calls is 30 seconds (`default_timeout`)
 - `build_root_context` sets `OperationContext.deadline` to `now + default_timeout`
 - Composed calls inherit the parent's deadline (children do NOT get a fresh 30s)
 - A composed call that exceeds the deadline is cancelled and returns
  `CallError { code: "TIMEOUT", retryable: true }`
 - Subscriptions default to no deadline (`deadline: None` — unbounded); the
  client can specify a timeout in the `call.requested` payload
 - The `PendingRequestMap` sweeper runs every 10 seconds and removes expired
  wire entries
 ### Error handling in handle()
 - If a handler panics, the stream is closed and the PendingRequestMap entry is
  cleaned up by the next sweeper pass. Other streams and the connection are
  unaffected.
 - Connection drop: all pending requests failed with `call.error` code
  `INTERNAL` and message `"connection closed"`. All subscription channels
  closed. `handle()` returns `Ok(())` (clean) or `Err(ConnectionClosed)`.
 - Stream reset: `FrameFramedReader` returns an error. If subscription, remove
  PendingRequestMap entry, close mpsc. If call, resolve oneshot with error. No
  `call.aborted` sent — stream is gone.
 ## Acceptance Criteria
 - [ ] `CallAdapter` struct with registry, identity_provider, session_source, default_timeout
 - [ ] `CallAdapter::new()`, `with_session_source()`, `with_timeout()` constructors
 - [ ] `SessionOverlaySource` trait defined with `overlay_for()` method
 - [ ] `ProtocolHandler::alpn()` returns `b"alknet/call"`
 - [ ] `handle()` accepts streams, reads EventEnvelope frames, dispatches
 - [ ] `handle()` spawns task for continuous `accept_bi()`
 - [ ] Outgoing calls: open_bi, send call.requested, add to PendingRequestMap
 - [ ] Identity resolution: AuthContext.identity used, auth_token overrides per-request
 - [ ] auth_token resolution failure → proceed with connection-level identity
 - [ ] `build_root_context` sets internal: false, deadline, capabilities from registration
 - [ ] `compose_root_env` builds CompositeOperationEnv (base + session + connection)
 - [ ] operationId leading slash stripped before registry lookup
 - [ ] ResponseEnvelope → EventEnvelope conversion (Ok → responded, Err → error)
 - [ ] Subscriptions: multiple call.responded with same id, then call.completed
 - [ ] Timeout: 30s default, composed calls inherit parent deadline
 - [ ] Handler panic: stream closed, PendingRequestMap cleaned up, others unaffected
 - [ ] Connection drop: fail all pending with INTERNAL, return Ok or Err
 - [ ] Unit test: CallAdapter alpn returns b"alknet/call"
 - [ ] Integration test: call.requested → dispatch → call.responded round-trip
 - [ ] Integration test: auth_token overrides connection-level identity
 - [ ] Integration test: Internal op called from wire → NOT_FOUND
 - [ ] Integration test: ACL denied → FORBIDDEN
 - [ ] `cargo test -p alknet-call` succeeds
 - [ ] `cargo clippy -p alknet-call` succeeds with no warnings
 ## References
 - docs/architecture/crates/call/call-protocol.md — CallAdapter, stream handling, root context
 - docs/architecture/crates/call/operation-registry.md — OperationContext construction
 - docs/architecture/decisions/015-privilege-model-and-authority-context.md — ADR-015 (internal: false for wire)
 - docs/architecture/decisions/024-operation-registry-layering.md — ADR-024 (env composition)
 - docs/architecture/decisions/012-call-protocol-stream-model.md — ADR-012
 ## Notes
 > This is the merge point of the registry and protocol strands — the highest-
 > risk task in the call crate. It ties together stream handling, identity
 > resolution, root context construction, env composition, and dispatch. The
 > per-request identity resolution (auth_token overrides connection-level) is
 > important — a single connection can upgrade auth mid-session. The
 > compose_root_env builds the CompositeOperationEnv per call from the active
 > layers. operationId on the wire has a leading slash; strip it before lookup.
 ## Summary
 > To be filled on completion
--- a/tasks/call/protocol/call-connection.md
+++ b/tasks/call/protocol/call-connection.md
@@ -0,0 +1,158 @@
 ---
 id: call/protocol/call-connection
 name: Implement CallConnection with imported-ops overlay (Layer 2) and call/subscribe/abort methods
 status: pending
 depends_on: [call/protocol/pending-request-map, call/registry/operation-env]
 scope: moderate
 risk: medium
 impact: component
 level: implementation
 ---
 ## Description
 Implement `CallConnection` in `src/protocol/connection.rs`. This represents an
 established `alknet/call` connection, regardless of which side opened it
 (ADR-017). It holds the connection's imported-ops overlay (Layer 2, ADR-024).
 ### CallConnection
 ```rust
 pub struct CallConnection {
    connection: Connection,
    imported_operations: Arc<RwLock<HashMap<String, HandlerRegistration>>>,
 }
 ```
 An established alknet/call connection (either direction — accepted or opened).
 Holds the Layer 2 overlay (imported ops from `from_call` discovery).
 ### Layer 2 registration API
 ```rust
 impl CallConnection {
    /// Register an imported operation into this connection's overlay (Layer 2, ADR-024).
    /// Called by from_call after discovery.
    pub fn register_imported(&self, registration: HandlerRegistration) {
        let name = registration.spec.name.clone();
        self.imported_operations.write().insert(name, registration);
    }
    /// Register multiple imported operations (bulk variant for from_call).
    pub fn register_imported_all(&self, registrations: Vec<HandlerRegistration>) {
        let mut overlay = self.imported_operations.write();
        for reg in registrations {
            overlay.insert(reg.spec.name.clone(), reg);
        }
    }
 }
 ```
 Layer 0 (curated) is built via `OperationRegistryBuilder` at startup. Layer 2
 (per-connection) registration uses `CallConnection::register_imported()` at
 runtime. When the connection drops, the overlay (and all imported ops) is
 dropped — no explicit deregistration needed.
 ### Overlay env
 ```rust
 impl CallConnection {
    /// Build an OperationEnv impl for this connection's overlay.
    /// Used by the CallAdapter when composing the root OperationContext.env.
    /// Returns an OperationEnv that dispatches to this connection's imported ops
    /// (and reports contains only for ops in the overlay).
    pub fn overlay_env(&self) -> Arc<dyn OperationEnv + Send + Sync>;
 }
 ```
 This is an `OperationEnv` impl that dispatches to the connection's imported ops.
 The `contains()` method returns true only for ops in the overlay. The
 `invoke_with_policy()` method looks up the op in the overlay and dispatches to
 its handler.
 This env is composed into the `CompositeOperationEnv` by the CallAdapter as the
 `connection` layer (Layer 2).
 ### Call methods (outgoing)
 ```rust
 impl CallConnection {
    /// Call an operation on the remote peer (sends call.requested).
    pub async fn call(&self, operation_id: &str, input: Value) -> ResponseEnvelope;
    /// Subscribe to a streaming operation on the remote peer.
    pub async fn subscribe(&self, operation_id: &str, input: Value) -> impl Stream<Item = ResponseEnvelope>;
    /// Abort an in-flight request (sends call.aborted, cascades per ADR-016).
    pub async fn abort(&self, request_id: &str);
 }
 ```
 These methods:
 1. Open a bidirectional stream with `connection.open_bi()`
 2. Send `call.requested` on that stream (via FrameFramedWriter)
 3. Add the request ID to the PendingRequestMap
 4. Read responses from any stream, correlate by ID (via PendingRequestMap)
 `call()` resolves on the first `call.responded`. `subscribe()` yields each
 `call.responded` until `call.completed` or `call.aborted`.
 `abort()` sends `call.aborted` for the given request ID. The abort cascade
 (ADR-016) is handled by the abort-cascade task.
 ### Connection direction independence
 Per ADR-017, connection direction is independent of call direction. Both
 sides can call each other once connected. The `CallConnection` type is the same
 whether the connection was accepted (server side) or opened (client side via
 `CallClient`). The `call`/`subscribe`/`abort` methods work the same way.
 ### from_call integration
 The `from_call` adapter (ADR-017) discovers operations on a remote call
 protocol endpoint via `services/list` and `services/schema`, then registers
 them with `register_imported()` / `register_imported_all()`. This makes
 cross-node composition transparent — a handler calling
 `env.invoke("worker", "exec", ...)` doesn't know whether the operation is
 local or remote.
 The `from_call` adapter itself is not implemented in this task — it's a future
 task. This task implements the `CallConnection` infrastructure that `from_call`
 will use.
 ## Acceptance Criteria
 - [ ] `CallConnection` struct with connection and imported_operations fields
 - [ ] `register_imported()` adds to the Layer 2 overlay
 - [ ] `register_imported_all()` bulk adds to the overlay
 - [ ] `overlay_env()` returns an OperationEnv dispatching to imported ops
 - [ ] `overlay_env().contains()` returns true only for ops in the overlay
 - [ ] `call()` sends call.requested, resolves on first call.responded
 - [ ] `subscribe()` sends call.requested, yields call.responded until completed/aborted
 - [ ] `abort()` sends call.aborted for the request ID
 - [ ] Outgoing calls open a stream, send request, add to PendingRequestMap
 - [ ] Connection drop drops the overlay (no explicit deregistration)
 - [ ] Unit test: register_imported adds to overlay, contains returns true
 - [ ] Unit test: overlay_env dispatches to imported op
 - [ ] Unit test: overlay_env contains returns false for non-imported op
 - [ ] `cargo test -p alknet-call` succeeds
 - [ ] `cargo clippy -p alknet-call` succeeds with no warnings
 ## References
 - docs/architecture/crates/call/call-protocol.md — CallConnection section
 - docs/architecture/decisions/017-call-protocol-client-and-adapter-contract.md — ADR-017
 - docs/architecture/decisions/024-operation-registry-layering.md — ADR-024 (Layer 2)
 ## Notes
 > Connection direction is independent of call direction (ADR-017) — both sides
 > can call each other. The Layer 2 overlay is per-connection: when the
 > connection drops, the overlay drops (no deregistration needed). The
 > overlay_env() is composed into CompositeOperationEnv by the CallAdapter as
 > the connection layer. The from_call adapter itself is a future task — this
 > implements the infrastructure it will use.
 ## Summary
 > To be filled on completion
--- a/tasks/call/protocol/pending-request-map.md
+++ b/tasks/call/protocol/pending-request-map.md
@@ -0,0 +1,164 @@
 ---
 id: call/protocol/pending-request-map
 name: Implement PendingRequestMap for correlating call.requested and call.responded events
 status: pending
 depends_on: [call/protocol/wire-types]
 scope: moderate
 risk: medium
 impact: component
 level: implementation
 ---
 ## Description
 Implement `PendingRequestMap` in `src/protocol/pending.rs`. This manages
 in-flight calls and subscriptions, correlating `call.responded` events back to
 the original `call.requested` by request ID.
 ### PendingRequestMap
 ```rust
 pub struct PendingRequestMap {
    pending: HashMap<String, PendingEntry>,
 }
 enum PendingEntry {
    Call {
        tx: oneshot::Sender<Result<Value, CallError>>,
        timeout: Instant,
    },
    Subscribe {
        tx: mpsc::Sender<Result<Value, CallError>>,
        timeout: Option<Instant>,
    },
 }
 ```
 ### Behavior
 When a `call.responded` event arrives:
 - If `PendingEntry::Call` → resolve the oneshot, delete entry
 - If `PendingEntry::Subscribe` → push to the mpsc channel, keep entry alive
 When `call.completed` arrives on a subscription → close the mpsc channel, delete entry.
 When `call.aborted` arrives → cancel/drop whichever side initiated it. A
 `call.aborted` for an unknown `requestId` is silently discarded.
 When `call.error` arrives → resolve the oneshot (Call) or push to channel
 (Subscribe) with the error, delete entry.
 ### Timeouts
 Timeouts prevent dangling entries. A background task sweeps expired entries
 periodically (every 10 seconds per call-protocol.md).
 - `Call` entries have a timeout (default 30s from CallAdapter.default_timeout)
 - `Subscribe` entries may have `timeout: None` (unbounded — long-running
  subscriptions)
 When the sweeper finds an expired entry:
 - `Call`: resolve oneshot with `CallError { code: "TIMEOUT", retryable: true }`, delete
 - `Subscribe`: close mpsc channel with a timeout error, delete
 ### Methods
 ```rust
 impl PendingRequestMap {
    pub fn new() -> Self;
    /// Register a pending call. Returns a oneshot receiver for the result.
    pub fn register_call(&mut self, request_id: String, timeout: Instant) -> oneshot::Receiver<Result<Value, CallError>>;
    /// Register a pending subscription. Returns an mpsc receiver for the stream.
    pub fn register_subscribe(&mut self, request_id: String, timeout: Option<Instant>) -> mpsc::Receiver<Result<Value, CallError>>;
    /// Handle an incoming call.responded event.
    /// Returns true if the entry was found and handled.
    pub fn handle_responded(&mut self, request_id: &str, output: Value) -> bool;
    /// Handle an incoming call.completed event (subscriptions only).
    /// Closes the mpsc channel, deletes entry.
    pub fn handle_completed(&mut self, request_id: &str) -> bool;
    /// Handle an incoming call.aborted event.
    /// Cancels the pending request, deletes entry.
    pub fn handle_aborted(&mut self, request_id: &str) -> bool;
    /// Handle an incoming call.error event.
    /// Resolves with the error, deletes entry.
    pub fn handle_error(&mut self, request_id: &str, error: CallError) -> bool;
    /// Sweep expired entries. Called periodically by a background task.
    pub fn evict_expired(&mut self) -> Vec<String>;  // returns evicted request IDs
    /// Fail all pending requests (connection closed). Returns the request IDs that were failed.
    pub fn fail_all(&mut self, error: CallError) -> Vec<String>;
    /// Check if a request ID is pending.
    pub fn contains(&self, request_id: &str) -> bool;
    /// Number of pending entries.
    pub fn len(&self) -> usize;
 }
 ```
 ### Connection drop handling
 When the QUIC connection closes, all pending requests are failed with
 `call.error` code `INTERNAL` and message `"connection closed"`. All
 subscription channels are closed. This is `fail_all()`.
 ### Stream reset handling
 When a QUIC stream is reset mid-operation, the `FrameFramedReader` returns an
 error. If the stream was carrying a subscription, the PendingRequestMap entry
 is removed and the mpsc channel is closed. If the stream was carrying a call,
 the oneshot is resolved with an error. No `call.aborted` is sent — the stream
 is gone.
 ### Correlation is by ID, not by stream
 A response arriving on stream N can fulfill a request sent on stream M. The
 `PendingRequestMap` is keyed by ID, not by stream. This is the stream-agnostic
 correlation property from ADR-012.
 ## Acceptance Criteria
 - [ ] `PendingRequestMap` struct with pending HashMap
 - [ ] `PendingEntry::Call` with oneshot::Sender and timeout
 - [ ] `PendingEntry::Subscribe` with mpsc::Sender and optional timeout
 - [ ] `register_call` returns oneshot::Receiver
 - [ ] `register_subscribe` returns mpsc::Receiver
 - [ ] `handle_responded` resolves Call oneshot, pushes to Subscribe channel
 - [ ] `handle_completed` closes Subscribe mpsc, deletes entry
 - [ ] `handle_aborted` cancels pending, deletes entry
 - [ ] `handle_error` resolves with error, deletes entry
 - [ ] Unknown request_id in handle_* is silently discarded (returns false)
 - [ ] `evict_expired` removes timed-out entries, resolves with TIMEOUT error
 - [ ] `fail_all` fails all pending with given error (connection close)
 - [ ] Correlation is by request ID, not by stream
 - [ ] Unit test: register call, handle_responded → oneshot resolves
 - [ ] Unit test: register subscribe, handle multiple responded, handle_completed → stream ends
 - [ ] Unit test: expired call → evict_expired resolves with TIMEOUT
 - [ ] Unit test: fail_all resolves all pending with INTERNAL error
 - [ ] Unit test: unknown request_id handle_responded → false (silently discarded)
 - [ ] `cargo test -p alknet-call` succeeds
 - [ ] `cargo clippy -p alknet-call` succeeds with no warnings
 ## References
 - docs/architecture/crates/call/call-protocol.md — PendingRequestMap section
 - docs/architecture/decisions/012-call-protocol-stream-model.md — ADR-012 (ID-based correlation)
 ## Notes
 > Correlation is by request ID, not by stream — a response on stream N can
 > fulfill a request sent on stream M. This is the stream-agnostic property from
 > ADR-012. The sweeper runs every 10 seconds to evict expired entries. Unknown
 > request IDs in handle_* are silently discarded (not an error — the entry may
 > have already been resolved/cleaned up).
 ## Summary
 > To be filled on completion
--- a/tasks/call/protocol/wire-types.md
+++ b/tasks/call/protocol/wire-types.md
@@ -0,0 +1,219 @@
 ---
 id: call/protocol/wire-types
 name: Implement EventEnvelope, ResponseEnvelope, CallError, and length-prefixed JSON framing
 status: pending
 depends_on: [call/crate-init]
 scope: moderate
 risk: medium
 impact: component
 level: implementation
 ---
 ## Description
 Implement the wire protocol types and framing in `src/protocol/wire.rs`. Every
 message on the wire is a length-prefixed JSON `EventEnvelope`.
 ### EventEnvelope
 ```rust
 pub struct EventEnvelope {
    pub r#type: String,    // Event type
    pub id: String,        // Correlation key (request ID, subscription ID)
    pub payload: Value,    // serde_json::Value — schema depends on event type
 }
 // Frame: 4-byte big-endian length prefix + UTF-8 JSON body
 ```
 The envelope is JSON because it must be consumable from JavaScript, Python, and
 any language. The `Value` type is `serde_json::Value`.
 Binary payloads (postcard, protobuf) are base64-encoded as a JSON string within
 the `payload` field. The envelope itself does not interpret the payload — this
 is a handler-level concern, not a protocol-level concern.
 ### Event Types
 Five event types:
 | Event | Direction | Purpose |
 |-------|-----------|---------|
 | `call.requested` | Caller → Handler | Initiate a call or subscription |
 | `call.responded` | Handler → Caller | Deliver a result (one for calls, many for subscriptions) |
 | `call.completed` | Handler → Caller | Signal end of subscription stream |
 | `call.aborted` | Either side | Cancel the call/subscription |
 | `call.error` | Handler → Caller | Signal an error |
 ### Wire Payload Schemas
 | Event | `payload` shape |
 |-------|----------------|
 | `call.requested` | `{ "operationId": "/fs/readFile", "input": {...}, "auth_token": "alk_..." (optional) }` |
 | `call.responded` | `{ "output": <Value> }` |
 | `call.completed` | `{}` — empty object |
 | `call.aborted` | `{}` — empty object |
 | `call.error` | `{ "code": "...", "message": "...", "retryable": bool, "details": {...} (optional) }` |
 ### call.requested payload
 ```json
 {
  "operationId": "/fs/readFile",
  "input": { ... },
  "auth_token": "alk_..."    // optional
 }
 ```
 - `operationId` — the operation to invoke, **with a leading slash** on the wire.
  The registry stores names without the leading slash; the wire format adds it.
  The CallAdapter strips the leading slash before registry lookup.
 - `input` — the operation input, matching the operation's `input_schema`.
 - `auth_token` — optional. If present, CallAdapter resolves via
  `IdentityProvider::resolve_from_token()`. Resulting Identity takes precedence
  over connection-level identity for this request.
 The `call.requested` payload does **not** carry an abort policy field. The abort
 policy is set on `OperationContext` and propagated through
 `OperationEnv::invoke()` — the composing handler decides, not the wire caller.
 ### call.error payload
 ```json
 {
  "code": "FILE_NOT_FOUND",
  "message": "file not found: /etc/nonexistent",
  "retryable": false,
  "details": { "path": "/etc/nonexistent", "errno": 2 }
 }
 ```
 Protocol-level codes (emitted by dispatch machinery):
 - `NOT_FOUND` — operation not in registry (or Internal op called from wire)
 - `FORBIDDEN` — access denied
 - `INVALID_INPUT` — input doesn't match JSON Schema
 - `INTERNAL` — handler error, panic, connection failure
 - `TIMEOUT` — request timed out (retryable: true)
 Operation-level domain codes (emitted by handlers, ADR-023): e.g.,
 `FILE_NOT_FOUND`, `RATE_LIMITED`. These carry a `details` payload conforming to
 the declared `ErrorDefinition.schema`.
 New error codes may be added in future. Clients should treat unknown codes as
 `INTERNAL` with `retryable: false`.
 ### ResponseEnvelope
 ```rust
 pub struct ResponseEnvelope {
    pub request_id: String,
    pub result: Result<Value, CallError>,
 }
 pub struct CallError {
    pub code: String,
    pub message: String,
    pub retryable: bool,
    pub details: Option<Value>,
 }
 ```
 Local dispatch produces `ResponseEnvelope` with no serialization overhead. The
 CallAdapter converts it to `EventEnvelope` for the wire.
 ### ResponseEnvelope → EventEnvelope conversion
 | `ResponseEnvelope` | `EventEnvelope` |
 |--------------------|-----------------|
 | `Ok(value)` | `{ type: "call.responded", id: request_id, payload: { output: value } }` |
 | `Err(call_error)` | `{ type: "call.error", id: request_id, payload: <serialized CallError> }` |
 For subscriptions, each `call.responded` is a separate `EventEnvelope` with the
 same `id`; `call.completed` is `{ type: "call.completed", id, payload: {} }`.
 ### Framing
 Length-prefixed JSON: 4-byte big-endian length prefix + UTF-8 JSON body.
 Implement:
 - `FrameFramedReader` — reads length-prefixed frames from an async reader
  (RecvStream)
 - `FrameFramedWriter` — writes length-prefixed frames to an async writer
  (SendStream)
 ```rust
 pub struct FrameFramedReader<R: AsyncRead + Unpin> { /* ... */ }
 impl<R: AsyncRead + Unpin> FrameFramedReader<R> {
    pub fn new(reader: R) -> Self;
    pub async fn read_frame(&mut self) -> Result<EventEnvelope, FrameError>;
 }
 pub struct FrameFramedWriter<W: AsyncWrite + Unpin> { /* ... */ }
 impl<W: AsyncWrite + Unpin> FrameFramedWriter<W> {
    pub fn new(writer: W) -> Self;
    pub async fn write_frame(&mut self, envelope: &EventEnvelope) -> Result<(), FrameError>;
 }
 ```
 This is the same framing used by irpc. The Rust implementation in alknet-call is
 canonical (ADR-005, ADR-013).
 ### ResponseEnvelope helper methods
 ```rust
 impl ResponseEnvelope {
    pub fn ok(request_id: String, output: Value) -> Self;
    pub fn error(request_id: String, error: CallError) -> Self;
    pub fn not_found(request_id: String, op_name: &str) -> Self;
    pub fn forbidden(request_id: String, message: &str) -> Self;
 }
 ```
 ### FrameError
 ```rust
 pub enum FrameError {
    Io(io::Error),
    Json(serde_json::Error),
    ConnectionClosed,
    InvalidFrame,
 }
 ```
 ## Acceptance Criteria
 - [ ] `EventEnvelope` struct with type, id, payload fields
 - [ ] `ResponseEnvelope` struct with request_id, result fields
 - [ ] `CallError` struct with code, message, retryable, details fields
 - [ ] `FrameError` enum with Io, Json, ConnectionClosed, InvalidFrame
 - [ ] `FrameFramedReader` reads length-prefixed JSON frames
 - [ ] `FrameFramedWriter` writes length-prefixed JSON frames
 - [ ] 4-byte big-endian length prefix + UTF-8 JSON body
 - [ ] `ResponseEnvelope::ok()`, `error()`, `not_found()`, `forbidden()` helpers
 - [ ] `ResponseEnvelope` → `EventEnvelope` conversion (Ok → call.responded, Err → call.error)
 - [ ] Unit test: write frame, read frame, round-trip EventEnvelope
 - [ ] Unit test: ResponseEnvelope::ok produces correct EventEnvelope
 - [ ] Unit test: ResponseEnvelope::error produces correct call.error EventEnvelope
 - [ ] Unit test: framing handles large payloads
 - [ ] Unit test: framing detects truncated frames (ConnectionClosed error)
 - [ ] `cargo test -p alknet-call` succeeds
 - [ ] `cargo clippy -p alknet-call` succeeds with no warnings
 ## References
 - docs/architecture/crates/call/call-protocol.md — EventEnvelope, wire format, event types
 - docs/architecture/decisions/005-irpc-as-call-protocol-foundation.md — ADR-005
 - docs/architecture/decisions/012-call-protocol-stream-model.md — ADR-012
 - docs/architecture/decisions/023-operation-error-schemas.md — ADR-023 (CallError, details)
 ## Notes
 > The envelope is always JSON for cross-language compatibility. Binary
 > payloads are base64-encoded within the payload field (handler concern, not
 > protocol concern). The 4-byte big-endian length prefix is the same framing
 > irpc uses. operationId on the wire has a leading slash; the registry stores
 > names without it — the CallAdapter strips it before lookup.
 ## Summary
 > To be filled on completion
--- a/tasks/call/registry/handler-registration.md
+++ b/tasks/call/registry/handler-registration.md
@@ -0,0 +1,202 @@
 ---
 id: call/registry/handler-registration
 name: Implement Handler, HandlerRegistration, OperationProvenance, OperationRegistry, and OperationRegistryBuilder
 status: pending
 depends_on: [call/registry/operation-context]
 scope: broad
 risk: medium
 impact: component
 level: implementation
 ---
 ## Description
 Implement the handler registration types and the operation registry in
 `src/registry/registration.rs`. The registry maps operation names to
 registration bundles and provides the dispatch entry point.
 ### Handler
 ```rust
 pub type Handler = Arc<
    dyn Fn(Value, OperationContext) -> Pin<Box<dyn Future<Output = ResponseEnvelope> + Send>>
        + Send + Sync
 >;
 ```
 Handlers are async. They receive:
 - `input: Value` — deserialized payload from `call.requested` (always `serde_json::Value`)
 - `context: OperationContext` — request ID, identity, metadata, env
 And return `ResponseEnvelope` (defined in protocol/wire task — use a forward
 reference or define a minimal version here, full impl in the wire task).
 ### HandlerRegistration
 ```rust
 pub struct HandlerRegistration {
    pub spec: OperationSpec,
    pub handler: Handler,
    pub provenance: OperationProvenance,
    pub composition_authority: Option<CompositionAuthority>, // None for leaves
    pub scoped_env: Option<ScopedOperationEnv>,               // None for leaves
    pub capabilities: Capabilities,
 }
 ```
 The registration bundle carries everything the dispatch path needs to
 construct an `OperationContext`. See ADR-022.
 ### OperationProvenance
 ```rust
 pub enum OperationProvenance {
    Local,           // Assembly-written, trusted, can compose
    FromOpenAPI,     // HTTP forwarding stub, leaf
    FromMCP,         // MCP forwarding stub, leaf
    FromCall,        // QUIC forwarding stub, leaf locally
    FromJsonSchema,  // JSON Schema definition, no handler — schema only
    Session,         // Agent-written, sandboxed, can compose within sandbox
 }
 ```
 | Provenance | Can compose? | Has composition authority? | Default visibility |
 |-----------|-------------|---------------------------|-------------------|
 | `Local` | Yes | Yes | External or Internal (assembly declares) |
 | `FromOpenAPI` | No (leaf) | No | Internal |
 | `FromMCP` | No (leaf) | No | Internal |
 | `FromCall` | No (leaf in local registry) | No | Internal |
 | `FromJsonSchema` | N/A (no handler) | No | N/A |
 | `Session` | Yes (within sandbox) | Yes | Internal always |
 ### OperationRegistry
 ```rust
 pub struct OperationRegistry {
    operations: HashMap<String, HandlerRegistration>,
 }
 ```
 The curated layer (Layer 0) is a `HashMap<String, HandlerRegistration>`. Session
 and connection overlays (Layers 1 and 2) are separate maps composed into the
 per-call `OperationContext.env` by the CallAdapter (ADR-024).
 Methods:
 - `register(registration)`: add to curated layer at startup
 - `registration(name)`: find by operation name (checks active overlays first,
  then curated base — ADR-024). Returns spec, handler, provenance, composition
  authority, scoped env, capabilities.
 - `invoke(name, input, context)`: look up, check ACL, invoke handler, return result
 - `list_operations()`: return all registered specs (for `/services/list` —
  returns curated + active overlay ops, External only)
 ### OperationRegistryBuilder
 Fluent API with convenience methods:
 ```rust
 pub struct OperationRegistryBuilder {
    operations: HashMap<String, HandlerRegistration>,
 }
 impl OperationRegistryBuilder {
    pub fn new() -> Self;
    // with_local: Local provenance, full bundle — all 5 args required
    pub fn with_local(
        mut self,
        spec: OperationSpec,
        handler: Handler,
        composition_authority: Option<CompositionAuthority>,
        scoped_env: Option<ScopedOperationEnv>,
        capabilities: Capabilities,
    ) -> Self;
    // with_leaf: leaf provenance (FromOpenAPI/FromMCP/FromCall), no authority, no scoped env
    pub fn with_leaf(
        mut self,
        spec: OperationSpec,
        handler: Handler,
        capabilities: Capabilities,
    ) -> Self;
    // with: full manual registration (any provenance)
    pub fn with(mut self, registration: HandlerRegistration) -> Self;
    pub fn build(self) -> OperationRegistry;
 }
 ```
 `with_local` sets `provenance: Local`. `with_leaf` sets `provenance: FromOpenAPI`
 (or a parameter), `composition_authority: None`, `scoped_env: None`. `with` takes
 the full bundle for any provenance.
 ### Registry invoke flow
 ```rust
 impl OperationRegistry {
    pub async fn invoke(&self, name: &str, input: Value, context: OperationContext) -> ResponseEnvelope {
        // 1. Look up registration by name
        // 2. Check visibility: if Internal and context is external (internal: false), return NOT_FOUND
        // 3. Check ACL: access_control.check(identity or handler_identity depending on internal flag)
        // 4. If denied: return FORBIDDEN
        // 5. Invoke handler: (handler)(input, context).await
        // 6. Return ResponseEnvelope
    }
 }
 ```
 The ACL authority depends on `context.internal`:
 - `internal: false` (wire call): check against `context.identity` (caller)
 - `internal: true` (composition): check against `context.handler_identity.as_identity()`
 ### Layer 0 immutability
 The curated layer (Layer 0 — `Local` provenance ops) is immutable after
 construction. Adding a `Local` op requires restarting the process. Session and
 imported overlays are dynamic at their respective scopes (ADR-024). The
 `OperationRegistryBuilder` is Layer-0-only; runtime overlay registration uses
 `CallConnection::register_imported()` (in the protocol/connection task).
 ## Acceptance Criteria
 - [ ] `Handler` type alias (async closure returning ResponseEnvelope)
 - [ ] `HandlerRegistration` struct with all 6 fields
 - [ ] `OperationProvenance` enum with all 6 variants
 - [ ] `OperationRegistry` struct with operations HashMap
 - [ ] `OperationRegistry::register()` adds to curated layer
 - [ ] `OperationRegistry::registration()` looks up by name
 - [ ] `OperationRegistry::invoke()` checks visibility, ACL, invokes handler
 - [ ] `OperationRegistry::list_operations()` returns External specs only
 - [ ] `OperationRegistryBuilder` with `new()`, `with_local()`, `with_leaf()`, `with()`, `build()`
 - [ ] `with_local` sets provenance Local, requires all 5 args
 - [ ] `with_leaf` sets provenance leaf, composition_authority None, scoped_env None
 - [ ] invoke: Internal op called externally → NOT_FOUND (not FORBIDDEN)
 - [ ] invoke: ACL denied → FORBIDDEN
 - [ ] invoke: internal: true → ACL against handler_identity, not identity
 - [ ] invoke: internal: false → ACL against identity
 - [ ] Unit test: register and invoke a simple operation
 - [ ] Unit test: Internal op returns NOT_FOUND from external call
 - [ ] Unit test: ACL check with sufficient scopes → Allowed
 - [ ] Unit test: ACL check with insufficient scopes → Forbidden
 - [ ] Unit test: builder with_local and with_leaf produce correct provenance
 - [ ] `cargo test -p alknet-call` succeeds
 - [ ] `cargo clippy -p alknet-call` succeeds with no warnings
 ## References
 - docs/architecture/crates/call/operation-registry.md — Handler, HandlerRegistration, OperationRegistry, builder
 - docs/architecture/decisions/022-handler-registration-provenance-and-composition-authority.md — ADR-022
 - docs/architecture/decisions/024-operation-registry-layering.md — ADR-024 (layering, immutability)
 ## Notes
 > The registry is the dispatch core. The ACL authority switch (internal: true
 > → handler_identity, internal: false → identity) is the ADR-015 privilege
 > model — get this right. Internal ops return NOT_FOUND from the wire (don't
 > leak existence), not FORBIDDEN. The builder is Layer-0-only; runtime overlay
 > registration is via CallConnection (protocol task).
 ## Summary
 > To be filled on completion
--- a/tasks/call/registry/operation-context.md
+++ b/tasks/call/registry/operation-context.md
@@ -0,0 +1,204 @@
 ---
 id: call/registry/operation-context
 name: Implement OperationContext, AbortPolicy, CompositionAuthority, and ScopedOperationEnv
 status: pending
 depends_on: [call/registry/operation-spec, core/core-types]
 scope: broad
 risk: high
 impact: component
 level: implementation
 ---
 ## Description
 Implement the operation context types in `src/registry/context.rs`. This is
 the highest-density task in the call crate — `OperationContext` has 10 fields,
 each tied to an ADR. The authority-switch semantics (`internal: true` → ACL
 against `handler_identity`, not `identity`) is where ADR-015, ADR-022, and
 ADR-024 converge.
 **Read ADR-015, ADR-022, and ADR-024 before starting this task.**
 ### OperationContext
 ```rust
 pub struct OperationContext {
    pub request_id: String,
    pub parent_request_id: Option<String>,
    pub identity: Option<Identity>,                       // Caller's identity (inbound)
    pub handler_identity: Option<CompositionAuthority>,    // Handler's composition authority (ADR-022)
    pub capabilities: Capabilities,
    pub metadata: HashMap<String, Value>,
    pub scoped_env: ScopedOperationEnv,                   // Reachability set (data, ADR-022)
    pub env: Arc<dyn OperationEnv + Send + Sync>,          // Composition dispatch trait (ADR-024)
    pub abort_policy: AbortPolicy,                         // ADR-016 Decision 6
    pub deadline: Option<Instant>,
    pub(crate) internal: bool,                             // Module-private for writes (ADR-015)
 }
 ```
 Field-by-field:
 - `request_id`: correlates with `call.requested` event's `id` field. For wire
  calls, this is the client-generated ID. For composed calls, generated by
  `OperationEnv::invoke()` via `generate_request_id()` (UUID v4 or
  `parent_id + "-" + counter`). **Deterministic IDs must not be used** — they
  collide across concurrent invocations, corrupting PendingRequestMap and the
  abort-cascade tree.
 - `parent_request_id`: set when this call was initiated by another operation
  (via OperationEnv). Records the agency chain — the call tree is the
  principal→agent chain (ADR-015).
 - `identity`: the authenticated caller (from IdentityProvider) — inbound auth
  (who is calling me). For external calls, who sent `call.requested`. For
  internal calls, the parent handler's `handler_identity` (propagated through
  `OperationEnv::invoke()`).
 - `handler_identity`: the composition authority of the handler processing this
  call. `None` for leaves (FromOpenAPI/FromMCP/FromCall) — they don't compose.
  `Some(...)` for Local/Session ops. For internal calls (`internal: true`), ACL
  checks against this authority (ADR-015, ADR-022). This is NOT a peer Identity
  — it's a declared authority bundle set at registration.
 - `capabilities`: outbound credentials the handler may use (decrypted API keys,
  scoped vault access). From the registration bundle (ADR-022).
 - `metadata`: request-scoped context (tracing IDs, connection info). **Must not
  hold secret material** (ADR-014). **Does not propagate through
  `OperationEnv::invoke()`** — nested calls get fresh metadata. The tracing
  link is `parent_request_id`, not metadata propagation.
 - `scoped_env`: the reachability set — operations this handler may compose.
  Populated from the registration bundle (ADR-022). This is *data* (a struct),
  not a dispatch trait. `None`/empty for leaves.
 - `env`: the composition dispatch trait (`Arc<dyn OperationEnv + Send + Sync>`).
  A handler calls `context.env.invoke(...)` to compose children. This is a
  trait object, not a concrete struct — enables registry layering (ADR-024).
 - `abort_policy`: for this call's descendants (ADR-016 Decision 6). Default
  `AbortDependents`. `ContinueRunning` is opt-in for long-running work. Set by
  the composing handler via `invoke()`, not by the wire caller.
 - `deadline`: for this call and all descendants. Set by `build_root_context`
  to `now + CallAdapter.default_timeout` (default 30s). Composed calls inherit
  the parent's deadline (children do NOT get a fresh 30s). `None` = unbounded
  (long-running subscriptions).
 - `internal`: when `true`, this call originated from composition (a handler
  calling another operation via OperationEnv), not from a wire request. This
  switches the authority context: ACL runs against `handler_identity`, not
  `identity`. Module-private for writes; read via `is_internal()`. Only set by
  `OperationEnv::invoke()` (true) or `CallAdapter` dispatch path (false).
 ### AbortPolicy
 ```rust
 pub enum AbortPolicy {
    AbortDependents,   // default — abort cascades to all non-terminal descendants
    ContinueRunning,   // opt-in — started descendants continue, unstarted aborted
 }
 impl Default for AbortPolicy {
    fn default() -> Self { Self::AbortDependents }
 }
 ```
 ### CompositionAuthority
 ```rust
 pub struct CompositionAuthority {
    pub label: String,                          // e.g., "agent-chat" — not a peer id
    pub scopes: Vec<String>,                     // e.g., ["llm:call", "fs:read"]
    pub resources: HashMap<String, Vec<String>>,  // e.g., {"service": ["vastai"]}
 }
 impl CompositionAuthority {
    pub fn none() -> Option<Self> { None }  // Convenience for leaves
    pub fn new(label: &str, scopes: impl IntoIterator<Item = String>) -> Self { ... }
    pub fn as_identity(&self) -> Option<Identity> { ... }  // Synthetic Identity for ACL
 }
 ```
 The declared authority the handler operates under when composing children.
 `None` for leaves. This replaces ADR-015's `handler_identity: Identity` — it's
 not a peer identity, it's a declared authority bundle. See ADR-022.
 `as_identity()` produces a synthetic `Identity` from the authority (label as
 id, scopes, resources) for ACL checking against `AccessControl`.
 ### ScopedOperationEnv
 ```rust
 pub struct ScopedOperationEnv {
    allowed: HashSet<String>,  // operation names this handler may reach
 }
 impl ScopedOperationEnv {
    pub fn empty() -> Self;
    pub fn new(ops: impl IntoIterator<Item = impl Into<String>>) -> Self;
    pub fn allows(&self, name: &str) -> bool;  // is this op in the reachability set?
 }
 ```
 The reachability set — the operations this handler may reach via `env.invoke()`.
 Populated from the registration bundle (ADR-022). This is *data*, not a dispatch
 trait. The reachability check in `OperationEnv::invoke()` consults
 `scoped_env.allows(&name)`. `None`/empty for leaves.
 ### OperationContext methods
 ```rust
 impl OperationContext {
    pub fn is_internal(&self) -> bool { self.internal }
 }
 ```
 The `internal` field is `pub(crate)` — only `OperationEnv::invoke()` and the
 `CallAdapter` dispatch path can set it. Handlers read via `is_internal()`.
 ### generate_request_id
 ```rust
 pub(crate) fn generate_request_id() -> String {
    // UUID v4 — must be unique across concurrent invocations
    // Deterministic IDs (e.g., format!("env-{name}")) MUST NOT be used
 }
 ```
 Use the `uuid` crate (already a dependency). This is module-internal — called
 by `OperationEnv::invoke()` for composed calls.
 ## Acceptance Criteria
 - [ ] `OperationContext` struct with all 10 fields
 - [ ] `internal` field is `pub(crate)` (module-private for writes)
 - [ ] `is_internal()` method exposes read access
 - [ ] `AbortPolicy` enum with AbortDependents, ContinueRunning
 - [ ] `Default for AbortPolicy` returns `AbortDependents`
 - [ ] `CompositionAuthority` struct with label, scopes, resources
 - [ ] `CompositionAuthority::none()` returns `None`
 - [ ] `CompositionAuthority::new(label, scopes)` constructor
 - [ ] `CompositionAuthority::as_identity()` produces synthetic Identity for ACL
 - [ ] `ScopedOperationEnv` struct with allowed set
 - [ ] `ScopedOperationEnv::empty()`, `new()`, `allows()` methods
 - [ ] `generate_request_id()` produces UUID v4 (unique, non-deterministic)
 - [ ] Unit test: ScopedOperationEnv::allows (in set → true, not in set → false)
 - [ ] Unit test: CompositionAuthority::as_identity produces correct Identity
 - [ ] Unit test: AbortPolicy default is AbortDependents
 - [ ] `cargo test -p alknet-call` succeeds
 - [ ] `cargo clippy -p alknet-call` succeeds with no warnings
 ## References
 - docs/architecture/crates/call/operation-registry.md — OperationContext, AbortPolicy, CompositionAuthority, ScopedOperationEnv
 - docs/architecture/decisions/015-privilege-model-and-authority-context.md — ADR-015 (internal flag, authority switch)
 - docs/architecture/decisions/016-abort-cascade-for-nested-calls.md — ADR-016 (AbortPolicy)
 - docs/architecture/decisions/022-handler-registration-provenance-and-composition-authority.md — ADR-022 (CompositionAuthority, ScopedOperationEnv)
 - docs/architecture/decisions/024-operation-registry-layering.md — ADR-024 (env as trait object)
 ## Notes
 > **Read ADR-015, ADR-022, and ADR-024 before starting.** This is the
 > highest-density task in the call crate. OperationContext has 10 fields, each
 > tied to an ADR. The authority-switch semantics (internal: true → ACL against
 > handler_identity, not identity) is where three ADRs converge. The `internal`
 > field is module-private for writes — only OperationEnv::invoke() and the
 > CallAdapter dispatch path set it. Metadata does NOT propagate through
 > composition (security constraint, ADR-014). Request IDs must be unique
 > (UUID v4) — deterministic IDs corrupt PendingRequestMap and abort-cascade tree.
 ## Summary
 > To be filled on completion
--- a/tasks/call/registry/operation-env.md
+++ b/tasks/call/registry/operation-env.md
@@ -0,0 +1,225 @@
 ---
 id: call/registry/operation-env
 name: Implement OperationEnv trait, LocalOperationEnv, and CompositeOperationEnv
 status: pending
 depends_on: [call/registry/handler-registration]
 scope: broad
 risk: high
 impact: component
 level: implementation
 ---
 ## Description
 Implement the `OperationEnv` trait and its implementations in
 `src/registry/env.rs`. This is the universal composition mechanism — a handler
 calls `context.env.invoke(...)` to compose child operations. The trait-object
 design is what enables registry layering (ADR-024).
 **Read ADR-024 before starting this task.** The trait-object pattern is
 load-bearing — making `OperationEnv` concrete would close the session-overlay
 and connection-overlay patterns.
 ### OperationEnv trait
 ```rust
 #[async_trait]
 pub trait OperationEnv: Send + Sync {
    /// Compose a child operation. The child's OperationContext is constructed
    /// with internal: true, inheriting the parent's composition authority as
    /// the child's caller identity. Abort policy defaults to parent's.
    async fn invoke(
        &self,
        namespace: &str,
        operation: &str,
        input: Value,
        parent: &OperationContext,
    ) -> ResponseEnvelope {
        self.invoke_with_policy(namespace, operation, input, parent, parent.abort_policy.clone()).await
    }
    /// Compose with explicit abort policy (ADR-016 Decision 6).
    /// This is the required method — invoke() delegates to it.
    async fn invoke_with_policy(
        &self,
        namespace: &str,
        operation: &str,
        input: Value,
        parent: &OperationContext,
        policy: AbortPolicy,
    ) -> ResponseEnvelope;
    /// Does this env contain the named operation? Used by CompositeOperationEnv
    /// to probe overlays before dispatching (ADR-024).
    fn contains(&self, name: &str) -> bool { true }
 }
 ```
 `invoke()` has a default impl that delegates to `invoke_with_policy()` with
 the parent's abort policy. Implementations only need to implement
 `invoke_with_policy()`.
 ### LocalOperationEnv (Layer 0)
 ```rust
 pub struct LocalOperationEnv {
    registry: Arc<OperationRegistry>,
 }
 #[async_trait]
 impl OperationEnv for LocalOperationEnv {
    async fn invoke_with_policy(&self, namespace: &str, operation: &str, input: Value, parent: &OperationContext, policy: AbortPolicy) -> ResponseEnvelope {
        let name = format!("{namespace}/{operation}");
        // 1. Reachability check (ADR-015, ADR-022): is this op in parent's scoped env?
        if !parent.scoped_env.allows(&name) {
            return ResponseEnvelope::not_found(name);
        }
        // 2. Look up registration
        let registration = self.registry.registration(&name);
        // 3. Construct child OperationContext
        let context = OperationContext {
            request_id: generate_request_id(),  // UUID v4 — NOT deterministic
            parent_request_id: Some(parent.request_id.clone()),
            identity: parent.handler_identity.as_identity(),  // authority switch
            handler_identity: registration.composition_authority.clone(),
            capabilities: parent.capabilities.clone(),  // inherit
            metadata: HashMap::new(),  // fresh — does NOT propagate parent metadata (ADR-014)
            abort_policy: policy,
            deadline: parent.deadline,  // inherit — children don't get fresh 30s
            scoped_env: registration.scoped_env.clone().unwrap_or_else(ScopedOperationEnv::empty),
            env: parent.env.clone(),  // inherit the same composite env
            internal: true,  // nested calls use handler authority
        };
        // 4. Dispatch
        self.registry.invoke(&name, input, context).await
    }
    // contains() uses default (returns true — curated registry contains everything it can dispatch)
 }
 ```
 Key points:
 - **Reachability check first**: if op not in parent's scoped_env, NOT_FOUND.
  This bounds the parameterized-dispatch attack surface.
 - **Authority propagation**: child's `identity` = parent's `handler_identity`
  (the parent's composition authority becomes the caller). This is the
  authority switch from ADR-015.
 - **Fresh metadata**: `HashMap::new()`, NOT parent's metadata. Security
  constraint (ADR-014) — prevents secret leakage through composition.
 - **Inherited deadline**: children don't get a fresh 30s — the root call's
  deadline bounds the entire call tree.
 - **Inherited env**: child gets `parent.env.clone()` (the same composite of
  curated base + active overlays).
 - **internal: true**: this is the flag that switches ACL authority.
 ### CompositeOperationEnv (per-call, ADR-024)
 ```rust
 pub struct CompositeOperationEnv {
    session: Option<Arc<dyn OperationEnv + Send + Sync>>,    // Layer 1
    connection: Option<Arc<dyn OperationEnv + Send + Sync>>, // Layer 2
    base: Arc<dyn OperationEnv + Send + Sync>,               // Layer 0 (LocalOperationEnv)
 }
 #[async_trait]
 impl OperationEnv for CompositeOperationEnv {
    async fn invoke_with_policy(&self, namespace: &str, operation: &str, input: Value, parent: &OperationContext, policy: AbortPolicy) -> ResponseEnvelope {
        let name = format!("{namespace}/{operation}");
        // Reachability check (same as LocalOperationEnv)
        if !parent.scoped_env.allows(&name) {
            return ResponseEnvelope::not_found(name);
        }
        // Dispatch in overlay order: session → connection → curated base
        // First overlay that *contains* the op wins
        if let Some(session) = &self.session {
            if session.contains(&name) {
                return session.invoke_with_policy(namespace, operation, input, parent, policy).await;
            }
        }
        if let Some(connection) = &self.connection {
            if connection.contains(&name) {
                return connection.invoke_with_policy(namespace, operation, input, parent, policy).await;
            }
        }
        self.base.invoke_with_policy(namespace, operation, input, parent, policy).await
    }
    fn contains(&self, name: &str) -> bool {
        self.session.as_ref().map_or(false, |s| s.contains(name))
            || self.connection.as_ref().map_or(false, |c| c.contains(name))
            || self.base.contains(name)
    }
 }
 ```
 The `contains()` method (review #003 C9) is the overlay-dispatch contract. It
 replaces the previous ambiguous "sentinel or contains check" framing. The
 structural decision (composite trait object, overlay order, Arc::clone
 inheritance) is locked by ADR-024; the dispatch contract (contains probe before
 invoke_with_policy) is locked too.
 ### Why OperationEnv must remain a trait
 The trait-based design enables registry layering (ADR-024):
 - The CallAdapter composes the root env per call from curated base + active
  connection/session overlays
 - Overlays wrap the base via trait layering
 - Session-scoped registries (OQ-19) and connection-scoped remote imports
  (ADR-017 `from_call`) are both overlays on the same base
 Making `OperationEnv` concrete or hardcoding the global registry into the
 dispatch path would close both patterns. This is the same integration-point
 pattern as `IdentityProvider` (ADR-004).
 ## Acceptance Criteria
 - [ ] `OperationEnv` trait with `invoke()`, `invoke_with_policy()`, `contains()`
 - [ ] `invoke()` has default impl delegating to `invoke_with_policy()` with parent's policy
 - [ ] `contains()` has default impl returning `true`
 - [ ] `LocalOperationEnv` struct holding `Arc<OperationRegistry>`
 - [ ] `LocalOperationEnv::invoke_with_policy` checks reachability (scoped_env.allows)
 - [ ] `LocalOperationEnv` constructs child context with internal: true, authority switch
 - [ ] `LocalOperationEnv` fresh metadata (HashMap::new(), not parent's)
 - [ ] `LocalOperationEnv` inherited deadline (parent.deadline, not fresh 30s)
 - [ ] `LocalOperationEnv` inherited env (parent.env.clone())
 - [ ] `CompositeOperationEnv` with session, connection, base fields
 - [ ] `CompositeOperationEnv::invoke_with_policy` dispatches in overlay order (session → connection → base)
 - [ ] `CompositeOperationEnv` uses `contains()` probe before dispatching to overlay
 - [ ] `CompositeOperationEnv::contains` returns true if any layer contains the op
 - [ ] Reachability check returns NOT_FOUND if op not in scoped_env
 - [ ] Unit test: LocalOperationEnv invoke with allowed op → dispatches
 - [ ] Unit test: LocalOperationEnv invoke with disallowed op → NOT_FOUND
 - [ ] Unit test: child context has internal: true
 - [ ] Unit test: child context identity = parent's handler_identity
 - [ ] Unit test: child metadata is fresh (empty), not parent's
 - [ ] Unit test: CompositeOperationEnv dispatches to session overlay if contains
 - [ ] Unit test: CompositeOperationEnv falls through to base if no overlay contains
 - [ ] `cargo test -p alknet-call` succeeds
 - [ ] `cargo clippy -p alknet-call` succeeds with no warnings
 ## References
 - docs/architecture/crates/call/operation-registry.md — OperationEnv, LocalOperationEnv, CompositeOperationEnv
 - docs/architecture/decisions/015-privilege-model-and-authority-context.md — ADR-015 (authority switch)
 - docs/architecture/decisions/016-abort-cascade-for-nested-calls.md — ADR-016 (abort policy propagation)
 - docs/architecture/decisions/024-operation-registry-layering.md — ADR-024 (layering, contains contract)
 ## Notes
 > **Read ADR-024 before starting.** The trait-object design is load-bearing —
 > OperationEnv MUST remain a trait, not a concrete type. The authority switch
 > (child identity = parent handler_identity) is the ADR-015 privilege model.
 > Metadata does NOT propagate (ADR-014 security constraint). Deadline
 > inherits (children don't get fresh 30s). The `contains()` probe is the
 > overlay-dispatch contract from review #003 C9 — any OperationEnv impl that
 > correctly reports contains works with the composite.
 ## Summary
 > To be filled on completion
--- a/tasks/call/registry/operation-spec.md
+++ b/tasks/call/registry/operation-spec.md
@@ -0,0 +1,168 @@
 ---
 id: call/registry/operation-spec
 name: Implement OperationSpec, OperationType, Visibility, ErrorDefinition, and AccessControl
 status: pending
 depends_on: [call/crate-init]
 scope: moderate
 risk: medium
 impact: component
 level: implementation
 ---
 ## Description
 Implement the operation specification types in `src/registry/spec.rs`. These
 types declare what an operation is, its schemas, and its access control policy.
 ### OperationSpec
 ```rust
 pub struct OperationSpec {
    pub name: String,              // e.g., "fs/readFile", "agent/chat" (no leading slash)
    pub namespace: String,         // e.g., "fs", "agent"
    pub op_type: OperationType,    // Query, Mutation, Subscription
    pub visibility: Visibility,    // External (wire-callable) or Internal (composition-only)
    pub input_schema: Value,       // JSON Schema for input
    pub output_schema: Value,      // JSON Schema for output
    pub error_schemas: Vec<ErrorDefinition>,  // Declared domain errors (ADR-023)
    pub access_control: AccessControl,
 }
 ```
 Operation names use slash-based paths **without a leading slash**, aligned with
 URL path conventions: `fs/readFile`, `agent/chat`, `services/list`. The leading
 slash is added for display (`spec.path()` returns `/fs/readFile`) and wire
 format. The registry stores names without the leading slash.
 The `namespace` field is derived from the name: for `fs/readFile` it's `fs`,
 for `agent/chat` it's `agent`. It's a convenience accessor for ACL matching and
 service grouping.
 Implement `OperationSpec::path(&self) -> String` that returns `/{name}` (the
 wire/display form with leading slash).
 ### OperationType
 ```rust
 pub enum OperationType {
    Query,         // Read-only, idempotent (e.g., "fs/readFile", "services/list")
    Mutation,      // Side effects (e.g., "bash/exec", "github/authenticate")
    Subscription,  // Streaming (e.g., "agent/chat", "events/subscribe")
 }
 ```
 ### Visibility
 ```rust
 pub enum Visibility {
    External,  // Callable from the wire (call.requested from a client)
    Internal,  // Composition-only (env.invoke from a handler)
 }
 ```
 `External` operations appear in `services/list` and accept `call.requested`.
 `Internal` operations return `NOT_FOUND` when called from the wire and do not
 appear in `services/list`. The assembly layer declares visibility at
 registration. All import adapters register operations as `Internal` by default
 (they're composition material); the handler that composes them is `External`.
 ### ErrorDefinition
 ```rust
 pub struct ErrorDefinition {
    pub code: String,           // e.g., "FILE_NOT_FOUND", "RATE_LIMITED"
    pub description: String,    // Human-readable description
    pub schema: Value,           // JSON Schema for the error detail payload
    pub http_status: Option<u16>,  // HTTP status for adapter projection (from_openapi/to_openapi)
 }
 ```
 A declared operation-level error (ADR-023). When a handler returns a `CallError`
 whose `code` matches a declared `ErrorDefinition`, the `call.error` event
 carries that code and the error's detail payload. If it doesn't match, the
 `call.error` carries `INTERNAL`.
 ### AccessControl
 ```rust
 pub struct AccessControl {
    pub required_scopes: Vec<String>,          // AND-checked: caller must have ALL
    pub required_scopes_any: Option<Vec<String>>, // OR-checked: caller must have at LEAST ONE
    pub resource_type: Option<String>,          // e.g., "service"
    pub resource_action: Option<String>,        // e.g., "read"
 }
 ```
 ### ACL check flow
 When a `call.requested` event arrives:
 1. Registry checks **visibility** — if `Internal`, returns `NOT_FOUND` (does
   not leak existence)
 2. Registry checks `access_control.check(identity)`:
   - For external calls (`internal: false`): ACL against the **caller's identity**
   - For internal calls (`internal: true`): ACL against the **handler's
     composition authority** (ADR-015)
 3. If denied: `FORBIDDEN`
 4. If identity is `None` and operation has restrictions: `FORBIDDEN` with
   message `"authentication required"`
 Operations with empty `AccessControl` (no required scopes, no resource checks)
 are accessible to all callers, including unauthenticated ones.
 ### Implement AccessControl::check
 ```rust
 impl AccessControl {
    pub fn check(&self, identity: Option<&Identity>) -> AccessResult;
 }
 pub enum AccessResult {
    Allowed,
    Forbidden(String),  // reason
 }
 ```
 The check logic:
 - `required_scopes`: caller must have ALL (subset check)
 - `required_scopes_any`: caller must have at LEAST ONE (if present)
 - `resource_type` / `resource_action`: check against `identity.resources`
 - If `identity` is `None` and any scope/resource is required: `Forbidden("authentication required")`
 ## Acceptance Criteria
 - [ ] `OperationSpec` struct with all 8 fields
 - [ ] `OperationSpec::path()` returns `/{name}` (leading slash for wire/display)
 - [ ] `OperationSpec::namespace` derived from name (split on `/`)
 - [ ] `OperationType` enum with Query, Mutation, Subscription
 - [ ] `Visibility` enum with External, Internal
 - [ ] `ErrorDefinition` struct with all 4 fields
 - [ ] `AccessControl` struct with all 4 fields
 - [ ] `AccessControl::check(identity)` returns `AccessResult`
 - [ ] `required_scopes` is AND-checked (caller must have all)
 - [ ] `required_scopes_any` is OR-checked (caller must have at least one)
 - [ ] `None` identity with restrictions → `Forbidden("authentication required")`
 - [ ] Empty AccessControl → `Allowed` for all callers
 - [ ] Unit tests for AccessControl::check (all combinations)
 - [ ] Unit test: OperationSpec::path() produces leading slash
 - [ ] Unit test: namespace derived correctly from name
 - [ ] `cargo test -p alknet-call` succeeds
 - [ ] `cargo clippy -p alknet-call` succeeds with no warnings
 ## References
 - docs/architecture/crates/call/operation-registry.md — OperationSpec, AccessControl, Visibility
 - docs/architecture/decisions/015-privilege-model-and-authority-context.md — ADR-015 (visibility, ACL)
 - docs/architecture/decisions/023-operation-error-schemas.md — ADR-023 (ErrorDefinition)
 ## Notes
 > Operation names have NO leading slash in the registry (`fs/readFile`). The
 > leading slash is added for wire format and display (`/fs/readFile`). This is
 > a single rule applied consistently — do not mix the two forms. Visibility
 > controls wire-callability: Internal ops return NOT_FOUND from the wire (don't
 > leak existence). AccessControl.check is the ACL gate — read it carefully
 > against ADR-015 for the internal vs external authority distinction.
 ## Summary
 > To be filled on completion
--- a/tasks/call/registry/service-discovery.md
+++ b/tasks/call/registry/service-discovery.md
@@ -0,0 +1,181 @@
 ---
 id: call/registry/service-discovery
 name: Implement services/list and services/schema built-in operations
 status: pending
 depends_on: [call/registry/handler-registration]
 scope: narrow
 risk: low
 impact: isolated
 level: implementation
 ---
 ## Description
 Implement the two built-in service discovery operations in
 `src/registry/discovery.rs`. These are read-only operations that expose what
 the node offers.
 ### Operations
 | Operation name | Display path | Type | Description |
 |---------------|-------------|------|-------------|
 | `services/list` | `/services/list` | Query | List registered operation names and metadata |
 | `services/schema` | `/services/schema` | Query | Get the OperationSpec for a specific operation |
 ### services/list
 Returns `External` operations only. `Internal` operations are not part of the
 wire-facing API surface — they're implementation details of composition. A
 remote client cannot enumerate the internal call tree (ADR-015).
 ```json
 {
  "operations": [
    { "name": "fs/readFile", "namespace": "fs", "op_type": "query" },
    { "name": "agent/chat", "namespace": "agent", "op_type": "subscription" },
    { "name": "events/subscribe", "namespace": "events", "op_type": "subscription" }
  ]
 }
 ```
 The handler queries the registry's `list_operations()` (which returns External
 specs only) and serializes to the above format.
 ### services/schema
 Accepts `{ "name": "fs/readFile" }` (no leading slash — registry form, same as
 `OperationSpec.name`) and returns the full `OperationSpec` including
 input/output JSON Schemas and declared `error_schemas` (ADR-023).
 The CallAdapter normalizes the leading slash from wire `operationId`s before
 lookup, so `services/schema` accepts both `fs/readFile` and `/fs/readFile`.
 This enables client code generation: a client reading the schema can produce
 typed error enums instead of generic error handling.
 ### Registration
 These are registered as `Local` provenance with empty composition authority,
 empty scoped env, and empty capabilities (they don't compose, don't need
 credentials):
 ```rust
 .with_local(services_list_spec(), Arc::new(services_list_handler),
            CompositionAuthority::none(), ScopedOperationEnv::empty(), Capabilities::new())
 .with_local(services_schema_spec(), Arc::new(schema_handler),
            CompositionAuthority::none(), ScopedOperationEnv::empty(), Capabilities::new())
 ```
 ### Specs
 ```rust
 fn services_list_spec() -> OperationSpec {
    OperationSpec {
        name: "services/list".into(),
        namespace: "services".into(),
        op_type: OperationType::Query,
        visibility: Visibility::External,
        input_schema: json!({}),  // no input
        output_schema: json!({
            "type": "object",
            "properties": {
                "operations": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "name": { "type": "string" },
                            "namespace": { "type": "string" },
                            "op_type": { "type": "string", "enum": ["query", "mutation", "subscription"] }
                        }
                    }
                }
            }
        }),
        error_schemas: vec![],
        access_control: AccessControl::default(),  // no restrictions — callable by all
    }
 }
 fn services_schema_spec() -> OperationSpec {
    OperationSpec {
        name: "services/schema".into(),
        namespace: "services".into(),
        op_type: OperationType::Query,
        visibility: Visibility::External,
        input_schema: json!({
            "type": "object",
            "properties": { "name": { "type": "string" } },
            "required": ["name"]
        }),
        output_schema: json!({ /* full OperationSpec schema */ }),
        error_schemas: vec![],
        access_control: AccessControl::default(),
    }
 }
 ```
 ### Handlers
 The handlers need access to the registry. Since handlers are `Arc<dyn Fn>`,
 the registry reference is captured in the closure. Use `Arc<OperationRegistry>`
 cloned into the closure.
 ```rust
 fn services_list_handler(registry: Arc<OperationRegistry>) -> Handler {
    Arc::new(move |input: Value, ctx: OperationContext| {
        let registry = registry.clone();
        Box::pin(async move {
            let ops: Vec<_> = registry.list_operations()
                .into_iter()
                .filter(|s| s.visibility == Visibility::External)
                .map(|s| json!({
                    "name": s.name,
                    "namespace": s.namespace,
                    "op_type": match s.op_type {
                        OperationType::Query => "query",
                        OperationType::Mutation => "mutation",
                        OperationType::Subscription => "subscription",
                    }
                }))
                .collect();
            ResponseEnvelope::ok(ctx.request_id, json!({ "operations": ops }))
        })
    })
 }
 ```
 ## Acceptance Criteria
 - [ ] `services/list` spec with correct fields (Query, External, no input, output schema)
 - [ ] `services/schema` spec with correct fields (Query, External, name input, full spec output)
 - [ ] `services/list` handler returns External operations only (Internal excluded)
 - [ ] `services/list` output format matches spec (operations array with name, namespace, op_type)
 - [ ] `services/schema` handler accepts name with or without leading slash
 - [ ] `services/schema` returns full OperationSpec (input_schema, output_schema, error_schemas)
 - [ ] `services/schema` returns NOT_FOUND for unknown operation name
 - [ ] Both registered as Local provenance, empty authority/env/caps
 - [ ] Both have empty AccessControl (callable by all, including unauthenticated)
 - [ ] Unit test: services/list returns only External ops
 - [ ] Unit test: services/schema returns spec for known op
 - [ ] Unit test: services/schema returns NOT_FOUND for unknown op
 - [ ] Unit test: services/schema accepts both "fs/readFile" and "/fs/readFile"
 - [ ] `cargo test -p alknet-call` succeeds
 - [ ] `cargo clippy -p alknet-call` succeeds with no warnings
 ## References
 - docs/architecture/crates/call/operation-registry.md — Service Discovery section
 - docs/architecture/decisions/015-privilege-model-and-authority-context.md — ADR-015 (Internal not in services/list)
 ## Notes
 > services/list returns External ops only — Internal ops are implementation
 > details of composition and must not be enumerable from the wire. The
 > CallAdapter normalizes leading slashes, so services/schema accepts both
 > forms. These are the only built-in operations; no admin operations are
 > exposed through the call protocol itself.
 ## Summary
 > To be filled on completion
--- a/tasks/call/review-call.md
+++ b/tasks/call/review-call.md
@@ -0,0 +1,137 @@
 ---
 id: call/review-call
 name: Review alknet-call implementation for spec conformance and pattern consistency
 status: pending
 depends_on: [call/protocol/abort-cascade]
 scope: broad
 risk: low
 impact: phase
 level: review
 ---
 ## Description
 Review the alknet-call implementation for spec conformance, pattern
 consistency, and correctness. This is the quality checkpoint at the end of the
 call phase — the most complex crate in this batch.
 ### Review Checklist
 1. **Registry conformance** (operation-registry.md):
   - `OperationSpec` has all 8 fields, `path()` adds leading slash
   - `OperationType` (Query/Mutation/Subscription), `Visibility` (External/Internal)
   - `ErrorDefinition` with code, description, schema, http_status (ADR-023)
   - `AccessControl` with required_scopes (AND), required_scopes_any (OR), resource checks
   - `AccessControl::check` returns Allowed/Forbidden, None identity with restrictions → Forbidden
   - `OperationContext` has all 10 fields, `internal` is pub(crate), `is_internal()` reads
   - `AbortPolicy` (AbortDependents default, ContinueRunning opt-in)
   - `CompositionAuthority` with label, scopes, resources, `as_identity()`
   - `ScopedOperationEnv` with `allows()` reachability check
   - `Handler` type (async closure → ResponseEnvelope)
   - `HandlerRegistration` with all 6 fields (spec, handler, provenance, authority, scoped_env, caps)
   - `OperationProvenance` with all 6 variants
   - `OperationRegistry` with register, registration, invoke, list_operations
   - `OperationRegistryBuilder` with with_local, with_leaf, with, build
   - `OperationEnv` trait with invoke, invoke_with_policy, contains
   - `LocalOperationEnv` reachability check, authority switch, fresh metadata, inherited deadline
   - `CompositeOperationEnv` overlay dispatch (session → connection → base), contains probe
   - `services/list` returns External only, `services/schema` returns full spec
 2. **Protocol conformance** (call-protocol.md):
   - `EventEnvelope` with type, id, payload (JSON, length-prefixed framing)
   - `ResponseEnvelope` with request_id, result
   - `CallError` with code, message, retryable, details
   - 5 event types: call.requested, call.responded, call.completed, call.aborted, call.error
   - Wire payload schemas match spec table
   - `call.requested` has operationId (leading slash), input, optional auth_token
   - `call.error` has protocol-level codes (NOT_FOUND, FORBIDDEN, INVALID_INPUT, INTERNAL, TIMEOUT)
   - `PendingRequestMap` correlates by ID (not stream), handles all event types
   - `CallConnection` with Layer 2 overlay, register_imported, overlay_env, call/subscribe/abort
   - `CallAdapter` implements ProtocolHandler for alknet/call
   - CallAdapter stream handling (accept_bi loop, FrameFramedReader/Writer)
   - Per-request identity resolution (auth_token overrides connection-level)
   - `build_root_context` sets internal: false, deadline, capabilities from registration
   - `compose_root_env` builds CompositeOperationEnv (base + session + connection)
   - operationId leading slash stripped before lookup
   - ResponseEnvelope → EventEnvelope conversion
   - Timeout: 30s default, composed calls inherit parent deadline
   - Abort cascade: walks tree by parent_request_id, AbortDependents/ContinueRunning
 3. **ADR conformance**:
   - ADR-005: irpc framing used
   - ADR-012: bidirectional streams, ID-based correlation
   - ADR-014: no secret material on wire, Capabilities non-serializable
   - ADR-015: internal flag switches authority (handler_identity vs identity), Visibility
   - ADR-016: abort cascade, AbortPolicy, default AbortDependents
   - ADR-017: connection direction independent of call direction
   - ADR-022: registration bundle (provenance, authority, scoped_env, capabilities)
   - ADR-023: ErrorDefinition, typed details in call.error
   - ADR-024: registry layering (curated + session + connection), OperationEnv as trait
 4. **Security constraints**:
   - Capabilities non-serializable (no Serialize derive)
   - Capabilities zeroized, immutable after construction
   - Metadata does not propagate through composition (fresh HashMap::new())
   - Call protocol carries no secret material
   - Internal ops return NOT_FOUND from wire (don't leak existence)
   - Reachability check (scoped_env.allows) bounds composition attack surface
   - Request IDs are UUID v4 (non-deterministic, no collisions)
 5. **Pattern consistency**:
   - OperationEnv is a trait (not concrete) — enables layering
   - CompositeOperationEnv uses contains() probe before dispatch
   - Authority switch in invoke_with_policy (child identity = parent handler_identity)
   - Deadline inheritance (children don't get fresh 30s)
   - ArcSwap not used in call (that's core's pattern)
 6. **Test coverage**:
   - Unit tests for AccessControl::check (all combinations)
   - Unit tests for OperationContext construction
   - Unit tests for OperationEnv (LocalOperationEnv, CompositeOperationEnv)
   - Unit tests for PendingRequestMap (all event types, timeouts, fail_all)
   - Unit tests for framing (round-trip, truncation)
   - Unit tests for abort cascade (both policies, tree walking)
   - Integration test: call.requested → dispatch → call.responded
   - Integration test: auth_token overrides identity
   - Integration test: Internal op → NOT_FOUND from wire
   - Integration test: ACL denied → FORBIDDEN
   - Integration test: subscription streaming (multiple responded, completed)
 ## Acceptance Criteria
 - [ ] All registry types match operation-registry.md
 - [ ] All protocol types match call-protocol.md
 - [ ] All ADRs conformed to (005, 012, 014, 015, 016, 017, 022, 023, 024)
 - [ ] Capabilities non-serializable, zeroized, immutable
 - [ ] Metadata does not propagate through composition
 - [ ] Internal ops return NOT_FOUND from wire
 - [ ] Reachability check bounds composition
 - [ ] Request IDs are UUID v4
 - [ ] OperationEnv is a trait (not concrete)
 - [ ] CompositeOperationEnv uses contains() probe
 - [ ] Authority switch correct (internal: true → handler_identity)
 - [ ] Deadline inheritance correct (children inherit parent deadline)
 - [ ] Test coverage adequate for all functionality
 - [ ] `cargo fmt --check -p alknet-call` passes
 - [ ] `cargo clippy -p alknet-call` passes with no warnings
 - [ ] All tests pass
 ## References
 - docs/architecture/crates/call/README.md
 - docs/architecture/crates/call/call-protocol.md
 - docs/architecture/crates/call/operation-registry.md
 - docs/architecture/decisions/ (relevant ADRs: 005, 012, 014-017, 022-024)
 ## Notes
 > This is the most complex crate in this batch. The review should verify that
 > the registry layering (ADR-024), authority switch (ADR-015), abort cascade
 > (ADR-016), and composition model (ADR-022) all work correctly together. The
 > OperationEnv trait-object design is load-bearing — verify it's a trait, not
 > concrete. If deviations are found, document and fix before considering the
 > call crate complete.
 ## Summary
 > To be filled on completion
--- a/tasks/core/auth.md
+++ b/tasks/core/auth.md
@@ -0,0 +1,162 @@
 ---
 id: core/auth
 name: Implement AuthContext, Identity, AuthToken, IdentityProvider trait, and ConfigIdentityProvider
 status: pending
 depends_on: [core/core-types]
 scope: moderate
 risk: medium
 impact: component
 level: implementation
 ---
 ## Description
 Implement the authentication types in `src/auth.rs`. Auth is hybrid: the
 endpoint resolves what it can (TLS-level), handlers resolve what they need
 (protocol-level). AuthContext may be partial — handlers complete auth inside
 `handle()`.
 ### AuthContext
 ```rust
 #[derive(Clone)]
 pub struct AuthContext {
    pub identity: Option<Identity>,
    pub alpn: Vec<u8>,
    pub remote_addr: Option<SocketAddr>,
    pub tls_client_fingerprint: Option<String>,
 }
 ```
 Created by the endpoint for each incoming connection. Passed to
 `ProtocolHandler::handle()` as an immutable reference.
 - `identity`: peer's authenticated identity, if resolved by the endpoint. None
  means the endpoint has no identity info for this connection.
 - `alpn`: negotiated ALPN — always present after TLS handshake.
 - `remote_addr`: peer's address, if available (may be None for iroh).
 - `tls_client_fingerprint`: SHA-256 fingerprint of TLS client cert, if presented.
 `AuthContext` is `Clone` (handlers clone for per-stream contexts) and immutable
 in `handle()` (handlers create local variables for resolved identity, they
 don't mutate the shared context).
 ### Identity
 ```rust
 #[derive(Debug, Clone, PartialEq)]
 pub struct Identity {
    pub id: String,
    pub scopes: Vec<String>,
    pub resources: HashMap<String, Vec<String>>,
 }
 ```
 The authenticated peer identity. `id` is ALPN-agnostic:
 - SSH key auth: `"SHA256:abc123..."` (key fingerprint)
 - API key auth: `"alk_test"` (key prefix)
 - Certificate auth: `"username"` (principal name)
 ### AuthToken
 ```rust
 #[derive(Debug, Clone)]
 pub struct AuthToken {
    pub raw: Vec<u8>,
 }
 ```
 Opaque authentication token carried in protocol frames. The handler that
 extracted it knows its encoding.
 ### IdentityProvider trait
 ```rust
 pub trait IdentityProvider: Send + Sync + 'static {
    fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
    fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
 }
 ```
 - `resolve_from_fingerprint()`: used by endpoint (TLS client cert) and SSH (key fingerprint)
 - `resolve_from_token()`: used by call protocol (AuthToken in first frame) and HTTP (Bearer header)
 - Both return `Option<Identity>` — None means credential not recognized
 ### ConfigIdentityProvider
 ```rust
 pub struct ConfigIdentityProvider {
    dynamic: Arc<ArcSwap<DynamicConfig>>,
 }
 ```
 The default implementation. Resolves identities from `DynamicConfig` (reads
 from ArcSwap on every call — hot-reloadable).
 Resolution logic:
 - **Fingerprint**: look up in `DynamicConfig::auth::authorized_fingerprints`.
  If found, return `Identity { id: fingerprint, scopes: ["relay:connect"], resources: {} }`.
 - **Token**: parse as UTF-8. If starts with `alk_`, look up in
  `DynamicConfig::auth::api_keys` by prefix match + SHA-256 hash. If found and
  not expired, return `Identity { id: prefix, scopes: entry.scopes, resources: entry.resources }`.
 Changes to DynamicConfig via ConfigReloadHandle are reflected immediately.
 ### Two Identity Scopes
 There are two distinct identity scopes that must not be conflated:
 | Scope | Where set | Where stored | Represents | Used for |
 |-------|-----------|--------------|------------|----------|
 | Connection-level | Handler in `handle()` | `Connection` (via `set_identity`) | Who opened the QUIC connection | Observability, logging |
 | Per-request | CallAdapter per `call.requested` | `OperationContext.identity` | Who makes this specific call | ACL (ADR-015) |
 The connection-level identity is stable (set once). The per-request identity
 is dynamic (resolved per call, potentially different across requests). The
 per-request identity takes precedence for ACL.
 ### Security constraints
 - **Token entropy**: generated `alk_` tokens must have ≥128 bits of entropy.
  The prefix (first 8 chars) is for O(1) lookup and is not secret — it appears
  in logs by design. SHA-256 of the full token allows offline verification; this
  is safe only if the full token is high-entropy.
 - **Config reload must be authenticated**: a reload that adds an authorized
  fingerprint or API key grants access immediately. The reload trigger must be
  local-only or admin-scoped.
 - **Connection-level identity is for observability only**: per-request identity
  takes precedence for ACL.
 ## Acceptance Criteria
 - [ ] `AuthContext` struct with all 4 fields, derives `Clone`
 - [ ] `Identity` struct with `id`, `scopes`, `resources`, derives `Clone`, `PartialEq`
 - [ ] `AuthToken` struct with `raw` field, derives `Clone`
 - [ ] `IdentityProvider` trait with both methods
 - [ ] `ConfigIdentityProvider` struct holding `Arc<ArcSwap<DynamicConfig>>`
 - [ ] `ConfigIdentityProvider::resolve_from_fingerprint` looks up in authorized_fingerprints
 - [ ] `ConfigIdentityProvider::resolve_from_token` parses `alk_` prefix, matches by hash, checks expiry
 - [ ] ConfigIdentityProvider reads from ArcSwap on every call (hot-reloadable)
 - [ ] Unit test: fingerprint resolution (known fingerprint → Some, unknown → None)
 - [ ] Unit test: token resolution (valid non-expired → Some, expired → None, unknown → None)
 - [ ] Unit test: config reload changes resolution results immediately
 - [ ] `cargo test -p alknet-core` succeeds
 - [ ] `cargo clippy -p alknet-core` succeeds with no warnings
 ## References
 - docs/architecture/crates/core/auth.md — all type definitions, resolution flow
 - docs/architecture/decisions/004-auth-as-shared-core.md — ADR-004
 - docs/architecture/decisions/011-authcontext-structure.md — ADR-011
 ## Notes
 > Auth is hybrid: endpoint resolves TLS-level, handler resolves protocol-level.
 > AuthContext may be partial (identity = None). The two identity scopes
 > (connection-level for observability, per-request for ACL) must not be
 > conflated. ConfigIdentityProvider reads from ArcSwap on every call so config
 > reloads take effect immediately.
 ## Summary
 > To be filled on completion
--- a/tasks/core/config.md
+++ b/tasks/core/config.md
@@ -0,0 +1,190 @@
 ---
 id: core/config
 name: Implement StaticConfig, DynamicConfig, AuthPolicy, ApiKeyEntry, ConfigReloadHandle, TlsIdentity
 status: pending
 depends_on: [core/core-types]
 scope: moderate
 risk: low
 impact: component
 level: implementation
 ---
 ## Description
 Implement the configuration types in `src/config.rs`. These are the config
 structures consumed by the endpoint and the CLI binary. StaticConfig is
 immutable at startup; DynamicConfig is hot-reloadable via ArcSwap.
 ### StaticConfig
 ```rust
 pub struct StaticConfig {
    pub listen_addr: Option<SocketAddr>,
    pub tls_identity: Option<TlsIdentity>,
    pub iroh_relay: Option<RelayUrl>,
    pub drain_timeout: Duration,
 }
 ```
 Immutable configuration resolved at startup. `listen_addr` is None for
 iroh-only nodes. `tls_identity` is required if `listen_addr` is Some.
 ### TlsIdentity
 ```rust
 pub enum TlsIdentity {
    X509 { cert: PathBuf, key: PathBuf },
    RawKey(iroh::SecretKey),
    SelfSigned,
 }
 ```
 Three modes (OQ-12):
 - `X509`: domain certificate for browser/WebTransport clients
 - `RawKey`: RFC 7250 raw Ed25519 public key — default for P2P, no domain/CA
 - `SelfSigned`: development only
 `RawKey` uses `iroh::SecretKey` (Ed25519) — re-exported from iroh, which
 alknet-core depends on (feature-gated). The key can be derived from
 alknet-vault at the assembly layer or generated fresh.
 ### DynamicConfig
 ```rust
 #[derive(Debug, Clone)]
 pub struct DynamicConfig {
    pub auth: AuthPolicy,
    pub rate_limits: RateLimitConfig,
 }
 ```
 Runtime-reloadable via ArcSwap.
 ### AuthPolicy
 ```rust
 pub struct AuthPolicy {
    pub authorized_fingerprints: HashSet<String>,
    pub api_keys: Vec<ApiKeyEntry>,
 }
 ```
 Fingerprints stored as strings (no russh dependency in core — ADR-003).
 Certificate authority entries deferred to alknet-ssh (omitted from v1 to avoid
 referencing an undefined type; adding back is additive).
 ### ApiKeyEntry
 ```rust
 pub struct ApiKeyEntry {
    pub prefix: String,
    pub hash: String,
    pub scopes: Vec<String>,
    pub description: String,
    pub expires_at: Option<u64>,
 }
 ```
 Carries forward from reference implementation. Prefix (first 8 chars) for O(1)
 lookup, SHA-256 hash for verification.
 ### RateLimitConfig
 ```rust
 pub struct RateLimitConfig {
    pub max_connections_per_ip: usize,
    pub max_auth_attempts: usize,
 }
 ```
 ### ArcSwap pattern
 ```rust
 let dynamic = Arc::new(ArcSwap::new(Arc::new(DynamicConfig::default())));
 ```
 - Reads: `dynamic.load()` returns `Arc<DynamicConfig>` — lock-free
 - Writes: `dynamic.store(Arc::new(new_config))` — atomic swap
 - No locks: ArcSwap uses atomic operations
 ### ConfigReloadHandle
 ```rust
 pub struct ConfigReloadHandle {
    dynamic: Arc<ArcSwap<DynamicConfig>>,
 }
 impl ConfigReloadHandle {
    pub fn reload(&self, new_config: DynamicConfig);
    pub fn dynamic(&self) -> Arc<DynamicConfig>;
 }
 ```
 - `reload()`: atomically replaces the dynamic config
 - `dynamic()`: returns current config as `Arc<DynamicConfig>`
 **Config reload is a privilege-escalation path.** A reload that adds an
 authorized fingerprint or API key grants access immediately. The reload
 trigger must be authenticated/local-only (SIGHUP, file watch, or admin call
 protocol operation). The implementation must not ship a reload endpoint with
 no auth "for convenience."
 ### ConfigError
 ```rust
 pub enum ConfigError {
    InvalidFlag { name: String },
    KeyFileNotFound { path: String },
    BindFailed(io::Error),
    TlsConfig(io::Error),
    IncompatibleOptions,
 }
 ```
 ### Defaults
 - `drain_timeout`: 2 seconds
 - `max_connections_per_ip`: implementation default (reference uses a reasonable value)
 - `max_auth_attempts`: implementation default
 - `DynamicConfig::default()`: empty auth policy, default rate limits
 ### What NOT to include
 Per the spec, StaticConfig does NOT include: `host_key`, `host_key_algorithm`,
 `proxy_config`, `stealth`, `transport_mode`, `listeners`. These are removed in
 the new model (ALPN dispatch replaces them — see config.md Key Differences).
 ## Acceptance Criteria
 - [ ] `StaticConfig` struct with all fields per config.md
 - [ ] `TlsIdentity` enum with X509, RawKey, SelfSigned variants
 - [ ] `DynamicConfig` struct with `auth` and `rate_limits` fields
 - [ ] `AuthPolicy` struct with `authorized_fingerprints` and `api_keys`
 - [ ] `ApiKeyEntry` struct with all 5 fields
 - [ ] `RateLimitConfig` struct with both fields
 - [ ] `ConfigReloadHandle` with `reload()` and `dynamic()` methods
 - [ ] `ConfigError` enum with all variants
 - [ ] `DynamicConfig` derives `Clone`, `Debug` (for ArcSwap)
 - [ ] Default values match config.md (drain_timeout = 2s, etc.)
 - [ ] No russh dependency (fingerprints as strings)
 - [ ] Unit tests for Default impls
 - [ ] Unit test: ConfigReloadHandle reload swaps config atomically
 - [ ] `cargo test -p alknet-core` succeeds
 - [ ] `cargo clippy -p alknet-core` succeeds with no warnings
 ## References
 - docs/architecture/crates/core/config.md — all type definitions
 - docs/architecture/decisions/003-crate-decomposition.md — ADR-003 (no russh in core)
 - docs/architecture/decisions/010-alpn-router-and-endpoint.md — ADR-010 (no ListenerConfig)
 ## Notes
 > Config reload is a privilege-escalation path — do not ship an unauthenticated
 > reload endpoint. The ArcSwap pattern carries forward from the reference
 > implementation. StaticConfig removes all SSH-centric fields (host_key,
 > stealth, transport_mode, listeners) — those are handler concerns now.
 ## Summary
 > To be filled on completion
--- a/tasks/core/core-types.md
+++ b/tasks/core/core-types.md
@@ -0,0 +1,224 @@
 ---
 id: core/core-types
 name: "Implement core types: ProtocolHandler, Connection, BiStream, SendStream, RecvStream, StreamError, HandlerError, Capabilities"
 status: pending
 depends_on: [core/crate-init]
 scope: broad
 risk: medium
 impact: component
 level: implementation
 ---
 ## Description
 Implement the core types in `src/types.rs`. These are the foundational
 abstractions that every handler crate depends on. This is the most
 cross-crate-boundary task in core — `Capabilities` in particular is used
 heavily by alknet-call's operation registry and composition model.
 ### ProtocolHandler trait
 ```rust
 #[async_trait]
 pub trait ProtocolHandler: Send + Sync + 'static {
    fn alpn(&self) -> &'static [u8];
    async fn handle(&self, connection: Connection, auth: &AuthContext) -> Result<(), HandlerError>;
 }
 ```
 - `alpn()` returns the handler's ALPN identifier as a static byte string
 - `handle()` receives a `Connection` (not a single BiStream) and an `AuthContext`
 - Handlers that need a single stream call `connection.accept_bi()` once
 - Handlers that multiplex (SSH, call) open/accept streams as needed
 See ADR-002, ADR-007.
 ### HandlerError
 ```rust
 pub enum HandlerError {
    ConnectionClosed,
    StreamError(io::Error),
    AuthRequired,
    Internal(Box<dyn std::error::Error + Send + Sync>),
 }
 ```
 Non-fatal errors within `handle()`. The endpoint catches these, logs them,
 closes the connection. Other connections are unaffected. Handler panics are
 caught by tokio's task isolation.
 ### Connection
 ```rust
 pub struct Connection {
    // Private: wraps the underlying QUIC connection or test mock
    identity: OnceLock<Identity>,
 }
 impl Connection {
    #[cfg(feature = "quinn")]
    pub fn from_quinn(conn: quinn::Connection) -> Self;
    #[cfg(feature = "iroh")]
    pub fn from_iroh(conn: iroh::Connection) -> Self;
    pub async fn accept_bi(&self) -> Result<(SendStream, RecvStream), StreamError>;
    pub async fn open_bi(&self) -> Result<(SendStream, RecvStream), StreamError>;
    pub fn remote_alpn(&self) -> &[u8];
    pub fn remote_addr(&self) -> Option<SocketAddr>;
    pub fn close(&self, code: u32, reason: &str);
    pub fn set_identity(&self, identity: Identity) -> Result<(), IdentityAlreadySet>;
    pub fn identity(&self) -> Option<&Identity>;
 }
 ```
 - Opaque type wrapping a QUIC connection (quinn or iroh, feature-gated)
 - `set_identity` is write-once-read-many via `OnceLock` (OQ-11) — handlers
  store resolved identity for observability; the endpoint does NOT read it
  after `handle()` returns (the Connection is moved into the spawned task)
 - Internal enum dispatch for quinn vs iroh vs test mock
 - `Connection` does not expose quinn types in its public API
 ### BiStream trait
 ```rust
 pub trait BiStream: AsyncRead + AsyncWrite + Send + Unpin {}
 ```
 A convenience trait for client-side code, test mocks, and future transport
 abstractions (WebTransport, raw TCP). Handlers that need a single stream
 obtain one via `connection.accept_bi()` and treat the pair as a BiStream.
 ### SendStream and RecvStream
 ```rust
 pub struct SendStream { /* wraps quinn::SendStream or iroh::SendStream or test mock */ }
 pub struct RecvStream { /* wraps quinn::RecvStream or iroh::RecvStream or test mock */ }
 impl AsyncWrite for SendStream { ... }
 impl AsyncRead for RecvStream { ... }
 ```
 Concrete wrapper types using internal enum dispatch to delegate to the
 appropriate QUIC stream type (quinn or iroh) in production, and to test mocks
 in tests.
 ### StreamError
 ```rust
 pub enum StreamError {
    ConnectionClosed,
    StreamClosed,
    Timeout,
    Internal(io::Error),
 }
 ```
 Returned by `accept_bi()`, `open_bi()`, and stream read/write operations.
 Maps from `quinn::ConnectionError` / `quinn::StreamError` and iroh equivalents.
 ### From<StreamError> for HandlerError
 ```rust
 impl From<StreamError> for HandlerError {
    fn from(e: StreamError) -> Self {
        match e {
            StreamError::ConnectionClosed => HandlerError::ConnectionClosed,
            StreamError::StreamClosed => HandlerError::StreamError(
                io::Error::new(io::ErrorKind::ConnectionReset, "stream closed")),
            StreamError::Timeout => HandlerError::StreamError(
                io::Error::new(io::ErrorKind::TimedOut, "stream timed out")),
            StreamError::Internal(e) => HandlerError::StreamError(e),
        }
    }
 }
 ```
 This `From` impl is the canonical conversion — handlers use `?` on
 `accept_bi()` / `open_bi()`.
 ### Capabilities
 ```rust
 #[derive(Clone, Zeroize, ZeroizeOnDrop)]
 pub struct Capabilities {
    entries: HashMap<String, Secret<String>>,
 }
 impl Capabilities {
    pub fn new() -> Self;
    pub fn with_api_key(mut self, service: &str, key: String) -> Self;
    pub fn with_http_token(mut self, service: &str, token: String) -> Self;
    pub fn get(&self, service: &str) -> Option<&Secret<String>>;
 }
 ```
 Critical constraints (ADR-014, ADR-022, review #002 W2):
 - **Non-serializable**: does NOT derive `Serialize`. Cannot appear in
  `EventEnvelope` payloads even by accident.
 - **Zeroized**: derives `Zeroize` and `ZeroizeOnDrop`. Secret material does
  not linger in freed heap memory.
 - **Clone + Send + Sync**: required by the composition model —
  `OperationEnv::invoke()` clones the parent's capabilities for each child.
 - **Immutable after construction**: no `set`, no `insert`, no `mut` accessors.
  This is the guard from review #002 W2 — makes clone semantics genuinely
  two-way (Arc-based vs deep-copy are behaviorally identical when neither
  supports mutation).
 - **Private fields**: the builder API (`new`, `with_*`) is the only
  construction path.
 Use `secrecy::Secret<String>` (from the `secrecy` crate) or a similar wrapper
 for the secret values. Add `secrecy` to dependencies if needed, or implement
 a simple `Secret` wrapper that zeroizes on drop and redacts in Debug.
 ### IdentityAlreadySet error
 ```rust
 #[derive(Debug, thiserror::Error)]
 pub enum IdentityAlreadySet {
    #[error("connection identity already set")]
    AlreadySet,
 }
 ```
 Returned by `Connection::set_identity()` if called a second time.
 ## Acceptance Criteria
 - [ ] `ProtocolHandler` trait defined with `alpn()` and `handle()` (async)
 - [ ] `HandlerError` enum with all 4 variants
 - [ ] `Connection` struct with all methods (from_quinn/from_iroh feature-gated)
 - [ ] `Connection::set_identity` write-once via `OnceLock`, returns `IdentityAlreadySet` on second call
 - [ ] `BiStream` trait defined (AsyncRead + AsyncWrite + Send + Unpin)
 - [ ] `SendStream` implements `AsyncWrite`
 - [ ] `RecvStream` implements `AsyncRead`
 - [ ] `StreamError` enum with all 4 variants
 - [ ] `From<StreamError> for HandlerError` impl
 - [ ] `Capabilities` struct with `new()`, `with_api_key()`, `with_http_token()`, `get()`
 - [ ] `Capabilities` derives `Clone`, `Zeroize`, `ZeroizeOnDrop` — NOT `Serialize`
 - [ ] `Capabilities` fields are private (builder API only, no mut accessors)
 - [ ] `IdentityAlreadySet` error type
 - [ ] Unit tests for Capabilities (build, get, clone, zeroize)
 - [ ] Unit test: `Connection::set_identity` once succeeds, twice returns error
 - [ ] `cargo test -p alknet-core` succeeds
 - [ ] `cargo clippy -p alknet-core` succeeds with no warnings
 ## References
 - docs/architecture/crates/core/core-types.md — all type definitions
 - docs/architecture/decisions/002-protocol-handler-trait.md — ADR-002
 - docs/architecture/decisions/007-bistream-type-definition.md — ADR-007
 - docs/architecture/decisions/014-secret-material-flow-and-capability-injection.md — ADR-014 (Capabilities)
 - docs/architecture/decisions/022-handler-registration-provenance-and-composition-authority.md — ADR-022
 ## Notes
 > This is the most cross-crate-boundary task in core. `Capabilities` is used
 > heavily by alknet-call's operation registry and composition model — it must
 > be right the first time. The immutability guard (no mut accessors) is the
 > security control from review #002 W2 that makes clone semantics safe. The
 > `Connection` type uses internal enum dispatch for quinn/iroh/test — do not
 > expose quinn types in the public API.
 ## Summary
 > To be filled on completion
--- a/tasks/core/crate-init.md
+++ b/tasks/core/crate-init.md
@@ -0,0 +1,116 @@
 ---
 id: core/crate-init
 name: Initialize alknet-core crate with Cargo.toml, dependencies, and module skeleton
 status: pending
 depends_on: []
 scope: moderate
 risk: low
 impact: project
 level: implementation
 ---
 ## Description
 Initialize the `alknet-core` crate from scratch. The workspace currently has
 only `alknet-vault`. This task creates the crate directory, `Cargo.toml`,
 `lib.rs`, and the module skeleton that subsequent core tasks will fill in.
 ### Crate setup
 Create `crates/alknet-core/` with:
 - `Cargo.toml` — package metadata, dependencies, feature flags
 - `src/lib.rs` — crate root with module declarations and re-exports
 - Module skeleton files (empty or with `// TODO` markers) for:
  - `src/types.rs` — ProtocolHandler, HandlerError, Connection, BiStream, SendStream, RecvStream, StreamError, Capabilities
  - `src/auth.rs` — AuthContext, Identity, IdentityProvider, AuthToken, ConfigIdentityProvider
  - `src/config.rs` — StaticConfig, DynamicConfig, AuthPolicy, ApiKeyEntry, RateLimitConfig, ConfigReloadHandle, ConfigError, TlsIdentity
  - `src/endpoint.rs` — AlknetEndpoint, HandlerRegistry, EndpointError
 ### Dependencies
 Per the architecture specs (overview.md, core/README.md, endpoint.md):
 | Crate | Purpose |
 |-------|---------|
 | `tokio` 1 (full) | Async runtime, watch channel for shutdown |
 | `quinn` | QUIC endpoint (feature-gated) |
 | `iroh` | P2P relay-assisted endpoint (feature-gated) |
 | `rustls` | TLS implementation |
 | `rustls-pki-types` | TLS types (CertificateDer, PrivateKeyDer) |
 | `serde` 1 | Serialization for config types |
 | `serde_json` 1 | JSON for config, JSON Schema values |
 | `toml` 0.8 | Config file format |
 | `arc-swap` 1 | Atomic config swap for DynamicConfig |
 | `async-trait` 0.1 | ProtocolHandler trait (async fn in trait) |
 | `tracing` 0.1 | Structured logging |
 | `thiserror` 2 | Error enums |
 | `zeroize` 1 | Capabilities zeroization |
 | `bytes` 1 | Byte buffer types for streams |
 | `futures` | AsyncRead/AsyncWrite for BiStream trait |
 ### Feature flags
 ```toml
 [features]
 default = ["quinn"]
 quinn = ["dep:quinn"]
 iroh = ["dep:iroh"]
 ```
 Both quinn and iroh are optional, both can be active simultaneously (ADR-010).
 `quinn` is default-on for the common case; `iroh` is opt-in.
 ### Workspace Cargo.toml
 Add `crates/alknet-core` to the workspace `members` list in the root
 `Cargo.toml`.
 ### Module skeleton
 ```rust
 // src/lib.rs
 //! alknet-core: Core library for ALPN-based protocol dispatch.
 pub mod types;
 pub mod auth;
 pub mod config;
 pub mod endpoint;
 // Re-exports (filled in by subsequent tasks)
 ```
 Each module file gets a doc comment and `// TODO: implement` marker. The
 subsequent tasks (core-types, config, auth, endpoint) fill these in.
 ## Acceptance Criteria
 - [ ] `crates/alknet-core/Cargo.toml` exists with all dependencies and feature flags
 - [ ] `crates/alknet-core/src/lib.rs` exists with module declarations
 - [ ] Module skeleton files exist: `types.rs`, `auth.rs`, `config.rs`, `endpoint.rs`
 - [ ] Root `Cargo.toml` `members` list includes `crates/alknet-core`
 - [ ] `cargo check -p alknet-core` succeeds
 - [ ] `cargo clippy -p alknet-core` succeeds with no warnings
 - [ ] Dual licensing: `MIT OR Apache-2.0` (workspace-inherited)
 ## References
 - docs/architecture/overview.md — crate graph, shared types
 - docs/architecture/crates/core/README.md — crate index
 - docs/architecture/crates/core/core-types.md — types to implement
 - docs/architecture/crates/core/endpoint.md — endpoint, features (quinn + iroh)
 - docs/architecture/crates/core/config.md — config types
 - docs/architecture/crates/core/auth.md — auth types
 - docs/architecture/decisions/003-crate-decomposition.md — ADR-003
 - docs/architecture/decisions/010-alpn-router-and-endpoint.md — ADR-010 (feature-gating)
 ## Notes
 > This is the foundational setup task for alknet-core. All subsequent core
 > tasks depend on this one. The crate has no alknet dependencies (vault is
 > standalone; core doesn't depend on vault). The feature flags for quinn/iroh
 > are important — both are optional and can be active simultaneously.
 ## Summary
 > To be filled on completion
--- a/tasks/core/endpoint.md
+++ b/tasks/core/endpoint.md
@@ -0,0 +1,249 @@
 ---
 id: core/endpoint
 name: Implement AlknetEndpoint, HandlerRegistry, accept loops (quinn + iroh), TLS identity, and graceful shutdown
 status: pending
 depends_on: [core/core-types, core/config, core/auth]
 scope: broad
 risk: high
 impact: component
 level: implementation
 ---
 ## Description
 Implement the ALPN router and endpoint in `src/endpoint.rs`. This is the
 integration point of alknet-core — it ties together the core types, config,
 and auth into the central runtime that accepts connections and dispatches to
 handlers by ALPN string.
 ### AlknetEndpoint
 ```rust
 pub struct AlknetEndpoint {
    quinn: Option<quinn::Endpoint>,
    iroh: Option<iroh::Endpoint>,
    handlers: Arc<HandlerRegistry>,
    dynamic: Arc<ArcSwap<DynamicConfig>>,
    identity_provider: Arc<dyn IdentityProvider>,
    shutdown: watch::Receiver<bool>,
 }
 ```
 Manages one or more QUIC connection sources, each feeding into the same ALPN
 router. Both quinn and iroh are optional (feature-gated), both can be active
 simultaneously (ADR-010).
 ### HandlerRegistry
 ```rust
 pub struct HandlerRegistry {
    handlers: HashMap<&'static [u8], Arc<dyn ProtocolHandler>>,
 }
 impl HandlerRegistry {
    pub fn new() -> Self;
    pub fn register(&mut self, handler: Arc<dyn ProtocolHandler>);
    pub fn get(&self, alpn: &[u8]) -> Option<&Arc<dyn ProtocolHandler>>;
    pub fn alpn_strings(&self) -> Vec<Vec<u8>>;
 }
 ```
 - `register()`: insert a handler. Panics if ALPN already registered.
 - `get()`: look up by ALPN string.
 - `alpn_strings()`: all registered ALPNs — used to build TLS ServerConfig
  (quinn) and ALPN list (iroh).
 - Registration is **static at startup** (OQ-04, ADR-010). The CLI builds the
  registry, inserts all handlers, passes to `AlknetEndpoint::new()`.
 ### Accept loops
 Each active connection source runs its own accept loop. Both dispatch through
 the same `HandlerRegistry`.
 **Quinn accept loop** (public QUIC+TLS):
 ```
 loop {
    tokio::select! {
        incoming = quinn_endpoint.accept() => {
            let connection = incoming.await;
            match connection {
                Ok(conn) => dispatch(conn),
                Err(e) => { /* log TLS handshake failure, continue */ }
            }
        }
        _ = shutdown.changed() => break,
    }
 }
 ```
 **iroh accept loop** (P2P relay-assisted):
 ```
 loop {
    tokio::select! {
        incoming = iroh_endpoint.accept() => {
            let accepting = incoming.accept();
            let alpn = accepting.alpn().await;
            match alpn {
                Ok(alpn) => dispatch(alpn, accepting),
                Err(e) => { /* log handshake failure, continue */ }
            }
        }
        _ = shutdown.changed() => break,
    }
 }
 ```
 Use `iroh::Endpoint` directly (not iroh's `Router`) because our HandlerRegistry
 is shared between quinn and iroh, and our AuthContext construction differs per
 source. See iroh's `protocol.rs` for the reference pattern.
 ### Dispatch function (shared)
 ```
 fn dispatch(connection) {
    let alpn = connection.alpn();
    match handlers.get(alpn) {
        Some(handler) => {
            let auth = AuthContext::from_connection(&connection);
            let conn = Connection::from_quinn(connection); // or from_iroh
            tokio::spawn(async move {
                if let Err(e) = handler.handle(conn, &auth).await {
                    // log error, connection closes
                }
            });
        }
        None => connection.close(0u32, "no handler"),
    }
 }
 ```
 ### AuthContext construction
 The endpoint constructs `AuthContext` from the QUIC connection:
 1. `alpn`: from `connection.alpn()` — always present
 2. `remote_addr`: from `connection.remote_addr()` — may be None for iroh
 3. `tls_client_fingerprint`: extracted from TLS session's client cert, if presented
 4. `identity`: if fingerprint available, call `IdentityProvider::resolve_from_fingerprint()`.
   If resolves, `identity = Some(resolved)`. If not, `identity = None`.
 ### TLS Identity
 Three modes per `TlsIdentity` (OQ-12):
 **RawKey (RFC 7250, default for P2P)**:
 - Build `rustls::ServerConfig` with `only_raw_public_keys() -> true`
 - `ResolvesServerCert` generates cert on-the-fly from the Ed25519 key
 - ~100 lines — see `iroh/iroh/src/tls/resolver.rs` for the reference pattern
 - Works natively with SSH auth and git; browsers do NOT support RFC 7250
 **X509 (domain-hosted)**:
 - Load cert/key from file paths
 - Standard `rustls::ServerConfig`
 - For browser/WebTransport clients and public domain services
 **SelfSigned (dev only)**:
 - Generate self-signed cert on startup
 - External clients will not trust it
 **ACME (future, not in this task)**:
 - The reverse-proxy project demonstrates the complete ACME pattern. It will be
  adapted as an additional `TlsIdentity` variant or `ResolvesServerCert` impl.
  For now, X509 with manual certs is the domain path. Note this as a TODO.
 The quinn endpoint's `rustls::ServerConfig` ALPN list is set from
 `registry.alpn_strings()` at construction time. The iroh endpoint's ALPN list
 is similarly derived. Both advertise the same set of ALPNs.
 ### Graceful shutdown
 ```rust
 impl AlknetEndpoint {
    pub fn shutdown_sender(&self) -> watch::Sender<bool>;
    pub async fn shutdown(&self) -> Result<(), EndpointError>;
 }
 ```
 - `shutdown_sender()`: clone of shutdown channel sender. `send(true)` signals shutdown.
 - `shutdown()`: signals all accept loops to stop, waits for in-flight connections
  with drain timeout (default 2s from StaticConfig), then forcefully closes remaining.
 - SIGTERM/SIGINT wired to shutdown channel by the CLI binary (not core's concern).
 ### EndpointError
 ```rust
 pub enum EndpointError {
    BindFailed(io::Error),
    TlsConfig(io::Error),
    HandlerNotFound(Vec<u8>),
 }
 ```
 Fatal errors that prevent the endpoint from starting or continuing.
 ### Accept loop error handling
 - **TLS handshake failure**: log and continue. Client may have offered no
  compatible ALPN, or cert may be untrusted.
 - **Handler panic**: caught by tokio's task isolation. Connection dropped,
  others continue.
 - **Connection-level errors** (quinn/iroh ConnectionError): log and continue.
  Accept loop keeps running.
 ### What the accept loops do NOT do
 - No byte-peeking (ALPN handles protocol detection)
 - No per-handler accept loops (ALPN unifies)
 - No SSH-specific logic (accept loop is ALPN-agnostic)
 ### TCP is NOT an endpoint concern
 Bare TCP (SSH over port 22) does not use QUIC or ALPN. TCP access is handled by
 individual handlers (the SSH handler can listen on TCP independently). This is
 handler-specific, not core endpoint.
 ## Acceptance Criteria
 - [ ] `AlknetEndpoint` struct with quinn/iroh (both Option, both feature-gated)
 - [ ] `HandlerRegistry` with new/register/get/alpn_strings
 - [ ] `register()` panics on duplicate ALPN
 - [ ] Quinn accept loop runs, dispatches by ALPN, respects shutdown
 - [ ] iroh accept loop runs, dispatches by ALPN, respects shutdown
 - [ ] Dispatch function spawns handler task via `tokio::spawn`
 - [ ] AuthContext constructed from connection (alpn, remote_addr, fingerprint, identity)
 - [ ] TLS RawKey mode: rustls ServerConfig with `only_raw_public_keys()`, on-the-fly cert
 - [ ] TLS X509 mode: load cert/key from files, standard ServerConfig
 - [ ] TLS SelfSigned mode: generate self-signed cert on startup
 - [ ] ALPN list in TLS ServerConfig set from `registry.alpn_strings()`
 - [ ] Graceful shutdown: signal accept loops to stop, drain timeout, force close
 - [ ] `EndpointError` enum with all variants
 - [ ] Accept loop errors logged, loop continues (no crash on handshake failure)
 - [ ] Handler panics caught by tokio task isolation (connection dropped, others continue)
 - [ ] No byte-peeking, no per-handler accept loops, no SSH-specific logic
 - [ ] Unit test: HandlerRegistry register/get/alpn_strings
 - [ ] Unit test: HandlerRegistry register panics on duplicate ALPN
 - [ ] Integration test: endpoint with mock handler, verify dispatch by ALPN
 - [ ] `cargo test -p alknet-core` succeeds
 - [ ] `cargo clippy -p alknet-core` succeeds with no warnings
 ## References
 - docs/architecture/crates/core/endpoint.md — full endpoint spec
 - docs/architecture/decisions/001-alpn-protocol-dispatch.md — ADR-001
 - docs/architecture/decisions/010-alpn-router-and-endpoint.md — ADR-010
 - docs/architecture/decisions/006-alpn-convention-and-connection-model.md — ADR-006
 - docs/architecture/decisions/007-bistream-type-definition.md — ADR-007
 - iroh reference: `/workspace/iroh/iroh/src/protocol.rs` (accept loop pattern)
 - iroh reference: `/workspace/iroh/iroh/src/tls/resolver.rs` (RFC 7250 raw key)
 ## Notes
 > This is the integration point of alknet-core — it ties together types, config,
 > and auth. The highest-risk task in core because it involves QUIC connection
 > handling, TLS identity (3 modes), and graceful shutdown. The RFC 7250 raw key
 > path is ~100 lines (iroh has a reference implementation). ACME is deferred —
 > note as TODO, use X509 manual certs for the domain path for now. TCP is NOT
 > an endpoint concern — it's handler-specific.
 ## Summary
 > To be filled on completion
--- a/tasks/core/review-core.md
+++ b/tasks/core/review-core.md
@@ -0,0 +1,122 @@
 ---
 id: core/review-core
 name: Review alknet-core implementation for spec conformance and pattern consistency
 status: pending
 depends_on: [core/endpoint]
 scope: moderate
 risk: low
 impact: phase
 level: review
 ---
 ## Description
 Review the alknet-core implementation for spec conformance, pattern
 consistency, and correctness before alknet-call (which depends on core)
 begins implementation. This is the quality checkpoint at the end of the core
 phase.
 ### Review Checklist
 1. **Core types conformance** (core-types.md):
   - `ProtocolHandler` trait signature matches spec (alpn, handle)
   - `HandlerError` has all 4 variants (ConnectionClosed, StreamError, AuthRequired, Internal)
   - `Connection` has all methods, from_quinn/from_iroh feature-gated
   - `Connection::set_identity` is write-once via OnceLock
   - `BiStream` is a trait (AsyncRead + AsyncWrite + Send + Unpin)
   - `SendStream` implements AsyncWrite, `RecvStream` implements AsyncRead
   - `StreamError` has all 4 variants
   - `From<StreamError> for HandlerError` impl matches spec mapping table
   - `Capabilities` is non-serializable, zeroized, immutable, Clone+Send+Sync
   - `Capabilities` has builder API (new, with_api_key, with_http_token, get), private fields
 2. **Config conformance** (config.md):
   - `StaticConfig` fields match (listen_addr, tls_identity, iroh_relay, drain_timeout)
   - `TlsIdentity` has X509, RawKey, SelfSigned
   - `DynamicConfig` has auth and rate_limits
   - `AuthPolicy` has authorized_fingerprints (HashSet<String>), api_keys (Vec<ApiKeyEntry>)
   - `ApiKeyEntry` has all 5 fields (prefix, hash, scopes, description, expires_at)
   - `ConfigReloadHandle` has reload() and dynamic()
   - No russh dependency (fingerprints as strings)
   - No removed fields (host_key, stealth, transport_mode, listeners)
 3. **Auth conformance** (auth.md):
   - `AuthContext` has all 4 fields, derives Clone
   - `Identity` has id, scopes, resources
   - `AuthToken` has raw field
   - `IdentityProvider` trait with both methods
   - `ConfigIdentityProvider` reads from ArcSwap on every call
   - Fingerprint resolution looks up in authorized_fingerprints
   - Token resolution: alk_ prefix, hash match, expiry check
   - Two identity scopes documented (connection-level vs per-request)
 4. **Endpoint conformance** (endpoint.md):
   - `AlknetEndpoint` has quinn/iroh (both Option, both feature-gated)
   - `HandlerRegistry` register/get/alpn_strings, panics on duplicate
   - Quinn accept loop: select on accept + shutdown, dispatch by ALPN
   - iroh accept loop: select on accept + shutdown, dispatch by ALPN
   - Dispatch spawns handler task via tokio::spawn
   - AuthContext constructed from connection (alpn, remote_addr, fingerprint, identity)
   - TLS RawKey: only_raw_public_keys(), on-the-fly cert from Ed25519
   - TLS X509: load from files
   - TLS SelfSigned: generate on startup
   - ALPN list in ServerConfig from registry.alpn_strings()
   - Graceful shutdown: drain timeout, force close
   - EndpointError has all 3 variants
   - No byte-peeking, no per-handler loops, no SSH-specific logic
 5. **Pattern consistency**:
   - ArcSwap used consistently for DynamicConfig
   - Feature flags (quinn, iroh) gate transport code correctly
   - Error handling patterns consistent (thiserror, Result propagation)
   - No quinn/iroh types in public API (Connection wraps them)
 6. **Security constraints**:
   - Capabilities non-serializable (no Serialize derive)
   - Capabilities zeroized (Zeroize, ZeroizeOnDrop)
   - Capabilities immutable (no mut accessors)
   - Config reload is privilege escalation (no unauthenticated reload endpoint)
   - Token entropy requirement documented
 7. **Test coverage**:
   - Unit tests for Capabilities (build, get, clone, zeroize)
   - Unit tests for config types and reload
   - Unit tests for auth resolution (fingerprint, token, expiry)
   - Unit tests for HandlerRegistry
   - Integration test: endpoint dispatch by ALPN
 ## Acceptance Criteria
 - [ ] All core types match core-types.md
 - [ ] All config types match config.md
 - [ ] All auth types match auth.md
 - [ ] Endpoint matches endpoint.md (accept loops, TLS modes, shutdown)
 - [ ] Capabilities security constraints satisfied (non-serializable, zeroized, immutable)
 - [ ] No russh dependency in core
 - [ ] No quinn/iroh types in public API
 - [ ] ArcSwap pattern consistent
 - [ ] Feature flags gate transport code correctly
 - [ ] Test coverage adequate for all functionality
 - [ ] `cargo fmt --check -p alknet-core` passes
 - [ ] `cargo clippy -p alknet-core` passes with no warnings
 - [ ] All tests pass
 ## References
 - docs/architecture/crates/core/README.md
 - docs/architecture/crates/core/core-types.md
 - docs/architecture/crates/core/config.md
 - docs/architecture/crates/core/auth.md
 - docs/architecture/crates/core/endpoint.md
 - docs/architecture/decisions/ (relevant ADRs: 001-011, 014, 015, 022)
 ## Notes
 > This review verifies core is spec-conformant before alknet-call begins.
 > alknet-call depends heavily on core types (ProtocolHandler, Connection,
 > AuthContext, Capabilities, IdentityProvider) — any issues here propagate to
 > call. If deviations are found, document and fix before proceeding.
 ## Summary
 > To be filled on completion
--- a/tasks/vault/cache-zeroization-test.md
+++ b/tasks/vault/cache-zeroization-test.md
@@ -0,0 +1,85 @@
 ---
 id: vault/cache-zeroization-test
 name: Verify and test that HashMap::clear() drops CachedKey values triggering zeroization
 status: pending
 depends_on: []
 scope: single
 risk: low
 impact: isolated
 level: implementation
 ---
 ## Description
 Fix drift item #6: `KeyCache::clear()` removes entries and relies on
 `CachedKey`'s `Drop` impl for zeroization. The spec says to verify that
 `HashMap::clear()` actually drops the values (it does, but this is worth a
 test). This task adds a test that proves zeroization happens on cache eviction
 and clear.
 ### Background
 `CachedKey` derives `Zeroize` and `ZeroizeOnDrop` (via the `DerivedKey` it
 holds, which is `#[zeroize(drop)]`). When the cache evicts an entry (LRU or TTL)
 or `clear()` is called, the `CachedKey` is dropped, which triggers
 `ZeroizeOnDrop` — the private key bytes are zeroized before deallocation.
 `HashMap::clear()` drops all values, which triggers their `Drop` impls. This
 is standard Rust behavior, but the security-critical nature of key material
 warrants an explicit test.
 ### What to add
 A test in `cache.rs` (or `tests/`) that:
 1. Inserts a `CachedKey` with a known private key into the cache
 2. Verifies the key is present
 3. Calls `clear()` (or evicts via LRU/TTL)
 4. Verifies the `CachedKey` was dropped and zeroized
 Testing zeroization directly is tricky because the memory is freed — you can't
 easily inspect it after drop. A practical approach:
 - **Option A**: Use a custom type with a `Drop` impl that sets a flag (e.g., an
  `Arc<AtomicBool>`) when zeroized. Insert it into the cache, clear, verify the
  flag is set. This tests the drop path, not the zeroize path directly, but
  confirms `clear()` drops values.
 - **Option B**: Test the LRU eviction path — fill the cache to `max_entries`,
  insert one more, verify the LRU entry was evicted (dropped).
 - **Option C**: Test that `lock()` calls `cache.clear()` and the cache is empty
  afterward (integration test via `VaultServiceHandle`).
 At minimum, implement Option B and C. Option A is a bonus if feasible without
 over-engineering the test type.
 ### Scope
 This task touches `cache.rs` (test additions) and possibly `tests/`. It does
 not depend on the irpc removal task (drift #4) because `cache.rs` is a separate
 file. It can run in parallel with drift #4.
 ## Acceptance Criteria
 - [ ] Test: LRU eviction drops the evicted `CachedKey` (cache exceeds `max_entries`, oldest evicted)
 - [ ] Test: `lock()` clears the cache (verify cache is empty after lock)
 - [ ] Test: TTL expiry evicts entries (set short TTL, wait, verify entry gone)
 - [ ] Test: `clear()` removes all entries (verify empty after clear)
 - [ ] `cargo test` succeeds
 - [ ] `cargo clippy` succeeds with no warnings
 ## References
 - docs/architecture/crates/vault/README.md — Known Source Drift table item #6
 - docs/architecture/crates/vault/service.md — Cache section, Security Constraints
 - docs/architecture/crates/vault/encryption.md — Security Constraints
 ## Notes
 > `HashMap::clear()` does drop values, triggering their `Drop` impls. This is
 > standard Rust behavior, but key material is security-critical enough to
 > warrant an explicit test. This task touches only `cache.rs` and can run in
 > parallel with the irpc removal task (drift #4).
 ## Summary
 > To be filled on completion
--- a/tasks/vault/derivedkey-serialization.md
+++ b/tasks/vault/derivedkey-serialization.md
@@ -0,0 +1,140 @@
 ---
 id: vault/derivedkey-serialization
 name: Implement always-redact DerivedKey serialization and reject redacted payloads on deserialize
 status: pending
 depends_on: [vault/irpc-removal]
 scope: narrow
 risk: medium
 impact: component
 level: implementation
 ---
 ## Description
 Fix drift item #5: `DerivedKey` currently has dual serialization behavior — JSON
 redacts the private key, but postcard (the binary format used by irpc) preserves
 the raw bytes. ADR-025 dropped the postcard/remote path, so `DerivedKey` should
 **always** redact on serialize and reject `"[REDACTED]"` on deserialize with an
 explicit error.
 ### Current state
 `protocol.rs` has `DerivedKey` with `#[derive(Serialize, Deserialize)]` (or
 similar) that produces JSON redaction for JSON but preserves bytes in postcard.
 The postcard tests in the test suite verify the binary round-trip.
 ### Target state
 Per `docs/architecture/crates/vault/protocol.md` → Serialization Redaction:
 `DerivedKey` must **not** derive `Deserialize` via `#[derive]`. It needs custom
 `Serialize` and `Deserialize` impls:
 **Custom Serialize** — always redacts `private_key`:
 ```rust
 impl serde::Serialize for DerivedKey {
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where S: serde::Serializer {
        use serde::SerializeStruct;
        let mut s = serializer.serialize_struct("DerivedKey", 3)?;
        s.serialize_field("key_type", &self.key_type)?;
        s.serialize_field("private_key", "[REDACTED]")?;
        s.serialize_field("public_key", &self.public_key)?;
        s.end()
    }
 }
 ```
 **Custom Deserialize** — rejects `"[REDACTED]"` with an explicit error:
 ```rust
 impl<'de> serde::Deserialize<'de> for DerivedKey {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where D: serde::Deserializer<'de> {
        #[derive(serde::Deserialize)]
        struct DerivedKeyHelper {
            key_type: KeyType,
            private_key: Vec<u8>,
            public_key: Vec<u8>,
        }
        let helper = DerivedKeyHelper::deserialize(deserializer)?;
        if helper.private_key == b"[REDACTED]" {
            return Err(serde::de::Error::custom(
                "DerivedKey.private_key is \"[REDACTED]\" — redacted payloads \
                 cannot be deserialized. JSON round-tripping a DerivedKey is \
                 not supported (the private key is gone)."
            ));
        }
        Ok(DerivedKey {
            key_type: helper.key_type,
            private_key: helper.private_key,
            public_key: helper.public_key,
        })
    }
 }
 ```
 **Debug impl** — also redacts:
 ```rust
 impl fmt::Debug for DerivedKey {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        f.debug_struct("DerivedKey")
            .field("key_type", &self.key_type)
            .field("private_key", &"[REDACTED]")
            .field("public_key", &self.public_key)
            .finish()
    }
 }
 ```
 ### Remove postcard tests
 The postcard round-trip tests (which verified binary format preserved private
 key bytes) are removed — ADR-025 dropped that path. The `postcard`
 dev-dependency was removed in the irpc removal task (drift #4).
 ### Why custom impls instead of derives
 A derived `Deserialize` would generate a default impl that conflicts with the
 manual one, and would only fail incidentally (serde type mismatch: string vs
 sequence), not with the explicit redaction-rejection error the spec requires.
 The custom impl is required for the explicit error message.
 ### Scope
 This task touches `protocol.rs` (the `DerivedKey` type, its serde impls, Debug
 impl) and test files (remove postcard tests, add redaction tests). It depends on
 the irpc removal task (drift #4) because both modify `protocol.rs`.
 ## Acceptance Criteria
 - [ ] `DerivedKey` does not derive `Serialize` or `Deserialize` via `#[derive]`
 - [ ] Custom `Serialize` impl always redacts `private_key` as `"[REDACTED]"`
 - [ ] Custom `Deserialize` impl rejects `private_key == b"[REDACTED]"` with explicit error
 - [ ] Custom `Debug` impl redacts `private_key` as `"[REDACTED]"`
 - [ ] Postcard round-trip tests removed
 - [ ] Unit test: JSON serialize produces `"[REDACTED]"` for `private_key`
 - [ ] Unit test: JSON deserialize of a redacted payload returns an error (not a corrupted key)
 - [ ] Unit test: `{:?}` on `DerivedKey` does not contain private key bytes
 - [ ] `cargo test` succeeds
 - [ ] `cargo clippy` succeeds with no warnings
 ## References
 - docs/architecture/crates/vault/README.md — Known Source Drift table item #5
 - docs/architecture/crates/vault/protocol.md — Serialization Redaction, Debug redaction
 - docs/architecture/decisions/025-vault-local-only-dispatch.md — ADR-025 (resolves W8)
 - docs/architecture/decisions/014-secret-material-flow-and-capability-injection.md — ADR-014
 ## Notes
 > The redaction is defense-in-depth for logging safety, not the primary control
 > — the primary control is that `DerivedKey` never crosses the call protocol
 > wire (ADR-014). ADR-025 dropped the postcard/remote path that previously
 > preserved bytes in binary formats. The custom Deserialize impl is required
 > because `#[derive(Deserialize)]` would conflict and not produce the explicit
 > redaction-rejection error. Depends on irpc removal because both modify
 > `protocol.rs`.
 ## Summary
 > To be filled on completion
--- a/tasks/vault/irpc-removal.md
+++ b/tasks/vault/irpc-removal.md
@@ -0,0 +1,106 @@
 ---
 id: vault/irpc-removal
 name: Remove irpc dependency and actor dispatch from vault, convert to direct method calls on VaultServiceHandle
 status: pending
 depends_on: []
 scope: broad
 risk: high
 impact: component
 level: implementation
 ---
 ## Description
 Remove the irpc-based actor dispatch from the vault crate and convert to direct
 method calls on `VaultServiceHandle`. This is drift item #4 from the vault README
 drift table and the foundational ADR-025 refactor — it restructures `service.rs`
 and `protocol.rs` fundamentally, which is why most other vault drift tasks depend
 on this one.
 ### What to remove
 - `VaultProtocol` enum with `#[rpc_requests]` derive in `protocol.rs`
 - `VaultServiceActor` in `service.rs`
 - `Client<VaultProtocol>` usage
 - `irpc` and `irpc-derive` dependencies from `Cargo.toml`
 - `postcard` from dev-dependencies (was only needed for the irpc binary path)
 - `tokio` dependency from `Cargo.toml` (the vault uses `std::sync::RwLock`, not
  `tokio::sync::RwLock` — ADR-025)
 - `VaultMessage` / `VaultProtocol` re-exports from `lib.rs`
 ### What to keep / change
 - `VaultServiceHandle` stays — it becomes the sole API. It is already
  `Arc<std::sync::RwLock<VaultServiceInner>>` with synchronous methods. The actor
  path (`mpsc` channel + oneshot backchannels via irpc's `Service` trait) is
  removed entirely.
 - `VaultServiceError` drops `Serialize`/`Deserialize` derives (were needed for
  irpc dispatch — ADR-025 removed that path). It becomes a plain `thiserror::Error`
  enum.
 - `DerivedKey` and `KeyType` stay in `protocol.rs` — the file is renamed in
  spirit to "the types module" but the filename stays `protocol.rs` for
  continuity. The `VaultProtocol` enum is removed; `DerivedKey`/`KeyType` remain.
 - `lib.rs` re-exports are updated to remove `VaultMessage`, `VaultProtocol`,
  `VaultServiceActor` and reflect the new public API per the vault README's
  Public API section.
 ### Public API after this task
 Per `docs/architecture/crates/vault/README.md` → Public API:
 ```rust
 pub use mnemonic::{Language, Mnemonic, Seed};
 pub use derivation::{DerivationError, ExtendedPrivKey, PATHS};
 pub use encryption::{EncryptedData, EncryptionError, EncryptionKey};
 pub use encryption::CURRENT_KEY_VERSION;
 pub use protocol::{DerivedKey, KeyType};
 pub use service::{VaultServiceError, VaultServiceHandle};
 pub use cache::CacheConfig;
 ```
 ### Cargo.toml changes
 Remove from `[dependencies]`:
 - `irpc = { workspace = true }`
 - `irpc-derive = { workspace = true }`
 - `tokio = { version = "1", features = ["sync", "rt", "macros"] }`
 Remove from `[dev-dependencies]`:
 - `postcard = { version = "1", features = ["alloc"] }`
 The vault should have **zero** async runtime dependency after this task.
 ## Acceptance Criteria
 - [ ] `VaultProtocol` enum and `#[rpc_requests]` derive removed from `protocol.rs`
 - [ ] `VaultServiceActor` removed from `service.rs`
 - [ ] `Client<VaultProtocol>` usage removed
 - [ ] `irpc`, `irpc-derive`, `tokio` removed from `[dependencies]` in `Cargo.toml`
 - [ ] `postcard` removed from `[dev-dependencies]` in `Cargo.toml`
 - [ ] `VaultServiceError` no longer derives `Serialize`/`Deserialize`
 - [ ] `lib.rs` re-exports match the Public API section of vault README (no `VaultMessage`, `VaultProtocol`, `VaultServiceActor`)
 - [ ] `VaultServiceHandle` methods are all synchronous (no `async`, no `.await`)
 - [ ] `cargo check` succeeds
 - [ ] `cargo clippy` succeeds with no warnings
 - [ ] `cargo test` succeeds (existing tests updated to remove irpc/postcard usage)
 - [ ] No `tokio` dependency remains in the vault `Cargo.toml`
 ## References
 - docs/architecture/crates/vault/README.md — Known Source Drift table item #4, Public API section
 - docs/architecture/crates/vault/service.md — Dispatch section, VaultServiceHandle
 - docs/architecture/crates/vault/protocol.md — Local-Only by Construction
 - docs/architecture/decisions/025-vault-local-only-dispatch.md — ADR-025
 ## Notes
 > This is the foundational vault refactor. It restructures `service.rs` and
 > `protocol.rs` — most other vault drift tasks touch these same files and must
 > follow this one to avoid merge conflicts. The `VaultServiceHandle` struct
 > already uses `std::sync::RwLock` with synchronous methods; the actor path is
 > the dead code to remove. After this task, the vault has no async runtime
 > dependency and no RPC framework dependency — it is local-only by construction.
 ## Summary
 > To be filled on completion
--- a/tasks/vault/key-versioning-rotation.md
+++ b/tasks/vault/key-versioning-rotation.md
@@ -0,0 +1,127 @@
 ---
 id: vault/key-versioning-rotation
 name: Implement version-indexed encryption key paths, bump CURRENT_KEY_VERSION to 2, and add rotate method
 status: pending
 depends_on: [vault/irpc-removal]
 scope: moderate
 risk: medium
 impact: component
 level: implementation
 ---
 ## Description
 Fix drift items #3, #9, and #10 as one coherent feature: the version-indexed
 key rotation mechanism from ADR-021. These three drifts are tightly coupled —
 `CURRENT_KEY_VERSION = 2` (drift #3), version-aware `encrypt`/`decrypt` via
 `encryption_path_for_version` (drift #9), and the `rotate` method (drift #10)
 form the complete key rotation feature. Splitting them would produce tasks that
 don't compile independently.
 ### Drift #3: Bump CURRENT_KEY_VERSION
 Current: `CURRENT_KEY_VERSION = 1` (but the key is HD-derived, and v1 is
 reserved for the TypeScript PBKDF2 legacy per ADR-020).
 Target: `CURRENT_KEY_VERSION = 2` (HD-derived, per ADR-020).
 Version semantics:
 - v1: TypeScript predecessor's PBKDF2-encrypted data — the vault **cannot**
  decrypt it (different key derivation). Migration is a one-time re-encryption.
 - v2: HD-derived at `m/74'/2'/0'/0'` (PATHS::ENCRYPTION) — current.
 - v3+: `m/74'/2'/0'/1'`, `m/74'/2'/0'/2'`, etc. — future rotation versions.
 ### Drift #9: Version-aware encrypt/decrypt
 Current: `encrypt`/`decrypt` always derive at `PATHS::ENCRYPTION` regardless of
 the `key_version` parameter.
 Target:
 - `encrypt(plaintext, key_version)`: derive the encryption key at
  `encryption_path_for_version(key_version)`, stamp the same `key_version` on
  the resulting `EncryptedData`.
 - `decrypt(encrypted)`: derive the key at
  `encryption_path_for_version(encrypted.key_version)` — the blob carries its
  own version, and each version maps to a distinct derivation path.
 This requires:
 1. `encryption_path_for_version(version: u32) -> Result<String, DerivationError>`
   already exists in `derivation.rs` — verify it returns `InvalidPath` for
   `version < 2` (v1 is TS legacy, v0 is meaningless).
 2. `derive_encryption_key_for_version(version: u32) -> Result<DerivedKey, VaultServiceError>`
   — a new method on `VaultServiceHandle` that maps version → path → derive.
   Cached by path (same cache as `derive_encryption_key`).
 3. `encrypt` and `decrypt` use `derive_encryption_key_for_version` instead of
   deriving at the fixed `PATHS::ENCRYPTION` path.
 ### Drift #10: Implement rotate
 Current: no `rotate` method exists.
 Target:
 ```rust
 pub fn rotate(&self, encrypted: &EncryptedData, to_version: u32) -> Result<EncryptedData, VaultServiceError>;
 ```
 Decrypts with the old version's key (from `encrypted.key_version`), re-encrypts
 with the new version's key (`to_version`). Returns the new `EncryptedData` —
 the caller replaces the blob in storage. No new mnemonic needed; the same seed
 produces all version keys via different derivation paths (ADR-021).
 ### Implementation notes
 - `derive_encryption_key(path)` (the path-based API) remains as-is for deriving
  at arbitrary paths. `derive_encryption_key_for_version(version)` is the
  version-aware API used by `encrypt`/`decrypt`. Both share the same cache
  (keyed by derivation path).
 - `encrypt` and `decrypt` extract the `EncryptionKey` from the `DerivedKey` via
  `EncryptionKey::from_derived_bytes` (see encryption.md).
 - `encryption_path_for_version` returns `InvalidPath` for `version < 2`.
  `derive_encryption_key_for_version` propagates this as
  `VaultServiceError::InvalidPath`.
 ### Scope
 This task touches `encryption.rs` (CURRENT_KEY_VERSION), `service.rs` (encrypt,
 decrypt, rotate, derive_encryption_key_for_version), and possibly `derivation.rs`
 (verify `encryption_path_for_version`). It depends on the irpc removal task
 (drift #4) because both modify `service.rs`.
 ## Acceptance Criteria
 - [ ] `CURRENT_KEY_VERSION` is `2` in `encryption.rs`
 - [ ] `derive_encryption_key_for_version(version)` method added to `VaultServiceHandle`
 - [ ] `derive_encryption_key_for_version` returns `InvalidPath` for `version < 2`
 - [ ] `encrypt(plaintext, key_version)` derives at `encryption_path_for_version(key_version)`
 - [ ] `encrypt` stamps the passed `key_version` on the resulting `EncryptedData`
 - [ ] `decrypt(encrypted)` derives at `encryption_path_for_version(encrypted.key_version)`
 - [ ] `rotate(encrypted, to_version)` method implemented: decrypt old, re-encrypt new
 - [ ] `rotate` returns `EncryptedData` with `key_version = to_version`
 - [ ] Unit test: encrypt at v2, decrypt at v2 — round-trip succeeds
 - [ ] Unit test: encrypt at v2, rotate to v3, decrypt at v3 — round-trip succeeds
 - [ ] Unit test: decrypt v2 blob after rotation — old key still derivable (partial rotation safe)
 - [ ] Unit test: `derive_encryption_key_for_version(1)` returns `InvalidPath`
 - [ ] Unit test: `derive_encryption_key_for_version(0)` returns `InvalidPath`
 - [ ] `cargo test` succeeds
 - [ ] `cargo clippy` succeeds with no warnings
 ## References
 - docs/architecture/crates/vault/README.md — Known Source Drift table items #3, #9, #10
 - docs/architecture/crates/vault/encryption.md — Key Versioning, Rotation, EncryptionKey
 - docs/architecture/crates/vault/service.md — encrypt, decrypt, rotate, derive_encryption_key_for_version
 - docs/architecture/crates/vault/mnemonic-derivation.md — encryption_path_for_version, PATHS
 - docs/architecture/decisions/020-hd-derivation-for-encryption-keys.md — ADR-020
 - docs/architecture/decisions/021-key-rotation-via-version-indexed-paths.md — ADR-021
 ## Notes
 > These three drifts are one feature: version-indexed key rotation (ADR-021).
 > Splitting them would produce tasks that don't compile independently —
 > bumping the version without version-aware encrypt/decrypt would make v2
 > blobs undecryptable, and rotate without version-aware encrypt/decrypt has no
 > keys to work with. Depends on irpc removal because both modify `service.rs`.
 ## Summary
 > To be filled on completion
--- a/tasks/vault/osrng-iv-generation.md
+++ b/tasks/vault/osrng-iv-generation.md
@@ -0,0 +1,83 @@
 ---
 id: vault/osrng-iv-generation
 name: Replace rand::random() IV generation with OsRng in AES-GCM encryption
 status: pending
 depends_on: []
 scope: single
 risk: medium
 impact: isolated
 level: implementation
 ---
 ## Description
 Fix drift item #1: the AES-256-GCM IV (nonce) generation in `encryption.rs`
 currently uses `rand::random()`, which uses the thread-local RNG and may not be a
 CSPRNG on all platforms. Replace with `OsRng` (or equivalent CSPRNG).
 This is a security-critical fix. IV reuse under the same AES-GCM key is
 catastrophic — it breaks authenticity and creates a two-time-pad on the
 plaintext. `OsRng` reads from the operating system's entropy source and is the
 correct choice for cryptographic nonces.
 ### Current state
 `encryption.rs` line ~133: IV generation uses `rand::random()` to produce the
 12-byte GCM nonce.
 ### Target state
 Use `rand::rngs::OsRng` (from the `rand` crate, which is already a dependency)
 to generate the 12-byte IV. The `aes-gcm` crate's `Aes256Gcm` encrypt path takes
 a `Nonce` — construct it from `OsRng`-generated bytes.
 ```rust
 use rand::rngs::OsRng;
 use rand::RngCore;
 let mut iv_bytes = [0u8; 12];
 OsRng.fill_bytes(&mut iv_bytes);
 let nonce = Nonce::from_slice(&iv_bytes);
 ```
 The IV is generated fresh for each `encrypt()` call. The salt (32 bytes, unused
 in v2 for key derivation but kept for wire-format compat) should also use `OsRng`
 for consistency — it's stored in the `EncryptedData` blob and doesn't need to be
 deterministic.
 ### Scope
 This task touches only `encryption.rs`. It does not depend on the irpc removal
 (drift #4) because `encryption.rs` is a separate file from `service.rs` /
 `protocol.rs`. It can run in parallel with drift #4.
 ## Acceptance Criteria
 - [ ] `encryption::encrypt()` uses `OsRng` for IV generation, not `rand::random()`
 - [ ] Salt generation uses `OsRng` (or equivalent CSPRNG)
 - [ ] No `rand::random()` calls remain in `encryption.rs`
 - [ ] IV is 12 bytes (standard GCM nonce size)
 - [ ] Salt is 32 bytes (wire-format compat, unused in key derivation)
 - [ ] Unit test: verify IV is fresh on each encrypt call (encrypt twice, different IVs)
 - [ ] Unit test: verify decrypt round-trip still works after the change
 - [ ] `cargo test` succeeds
 - [ ] `cargo clippy` succeeds with no warnings
 ## References
 - docs/architecture/crates/vault/README.md — Known Source Drift table item #1
 - docs/architecture/crates/vault/encryption.md — Security Constraints: OsRng for IVs
 - docs/architecture/crates/vault/service.md — Security Constraints: OsRng for IVs
 - docs/architecture/decisions/020-hd-derivation-for-encryption-keys.md — ADR-020
 ## Notes
 > This is a security-critical fix. IV reuse under the same AES-GCM key breaks
 > authenticity and creates a two-time-pad on the plaintext. `rand::random()`
 > uses the thread-local RNG which may not be a CSPRNG on all platforms; `OsRng`
 > reads from the operating system's entropy source. This task touches only
 > `encryption.rs` and can run in parallel with the irpc removal task (drift #4).
 ## Summary
 > To be filled on completion
--- a/tasks/vault/poisoned-lock-recovery.md
+++ b/tasks/vault/poisoned-lock-recovery.md
@@ -0,0 +1,86 @@
 ---
 id: vault/poisoned-lock-recovery
 name: Replace unwrap() on RwLock acquisition with poisoned-lock recovery via unwrap_or_else
 status: pending
 depends_on: [vault/irpc-removal]
 scope: narrow
 risk: low
 impact: component
 level: implementation
 ---
 ## Description
 Fix drift item #2: `VaultServiceHandle` methods use `unwrap()` on every
 `RwLock` acquisition (read and write locks). A poisoned lock (caused by a panic
 while the lock was held) would brick the vault for all subsequent operations.
 Replace with `unwrap_or_else(|e| e.into_inner())` to recover the inner data from
 a poisoned lock, or explicit error propagation where appropriate.
 ### Current state
 `service.rs` uses `.unwrap()` on `RwLock` read and write acquisitions at
 approximately lines 142, 161, 182, 191, 196, 227, 264, 307, 340, 367 (line
 numbers may shift after the irpc removal task — match by pattern: every
 `.read().unwrap()` and `.write().unwrap()` call in `VaultServiceHandle` method
 bodies).
 ### Target state
 For read locks:
 ```rust
 let inner = self.inner.read().unwrap_or_else(|e| e.into_inner());
 ```
 For write locks:
 ```rust
 let mut inner = self.inner.write().unwrap_or_else(|e| e.into_inner());
 ```
 The rationale: a poisoned lock means a panic occurred while the lock was held.
 The data may be in an inconsistent state, but bricking the vault (panicking on
 every subsequent call) is worse than attempting to continue. The vault's
 operations are idempotent reads (derive) and state transitions (lock/unlock) —
 recovering the inner data and continuing is the pragmatic choice. If the data
 is truly corrupted, the next operation will fail with a normal error, not a
 panic.
 ### No unwrap() or expect() outside tests
 This is a general constraint for the vault: no `unwrap()` or `expect()` outside
 test code. After fixing the RwLock acquisitions, audit the rest of `service.rs`
 for any remaining `unwrap()`/`expect()` calls and replace them with proper error
 propagation (`?` operator, explicit `Result` returns, or
 `unwrap_or_else(|e| e.into_inner())` for lock recovery).
 ### Scope
 This task touches `service.rs` only. It depends on the irpc removal task (drift
 #4) because that task restructures `service.rs` — doing this first would cause
 merge conflicts.
 ## Acceptance Criteria
 - [ ] All `.read().unwrap()` calls in `VaultServiceHandle` methods replaced with `.read().unwrap_or_else(|e| e.into_inner())`
 - [ ] All `.write().unwrap()` calls in `VaultServiceHandle` methods replaced with `.write().unwrap_or_else(|e| e.into_inner())`
 - [ ] No `unwrap()` or `expect()` calls remain in `service.rs` outside of test code
 - [ ] Unit test: vault remains usable after a simulated panic (poison the lock, verify next call recovers)
 - [ ] `cargo test` succeeds
 - [ ] `cargo clippy` succeeds with no warnings
 ## References
 - docs/architecture/crates/vault/README.md — Known Source Drift table item #2
 - docs/architecture/crates/vault/service.md — Security Constraints: No unwrap() outside tests
 - docs/architecture/decisions/025-vault-local-only-dispatch.md — ADR-025
 ## Notes
 > A panic in one vault operation must not brick the vault for all other
 > operations. The poisoned-lock recovery via `unwrap_or_else(|e| e.into_inner())`
 > is the standard Rust pattern for this. This task depends on the irpc removal
 > task because both modify `service.rs` heavily.
 ## Summary
 > To be filled on completion
--- a/tasks/vault/remove-password-derivation.md
+++ b/tasks/vault/remove-password-derivation.md
@@ -0,0 +1,69 @@
 ---
 id: vault/remove-password-derivation
 name: Remove derive_password and site_password_path methods (password-manager pattern not relevant)
 status: pending
 depends_on: [vault/irpc-removal]
 scope: single
 risk: trivial
 impact: isolated
 level: implementation
 ---
 ## Description
 Fix drift item #7: the vault currently has `derive_password`,
 `derive_password_string`, and `site_password_path` methods. These implement a
 password-manager pattern (deriving site-specific passwords from the seed) that
 is not relevant to an RPC system's vault. Remove them entirely per ADR-025
 (resolves review #002 C9).
 ### What to remove
 - `derive_password` method from `VaultServiceHandle` (in `service.rs`)
 - `derive_password_string` method from `VaultServiceHandle` (in `service.rs`)
 - `site_password_path` function (in `mnemonic-derivation.rs` or `derivation.rs`,
  wherever it's defined)
 - Any associated path constants for password derivation
 - Any tests for these methods
 - Any references in `lib.rs` re-exports
 ### Why
 The vault's purpose in alknet is to derive cryptographic keys (Ed25519 for
 identity, AES-256-GCM for encryption) and encrypt/decrypt external credentials.
 Site-specific password derivation is a password-manager feature that doesn't
 belong in a networking toolkit's vault. Keeping it expands the attack surface
 and API surface for no benefit.
 ### Scope
 This task touches `service.rs` and possibly `derivation.rs` /
 `mnemonic-derivation.rs`. It depends on the irpc removal task (drift #4) because
 both modify `service.rs`.
 ## Acceptance Criteria
 - [ ] `derive_password` method removed from `VaultServiceHandle`
 - [ ] `derive_password_string` method removed from `VaultServiceHandle`
 - [ ] `site_password_path` function removed
 - [ ] Any password-derivation path constants removed
 - [ ] Tests for password derivation removed
 - [ ] No references to password derivation remain in `lib.rs` re-exports
 - [ ] `cargo check` succeeds (no dangling references)
 - [ ] `cargo test` succeeds
 - [ ] `cargo clippy` succeeds with no warnings
 ## References
 - docs/architecture/crates/vault/README.md — Known Source Drift table item #7
 - docs/architecture/decisions/025-vault-local-only-dispatch.md — ADR-025 (resolves C9)
 ## Notes
 > Straightforward removal. The password-manager pattern was inherited from the
 > POC and is not relevant to alknet's vault use case. Depends on irpc removal
 > because both modify `service.rs`.
 ## Summary
 > To be filled on completion
--- a/tasks/vault/review-vault-sync.md
+++ b/tasks/vault/review-vault-sync.md
@@ -0,0 +1,112 @@
 ---
 id: vault/review-vault-sync
 name: Review vault implementation against specs after all drift fixes
 status: pending
 depends_on: [vault/irpc-removal, vault/osrng-iv-generation, vault/poisoned-lock-recovery, vault/remove-password-derivation, vault/unlock-new-zeroizing-return, vault/key-versioning-rotation, vault/derivedkey-serialization, vault/cache-zeroization-test]
 scope: moderate
 risk: low
 impact: phase
 level: review
 ---
 ## Description
 Review the vault crate implementation against the architecture specs after all
 drift fixes are complete. This is the quality checkpoint before the spec-sync
 task — verify that the implementation matches the specs and that no drift
 items were missed or incompletely fixed.
 ### Review Checklist
 1. **Drift table verification** — every item in the vault README's Known Source
   Drift table is resolved:
   - #1: OsRng for IVs (encryption.rs)
   - #2: No unwrap() on RwLock (service.rs)
   - #3: CURRENT_KEY_VERSION = 2 (encryption.rs)
   - #4: irpc removed, direct method calls (protocol.rs, service.rs, Cargo.toml)
   - #5: DerivedKey always-redact serialization (protocol.rs)
   - #6: Cache zeroization tested (cache.rs)
   - #7: derive_password removed (service.rs, derivation)
   - #8: unlock_new returns Zeroizing<String> (service.rs)
   - #9: encrypt/decrypt version-aware (service.rs)
   - #10: rotate implemented (service.rs)
 2. **Spec conformance** — implementation matches the spec docs:
   - `VaultServiceHandle` API matches service.md (all methods, signatures, semantics)
   - `DerivedKey` / `KeyType` match protocol.md (serialization, redaction, move-only)
   - `EncryptedData` / `EncryptionKey` match encryption.md (fields, key versioning)
   - `Mnemonic` / `Seed` / `ExtendedPrivKey` match mnemonic-derivation.md
   - `KeyCache` / `CachedKey` / `CacheConfig` match service.md Cache section
   - PATHS constants match mnemonic-derivation.md (IDENTITY, DEVICE_PREFIX, SSH_HOST, ENCRYPTION, ETHEREUM)
   - `encryption_path_for_version` matches (returns InvalidPath for version < 2)
 3. **Security constraints** (from service.md, encryption.md, README.md):
   - OsRng for IVs and salt (no `rand::random()`)
   - Zeroized drop on Seed, Mnemonic, ExtendedPrivKey, EncryptionKey, CachedKey, DerivedKey
   - No `unwrap()` or `expect()` outside tests
   - DerivedKey is move-only (no Clone)
   - DerivedKey Debug impl redacts private key
   - Cache eviction zeroizes (tested)
   - No tokio dependency (local-only, std::sync::RwLock)
 4. **Public API** — `lib.rs` re-exports match the vault README's Public API section:
   - `Mnemonic`, `Seed`, `Language` from mnemonic
   - `DerivationError`, `ExtendedPrivKey`, `PATHS` from derivation
   - `EncryptedData`, `EncryptionError`, `EncryptionKey`, `CURRENT_KEY_VERSION` from encryption
   - `DerivedKey`, `KeyType` from protocol
   - `VaultServiceError`, `VaultServiceHandle` from service
   - `CacheConfig` from cache
   - No `VaultMessage`, `VaultProtocol`, `VaultServiceActor` (removed)
 5. **Test coverage**:
   - Derivation test vectors (BIP39 "abandon...about" vector)
   - Encryption round-trip tests
   - Service lifecycle tests (unlock, lock, derive, encrypt, decrypt, rotate)
   - Cache tests (LRU, TTL, clear, zeroization)
   - Serialization redaction tests (JSON redact, reject redacted deserialize)
 6. **Code quality**:
   - `cargo fmt --check` passes
   - `cargo clippy` passes with no warnings
   - No dead code (removed irpc/actor/password paths fully gone)
 ## Acceptance Criteria
 - [ ] All 10 drift items verified resolved
 - [ ] VaultServiceHandle API matches service.md
 - [ ] DerivedKey / KeyType match protocol.md
 - [ ] EncryptedData / EncryptionKey match encryption.md
 - [ ] Mnemonic / Seed / ExtendedPrivKey match mnemonic-derivation.md
 - [ ] KeyCache / CachedKey / CacheConfig match service.md
 - [ ] PATHS constants match mnemonic-derivation.md
 - [ ] All security constraints satisfied (OsRng, zeroize, no unwrap, move-only, redaction)
 - [ ] Public API (lib.rs re-exports) matches vault README
 - [ ] Test coverage adequate for all functionality
 - [ ] `cargo fmt --check` passes
 - [ ] `cargo clippy` passes with no warnings
 - [ ] All tests pass
 - [ ] No dead code from removed features (irpc, actor, password derivation)
 ## References
 - docs/architecture/crates/vault/README.md — drift table, public API, security constraints
 - docs/architecture/crates/vault/service.md
 - docs/architecture/crates/vault/encryption.md
 - docs/architecture/crates/vault/protocol.md
 - docs/architecture/crates/vault/mnemonic-derivation.md
 - docs/architecture/decisions/018-vault-standalone-crate.md
 - docs/architecture/decisions/020-hd-derivation-for-encryption-keys.md
 - docs/architecture/decisions/021-key-rotation-via-version-indexed-paths.md
 - docs/architecture/decisions/025-vault-local-only-dispatch.md
 - docs/architecture/decisions/026-vault-key-model-hd-derivation.md
 ## Notes
 > This review verifies the vault is spec-conformant after all drift fixes. If
 > deviations are found, document them and create fix tasks before proceeding
 > to the spec-sync task. This is the last checkpoint before the vault docs are
 > updated to remove the drift table and bump status.
 ## Summary
 > To be filled on completion
--- a/tasks/vault/spec-sync-remove-drift.md
+++ b/tasks/vault/spec-sync-remove-drift.md
@@ -0,0 +1,107 @@
 ---
 id: vault/spec-sync-remove-drift
 name: Update vault specs to remove drift table and security-constraint drift prose, bump doc status
 status: pending
 depends_on: [vault/review-vault-sync]
 scope: narrow
 risk: low
 impact: component
 level: implementation
 ---
 ## Description
 After the vault review confirms all drift is resolved, update the vault
 architecture docs to remove the drift tracking artifacts and reflect the
 completed state. The drift table and the "known drift" prose in the security
 constraints sections were tracking tools during the spec-to-implementation
 sync — now that the sync is complete, they should be cleaned up.
 ### What to update
 1. **vault/README.md**:
   - Remove the "Known Source Drift" section (the entire table and its intro
     paragraph). The drift is resolved; the table is no longer needed.
   - Remove the "Security Constraints" drift prose — the items that said
     "The current source uses `rand::random()` — this is a known drift" etc.
     Keep the constraint statements themselves (OsRng for IVs, zeroized drop,
     no unwrap, etc.) — those are permanent implementation requirements. Remove
     only the "current source uses X, this is a known drift" sentences.
   - Bump `status: draft` → `status: stable` in the frontmatter (per the
     Document Lifecycle in the architecture README: stable = implementation
     complete and verified).
 2. **vault/encryption.md**:
   - In Security Constraints, remove the "The current source uses
     `rand::random()` for IV generation (`encryption.rs` line 133) — this is a
     known drift from the spec and must be corrected during implementation
     sync." sentence. Keep the "OsRng for IVs" constraint.
   - In Key Versioning, remove the "The current source uses
     `CURRENT_KEY_VERSION = 1` with HD derivation and does not implement
     version-indexed paths or `rotate`. These are drift items to be corrected
     during implementation sync." paragraph.
   - Bump `status: draft` → `status: stable`.
 3. **vault/service.md**:
   - In Security Constraints, remove the drift prose about `rand::random()`,
     `unwrap()` on RwLock, and `KeyCache::clear()` verification. Keep the
     constraint statements.
   - Bump `status: draft` → `status: stable`.
 4. **vault/protocol.md**:
   - Remove the "to be updated per ADR-025 — remove `VaultProtocol` enum and
     irpc usage" note in References.
   - Remove the "postcard tests to be removed" note in References.
   - Bump `status: draft` → `status: stable`.
 5. **vault/mnemonic-derivation.md**:
   - Bump `status: draft` → `status: stable` (no drift prose to remove here,
     but the doc should reflect stable status).
 6. **architecture/README.md**:
   - Update the vault crate doc status entries in the Architecture Documents
     table from `draft` to `stable`.
   - Update the Current State paragraph to reflect vault implementation is
     complete (remove "pending ADR-025/026 refactor" language).
 ### What NOT to change
 - Do not remove the Security Constraints sections themselves — they are
  permanent implementation requirements, not drift tracking.
 - Do not change the ADRs — they record decisions, not implementation status.
 - Do not remove the Public API section — it's a living reference.
 ### Scope
 This task touches only documentation files — no source code changes. It
 depends on the review task (which depends on all drift fixes).
 ## Acceptance Criteria
 - [ ] "Known Source Drift" table removed from vault/README.md
 - [ ] Drift prose removed from Security Constraints sections (constraint statements kept)
 - [ ] All vault doc frontmatter bumped from `status: draft` to `status: stable`
 - [ ] architecture/README.md vault doc statuses updated to `stable`
 - [ ] architecture/README.md Current State updated (no "pending refactor" language)
 - [ ] No drift-tracking language remains anywhere in vault docs
 - [ ] Security constraint statements (OsRng, zeroize, no unwrap, etc.) preserved
 - [ ] Public API section preserved in vault/README.md
 ## References
 - docs/architecture/crates/vault/README.md — Known Source Drift, Security Constraints, Public API
 - docs/architecture/crates/vault/encryption.md — Security Constraints, Key Versioning
 - docs/architecture/crates/vault/service.md — Security Constraints
 - docs/architecture/crates/vault/protocol.md — References
 - docs/architecture/README.md — Document Lifecycle, Architecture Documents table, Current State
 ## Notes
 > This is the doc cleanup that closes out the vault phase. The drift table and
 > "known drift" prose were tracking tools during spec-to-implementation sync;
 > now that the sync is complete, they're noise. Keep the permanent constraint
 > statements — they guide future implementation agents who touch the vault.
 ## Summary
 > To be filled on completion
--- a/tasks/vault/unlock-new-zeroizing-return.md
+++ b/tasks/vault/unlock-new-zeroizing-return.md
@@ -0,0 +1,79 @@
 ---
 id: vault/unlock-new-zeroizing-return
 name: Change unlock_new return type from String to Zeroizing<String>
 status: pending
 depends_on: [vault/irpc-removal]
 scope: single
 risk: low
 impact: isolated
 level: implementation
 ---
 ## Description
 Fix drift item #8: `unlock_new` currently returns `String`, which is not
 zeroized on drop. The mnemonic phrase is the root of trust — it must not linger
 in freed heap memory. Change the return type to `Zeroizing<String>` (from the
 `zeroize` crate, already a dependency).
 ### Current state
 ```rust
 pub fn unlock_new(&self, word_count: usize) -> Result<String, VaultServiceError>;
 ```
 ### Target state
 ```rust
 pub fn unlock_new(&self, word_count: usize) -> Result<Zeroizing<String>, VaultServiceError>;
 ```
 Per `docs/architecture/crates/vault/service.md` → unlock_new:
 > The returned phrase is the root of trust — it is heap-allocated and zeroized
 > on drop, so it does not linger in freed memory. The caller should extract the
 > phrase for secure storage (write down, display to user) and let the
 > `Zeroizing<String>` drop when done. Do not clone the returned value or store
 > it in a non-zeroizing container.
 ### Caller adaptation
 The assembly layer (CLI binary, not yet implemented) will call `unlock_new` and
 extract the phrase. The `Zeroizing<String>` wrapper derefs to `String`, so
 `&*result` or `result.as_str()` works for reading. The caller must not clone the
 inner `String` into a non-zeroizing container.
 Existing tests that call `unlock_new` need updating to handle the new return
 type — use `&*phrase` or `phrase.as_str()` to read the string.
 ### Scope
 This task touches `service.rs` (the method signature and body) and test files.
 It depends on the irpc removal task (drift #4) because both modify `service.rs`.
 ## Acceptance Criteria
 - [ ] `unlock_new` return type changed from `Result<String, ...>` to `Result<Zeroizing<String>, ...>`
 - [ ] Method body constructs `Zeroizing<String>` from the generated phrase
 - [ ] Existing tests updated to handle `Zeroizing<String>` return type
 - [ ] No `clone()` of the returned value in non-test code
 - [ ] `cargo check` succeeds
 - [ ] `cargo test` succeeds
 - [ ] `cargo clippy` succeeds with no warnings
 ## References
 - docs/architecture/crates/vault/README.md — Known Source Drift table item #8
 - docs/architecture/crates/vault/service.md — unlock_new section
 - docs/architecture/decisions/025-vault-local-only-dispatch.md — ADR-025 (resolves W7)
 ## Notes
 > The mnemonic is the root of trust. Returning a plain `String` means the phrase
 > lingers in freed heap memory after the caller drops it. `Zeroizing<String>`
 > zeroizes the bytes on drop. This resolves review #002 W7. Depends on irpc
 > removal because both modify `service.rs`.
 ## Summary
 > To be filled on completion