docs: complete Phase 0 architecture — spec updates, review fixes, and link portability

Update four existing specs (overview, server, napi-and-pubsub, call-protocol) to reflect Phase 0 decisions: three-layer model, IdentityProvider, ForwardingPolicy, OperationEnv, static/dynamic config split. Review all 9 Phase 0a ADRs (026-034) for consistency. Fix 4 critical issues from architecture review: missing OQ-SVC-05 in open-questions.md, deprecated hub terminology, undefined AuthService and noq terms. Replace inline OQ text with cross-references per format rules. Add ConfigServiceImpl definition to configuration.md. Port absolute workspace paths to project-relative links by copying referenced docs (feasibility, certbot, fail2ban, event_source_types) into docs/research/.
2026-06-07 11:27:52 +00:00
parent 835724d087
commit d3633b7839
22 changed files with 1508 additions and 115 deletions
--- a/docs/architecture/README.md
+++ b/docs/architecture/README.md
@@ -7,25 +7,26 @@ last_updated: 2026-06-07

 ## Current State

-Architecture specification in active development. Phase 0 foundation ADRs
-completed (026–034). New spec documents created for identity, services,
-interface, configuration, storage, flowgraph, and secret service. Existing
-specs updated for the three-layer model, crate decomposition, and unified
-identity. See [open-questions.md](open-questions.md) for remaining open
-questions.
+Architecture specification in active development. Phase 0 foundation complete:
+ADRs 001–034 accepted, new spec documents created for all components, existing
+specs updated for the three-layer model, crate decomposition, unified identity,
+OperationEnv, and forwarding policy. Remaining open questions: OQ-15 (QUIC
+coexistence), OQ-19 (WebTransport TLS), OQ-20 (worker registration), OQ-IF-01
+(Interface session/EventEnvelope), OQ-IF-02 (ForwardingPolicy placement). See
+[open-questions.md](open-questions.md).

 ## Architecture Documents

 | Document | Status | Description |
 |----------|--------|-------------|
-| [overview.md](overview.md) | reviewed | Package purpose, exports, dependencies |
+| [overview.md](overview.md) | reviewed | Package purpose, crate structure, three-layer model, exports, dependencies |
 | [transport.md](transport.md) | reviewed | Transport abstraction: TCP, TLS, iroh |
 | [auth.md](auth.md) | draft | Unified auth: SSH + token, IdentityProvider trait |
-| [call-protocol.md](call-protocol.md) | draft | Bidirectional call/event protocol, operation registry |
+| [call-protocol.md](call-protocol.md) | draft | Bidirectional call/event protocol, OperationEnv, three dispatch paths |
 | [client.md](client.md) | reviewed | Client connection, SOCKS5, port forwarding |
-| [server.md](server.md) | reviewed | Server acceptance, channel handling, proxy |
+| [server.md](server.md) | reviewed | Server acceptance, IdentityProvider, ForwardingPolicy, channel handling |
 | [tun-shim.md](tun-shim.md) | deprecated | TUN interface wrapper — **deferred**, use tun2proxy |
-| [napi-and-pubsub.md](napi-and-pubsub.md) | reviewed | NAPI wrapper and pubsub event target adapter |
+| [napi-and-pubsub.md](napi-and-pubsub.md) | reviewed | NAPI wrapper, reload API, pubsub event target adapter |
 | [identity.md](identity.md) | draft | Identity type, IdentityProvider trait, auth flows |
 | [services.md](services.md) | draft | irpc service layer, OperationEnv, three dispatch paths |
 | [interface.md](interface.md) | draft | Layer 2: Interface trait, SshInterface, RawFramingInterface |
@@ -44,6 +45,9 @@ questions.
 | [storage.md](../research/storage.md) | draft | Metagraph, identity, ACL, secrets, honker |
 | [flow.md](../research/flow.md) | draft | FlowGraph, operation graph, call graph, petgraph mapping |
 | [integration-plan.md](../research/integration-plan.md) | draft | Phased integration plan for services, pubsub, and operations |
+| [feasibility/](../research/feasibility/) | — | SSH tunnel feasibility assessment and related analyses |
+| [event-sourcing/](../research/event-sourcing/) | — | Event sourcing patterns and event-driven architecture reference |
+| [ops/](../research/ops/) | — | Production ops reference: certbot, fail2ban |

 ## ADR Table

@@ -81,6 +85,9 @@ questions.
 | [033](decisions/033-operationenv-irpc-call-protocol.md) | OperationEnv as universal composition mechanism | Accepted |
 | [034](decisions/034-head-worker-terminology.md) | Head/worker terminology replacing hub/spoke | Accepted |

+> ADR numbers 020–022 were allocated to proposals that were withdrawn before
+> acceptance and are not listed.
+
 ## Open Questions

 See [open-questions.md](open-questions.md) for all open and resolved questions.
--- a/docs/architecture/call-protocol.md
+++ b/docs/architecture/call-protocol.md
@@ -13,6 +13,11 @@ subscriptions, and unidirectional events — all using the same wire format. The
 protocol is defined as a spec + handler + registry; downstream consumers (NAPI,
 Python, head/worker) register their own operations without modifying core.

+OperationEnv extends the call protocol with a universal composition mechanism
+that unifies local dispatch, irpc service dispatch, and remote dispatch. A
+handler receives `context.env.invoke(namespace, op, input)` and doesn't know
+whether the operation runs locally, in-cluster, or on a remote node.
+
 ## Why

 The current control channel (ADR-018) is unidirectional (client → server) and
@@ -21,6 +26,10 @@ The call protocol generalizes it to support bidirectional calls (ADR-024) and
 downstream service registration (ADR-025), enabling the head/worker model where
 workers expose operations the head invokes.

+Without OperationEnv, handlers calling other operations would need to know
+whether the target is local, in-cluster, or on a remote node. OperationEnv
+abstracts this away — one handler-facing API, three dispatch backends (ADR-033).
+
 ## Architecture

 ### Operation Paths
@@ -316,6 +325,101 @@ that carries `EventEnvelope` frames:
 The framing is always: 4-byte BE length prefix + JSON. The envelope shape is
 the same regardless of transport.

+### OperationEnv — Universal Composition Mechanism
+
+OperationEnv provides the handler-facing API for composing operations. A handler
+receives `context.env.invoke(namespace, operation, input)` and gets back a
+`ResponseEnvelope` — regardless of which dispatch path the operation takes
+(ADR-033).
+
+Three dispatch paths, one API:
+
+| Path | Mechanism | Serialization | Scope |
+|------|-----------|---------------|-------|
+| **Local** | Direct function call through registry | None (in-process) | Same process |
+| **Service** | irpc protocol enum dispatch | postcard (binary) | Same cluster |
+| **Remote** | Call protocol `EventEnvelope` | JSON | Cross-node |
+
+All three produce the same `ResponseEnvelope`. Service assembly determines
+which path each operation uses:
+
+```rust
+// Minimal deployment (Phase 1: single node, all local)
+let env = OperationEnv::local(local_registry);
+
+// Production deployment (Phase 2+: mix of local and remote)
+let env = OperationEnv::new()
+    .local("auth", auth_registry)
+    .local("config", config_registry)
+    .service("secrets", secret_irpc_client)
+    .remote("worker-1", call_protocol_conn);
+```
+
+**Phase boundary**: Phase 1 ships with local dispatch only (direct function
+calls through the operation registry). The irpc service dispatch and remote
+dispatch paths are contracted here but not built yet. irpc service protocols
+(`AuthProtocol`, `SecretProtocol`, etc.) are defined in the specs but the
+implementations are Phase 2+ work.
+
+**irpc is one dispatch backend for OperationEnv, not a replacement for the
+call protocol or for OperationEnv.** A call protocol handler can call an irpc
+service internally (e.g., `/head/auth/verify` calls
+`AuthProtocol::VerifyPubkey`) — the layers compose. irpc is behind a feature
+flag in alknet-core. See [services.md](services.md) for full OperationEnv and
+irpc service details.
+
+### OperationContext
+
+Every handler receives an `OperationContext`:
+
+```rust
+pub struct OperationContext {
+    pub request_id: String,
+    pub parent_request_id: Option<String>,
+    pub identity: Option<Identity>,
+    pub metadata: HashMap<String, Value>,
+    pub env: OperationEnv,
+    pub trusted: bool,  // set by buildEnv(), not by callers
+}
+```
+
+- **`identity`**: The authenticated identity making the call. Populated by
+  `IdentityProvider` from the interface layer ([identity.md](identity.md)).
+- **`env`**: The operation environment — namespaced access to other operations.
+- **`trusted`**: When a handler calls another operation through `env`, the
+  nested call is `trusted` (skips ACL checks). This prevents double-checking:
+  if `/head/agent/chat` is allowed, and it internally calls
+  `/head/auth/verify`, the auth check is trusted.
+
+Handler signature:
+
+```rust
+fn handle(input: Value, context: OperationContext) -> ResponseEnvelope;
+```
+
+### ResponseEnvelope
+
+The universal return type from all three dispatch paths:
+
+```rust
+pub struct ResponseEnvelope {
+    pub request_id: String,
+    pub result: Result<Value, CallError>,
+}
+
+pub struct CallError {
+    pub code: String,
+    pub message: String,
+    pub retryable: bool,
+}
+```
+
+Local dispatch produces `ResponseEnvelope` with no serialization. irpc service
+dispatch produces postcard-encoded results that are decoded into
+`ResponseEnvelope`. Remote dispatch receives `call.responded` EventEnvelope
+frames and maps them to `ResponseEnvelope`. The handler always gets the same
+type back.
+
 ### Relationship to @alkdev/pubsub and @alkdev/operations

 The call protocol in core is a Rust reimplementation of the same protocol
@@ -335,11 +439,11 @@ through core, out over SSH channel, into a JavaScript pubsub adapter, and
 be dispatched through `@alkdev/operations`'s call handler** — with zero
 translation at the wire level.

-### Agent Service Pattern (Future)
+### Agent Service Pattern (Downstream Application Concern)

 An agent service — coordinating between LLM providers and tool calls — is a
-primary use case for the call protocol. It would be just another set of
-registered operations with no special treatment:
+primary downstream use case for the call protocol. It would be just another set
+of registered operations with no special treatment:

 - `/head/agent/chat` — send a message, get a completion. Routes to the
  appropriate LLM provider based on available workers and configuration.
@@ -348,12 +452,10 @@ registered operations with no special treatment:
  durable storage).
 - `/head/sessions/history` — retrieve a specific session's message history.

-The agent service would use the same call protocol to invoke tools on workers
-(e.g., `/dev1/fs/readFile` for file access, `/dev1/bash/exec` for shell
-commands). This is a **downstream application concern**, not a core
-requirement. The call protocol enables it by providing the universal composition
-mechanism (OperationEnv, ADR-033), but the agent service itself is built on
-top, not into the core.
+The agent service uses OperationEnv to invoke tools on workers. **This is a
+downstream application concern, not a core requirement.** The call protocol
+enables it by providing the universal composition mechanism (ADR-033), but the
+agent service itself is built on top, not into the core.

 ## Constraints

@@ -370,6 +472,16 @@ top, not into the core.
  boundary. ACL is enforced at the `AccessControl` level, not by path prefix
  alone. A worker that exposes `/dev1/bash/exec` can restrict access via
  `required_scopes` — not every authenticated identity should have shell access.
+- **OperationEnv composition model matches the `@alkdev/operations` behavioral
+  contract**: namespace + operation name → invoke with input, return output.
+  The Rust implementation may differ in structure but must preserve this
+  contract (ADR-033).
+- **irpc is explicitly positioned as one dispatch backend for OperationEnv**
+  (ADR-033, ADR-028). It is not a replacement for the call protocol or for
+  OperationEnv.
+- **Phase 1 is local dispatch only.** irpc service dispatch and remote dispatch
+  are contracted in this spec but not built yet. The `OperationEnv::local()`
+  path is the Phase 1 implementation.

 ## Open Questions

@@ -378,9 +490,13 @@ top, not into the core.
  disconnect, or heartbeat-based discovery? See
  [open-questions.md](open-questions.md).

- **OQ-22**: Should the call protocol support streaming inputs (client streaming
-  in gRPC terms), or is client→server always a single request payload with
-  streaming only server→client? See [open-questions.md](open-questions.md).
+- **OQ-22**: ~~Should the call protocol support streaming inputs (client streaming
+  in gRPC terms)?~~ Resolved — deferred. Current model covers all identified use
+  cases. See [open-questions.md](open-questions.md).
+
+- **OQ-IF-01**: How does the `Interface` session type relate to the call
+  protocol's `EventEnvelope` stream? This needs design during Phase 1.8
+  implementation. See [open-questions.md](open-questions.md).

 ## Design Decisions

@@ -389,6 +505,8 @@ top, not into the core.
 | [018](decisions/018-control-channel-for-pubsub.md) | Control channel for pubsub | Reserved destination for event bus |
 | [024](decisions/024-bidirectional-call-protocol.md) | Bidirectional call protocol | Generalizes ADR-018, both sides can call |
 | [025](decisions/025-handler-spec-separation.md) | Handler/spec separation | Downstream registers operations without modifying core |
+| [028](decisions/028-auth-irpc-service.md) | Auth as irpc service | irpc is one dispatch backend for OperationEnv |
+| [033](decisions/033-operationenv-irpc-call-protocol.md) | OperationEnv | Universal composition with three dispatch paths |

 ## References

@@ -396,7 +514,10 @@ top, not into the core.
 - [napi-and-pubsub.md](napi-and-pubsub.md) — NAPI wrapper and pubsub adapter
 - [server.md](server.md) — Channel handling and control channel routing
 - [transport.md](transport.md) — Transport abstraction
- [configuration.md](../research/configuration.md) — ForwardingPolicy, service metadata
+- [identity.md](identity.md) — Identity struct, IdentityProvider trait
+- [interface.md](interface.md) — Interface layer, EventEnvelope stream from interfaces
+- [configuration.md](configuration.md) — ForwardingPolicy, service metadata
+- [services.md](services.md) — OperationEnv, OperationContext, irpc service layer
 - `@alkdev/pubsub` — TypeScript event target adapters and `EventEnvelope`
 - `@alkdev/operations` — TypeScript call protocol, `OperationSpec`, registry
 - `@alkdev/storage` — `peer_credentials` table, ACL graph, `Identity`
--- a/docs/architecture/configuration.md
+++ b/docs/architecture/configuration.md
@@ -69,6 +69,39 @@ impl ConfigReloadHandle {

 Obtained from `Server::run()`. Passed to NAPI or CLI for explicit reload.

+### ConfigServiceImpl
+
+The Phase 1 implementation of config service logic, backed by
+`ArcSwap<DynamicConfig>`. Where `ConfigIdentityProvider` wraps the auth section
+of `DynamicConfig`, `ConfigServiceImpl` wraps the forwarding and rate-limit
+sections. Both are ArcSwap-backed and share the same `DynamicConfig` instance.
+
+```rust
+pub struct ConfigServiceImpl {
+    dynamic: Arc<ArcSwap<DynamicConfig>>,
+}
+
+impl ConfigServiceImpl {
+    pub fn forwarding_policy(&self) -> Arc<ForwardingPolicy> {
+        self.dynamic.load().forwarding.clone()
+    }
+
+    pub fn rate_limits(&self) -> Arc<RateLimitConfig> {
+        self.dynamic.load().rate_limits.clone()
+    }
+
+    pub fn reload(&self, new_config: DynamicConfig) {
+        self.dynamic.store(Arc::new(new_config));
+    }
+}
+```
+
+Phase 1 deploys `ConfigServiceImpl` directly — no irpc service boundary. The
+`ConfigProtocol` irpc service (behind feature flag) wraps `ConfigServiceImpl`
+for production deployments that use the service layer. This mirrors the
+`ConfigIdentityProvider` / `AuthProtocol` pattern from [identity.md](identity.md)
+and ADR-028.
+
 ### ConfigService irpc Service

 ```rust
@@ -155,7 +188,7 @@ iroh_relay = "https://relay.alk.dev"
 | Interface | Static config | Dynamic config | Reload mechanism |
 |-----------|--------------|----------------|------------------|
 | CLI | Flags + optional `--config` file | Loaded at startup from `--authorized-keys` | None (restart to change) |
-| Core Rust | `StaticConfig` struct | `AuthService` (irpc) or `ArcSwap<DynamicConfig>` (minimal) | `ConfigService::reload()` or `ConfigReloadHandle::reload()` |
+| Core Rust | `StaticConfig` struct | `AuthProtocol` (irpc) or `ConfigIdentityProvider` (ArcSwap) | `ConfigProtocol::ReloadDynamicConfig` or `ConfigReloadHandle::reload()` |
 | NAPI | `serve()` options | Same | `server.reloadAuth()`, `server.reloadForwarding()` |

 ## Constraints
--- a/docs/architecture/decisions/001-pluggable-transport.md
+++ b/docs/architecture/decisions/001-pluggable-transport.md
@@ -23,4 +23,4 @@ This makes adding a new transport (e.g., WebSocket, QUIC directly) a matter of i

 ## References
 - [transport.md](../transport.md)
- [Feasibility assessment §3](../../../../conversations/research/ssh-tunnel-vpn-alternative-feasibility.md)
+- [Feasibility assessment §3](../../research/feasibility/ssh-tunnel-vpn-alternative-feasibility.md)
--- a/docs/architecture/decisions/003-iroh-stream-join.md
+++ b/docs/architecture/decisions/003-iroh-stream-join.md
@@ -28,4 +28,4 @@ Option 3 was rejected because it would require modifying russh to understand iro

 ## References
 - [transport.md](../transport.md)
- [Feasibility assessment §11](../../../../conversations/research/ssh-tunnel-vpn-alternative-feasibility.md)
+- [Feasibility assessment §11](../../research/feasibility/ssh-tunnel-vpn-alternative-feasibility.md)
--- a/docs/architecture/decisions/004-ssh-over-transport.md
+++ b/docs/architecture/decisions/004-ssh-over-transport.md
@@ -25,4 +25,4 @@ This is directly enabled by russh's `connect_stream()` and `run_stream()` APIs,

 ## References
 - [transport.md](../transport.md)
- [Feasibility assessment §3.4](../../../../conversations/research/ssh-tunnel-vpn-alternative-feasibility.md)
+- [Feasibility assessment §3.4](../../research/feasibility/ssh-tunnel-vpn-alternative-feasibility.md)
--- a/docs/architecture/decisions/008-acme-lets-encrypt.md
+++ b/docs/architecture/decisions/008-acme-lets-encrypt.md
@@ -4,7 +4,7 @@
 Accepted

 ## Context
-TLS transport mode requires certificates. Manual certificate management is error-prone — users need to obtain, install, and renew certificates. Our production setup uses certbot with Let's Encrypt (documented in `/workspace/system/dev1/certbot.md`), which automates this via the ACME protocol.
+TLS transport mode requires certificates. Manual certificate management is error-prone — users need to obtain, install, and renew certificates. Our production setup uses certbot with Let's Encrypt (documented in [certbot.md](../../research/ops/certbot.md)), which automates this via the ACME protocol.

 There are two ACME flows:
 1. **Domain-based**: Standard flow with DNS-01 or HTTP-01 challenge. Certificate is tied to a domain name, auto-renews via certbot/systemd timer. Requires port 80 or DNS access for challenges.
@@ -35,4 +35,4 @@ The implementation should use the `rustls-acme` crate (or similar pure-Rust ACME
 - [server.md](../server.md)
 - [OQ-01](../open-questions.md) — resolved by this ADR
 - [OQ-07](../open-questions.md) — resolved by this ADR
- Production certbot setup: `/workspace/system/dev1/certbot.md`
+- Production certbot setup: [certbot.md](../../research/ops/certbot.md)
--- a/docs/architecture/decisions/013-fail2ban-friendly-logging.md
+++ b/docs/architecture/decisions/013-fail2ban-friendly-logging.md
@@ -4,7 +4,7 @@
 Accepted

 ## Context
-The server needs to handle abuse on public-facing deployments. Our production infrastructure uses fail2ban on Linux (documented in `/workspace/system/dev1/fail2ban.md`) with nftables and systemd journal backend. fail2ban needs structured, parseable logs to identify abusive IP addresses.
+The server needs to handle abuse on public-facing deployments. Our production infrastructure uses fail2ban on Linux (documented in [fail2ban.md](../../research/ops/fail2ban.md)) with nftables and systemd journal backend. fail2ban needs structured, parseable logs to identify abusive IP addresses.

 However, fail2ban is Linux-specific. On other platforms (macOS, Windows, BSD), users need a different approach to reject abusive connections. The server should provide enough logging for fail2ban on Linux and enough built-in protection for other platforms.

@@ -36,4 +36,4 @@ This ensures that even without fail2ban, the server rejects obviously abusive co
 ## References
 - [server.md](../server.md)
 - [OQ-08](../open-questions.md) — resolved by this ADR
- Production fail2ban setup: `/workspace/system/dev1/fail2ban.md`
+- Production fail2ban setup: [fail2ban.md](../../research/ops/fail2ban.md)
--- a/docs/architecture/decisions/027-crate-decomposition.md
+++ b/docs/architecture/decisions/027-crate-decomposition.md
@@ -64,17 +64,30 @@ format, but not as a crate dependency.
 ### Dependency Graph

 ```
-                  alknet-secret
-                 /             \
-                /               \
-alknet-core ←────                ←── alknet-storage
-     ↑               \           /
-     │                alknet-flowgraph
-     │
-alknet-napi
-alknet (CLI binary — assembles everything)
+alknet-secret       alknet-storage      alknet-flowgraph
+   (standalone)        (standalone)        (standalone)
+        │                   │                  │
+        │  (feature flags   │   (trait impl    │  (type compat
+        │   in CLI binary)  │    via CLI wire)  │   via JSON)
+        ▼                   ▼                  ▼
+                 ┌─────────────────────┐
+                 │    alknet-core       │
+                 │  (transport, SSH,     │
+                 │   call protocol,     │
+                 │   Identity, Config)  │
+                 └─────────┬───────────┘
+                           │
+              ┌────────────┼────────────┐
+              ▼            ▼            ▼
+        alknet-napi    alknet (CLI binary — assembles everything)
 ```

+All four library crates (core, secret, storage, flowgraph) are independent of
+each other. Dependencies flow **upward** only. The CLI binary sits at the top
+and wires concrete implementations together. alknet-storage implements
+alknet-core's `IdentityProvider` trait without a crate dependency — the CLI
+binary provides the bridge.
+
 ### Narrow Interface Points

 Three types serve as the narrow interface points between crates:
@@ -147,4 +160,5 @@ alknet-storage does NOT depend on alknet-secret as a crate. Instead:
 - [research/services.md](../../research/services.md) — Service protocols
 - [research/storage.md](../../research/storage.md) — alknet-storage contents
 - [research/flow.md](../../research/flow.md) — alknet-flowgraph contents
+- [ADR-028](028-auth-irpc-service.md) — Auth as irpc service (service protocol enabled by decomposition)
 - [ADR-029](029-identity-core-type.md) — Identity as core type (narrow interface point)
--- a/docs/architecture/decisions/032-event-boundary-discipline.md
+++ b/docs/architecture/decisions/032-event-boundary-discipline.md
@@ -93,4 +93,4 @@ propagate beyond the service boundary without projection.
 - [research/services.md](../../research/services.md) — Event boundary discipline section
 - [research/storage.md](../../research/storage.md) — Honker integration, event boundaries
 - [research/integration-plan.md](../../research/integration-plan.md) — ADR 032 entry
- [event_source_types.md](/workspace/research/event_sourcing/event_source_types.md) — Event-driven architecture patterns
+- [event_source_types.md](../../research/event-sourcing/event_source_types.md) — Event-driven architecture patterns
--- a/docs/architecture/decisions/033-operationenv-irpc-call-protocol.md
+++ b/docs/architecture/decisions/033-operationenv-irpc-call-protocol.md
@@ -125,6 +125,8 @@ operations universally composable across all interfaces.

 - [research/services.md](../../research/services.md) — OperationContext, OperationEnv
 - [research/integration-plan.md](../../research/integration-plan.md) — Phase 1.5, OperationEnv wiring
+- [ADR-026](026-transport-interface-separation.md) — Three-layer model (OperationEnv is Layer 3)
+- [ADR-028](028-auth-irpc-service.md) — Auth as irpc service (one dispatch backend)
 - [ADR-032](032-event-boundary-discipline.md) — Event boundary discipline
 - [ADR-024](024-bidirectional-call-protocol.md) — Bidirectional call protocol
 - [ADR-025](025-handler-spec-separation.md) — Handler/spec separation
--- a/docs/architecture/napi-and-pubsub.md
+++ b/docs/architecture/napi-and-pubsub.md
@@ -1,6 +1,6 @@
 ---
 status: reviewed
-last_updated: 2026-06-02
+last_updated: 2026-06-07
 ---

 # NAPI Wrapper & PubSub Event Target
@@ -71,11 +71,36 @@ function serve(options: AlknetServeOptions): Promise<AlknetServer>;
 interface AlknetServer {
  close(): Promise<void>;
  onConnection(callback: (stream: Duplex, info: ConnectionInfo) => void): void;
+  // Dynamic config reload (ADR-030)
+  reloadAuth(auth: { authorizedKeys?: Buffer, certAuthority?: Buffer }): void;
+  reloadForwarding(policy: ForwardingPolicyConfig): void;
+  reloadAll(config: DynamicConfig): void;
+}
+
+interface ForwardingPolicyConfig {
+  default: 'allow' | 'deny';
+  rules: ForwardingRuleConfig[];
+}
+
+interface ForwardingRuleConfig {
+  target: string;        // "localhost:*", "10.0.0.0/8:80", "alknet-*"
+  action: 'allow' | 'deny';
+  principals?: string[]; // default ["*"]
 }
 ```

 The NAPI layer is **transport-agnostic** — it doesn't know about pubsub's `EventEnvelope`. The pubsub adapter wraps the `Duplex` stream to implement `TypedEventTarget`. This separation ensures the NAPI wrapper is reusable for any stream-based protocol, not tied specifically to pubsub.

+### NAPI Call Protocol Integration
+
+NAPI consumers can register operation handlers to participate in the call protocol. The `Duplex` stream from `connect()` or `serve()` carries `EventEnvelope` frames (4-byte BE length prefix + JSON). A TypeScript consumer can implement a call protocol handler that reads these frames and dispatches to registered operations — the same wire protocol used by `@alkdev/operations`.
+
+See [call-protocol.md](call-protocol.md) for the call protocol spec and [services.md](services.md) for OperationEnv and dispatch paths.
+
+### NAPI irpc Service Creation
+
+Behind the `irpc` feature flag, NAPI consumers can create irpc service instances for in-cluster communication. This is a Phase 2+ capability — Phase 1 uses `ConfigIdentityProvider` and direct `ConfigReloadHandle` calls. See [services.md](services.md) for the irpc service layer and ADR-027 for crate decomposition.
+
 ### NAPI `connect()` vs CLI `alknet connect`

 The NAPI `connect()` function and the CLI `alknet connect` command are fundamentally different operations despite sharing the same name:
@@ -154,4 +179,11 @@ None — all resolved.
 | [011](decisions/011-no-ssh-config-programmatic-api.md) | Programmatic-first API | No file-based config; options are structs or env vars |
 | [015](decisions/015-napi-rs-for-ffi-bridge.md) | napi-rs for FFI | Standard Node.js native addon tooling |
 | [016](decisions/016-napi-expose-connect-and-serve.md) | Both connect() and serve() | NAPI exposes client and server sides from the start |
-| [018](decisions/018-control-channel-for-pubsub.md) | Control channel for pubsub | Reserved `alknet-control` destination for event bus |
+| [018](decisions/018-control-channel-for-pubsub.md) | Control channel for pubsub | Reserved `alknet-control` destination for event bus |
+| [030](decisions/030-static-dynamic-config-split.md) | Static/dynamic config split | NAPI reload methods for auth, forwarding, and all dynamic config |
+
+## References
+
+- [configuration.md](configuration.md) — DynamicConfig, ForwardingPolicy, reload mechanism
+- [services.md](services.md) — OperationEnv, irpc service layer
+- [call-protocol.md](call-protocol.md) — Call protocol wire format and operation registry
--- a/docs/architecture/open-questions.md
+++ b/docs/architecture/open-questions.md
@@ -105,7 +105,7 @@ last_updated: 2026-06-07
 - **Origin**: [research/configuration.md](../research/configuration.md)
 - **Status**: resolved
 - **Priority**: low
- **Resolution**: No file watching. CLI loads once at startup; NAPI/hub reload explicitly. File watching is a potential attack vector and unnecessary complexity for a security tool.
+- **Resolution**: No file watching. CLI loads once at startup; NAPI/head reload explicitly. File watching is a potential attack vector and unnecessary complexity for a security tool.
 - **Cross-references**: configuration.md

 ### OQ-14: ArcSwap vs RwLock for dynamic config
@@ -221,11 +221,18 @@ last_updated: 2026-06-07

 ### OQ-SVC-04: Should workers cache derived keys locally?
 - **Origin**: [secret-service.md](secret-service.md)
- **Status**: open
- **Priority**: low
+- **Status**: ~~resolved~~
+- **Priority**: low —
 - **Resolution**: Yes, with a TTL (default: 1 hour). The head can revoke by invalidating the session.
 - **Cross-references**: [secret-service.md](secret-service.md)

+### OQ-SVC-05: How does the NFT-based ACL smart contract interact with the secret service?
+- **Origin**: [storage.md](storage.md)
+- **Status**: open
+- **Priority**: low
+- **Resolution**: The Ethereum signing key (`m/44'/60'/0'/0/0`) is derived from the same seed as the secret service. The smart contract is a separate concern — it reads on-chain ACL state, it doesn't call the secret service.
+- **Cross-references**: [storage.md](storage.md), [secret-service.md](secret-service.md)
+
 ## Interface

 ### OQ-IF-01: How does the Interface session type relate to the call protocol's EventEnvelope stream?
--- a/docs/architecture/overview.md
+++ b/docs/architecture/overview.md
@@ -1,6 +1,6 @@
 ---
 status: reviewed
-last_updated: 2026-06-02
+last_updated: 2026-06-07
 ---

 # Alknet Overview
@@ -16,6 +16,64 @@ Alknet is a self-hostable SSH-based tunnel tool that provides VPN-like functiona

 The core insight: SSH tunnels work because SSH is fundamental infrastructure. Blocking it breaks the internet. Alknet makes SSH tunneling accessible through a simple CLI with pluggable transports.

+## Crate Structure
+
+Alknet is decomposed into six crates with a strict acyclic dependency graph (ADR-027):
+
+| Crate | Purpose | Exists Now? |
+|-------|---------|-------------|
+| **alknet-core** | Transport, SSH, call protocol, config, auth types, `OperationSpec`, `Interface` trait | Yes |
+| **alknet-napi** | Node.js native addon via napi-rs | Yes |
+| **alknet-secret** | BIP39, SLIP-0010 HD key derivation, AES-256-GCM, `SecretProtocol` irpc service | Phase 2+ |
+| **alknet-storage** | SQLite-backed metagraph, identity tables, ACL graph, honker, `StorageProtocol` | Phase 2+ |
+| **alknet-flowgraph** | `FlowGraph<N,E>` over petgraph, operation graph, call graph | Phase 2+ |
+| **alknet** (CLI) | Binary that assembles everything with feature flags | Yes |
+
+The four library crates (core, secret, storage, flowgraph) are independent of each other. Dependencies flow upward only: the CLI binary sits at the top and wires concrete implementations together. alknet-storage implements alknet-core's `IdentityProvider` trait without a crate dependency — the CLI binary provides the bridge.
+
+irpc is behind a feature flag in alknet-core. Nodes that only do SSH tunneling don't need the service layer overhead.
+
+## Three-Layer Model
+
+Alknet uses a three-layer model (ADR-026):
+
+| Layer | Responsibility | Examples |
+|-------|---------------|----------|
+| **Layer 1: Transport** | Produces byte streams (`AsyncRead + AsyncWrite + Unpin + Send`) | TCP, TLS, iroh, DNS (future), WebTransport (future) |
+| **Layer 2: Interface** | Consumes a transport stream and produces call protocol sessions | SSH (handshake + auth + channel multiplexing), raw framing (length-prefix + JSON) |
+| **Layer 3: Protocol** | Carries semantics — operation registry, service calls, events | Call protocol, OperationEnv, operation dispatch |
+
+SSH is an interface, not a transport. The three-layer model enables DNS control channels (DNS transport + raw framing), local service mesh (TCP + raw framing), and browser direct call protocol (WebTransport + raw framing) without wrapping SSH inside those transports.
+
+A connection is always a (Transport, Interface) pair. The protocol layer is agnostic to both.
+
+## Service Layer
+
+The irpc service layer decomposes alknet's core responsibilities into independently testable, deployable, and replaceable components (ADR-033, [services.md](services.md)):
+
+- **Auth** (`AuthProtocol`) — verify identities, check credentials
+- **Secret** (`SecretProtocol`) — derive keys, encrypt/decrypt
+- **Config** (`ConfigProtocol`) — dynamic config reload
+- **Storage** (`StorageProtocol`) — graph CRUD, metagraph operations
+
+**OperationEnv** is the universal composition mechanism. A handler receives `context.env.invoke("secrets", "derive", input)` and doesn't know whether the dispatch is local (direct function call), in-cluster (irpc service), or cross-node (call protocol `EventEnvelope`). Three dispatch paths, one handler-facing API.
+
+**Phase boundary**: Phase 1 ships `ConfigIdentityProvider` (ArcSwap-backed) and `ConfigServiceImpl` (ArcSwap-backed) as the only auth and config implementations. The irpc service protocols (`AuthProtocol`, `SecretProtocol`, etc.) and the production deployment topology (multi-node with `StorageIdentityProvider`) are contracted in the specs but will be implemented in Phase 2+. Application services (DockerService, NodeService, agent services) are downstream concerns that build on top of the call protocol and OperationEnv.
+
+## Identity
+
+`Identity` struct and `IdentityProvider` trait are core types in alknet-core (ADR-029, [identity.md](identity.md)):
+
+```rust
+pub struct Identity {
+    pub id: String,          // Fingerprint (config auth) or account UUID (database auth)
+    pub scopes: Vec<String>, // Authorization scope strings
+    pub resources: HashMap<String, Vec<String>>, // Resource-level authorization
+}
+```
+
+`IdentityProvider` decouples alknet-core from identity storage. Phase 1 ships `ConfigIdentityProvider` (reads from `ArcSwap<DynamicConfig.auth>`). `StorageIdentityProvider` (Phase 2+, backed by SQLite) replaces it for production deployments. Both produce the same `Identity` result.
+
 ## Exports

 ### Binary: `alknet`
@@ -35,24 +93,40 @@ The `alknet-core` crate exports the pluggable components for embedding or progra
 - `TcpTransport` — direct TCP connection
 - `TlsTransport` — TCP + tokio-rustls TLS
 - `IrohTransport` — iroh QUIC P2P connection
+- `Interface` trait — consumes transport stream, produces call protocol session
 - `Socks5Server` — local SOCKS5 proxy that forwards through SSH channels
 - `PortForwarder` — manages local/remote port forwards
 - `ServerHandler` — russh server handler with configurable auth and channel policies
- `ConnectOptions` / `ServeOptions` — programmatic configuration structs (no file parsing)
+- `Identity` / `IdentityProvider` — core identity types (ADR-029)
+- `OperationSpec` — operation registration for call protocol (ADR-025)
+- `ConnectOptions` / `ServeOptions` — programmatic configuration structs
+- `StaticConfig` / `DynamicConfig` — static/immutable vs. hot-reloadable config (ADR-030)
+- `ConfigReloadHandle` — programmatic reload of dynamic config

 ## Dependencies

-| Dependency | Purpose | Feature-gated |
-|------------|---------|---------------|
-| `russh` | SSH client & server | No (core) |
-| `tokio` | Async runtime | No (core) |
-| `tokio-rustls` | TLS wrapping | Yes (`tls`) |
-| `rustls` | TLS implementation | Yes (`tls`) |
-| `rustls-acme` | ACME/Let's Encrypt auto-cert | Yes (`acme`) |
-| `iroh` | P2P QUIC transport | Yes (`iroh`) |
-| `clap` | CLI argument parsing | No (core) |
-| `tracing` | Structured logging | No (core) |
-| `anyhow` / `thiserror` | Error handling | No (core) |
+| Dependency | Purpose | Crate | Feature-gated |
+|------------|---------|-------|---------------|
+| `russh` | SSH client & server | core | No (core) |
+| `tokio` | Async runtime | core | No (core) |
+| `tokio-rustls` | TLS wrapping | core | Yes (`tls`) |
+| `rustls` | TLS implementation | core | Yes (`tls`) |
+| `rustls-acme` | ACME/Let's Encrypt auto-cert | core | Yes (`acme`) |
+| `iroh` | P2P QUIC transport | core | Yes (`iroh`) |
+| `irpc` | Streaming RPC service layer | core | Yes (`irpc`) |
+| `arc-swap` | Lock-free dynamic config | core | No (core) |
+| `serde` | Serialization | core | No (core) |
+| `clap` | CLI argument parsing | CLI | No (CLI) |
+| `toml` | TOML config file | CLI | No (CLI) |
+| `tracing` | Structured logging | core | No (core) |
+| `anyhow` / `thiserror` | Error handling | core | No (core) |
+| `bip39` | Mnemonic generation | secret | No (secret) |
+| `ed25519-bip32` | HD key derivation | secret | No (secret) |
+| `aes-gcm` | AES-256-GCM encryption | secret | No (secret) |
+| `rusqlite` | SQLite (via honker) | storage | No (storage) |
+| `honker` | Event-sourced storage | storage | No (storage) |
+| `petgraph` | Graph data structure | storage, flowgraph | No |
+| `jsonschema` | JSON Schema validation | storage, flowgraph | No |

 > Note: `tun-rs` is no longer a dependency. TUN support is deferred in favor of the external `tun2proxy` tool (ADR-014).

@@ -60,19 +134,29 @@ The `alknet-core` crate exports the pluggable components for embedding or progra

 1. **SSH runs over transport, not alongside** — The transport layer produces a single `AsyncRead+AsyncWrite+Unpin+Send` stream. SSH runs over that stream via `russh::client::connect_stream()` / `russh::server::run_stream()`. The SSH layer never knows what transport it's on. (ADR-001, ADR-004)

-2. **SOCKS5 is the primary client interface** — Port forwarding is built on top of SOCKS5-like channel management. For VPN-like "route all traffic" behavior, users run `tun2proxy` alongside alknet's SOCKS5 proxy. TUN is not in the project scope. (ADR-005, ADR-014)
+2. **Three-layer model: Transport, Interface, Protocol** — SSH is an interface (Layer 2), not a transport (Layer 1). A connection is always a (Transport, Interface) pair. The call protocol (Layer 3) is agnostic to both. This enables DNS control channels, raw framing, and WebTransport direct call protocol without wrapping SSH inside those transports. (ADR-026)

-3. **No logging of tunnel destinations** — The server logs auth attempts and connections (for fail2ban) but does not log `channel_open_direct_tcpip` destinations, DNS lookups, or bytes transferred. (ADR-006, ADR-013)
+3. **SOCKS5 is the primary client interface** — Port forwarding is built on top of SOCKS5-like channel management. For VPN-like "route all traffic" behavior, users run `tun2proxy` alongside alknet's SOCKS5 proxy. TUN is not in the project scope. (ADR-005, ADR-014)

-4. **Programmatic-first API** — Configuration via CLI flags, library API structs (`ConnectOptions`, `ServeOptions`), and environment variables. No `~/.ssh/config` parsing, no custom config files. (ADR-011)
+4. **No logging of tunnel destinations** — The server logs auth attempts and connections (for fail2ban) but does not log `channel_open_direct_tcpip` destinations, DNS lookups, or bytes transferred. (ADR-006, ADR-013)

-5. **Feature flags control transport inclusion** — `tls`, `iroh`, `acme` are feature-gated so the base install is lean. Users opt in to heavier dependencies.
+5. **Programmatic-first API** — Configuration via CLI flags, library API structs (`ConnectOptions`, `ServeOptions`), and environment variables. No `~/.ssh/config` parsing. Optional `--config` TOML file for reproducible deployments. (ADR-011, ADR-030)

-6. **Authentication is key-based** — Ed25519 public key (default) and OpenSSH certificate authority. No password authentication over SSH. (ADR-012)
+6. **Feature flags control transport inclusion** — `tls`, `iroh`, `acme`, `irpc` are feature-gated so the base install is lean. Users opt in to heavier dependencies.

-7. **NAPI exposes both connect() and serve()** — The napi-rs wrapper provides client and server functionality, using napi-rs as the FFI bridge. The NAPI layer is transport-agnostic and not tied to pubsub. (ADR-015, ADR-016)
+7. **Authentication is key-based and unified** — Ed25519 public key (default) and OpenSSH certificate authority. Same key material for SSH and token auth. Identity resolves through `IdentityProvider` trait, decoupling core from identity storage. (ADR-012, ADR-023, ADR-029)

-8. **Error handling follows a consistent layered pattern** — Transport and auth errors cause reconnection (client, with exponential backoff) or connection rejection (server). Channel-level errors (target unreachable, proxy failure) close the individual channel without killing the session. Library API errors propagate via `anyhow::Result` / `thiserror` types. CLI reports errors to stderr with appropriate exit codes. NAPI errors are marshalled as JavaScript exceptions.
+8. **NAPI exposes both connect() and serve()** — The napi-rs wrapper provides client and server functionality, using napi-rs as the FFI bridge. The NAPI layer is transport-agnostic and not tied to pubsub. (ADR-015, ADR-016)
+
+9. **Static/dynamic config split** — Transport-level settings (listen address, TLS certs) are immutable after startup. Auth, forwarding policy, and rate limits are hot-reloadable via `ArcSwap<DynamicConfig>`. (ADR-030)
+
+10. **Forwarding policy enforced before proxy spawn** — Each `channel_open_direct_tcpip` is checked against `ForwardingPolicy` before a TCP connection is made. Default-allow preserves current behavior. (ADR-031)
+
+11. **OperationEnv as universal composition mechanism** — Handlers call `context.env.invoke(namespace, op, input)` regardless of dispatch path (local, irpc service, remote call protocol). (ADR-033)
+
+12. **Event boundary discipline** — Domain events (Honker streams) stay within the owning service. irpc calls are synchronous and in-cluster. Call protocol `EventEnvelope` is the only thing that crosses node boundaries. (ADR-032)
+
+13. **Error handling follows a consistent layered pattern** — Transport and auth errors cause reconnection (client, with exponential backoff) or connection rejection (server). Channel-level errors (target unreachable, proxy failure) close the individual channel without killing the session. Library API errors propagate via `anyhow::Result` / `thiserror` types. CLI reports errors to stderr with appropriate exit codes. NAPI errors are marshalled as JavaScript exceptions.

 ## Design Decisions

@@ -88,7 +172,7 @@ The `alknet-core` crate exports the pluggable components for embedding or progra
 | [008](decisions/008-acme-lets-encrypt.md) | ACME/Let's Encrypt | Auto-provision TLS certs, domain and IP paths |
 | [009](decisions/009-default-iroh-relay.md) | Default iroh relay | n0 relay by default, `--iroh-relay` override |
 | [010](decisions/010-transport-chaining-cli.md) | Transport chaining | `--proxy` works with all transports natively |
-| [011](decisions/011-no-ssh-config-programmatic-api.md) | Programmatic-first | No file-based config; options are structs, env vars, CLI flags |
+| [011](decisions/011-no-ssh-config-programmatic-api.md) | Programmatic-first | No SSH config files; options are structs, env vars, CLI flags (amended by ADR-030 for optional TOML) |
 | [012](decisions/012-auth-ed25519-and-cert-authority.md) | Key + cert-authority | Ed25519 keys + OpenSSH CA; no password auth |
 | [013](decisions/013-fail2ban-friendly-logging.md) | Fail2ban-friendly | Structured auth logs + built-in rate limiting |
 | [014](decisions/014-defer-tun-recommend-socks5-proxy.md) | Defer TUN | Use tun2proxy for VPN-like behavior; no alknet-tun binary |
@@ -97,17 +181,46 @@ The `alknet-core` crate exports the pluggable components for embedding or progra
 | [017](decisions/017-stealth-mode-protocol-multiplexing.md) | Stealth mode | Protocol multiplexing on port 443 |
 | [018](decisions/018-control-channel-for-pubsub.md) | Control channel | Reserved `alknet-control` destination for pubsub |
 | [019](decisions/019-proxy-dual-semantics.md) | Proxy dual semantics | `--proxy` routes transport on client, data on server |
+| [023](decisions/023-unified-auth-shared-key-material.md) | Unified auth | Same key material for SSH and token auth |
+| [024](decisions/024-bidirectional-call-protocol.md) | Bidirectional call protocol | Both sides can initiate calls |
+| [025](decisions/025-handler-spec-separation.md) | Handler/spec separation | Downstream registers operations without modifying core |
+| [026](decisions/026-transport-interface-separation.md) | Three-layer model | SSH is Layer 2, not Layer 1 |
+| [027](decisions/027-crate-decomposition.md) | Crate decomposition | Six crates, acyclic deps, feature-gated irpc |
+| [028](decisions/028-auth-irpc-service.md) | Auth as irpc service | IdentityProvider is the contract, irpc is one backend |
+| [029](decisions/029-identity-core-type.md) | Identity as core type | `Identity` and `IdentityProvider` in alknet-core |
+| [030](decisions/030-static-dynamic-config-split.md) | Static/dynamic config | ArcSwap for hot-reloadable auth and forwarding |
+| [031](decisions/031-forwarding-policy.md) | Forwarding policy | Per-identity, per-destination, per-transport rules |
+| [032](decisions/032-event-boundary-discipline.md) | Event boundary | Domain events never cross service boundaries |
+| [033](decisions/033-operationenv-irpc-call-protocol.md) | OperationEnv | Universal composition, three dispatch paths |
+| [034](decisions/034-head-worker-terminology.md) | Head/worker | Replaces hub/spoke terminology |

 ## Open Questions

-All open questions have been resolved. See [open-questions.md](open-questions.md) for resolution details.
+See [open-questions.md](open-questions.md) for all open and resolved questions.
+Key open questions: OQ-15 (QUIC coexistence), OQ-19 (WebTransport TLS),
+OQ-20 (worker registration), OQ-IF-01 (Interface session / EventEnvelope
+relationship).

 ## References

- [Feasibility Assessment](../../../conversations/research/ssh-tunnel-vpn-alternative-feasibility.md)
+- [transport.md](transport.md) — Transport abstraction (Layer 1)
+- [interface.md](interface.md) — Interface layer (Layer 2)
+- [call-protocol.md](call-protocol.md) — Call protocol (Layer 3)
+- [auth.md](auth.md) — Unified authentication
+- [identity.md](identity.md) — Identity and IdentityProvider
+- [configuration.md](configuration.md) — StaticConfig, DynamicConfig, ForwardingPolicy
+- [services.md](services.md) — irpc service layer, OperationEnv
+- [server.md](server.md) — Server acceptance, channel handling
+- [client.md](client.md) — Client connection, SOCKS5, port forwarding
+- [napi-and-pubsub.md](napi-and-pubsub.md) — NAPI wrapper and pubsub adapter
+- [storage.md](storage.md) — alknet-storage: metagraph, identity, ACL
+- [flowgraph.md](flowgraph.md) — alknet-flowgraph: call graph, operation graph
+- [secret-service.md](secret-service.md) — alknet-secret: BIP39, SLIP-0010, AES-GCM
+- [Feasibility Assessment](../research/feasibility/ssh-tunnel-vpn-alternative-feasibility.md)
 - [russh API](/workspace/russh) — SSH client/server library
 - [Dispatch](/workspace/@alkdev/dispatch) — Reference implementation of russh port forwarding
 - [iroh](/workspace/iroh) — P2P QUIC connections
 - [tun2proxy](https://github.com/tun2proxy/tun2proxy) — Recommended external TUN-to-SOCKS5 tool
- [Production certbot setup](/workspace/system/dev1/certbot.md) — Let's Encrypt on our infrastructure
- [Production fail2ban setup](/workspace/system/dev1/fail2ban.md) — fail2ban with nftables on our infrastructure
+- [irpc](/workspace/irpc) — iroh streaming RPC
+- [Production certbot setup](../research/ops/certbot.md) — Let's Encrypt on our infrastructure
+- [Production fail2ban setup](../research/ops/fail2ban.md) — fail2ban with nftables on our infrastructure
--- a/docs/architecture/secret-service.md
+++ b/docs/architecture/secret-service.md
@@ -166,20 +166,16 @@ never leaves the secret service node.

 ## Open Questions

- **OQ-SVC-01**: Should the secret service support multiple seed phrases (one per
-  tenant)? See [open-questions.md](open-questions.md).
+- **OQ-SVC-01**: Should the secret service support multiple seed phrases (one
+  per tenant)? See [open-questions.md](open-questions.md).

 - **OQ-SVC-02**: Should service protocols use postcard (binary) or JSON for
-  remote calls? Postcard for irpc (Rust-to-Rust), JSON for call protocol
-  (cross-language). See [open-questions.md](open-questions.md).
+  remote calls? See [open-questions.md](open-questions.md).

 - **OQ-SVC-03**: How does the secret service integrate with the existing
-  `EncryptedDataSchema` from `@alkdev/storage`? The Rust implementation replaces
-  PBKDF2 password-based encryption with derived AES-256-GCM keys. The
-  `EncryptedData` format is a superset.
+  `EncryptedDataSchema` from `@alkdev/storage`? See [open-questions.md](open-questions.md).

- **OQ-SVC-04**: Should workers cache derived keys locally? Yes, with a TTL
-  (default: 1 hour). The head can revoke by invalidating the session.
+- **OQ-SVC-04**: Should workers cache derived keys locally? See [open-questions.md](open-questions.md).

 ## Design Decisions

--- a/docs/architecture/server.md
+++ b/docs/architecture/server.md
@@ -1,6 +1,6 @@
 ---
 status: reviewed
-last_updated: 2026-06-02
+last_updated: 2026-06-07
 ---

 # Server
@@ -51,21 +51,30 @@ The server is the tunnel endpoint. It receives SSH channels requesting TCP conne

 ### Authentication

-The server supports Ed25519 public key authentication (default) and OpenSSH certificate authority authentication (ADR-012):
+The server authenticates connections through the `IdentityProvider` trait (ADR-029, [identity.md](identity.md)). `IdentityProvider` decouples the server from any specific identity storage — the server resolves an identity, it doesn't manage keys.

-**Ed25519 public key** (default):
-1. Load authorized keys from a specified path or in-memory data
-2. `auth_publickey()` checks the presented key against the authorized set
-3. Uses constant-time comparison to prevent timing attacks
+**Phase 1 implementation**: `ConfigIdentityProvider` (in alknet-core) reads from `ArcSwap<DynamicConfig.auth>` (ADR-030). Every authorized key gets a default scope set. No database required. This is the default for CLI and single-node deployments.

-**OpenSSH certificate authority** (ADR-012):
-1. Load a trusted CA public key (`--cert-authority <path>`)
-2. `auth_publickey()` validates the presented certificate: checks CA signature, expiry, and principal restrictions
-3. Supports certificate options: `permit-port-forwarding`, `no-pty`, `source-address`
+**Future implementation**: `StorageIdentityProvider` (in alknet-storage, not yet built) backed by SQLite `peer_credentials` and `api_keys` tables plus the ACL graph. The server doesn't need to know which implementation is active — it goes through the trait.

-This enables multi-user deployments where adding one CA line to `authorized_keys` is simpler than managing individual keys for every user.
+The server supports two auth presentation paths (ADR-023, [auth.md](auth.md)):

-**No password authentication over SSH.** Keys and certificates are sufficient and more secure. If a local SOCKS5 proxy needs its own auth layer, that's a separate concern.
+**SSH public key auth** (SSH transports):
+1. `auth_publickey()` callback receives the presented key
+2. Delegates to `IdentityProvider::resolve_from_fingerprint()` with the key fingerprint
+3. Returns `Accept` (with `Identity` attached) or `Reject`
+
+**Ed25519 + OpenSSH certificate authority** (ADR-012):
+1. If no direct key match, validate the presented certificate against trusted cert-authorities
+2. Check CA signature, expiry, and principal restrictions
+3. Certificate options: `permit-port-forwarding`, `no-pty`, `source-address`
+
+**Token auth** (non-SSH transports, WebTransport):
+1. Extract token from URL path or `Authorization` header
+2. Delegate to `IdentityProvider::resolve_from_token()`
+3. Same verification: same authorized keys set, same `Identity` result (ADR-023)
+
+**No password authentication over SSH channels.** Keys and certificates are sufficient and more secure. If a local SOCKS5 proxy needs its own auth layer, that's a separate concern.

 ### Key Material Format

@@ -87,7 +96,9 @@ When a client opens a `channel_open_direct_tcpip(host, port, originator_addr, or

 **Reserved destination** — If `host` starts with `alknet-` (e.g., `alknet-control`), the server routes the channel internally instead of connecting to a TCP target. The primary reserved destination is `alknet-control:0`, which bridges the channel to the local pubsub event bus (ADR-018).

-**Regular destination** — For all other targets:
+**Forwarding policy check** — Before the proxy task is spawned for any non-reserved destination, the server evaluates `ForwardingPolicy` against the authenticated `Identity` (ADR-031, [configuration.md](configuration.md)). The policy check uses `Identity.id` and `Identity.scopes` from the identity resolved during auth. If the policy denies the destination, the channel open is rejected — no TCP connection is attempted. The default policy (`ForwardingPolicy::allow_all()`) preserves current behavior.
+
+**Regular destination** — For targets that pass the forwarding policy check:

 1. **Connection** — connect to `host:port`, either directly or via the configured outbound proxy
 2. **Outbound connection** — connect to the target, either directly or via the configured outbound proxy
@@ -122,17 +133,23 @@ This makes the server appear as an ordinary web server to port scanners and DPI
 The server handler implements `russh::server::Handler` with two primary responsibilities:

 **Authentication (`auth_publickey`)**:
- Check the presented key against the configured `authorized_keys` set (constant-time comparison)
- If no direct match, check whether the key is a certificate signed by a trusted cert-authority
- Validate certificate signature, expiry, and principal restrictions (e.g., `permit-port-forwarding`, `no-pty`, `source-address`)
+- Delegate to `IdentityProvider::resolve_from_fingerprint()` with the presented key fingerprint
+- If identity resolved, return `Accept` with the `Identity` attached to the session
+- If no identity, check certificate authority: validate CA signature, expiry, principals
 - Return `Accept` or `Reject`

 **Channel handling (`channel_open_direct_tcpip`)**:
 - If the destination host starts with `alknet-`, route internally (control channel, ADR-018)
- Otherwise, connect to `host:port` (directly or via the configured outbound proxy)
+- Otherwise, evaluate `ForwardingPolicy` against the session's `Identity` (ADR-031)
+- If denied, reject the channel open
+- If allowed, connect to `host:port` (directly or via the configured outbound proxy)
 - Spawn a bidirectional proxy task between the SSH channel and the outbound TCP stream
 - Return the channel for data flow

+### Interface Abstraction
+
+SSH is one interface at Layer 2 in the three-layer model (ADR-026, [interface.md](interface.md)). The current `ServerHandler` will be refactored into `SshInterface` — it manages SSH session concerns (handshake, auth delegation, channel multiplexing). Forwarding policy, operation routing, and call protocol handling are Layer 3 concerns that live outside the interface. This refactoring is the most invasive code change in Phase 1 (integration-plan, Phase 1.8).
+
 ### Logging and Rate Limiting

 **Logging** (for fail2ban integration on Linux):
@@ -159,6 +176,25 @@ These provide abuse protection on platforms without fail2ban (macOS, Windows, BS

 ### CLI Interface

+Configuration sources (in priority order): CLI flags, environment variables, optional `--config` TOML file (ADR-030). The TOML config file is a convenience input for reproducible deployments; it does not replace `ServeOptions` (ADR-011).
+
+Multi-transport listeners use `[[listeners]]` in the TOML config (ADR-030):
+
+```toml
+[[listeners]]
+transport = "tls"
+listen = "0.0.0.0:443"
+
+[listeners.tls]
+cert = "/etc/alknet/tls/cert.pem"
+key = "/etc/alknet/tls/key.pem"
+
+[[listeners]]
+transport = "iroh"
+```
+
+Currently, the server binds to a single transport at a time. Multi-transport via `[[listeners]]` is coming per ADR-030.
+
 ```bash
 # Basic server (SSH on port 22)
 alknet serve --key ~/.ssh/ssh_host_ed25519_key
@@ -230,7 +266,9 @@ No listening port is needed. The server connects outbound to the iroh relay (def
 - The server does not log tunnel destinations (ADR-006). Auth events and connection events are logged for fail2ban integration (ADR-013).
 - Destination strings beginning with `alknet-` are reserved for internal use (ADR-018). The server must not attempt TCP connections to `alknet-*` destinations — these are intercepted for control channel routing.
 - One `ServerHandler` instance per connection. Handler state is not shared between connections (unless explicitly configured via `Arc` shared state for things like connection limits).
- The server binds to a single transport at a time. Running multiple transports (e.g., TCP + iroh) simultaneously requires separate processes or a future multiplexing feature.
+- The server currently binds to a single transport at a time. Multi-transport via `[[listeners]]` is coming per ADR-030.
+- Forwarding policy is evaluated before every channel proxy spawn. Denied channels are rejected immediately (ADR-031).
+- Auth resolves through `IdentityProvider` (ADR-029). Phase 1 uses `ConfigIdentityProvider` backed by `ArcSwap<DynamicConfig>` (ADR-030). `StorageIdentityProvider` (Phase 2+) replaces it for production deployments with SQLite.
 - ACME support requires the `acme` feature flag. Without it, only manual TLS certs are supported.
 - No password authentication over SSH channels. Key-based and cert-authority only (ADR-012).
 - Stealth mode (`--stealth`) requires TLS transport. It has no effect on TCP or iroh transports (ADR-017).
@@ -272,4 +310,16 @@ None — all resolved.
 | [013](decisions/013-fail2ban-friendly-logging.md) | Fail2ban-friendly logging | Structured auth logs + built-in rate limiting |
 | [017](decisions/017-stealth-mode-protocol-multiplexing.md) | Stealth mode | Protocol multiplexing on port 443 |
 | [018](decisions/018-control-channel-for-pubsub.md) | Control channel | Reserved `alknet-control` destination for pubsub |
-| [019](decisions/019-proxy-dual-semantics.md) | Proxy dual semantics | `--proxy` routes transport on client, data on server |
+| [019](decisions/019-proxy-dual-semantics.md) | Proxy dual semantics | `--proxy` routes transport on client, data on server |
+| [026](decisions/026-transport-interface-separation.md) | Three-layer model | SSH is Layer 2 interface, ServerHandler → SshInterface |
+| [028](decisions/028-auth-irpc-service.md) | Auth as irpc service | IdentityProvider is the contract; irpc service is one backend |
+| [029](decisions/029-identity-core-type.md) | Identity as core type | IdentityProvider trait in alknet-core |
+| [030](decisions/030-static-dynamic-config-split.md) | Static/dynamic config split | ArcSwap for dynamic config, ConfigReloadHandle |
+| [031](decisions/031-forwarding-policy.md) | Forwarding policy | Evaluated before channel proxy spawn |
+
+## References
+
+- [configuration.md](configuration.md) — DynamicConfig, ForwardingPolicy, ConfigReloadHandle
+- [identity.md](identity.md) — IdentityProvider trait, Identity struct
+- [auth.md](auth.md) — Unified auth, AuthPolicy, token auth
+- [interface.md](interface.md) — Interface trait, SshInterface, three-layer model
--- a/docs/architecture/services.md
+++ b/docs/architecture/services.md
@@ -20,8 +20,8 @@ last_updated: 2026-06-07
 The irpc service layer decomposes alknet's core responsibilities into
 independently testable, deployable, and replaceable components. Auth, Secret,
 Config, and Storage are irpc protocol enums that work both as in-process async
-boundaries (tokio channels) and cross-process/cross-network (QUIC streams via
-noq). OperationEnv is the universal composition mechanism that unifies local
+boundaries (tokio channels) and cross-process/cross-network (irpc over iroh
+QUIC streams). OperationEnv is the universal composition mechanism that unifies local
 dispatch, irpc service dispatch, and remote call protocol dispatch.

 ## Why
@@ -209,13 +209,10 @@ layer to be built — they are Phase 2+ concerns.
 ## Open Questions

 - **OQ-SVC-01**: Should the secret service support multiple seed phrases (one
-  per tenant)? Defer for now — one seed per node. Multi-seed can be added
-  later by indexing the `Unlock` call with a tenant ID.
+  per tenant)? See [open-questions.md](open-questions.md).

 - **OQ-SVC-02**: Should service protocols use postcard (binary) or JSON for
-  remote calls? Postcard for irpc (Rust-to-Rust, efficient). JSON for call
-  protocol (cross-language, universal). The irpc remote path naturally uses
-  postcard.
+  remote calls? See [open-questions.md](open-questions.md).

 ## Design Decisions

--- a/docs/architecture/storage.md
+++ b/docs/architecture/storage.md
@@ -197,17 +197,12 @@ dependency.
 ## Open Questions

 - **OQ-SVC-03**: How does the secret service integrate with the existing
-  `EncryptedDataSchema` from `@alkdev/storage`? The Rust implementation replaces
-  PBKDF2 password-based encryption with derived AES-256-GCM keys. The
-  `EncryptedData` format is a superset — old format can be migrated by
-  re-encrypting with the new key.
+  `EncryptedDataSchema` from `@alkdev/storage`? See [open-questions.md](open-questions.md).

- **OQ-SVC-04**: Should workers cache derived keys locally? Yes, with a TTL
-  (default: 1 hour). The head can revoke by invalidating the session.
+- **OQ-SVC-04**: Should workers cache derived keys locally? See [open-questions.md](open-questions.md).

- **OQ-SVC-05**: How does the smart contract (NFT-based ACL) interact with the
-  secret service? The Ethereum signing key (`m/44'/60'/0'/0/0`) is derived from
-  the same seed. The smart contract is a separate concern.
+- **OQ-SVC-05**: How does the NFT-based ACL smart contract interact with the
+  secret service? See [open-questions.md](open-questions.md).

 ## Design Decisions