docs: write Phase 0 architecture foundation — ADRs 026-034, spec docs, and task updates

Phase 0a — ADRs (9 new): - ADR-026: Transport/interface separation (three-layer model) - ADR-027: Crate decomposition (core, secret, storage, flowgraph, napi, CLI) - ADR-028: Auth as irpc service (AuthProtocol behind feature flag) - ADR-029: Identity as core type (Identity + IdentityProvider in alknet-core) - ADR-030: Static/dynamic config split (ArcSwap, ConfigReloadHandle) - ADR-031: Forwarding policy (rule-based allow/deny, TransportKind-aware) - ADR-032: Event boundary discipline (domain, irpc, call protocol boundaries) - ADR-033: OperationEnv universal composition (three dispatch paths) - ADR-034: Head/worker terminology (replace hub/spoke) Phase 0b — New spec documents (7): - identity.md, services.md, interface.md, configuration.md, storage.md, flowgraph.md, secret-service.md Updated existing docs: - auth.md: reference identity.md for canonical definitions, add AuthProtocol - open-questions.md: resolve OQ-12, OQ-16, OQ-18, OQ-22, OQ-23-25 - README.md: add all new docs, ADRs 026-034 Marked 19 architecture tasks as completed.
2026-06-07 09:32:58 +00:00
parent 84f16d66e7
commit 19b3d3a078
38 changed files with 2750 additions and 101 deletions
--- a/docs/architecture/README.md
+++ b/docs/architecture/README.md
@@ -1,16 +1,18 @@
 ---
 status: draft
-last_updated: 2026-06-04
+last_updated: 2026-06-07
 ---

 # Alknet Architecture

 ## Current State

-Architecture specification in active development. 22 ADRs accepted. Unified
-auth and call protocol architecture being specified — see [auth.md](auth.md)
-and [call-protocol.md](call-protocol.md). Configuration architecture under
-exploration — see [research/configuration.md](../research/configuration.md).
+Architecture specification in active development. Phase 0 foundation ADRs
+completed (026–034). New spec documents created for identity, services,
+interface, configuration, storage, flowgraph, and secret service. Existing
+specs updated for the three-layer model, crate decomposition, and unified
+identity. See [open-questions.md](open-questions.md) for remaining open
+questions.

 ## Architecture Documents

@@ -24,12 +26,24 @@ exploration — see [research/configuration.md](../research/configuration.md).
 | [server.md](server.md) | reviewed | Server acceptance, channel handling, proxy |
 | [tun-shim.md](tun-shim.md) | deprecated | TUN interface wrapper — **deferred**, use tun2proxy |
 | [napi-and-pubsub.md](napi-and-pubsub.md) | reviewed | NAPI wrapper and pubsub event target adapter |
+| [identity.md](identity.md) | draft | Identity type, IdentityProvider trait, auth flows |
+| [services.md](services.md) | draft | irpc service layer, OperationEnv, three dispatch paths |
+| [interface.md](interface.md) | draft | Layer 2: Interface trait, SshInterface, RawFramingInterface |
+| [configuration.md](configuration.md) | draft | StaticConfig, DynamicConfig, forwarding policy, reload |
+| [storage.md](storage.md) | draft | alknet-storage: metagraph, identity, ACL, honker |
+| [flowgraph.md](flowgraph.md) | draft | alknet-flowgraph: call graph, operation graph, petgraph |
+| [secret-service.md](secret-service.md) | draft | alknet-secret: BIP39, SLIP-0010, AES-GCM, SecretProtocol |

 ## Research Documents

 | Document | Status | Description |
 |----------|--------|-------------|
-| [configuration.md](../research/configuration.md) | draft | Configuration architecture: static/dynamic split, hot reload, forwarding policy |
+| [configuration.md](../research/configuration.md) | draft | Configuration architecture (source for promoted spec) |
+| [core.md](../research/core.md) | draft | Core overview, transport, call protocol, DNS |
+| [services.md](../research/services.md) | draft | irpc service protocols, OperationContext, application services |
+| [storage.md](../research/storage.md) | draft | Metagraph, identity, ACL, secrets, honker |
+| [flow.md](../research/flow.md) | draft | FlowGraph, operation graph, call graph, petgraph mapping |
+| [integration-plan.md](../research/integration-plan.md) | draft | Phased integration plan for services, pubsub, and operations |

 ## ADR Table

@@ -57,12 +71,24 @@ exploration — see [research/configuration.md](../research/configuration.md).
 | [023](decisions/023-unified-auth-shared-key-material.md) | Unified auth with shared key material + token auth | Accepted |
 | [024](decisions/024-bidirectional-call-protocol.md) | Bidirectional call protocol (EventEnvelope) | Accepted |
 | [025](decisions/025-handler-spec-separation.md) | Handler/spec separation for downstream service registration | Accepted |
+| [026](decisions/026-transport-interface-separation.md) | Transport/interface separation (three-layer model) | Accepted |
+| [027](decisions/027-crate-decomposition.md) | Crate decomposition (core, secret, storage, flowgraph) | Accepted |
+| [028](decisions/028-auth-irpc-service.md) | Auth as irpc service behind feature flag | Accepted |
+| [029](decisions/029-identity-core-type.md) | Identity as core type in alknet-core | Accepted |
+| [030](decisions/030-static-dynamic-config-split.md) | Static/dynamic config split with ArcSwap | Accepted |
+| [031](decisions/031-forwarding-policy.md) | Forwarding policy with rule-based allow/deny | Accepted |
+| [032](decisions/032-event-boundary-discipline.md) | Event boundary discipline (domain, irpc, call protocol) | Accepted |
+| [033](decisions/033-operationenv-irpc-call-protocol.md) | OperationEnv as universal composition mechanism | Accepted |
+| [034](decisions/034-head-worker-terminology.md) | Head/worker terminology replacing hub/spoke | Accepted |

 ## Open Questions

-Most open questions have been resolved. Open questions remain for
-configuration, auth, and call protocol — see
-[open-questions.md](open-questions.md) for details.
+See [open-questions.md](open-questions.md) for all open and resolved questions.
+Key resolved questions from Phase 0: OQ-12, OQ-16, OQ-18 (forwarding policy
+and identity scopes), OQ-17 (transport-aware auth), OQ-23 (irpc feature flag),
+OQ-24 (DNS control channel scope), OQ-25 (crate irpc dependencies). Key open
+questions: OQ-15 (QUIC coexistence), OQ-19 (WebTransport TLS), OQ-20 (worker
+registration).

 ## Lifecycle Definitions

--- a/docs/architecture/auth.md
+++ b/docs/architecture/auth.md
@@ -3,15 +3,15 @@ status: draft
 last_updated: 2026-06-07
 ---

-# Authentication & Identity
+# Authentication

 ## What

-A unified authentication and identity layer that works across all transports —
-SSH-over-any-transport and WebTransport (non-SSH HTTP-level transports). The
-same key material (Ed25519 authorized keys and certificate authorities) is
-shared across both auth paths. Identity resolution produces a transport-agnostic
-`Identity` that carries scopes and resources for downstream authorization.
+A unified authentication layer that works across all transports — SSH-over-any-
+transport and WebTransport (non-SSH HTTP-level transports). The same key
+material (Ed25519 authorized keys and certificate authorities) is shared across
+both auth paths. Identity resolution produces a transport-agnostic `Identity`
+that carries scopes and resources for downstream authorization.

 ## Why

@@ -21,8 +21,27 @@ need a different auth presentation that shares the same key material. The
 unified auth layer ensures one key set, one identity, one rotation mechanism
 across all transports. See ADR-023 for the decision context.

+The canonical definitions of `Identity` and `IdentityProvider` are in
+[identity.md](identity.md). This document covers auth-specific behavior:
+auth presentation per transport, `AuthPolicy` structure, and the auth service
+relationship.
+
 ## Architecture

+### Identity and IdentityProvider
+
+See [identity.md](identity.md) for the canonical definitions of:
+- `Identity` struct (`{ id, scopes, resources }`)
+- `IdentityProvider` trait (`resolve_from_fingerprint()`, `resolve_from_token()`)
+- `ConfigIdentityProvider` (default, ArcSwap-backed)
+- `StorageIdentityProvider` (production, SQLite-backed, in alknet-storage)
+- `AuthProtocol` irpc service (behind `irpc` feature flag)
+
+The key relationship: `IdentityProvider` is the contract. `ConfigIdentityProvider`
+is the default implementation (reads from `DynamicConfig.auth`). `AuthProtocol`
+irpc service is one way to satisfy the trait, behind a feature flag. Both paths
+produce the same `Identity` result. See ADR-028 and ADR-029.
+
 ### Auth Presentation Per Transport

 | Transport | Auth presentation | Verification |
@@ -72,44 +91,23 @@ V1 uses timestamp-only (±300s window, no server state). The replay trade-offs
 and future zero-replay options (nonce challenge-response) are documented in
 ADR-023.

-### IdentityProvider Trait
+### IdentityProvider and Auth Service Relationship

-The `IdentityProvider` trait decouples alknet-core from any specific identity
-storage. It resolves a key fingerprint or auth token to an `Identity` with
-scopes and resources.
+The `IdentityProvider` trait (defined in [identity.md](identity.md)) decouples
+alknet-core from any specific identity storage. Two implementations exist:

-```rust
-pub trait IdentityProvider: Send + Sync + 'static {
-    /// Resolve an SSH public key fingerprint to an identity.
-    fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
+- **ConfigIdentityProvider** (in alknet-core) — reads from
+  `ArcSwap<DynamicConfig.auth>`. Every authorized key gets a default scope set.
+  No database required. This is the default for minimal deployments.

-    /// Resolve an auth token to an identity.
-    /// Returns None if the token is invalid, expired, or the key is not authorized.
-    fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
-}
+- **StorageIdentityProvider** (in alknet-storage) — backed by SQLite
+  `peer_credentials` and `api_keys` tables plus the ACL graph. Resolves
+  fingerprint → account → organization membership → effective scopes.

-pub struct Identity {
-    pub id: String,                              // Unique identifier — fingerprint (config) or account UUID (database)
-    pub scopes: Vec<String>,                     // e.g., ["relay:connect", "service:gitea:read"]
-    pub resources: HashMap<String, Vec<String>>,  // e.g., {"service": ["gitea", "registry"]}
-}
-```
-
-> **Note on identity models**: Earlier research used `{node_id, fingerprint, scopes}`.
-> The unified model uses `{id, scopes, resources}` where `id` serves as both
-> fingerprint (for key-based auth from config) and account UUID (for
-> database-backed auth). The `resources` field provides resource-level
-> authorization beyond what scopes offer. This is the canonical definition
-> that all components should use.
-```
-
-**Default implementation**: `ConfigIdentityProvider` loads from
-`DynamicConfig.auth` (the `authorized_keys` set). Every authorized key gets a
-default scope set. No database required.
-
-**Head implementation**: Backed by `@alkdev/storage`'s `peer_credentials` and
-`accounts` tables plus the ACL graph. Resolves fingerprint → account →
-organization membership → effective scopes. Uses `ArcSwap` for hot reload.
+The `AuthProtocol` irpc service (behind the `irpc` feature flag, per ADR-028)
+provides an async boundary for auth verification. It is one way to satisfy the
+`IdentityProvider` trait, not a replacement for it. Both the trait path and the
+irpc path produce the same `Identity` result.

 The trait is the contract. The backing store is pluggable. Alknet-core never
 depends on Honker, SQLite, or any specific database.
@@ -240,13 +238,13 @@ security consideration:

 ## Open Questions

- **OQ-18**: Should `Identity.scopes` be populated from `ForwardingPolicy`
-  rules, from an external `IdentityProvider`, or from both? See
-  [open-questions.md](open-questions.md).
+- **OQ-18**: ~~Source of Identity.scopes~~ Resolved per ADR-029 and ADR-031.
+  `IdentityProvider` owns scopes, `ForwardingPolicy` uses scopes from `Identity`.
+  See [open-questions.md](open-questions.md).

 - **OQ-19**: Should the WebTransport listener require its own TLS identity
  (separate from the SSH-over-TLS listener), or can they share the same
-  certificate? See [open-questions.md](open-questions.md).
+  certificate? Deferred to Phase 4. See [open-questions.md](open-questions.md).

 ## Design Decisions

@@ -254,16 +252,16 @@ security consideration:
 |-----|----------|---------|
 | [012](decisions/012-auth-ed25519-and-cert-authority.md) | Ed25519 + cert-authority | Key-based auth, no passwords |
 | [023](decisions/023-unified-auth-shared-key-material.md) | Unified auth, shared key material | Same keys for SSH and token auth |
+| [028](decisions/028-auth-irpc-service.md) | Auth as irpc service | AuthProtocol behind feature flag; IdentityProvider is the contract |
+| [029](decisions/029-identity-core-type.md) | Identity as core type | `Identity` and `IdentityProvider` in alknet-core |

 ## References

+- [identity.md](identity.md) — Canonical Identity and IdentityProvider definitions
 - [server.md](server.md) — Current SSH auth handler
 - [transport.md](transport.md) — Transport abstraction
- [configuration.md](../research/configuration.md) — DynamicConfig, AuthPolicy structure
- [open-questions.md](open-questions.md) — OQ-17 (resolved), OQ-18, OQ-19
- `server/handler.rs` — Current `auth_publickey()` callback
- `auth/server_auth.rs` — Current `ServerAuthConfig` struct
- `auth/keys.rs` — `KeySource` and key loading
+- [configuration.md](configuration.md) — DynamicConfig, AuthPolicy, ConfigReloadHandle
+- [services.md](services.md) — AuthProtocol irpc service
+- [open-questions.md](open-questions.md) — OQ-17 (resolved), OQ-18 (resolved), OQ-19
 - [wtransport](https://github.com/BiagioFesta/wtransport) — Rust WebTransport library
- [WebTransport W3C Spec](https://www.w3.org/TR/webtransport/) — Browser API
- [@alkdev/storage](/workspace/@alkdev/storage) — `peer_credentials` table, ACL graph
+- [WebTransport W3C Spec](https://www.w3.org/TR/webtransport/) — Browser API
--- a/docs/architecture/configuration.md
+++ b/docs/architecture/configuration.md
@@ -0,0 +1,192 @@
+---
+status: draft
+last_updated: 2026-06-07
+---
+
+# Configuration
+
+## What
+
+Alknet's configuration is split into `StaticConfig` (immutable after startup) and
+`DynamicConfig` (hot-reloadable at runtime), with `ArcSwap` providing lock-free
+reads on the hot path. `ConfigService` wraps reloads behind an irpc protocol
+for production deployments.
+
+## Why
+
+Three specific failures motivated the split (ADR-030):
+
+1. No hot reload of authentication credentials — adding a key requires a restart.
+2. No port forwarding access control — any authenticated client has unrestricted
+   access (ADR-031).
+3. No structured configuration beyond CLI flags — operators need config files
+   and the NAPI layer needs programmatic reload.
+
+The split is clean: anything that affects SSH handshake or socket binding is
+static; anything checked per-connection or per-channel is dynamic.
+
+## Architecture
+
+### StaticConfig
+
+Immutable after startup. Constructed from `ServeOptions` (the builder pattern
+is preserved per ADR-011). Contains:
+
+- Transport mode, listen address
+- TLS config (cert, key)
+- iroh config (relay URL)
+- Stealth mode flag
+- Host key, host key algorithm
+- Max auth attempts, max connections per IP
+- Proxy config
+
+Changing any of these requires a restart.
+
+### DynamicConfig
+
+Hot-reloadable at runtime via `ArcSwap<DynamicConfig>`. Contains:
+
+- `AuthPolicy` — authorized keys, certificate authorities, token config
+- `ForwardingPolicy` — allow/deny rules for channel targets (ADR-031)
+- `RateLimitConfig` — rate limiting parameters
+
+`ArcSwap` provides lock-free reads. Every `auth_publickey()` and
+`channel_open_direct_tcpip()` call does a single `Arc` dereference — zero cost
+compared to the current approach. Writes are atomic: `store()` swaps the
+pointer.
+
+### ConfigReloadHandle
+
+```rust
+pub struct ConfigReloadHandle {
+    dynamic: Arc<ArcSwap<DynamicConfig>>,
+}
+
+impl ConfigReloadHandle {
+    pub fn reload(&self, new_config: DynamicConfig) { ... }
+}
+```
+
+Obtained from `Server::run()`. Passed to NAPI or CLI for explicit reload.
+
+### ConfigService irpc Service
+
+```rust
+enum ConfigProtocol {
+    GetForwardingPolicy,
+    GetRateLimits,
+    ReloadForwarding { policy: ForwardingPolicy },
+    ReloadRateLimits { limits: RateLimitConfig },
+}
+```
+
+Behind the `irpc` feature flag. For production deployments that use the service
+layer. For minimal deployments, direct `ConfigReloadHandle::reload()` is
+sufficient.
+
+### ForwardingPolicy
+
+Part of DynamicConfig (ADR-031). Evaluated per-channel-open, matched against
+the authenticated `Identity`. Rules are evaluated in order; first match wins.
+Default determines fallback.
+
+```rust
+pub struct ForwardingPolicy {
+    pub default: ForwardingAction,
+    pub rules: Vec<ForwardingRule>,
+}
+```
+
+### TOML Config File
+
+Optional convenience input format (amends ADR-011, does not replace
+programmatic API). Covers static config plus initial auth/forwarding paths.
+
+```toml
+[server]
+transport = "tls"
+listen = "0.0.0.0:443"
+
+[auth]
+host_key = "/etc/alknet/ssh/host_key"
+
+[forwarding]
+default = "deny"
+
+[[forwarding.rules]]
+target = "localhost:*"
+action = "allow"
+```
+
+### NAPI Reload API
+
+```typescript
+interface AlknetServer {
+  reloadAuth(auth: { authorizedKeys?: Buffer, certAuthority?: Buffer }): void;
+  reloadForwarding(policy: ForwardingPolicyConfig): void;
+  reloadAll(config: DynamicConfig): void;
+}
+```
+
+### Multi-Transport Listeners
+
+A head node may accept connections on multiple transports simultaneously. The
+architecture supports `Vec<ListenerConfig>` instead of a single
+`ServeTransportMode`. `Server::run()` spawns one accept loop per listener,
+sharing `DynamicConfig`, `ConnectionRateLimiter`, sessions, and shutdown signal.
+
+```toml
+[[listeners]]
+transport = "tls"
+listen = "0.0.0.0:443"
+stealth = true
+
+[[listeners]]
+transport = "tcp"
+listen = "0.0.0.0:22"
+
+[[listeners]]
+transport = "iroh"
+iroh_relay = "https://relay.alk.dev"
+```
+
+### CLI vs Programmatic Behavior
+
+| Interface | Static config | Dynamic config | Reload mechanism |
+|-----------|--------------|----------------|------------------|
+| CLI | Flags + optional `--config` file | Loaded at startup from `--authorized-keys` | None (restart to change) |
+| Core Rust | `StaticConfig` struct | `AuthService` (irpc) or `ArcSwap<DynamicConfig>` (minimal) | `ConfigService::reload()` or `ConfigReloadHandle::reload()` |
+| NAPI | `serve()` options | Same | `server.reloadAuth()`, `server.reloadForwarding()` |
+
+## Constraints
+
+- `StaticConfig` cannot be changed after startup. Changing transport mode,
+  listen address, TLS config, or host key requires a restart.
+- `DynamicConfig` is reloaded atomically via `ArcSwap`. Existing connections
+  continue with their current config; new connections get the new config.
+- Config file is optional. `ServeOptions` builder pattern remains the primary
+  API (amends ADR-011, does not supersede it).
+- No file watching (OQ-13 resolved: potential attack vector, unnecessary
+  complexity).
+- Client configuration stays as `ConnectOptions` — no `ArcSwap` needed.
+
+## Open Questions
+
+- None. All configuration-related questions are resolved per ADR-030, ADR-031,
+  and the resolved OQs in [open-questions.md](open-questions.md).
+
+## Design Decisions
+
+| ADR | Decision | Summary |
+|-----|----------|---------|
+| [030](decisions/030-static-dynamic-config-split.md) | Static/dynamic config split | Immutable transport vs. reloadable auth/forwarding |
+| [011](decisions/011-no-ssh-config-programmatic-api.md) | Programmatic-first API | Amended, not superseded — TOML is convenience layer |
+| [031](decisions/031-forwarding-policy.md) | Forwarding policy | Rule-based allow/deny, TransportKind-aware |
+| [029](decisions/029-identity-core-type.md) | Identity as core type | DynamicConfig.auth consumed by IdentityProvider |
+| [028](decisions/028-auth-irpc-service.md) | Auth as irpc service | ConfigService wraps DynamicConfig reloads |
+
+## References
+
+- [research/configuration.md](../research/configuration.md) — Full analysis and proposed solution
+- [identity.md](identity.md) — IdentityProvider trait, DynamicConfig.auth
+- [ADR-013](decisions/013-fail2ban-friendly-logging.md) — Rate limiting parameters
--- a/docs/architecture/decisions/026-transport-interface-separation.md
+++ b/docs/architecture/decisions/026-transport-interface-separation.md
@@ -0,0 +1,162 @@
+# ADR-026: Transport/Interface Separation (Three-Layer Model)
+
+## Status
+
+Accepted
+
+## Context
+
+In the current architecture, SSH is deeply embedded in the server handler. The
+`ServerHandler` owns auth, channel management, and proxy logic — all mixed
+together. This makes it impossible to run the call protocol over any transport
+that doesn't speak SSH, such as:
+
+- **DNS** — encoding call protocol frames as DNS TXT queries/responses for
+  censorship resistance
+- **Raw framing** — 4-byte length prefix + JSON `EventEnvelope` without SSH
+  wrapping, for local service mesh or browser-to-head direct communication
+- **WebTransport** — running call protocol over QUIC streams (browsers can't do
+  SSH key exchange)
+
+The DNS control channel concept from research (`core.md`) currently conflates
+"DNS as a transport that moves bytes" with "SSH sessions over those bytes." But
+SSH is not a transport — it's a protocol layer that sits *on top of* a
+transport. Separating them enables the DNS control channel to carry call
+protocol events directly, without wrapping SSH inside DNS queries.
+
+The same separation enables raw framing (no SSH overhead) for trusted local
+networks, and WebTransport direct call protocol for browser clients.
+
+## Decision
+
+**Establish a three-layer model:**
+
+### Layer 1: Transport
+
+Produces byte streams. A `Transport` still produces
+`AsyncRead + AsyncWrite + Unpin + Send`. This layer is unchanged from ADR-001.
+
+```rust
+#[async_trait]
+pub trait Transport: Send + Sync + 'static {
+    type Stream: AsyncRead + AsyncWrite + Unpin + Send + 'static;
+    async fn connect(&self) -> Result<Self::Stream>;
+    fn describe(&self) -> String;
+}
+```
+
+Transports: TCP, TLS, iroh, DNS (as byte carrier), WebTransport (future).
+
+### Layer 2: Interface
+
+Consumes a `Transport::Stream` and produces call protocol sessions. An
+interface is what SSH currently does: wrap a byte stream in session semantics.
+
+```rust
+#[async_trait]
+pub trait Interface: Send + Sync + 'static {
+    type Session;
+    async fn accept(stream: TransportStream, config: &InterfaceConfig) -> Result<Self::Session>;
+}
+```
+
+Interfaces:
+
+- **SSH interface** — wraps existing `ServerHandler` logic. SSH handshake, auth,
+  channel multiplexing. The call protocol runs over a reserved SSH channel
+  (`alknet-control:0`).
+- **Raw framing interface** — 4-byte big-endian length prefix + JSON
+  `EventEnvelope`. No SSH overhead. Direct call protocol over the transport
+  stream.
+- **DNS control channel** — a (DNS transport, raw framing interface) pair that
+  encodes/decodes `EventEnvelope` frames as DNS query/response pairs.
+
+### Layer 3: Protocol
+
+Carries semantics. Call protocol events, operation registry, service calls.
+The protocol is agnostic to both the transport and the interface below it. It
+receives `EventEnvelope` frames from whatever interface produced them.
+
+### Connection Model
+
+A **connection** is always a (Transport, Interface) pair. The valid combinations are enumerated:
+
+| Transport | Interface | Use case |
+|-----------|-----------|----------|
+| TLS | SSH | Standard alknet tunnel |
+| TCP | SSH | Plain SSH tunnel |
+| iroh | SSH | P2P SSH tunnel |
+| DNS | raw framing | DNS control channel |
+| WebTransport | SSH | Browser SSH tunnel (future) |
+| WebTransport | raw framing | Browser call protocol (future) |
+| TCP | raw framing | Direct call protocol, local mesh |
+
+**The DNS control channel carries call protocol frames directly — it does NOT
+wrap SSH inside DNS.** This is explicit because the research originally
+conflated "SSH tunneling over DNS" with "DNS as a transport for call protocol."
+The (DNS, raw framing) pair sends `EventEnvelope` frames as DNS TXT
+queries/responses — no SSH involved.
+
+### `TransportKind` Enum
+
+The `TransportKind` enum (currently `Tcp | Tls | Iroh`) gains `Dns` and
+`WebTransport` variants. Initially these are tags only — no acceptor
+implementation. The full DNS and WebTransport implementations are Phase 4 work
+per the integration plan.
+
+```rust
+pub enum TransportKind {
+    Tcp,
+    Tls { server_name: Option<String> },
+    Iroh { endpoint_id: String },
+    Dns { domain: String },
+    WebTransport { host: String },
+}
+```
+
+### ServerHandler Refactor
+
+The existing `ServerHandler` is refactored into `SshInterface`. The interface
+abstraction means the server's accept loop becomes:
+
+```rust
+// Pseudocode
+let (transport, interface) = listener_config;
+let stream = transport.accept().await?;
+let session = interface.accept(stream, &config).await?;
+// session produces call protocol events
+```
+
+The call protocol handler is interface-agnostic — it receives `EventEnvelope`
+frames from any interface. Auth, forwarding policy, and operation routing happen
+at Layer 3, not inside the SSH handler.
+
+## Consequences
+
+- **Positive**: Enables DNS control channel without SSH wrapping. The (DNS,
+  raw framing) pair is a clean (Transport, Interface) combination.
+- **Positive**: Enables raw framing for local service mesh. No SSH overhead for
+  trusted networks.
+- **Positive**: SSH becomes pluggable. The same call protocol handler works with
+  any interface.
+- **Positive**: `ServerHandler` is refactored into `SshInterface` — a smaller,
+  more focused component that only handles SSH session management.
+- **Positive**: Future WebTransport and WebSocket interfaces are additive — they
+  implement the `Interface` trait without touching SSH code.
+- **Negative**: This is the most invasive code change in Phase 1
+  (integration-plan, Phase 1.8). SSH auth, channel management, and proxy logic
+  are currently tangled in `ServerHandler`. Extracting them requires careful
+  refactoring to maintain existing behavior.
+- **Negative**: The `Interface` trait is new and untested. The design must
+  accommodate both SSH's channel multiplexing and raw framing's single-stream
+  model through the same abstraction.
+
+## References
+
+- [research/core.md](../../research/core.md) — Transport layer, DNS transport section
+- [research/integration-plan.md](../../research/integration-plan.md) — Phase 1.8, three-layer model
+- [transport.md](../transport.md) — Current Transport trait (unchanged at Layer 1)
+- [server.md](../server.md) — Current ServerHandler (will become SshInterface)
+- [ADR-001](001-pluggable-transport.md) — Transport trait produces stream (unchanged)
+- [ADR-004](004-ssh-over-transport.md) — SSH runs over transport (reinforced by Layer 2)
+- [ADR-024](024-bidirectional-call-protocol.md) — Bidirectional call protocol (Layer 3)
--- a/docs/architecture/decisions/027-crate-decomposition.md
+++ b/docs/architecture/decisions/027-crate-decomposition.md
@@ -0,0 +1,150 @@
+# ADR-027: Crate Decomposition
+
+## Status
+
+Accepted
+
+## Context
+
+alknet-core currently contains everything: transport, SSH, auth, config, the
+call protocol handler, and the server accept loop. As the project grows to
+include SQLite-backed identity, HD key derivation, and metagraph storage, core
+would need to depend on rusqlite, bip39, petgraph, and other heavy dependencies
+— unacceptable for a library crate that CLI users embed.
+
+Different deployment topologies need different subsets:
+- A minimal CLI tunnel only needs core, transport, and auth types
+- A head node needs SQLite-backed identity and the secret service
+- A flowgraph visualization tool only needs petgraph operations
+
+Circular dependencies must be avoided. alknet-storage implements
+alknet-core's `IdentityProvider` trait, so alknet-core cannot depend on
+alknet-storage. alknet-storage references alknet-secret's `EncryptedData` wire
+format, but not as a crate dependency.
+
+## Decision
+
+**Decompose the project into six crates with a strict acyclic dependency graph.**
+
+### Crate Structure
+
+1. **alknet-core** — Transport, SSH, call protocol, config, auth types, identity,
+   `OperationSpec`, `Interface` trait. The foundational crate that everything
+   else depends on (by type, not by crate dep in some cases).
+   - *Depends on*: russh, tokio, irpc (feature-gated), serde, arc-swap
+   - *Does NOT depend on*: alknet-secret, alknet-storage, alknet-flowgraph
+
+2. **alknet-secret** — BIP39 mnemonic generation, SLIP-0010 Ed25519 HD key
+   derivation, AES-256-GCM encryption, `SecretProtocol` irpc service.
+   - *Depends on*: bip39, ed25519-bip32 (or rust-bip32-ed25519), aes-gcm, sha2,
+     irpc
+   - *Does NOT depend on*: alknet-core, alknet-storage
+
+3. **alknet-storage** — SQLite-backed metagraph, identity tables, ACL graph,
+   honker integration, `StorageProtocol` irpc service.
+   - *Depends on*: rusqlite (via honker), honker, petgraph, jsonschema, irpc
+   - *Does NOT depend on alknet-core* (but implements alknet-core's
+     `IdentityProvider` trait via the trait, not a crate dep)
+   - *Does NOT depend on alknet-secret* (but references `EncryptedData` type
+     format for wire compatibility)
+
+4. **alknet-flowgraph** — `FlowGraph<N,E>` over petgraph, operation graph, call
+   graph, type compatibility checking.
+   - *Depends on*: petgraph, serde, jsonschema, thiserror
+   - *Does NOT depend on*: alknet-core, alknet-storage, alknet-secret
+
+5. **alknet-napi** — Node.js native addon. Exposes alknet-core to Node.js.
+   - *Depends on*: alknet-core
+   - *Does NOT depend on*: alknet-secret, alknet-storage, alknet-flowgraph
+
+6. **alknet** (CLI binary) — Assembles everything.
+   - *Depends on*: alknet-core, alknet-secret (feature), alknet-storage (feature),
+     alknet-flowgraph (feature), toml
+
+### Dependency Graph
+
+```
+                  alknet-secret
+                 /             \
+                /               \
+alknet-core ←────                ←── alknet-storage
+     ↑               \           /
+     │                alknet-flowgraph
+     │
+alknet-napi
+alknet (CLI binary — assembles everything)
+```
+
+### Narrow Interface Points
+
+Three types serve as the narrow interface points between crates:
+
+1. **`Identity`** — Defined in `alknet_core::auth`. Used by auth handler,
+   forwarding policy, and call protocol. alknet-storage implements
+   `IdentityProvider` to produce instances.
+
+2. **`IdentityProvider`** — Trait defined in `alknet_core::auth`. Implemented by
+   `ConfigIdentityProvider` (in core) and `StorageIdentityProvider` (in
+   alknet-storage). The CLI/NAPI layer wires the concrete implementation.
+
+3. **`OperationSpec`** — Defined in `alknet_core::call`. Used by the operation
+   registry and by alknet-flowgraph for type compatibility checking. The bridge
+   is serialization — flowgraph serializes to JSON, storage persists it.
+
+### irpc Feature Flag
+
+irpc is a feature flag in alknet-core. When disabled, auth and config go through
+`IdentityProvider` and `ConfigReloadHandle` directly — no irpc overhead. Nodes
+that only do SSH tunneling don't need the service layer.
+
+In alknet-secret and alknet-storage, irpc is an independent dependency, not
+feature-gated. These crates always define irpc service protocols because they
+are used in production deployments where the service layer is active.
+
+### alknet-storage's Relationship to alknet-core
+
+alknet-storage does NOT depend on alknet-core as a crate. Instead:
+
+- alknet-storage defines its own `IdentityProvider` impl that matches
+  alknet-core's trait signature. The trait is re-exported or defined locally
+  with `#[cfg(feature = "alknet-core")]` interop.
+- In practice, the CLI binary crate depends on both and wires them together.
+  alknet-storage provides `StorageIdentityProvider`; alknet-core takes
+  `impl IdentityProvider`.
+
+### alknet-storage's Relationship to alknet-secret
+
+alknet-storage does NOT depend on alknet-secret as a crate. Instead:
+
+- alknet-storage and alknet-secret share the `EncryptedData` wire format (key
+  version, salt, IV, ciphertext). This is a type-level compatibility, not a
+  crate dependency.
+- alknet-secret encrypts; alknet-storage stores the encrypted blob in a
+  `SecretNode` in the metagraph. The bridge is serialization.
+
+## Consequences
+
+- **Positive**: Core is lean. No database, no crypto, no petgraph. CLI users
+  get a small binary.
+- **Positive**: Services are pluggable. alknet-secret and alknet-storage can be
+  swapped for alternative implementations.
+- **Positive**: No circular dependencies. The dependency graph is a DAG.
+- **Positive**: Deployment topology determines which crates to include. A CLI
+  tunnel uses only alknet-core. A head node uses everything.
+- **Positive**: irpc is feature-gated in core. Minimal deployments don't pay for
+  service layer overhead.
+- **Negative**: `IdentityProvider` trait interop between alknet-core and
+  alknet-storage requires careful versioning. If the trait signature changes,
+  both crates must update.
+- **Negative**: `EncryptedData` wire format compatibility between alknet-secret
+  and alknet-storage is implicit (not enforced by the type system). A shared
+  types crate could be extracted if needed, but adds another crate dependency.
+
+## References
+
+- [research/integration-plan.md](../../research/integration-plan.md) — Phase 2, dependency graph
+- [research/core.md](../../research/core.md) — alknet-core contents
+- [research/services.md](../../research/services.md) — Service protocols
+- [research/storage.md](../../research/storage.md) — alknet-storage contents
+- [research/flow.md](../../research/flow.md) — alknet-flowgraph contents
+- [ADR-029](029-identity-core-type.md) — Identity as core type (narrow interface point)
--- a/docs/architecture/decisions/028-auth-irpc-service.md
+++ b/docs/architecture/decisions/028-auth-irpc-service.md
@@ -0,0 +1,146 @@
+# ADR-028: Auth as irpc Service
+
+## Status
+
+Accepted
+
+## Context
+
+For head nodes serving many users, in-memory key lookup via `ArcSwap<DynamicConfig>`
+doesn't scale. Loading all authorized keys into RAM and atomic-swapping the
+entire set on each reload works for small deployments but requires holding every
+key in memory. For production deployments with hundreds or thousands of users,
+auth verification should query a database on demand rather than holding all keys
+in memory.
+
+The current `ArcSwap<DynamicConfig>` approach works for CLI and single-node
+setups. What's needed is an async boundary that allows auth verification to go
+through a service — locally via channels for minimal deployments, or via irpc
+for production deployments where auth runs on a separate process or node.
+
+The critical design point: callers go through the `IdentityProvider` trait
+(ADR-029). The irpc service is one way to satisfy the trait. Both paths produce
+the same result — an `Identity` or rejection. The trait is the contract; the
+service is an implementation path.
+
+## Decision
+
+**Auth verification is provided via an irpc service protocol, with
+`IdentityProvider` as the interface contract and `ConfigIdentityProvider`
+(ArcSwap-backed) as the default implementation.**
+
+### IdentityProvider Trait (ADR-029) — The Contract
+
+Callers depend on `IdentityProvider`, not on any concrete implementation:
+
+```rust
+pub trait IdentityProvider: Send + Sync + 'static {
+    fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
+    fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
+}
+```
+
+### ConfigIdentityProvider — Default Implementation
+
+Reads from `ArcSwap<DynamicConfig.auth>`. No database needed. Every authorized
+key gets a default scope set. This is the default for CLI and single-node
+deployments.
+
+### AuthProtocol irpc Service — Behind Feature Flag
+
+```rust
+#[rpc_requests(message = AuthMessage)]
+#[derive(Debug, Serialize, Deserialize)]
+enum AuthProtocol {
+    #[rpc(tx=oneshot::Sender<AuthResult>)]
+    #[wrap(VerifyPubkey)]
+    VerifyPubkey { fingerprint: String, key_data: Vec<u8> },
+
+    #[rpc(tx=oneshot::Sender<AuthResult>)]
+    #[wrap(VerifyToken)]
+    VerifyToken { token_bytes: Vec<u8>, timestamp: u64 },
+
+    #[rpc(tx=oneshot::Sender<()>)]
+    #[wrap(ReloadKeys)]
+    ReloadKeys,
+
+    #[rpc(tx=oneshot::Sender<bool>)]
+    #[wrap(CheckAccess)]
+    CheckAccess { identity: Identity, operation: String },
+}
+
+enum AuthResult {
+    Ok(Identity),
+    Denied(String),
+}
+```
+
+The `AuthProtocol` is behind the `irpc` feature flag in alknet-core. Nodes
+that only do SSH tunneling don't need the service layer overhead. When the
+feature is disabled, auth goes through `IdentityProvider` directly.
+
+### AuthServiceImpl
+
+Two implementations exist:
+
+- **ConfigAuthService** — backed by `ConfigIdentityProvider` (ArcSwap path).
+  Wraps the trait in an irpc service for deployments that use the service layer
+  but don't have SQLite.
+- **StorageAuthService** — backed by SQLite `peer_credentials` and `api_keys`
+  tables (in alknet-storage). Queries on demand. Can maintain an LRU cache for
+  hot fingerprints. This is the production implementation.
+
+Both produce the same `AuthResult` — an `Identity` or a denial. Callers don't
+know or care which backend is running.
+
+### Integration with IdentityProvider
+
+The irpc service and the trait compose. A caller goes through `IdentityProvider`,
+which may internally delegate to the irpc service, or may satisfy the request
+locally via `ConfigIdentityProvider`. The deployment topology determines the
+path:
+
+- **Minimal (CLI, single-node)**: `ConfigIdentityProvider` reads from
+  `ArcSwap<DynamicConfig>`. No irpc overhead.
+- **Production with local auth**: `AuthServiceImpl` wraps
+  `StorageIdentityProvider` locally. The handler calls `IdentityProvider` which
+  routes to the local irpc service.
+- **Distributed auth**: Handler on a worker node calls `IdentityProvider` which
+  routes to a remote auth irpc service over QUIC.
+
+### ConfigService Integration
+
+`AuthProtocol::ReloadKeys` triggers reload of the dynamic config's auth section.
+For the `ConfigIdentityProvider` path, this is equivalent to
+`ConfigReloadHandle::reload()`. For the `StorageIdentityProvider` path, this
+refreshes the LRU cache. Both update atomically — ongoing connections are
+unaffected, new connections pick up changes.
+
+## Consequences
+
+- **Positive**: Minimal deployments use `ArcSwap` without irpc overhead. No
+  database dependency for CLI users.
+- **Positive**: Production deployments wire `StorageIdentityProvider` behind the
+  irpc service. Auth scales to thousands of users without loading all keys into
+  memory.
+- **Positive**: The `IdentityProvider` trait is the only contract callers depend
+  on. This keeps alknet-core lean and testable.
+- **Positive**: Feature flag (`irpc`) keeps core lean for deployments that don't
+  need the service layer.
+- **Positive**: Both paths produce identical `Identity` results. Behavioral
+  parity is enforced by the shared `Identity` type.
+- **Negative**: Two implementations must be kept in sync. `ConfigIdentityProvider`
+  and `StorageIdentityProvider` must produce the same `Identity` for the same
+  input. Integration tests should verify this.
+- **Negative**: The `irpc` feature flag adds conditional compilation complexity.
+  The core must compile and work without it, and the service layer must work
+  with it enabled.
+
+## References
+
+- [research/services.md](../../research/services.md) — AuthService, AuthProtocol definition
+- [auth.md](../auth.md) — IdentityProvider trait, Identity struct
+- [research/configuration.md](../../research/configuration.md) — Auth service approach
+- [research/integration-plan.md](../../research/integration-plan.md) — Phase 1.4
+- [ADR-029](029-identity-core-type.md) — Identity as core type
+- [ADR-027](027-crate-decomposition.md) — Crate decomposition
--- a/docs/architecture/decisions/029-identity-core-type.md
+++ b/docs/architecture/decisions/029-identity-core-type.md
@@ -0,0 +1,107 @@
+# ADR-029: Identity as Core Type
+
+## Status
+
+Accepted
+
+## Context
+
+The `Identity` struct and `IdentityProvider` trait are needed by auth,
+forwarding policy, and call protocol — three different subsystems in
+alknet-core. Without placing them in core, these subsystems would each define
+their own identity type, leading to duplication and conversion boilerplate.
+
+The constraint: alknet-core must not depend on alknet-storage or any database.
+The `IdentityProvider` trait must be in core so that the handler can resolve
+identities without knowing whether the backing store is a config file or a
+SQLite database. External crates provide implementations.
+
+Earlier research defined `Identity` inconsistently: `{node_id, fingerprint,
+scopes}` in services.md and `{id, scopes, resources}` in auth.md. The unified
+model uses `{id, scopes, resources}` where `id` serves as both fingerprint (for
+key-based auth from config) and account UUID (for database-backed auth).
+
+## Decision
+
+**`Identity` struct and `IdentityProvider` trait live in `alknet_core::auth`.**
+
+### Identity Struct
+
+```rust
+pub struct Identity {
+    pub id: String,                               // Fingerprint (config auth) or account UUID (database auth)
+    pub scopes: Vec<String>,                      // e.g., ["relay:connect", "service:gitea:read"]
+    pub resources: HashMap<String, Vec<String>>,   // e.g., {"service": ["gitea", "registry"]}
+}
+```
+
+The `id` field serves dual purpose: when using config-based authentication
+(`ConfigIdentityProvider`), it holds the Ed25519 key fingerprint. When using
+database-backed authentication (`StorageIdentityProvider`), it holds the account
+UUID from the `accounts` table. This keeps the type simple while accommodating
+both auth paths.
+
+The `scopes` field provides authorization scope strings used by
+`ForwardingPolicy` and `AccessControl` in `OperationSpec`. The `resources`
+field provides resource-level authorization beyond what scopes offer (e.g., which
+services this identity can access).
+
+### IdentityProvider Trait
+
+```rust
+pub trait IdentityProvider: Send + Sync + 'static {
+    fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
+    fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
+}
+```
+
+The trait is the contract. Callers (auth handler, forwarding policy, call
+protocol) depend on `IdentityProvider` — not on any concrete implementation.
+
+### Default and Production Implementations
+
+- **`ConfigIdentityProvider`** (in alknet-core) — reads from
+  `ArcSwap<DynamicConfig.auth>`. Every authorized key gets a default scope set.
+  No database needed. This is the default for minimal deployments.
+- **`StorageIdentityProvider`** (in alknet-storage) — backed by SQLite
+  `peer_credentials` and `api_keys` tables plus the ACL graph. Resolves
+  fingerprint → account → organization membership → effective scopes. This is
+  the production implementation for head nodes.
+
+alknet-core never depends on alknet-storage. The trait relationship is:
+alknet-core *defines* the trait, alknet-storage *implements* it. The CLI or
+NAPI assembly layer wires the concrete implementation.
+
+### Why Not in alknet-storage?
+
+If `Identity` lived in alknet-storage, alknet-core would need to depend on
+alknet-storage to use the type — creating a circular dependency (since
+alknet-storage implements alknet-core's `IdentityProvider` trait). Placing the
+type and trait in core breaks the cycle.
+
+## Consequences
+
+- **Positive**: alknet-core has no database dependency. Auth, forwarding, and
+  call protocol all use the same `Identity` type.
+- **Positive**: alknet-storage implements the core trait. The CLI/NAPI layer
+  wires the concrete implementation. Deployment topology determines which impl
+  to use.
+- **Positive**: The `id` field serves dual purpose (fingerprint or UUID),
+  avoiding separate types for config-based and database-based auth.
+- **Positive**: `ForwardingPolicy` and `AccessControl` can reference scopes from
+  `Identity` without knowing where they came from.
+- **Negative**: Two implementations of `IdentityProvider` exist — `Config` and
+  `Storage`. Both must produce identical `Identity` results for the same input.
+  Tests should verify behavioral parity.
+- **Negative**: The trait abstraction adds a level of indirection for the
+  minimal (config-only) deployment path. The cost is negligible — the
+  `ConfigIdentityProvider` is a simple `ArcSwap` dereference.
+
+## References
+
+- [auth.md](../auth.md) — IdentityProvider trait, Identity struct, unified auth
+- [research/services.md](../../research/services.md) — AuthService, Identity section
+- [research/integration-plan.md](../../research/integration-plan.md) — Phase 1.2
+- [ADR-023](023-unified-auth-shared-key-material.md) — Unified auth with shared key material
+- [ADR-028](028-auth-irpc-service.md) — Auth as irpc service
+- [OQ-18](../open-questions.md) — IdentityProvider owns scopes
--- a/docs/architecture/decisions/030-static-dynamic-config-split.md
+++ b/docs/architecture/decisions/030-static-dynamic-config-split.md
@@ -0,0 +1,159 @@
+# ADR-030: Static/Dynamic Configuration Split
+
+## Status
+
+Accepted
+
+## Context
+
+Alknet's configuration is loaded once at startup and never changes. This causes
+three specific failures:
+
+1. **No hot reload of authentication credentials.** Adding or removing an
+   authorized key requires restarting the server process. In head/worker
+   deployments where keys are managed via a database, the process must be
+   restarted every time a key is added, revoked, or rotated. This is
+   operationally unacceptable.
+
+2. **No port forwarding access control.** Any authenticated client can open a
+   `direct-tcpip` channel to any destination. There is no policy governing
+   which hosts, ports, or alknet control channels a client may access. A
+   compromised key grants unrestricted network access through the tunnel.
+
+3. **No structured configuration beyond CLI flags.** ADR-011 chose
+   programmatic-first configuration for the alpha — correct at the time. But as
+   alknet moves toward publishable releases, operators need config files for
+   reproducible deployments, and the NAPI layer needs programmatic reload
+   capability that `ServeOptions` doesn't currently support.
+
+Not all configuration should be reloadable. Transport-level settings (listen
+address, TLS certificates, host key) require socket/TLS renegotiation to change
+at runtime — effectively a restart. Auth and forwarding policy can change
+atomically without disrupting existing connections.
+
+## Decision
+
+**Split configuration into `StaticConfig` and `DynamicConfig`.**
+
+### StaticConfig
+
+Immutable after startup. Constructed from `ServeOptions` (the builder pattern is
+preserved). Contains everything that affects socket binding, TLS handshakes, or
+SSH session negotiation:
+
+- Transport mode, listen address
+- TLS config (cert, key)
+- iroh config (relay URL)
+- Stealth mode flag
+- Host key, host key algorithm
+- Max auth attempts, max connections per IP
+- Proxy config
+
+Changing any of these requires a restart.
+
+### DynamicConfig
+
+Hot-reloadable at runtime via `ArcSwap<DynamicConfig>`. Contains everything
+checked per-connection or per-channel:
+
+- `AuthPolicy` — authorized keys, certificate authorities, token config
+- `ForwardingPolicy` — allow/deny rules for channel targets (ADR-031)
+- `RateLimitConfig` — rate limiting parameters
+
+`ArcSwap` provides lock-free reads on the hot path (every `auth_publickey()` and
+every `channel_open_direct_tcpip()` call does an `Arc` dereference — zero cost
+compared to the current approach). Writes are atomic: `store()` swaps the
+pointer. Existing connections finish with their current config; new connections
+get the new config.
+
+### ConfigReloadHandle
+
+```rust
+pub struct ConfigReloadHandle {
+    dynamic: Arc<ArcSwap<DynamicConfig>>,
+}
+
+impl ConfigReloadHandle {
+    pub fn reload(&self, new_config: DynamicConfig) { ... }
+}
+```
+
+The handle is obtained from `Server::run()` and passed to NAPI or the CLI.
+
+### ConfigService
+
+The `ConfigService` wraps `ArcSwap<DynamicConfig>` reloads behind an irpc
+protocol (behind the `irpc` feature flag) for production deployments that use
+the service layer. For minimal deployments (CLI, single-node), direct
+`ConfigReloadHandle::reload()` is sufficient.
+
+### TOML Config File
+
+An optional TOML config file covers static config plus initial auth/forwarding
+paths. This **amends** ADR-011 (does not supersede it) — the programmatic-first
+API remains primary. The config file is a convenience input format:
+
+```toml
+[server]
+transport = "tls"
+listen = "0.0.0.0:443"
+stealth = false
+max_connections_per_ip = 5
+max_auth_attempts = 3
+
+[server.tls]
+cert = "/etc/alknet/tls/cert.pem"
+key = "/etc/alknet/tls/key.pem"
+
+[auth]
+host_key = "/etc/alknet/ssh/host_key"
+
+[forwarding]
+default = "deny"
+```
+
+### NAPI Reload API
+
+```typescript
+interface AlknetServer {
+  reloadAuth(auth: { authorizedKeys?: Buffer, certAuthority?: Buffer }): void;
+  reloadForwarding(policy: ForwardingPolicyConfig): void;
+  reloadAll(config: DynamicConfig): void;
+}
+```
+
+The NAPI layer parses key data and constructs a new `DynamicConfig`, then calls
+`ConfigReloadHandle::reload()`.
+
+### Client Configuration
+
+Client configuration stays as `ConnectOptions` — no `ArcSwap` needed. Client
+config is almost entirely static (which server to connect to, which key to use).
+
+## Consequences
+
+- **Positive**: Auth credentials and forwarding policy can be reloaded without
+  restarting the server. Adding a key via `reloadAuth()` takes effect on the
+  next connection attempt.
+- **Positive**: ADR-011's programmatic-first intent is preserved. The TOML
+  config file is an optional convenience layer, not a replacement for
+  `ServeOptions`.
+- **Positive**: `ArcSwap` provides zero-cost reads on the hot path. Every auth
+  check and every channel open is a single `Arc` dereference.
+- **Positive**: The `ConfigService` irpc protocol (behind feature flag) allows
+  production deployments to integrate config reload into their service mesh
+  without taking a direct dependency on `DynamicConfig` internals.
+- **Positive**: Forwarding policy is now part of `DynamicConfig` — operators can
+  restrict access per identity, per destination, per transport (ADR-031).
+- **Negative**: Two config structs where there was one. The split is clean
+  (transport vs. policy) but adds surface area.
+- **Negative**: Config file introduces `toml` as a dependency in the CLI crate.
+  This is acceptable for a CLI binary.
+
+## References
+
+- [research/configuration.md](../../research/configuration.md) — Full analysis
+- [ADR-011](011-no-ssh-config-programmatic-api.md) — Programmatic-first API (amended, not superseded)
+- [ADR-031](031-forwarding-policy.md) — Forwarding policy (part of DynamicConfig)
+- [ADR-029](029-identity-core-type.md) — Identity as core type (DynamicConfig.auth uses IdentityProvider)
+- [integration-plan.md](../../research/integration-plan.md) — Phase 1.1
--- a/docs/architecture/decisions/031-forwarding-policy.md
+++ b/docs/architecture/decisions/031-forwarding-policy.md
@@ -0,0 +1,138 @@
+# ADR-031: Forwarding Policy
+
+## Status
+
+Accepted
+
+## Context
+
+Currently, any authenticated client can open a `direct-tcpip` SSH channel to
+any destination. The only gate is authentication — once authenticated, a client
+has unrestricted network access through the tunnel. This is a security gap: a
+compromised key grants unrestricted access.
+
+Operators need the ability to:
+- Restrict which hosts and ports authenticated clients can access
+- Apply different rules to different principals (key fingerprints, accounts)
+- Restrict WebTransport clients to alknet control channels only
+- Set a default policy (allow-all for migration compatibility, deny-all for
+  production)
+
+## Decision
+
+**Add `ForwardingPolicy` as part of `DynamicConfig` (reloadable without
+restart).**
+
+### Type Definitions
+
+```rust
+pub struct ForwardingPolicy {
+    pub default: ForwardingAction,
+    pub rules: Vec<ForwardingRule>,
+}
+
+pub struct ForwardingRule {
+    pub target: TargetPattern,
+    pub action: ForwardingAction,
+    pub principals: Vec<String>,   // Empty = matches all
+    pub transports: Vec<TransportKind>,  // Empty = matches all
+}
+
+pub enum ForwardingAction {
+    Allow,
+    Deny,
+}
+
+pub enum TargetPattern {
+    Any,
+    Host(String),          // "localhost", "*.example.com"
+    Cidr(IpNetwork),       // "10.0.0.0/8"
+    PortRange(String, Range<u16>),  // "localhost", ports 8080-8090
+    AlknetPrefix,          // Matches alknet-* control channels
+}
+```
+
+### Rule Evaluation
+
+Rules are evaluated in order. First match wins. If no rule matches, the default
+applies. This supports both allowlist and blocklist semantics:
+
+- **Allowlist**: `default: Deny`, then explicit Allow rules for permitted
+  destinations.
+- **Blocklist**: `default: Allow`, then explicit Deny rules for blocked
+  destinations.
+
+### Principals
+
+Each rule can specify which principals it applies to. A principal is an
+`Identity.id` (fingerprint or UUID) or a scope from `Identity.scopes`. When the
+rule's `principals` field is empty, it matches all identities.
+
+This connects to the `IdentityProvider` trait (ADR-029): when a client
+authenticates, the `Identity` is resolved, and the forwarding policy checks
+rules against `Identity.id` and `Identity.scopes`.
+
+### TransportKind-Aware Rules
+
+Each rule can specify which `TransportKind` it applies to. This enables
+transport-specific restrictions — for example, WebTransport clients can be
+restricted to `alknet-*` control channels only:
+
+```rust
+ForwardingRule {
+    target: TargetPattern::AlknetPrefix,
+    action: ForwardingAction::Allow,
+    principals: vec![],
+    transports: vec![TransportKind::WebTransport { host: "*".into() }],
+}
+```
+
+### Where the Policy Check Happens
+
+The forwarding policy check occurs in `channel_open_direct_tcpip` before the
+proxy task is spawned. The current behavior (no check) is equivalent to
+`ForwardingPolicy::allow_all()` — default Allow, no rules. This preserves
+backward compatibility during migration.
+
+### DynamicConfig Integration
+
+`ForwardingPolicy` is part of `DynamicConfig` and reloadable via
+`ConfigReloadHandle::reload()` or NAPI's `reloadForwarding()`. Changes take
+effect on the next channel open — existing connections continue with their
+current policy.
+
+### OQ Resolutions
+
+- **OQ-12** (Per-user forwarding scope vs global rules): Resolved. Start with
+  global rules + principal matching from `Identity.scopes`. Per-user scope
+  from `peer_credentials.metadata.scopes` via `IdentityProvider`.
+- **OQ-16** (Transport-specific forwarding): Resolved. Add `TransportKind`
+  match in `ForwardingRule`. WebTransport clients can be restricted.
+- **OQ-18** (Source of Identity.scopes): Resolved by ADR-029 and this ADR.
+  `IdentityProvider` owns scopes. `ForwardingPolicy` consumes them.
+
+## Consequences
+
+- **Positive**: Operators can restrict access per identity, per destination, per
+  transport. A compromised key no longer grants unrestricted network access.
+- **Positive**: Default-allow preserves current behavior during migration. Switch
+  to default-deny for production deployments.
+- **Positive**: Policy is reloadable without restart. Adding a rule via
+  `reloadForwarding()` takes effect on the next channel open.
+- **Positive**: `TransportKind`-aware rules enable transport-specific
+  restrictions (e.g., WebTransport clients restricted to alknet-* channels).
+- **Negative**: Another check in the hot path (every `channel_open_direct_tcpip`
+  call). The cost is a linear scan of rules — acceptable for small rule sets.
+  Large rule sets should use compiled matchers (future optimization).
+- **Negative**: `TargetPattern` string matching is lenient. Host patterns like
+  `*.example.com` require careful implementation to prevent bypasses. The
+  `glob` or `globset` crate can handle this correctly.
+
+## References
+
+- [research/configuration.md](../../research/configuration.md) — ForwardingPolicy section
+- [auth.md](../auth.md) — Identity.scopes and IdentityProvider
+- [open-questions.md](../open-questions.md) — OQ-12, OQ-16, OQ-18
+- [ADR-029](029-identity-core-type.md) — Identity as core type
+- [ADR-030](030-static-dynamic-config-split.md) — DynamicConfig (ForwardingPolicy is part of it)
+- [integration-plan.md](../../research/integration-plan.md) — Phase 1.3
--- a/docs/architecture/decisions/032-event-boundary-discipline.md
+++ b/docs/architecture/decisions/032-event-boundary-discipline.md
@@ -0,0 +1,96 @@
+# ADR-032: Event Boundary Discipline
+
+## Status
+
+Accepted
+
+## Context
+
+The research identified three distinct communication patterns in the system, and
+conflating them is a known anti-pattern in event-driven architectures:
+
+1. **Domain events** (Honker streams) — Internal to the service that owns that
+   data. Used for state reconstruction within the service's own boundaries.
+   Examples: `nodes:created`, `edges:deleted`, `accounts:updated`.
+
+2. **irpc service calls** — Synchronous request-response within a node or
+   cluster. Internal to the system. Examples: `AuthProtocol::VerifyPubkey`,
+   `SecretProtocol::DeriveEd25519`, `ConfigProtocol::ReloadForwarding`.
+
+3. **Call protocol events** (`EventEnvelope`) — Asynchronous integration events
+   that cross node boundaries. External to the system. Examples:
+   `call.requested`, `call.responded`, `call.completed`, `call.aborted`.
+
+Without a hard constraint, it's tempting to have one service subscribe directly
+to another service's Honker streams. This leads to:
+
+- **Leaky event store**: Service A reads Service B's domain events directly,
+  coupling A to B's internal state representation. When B changes its schema, A
+  breaks.
+- **Boomerang coupling**: An integration event is too thin, causing the
+  consumer to call back to the source service synchronously to get details. This
+  negates the benefit of async communication.
+- **Fat notification trap**: A notification event carries full entity state,
+  when it should use state transfer instead.
+
+## Decision
+
+**Event boundary discipline is a hard architectural constraint, not a
+suggestion.**
+
+1. **Domain events stay within the owning service.** A Honker stream published
+   by the storage service (`nodes:created`) is for the storage service's own
+   state reconstruction. No other service reads these stream events directly.
+
+2. **irpc service calls are synchronous and internal.** They never cross node
+   boundaries. They are request-response, not events. They should not be used
+   as a substitute for integration events.
+
+3. **Call protocol events are the only events that cross node boundaries.**
+   `EventEnvelope` frames are the integration boundary. When a domain event
+   needs to be communicated to another node, it must be projected into a call
+   protocol event.
+
+4. **Projection from domain events to integration events is required when
+   crossing boundaries.** A service that owns a Honker stream must project
+   relevant state changes into `EventEnvelope` frames before they leave the
+   node. The projection strips internal details and produces a versioned,
+   stable integration event.
+
+This discipline applies at three levels:
+
+```
+Call Protocol (Layer 3, external, JSON)
+    └── irpc Service (Layer 3, internal, postcard)
+            └── Honker Streams (Domain events, within service boundary)
+```
+
+A call protocol handler MAY call an irpc service internally (e.g.,
+`/head/auth/verify` calls `AuthProtocol::VerifyPubkey`). The irpc service MAY
+use Honker streams for its own state management. But domain events never
+propagate beyond the service boundary without projection.
+
+## Consequences
+
+- **Positive**: Prevents leaky event stores. Services are independently
+  deployable and their internal schemas can evolve without breaking consumers.
+- **Positive**: Honker and irpc are implementation details, not cross-boundary
+  contracts. The call protocol's `EventEnvelope` is the only stable, versioned
+  contract that other nodes depend on.
+- **Positive**: Clear ownership. Each service owns its Honker streams and can
+  change them freely. Integration events are a deliberate, reviewed contract.
+- **Positive**: Makes testing easier. Services can be tested in isolation with
+  mock domain events. Integration events are tested against the `EventEnvelope`
+  schema.
+- **Negative**: Projection code is required. Every domain event that needs to
+  cross a boundary must be explicitly projected. This is deliberate — the
+  overhead ensures the integration contract is intentional.
+- **Negative**: Developers must resist the temptation to subscribe directly to
+  Honker streams across services. Code review should catch this pattern.
+
+## References
+
+- [research/services.md](../../research/services.md) — Event boundary discipline section
+- [research/storage.md](../../research/storage.md) — Honker integration, event boundaries
+- [research/integration-plan.md](../../research/integration-plan.md) — ADR 032 entry
+- [event_source_types.md](/workspace/research/event_sourcing/event_source_types.md) — Event-driven architecture patterns
--- a/docs/architecture/decisions/033-operationenv-irpc-call-protocol.md
+++ b/docs/architecture/decisions/033-operationenv-irpc-call-protocol.md
@@ -0,0 +1,130 @@
+# ADR-033: OperationEnv as Universal Composition Mechanism
+
+## Status
+
+Accepted
+
+## Context
+
+The `@alkdev/operations` TypeScript package defines `OperationEnv` as a
+universal composition mechanism. A handler receives `context.env[namespace][op](input)`
+and can invoke any registered operation regardless of whether it runs locally, in
+an irpc service on the same cluster, or on a remote node via call protocol.
+
+The research documents define three dispatch paths:
+1. **Local dispatch** — direct function call through the operation registry
+2. **Service dispatch** — irpc protocol call to a service backend
+3. **Remote dispatch** — call protocol `EventEnvelope` to a remote node
+
+Without a formal decision, irpc services could be seen as a replacement for
+OperationEnv or for the call protocol. They are not — irpc is one dispatch
+backend for OperationEnv, not a replacement for anything. The call protocol is
+another dispatch backend. OperationEnv unifies them from the handler's
+perspective.
+
+The three communication patterns in the system (ADR-032) are:
+- Domain events (Honker streams) — internal to the owning service
+- irpc service calls — synchronous, in-cluster
+- Call protocol events — asynchronous, cross-node
+
+irpc services and call protocol operations serve different scopes but must
+compose cleanly through OperationEnv.
+
+## Decision
+
+**OperationEnv is the universal composition mechanism that all operation
+handlers receive. It provides namespace + operation name → invoke with input,
+return output, regardless of dispatch path.**
+
+### OperationEnv Behavioral Contract
+
+```rust
+// The behavioral contract: given a namespace and operation name, invoke the
+// operation with the given input and return the output. The handler neither
+// knows nor cares whether the dispatch is local, via irpc, or via call protocol.
+pub trait OperationEnv: Send + Sync {
+    fn invoke(&self, namespace: &str, operation: &str, input: Value) -> ResponseEnvelope;
+}
+```
+
+The Rust implementation may use typed method dispatch or a registry behind the
+scenes, but the handler-facing API must preserve this contract.
+
+### Three Dispatch Paths
+
+OperationEnv resolves each call to one of three dispatch backends:
+
+| Path | Mechanism | Serialization | Scope |
+|------|-----------|---------------|-------|
+| Local | Direct function call through registry | None (in-process) | Same process |
+| Service | irpc protocol enum dispatch | postcard (binary) | Same cluster |
+| Remote | Call protocol `EventEnvelope` | JSON | Cross-node |
+
+All three produce the same `ResponseEnvelope`. The handler always calls
+`context.env.invoke("secrets", "derive", input)` and gets a `ResponseEnvelope`
+back.
+
+### Service Assembly
+
+The deployment topology determines which dispatch path each operation uses:
+
+```rust
+// Minimal deployment (single node, all local)
+let env = OperationEnv::local(local_registry);
+
+// Production deployment (mix of local and remote)
+let env = OperationEnv::new()
+    .local("auth", auth_registry)           // Auth runs locally
+    .local("config", config_registry)       // Config runs locally
+    .service("secrets", secret_irpc_client) // Secret service via irpc
+    .remote("worker-1", call_protocol_conn)  // Worker-1 operations via call protocol
+```
+
+### irpc Services Are One Dispatch Backend
+
+irpc services (`AuthProtocol`, `SecretProtocol`, `ConfigProtocol`) define the
+wire format for in-cluster communication. They are Rust-to-Rust, type-safe,
+and efficient. But they are not a replacement for OperationEnv or for the call
+protocol. They are one dispatch backend.
+
+An irpc service can be exposed as a call protocol operation:
+`/head/auth/verify` receives a call protocol event and internally calls
+`AuthProtocol::VerifyPubkey` via irpc. The layers compose:
+
+```
+Call Protocol (Layer 3, external, JSON)
+    └── irpc Service (Layer 3, internal, postcard)
+            └── Honker Streams (Domain events, within service boundary)
+```
+
+### Adapters Map to OperationEnv
+
+HTTP (`POST /v1/{namespace}/{op}`), MCP (`tools/call`), DNS
+(`{op}.{namespace}.alk.dev TXT?`), and call protocol
+(`/call.requested`) all resolve through OperationEnv. This is what makes
+operations universally composable across all interfaces.
+
+## Consequences
+
+- **Positive**: Handlers compose through a single interface. Adding a new
+  dispatch path (e.g., a new irpc service) doesn't change handler code.
+- **Positive**: irpc and call protocol coexist naturally. The handler doesn't
+  know which path was taken.
+- **Positive**: Adapters (MCP, HTTP, DNS) map to operations through the same
+  OperationEnv interface. One handler, multiple dispatch paths.
+- **Positive**: Deployment topology determines dispatch, not code. Same handler
+  works locally, in-cluster, or cross-node.
+- **Negative**: OperationEnv is a new abstraction that must coexist with the
+  existing call protocol handler pattern. The registry currently maps paths to
+  handlers; OperationEnv adds namespace-aware composition on top.
+- **Negative**: The `@alkdev/operations` TypeScript `HashMap<String,
+  HashMap<String, fn>>` model needs idiomatic Rust translation. The behavioral
+  contract must match, but the implementation can differ.
+
+## References
+
+- [research/services.md](../../research/services.md) — OperationContext, OperationEnv
+- [research/integration-plan.md](../../research/integration-plan.md) — Phase 1.5, OperationEnv wiring
+- [ADR-032](032-event-boundary-discipline.md) — Event boundary discipline
+- [ADR-024](024-bidirectional-call-protocol.md) — Bidirectional call protocol
+- [ADR-025](025-handler-spec-separation.md) — Handler/spec separation
--- a/docs/architecture/decisions/034-head-worker-terminology.md
+++ b/docs/architecture/decisions/034-head-worker-terminology.md
@@ -0,0 +1,55 @@
+# ADR-034: Head/Worker Terminology
+
+## Status
+
+Accepted
+
+## Context
+
+The project previously used hub/spoke terminology for describing node
+relationships: a hub node that coordinates connections and spokes that connect to
+it. This terminology implies a strict star topology where the hub is
+fundamentally different from spokes.
+
+In practice, a coordinating node can also execute operations (run services,
+forward traffic). Any node can become a coordinator. The architecture supports
+mesh topologies where nodes coordinate in a peer-to-peer fashion.
+
+The research documents (`core.md`, `services.md`) and updated architecture
+specs (`call-protocol.md`, `auth.md`, `napi-and-pubsub.md`, `open-questions.md`)
+already use head/worker consistently. Existing ADRs (024, 025) retain their
+original hub/spoke language because ADRs are historical records.
+
+## Decision
+
+**Use head/worker terminology throughout the project.**
+
+- **Head node**: A node that coordinates — accepts connections, routes
+  operations, manages cluster state. A head is also a worker (it can execute
+  operations).
+- **Worker node**: A node that connects to a head, registers its services, and
+  executes operations. Any worker can become a head.
+- **Node**: Any participant in the network. Every node has an Ed25519 identity.
+
+The terms hub and spoke are deprecated in all new specs, code, and
+documentation. Existing ADRs retain their original language as historical
+records — ADRs document what was decided at the time, not what the current
+terminology is.
+
+## Consequences
+
+- **Positive**: Natural mesh formation. A head that is also a worker enables
+  multi-hop routing, redundancy, and distributed topologies without a
+  centralized authority.
+- **Positive**: Consistency with integration plan and research documents.
+- **Positive**: The terminology better reflects the architecture — there is no
+  single "hub" that's fundamentally different from "spokes."
+- **Neutral**: Existing ADRs (024, 025) retain hub/spoke in their text. This is
+  intentional — ADRs are historical records.
+
+## References
+
+- [research/integration-plan.md](../../research/integration-plan.md) — Phase 0 ADR 034 entry, inconsistencies section
+- [ADR-024](024-bidirectional-call-protocol.md) — Uses hub/spoke historically
+- [ADR-025](025-handler-spec-separation.md) — Uses hub/spoke historically
+- [research/core.md](../../research/core.md) — Head/worker terminology
--- a/docs/architecture/flowgraph.md
+++ b/docs/architecture/flowgraph.md
@@ -0,0 +1,186 @@
+---
+status: draft
+last_updated: 2026-06-07
+---
+
+# FlowGraph
+
+## What
+
+The `alknet-flowgraph` crate provides graph data structures and operations,
+mapping the TypeScript `@alkdev/flowgraph` package's call-graph and
+operation-graph concepts to `petgraph::DiGraph`.
+
+## Why
+
+Call graphs and operation graphs are core observability and type-safety
+constructs. Call graphs track request flow across services; operation graphs
+validate type compatibility between composed operations. The crate is pure
+computation (no I/O, no external state), making it safe to include in any
+deployment topology.
+
+## Architecture
+
+### Core Abstraction
+
+`petgraph::DiGraph` replaces graphology. The mapping is nearly 1:1 for the
+operations used:
+
+| TypeScript (graphology) | Rust (petgraph) |
+|------------------------|-----------------|
+| `graph.addNode(key, attrs)` | `graph.add_node(attrs)` + key_to_index |
+| `graph.addEdge(source, target, attrs)` | `graph.add_edge(source, target, attrs)` |
+| `hasCycle()` | `is_cyclic_directed(&graph)` |
+| `topologicalSort()` | `toposort(&graph)` |
+
+A `HashMap<String, NodeIndex>` provides node-key-to-index lookups, mirroring
+the `key` column in the SQLite `nodes` table.
+
+### FlowGraph<N, E>
+
+```rust
+pub struct FlowGraph<N, E>
+where
+    N: NodeAttributes,
+    E: EdgeAttributes,
+{
+    graph: DiGraph<N, E>,
+    key_to_index: HashMap<String, NodeIndex>,
+}
+
+pub trait NodeAttributes: Clone + Serialize + DeserializeOwned + Debug + Send + Sync {
+    fn key(&self) -> &str;
+    fn set_key(&mut self, key: String);
+}
+
+pub trait EdgeAttributes: Clone + Serialize + DeserializeOwned + Debug + Send + Sync {
+    fn edge_type(&self) -> &str;
+}
+```
+
+### Operation Graph (Static)
+
+Built from `OperationSpec`s at startup. Answers structural questions: type
+compatibility, cycle detection, reachability.
+
+```rust
+pub struct OperationNodeAttrs {
+    pub name: String,
+    pub namespace: String,
+    pub op_type: OperationType,
+    pub input_schema: Value,
+    pub output_schema: Value,
+}
+
+pub enum OperationType { Query, Mutation, Subscription }
+```
+
+Type compatibility compares `output_schema` (source) against `input_schema`
+(target) using `jsonschema::validate()`. Exact match or subtype = compatible
+edge. Structural mismatch = incompatible edge.
+
+### Call Graph (Dynamic)
+
+Populated at runtime from call protocol events. Every `call.requested` adds a
+node; `call.responded`/`call.error`/`call.aborted` update status.
+
+```rust
+pub struct CallNodeAttrs {
+    pub request_id: String,
+    pub operation_id: String,
+    pub status: CallStatus,
+    pub parent_request_id: Option<String>,
+    pub input: Value,
+    pub output: Option<Value>,
+    pub error: Option<CallErrorInfo>,
+    pub identity: Option<Identity>,
+    pub started_at: Option<String>,
+    pub completed_at: Option<String>,
+}
+
+pub enum CallStatus { Pending, Running, Completed, Failed, Aborted }
+```
+
+### Key Operations
+
+| Query | Method | Returns |
+|-------|--------|---------|
+| Topological order | `topological_order()` | `Result<Vec<String>, CycleError>` |
+| Cycle detection | `has_cycles()` | `bool` |
+| Ancestors/descendants | `ancestors()`, `descendants()` | `Vec<String>` |
+| Status filtering | `filter_by_status()` | Keys with matching status |
+| Duration | `duration()` | `completed_at - started_at` |
+
+### DAG Invariants
+
+- **Operation graph**: DAG-only enforced at construction. Cycles throw
+  `CycleError`.
+- **Call graph**: DAG by design. `parent_request_id` cannot create ancestor
+  cycles.
+- **No parallel edges**: `multi: false`.
+- **No self-loops**: `allow_self_loops: false`.
+
+### Integration with alknet-storage
+
+Call graphs and operation graphs are stored as metagraph instances in
+alknet-storage. The bridge is serialization: `FlowGraph` serializes to
+`serde_json::Value`, which storage persists in the `nodes.attributes` and
+`edges.attributes` columns.
+
+### Integration with alknet-core (Call Protocol)
+
+The call protocol's `EventEnvelope` drives call graph construction:
+
+```rust
+call_map.on_requested(|event| {
+    call_graph.update_from_event(&CallEvent::Requested(event));
+});
+```
+
+### Crate Dependencies
+
+```toml
+[dependencies]
+petgraph = "0.x"
+serde = { version = "1", features = ["derive"] }
+serde_json = "1"
+jsonschema = "0.x"
+thiserror = "1"
+uuid = { version = "1", features = ["v4"] }
+chrono = { version = "0.x", features = ["serde"] }
+```
+
+Does NOT depend on alknet-core, alknet-storage, or alknet-secret.
+
+### Interface Back to Core
+
+`OperationSpec` and `CallNodeAttrs` types must match alknet-core's definitions.
+The bridge is serialization — flowgraph serializes to JSON, storage persists it.
+alknet-flowgraph does not depend on alknet-core as a crate; it conforms to the
+`OperationSpec` schema independently.
+
+## Constraints
+
+- Pure computation crate — no I/O, no database, no external state.
+- No dependency on alknet-core, alknet-storage, or alknet-secret.
+- Type compatibility with alknet-core's `OperationSpec` is via serialization
+  conformance, not a crate dependency.
+
+## Open Questions
+
+- None specific to this spec. See [open-questions.md](open-questions.md) for
+  general questions.
+
+## Design Decisions
+
+| ADR | Decision | Summary |
+|-----|----------|---------|
+| [027](decisions/027-crate-decomposition.md) | Crate decomposition | alknet-flowgraph is independent of core, storage, secret |
+
+## References
+
+- [research/flow.md](../research/flow.md) — Full FlowGraph, operation graph, call graph design
+- [research/integration-plan.md](../research/integration-plan.md) — Phase 2.3
+- [call-protocol.md](call-protocol.md) — EventEnvelope, PendingRequestMap
+- `@alkdev/flowgraph` — TypeScript call-graph and operation-graph implementation
+- `@alkdev/operations` — OperationSpec, CallHandler, registry
--- a/docs/architecture/identity.md
+++ b/docs/architecture/identity.md
@@ -0,0 +1,189 @@
+---
+status: draft
+last_updated: 2026-06-07
+---
+
+# Identity
+
+## What
+
+The `Identity` type and `IdentityProvider` trait are the core abstractions for
+authentication and authorization in alknet. `Identity` is the unified result of
+auth verification — whether via SSH public key, signed timestamp token, or
+database lookup. `IdentityProvider` is the trait that resolves credentials to an
+`Identity`, decoupling alknet-core from any specific identity storage.
+
+## Why
+
+Auth, forwarding policy, and call protocol all need to know who is making a
+request and what they are authorized to do. Without `Identity` in core, each
+subsystem would define its own identity type, leading to duplication and
+conversion boilerplate. Without `IdentityProvider` as a trait, alknet-core
+would either hardcode config-file-based auth or take a database dependency —
+neither acceptable for a library crate.
+
+The `IdentityProvider` trait exists because the same auth verification concept
+needs two implementations: `ConfigIdentityProvider` for minimal deployments (all
+keys in memory via ArcSwap) and `StorageIdentityProvider` for production (SQLite
+lookup via `peer_credentials` and ACL graph). The trait is the contract; the
+backing store is pluggable.
+
+## Architecture
+
+### Identity Struct
+
+```rust
+pub struct Identity {
+    pub id: String,                               // Fingerprint or account UUID
+    pub scopes: Vec<String>,                      // e.g., ["relay:connect", "service:gitea:read"]
+    pub resources: HashMap<String, Vec<String>>,   // e.g., {"service": ["gitea", "registry"]}
+}
+```
+
+The `id` field serves dual purpose:
+- **Config-based auth** (`ConfigIdentityProvider`): holds the Ed25519 key
+  fingerprint (e.g., `SHA256:abc123...`)
+- **Database-backed auth** (`StorageIdentityProvider`): holds the account UUID
+  from the `accounts` table
+
+This keeps the type simple while accommodating both auth paths. Downstream
+consumers (forwarding policy, call protocol ACL checks) use `scopes` and
+`resources` without knowing whether the identity came from a config file or a
+database.
+
+### IdentityProvider Trait
+
+```rust
+pub trait IdentityProvider: Send + Sync + 'static {
+    /// Resolve an SSH public key fingerprint to an identity.
+    fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
+
+    /// Resolve an auth token to an identity.
+    /// Returns None if the token is invalid, expired, or the key is not authorized.
+    fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
+}
+```
+
+Both SSH key auth and token auth resolve to the same `Identity` type. The trait
+lives in `alknet_core::auth`.
+
+### ConfigIdentityProvider (Default)
+
+Reads from `ArcSwap<DynamicConfig.auth>` per ADR-030. Every authorized key gets
+a default scope set. No database dependency. This is the default for CLI and
+single-node deployments.
+
+```rust
+pub struct ConfigIdentityProvider {
+    auth_config: Arc<ArcSwap<DynamicConfig>>,
+}
+
+impl IdentityProvider for ConfigIdentityProvider {
+    fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity> {
+        let config = self.auth_config.load();
+        config.auth.ssh.authorized_keys.get(fingerprint)
+            .map(|key_entry| Identity {
+                id: fingerprint.to_string(),
+                scopes: key_entry.scopes.clone(),
+                resources: key_entry.resources.clone(),
+            })
+    }
+
+    fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity> {
+        // Verify Ed25519 signature against the same authorized_keys set
+        // Resolve to the same Identity as SSH auth would produce
+    }
+}
+```
+
+### StorageIdentityProvider (Production)
+
+Implemented in `alknet-storage` (not in alknet-core). Backed by SQLite
+`peer_credentials` and `api_keys` tables plus the ACL graph. Resolves
+fingerprint → account → organization membership → effective scopes. Uses the
+`IdentityProvider` trait defined in alknet-core, providing the concrete impl via
+the trait.
+
+### AuthProtocol irpc Service
+
+The `AuthProtocol` irpc service (behind the `irpc` feature flag per ADR-028)
+provides an async boundary for auth verification. It is one way to satisfy the
+`IdentityProvider` trait, not a replacement for it:
+
+```rust
+enum AuthProtocol {
+    VerifyPubkey { fingerprint: String, key_data: Vec<u8> },
+    VerifyToken { token_bytes: Vec<u8>, timestamp: u64 },
+    ReloadKeys,
+    CheckAccess { identity: Identity, operation: String },
+}
+
+enum AuthResult {
+    Ok(Identity),
+    Denied(String),
+}
+```
+
+The relationship:
+- **Trait-based path**: Handler calls `identity_provider.resolve_from_fingerprint()`
+  directly. Zero overhead. Used when irpc is disabled or when the
+  implementation is local.
+- **irpc path**: Handler calls `identity_provider.resolve_from_fingerprint()`,
+  which internally delegates to `AuthProtocol::VerifyPubkey` via an irpc client.
+  Used in production deployments with SQLite-backed auth.
+
+Both paths produce the same `Identity` result.
+
+### Auth Flows
+
+**SSH key auth** (existing, unchanged):
+```
+Client connects → SSH handshake → auth_publickey() callback
+  → IdentityProvider::resolve_from_fingerprint(fingerprint)
+  → Some(Identity) or None
+```
+
+**Token auth** (new, for non-SSH transports):
+```
+Browser connects → WebTransport CONNECT request
+  → Extract token from URL path or Authorization header
+  → IdentityProvider::resolve_from_token(token)
+  → Some(Identity) or None
+```
+
+Both paths produce an `Identity`. The `Identity` is attached to the connection
+and used by `ForwardingPolicy` and call protocol for authorization decisions.
+
+## Constraints
+
+- `Identity` and `IdentityProvider` live in `alknet_core::auth`. No database
+  dependency at the core level (ADR-029).
+- alknet-storage implements the core trait — the dependency goes from storage
+  to core, not the other way.
+- The `id` field in `Identity` serves dual purpose (fingerprint or UUID). This
+  is a deliberate simplification — downstream consumers don't need to know the
+  source.
+- Certificate authority tokens are not supported for token auth in v1 (ADR-023).
+- The irpc feature flag means nodes that only do SSH tunneling don't need the
+  service layer overhead.
+
+## Open Questions
+
+- None specific to this spec. See [open-questions.md](open-questions.md) for
+  general auth questions (OQ-15, OQ-19).
+
+## Design Decisions
+
+| ADR | Decision | Summary |
+|-----|----------|---------|
+| [029](decisions/029-identity-core-type.md) | Identity as core type | `Identity` and `IdentityProvider` live in alknet-core, not storage |
+| [028](decisions/028-auth-irpc-service.md) | Auth as irpc service | `AuthProtocol` behind feature flag; `IdentityProvider` is the contract |
+| [023](decisions/023-unified-auth-shared-key-material.md) | Unified auth | Same key material for SSH and token auth; same `Identity` result |
+
+## References
+
+- [auth.md](auth.md) — Token authentication, AuthPolicy, WebTransport session handling
+- [research/services.md](../research/services.md) — AuthService, AuthProtocol definition
+- [research/integration-plan.md](../research/integration-plan.md) — Phase 1.2
+- [ADR-030](decisions/030-static-dynamic-config-split.md) — DynamicConfig (ConfigIdentityProvider reads from it)
+- [ADR-031](decisions/031-forwarding-policy.md) — ForwardingPolicy consumes Identity.scopes
--- a/docs/architecture/interface.md
+++ b/docs/architecture/interface.md
@@ -0,0 +1,221 @@
+---
+status: draft
+last_updated: 2026-06-07
+---
+
+# Interface (Layer 2)
+
+## What
+
+The Interface layer sits between Transport (Layer 1) and Protocol (Layer 3).
+An Interface consumes a `Transport::Stream` and produces call protocol sessions.
+SSH is an interface, not a transport — it wraps a byte stream in session
+semantics. Raw framing (4-byte length prefix + JSON `EventEnvelope`) is another
+interface, one without SSH overhead.
+
+## Why
+
+In the current architecture, SSH is deeply embedded in `ServerHandler`. This
+tangling of transport, interface, and protocol makes it impossible to:
+
+- Run the call protocol over DNS queries without wrapping SSH inside DNS
+- Use raw framing for local service mesh (no SSH overhead)
+- Support WebTransport direct call protocol for browsers
+- Separate auth mechanics from channel management
+
+The three-layer model (ADR-026) cleanly separates these concerns. Transport
+produces bytes. Interface parses bytes into sessions. Protocol carries
+semantics. A connection is always a (Transport, Interface) pair.
+
+## Architecture
+
+### Three-Layer Model
+
+```
+Layer 3: Protocol    (Call protocol, Operations, OperationEnv)
+Layer 2: Interface   (SSH, raw framing, HTTP/WS, DNS control channel)
+Layer 1: Transport   (TCP, TLS, iroh, DNS, WebTransport)
+```
+
+- **Layer 1: Transport** — produces byte streams (`AsyncRead + AsyncWrite + Unpin
+  + Send`). Unchanged per ADR-001.
+- **Layer 2: Interface** — consumes a `Transport::Stream` and produces call
+  protocol sessions. SSH does handshake + auth + channel multiplexing. Raw
+  framing does length-prefix parsing.
+- **Layer 3: Protocol** — carries semantics. Call protocol events, operation
+  registry, service calls. Agnostic to both Transport and Interface below it.
+
+### Interface Trait
+
+```rust
+#[async_trait]
+pub trait Interface: Send + Sync + 'static {
+    type Session;
+    async fn accept(stream: TransportStream, config: &InterfaceConfig) -> Result<Self::Session>;
+}
+```
+
+The session produced by an interface is consumed by the call protocol handler.
+Different interfaces produce different session types, but the call protocol
+handler receives `EventEnvelope` frames from any interface.
+
+### SshInterface
+
+Wraps the existing `ServerHandler` logic. This is the most complex interface
+because SSH provides channel multiplexing, auth negotiation, and proxy
+management within a single session.
+
+What stays in SshInterface (Layer 2):
+- SSH handshake and session management
+- Auth delegation to `IdentityProvider` (via `auth_publickey()` callback)
+- Channel multiplexing (multiple channels per session)
+- `alknet-control:0` channel routing to call protocol
+
+What moves to Layer 3 (call protocol handler):
+- Operation registry and dispatch
+- Forwarding policy checks (per ADR-031)
+- Operation context construction (Identity, scopes)
+
+What moves to per-connection state:
+- Port forwarding proxy logic
+
+### RawFramingInterface
+
+Reads 4-byte big-endian length prefix + JSON `EventEnvelope` frames directly
+from the transport stream. No SSH wrapping. No channel multiplexing — the
+entire stream is a single call protocol channel.
+
+```rust
+pub struct RawFramingInterface;
+
+impl Interface for RawFramingInterface {
+    type Session = RawFramingSession;
+    // Reads length-prefixed EventEnvelope frames from the stream
+}
+```
+
+Used for:
+- DNS control channel (DNS transport + raw framing)
+- Local service mesh (TCP + raw framing, no SSH overhead)
+- Browser direct call protocol (WebTransport + raw framing, future)
+
+### DNS Control Channel
+
+A (DNS transport, raw framing interface) pair. The DNS transport encodes
+`EventEnvelope` frames as DNS query/response pairs. The raw framing interface
+parses them directly — **NOT** SSH inside DNS.
+
+```
+Client: Encode EventEnvelope as base32 DNS query labels
+  → DNS Transport → DNS Server → Raw Framing Interface → Call Protocol Handler
+
+Server: Return EventEnvelope as DNS TXT record response
+  ← Raw Framing Interface ← DNS Transport ← Call Protocol Handler
+```
+
+### Valid (Transport, Interface) Pairs
+
+| Transport | Interface | Use case |
+|-----------|-----------|----------|
+| TLS | SSH | Standard alknet tunnel |
+| TCP | SSH | Plain SSH tunnel |
+| iroh | SSH | P2P SSH tunnel |
+| DNS | raw framing | DNS control channel |
+| WebTransport | SSH | Browser SSH tunnel (future) |
+| WebTransport | raw framing | Browser call protocol (future) |
+| TCP | raw framing | Direct call protocol, local mesh |
+
+### InterfaceConfig
+
+Different interfaces require different configuration:
+
+```rust
+pub enum InterfaceConfig {
+    Ssh(SshInterfaceConfig),
+    RawFraming(RawFramingConfig),
+}
+
+pub struct SshInterfaceConfig {
+    pub auth: Arc<dyn IdentityProvider>,
+    pub forwarding: Arc<ArcSwap<DynamicConfig>>, // for ForwardingPolicy
+    pub host_key: Arc<PrivateKey>,
+}
+
+pub struct RawFramingConfig {
+    // No SSH-specific config needed
+    // Auth is handled by the transport layer (e.g., token auth for WebTransport)
+    // or by the call protocol layer
+}
+```
+
+### Auth Across Interfaces
+
+- **SshInterface**: Auth happens during SSH handshake via
+  `IdentityProvider::resolve_from_fingerprint()`. The authenticated `Identity`
+  is attached to the session.
+- **RawFramingInterface**: Auth is handled by the transport (e.g., token auth
+  for WebTransport via `IdentityProvider::resolve_from_token()`) or by the call
+  protocol layer (operation-level ACL).
+
+Both paths produce the same `Identity` type (ADR-029).
+
+### Server Accept Loop
+
+With the Interface trait, the accept loop becomes:
+
+```rust
+for listener in listeners {
+    let (transport, interface) = listener;
+    tokio::spawn(async move {
+        loop {
+            let stream = transport.accept().await?;
+            let session = interface.accept(stream, &config).await?;
+            // session produces call protocol events
+            // call protocol handler is interface-agnostic
+        }
+    });
+}
+```
+
+## Constraints
+
+- The Interface trait must accommodate both SSH's channel multiplexing and raw
+  framing's single-stream model through the same abstraction.
+- `SshInterface` is the most invasive refactoring in Phase 1. The existing
+  `ServerHandler` owns auth, channel management, and proxy logic — extracting
+  these cleanly requires careful design (integration-plan, Phase 1.8).
+- DNS transport implementation is Phase 4 work. The `TransportKind::Dns` variant
+  and `RawFramingInterface` are defined now; implementation is deferred.
+- WebTransport is Phase 4 work. The `TransportKind::WebTransport` variant is a
+  tag only for now.
+
+## Open Questions
+
+- **OQ-IF-01**: How does the `Interface` session type relate to the call
+  protocol's `EventEnvelope` stream? Does every session implement
+  `Stream<Item=EventEnvelope>`? This needs design during Phase 1.8.
+
+- **OQ-IF-02**: Should `SshInterface` own the `ForwardingPolicy` check for
+  `channel_open_direct_tcpip`, or should that move to Layer 3? Current thinking:
+  the forwarding check is a Layer 3 concern (it's policy, not session mechanics),
+  but the channel open/close lifecycle is Layer 2. The Interface reports channel
+  open requests to Layer 3; Layer 3 applies `ForwardingPolicy` and tells
+  Layer 2 whether to proxy.
+
+## Design Decisions
+
+| ADR | Decision | Summary |
+|-----|----------|---------|
+| [026](decisions/026-transport-interface-separation.md) | Three-layer model | SSH is Layer 2, not Layer 1 |
+| [033](decisions/033-operationenv-irpc-call-protocol.md) | OperationEnv | Protocol is interface-agnostic |
+| [029](decisions/029-identity-core-type.md) | Identity as core type | Auth resolution across interfaces |
+| [031](decisions/031-forwarding-policy.md) | Forwarding policy | Layer 3 policy applied to Layer 2 channel requests |
+
+## References
+
+- [research/integration-plan.md](../research/integration-plan.md) — Phase 1.8, valid (Transport, Interface) pairs
+- [research/core.md](../research/core.md) — DNS transport, three-layer model
+- [ADR-026](decisions/026-transport-interface-separation.md) — Transport/interface separation
+- [transport.md](transport.md) — Transport trait (unchanged at Layer 1)
+- [server.md](server.md) — Current ServerHandler (will become SshInterface)
+- [identity.md](identity.md) — IdentityProvider, auth across interfaces
--- a/docs/architecture/open-questions.md
+++ b/docs/architecture/open-questions.md
@@ -1,6 +1,6 @@
 ---
 status: draft
-last_updated: 2026-06-04
+last_updated: 2026-06-07
 ---

 # Open Questions
@@ -96,10 +96,10 @@ last_updated: 2026-06-04

 ### OQ-12: Per-user forwarding scope vs global rules
 - **Origin**: [research/configuration.md](../research/configuration.md)
- **Status**: open
- **Priority**: medium
- **Resolution**: (pending)
- **Cross-references**: configuration.md
+- **Status**: ~~resolved~~
+- **Priority**: ~~medium~~ —
+- **Resolution**: ADR-031 — Start with global rules + principal matching from `Identity.scopes`. Per-user scope from `peer_credentials.metadata.scopes` via `IdentityProvider`. The `ForwardingPolicy` evaluates rules against `Identity.id` and `Identity.scopes` from the authenticated identity.
+- **Cross-references**: [ADR-031](decisions/031-forwarding-policy.md), [configuration.md](configuration.md)

 ### OQ-13: Config file auto-reload via file watching
 - **Origin**: [research/configuration.md](../research/configuration.md)
@@ -119,38 +119,59 @@ last_updated: 2026-06-04
 - **Origin**: [research/configuration.md](../research/configuration.md)
 - **Status**: open
 - **Priority**: medium
- **Resolution**: (pending — needs R&D in WebTransport transport session)
- **Cross-references**: [auth.md](auth.md), OQ-19
+- **Resolution**: (deferred to Phase 4 — needs R&D in WebTransport transport session)
+- **Cross-references**: [auth.md](auth.md), OQ-19, [interface.md](interface.md)

 ### OQ-16: Transport-specific forwarding policy (e.g., WebTransport clients restricted to alknet-* channels)
 - **Origin**: [research/configuration.md](../research/configuration.md)
- **Status**: open
- **Priority**: low
- **Resolution**: (pending — defer to forwarding policy design)
- **Cross-references**: configuration.md
+- **Status**: ~~resolved~~
+- **Priority**: ~~low~~ —
+- **Resolution**: ADR-031 — Add `TransportKind` match in `ForwardingRule`. WebTransport clients can be restricted to `alknet-*` channels via `TargetPattern::AlknetPrefix` combined with a `TransportKind::WebTransport` filter.
+- **Cross-references**: [ADR-031](decisions/031-forwarding-policy.md), [configuration.md](configuration.md)

 ### OQ-17: Transport-aware auth layer (SSH keys vs API keys for non-SSH transports)
 - **Origin**: [research/configuration.md](../research/configuration.md)
 - **Status**: ~~resolved~~
 - **Priority**: ~~medium~~ —
 - **Resolution**: ADR-023 — Unified auth with shared key material. SSH transports use SSH pubkey auth. Non-SSH transports (WebTransport) use Ed25519-signed timestamp tokens. Both verify against the same `authorized_keys` set. The presentation differs per transport, but the identity is unified. `AuthPolicy` holds both `SshAuthConfig` and `TokenAuthConfig`, with `TokenKeySource::Shared` as the default (same keys for both paths). `IdentityProvider` trait decouples alknet-core from identity storage.
- **Cross-references**: [ADR-023](decisions/023-unified-auth-shared-key-material.md), [auth.md](auth.md), OQ-15
+- **Cross-references**: [ADR-023](decisions/023-unified-auth-shared-key-material.md), [identity.md](identity.md), OQ-15
+
+### OQ-23: irpc dependency — always or behind feature flag?
+- **Origin**: [research/integration-plan.md](../research/integration-plan.md)
+- **Status**: ~~resolved~~
+- **Priority**: medium —
+- **Resolution**: ADR-027 — Feature flag. Nodes that only do SSH tunneling don't need the service layer. irpc is behind a feature flag in alknet-core and an independent dependency in alknet-secret and alknet-storage.
+- **Cross-references**: [ADR-027](decisions/027-crate-decomposition.md)
+
+### OQ-24: DNS control channel scope for initial implementation?
+- **Origin**: [research/integration-plan.md](../research/integration-plan.md)
+- **Status**: ~~resolved~~
+- **Priority**: medium —
+- **Resolution**: ADR-026 — DNS control channel carries call protocol frames only (no SSH tunneling over DNS). The (DNS transport, raw framing interface) pair sends `EventEnvelope` directly. SSH-over-DNS is a future possibility but out of scope.
+- **Cross-references**: [ADR-026](decisions/026-transport-interface-separation.md), [interface.md](interface.md)
+
+### OQ-25: alknet-storage and alknet-secret irpc dependency
+- **Origin**: [research/integration-plan.md](../research/integration-plan.md)
+- **Status**: ~~resolved~~
+- **Priority**: low —
+- **Resolution**: ADR-027 — Independently. They're separate crates. irpc is a shared library they both use as an independent dependency.
+- **Cross-references**: [ADR-027](decisions/027-crate-decomposition.md)

 ## Auth

 ### OQ-18: Source of Identity.scopes — ForwardingPolicy, IdentityProvider, or both?
 - **Origin**: [auth.md](auth.md)
- **Status**: open
- **Priority**: medium
- **Resolution**: (pending)
- **Cross-references**: ADR-023, [call-protocol.md](call-protocol.md)
+- **Status**: ~~resolved~~
+- **Priority**: ~~medium~~ —
+- **Resolution**: ADR-029 and ADR-031 — `IdentityProvider` owns scopes. The `Identity` struct includes `scopes` and `resources` fields populated by the `IdentityProvider` implementation (config-based or database-backed). `ForwardingPolicy` uses scopes from `Identity` — it consumes them, it doesn't produce them.
+- **Cross-references**: [ADR-029](decisions/029-identity-core-type.md), [ADR-031](decisions/031-forwarding-policy.md), [identity.md](identity.md)

 ### OQ-19: Separate TLS identity for WebTransport vs shared with SSH-over-TLS?
 - **Origin**: [auth.md](auth.md)
 - **Status**: open
 - **Priority**: low
- **Resolution**: (pending)
- **Cross-references**: OQ-15
+- **Resolution**: (deferred to Phase 4 — QUIC is UDP, TLS-over-TCP is TCP, they can share port 443 without conflict)
+- **Cross-references**: OQ-15, [interface.md](interface.md)

 ## Call Protocol

@@ -158,19 +179,65 @@ last_updated: 2026-06-04
 - **Origin**: [call-protocol.md](call-protocol.md)
 - **Status**: open
 - **Priority**: medium
- **Resolution**: (pending — registration on connect / cleanup on disconnect is the leading approach)
+- **Resolution**: (pending — registration on connect / cleanup on disconnect is the leading approach but needs spec in call-protocol.md)
 - **Cross-references**: ADR-024, ADR-025

 ### OQ-21: Routing calls to specific workers with same-service operations
 - **Origin**: [call-protocol.md](call-protocol.md)
 - **Status**: ~~resolved~~
 - **Priority**: ~~medium~~ —
- **Resolution**: ADR-024, ADR-025 — Operation paths use `/{node}/{service}/{op}` format. The first path segment identifies the node and routes the call to the correct connected node. Multiple workers exposing the same service (e.g., two dev envs both with `/fs/*`) are differentiated by the node prefix (`/dev1/fs/readFile` vs `/dev2/fs/readFile`). The head maintains a routing table mapping node identity to connection. This mirrors iroh's ALPN dispatch: first segment = routing key.
+- **Resolution**: ADR-024, ADR-025 — Operation paths use `/{node}/{service}/{op}` format. The first path segment identifies the node and routes the call to the correct connected node. Multiple workers exposing the same service are differentiated by the node prefix (`/dev1/fs/readFile` vs `/dev2/fs/readFile`). The head maintains a routing table mapping node identity to connection.
 - **Cross-references**: [call-protocol.md](call-protocol.md), ADR-024, ADR-025

 ### OQ-22: Client streaming (streaming inputs) in the call protocol?
 - **Origin**: [call-protocol.md](call-protocol.md)
+- **Status**: ~~resolved~~
+- **Priority**: ~~low~~ —
+- **Resolution**: Deferred. Current model (single request, optional streaming response) covers all identified use cases. Client streaming can be added later if needed.
+- **Cross-references**: ADR-024
+
+## Services
+
+### OQ-SVC-01: Should the secret service support multiple seed phrases (one per tenant)?
+- **Origin**: [secret-service.md](secret-service.md)
 - **Status**: open
 - **Priority**: low
- **Resolution**: (pending)
- **Cross-references**: ADR-024
+- **Resolution**: (deferred — one seed per node is simplest; multi-seed can be added later by indexing `Unlock` with a tenant ID)
+- **Cross-references**: [secret-service.md](secret-service.md)
+
+### OQ-SVC-02: Should service protocols use postcard (binary) or JSON for remote calls?
+- **Origin**: [research/services.md](../research/services.md)
+- **Status**: ~~resolved~~
+- **Priority**: low —
+- **Resolution**: Postcard for irpc (Rust-to-Rust, efficient). JSON for call protocol (cross-language, universal). The irpc remote path naturally uses postcard.
+- **Cross-references**: [services.md](services.md)
+
+### OQ-SVC-03: How does the secret service integrate with the existing EncryptedDataSchema from @alkdev/storage?
+- **Origin**: [secret-service.md](secret-service.md)
+- **Status**: open
+- **Priority**: medium
+- **Resolution**: (pending — Rust implementation replaces PBKDF2 password-based encryption with derived AES-256-GCM keys; EncryptedData format is a superset; migration by re-encrypting)
+- **Cross-references**: [secret-service.md](secret-service.md), [storage.md](storage.md)
+
+### OQ-SVC-04: Should workers cache derived keys locally?
+- **Origin**: [secret-service.md](secret-service.md)
+- **Status**: open
+- **Priority**: low
+- **Resolution**: Yes, with a TTL (default: 1 hour). The head can revoke by invalidating the session.
+- **Cross-references**: [secret-service.md](secret-service.md)
+
+## Interface
+
+### OQ-IF-01: How does the Interface session type relate to the call protocol's EventEnvelope stream?
+- **Origin**: [interface.md](interface.md)
+- **Status**: open
+- **Priority**: high
+- **Resolution**: (pending — needs design during Phase 1.8 implementation)
+- **Cross-references**: [interface.md](interface.md), [ADR-026](decisions/026-transport-interface-separation.md)
+
+### OQ-IF-02: Should SshInterface own ForwardingPolicy checks or should they move to Layer 3?
+- **Origin**: [interface.md](interface.md)
+- **Status**: open
+- **Priority**: medium
+- **Resolution**: (pending — current thinking: forwarding check is Layer 3 policy, but channel open/close lifecycle is Layer 2. The Interface reports channel open requests to Layer 3; Layer 3 applies ForwardingPolicy.)
+- **Cross-references**: [interface.md](interface.md), [ADR-031](decisions/031-forwarding-policy.md)
--- a/docs/architecture/secret-service.md
+++ b/docs/architecture/secret-service.md
@@ -0,0 +1,197 @@
+---
+status: draft
+last_updated: 2026-06-07
+---
+
+# Secret Service
+
+## What
+
+The `alknet-secret` crate provides BIP39 mnemonic generation, SLIP-0010 Ed25519
+HD key derivation, AES-256-GCM encryption for external credentials, and the
+`SecretProtocol` irpc service. It is the only component that holds the master
+seed phrase.
+
+## Why
+
+Operations like SSH key generation, API key storage, and Ethereum transaction
+signing all need deterministic key derivation from a single root of trust. The
+seed phrase is the single recovery mechanism — from it, all self-generated
+secrets can be derived on demand. External credentials (third-party API keys,
+OAuth tokens) cannot be derived and must be stored encrypted, with the
+encryption key itself derived from the seed.
+
+The secret service isolates this responsibility: no other crate sees the seed,
+and derived keys are provided on demand through an irpc service interface.
+
+## Architecture
+
+### Security Model
+
+| State | What's in memory | What's on disk |
+|-------|-----------------|---------------|
+| Locked | Nothing | Encrypted database, derivation path metadata |
+| Unlocked | Master seed in RAM | Same (seed is never persisted) |
+| After use | Derived keys cached in RAM | Derivation paths only |
+
+The seed phrase is entered once (at node startup or via `Unlock` call), held
+only in RAM, and never written to disk. The `Lock` call purges the seed and all
+cached derived keys from memory.
+
+### SecretProtocol irpc Service
+
+```rust
+#[rpc_requests(message = SecretMessage)]
+#[derive(Debug, Serialize, Deserialize)]
+enum SecretProtocol {
+    #[rpc(tx=oneshot::Sender<DerivedKey>)]
+    #[wrap(DeriveEd25519)]
+    DeriveEd25519 { path: String },
+
+    #[rpc(tx=oneshot::Sender<DerivedKey>)]
+    #[wrap(DeriveEncryptionKey)]
+    DeriveEncryptionKey { path: String },
+
+    #[rpc(tx=oneshot::Sender<DerivedKey>)]
+    #[wrap(DeriveEthereumKey)]
+    DeriveEthereumKey { path: String },
+
+    #[rpc(tx=oneshot::Sender<Vec<u8>>)]
+    #[wrap(DerivePassword)]
+    DerivePassword { path: String, length: usize },
+
+    #[rpc(tx=oneshot::Sender<EncryptedData>)]
+    #[wrap(Encrypt)]
+    Encrypt { plaintext: String, key_version: u32 },
+
+    #[rpc(tx=oneshot::Sender<String>)]
+    #[wrap(Decrypt)]
+    Decrypt { encrypted: EncryptedData },
+
+    #[rpc(tx=oneshot::Sender<()>)]
+    #[wrap(Lock)]
+    Lock,
+
+    #[rpc(tx=oneshot::Sender<()>)]
+    #[wrap(Unlock)]
+    Unlock { passphrase: String },
+}
+
+#[derive(Debug, Serialize, Deserialize)]
+struct DerivedKey {
+    key_type: KeyType,
+    private_key: Vec<u8>,
+    public_key: Vec<u8>,
+}
+
+#[derive(Debug, Serialize, Deserialize)]
+enum KeyType {
+    Ed25519,
+    Aes256Gcm,
+    Secp256k1,
+}
+
+#[derive(Debug, Serialize, Deserialize)]
+struct EncryptedData {
+    key_version: u32,
+    salt: String,   // Base64-encoded
+    iv: String,     // Base64-encoded
+    data: String,   // Base64-encoded
+}
+```
+
+### BIP39 Mnemonic and Seed Derivation
+
+```rust
+let mnemonic = Mnemonic::from_phrase(&phrase, Language::English)?;
+let seed = mnemonic.to_seed(Some(&passphrase));
+let master_key = ExtendedPrivKey::new_master(Network::Alknet, &seed)?;
+```
+
+### SLIP-0010 Ed25519 HD Key Derivation
+
+The `74'` coin type is unallocated per SLIP-0044 and reserved for alknet.
+
+### Derivation Path Constants
+
+| Path | Purpose | Curve/Algorithm |
+|------|---------|----------------|
+| `m/74'/0'/0'/0'` | Primary identity keypair | Ed25519 (alknet auth) |
+| `m/74'/0'/0'/{n}'` | Worker/device identity | Ed25519 |
+| `m/74'/0'/1'/0'` | SSH host key | Ed25519 |
+| `m/74'/1'/0'/{hash}'` | Site-specific password | Deterministic |
+| `m/74'/2'/0'/0'` | Encryption key for external credentials | AES-256-GCM |
+| `m/44'/60'/0'/0/0` | Ethereum signing key | secp256k1 |
+
+### AES-256-GCM Encryption for External Credentials
+
+External credentials (API keys, OAuth tokens) that cannot be derived are
+encrypted using a key derived from the seed at path `m/74'/2'/0'/0'`. The
+`EncryptedData` type stores the key version, salt, IV, and ciphertext. This
+format is compatible with the existing `@alkdev/storage` `EncryptedDataSchema`.
+
+1. The secret service derives an AES-256-GCM key via path `m/74'/2'/0'/0'`
+2. External credentials are encrypted with this key
+3. The encrypted data is stored as a `SecretNode` in the metagraph
+4. Only the derivation path and key version are stored in plain attributes
+5. The seed phrase (or derived encryption key) is held only by the secret
+   service — never in the database
+
+### Deployment Topologies
+
+**Minimal (single node, CLI)**: Secret service runs in the same process. Seed
+phrase entered at startup. All keys derived locally. No irpc overhead.
+
+**Production (head node)**: Secret service runs on a dedicated node or as a
+local irpc service. Workers request derived keys via irpc over QUIC. The seed
+never leaves the secret service node.
+
+## Constraints
+
+- The seed phrase is never persisted to disk. It is entered at startup or via
+  `Unlock` and held only in RAM.
+- `Lock` purges the seed and all cached derived keys from memory.
+- alknet-secret does not depend on alknet-core or alknet-storage. It is fully
+  independent.
+- The `EncryptedData` wire format (key_version, salt, iv, data) is shared with
+  alknet-storage for compatibility, but this is type-level compatibility — not a
+  crate dependency.
+- Per ADR-032, the secret service's Honker streams (key derivation notifications)
+  stay within the service boundary. External consumers use irpc calls or call
+  protocol operations that project to integration events.
+- The irpc service defines the wire format for in-cluster communication
+  (postcard serialization). For call protocol exposure (e.g.,
+  `/head/secrets/derive`), the service is wrapped in an operation that serializes
+  to JSON.
+
+## Open Questions
+
+- **OQ-SVC-01**: Should the secret service support multiple seed phrases (one per
+  tenant)? See [open-questions.md](open-questions.md).
+
+- **OQ-SVC-02**: Should service protocols use postcard (binary) or JSON for
+  remote calls? Postcard for irpc (Rust-to-Rust), JSON for call protocol
+  (cross-language). See [open-questions.md](open-questions.md).
+
+- **OQ-SVC-03**: How does the secret service integrate with the existing
+  `EncryptedDataSchema` from `@alkdev/storage`? The Rust implementation replaces
+  PBKDF2 password-based encryption with derived AES-256-GCM keys. The
+  `EncryptedData` format is a superset.
+
+- **OQ-SVC-04**: Should workers cache derived keys locally? Yes, with a TTL
+  (default: 1 hour). The head can revoke by invalidating the session.
+
+## Design Decisions
+
+| ADR | Decision | Summary |
+|-----|----------|---------|
+| [027](decisions/027-crate-decomposition.md) | Crate decomposition | alknet-secret is independent of core and storage |
+| [032](decisions/032-event-boundary-discipline.md) | Event boundary | Secret service domain events stay internal |
+
+## References
+
+- [research/services.md](../research/services.md) — SecretProtocol definition, DerivedKey, KeyType
+- [research/storage.md](../research/storage.md) — Secrets section, derivation paths, EncryptedData
+- [research/integration-plan.md](../research/integration-plan.md) — Phase 2.1
+- SLIP-0010 — https://github.com/satoshilabs/slips/blob/master/slip-0010.md
+- BIP39 — https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki
--- a/docs/architecture/services.md
+++ b/docs/architecture/services.md
@@ -0,0 +1,211 @@
+---
+status: draft
+last_updated: 2026-06-07
+---
+
+# Services
+
+## What
+
+The irpc service layer decomposes alknet's core responsibilities into
+independently testable, deployable, and replaceable components. Auth, Secret,
+Config, and Storage are irpc protocol enums that work both as in-process async
+boundaries (tokio channels) and cross-process/cross-network (QUIC streams via
+noq). OperationEnv is the universal composition mechanism that unifies local
+dispatch, irpc service dispatch, and remote call protocol dispatch.
+
+## Why
+
+Without the service layer, auth verification, key derivation, and config reload
+are scattered across the codebase with no async boundary. For head nodes serving
+many users, in-memory key lookup doesn't scale — auth needs to query a database
+on demand. For secret management, the seed must be isolated in its own process
+boundary.
+
+Without OperationEnv, handlers calling other operations would need to know
+whether the target is local, in-cluster, or on a remote node. OperationEnv
+abstracts this away: `context.env.invoke("secrets", "derive", input)` works
+regardless of dispatch path.
+
+## Architecture
+
+### Service Definition Pattern
+
+Services are defined as irpc protocol enums:
+
+```rust
+#[rpc_requests(message = AuthMessage)]
+#[derive(Debug, Serialize, Deserialize)]
+enum AuthProtocol {
+    #[rpc(tx=oneshot::Sender<AuthResult>)]
+    #[wrap(VerifyPubkey)]
+    VerifyPubkey { fingerprint: String, key_data: Vec<u8> },
+    // ...
+}
+```
+
+The `#[rpc_requests]` macro generates two versions:
+- **Serializable** (`Request`): for remote communication (postcard encoding)
+- **With channels** (`RequestWithChannels`): for local communication (tokio channels)
+
+Both use the same `Client<S>` type. The local/remote distinction is transparent
+at the call site.
+
+### Core Services
+
+| Service | Protocol | Purpose | Always Local? |
+|---------|----------|---------|---------------|
+| **Auth** | `AuthProtocol` | Verify identities, check credentials | Can be remote |
+| **Secret** | `SecretProtocol` | Derive keys, encrypt/decrypt | Local or remote |
+| **Config** | `ConfigProtocol` | Dynamic config reload | Local |
+| **Storage** | `StorageProtocol` | Graph CRUD, metagraph operations | Local or remote |
+
+### OperationContext
+
+Every handler receives an `OperationContext`:
+
+```rust
+pub struct OperationContext {
+    pub request_id: String,
+    pub parent_request_id: Option<String>,
+    pub identity: Option<Identity>,
+    pub metadata: HashMap<String, Value>,
+    pub env: OperationEnv,
+    pub trusted: bool,  // set by buildEnv(), not by callers
+}
+```
+
+- **`identity`**: The authenticated identity making the call. Populated by
+  `IdentityProvider` from the interface layer.
+- **`env`**: The operation environment — namespaced access to other operations.
+- **`trusted`**: When a handler calls another operation through `env`, the
+  nested call is `trusted` (skips ACL checks).
+
+### OperationEnv — Universal Composition Mechanism
+
+OperationEnv provides namespace + operation name → invoke with input, return
+output. The handler doesn't know or care whether the dispatch is local, irpc,
+or remote.
+
+Three dispatch paths:
+
+| Path | Mechanism | Serialization | Scope |
+|------|-----------|---------------|-------|
+| **Local** | Direct function call through registry | None (in-process) | Same process |
+| **Service** | irpc protocol enum dispatch | postcard (binary) | Same cluster |
+| **Remote** | Call protocol `EventEnvelope` | JSON | Cross-node |
+
+All three produce the same `ResponseEnvelope`.
+
+Service assembly determines which path each operation uses:
+
+```rust
+// Minimal deployment (single node, all local)
+let env = OperationEnv::local(local_registry);
+
+// Production deployment (mix of local and remote)
+let env = OperationEnv::new()
+    .local("auth", auth_registry)
+    .local("config", config_registry)
+    .service("secrets", secret_irpc_client)
+    .remote("worker-1", call_protocol_conn);
+```
+
+### Service vs Call Protocol vs External Service
+
+These are different concepts that compose through OperationEnv:
+
+- **irpc service**: In-cluster, Rust-to-Rust, type-safe, postcard serialization.
+  Dispatched by enum variant. Example: `AuthProtocol::VerifyPubkey`.
+- **Call protocol operation**: Cross-node, cross-language, path-based, JSON
+  `EventEnvelope`. Dispatched by namespace + name. Example:
+  `/head/auth/verify`.
+- **External service**: Any endpoint reachable via the call protocol.
+  Example: a vast.ai instance, an HTTP API, another head node.
+
+An irpc service can back a call protocol operation. The OperationEnv routes to
+the appropriate dispatch path:
+
+```
+Call Protocol (Layer 3, external, JSON)
+    └── irpc Service (Layer 3, internal, postcard)
+            └── Honker Streams (Domain events, within service boundary)
+```
+
+### Adapters
+
+HTTP, MCP, DNS, and WebSocket adapters all resolve through OperationEnv:
+
+- HTTP: `POST /v1/{namespace}/{op}` → `context.env.invoke(namespace, op, input)`
+- MCP: `tools/call` with tool name → `context.env.invoke(namespace, op, input)`
+- DNS: `{op}.{namespace}.alk.dev TXT?` → `context.env.invoke(namespace, op, input)`
+- Call protocol: `call.requested` with `operationId` → `context.env.invoke(namespace, op, input)`
+
+### Deployment Topologies
+
+**Minimal (single node, CLI)**: All services run locally via tokio channels.
+
+```
+┌──────────────────────────────────────────────┐
+│                 Single Process                │
+│  Auth (ArcSwap) | Secret (seed in RAM) |     │
+│  Config (ArcSwap) | alknet-core Server        │
+└──────────────────────────────────────────────┘
+```
+
+**Production (multi-node)**: Auth and secrets on dedicated nodes; workers
+access them remotely.
+
+```
+Auth Node (SQLite)           Secret Node (seed in RAM)
+       ↑                              ↑
+       │ QUIC (irpc)                  │ QUIC (irpc)
+       │                              │
+Head Node (Config, Storage, alknet-core Server)
+       │
+       │ SSH / iroh / TLS
+       │
+Worker Node (alknet-core Client)
+```
+
+## Constraints
+
+- Services are **internal** — they run within a node or cluster.
+- The call protocol is **external** — it's how nodes talk to each other.
+- Per ADR-032, domain events (Honker streams) stay within the owning service.
+  irpc calls are synchronous request-response within a node. Call protocol
+  `EventEnvelope` is the integration boundary between nodes.
+- OperationEnv is a hard constraint: the handler-facing API must match the
+  behavioral contract from `@alkdev/operations`. Namespace + operation name →
+  invoke with input, return output.
+- irpc is behind a feature flag in alknet-core. Nodes that only do SSH tunneling
+  don't need the service layer overhead.
+
+## Open Questions
+
+- **OQ-SVC-01**: Should the secret service support multiple seed phrases (one
+  per tenant)? Defer for now — one seed per node. Multi-seed can be added
+  later by indexing the `Unlock` call with a tenant ID.
+
+- **OQ-SVC-02**: Should service protocols use postcard (binary) or JSON for
+  remote calls? Postcard for irpc (Rust-to-Rust, efficient). JSON for call
+  protocol (cross-language, universal). The irpc remote path naturally uses
+  postcard.
+
+## Design Decisions
+
+| ADR | Decision | Summary |
+|-----|----------|---------|
+| [027](decisions/027-crate-decomposition.md) | Crate decomposition | Service crates are independent of core |
+| [028](decisions/028-auth-irpc-service.md) | Auth as irpc service | AuthProtocol behind feature flag |
+| [032](decisions/032-event-boundary-discipline.md) | Event boundary | Domain events never cross service boundaries |
+| [033](decisions/033-operationenv-irpc-call-protocol.md) | OperationEnv | Universal composition mechanism with three dispatch paths |
+
+## References
+
+- [research/services.md](../research/services.md) — Service protocol definitions, OperationContext, deployment topologies
+- [research/integration-plan.md](../research/integration-plan.md) — OperationEnv, three dispatch paths, adapter patterns
+- [secret-service.md](secret-service.md) — SecretProtocol definition
+- [identity.md](identity.md) — IdentityProvider, AuthProtocol
+- [configuration.md](configuration.md) — ConfigProtocol, DynamicConfig reload
+- [interface.md](interface.md) — Interface layer, auth across interfaces
--- a/docs/architecture/storage.md
+++ b/docs/architecture/storage.md
@@ -0,0 +1,219 @@
+---
+status: draft
+last_updated: 2026-06-07
+---
+
+# Storage
+
+## What
+
+The `alknet-storage` crate provides SQLite-backed graph storage, identity
+management, access control, and reactivity via honker. It mirrors the
+TypeScript `@alkdev/storage` package's design while leveraging Rust's type
+system and honker's built-in pub/sub.
+
+## Why
+
+alknet-core needs persistent identity data (authorized keys, accounts, ACLs)
+and a way to store and query graph-structured data (call graphs, operation
+graphs, metagraph). But alknet-core cannot take a database dependency. The
+solution: alknet-storage implements alknet-core's `IdentityProvider` trait,
+providing SQLite-backed identity resolution without core knowing about SQLite.
+
+The metagraph (three-level type system: GraphType → NodeType → EdgeType → Graph
+→ Node → Edge) is the foundation for ACL, flowgraph persistence, and any
+future graph-structured data.
+
+## Architecture
+
+### Crate Structure
+
+```
+alknet-storage/
+├── metagraph/     — GraphType, NodeType, EdgeType persistence
+├── identity/      — accounts, organizations, peer_credentials, api_keys, audit_logs
+├── acl/           — PrincipalNode, DelegatesEdge, access control graph
+├── secrets/       — Encrypted node type, encrypt/decrypt bridge
+├── honker/        — honker integration: notify, stream, queue
+├── graph/         — GraphInstance, Node, Edge CRUD with schema validation
+└── schema/        — JSON Schema definitions (serde + jsonschema)
+```
+
+### Metagraph Data Model
+
+Three-level type system:
+
+1. **GraphType** — A class of graphs (e.g., "call-graph", "acl",
+   "task-dependencies"). Defines structural constraints.
+2. **NodeType** — A category of node within a graph type. Each has a JSON Schema
+   for attribute validation.
+3. **EdgeType** — A category of edge within a graph type. Each has a JSON Schema
+   and optional source/target constraints.
+
+Graph instances belong to a graph type and contain nodes and edges conforming
+to those type definitions.
+
+### SQLite Table Schema
+
+Common columns: `id TEXT PK`, `metadata TEXT JSON DEFAULT '{}'`,
+`created_at INTEGER TIMESTAMP`, `updated_at INTEGER TIMESTAMP`.
+
+| Table | Key columns |
+|-------|------------|
+| `graph_types` | id, name (UNIQUE), config JSON, version, scope |
+| `node_types` | id, graph_type_id FK, name, schema JSON |
+| `edge_types` | id, graph_type_id FK, name, schema JSON, allowed_source/target types |
+| `graphs` | id, graph_type_id FK, name, description, status, owner_id, project_id |
+| `nodes` | id, graph_id FK, key (UNIQUE per graph), attributes JSON |
+| `edges` | id, graph_id FK, key, source_node_key, target_node_key, attributes JSON, undirected |
+
+No FK constraints across database files. Referential integrity is enforced at
+the application layer.
+
+### System DB vs Tenant DB
+
+- **System DB** (`system.db`): Identity tables (accounts, organizations,
+  peer_credentials, api_keys, audit_logs) + system-scoped graph types.
+- **Tenant DB** (`tenant-{orgId}.db`): Metagraph tables + tenant-scoped graph
+  types.
+
+### Identity Tables
+
+| Table | Key columns |
+|-------|------------|
+| `accounts` | email (UNIQUE), display_name, access_level (admin/user/service), status |
+| `organizations` | name (UNIQUE), slug (UNIQUE), owner_id FK → accounts |
+| `organization_members` | org_id FK, account_id FK, membership_level (owner/admin/member) |
+| `api_keys` | owner_id FK, key_hash (UNIQUE), name, enabled, expires_at, revoked_at |
+| `peer_credentials` | owner_id FK, credential_type (ssh_key/cert_authority), fingerprint (UNIQUE), public_key_data |
+| `audit_logs` | action, owner_id FK, credential_id, org_id FK, details JSON |
+
+### ACL as Metagraph
+
+The ACL graph is a directed, non-multi metagraph:
+
+- **PrincipalNode**: IdentityType (Account, Org, Service, Role) + identity_id + scopes + resources
+- **ResourceNode**: The thing being accessed
+- **Edges**: can_read, can_write, can_execute, belongs_to, delegates
+
+Delegation edges carry `narrowed_scopes` — the delegate can only exercise scopes
+that are a subset of the delegator's.
+
+### StorageIdentityProvider
+
+Implements alknet-core's `IdentityProvider` trait (ADR-029). Queries
+`peer_credentials` (for SSH key resolution) and `api_keys` (for token auth), then
+traverses the ACL graph to compute effective scopes and resources.
+
+```rust
+impl IdentityProvider for StorageIdentityProvider {
+    fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity> {
+        // 1. Find peer_credentials row by fingerprint
+        // 2. Resolve to account → organization membership → effective scopes
+        // 3. Return Identity { id: account_uuid, scopes, resources }
+    }
+
+    fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity> {
+        // 1. Verify Ed25519 signature against api_keys or peer_credentials
+        // 2. Resolve to account → effective scopes
+        // 3. Return Identity { id: account_uuid, scopes, resources }
+    }
+}
+```
+
+### StorageProtocol irpc Service
+
+```rust
+#[rpc_requests(message = StorageMessage)]
+enum StorageProtocol {
+    #[rpc(tx=oneshot::Sender<Graph>)]
+    #[wrap(CreateGraph)]
+    CreateGraph { graph_type_id: String, name: String },
+
+    #[rpc(tx=oneshot::Sender<Node>)]
+    #[wrap(AddNode)]
+    AddNode { graph_id: String, key: String, attributes: Value },
+
+    // ... (full protocol in research/services.md)
+}
+```
+
+### Honker Integration
+
+| Feature | Use case |
+|---------|----------|
+| `stream_publish` / `subscribe` | Durable pub/sub for node/edge/membership changes |
+| `notify` / `listen` | Ephemeral pub/sub for real-time control channel events |
+| `queue` / `claim` / `ack` | Task queue for async operations |
+
+Per ADR-032, honker streams are domain events internal to the storage service.
+They are projected to call protocol `EventEnvelope` events when crossing service
+boundaries.
+
+### Encrypted Data
+
+alknet-storage references alknet-secret's `EncryptedData` wire format for
+storing encrypted nodes (API keys, OAuth tokens). The format (key_version,
+salt, iv, ciphertext) is shared by type-level compatibility, not a crate
+dependency. alknet-secret encrypts; alknet-storage stores the blob.
+
+### Crate Dependencies
+
+```toml
+[dependencies]
+honker = "0.x"
+rusqlite = { version = "0.x", features = ["bundled"] }
+serde = { version = "1", features = ["derive"] }
+serde_json = "1"
+jsonschema = "0.x"
+petgraph = "0.x"
+irpc = "0.x"
+```
+
+Does NOT depend on alknet-core or alknet-secret. Implements alknet-core's
+`IdentityProvider` trait by conforming to its signature, not by direct crate
+dependency.
+
+## Constraints
+
+- alknet-storage does NOT depend on alknet-core as a crate. It implements the
+  `IdentityProvider` trait by conforming to the signature. The CLI binary
+  wires them together.
+- alknet-storage does NOT depend on alknet-secret. They share the `EncryptedData`
+  wire format by type-level compatibility, not a crate dependency.
+- WAL mode for concurrent reads during writes. Single writer per `.db` file.
+- JSON Schema validation uses the `jsonschema` crate at runtime (replaces
+  TypeBox from TypeScript).
+- Per ADR-032, honker stream events never cross service boundaries without
+  projection to `EventEnvelope`.
+
+## Open Questions
+
+- **OQ-SVC-03**: How does the secret service integrate with the existing
+  `EncryptedDataSchema` from `@alkdev/storage`? The Rust implementation replaces
+  PBKDF2 password-based encryption with derived AES-256-GCM keys. The
+  `EncryptedData` format is a superset — old format can be migrated by
+  re-encrypting with the new key.
+
+- **OQ-SVC-04**: Should workers cache derived keys locally? Yes, with a TTL
+  (default: 1 hour). The head can revoke by invalidating the session.
+
+- **OQ-SVC-05**: How does the smart contract (NFT-based ACL) interact with the
+  secret service? The Ethereum signing key (`m/44'/60'/0'/0/0`) is derived from
+  the same seed. The smart contract is a separate concern.
+
+## Design Decisions
+
+| ADR | Decision | Summary |
+|-----|----------|---------|
+| [027](decisions/027-crate-decomposition.md) | Crate decomposition | alknet-storage is independent of core and secret |
+| [029](decisions/029-identity-core-type.md) | Identity as core type | alknet-storage implements IdentityProvider trait |
+| [032](decisions/032-event-boundary-discipline.md) | Event boundary | Honker streams stay internal; projection to EventEnvelope at boundaries |
+
+## References
+
+- [research/storage.md](../research/storage.md) — Full metagraph, identity, ACL, honker definitions
+- [research/services.md](../research/services.md) — StorageProtocol, StorageIdentityProvider
+- [research/integration-plan.md](../research/integration-plan.md) — Phase 2.2
+- [identity.md](identity.md) — IdentityProvider trait, Identity struct
+- [secret-service.md](secret-service.md) — EncryptedData format, derivation paths
--- a/tasks/architecture/adr-026-transport-interface-separation.md
+++ b/tasks/architecture/adr-026-transport-interface-separation.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/adr-026-transport-interface-separation
 name: Write ADR-026 — Transport/interface separation (three-layer model)
-status: pending
+status: completed
 depends_on: []
 scope: moderate
 risk: high
--- a/tasks/architecture/adr-027-crate-decomposition.md
+++ b/tasks/architecture/adr-027-crate-decomposition.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/adr-027-crate-decomposition
 name: Write ADR-027 — Crate decomposition
-status: pending
+status: completed
 depends_on:
  - architecture/adr-029-identity-core-type
 scope: moderate
--- a/tasks/architecture/adr-028-auth-irpc-service.md
+++ b/tasks/architecture/adr-028-auth-irpc-service.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/adr-028-auth-irpc-service
 name: Write ADR-028 — Auth as irpc service
-status: pending
+status: completed
 depends_on:
  - architecture/adr-029-identity-core-type
 scope: narrow
--- a/tasks/architecture/adr-029-identity-core-type.md
+++ b/tasks/architecture/adr-029-identity-core-type.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/adr-029-identity-core-type
 name: Write ADR-029 — Identity as core type
-status: pending
+status: completed
 depends_on: []
 scope: single
 risk: low
--- a/tasks/architecture/adr-030-static-dynamic-config-split.md
+++ b/tasks/architecture/adr-030-static-dynamic-config-split.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/adr-030-static-dynamic-config-split
 name: Write ADR-030 — Static/dynamic config split
-status: pending
+status: completed
 depends_on: []
 scope: narrow
 risk: low
--- a/tasks/architecture/adr-031-forwarding-policy.md
+++ b/tasks/architecture/adr-031-forwarding-policy.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/adr-031-forwarding-policy
 name: Write ADR-031 — Forwarding policy
-status: pending
+status: completed
 depends_on: []
 scope: narrow
 risk: low
--- a/tasks/architecture/adr-032-event-boundary-discipline.md
+++ b/tasks/architecture/adr-032-event-boundary-discipline.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/adr-032-event-boundary-discipline
 name: Write ADR-032 — Event boundary discipline
-status: pending
+status: completed
 depends_on: []
 scope: single
 risk: low
--- a/tasks/architecture/adr-033-operationenv-irpc-call-protocol.md
+++ b/tasks/architecture/adr-033-operationenv-irpc-call-protocol.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/adr-033-operationenv-irpc-call-protocol
 name: Write ADR-033 — OperationEnv, irpc, and call protocol relationship
-status: pending
+status: completed
 depends_on:
  - architecture/adr-028-auth-irpc-service
  - architecture/adr-027-crate-decomposition
--- a/tasks/architecture/adr-034-head-worker-terminology.md
+++ b/tasks/architecture/adr-034-head-worker-terminology.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/adr-034-head-worker-terminology
 name: Write ADR-034 — Head/worker terminology
-status: pending
+status: completed
 depends_on: []
 scope: single
 risk: trivial
--- a/tasks/architecture/spec-configuration.md
+++ b/tasks/architecture/spec-configuration.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/spec-configuration
 name: Promote configuration.md from research to architecture spec
-status: pending
+status: completed
 depends_on:
  - architecture/adr-030-static-dynamic-config-split
  - architecture/adr-031-forwarding-policy
--- a/tasks/architecture/spec-flowgraph.md
+++ b/tasks/architecture/spec-flowgraph.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/spec-flowgraph
 name: Create flowgraph.md architecture spec (or stub referencing crate docs)
-status: pending
+status: completed
 depends_on:
  - architecture/adr-027-crate-decomposition
 scope: narrow
--- a/tasks/architecture/spec-identity.md
+++ b/tasks/architecture/spec-identity.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/spec-identity
 name: Create identity.md architecture spec
-status: pending
+status: completed
 depends_on:
  - architecture/adr-029-identity-core-type
  - architecture/adr-028-auth-irpc-service
--- a/tasks/architecture/spec-interface.md
+++ b/tasks/architecture/spec-interface.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/spec-interface
 name: Create interface.md architecture spec (Layer 2)
-status: pending
+status: completed
 depends_on:
  - architecture/adr-026-transport-interface-separation
  - architecture/adr-033-operationenv-irpc-call-protocol
--- a/tasks/architecture/spec-secret-service.md
+++ b/tasks/architecture/spec-secret-service.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/spec-secret-service
 name: Create secret-service.md architecture spec
-status: pending
+status: completed
 depends_on:
  - architecture/adr-027-crate-decomposition
  - architecture/adr-032-event-boundary-discipline
--- a/tasks/architecture/spec-services.md
+++ b/tasks/architecture/spec-services.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/spec-services
 name: Create services.md architecture spec (irpc service layer + OperationEnv)
-status: pending
+status: completed
 depends_on:
  - architecture/adr-033-operationenv-irpc-call-protocol
  - architecture/adr-027-crate-decomposition
--- a/tasks/architecture/spec-storage.md
+++ b/tasks/architecture/spec-storage.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/spec-storage
 name: Create storage.md architecture spec (or stub referencing crate docs)
-status: pending
+status: completed
 depends_on:
  - architecture/adr-027-crate-decomposition
  - architecture/adr-029-identity-core-type
--- a/tasks/architecture/spec-update-auth.md
+++ b/tasks/architecture/spec-update-auth.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/spec-update-auth
 name: Update auth.md — add IdentityProvider vs AuthService relationship
-status: pending
+status: completed
 depends_on:
  - architecture/spec-identity
  - architecture/adr-028-auth-irpc-service
--- a/tasks/architecture/spec-update-open-questions.md
+++ b/tasks/architecture/spec-update-open-questions.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/spec-update-open-questions
 name: Update open-questions.md — resolve questions per ADR decisions
-status: pending
+status: completed
 depends_on:
  - architecture/adr-031-forwarding-policy
  - architecture/adr-029-identity-core-type
--- a/tasks/architecture/spec-update-readme.md
+++ b/tasks/architecture/spec-update-readme.md
@@ -1,7 +1,7 @@
 ---
 id: architecture/spec-update-readme
 name: Update architecture README.md — add new docs and ADRs to tables
-status: pending
+status: completed
 depends_on:
  - architecture/spec-configuration
  - architecture/spec-identity