docs(research): storage and auth strategy — repo pattern, per-node ACL, SQLite+honker, metagraph-as-tool

Synthesizes the multi-thread discussion that surfaced during the peer-graph routing research (ADR-029) and OQ-33/34 resolution. Three separate threads (peer identity, filesystem POC, old storage spec) converged on the same question: where does persistent state live in the alknet crate graph, and what's the shared infrastructure for it. Key commitments documented: - SQLite + honker is the foundation (pattern, not a crate — ~20 lines per consumer). The metagraph is one tool built on it, for graph-shaped problems. Direct tables are another tool, for table-shaped problems. - IdentityProvider is the auth repo trait (already exists in core, make the pattern explicit). Adapters implement it (Config, SQLite, future Redis/remote/automerge). PeerStore is adapter-internal, not core. - Per-node ACL, no 'trusted' flag. Each node authorizes its direct callers via AccessControl::check(identity). No global ACL, no replication. The hub authorizes the user; the spoke authorizes the hub. Same mechanism. - Forwarded-for identity as metadata, not authority. The from_call handler includes the original caller's identity in the call payload; the spoke's ACL authorizes the hub (direct caller), never the forwarded_for. The ACL check signature prevents misuse. - The ACL check stays table-shaped (flat scope match); the delegation graph (future) produces effective scopes at resolution time. They compose at the IdentityProvider boundary. - The hub proxy tangle: ACL (authorize), bucket routing (operation input), peer routing (PeerRef) are three separate layers. Bucket-level authorization is handler logic, not protocol logic. What the old spec had that's dropped: multi-tenant (each tenant gets own setup), secrets module (replaced by vault), metagraph-as-foundation (demoted to tool), single storage crate (split by concern), accounts/orgs (deferred — v1 is a peers table). Reference: kepal (/workspace/keypal) — TypeScript repo-pattern example (Storage interface + adapters) that alknet's IdentityProvider follows.
2026-06-27 10:02:26 +00:00
parent 99c6dd9483
commit 19d010cf73
1 changed files with 620 additions and 0 deletions
--- a/docs/research/alknet-storage-strategy/findings.md
+++ b/docs/research/alknet-storage-strategy/findings.md
@@ -0,0 +1,620 @@
+---
+status: draft
+last_updated: 2026-06-27
+---
+
+# Storage and Auth Strategy
+
+**Status**: Draft for iteration
+**Date**: 2026-06-27
+**Scope**: Cross-cutting — storage decomposition, auth/ACL model, repo pattern,
+SQLite+honker as foundation, metagraph as tool. Synthesizes the discussion
+that surfaced during the peer-graph routing research (ADR-029) and OQ-33/34
+resolution.
+
+This document consolidates a multi-thread discussion into an architectural
+strategy for storage and auth in the alknet crate graph. It is not an ADR —
+it's the research that will inform ADRs and spec amendments. The
+implementation-relevant pieces (the `forwarded_for` field, the
+`IdentityProvider`-as-repo framing) get folded into specs after review.
+
+---
+
+## 1. The Problem
+
+Three separate threads converged on the same question: where does persistent
+state live in the alknet crate graph, and what's the shared infrastructure
+for it?
+
+1. **Peer identity (OQ-33/OQ-34)** — a head node needs to persist the mapping
+   from a stable logical peer identity to its current cryptographic material,
+   surviving key rotation and restarts. The UUID workaround is ephemeral; the
+   real solution is a store.
+2. **Filesystem (POC-validated)** — SQLite + honker + iroh-blobs as the
+   three-layer stack for path-tree metadata, content-addressed blobs, and
+   transactional notify-on-commit. 24 tests across two POC crates.
+3. **The old `alknet-storage` spec (alknet-main)** — a single crate doing
+   metagraph, identity, ACL, secrets, and honker integration. Designed before
+   the vault existed, before ADR-029, before the filesystem POC. Has residual
+   issues: multi-tenant complexity, secrets module that's now the vault,
+   metagraph-as-foundation rather than metagraph-as-tool.
+
+The common thread: **SQLite via honker is the right local persistence layer
+for all three**, and the metagraph model is the right shape for *some* of the
+data. The question is how to decompose this so the core crates stay lean
+while the storage-dependent crates get what they need — without forcing
+everything through the same abstraction.
+
+---
+
+## 2. The Principle: Right Tool for the Right Shape
+
+The metagraph (GraphType → NodeType → EdgeType → Graph → Node → Edge) is a
+generalized graph store. It's the right tool for genuinely graph-shaped
+problems: ACL delegation chains, workflows, task dependency DAGs, call
+composition trees. It is the *wrong* tool for things that aren't graph-shaped:
+
+| Data | Shape | Right tool |
+|------|-------|------------|
+| Peer identity → crypto material + scopes | Key-value (flat table) | `peers` table with typed columns |
+| Filesystem path tree | Tree (degenerate graph) | Specialized path-tree tables (recursive CTE, proven by POC) |
+| Provider credentials (encrypted blobs) | Key-value | `credentials` table |
+| ACL delegation chains | Graph (traversal, narrowing) | Metagraph |
+| Workflows / flowgraph | Graph (DAG, type compatibility) | Metagraph |
+| Taskgraph | Graph (dependency DAG) | Metagraph |
+| Operation specs | Flat records with typed fields | Table (or in-memory registry, as today) |
+
+Forcing table-shaped data through the metagraph adds overhead (JSON Schema
+validation on every node, graph traversal for what should be an indexed
+lookup) without benefit. The filesystem POC proved this empirically: the
+path tree uses specialized tables with a recursive CTE, and it's sub-
+millisecond. The same data in a metagraph would be a graph traversal per
+resolve — slower, more complex, no upside.
+
+**The principle: SQLite + honker is the foundation. The metagraph is one
+tool built on it, for graph-shaped problems. Direct tables are another tool,
+for table-shaped problems. Each consumer picks the right tool.**
+
+---
+
+## 3. SQLite + Honker as Foundation (Pattern, Not Crate)
+
+The filesystem POC established the integration pattern:
+
+```rust
+honker_core::apply_default_pragmas(conn)?;      // WAL, synchronous=NORMAL
+honker_core::attach_notify(conn)?;              // notify() SQL function
+honker_core::attach_honker_functions(conn)?;    // enqueue, claim, lock, stream, cron
+honker_core::bootstrap_honker_schema(conn)?;   // queue/stream/scheduler tables
+```
+
+This is ~20 lines of setup per consumer. Each consumer that wants its own
+tables does this on its own rusqlite connection. The critical property: the
+honker functions live on *the same connection* as the data tables, so writes
+and notifications are atomic in one transaction (the transactional-outbox
+pattern, built in). This is `honker-core` (attach to your connection), not
+`honker` (manages its own connection) — the POC documented this distinction.
+
+**This is a pattern, not a crate.** Packaging ~20 lines of setup as a shared
+crate adds a dependency boundary for no gain. Each consumer opens its own
+SQLite file, attaches honker, defines its schema. A `setup_honker(conn)`
+helper function (in a shared utility, or just copy-pasted) is enough.
+
+### Why SQLite, not a "real database"
+
+SQLite is an [application file format](https://sqlite.org/appfileformat.html),
+not just a database. The filesystem POC's insight: BLOBs < 100KB are faster
+inline in SQLite than as filesystem files; atomic transactions over metadata
+independent of content; the schema is the documentation. Each consumer gets
+a local, crash-safe, queryable file — not a database server to operate.
+
+The core crates (alknet-core, alknet-call) stay DB-free. The storage-
+consuming crates (filesystem, peer registry, graphs) each own their SQLite
+file. The assembly layer wires them together.
+
+### What honker adds
+
+| Feature | Use case |
+|---------|---------|
+| `notify` / `listen` | Ephemeral pub/sub — "ACL entry changed, invalidate cache" |
+| `stream_publish` / `subscribe` | Durable pub/sub — "peer identity updated, propagate" |
+| `queue` / `claim` / `ack` | Task queue — "orphaned write session cleanup" |
+| `lock_acquire` / `lock_release` | Named locks — "writer coordination on a path" |
+| `scheduler` | Periodic tasks — "session cleanup, audit log pruning" |
+
+The key integration: every mutation is atomic with its notification. A
+`peers` table update + `notify("peers:changed", peer_id)` commit together.
+A downstream consumer (e.g., the call protocol's `IdentityProvider` cache)
+wakes on commit, not on poll.
+
+---
+
+## 4. The Repo Pattern for Auth
+
+### The existing pattern (make it explicit)
+
+`alknet-core` already has the repo pattern: `IdentityProvider` is a trait
+with two methods (`resolve_from_fingerprint`, `resolve_from_token`), one
+adapter (`ConfigIdentityProvider`, backed by `ArcSwap<DynamicConfig>`), and
+one consumer (the call protocol's `Dispatcher`). This is a repo trait — it
+abstracts the *what* (resolve an identity from a credential) from the *how*
+(in-memory config, SQLite, Redis, remote service).
+
+**Make this explicit.** `IdentityProvider` is the auth repo trait in core.
+Adapters implement it. The assembly layer wires the adapter. Downstream
+crates consume the trait, not the adapter.
+
+### Why this matters beyond the call crate
+
+Downstream crates that don't use the call protocol still need auth. A crate
+that exposes operations over HTTP (alknet-http) or a service with no protocol
+at all still needs to resolve identities and check ACL. If the auth layer is
+a repo trait in core, those crates use the same trait, the same adapters, and
+potentially the same backing store — without depending on alknet-call. The
+call crate is one consumer of auth, not the owner of it.
+
+### The distributed-auth door
+
+If the repo trait is clean, someone can wire an adapter that syncs via
+automerge (like the filesystem POC's path-tree CRDT), a Redis adapter, or a
+remote-service adapter. The trait doesn't care. Auth data that isn't storing
+sensitive details (unless encrypted) could be distributed via the same
+patterns the filesystem uses for its path tree. This isn't designed here —
+it's a door the repo pattern opens by not foreclosing it.
+
+### Reference: kepal
+
+The TypeScript project [kepal](/workspace/keypal) is a clean example of this
+pattern. It abstracts API key management (hashing, validation, scopes,
+expiration, caching) with a `Storage` interface and adapters for Redis,
+Drizzle, Prisma, Kysely, Convex, and in-memory. The core logic
+(`Manager`) is backend-agnostic; the storage is a trait; the consumer picks
+the adapter at wiring time. An `AdapterFactory` provides column-mapping /
+schema-config so the same adapter works against different table schemas.
+
+The alknet equivalent: `IdentityProvider` is the trait (like kepal's
+`Storage`), `ConfigIdentityProvider` is the in-memory adapter (like kepal's
+`MemoryStore`), the SQLite peer registry is the real adapter (like kepal's
+`RedisStore`/`DrizzleStore`), and the assembly layer wires the adapter (like
+kepal's `Manager` constructor). The shapes map cleanly.
+
+### PeerStore: adapter-internal, not core
+
+A `PeerStore` trait (save/find/update/delete peer records) is an
+*adapter-internal* detail, not a core trait. The core trait is
+`IdentityProvider`. The SQLite adapter implements `IdentityProvider` by
+delegating to a `PeerStore` internally. The trait boundary that matters for
+cross-crate sharing is `IdentityProvider`, not `PeerStore`.
+
+This keeps core lean: one auth trait (`IdentityProvider`), not two. The
+store trait lives in the adapter crate (or the assembly layer), where it's
+an implementation detail. If a future adapter (Redis, remote service) needs
+a different internal store shape, it's free to define one — the core contract
+is `IdentityProvider`, not the store.
+
+---
+
+## 5. Per-Node ACL, No "Trusted" Flag
+
+### The model
+
+Each node has its own ACL. A node's ACL answers one question: **is this
+caller authorized to call this operation?** The caller is whoever
+authenticated to the connection — resolved by `IdentityProvider` from the
+TLS fingerprint or `auth_token`, checked by `AccessControl::check(identity)`.
+No "trusted" flag, no bypass, no special mode.
+
+This is the existing mechanism, restated for the cross-node case. The call
+protocol's dispatch path (`registration.rs:128-140`) already runs
+`AccessControl::check` against the caller's `Identity`. For a remote peer's
+call, the caller's `Identity` is the peer's resolved identity. Same check,
+same mechanism, no new concept.
+
+### Why no "trusted=true"
+
+A generic "trusted" flag is a blanket authorization bypass — the exact
+anti-pattern that ADR-015 was written to kill (it replaced `trusted: true`
+with the authority-switch model). There is no circumstance where a generic
+"skip the security check" flag is the right answer in a reasonably secure
+system. If a caller is authorized, the ACL says so. If the ACL doesn't say
+so, the caller isn't authorized. There's no third state.
+
+### The cross-node case
+
+When a hub forwards to a spoke (via `from_call`), the spoke authenticates
+the hub (resolves the hub's identity from the connection), and checks its
+ACL: "is this identity authorized to call this operation?" The answer is
+yes or no, based on the hub's identity and the op's `AccessControl`. Same
+mechanism, same check, no special-casing.
+
+```
+End user ──calls──> Hub ──forwards as hub──> Spoke (docker service)
+           │                    │
+     hub's ACL             spoke's ACL
+     (user → hub ops)       (hub → spoke ops)
+```
+
+The hub's ACL checked the end user. The spoke's ACL checked the hub. Two
+independent authorization decisions, same mechanism, no replication. The hub
+isn't "trusted" by the spoke — the hub is *authorized* by the spoke's ACL,
+the same way any caller is authorized.
+
+### The service-to-service pattern
+
+This is the same principle as: a database server authorizes the application
+server; it doesn't need to know about every end user the app server
+authenticated. The application server is the authorization boundary. In
+alknet, each node is an authorization boundary for its direct callers.
+
+The docker service example: the service exposes `/docker/start`. It's
+reachable directly (end users connect and call it) or through a hub (the
+hub imports via `from_call`, re-exposes, forwards). The docker service's
+ACL lists the principals that call it directly — either end users (direct
+topology) or the hub (proxied topology). It doesn't need to know about the
+hub's end users. The hub's ACL handles end-user authorization.
+
+### No global ACL, no replication
+
+Each node's ACL is local — in its own SQLite file (when storage arrives), in
+its own `peers` table, checked by its own `AccessControl`. There is no
+global ACL, no cross-service ACL replication. When a user's key rotates, the
+hub's `peers` table updates her fingerprint. The spoke's `peers` table is
+unchanged — it only knows about the hub. When the hub's key rotates, the
+spoke's `peers` table updates the hub's fingerprint — a single entry update,
+not a full ACL replication.
+
+### The "many DBs" concern
+
+Having many SQLite files (one per node, one per concern) looks like the
+microservices ACL-replication mess. It isn't, because the trust model is
+per-node: each node only authorizes its direct callers. The DBs don't
+overlap. The mess only happens if you try end-to-end identity propagation
+(the spoke needs to know about every end user) — that's the anti-pattern,
+and the repo pattern + per-node ACL avoids it.
+
+---
+
+## 6. Forwarded-For Identity (Metadata, Not Authority)
+
+### The question
+
+When a hub forwards a call to a spoke, should the spoke know *who initiated
+the call* (the end user), or just *who called it* (the hub)?
+
+**Without forwarded-for** (what the implementation does today): the spoke
+sees the hub as the caller. It authorizes the hub. It logs "the hub called
+`/docker/start`." If the spoke needs to audit "who actually initiated this,"
+it can't — that information is at the hub.
+
+**With forwarded-for**: the hub includes the original caller's identity in
+the `call.requested` payload. The spoke can log it, use it for per-user
+quotas, or pass it to the operation handler for context. But the spoke's ACL
+still authorizes the *hub*, not the end user — the forwarded-for identity is
+informational, not authoritative.
+
+### The recommendation: add it, as metadata
+
+The forwarded-for identity should be added as a protocol-level field, not
+as an afterthought. Reasoning:
+
+1. **Audit trail.** Without it, a cross-node call chain is untraceable at
+   the leaf. The spoke knows "the hub called me" but not "alice asked the
+   hub to call me." For debugging, billing, and abuse investigation, the
+   originator matters.
+
+2. **It's metadata, not authority.** The forwarded-for identity goes in the
+   call's metadata (or a dedicated `forwarded_for` field), not as the
+   `auth_token`. The spoke's dispatch path makes it available on
+   `OperationContext` but `AccessControl::check` *never* uses it — it
+   always authorizes the direct caller's identity. This keeps it from
+   becoming an authorization bypass.
+
+3. **The ACL check signature prevents misuse.** `AccessControl::check` takes
+   `Option<&Identity>` (the direct caller's identity). `forwarded_for` is a
+   *separate* field on `OperationContext` (`Option<Identity>`). The ACL
+   check signature doesn't accept it. If someone wants to ACL on the
+   forwarded-for identity, they'd have to change the `AccessControl::check`
+   signature — a visible, reviewable change, not a quiet flag flip.
+
+4. **Without it, the leaf service is blind to the originator.** If the spoke
+   needs to rate-limit per-user (not per-hub), or log who triggered a
+   container start, it can't. The hub would have to proxy and track
+   everything, which defeats the point of direct service composition.
+
+### Protocol shape
+
+The `call.requested` payload gains an optional `forwarded_for` field:
+
+```json
+{
+  "operationId": "/docker/start",
+  "input": { ... },
+  "auth_token": "alk_...",           // the direct caller's token (the hub's)
+  "forwarded_for": {                 // the original caller (the end user's)
+    "id": "alice-fingerprint",
+    "scopes": ["fs:read", "docker:start"]
+  }
+}
+```
+
+The dispatch path populates `OperationContext`:
+```rust
+pub struct OperationContext {
+    // ... existing fields ...
+    pub identity: Option<Identity>,              // the direct caller (authorized by ACL)
+    pub forwarded_for: Option<Identity>,         // the original caller (metadata only)
+}
+```
+
+`AccessControl::check(identity.as_ref())` — unchanged. The `forwarded_for`
+field is available to handlers for logging, auditing, rate-limiting, but
+never to the ACL.
+
+### The `from_call` handler's responsibility
+
+The hub's `from_call` forwarding handler populates `forwarded_for` with the
+end user's identity (from the hub's `OperationContext.identity`) when it
+constructs the `call.requested` payload to send to the spoke. The hub
+authenticates as itself (its own `auth_token`); the `forwarded_for` field
+carries the originator's identity as context.
+
+This is a protocol addition — a field on the `call.requested` payload and
+on `OperationContext`. It's in or it's out; it can't be bolted on later
+without a protocol change. The recommendation is to include it from the
+start.
+
+---
+
+## 7. The Decomposition
+
+### Crate boundaries
+
+```
+alknet-core (lean — no SQLite, no honker)
+├── IdentityProvider trait          (the auth repo trait — already exists)
+├── Identity, AuthToken, AuthContext (the auth types — already exist)
+├── AccessControl, AccessResult      (the ACL check — already exists)
+└── (no PeerStore trait — adapter-internal, not core)
+
+Storage-consuming crates (each owns its SQLite + honker):
+├── alknet-filesystem     — path-tree tables (tree, not graph; POC-proven)
+├── peer registry         — peers table (KV; implements IdentityProvider)
+├── provider credentials  — credentials table (KV; encrypted by vault)
+└── alknet-graphs (future) — metagraph tables (graph-shaped problems)
+
+alknet-call (lean — no SQLite, no honker, no storage traits)
+├── Uses IdentityProvider (the trait, not the adapter)
+├── PeerCompositeEnv keyed by PeerId (= Identity.id from IdentityProvider)
+├── AccessControl::check(identity) for per-node ACL
+└── from_call handler authenticates as the hub, forwards-for as metadata
+```
+
+### What goes where
+
+| Concern | Where it lives | Shape |
+|---------|---------------|-------|
+| Auth repo trait (`IdentityProvider`) | alknet-core | Trait (already exists) |
+| Auth adapters (Config, SQLite, future Redis/remote) | Adapter crates or assembly layer | Implements `IdentityProvider` |
+| Per-node ACL check (`AccessControl::check`) | alknet-core (already exists) | Table-shaped: scope/resource match |
+| Peer identity storage (PeerStore) | Adapter crate (adapter-internal) | `peers` table |
+| Filesystem path tree + bucket ACL | alknet-filesystem | Specialized tables (POC-proven) |
+| Provider credentials (encrypted) | Adapter crate or assembly layer | `credentials` table (vault encrypts) |
+| ACL delegation graph (future) | alknet-graphs (metagraph) | Graph (traversal, scope narrowing) |
+| Workflows / flowgraph (future) | alknet-graphs (metagraph) | Graph (DAG) |
+| Taskgraph (future) | alknet-graphs (metagraph) | Graph (dependency DAG) |
+| Forwarded-for identity | alknet-call (protocol field) | Metadata on `call.requested` + `OperationContext` |
+
+### What the old spec had that we're dropping
+
+| Old spec | Status | Why |
+|----------|--------|-----|
+| Multi-tenant (system.db + tenant.db) | Dropped | Each tenant gets its own complete setup (own ACL, ops, DB). Simpler, no cross-tenant complexity. |
+| `secrets/` module (HD derivation, secret service) | Replaced by alknet-vault | The vault already handles encryption/decryption (ADR-018/019/020/025/026). Storage just stores the `EncryptedData` blob. |
+| Metagraph as the foundation | Demoted to tool | SQLite+honker is the foundation. Metagraph is one tool on it, for graph-shaped problems. Tables are another tool, for table-shaped problems. |
+| `alknet-storage` as one crate | Split | The storage-consuming concerns are separate (filesystem, peer registry, graphs). No single "storage" crate. |
+| Accounts/organizations/multi-tenant identity | Deferred | The v1 need is a `peers` table (PeerId → fingerprint + scopes). The full account/org model is a future adapter. |
+| `alknet-flowgraph` as a separate crate | Folded into alknet-graphs | The metagraph + petgraph interop are one crate for graph-shaped problems. |
+
+---
+
+## 8. The ACL Split: Check Stays Table, Delegation Is Graph
+
+### The current ACL is table-shaped
+
+`AccessControl` on `OperationSpec` is `required_scopes` (AND-gate),
+`required_scopes_any` (OR-gate), `resource_type`/`resource_action`. `Identity`
+has `scopes: Vec<String>` and `resources: HashMap<String, Vec<String>>`. The
+check is `AccessControl::check(identity)` — a flat scope-match, not a graph
+traversal. This is fast, indexable, and correct for the current model (no
+delegation).
+
+### Delegation is graph-shaped (future)
+
+When delegation is needed ("A delegates to B with narrowed scopes, B
+delegates to C with further narrowing"), the delegation chain is a graph
+traversal — you walk the chain computing the effective scope set. This is
+where the metagraph pays off (PrincipalNode, DelegatesEdge, scope narrowing).
+
+But the *check* stays table-shaped even with delegation: the delegation
+graph produces the effective `Identity.scopes` (the graph's output); the ACL
+check is still "does the effective scope set satisfy the op's requirements?"
+(a flat join). The graph and the table compose — the graph produces the
+scopes, the table checks them.
+
+### Don't force the check through the graph
+
+The temptation is to make `AccessControl::check` traverse the delegation
+graph. Don't. The check is a flat scope-match — keep it that way. The
+delegation graph is a separate concern (producing effective scopes), and it
+lives in `alknet-graphs` (metagraph). The check lives in core (table). They
+compose at the `IdentityProvider` boundary: the adapter resolves the identity
+(possibly by traversing the delegation graph to compute effective scopes),
+returns an `Identity` with the effective scopes, and the check is a flat
+match against that `Identity`.
+
+This matches the "don't use a screwdriver to hammer a nail" principle: the
+check is table-shaped, the delegation is graph-shaped, and forcing either
+through the other's shape is worse.
+
+---
+
+## 9. The Hub Proxy Tangle (Resolved)
+
+### The tangle
+
+A hub can "have a filesystem" two ways:
+1. **In-process** — the hub's binary loads `alknet-filesystem`. The
+   filesystem's SQLite is local. The hub's call protocol dispatches
+   `/fs/readFile` directly to the filesystem handler. No network.
+2. **Proxied** — the filesystem runs on a spoke. The hub imports the spoke's
+   ops via `from_call`. The hub's `from_call` handler forwards over QUIC.
+   The spoke's call protocol dispatches to its own filesystem handler.
+
+These are different deployment topologies for the same libraries. The
+libraries don't change; the assembly does.
+
+### The three concerns that got conflated
+
+1. **ACL** — who can call the operation? The hub's ACL authorizes the user.
+   The spoke's ACL authorizes the hub. (Per-node ACL, same mechanism.)
+2. **Bucket routing** — which bucket is the operation targeting? The bucket
+   is a *parameter* in the operation input (`{ "bucket": "alice-files",
+   "path": "hello.txt" }`). It's not an ACL concern — it's operation input.
+3. **Peer routing** — which spoke *hosts* the operation? This is
+   `PeerRef::Specific` (ADR-029) — the hub's composition env routes to the
+   right peer.
+
+These are three separate decisions at three separate layers:
+
+```
+User calls hub's /fs/readFile with { bucket: "alice-files", path: "hello.txt" }
+  → hub's ACL: is this user authorized to call /fs/readFile? (AccessControl::check)
+  → hub's composition env: which peer serves /fs/readFile? (PeerRef routing)
+  → hub's from_call handler: forward { bucket, path } to that peer
+  → spoke's ACL: is the hub authorized to call /fs/readFile? (AccessControl::check)
+  → spoke's filesystem handler: read path from bucket (operation logic + bucket ACL)
+```
+
+### Bucket-level authorization
+
+The call protocol's ACL is coarse: "can this identity call `/fs/readFile`?"
+It doesn't know about buckets. The bucket is in the operation input. The
+**handler** checks bucket-level authorization — the filesystem handler reads
+`ctx.identity`, reads the input's `bucket` field, and checks its own bucket
+ACL (a `bucket_acl` table in the filesystem's SQLite: "is this identity
+authorized for this bucket?"). This is application logic — the filesystem
+owns its bucket authorization. The call protocol's ACL is the coarse gate;
+the handler is the fine gate.
+
+This keeps the call protocol's ACL simple and fast (a scope/resource check),
+and lets each service define its own fine-grained authorization against its
+own storage. The ACL doesn't inspect operation input; the handler does.
+
+---
+
+## 10. What This Means for the Immediate Path
+
+### ADR-029 migration (now)
+
+The peer-graph routing migration uses the UUID workaround (no storage). This
+document doesn't change that. But it establishes the pattern for when
+storage arrives:
+
+1. **ADR-029 migration** (now) — UUID PeerId, no storage, in-memory peer
+   overlays. `IdentityProvider` is `ConfigIdentityProvider` (in-memory).
+2. **Peer registry** (when key rotation / durable peer attribution is
+   needed) — `peers` table + honker, implements `IdentityProvider`, replaces
+   `ConfigIdentityProvider`. The call protocol's `Dispatcher` uses
+   `IdentityProvider` as today — no change. The `PeerCompositeEnv` uses
+   `PeerId` (= `Identity.id` from the adapter) — no change to routing.
+3. **alknet-graphs** (when ACL delegation / workflows / taskgraph are
+   needed) — metagraph crate, built on the same SQLite+honker pattern. For
+   graph-shaped problems only.
+
+Each step is independent. The migration doesn't wait for storage. Storage
+doesn't wait for the metagraph. The metagraph doesn't wait for the filesystem
+(which already has its own tables).
+
+### What goes into specs next (after this doc is reviewed)
+
+1. **`IdentityProvider` as the auth repo trait** — make the repo framing
+   explicit in `auth.md` and the `IdentityProvider` doc. No trait change;
+   just documenting the pattern.
+2. **`forwarded_for` field** — add to `call-protocol.md` (the
+   `call.requested` payload schema) and `operation-registry.md`
+   (`OperationContext`). `AccessControl::check` signature unchanged.
+3. **Per-node ACL framing** — add to `client-and-adapters.md` and
+   `operation-registry.md` as the cross-node extension of the existing
+   `AccessControl` model. No "trusted" flag.
+4. **OQ-34 update** — record the repo-pattern framing and the decomposition
+   (SQLite+honker as pattern, metagraph as tool, `IdentityProvider` as the
+   core trait).
+
+### What does NOT go into specs (stays in this research doc)
+
+- The metagraph schema (GraphType/NodeType/EdgeType) — that's a future
+  `alknet-graphs` spec, not relevant to the current crates
+- The filesystem's path-tree schema — that's the filesystem crate's spec
+- The full account/org identity model — deferred; the v1 need is a `peers`
+  table
+- The distributed-auth adapter (automerge/Redis) — a door the repo pattern
+  opens; not designed
+
+---
+
+## 11. Open Questions
+
+1. **When does the `forwarded_for` field get added?** It's a protocol
+   addition (a field on `call.requested` and `OperationContext`). It's in
+   the ADR-029 migration or it's a separate protocol-change task. The
+   recommendation is to include it in the migration — the `from_call`
+   handler is being rewritten anyway, and the `OperationContext` struct is
+   being touched. Adding the field now is cheaper than a separate protocol
+   change later.
+
+2. **Does the peer registry adapter live in its own crate or in the assembly
+   layer?** The `ConfigIdentityProvider` lives in alknet-core (a simple
+   impl). The SQLite adapter could live in a `alknet-peer-store-sqlite`
+   crate, or it could be in the assembly layer's binary (like a wiring
+   detail). The kepal pattern suggests a separate crate (the adapter is
+   reusable across deployments). This is a two-way door — the trait is in
+   core either way; the adapter's location is a packaging choice.
+
+3. **Does the ACL delegation graph (future) produce `Identity.scopes` at
+   resolution time or at check time?** The recommendation in §8 is at
+   resolution time (the `IdentityProvider` adapter traverses the delegation
+   graph to compute effective scopes, returns an `Identity` with them, and
+   the check is flat). But an alternative is lazy computation (the check
+   triggers the traversal). This is a future question, not a v1 decision —
+   the current model has no delegation.
+
+---
+
+## References
+
+- ADR-014: Secret Material Flow and Capability Injection (the no-env-vars
+  invariant)
+- ADR-015: Privilege Model and Authority Context (the authority-switch model
+  that replaced `trusted: true`)
+- ADR-017: Call Protocol Client and Adapter Contract (the `from_call`
+  forwarding handler)
+- ADR-018/019/020/025/026: The vault crate (handles encryption/decryption;
+  storage stores the `EncryptedData` blob)
+- ADR-029: Peer-Graph Routing Model (peer-keyed overlays, `PeerRef` routing,
+  `AccessControl`-based peer authorization)
+- OQ-33: PeerId — logical id, not crypto identity
+- OQ-34: Persistent peer registry (the storage dimension)
+- `docs/research/alknet-call-peer-routing/findings.md` — the peer-graph
+  routing research that surfaced the storage question
+- `docs/research/alknet-filesystem/poc-summary.md` — the filesystem POC that
+  validated SQLite + honker + iroh-blobs
+- `/workspace/@alkdev/alknet-main/docs/architecture/storage.md` — the old
+  storage spec (residual issues documented in §7)
+- `/workspace/@alkdev/alknet-main/docs/research/storage.md` — the old storage
+  research (metagraph, identity, ACL, honker integration)
+- `/workspace/keypal` — TypeScript repo-pattern reference for API key
+  management (Storage interface + adapters, the pattern alknet's
+  `IdentityProvider` follows)
+- `/workspace/honker` — SQLite extension with pub/sub, streams, queues,
+  locks, scheduler (`honker-core` for the attach-to-your-connection pattern)
+- https://sqlite.org/appfileformat.html — SQLite as an application file format