docs(research): storage and auth strategy — repo pattern, per-node ACL, SQLite+honker, metagraph-as-tool

Synthesizes the multi-thread discussion that surfaced during the peer-graph
routing research (ADR-029) and OQ-33/34 resolution. Three separate threads
(peer identity, filesystem POC, old storage spec) converged on the same
question: where does persistent state live in the alknet crate graph, and
what's the shared infrastructure for it.

Key commitments documented:
- SQLite + honker is the foundation (pattern, not a crate — ~20 lines per
  consumer). The metagraph is one tool built on it, for graph-shaped
  problems. Direct tables are another tool, for table-shaped problems.
- IdentityProvider is the auth repo trait (already exists in core, make the
  pattern explicit). Adapters implement it (Config, SQLite, future
  Redis/remote/automerge). PeerStore is adapter-internal, not core.
- Per-node ACL, no 'trusted' flag. Each node authorizes its direct callers
  via AccessControl::check(identity). No global ACL, no replication. The
  hub authorizes the user; the spoke authorizes the hub. Same mechanism.
- Forwarded-for identity as metadata, not authority. The from_call handler
  includes the original caller's identity in the call payload; the spoke's
  ACL authorizes the hub (direct caller), never the forwarded_for. The ACL
  check signature prevents misuse.
- The ACL check stays table-shaped (flat scope match); the delegation graph
  (future) produces effective scopes at resolution time. They compose at the
  IdentityProvider boundary.
- The hub proxy tangle: ACL (authorize), bucket routing (operation input),
  peer routing (PeerRef) are three separate layers. Bucket-level
  authorization is handler logic, not protocol logic.

What the old spec had that's dropped: multi-tenant (each tenant gets own
setup), secrets module (replaced by vault), metagraph-as-foundation (demoted
to tool), single storage crate (split by concern), accounts/orgs (deferred —
v1 is a peers table).

Reference: kepal (/workspace/keypal) — TypeScript repo-pattern example
(Storage interface + adapters) that alknet's IdentityProvider follows.
This commit is contained in:
2026-06-27 10:02:26 +00:00
parent 99c6dd9483
commit 19d010cf73

View File

@@ -0,0 +1,620 @@
---
status: draft
last_updated: 2026-06-27
---
# Storage and Auth Strategy
**Status**: Draft for iteration
**Date**: 2026-06-27
**Scope**: Cross-cutting — storage decomposition, auth/ACL model, repo pattern,
SQLite+honker as foundation, metagraph as tool. Synthesizes the discussion
that surfaced during the peer-graph routing research (ADR-029) and OQ-33/34
resolution.
This document consolidates a multi-thread discussion into an architectural
strategy for storage and auth in the alknet crate graph. It is not an ADR —
it's the research that will inform ADRs and spec amendments. The
implementation-relevant pieces (the `forwarded_for` field, the
`IdentityProvider`-as-repo framing) get folded into specs after review.
---
## 1. The Problem
Three separate threads converged on the same question: where does persistent
state live in the alknet crate graph, and what's the shared infrastructure
for it?
1. **Peer identity (OQ-33/OQ-34)** — a head node needs to persist the mapping
from a stable logical peer identity to its current cryptographic material,
surviving key rotation and restarts. The UUID workaround is ephemeral; the
real solution is a store.
2. **Filesystem (POC-validated)** — SQLite + honker + iroh-blobs as the
three-layer stack for path-tree metadata, content-addressed blobs, and
transactional notify-on-commit. 24 tests across two POC crates.
3. **The old `alknet-storage` spec (alknet-main)** — a single crate doing
metagraph, identity, ACL, secrets, and honker integration. Designed before
the vault existed, before ADR-029, before the filesystem POC. Has residual
issues: multi-tenant complexity, secrets module that's now the vault,
metagraph-as-foundation rather than metagraph-as-tool.
The common thread: **SQLite via honker is the right local persistence layer
for all three**, and the metagraph model is the right shape for *some* of the
data. The question is how to decompose this so the core crates stay lean
while the storage-dependent crates get what they need — without forcing
everything through the same abstraction.
---
## 2. The Principle: Right Tool for the Right Shape
The metagraph (GraphType → NodeType → EdgeType → Graph → Node → Edge) is a
generalized graph store. It's the right tool for genuinely graph-shaped
problems: ACL delegation chains, workflows, task dependency DAGs, call
composition trees. It is the *wrong* tool for things that aren't graph-shaped:
| Data | Shape | Right tool |
|------|-------|------------|
| Peer identity → crypto material + scopes | Key-value (flat table) | `peers` table with typed columns |
| Filesystem path tree | Tree (degenerate graph) | Specialized path-tree tables (recursive CTE, proven by POC) |
| Provider credentials (encrypted blobs) | Key-value | `credentials` table |
| ACL delegation chains | Graph (traversal, narrowing) | Metagraph |
| Workflows / flowgraph | Graph (DAG, type compatibility) | Metagraph |
| Taskgraph | Graph (dependency DAG) | Metagraph |
| Operation specs | Flat records with typed fields | Table (or in-memory registry, as today) |
Forcing table-shaped data through the metagraph adds overhead (JSON Schema
validation on every node, graph traversal for what should be an indexed
lookup) without benefit. The filesystem POC proved this empirically: the
path tree uses specialized tables with a recursive CTE, and it's sub-
millisecond. The same data in a metagraph would be a graph traversal per
resolve — slower, more complex, no upside.
**The principle: SQLite + honker is the foundation. The metagraph is one
tool built on it, for graph-shaped problems. Direct tables are another tool,
for table-shaped problems. Each consumer picks the right tool.**
---
## 3. SQLite + Honker as Foundation (Pattern, Not Crate)
The filesystem POC established the integration pattern:
```rust
honker_core::apply_default_pragmas(conn)?; // WAL, synchronous=NORMAL
honker_core::attach_notify(conn)?; // notify() SQL function
honker_core::attach_honker_functions(conn)?; // enqueue, claim, lock, stream, cron
honker_core::bootstrap_honker_schema(conn)?; // queue/stream/scheduler tables
```
This is ~20 lines of setup per consumer. Each consumer that wants its own
tables does this on its own rusqlite connection. The critical property: the
honker functions live on *the same connection* as the data tables, so writes
and notifications are atomic in one transaction (the transactional-outbox
pattern, built in). This is `honker-core` (attach to your connection), not
`honker` (manages its own connection) — the POC documented this distinction.
**This is a pattern, not a crate.** Packaging ~20 lines of setup as a shared
crate adds a dependency boundary for no gain. Each consumer opens its own
SQLite file, attaches honker, defines its schema. A `setup_honker(conn)`
helper function (in a shared utility, or just copy-pasted) is enough.
### Why SQLite, not a "real database"
SQLite is an [application file format](https://sqlite.org/appfileformat.html),
not just a database. The filesystem POC's insight: BLOBs < 100KB are faster
inline in SQLite than as filesystem files; atomic transactions over metadata
independent of content; the schema is the documentation. Each consumer gets
a local, crash-safe, queryable file — not a database server to operate.
The core crates (alknet-core, alknet-call) stay DB-free. The storage-
consuming crates (filesystem, peer registry, graphs) each own their SQLite
file. The assembly layer wires them together.
### What honker adds
| Feature | Use case |
|---------|---------|
| `notify` / `listen` | Ephemeral pub/sub — "ACL entry changed, invalidate cache" |
| `stream_publish` / `subscribe` | Durable pub/sub — "peer identity updated, propagate" |
| `queue` / `claim` / `ack` | Task queue — "orphaned write session cleanup" |
| `lock_acquire` / `lock_release` | Named locks — "writer coordination on a path" |
| `scheduler` | Periodic tasks — "session cleanup, audit log pruning" |
The key integration: every mutation is atomic with its notification. A
`peers` table update + `notify("peers:changed", peer_id)` commit together.
A downstream consumer (e.g., the call protocol's `IdentityProvider` cache)
wakes on commit, not on poll.
---
## 4. The Repo Pattern for Auth
### The existing pattern (make it explicit)
`alknet-core` already has the repo pattern: `IdentityProvider` is a trait
with two methods (`resolve_from_fingerprint`, `resolve_from_token`), one
adapter (`ConfigIdentityProvider`, backed by `ArcSwap<DynamicConfig>`), and
one consumer (the call protocol's `Dispatcher`). This is a repo trait — it
abstracts the *what* (resolve an identity from a credential) from the *how*
(in-memory config, SQLite, Redis, remote service).
**Make this explicit.** `IdentityProvider` is the auth repo trait in core.
Adapters implement it. The assembly layer wires the adapter. Downstream
crates consume the trait, not the adapter.
### Why this matters beyond the call crate
Downstream crates that don't use the call protocol still need auth. A crate
that exposes operations over HTTP (alknet-http) or a service with no protocol
at all still needs to resolve identities and check ACL. If the auth layer is
a repo trait in core, those crates use the same trait, the same adapters, and
potentially the same backing store — without depending on alknet-call. The
call crate is one consumer of auth, not the owner of it.
### The distributed-auth door
If the repo trait is clean, someone can wire an adapter that syncs via
automerge (like the filesystem POC's path-tree CRDT), a Redis adapter, or a
remote-service adapter. The trait doesn't care. Auth data that isn't storing
sensitive details (unless encrypted) could be distributed via the same
patterns the filesystem uses for its path tree. This isn't designed here —
it's a door the repo pattern opens by not foreclosing it.
### Reference: kepal
The TypeScript project [kepal](/workspace/keypal) is a clean example of this
pattern. It abstracts API key management (hashing, validation, scopes,
expiration, caching) with a `Storage` interface and adapters for Redis,
Drizzle, Prisma, Kysely, Convex, and in-memory. The core logic
(`Manager`) is backend-agnostic; the storage is a trait; the consumer picks
the adapter at wiring time. An `AdapterFactory` provides column-mapping /
schema-config so the same adapter works against different table schemas.
The alknet equivalent: `IdentityProvider` is the trait (like kepal's
`Storage`), `ConfigIdentityProvider` is the in-memory adapter (like kepal's
`MemoryStore`), the SQLite peer registry is the real adapter (like kepal's
`RedisStore`/`DrizzleStore`), and the assembly layer wires the adapter (like
kepal's `Manager` constructor). The shapes map cleanly.
### PeerStore: adapter-internal, not core
A `PeerStore` trait (save/find/update/delete peer records) is an
*adapter-internal* detail, not a core trait. The core trait is
`IdentityProvider`. The SQLite adapter implements `IdentityProvider` by
delegating to a `PeerStore` internally. The trait boundary that matters for
cross-crate sharing is `IdentityProvider`, not `PeerStore`.
This keeps core lean: one auth trait (`IdentityProvider`), not two. The
store trait lives in the adapter crate (or the assembly layer), where it's
an implementation detail. If a future adapter (Redis, remote service) needs
a different internal store shape, it's free to define one — the core contract
is `IdentityProvider`, not the store.
---
## 5. Per-Node ACL, No "Trusted" Flag
### The model
Each node has its own ACL. A node's ACL answers one question: **is this
caller authorized to call this operation?** The caller is whoever
authenticated to the connection — resolved by `IdentityProvider` from the
TLS fingerprint or `auth_token`, checked by `AccessControl::check(identity)`.
No "trusted" flag, no bypass, no special mode.
This is the existing mechanism, restated for the cross-node case. The call
protocol's dispatch path (`registration.rs:128-140`) already runs
`AccessControl::check` against the caller's `Identity`. For a remote peer's
call, the caller's `Identity` is the peer's resolved identity. Same check,
same mechanism, no new concept.
### Why no "trusted=true"
A generic "trusted" flag is a blanket authorization bypass — the exact
anti-pattern that ADR-015 was written to kill (it replaced `trusted: true`
with the authority-switch model). There is no circumstance where a generic
"skip the security check" flag is the right answer in a reasonably secure
system. If a caller is authorized, the ACL says so. If the ACL doesn't say
so, the caller isn't authorized. There's no third state.
### The cross-node case
When a hub forwards to a spoke (via `from_call`), the spoke authenticates
the hub (resolves the hub's identity from the connection), and checks its
ACL: "is this identity authorized to call this operation?" The answer is
yes or no, based on the hub's identity and the op's `AccessControl`. Same
mechanism, same check, no special-casing.
```
End user ──calls──> Hub ──forwards as hub──> Spoke (docker service)
│ │
hub's ACL spoke's ACL
(user → hub ops) (hub → spoke ops)
```
The hub's ACL checked the end user. The spoke's ACL checked the hub. Two
independent authorization decisions, same mechanism, no replication. The hub
isn't "trusted" by the spoke — the hub is *authorized* by the spoke's ACL,
the same way any caller is authorized.
### The service-to-service pattern
This is the same principle as: a database server authorizes the application
server; it doesn't need to know about every end user the app server
authenticated. The application server is the authorization boundary. In
alknet, each node is an authorization boundary for its direct callers.
The docker service example: the service exposes `/docker/start`. It's
reachable directly (end users connect and call it) or through a hub (the
hub imports via `from_call`, re-exposes, forwards). The docker service's
ACL lists the principals that call it directly — either end users (direct
topology) or the hub (proxied topology). It doesn't need to know about the
hub's end users. The hub's ACL handles end-user authorization.
### No global ACL, no replication
Each node's ACL is local — in its own SQLite file (when storage arrives), in
its own `peers` table, checked by its own `AccessControl`. There is no
global ACL, no cross-service ACL replication. When a user's key rotates, the
hub's `peers` table updates her fingerprint. The spoke's `peers` table is
unchanged — it only knows about the hub. When the hub's key rotates, the
spoke's `peers` table updates the hub's fingerprint — a single entry update,
not a full ACL replication.
### The "many DBs" concern
Having many SQLite files (one per node, one per concern) looks like the
microservices ACL-replication mess. It isn't, because the trust model is
per-node: each node only authorizes its direct callers. The DBs don't
overlap. The mess only happens if you try end-to-end identity propagation
(the spoke needs to know about every end user) — that's the anti-pattern,
and the repo pattern + per-node ACL avoids it.
---
## 6. Forwarded-For Identity (Metadata, Not Authority)
### The question
When a hub forwards a call to a spoke, should the spoke know *who initiated
the call* (the end user), or just *who called it* (the hub)?
**Without forwarded-for** (what the implementation does today): the spoke
sees the hub as the caller. It authorizes the hub. It logs "the hub called
`/docker/start`." If the spoke needs to audit "who actually initiated this,"
it can't — that information is at the hub.
**With forwarded-for**: the hub includes the original caller's identity in
the `call.requested` payload. The spoke can log it, use it for per-user
quotas, or pass it to the operation handler for context. But the spoke's ACL
still authorizes the *hub*, not the end user — the forwarded-for identity is
informational, not authoritative.
### The recommendation: add it, as metadata
The forwarded-for identity should be added as a protocol-level field, not
as an afterthought. Reasoning:
1. **Audit trail.** Without it, a cross-node call chain is untraceable at
the leaf. The spoke knows "the hub called me" but not "alice asked the
hub to call me." For debugging, billing, and abuse investigation, the
originator matters.
2. **It's metadata, not authority.** The forwarded-for identity goes in the
call's metadata (or a dedicated `forwarded_for` field), not as the
`auth_token`. The spoke's dispatch path makes it available on
`OperationContext` but `AccessControl::check` *never* uses it — it
always authorizes the direct caller's identity. This keeps it from
becoming an authorization bypass.
3. **The ACL check signature prevents misuse.** `AccessControl::check` takes
`Option<&Identity>` (the direct caller's identity). `forwarded_for` is a
*separate* field on `OperationContext` (`Option<Identity>`). The ACL
check signature doesn't accept it. If someone wants to ACL on the
forwarded-for identity, they'd have to change the `AccessControl::check`
signature — a visible, reviewable change, not a quiet flag flip.
4. **Without it, the leaf service is blind to the originator.** If the spoke
needs to rate-limit per-user (not per-hub), or log who triggered a
container start, it can't. The hub would have to proxy and track
everything, which defeats the point of direct service composition.
### Protocol shape
The `call.requested` payload gains an optional `forwarded_for` field:
```json
{
"operationId": "/docker/start",
"input": { ... },
"auth_token": "alk_...", // the direct caller's token (the hub's)
"forwarded_for": { // the original caller (the end user's)
"id": "alice-fingerprint",
"scopes": ["fs:read", "docker:start"]
}
}
```
The dispatch path populates `OperationContext`:
```rust
pub struct OperationContext {
// ... existing fields ...
pub identity: Option<Identity>, // the direct caller (authorized by ACL)
pub forwarded_for: Option<Identity>, // the original caller (metadata only)
}
```
`AccessControl::check(identity.as_ref())` — unchanged. The `forwarded_for`
field is available to handlers for logging, auditing, rate-limiting, but
never to the ACL.
### The `from_call` handler's responsibility
The hub's `from_call` forwarding handler populates `forwarded_for` with the
end user's identity (from the hub's `OperationContext.identity`) when it
constructs the `call.requested` payload to send to the spoke. The hub
authenticates as itself (its own `auth_token`); the `forwarded_for` field
carries the originator's identity as context.
This is a protocol addition — a field on the `call.requested` payload and
on `OperationContext`. It's in or it's out; it can't be bolted on later
without a protocol change. The recommendation is to include it from the
start.
---
## 7. The Decomposition
### Crate boundaries
```
alknet-core (lean — no SQLite, no honker)
├── IdentityProvider trait (the auth repo trait — already exists)
├── Identity, AuthToken, AuthContext (the auth types — already exist)
├── AccessControl, AccessResult (the ACL check — already exists)
└── (no PeerStore trait — adapter-internal, not core)
Storage-consuming crates (each owns its SQLite + honker):
├── alknet-filesystem — path-tree tables (tree, not graph; POC-proven)
├── peer registry — peers table (KV; implements IdentityProvider)
├── provider credentials — credentials table (KV; encrypted by vault)
└── alknet-graphs (future) — metagraph tables (graph-shaped problems)
alknet-call (lean — no SQLite, no honker, no storage traits)
├── Uses IdentityProvider (the trait, not the adapter)
├── PeerCompositeEnv keyed by PeerId (= Identity.id from IdentityProvider)
├── AccessControl::check(identity) for per-node ACL
└── from_call handler authenticates as the hub, forwards-for as metadata
```
### What goes where
| Concern | Where it lives | Shape |
|---------|---------------|-------|
| Auth repo trait (`IdentityProvider`) | alknet-core | Trait (already exists) |
| Auth adapters (Config, SQLite, future Redis/remote) | Adapter crates or assembly layer | Implements `IdentityProvider` |
| Per-node ACL check (`AccessControl::check`) | alknet-core (already exists) | Table-shaped: scope/resource match |
| Peer identity storage (PeerStore) | Adapter crate (adapter-internal) | `peers` table |
| Filesystem path tree + bucket ACL | alknet-filesystem | Specialized tables (POC-proven) |
| Provider credentials (encrypted) | Adapter crate or assembly layer | `credentials` table (vault encrypts) |
| ACL delegation graph (future) | alknet-graphs (metagraph) | Graph (traversal, scope narrowing) |
| Workflows / flowgraph (future) | alknet-graphs (metagraph) | Graph (DAG) |
| Taskgraph (future) | alknet-graphs (metagraph) | Graph (dependency DAG) |
| Forwarded-for identity | alknet-call (protocol field) | Metadata on `call.requested` + `OperationContext` |
### What the old spec had that we're dropping
| Old spec | Status | Why |
|----------|--------|-----|
| Multi-tenant (system.db + tenant.db) | Dropped | Each tenant gets its own complete setup (own ACL, ops, DB). Simpler, no cross-tenant complexity. |
| `secrets/` module (HD derivation, secret service) | Replaced by alknet-vault | The vault already handles encryption/decryption (ADR-018/019/020/025/026). Storage just stores the `EncryptedData` blob. |
| Metagraph as the foundation | Demoted to tool | SQLite+honker is the foundation. Metagraph is one tool on it, for graph-shaped problems. Tables are another tool, for table-shaped problems. |
| `alknet-storage` as one crate | Split | The storage-consuming concerns are separate (filesystem, peer registry, graphs). No single "storage" crate. |
| Accounts/organizations/multi-tenant identity | Deferred | The v1 need is a `peers` table (PeerId → fingerprint + scopes). The full account/org model is a future adapter. |
| `alknet-flowgraph` as a separate crate | Folded into alknet-graphs | The metagraph + petgraph interop are one crate for graph-shaped problems. |
---
## 8. The ACL Split: Check Stays Table, Delegation Is Graph
### The current ACL is table-shaped
`AccessControl` on `OperationSpec` is `required_scopes` (AND-gate),
`required_scopes_any` (OR-gate), `resource_type`/`resource_action`. `Identity`
has `scopes: Vec<String>` and `resources: HashMap<String, Vec<String>>`. The
check is `AccessControl::check(identity)` — a flat scope-match, not a graph
traversal. This is fast, indexable, and correct for the current model (no
delegation).
### Delegation is graph-shaped (future)
When delegation is needed ("A delegates to B with narrowed scopes, B
delegates to C with further narrowing"), the delegation chain is a graph
traversal — you walk the chain computing the effective scope set. This is
where the metagraph pays off (PrincipalNode, DelegatesEdge, scope narrowing).
But the *check* stays table-shaped even with delegation: the delegation
graph produces the effective `Identity.scopes` (the graph's output); the ACL
check is still "does the effective scope set satisfy the op's requirements?"
(a flat join). The graph and the table compose — the graph produces the
scopes, the table checks them.
### Don't force the check through the graph
The temptation is to make `AccessControl::check` traverse the delegation
graph. Don't. The check is a flat scope-match — keep it that way. The
delegation graph is a separate concern (producing effective scopes), and it
lives in `alknet-graphs` (metagraph). The check lives in core (table). They
compose at the `IdentityProvider` boundary: the adapter resolves the identity
(possibly by traversing the delegation graph to compute effective scopes),
returns an `Identity` with the effective scopes, and the check is a flat
match against that `Identity`.
This matches the "don't use a screwdriver to hammer a nail" principle: the
check is table-shaped, the delegation is graph-shaped, and forcing either
through the other's shape is worse.
---
## 9. The Hub Proxy Tangle (Resolved)
### The tangle
A hub can "have a filesystem" two ways:
1. **In-process** — the hub's binary loads `alknet-filesystem`. The
filesystem's SQLite is local. The hub's call protocol dispatches
`/fs/readFile` directly to the filesystem handler. No network.
2. **Proxied** — the filesystem runs on a spoke. The hub imports the spoke's
ops via `from_call`. The hub's `from_call` handler forwards over QUIC.
The spoke's call protocol dispatches to its own filesystem handler.
These are different deployment topologies for the same libraries. The
libraries don't change; the assembly does.
### The three concerns that got conflated
1. **ACL** — who can call the operation? The hub's ACL authorizes the user.
The spoke's ACL authorizes the hub. (Per-node ACL, same mechanism.)
2. **Bucket routing** — which bucket is the operation targeting? The bucket
is a *parameter* in the operation input (`{ "bucket": "alice-files",
"path": "hello.txt" }`). It's not an ACL concern — it's operation input.
3. **Peer routing** — which spoke *hosts* the operation? This is
`PeerRef::Specific` (ADR-029) — the hub's composition env routes to the
right peer.
These are three separate decisions at three separate layers:
```
User calls hub's /fs/readFile with { bucket: "alice-files", path: "hello.txt" }
→ hub's ACL: is this user authorized to call /fs/readFile? (AccessControl::check)
→ hub's composition env: which peer serves /fs/readFile? (PeerRef routing)
→ hub's from_call handler: forward { bucket, path } to that peer
→ spoke's ACL: is the hub authorized to call /fs/readFile? (AccessControl::check)
→ spoke's filesystem handler: read path from bucket (operation logic + bucket ACL)
```
### Bucket-level authorization
The call protocol's ACL is coarse: "can this identity call `/fs/readFile`?"
It doesn't know about buckets. The bucket is in the operation input. The
**handler** checks bucket-level authorization — the filesystem handler reads
`ctx.identity`, reads the input's `bucket` field, and checks its own bucket
ACL (a `bucket_acl` table in the filesystem's SQLite: "is this identity
authorized for this bucket?"). This is application logic — the filesystem
owns its bucket authorization. The call protocol's ACL is the coarse gate;
the handler is the fine gate.
This keeps the call protocol's ACL simple and fast (a scope/resource check),
and lets each service define its own fine-grained authorization against its
own storage. The ACL doesn't inspect operation input; the handler does.
---
## 10. What This Means for the Immediate Path
### ADR-029 migration (now)
The peer-graph routing migration uses the UUID workaround (no storage). This
document doesn't change that. But it establishes the pattern for when
storage arrives:
1. **ADR-029 migration** (now) — UUID PeerId, no storage, in-memory peer
overlays. `IdentityProvider` is `ConfigIdentityProvider` (in-memory).
2. **Peer registry** (when key rotation / durable peer attribution is
needed) — `peers` table + honker, implements `IdentityProvider`, replaces
`ConfigIdentityProvider`. The call protocol's `Dispatcher` uses
`IdentityProvider` as today — no change. The `PeerCompositeEnv` uses
`PeerId` (= `Identity.id` from the adapter) — no change to routing.
3. **alknet-graphs** (when ACL delegation / workflows / taskgraph are
needed) — metagraph crate, built on the same SQLite+honker pattern. For
graph-shaped problems only.
Each step is independent. The migration doesn't wait for storage. Storage
doesn't wait for the metagraph. The metagraph doesn't wait for the filesystem
(which already has its own tables).
### What goes into specs next (after this doc is reviewed)
1. **`IdentityProvider` as the auth repo trait** — make the repo framing
explicit in `auth.md` and the `IdentityProvider` doc. No trait change;
just documenting the pattern.
2. **`forwarded_for` field** — add to `call-protocol.md` (the
`call.requested` payload schema) and `operation-registry.md`
(`OperationContext`). `AccessControl::check` signature unchanged.
3. **Per-node ACL framing** — add to `client-and-adapters.md` and
`operation-registry.md` as the cross-node extension of the existing
`AccessControl` model. No "trusted" flag.
4. **OQ-34 update** — record the repo-pattern framing and the decomposition
(SQLite+honker as pattern, metagraph as tool, `IdentityProvider` as the
core trait).
### What does NOT go into specs (stays in this research doc)
- The metagraph schema (GraphType/NodeType/EdgeType) — that's a future
`alknet-graphs` spec, not relevant to the current crates
- The filesystem's path-tree schema — that's the filesystem crate's spec
- The full account/org identity model — deferred; the v1 need is a `peers`
table
- The distributed-auth adapter (automerge/Redis) — a door the repo pattern
opens; not designed
---
## 11. Open Questions
1. **When does the `forwarded_for` field get added?** It's a protocol
addition (a field on `call.requested` and `OperationContext`). It's in
the ADR-029 migration or it's a separate protocol-change task. The
recommendation is to include it in the migration — the `from_call`
handler is being rewritten anyway, and the `OperationContext` struct is
being touched. Adding the field now is cheaper than a separate protocol
change later.
2. **Does the peer registry adapter live in its own crate or in the assembly
layer?** The `ConfigIdentityProvider` lives in alknet-core (a simple
impl). The SQLite adapter could live in a `alknet-peer-store-sqlite`
crate, or it could be in the assembly layer's binary (like a wiring
detail). The kepal pattern suggests a separate crate (the adapter is
reusable across deployments). This is a two-way door — the trait is in
core either way; the adapter's location is a packaging choice.
3. **Does the ACL delegation graph (future) produce `Identity.scopes` at
resolution time or at check time?** The recommendation in §8 is at
resolution time (the `IdentityProvider` adapter traverses the delegation
graph to compute effective scopes, returns an `Identity` with them, and
the check is flat). But an alternative is lazy computation (the check
triggers the traversal). This is a future question, not a v1 decision —
the current model has no delegation.
---
## References
- ADR-014: Secret Material Flow and Capability Injection (the no-env-vars
invariant)
- ADR-015: Privilege Model and Authority Context (the authority-switch model
that replaced `trusted: true`)
- ADR-017: Call Protocol Client and Adapter Contract (the `from_call`
forwarding handler)
- ADR-018/019/020/025/026: The vault crate (handles encryption/decryption;
storage stores the `EncryptedData` blob)
- ADR-029: Peer-Graph Routing Model (peer-keyed overlays, `PeerRef` routing,
`AccessControl`-based peer authorization)
- OQ-33: PeerId — logical id, not crypto identity
- OQ-34: Persistent peer registry (the storage dimension)
- `docs/research/alknet-call-peer-routing/findings.md` — the peer-graph
routing research that surfaced the storage question
- `docs/research/alknet-filesystem/poc-summary.md` — the filesystem POC that
validated SQLite + honker + iroh-blobs
- `/workspace/@alkdev/alknet-main/docs/architecture/storage.md` — the old
storage spec (residual issues documented in §7)
- `/workspace/@alkdev/alknet-main/docs/research/storage.md` — the old storage
research (metagraph, identity, ACL, honker integration)
- `/workspace/keypal` — TypeScript repo-pattern reference for API key
management (Storage interface + adapters, the pattern alknet's
`IdentityProvider` follows)
- `/workspace/honker` — SQLite extension with pub/sub, streams, queues,
locks, scheduler (`honker-core` for the attach-to-your-connection pattern)
- https://sqlite.org/appfileformat.html — SQLite as an application file format