docs: refactor hub/spoke to head/worker, add service layer and HD key derivation

- Replace hub/spoke terminology with head/worker throughout all research docs
- Add irpc service layer architecture (AuthProtocol, SecretProtocol,
  ConfigProtocol, StorageProtocol)
- Add BIP39/SLIP-0010 HD key derivation for secrets management
- Add event boundary discipline (domain events vs integration events)
- Add application services layer (Docker, Node, Wallet, Proxy, Compute)
- New docs/research/services.md defining irpc service protocols
- Update core.md with service layer section and head/worker model
- Update configuration.md to delegate auth to AuthService (irpc)
- Update storage.md with secrets/key derivation and event boundaries
- Update flow.md with event boundary decision and cross-references
This commit is contained in:
2026-06-06 15:33:35 +00:00
parent 2315a211ff
commit d291a485f0
5 changed files with 1007 additions and 49 deletions

View File

@@ -6,13 +6,25 @@ phase: exploration
# Configuration Architecture
## Terminology Change: Head/Worker
This document previously used **hub/spoke** terminology. It has been updated to **head/worker**:
- **Head node**: The coordinating node (formerly "hub"). A head can also be a worker.
- **Worker node**: A node that connects to a head and registers services (formerly "spoke").
- **Node**: Any participant in the network. Every node has an identity.
This better reflects that a head is also a worker, enabling mesh topologies.
## Problem
## Problem
Alknet's configuration is loaded once at startup and never changes. This has
three specific failures:
1. **No hot reload of authentication credentials.** Adding or removing an
authorized key requires restarting the server process. In a hub/spoke
authorized key requires restarting the server process. In a head/worker
deployment where keys are managed via a database (see
`@alkdev/storage`'s `peer_credentials` table), the alknet process must be
restarted every time a key is added, revoked, or rotated. This is
@@ -38,7 +50,7 @@ three specific failures:
data sources plug in from outside.
- This does not propose file-watching (potential attack vector, unnecessary
complexity). CLI usage loads config once at startup. Programmatic usage
(NAPI, hub) calls reload explicitly.
(NAPI, head node) calls reload explicitly.
- This does not replace the existing `ServeOptions` builder pattern. It
generalizes it.
@@ -62,6 +74,19 @@ atomically without disrupting existing connections.
The split is clean: anything that affects the SSH handshake or socket binding
is static. Anything that's checked per-connection or per-channel is dynamic.
### Auth Reload: Service Approach
The original design held all authorized keys in memory via `ArcSwap<DynamicConfig>`. For small deployments this works, but for nodes serving many users it requires loading every key into RAM and atomic-swapping the entire set on each reload.
The improved approach is to make auth an **irpc service** (see [core.md](core.md) and [services.md](services.md)). Auth verification becomes a service call: `VerifyPubkey { fingerprint, key_data }``oneshot::Sender<AuthResult>`. The service can:
- Query SQLite on demand (no need to hold all keys in memory)
- Maintain an LRU cache for hot keys
- Subscribe to honker streams for key invalidation
- Run locally (in-process mpsc) or remotely (QUIC stream)
`ArcSwap<DynamicConfig>` remains as a fallback for minimal deployments (CLI usage, single-node setups) where SQLite overhead isn't warranted. The service approach is the primary path for production deployments.
### Current Architecture
```
@@ -83,7 +108,7 @@ path to update it.
### Proposed Architecture
Replace `Arc<ServerAuthConfig>` with a reloadable provider:
Replace `Arc<ServerAuthConfig>` with a service-based approach:
```
StaticConfig (Arc, loaded once)
@@ -92,15 +117,24 @@ StaticConfig (Arc, loaded once)
├─ host key
└─ max_auth_attempts, max_connections_per_ip
AuthService (irpc service, local or remote)
├─ VerifyPubkey(fingerprint, key_data) → AuthResult
├─ VerifyToken(token_bytes) → AuthResult
└─ ReloadKeys() → ()
Backed by: SQLite (peer_credentials, api_keys)
Optional: ArcSwap<DynamicConfig> for minimal deployments
ConfigService (irpc service, always local)
├─ ReloadDynamicConfig(DynamicConfig)
└─ GetForwardingPolicy() → ForwardingPolicy
DynamicConfig (Arc<ArcSwap<DynamicConfig>>, reloadable)
├─ auth: ServerAuthConfig
├─ forwarding: ForwardingPolicy
└─ rate_limits: RateLimitConfig
ConfigReloadHandle (exposed to NAPI)
└─ reload(DynamicConfig)
```
For production: auth verification goes through the auth service, which queries SQLite. The `DynamicConfig` only holds forwarding policy and rate limits — not the full key set. For minimal deployments: auth falls back to `ArcSwap<DynamicConfig>` with all keys in memory, wrapped by the same service interface.
`ArcSwap` provides lock-free reads on the hot path. Every `auth_publickey()`
and `channel_open_direct_tcpip()` call does an `Arc` dereference — zero cost
compared to the current approach. Writes are atomic: `store()` swaps the
@@ -138,7 +172,7 @@ pub enum TargetPattern {
Rule evaluation: first match wins, default applies if no rule matches. This
model maps to OpenSSH's `AllowTcpForwarding` + `PermitOpen` but is more
expressive. It also maps to `peer_credentials.metadata.scopes` in `@alkdev/storage`
— the hub can generate forwarding rules from stored scopes.
— the head node can generate forwarding rules from stored scopes.
Rule ordering matters. A deny-then-allow pattern gives blocklist semantics. An
allow-then-deny pattern gives allowlist semantics. Both are useful. The
@@ -220,7 +254,7 @@ interface ForwardingRuleConfig {
}
```
The hub calls `server.reloadAuth(...)` after writing to `peer_credentials`.
The head node calls `server.reloadAuth(...)` after writing to `peer_credentials`.
The NAPI layer parses the key data and constructs a new `DynamicConfig`, then
calls the `ConfigReloadHandle`.
@@ -235,7 +269,7 @@ A config file for client connections could define named profiles:
```toml
[profiles.production]
server = "hub.alk.dev:443"
server = "head.alk.dev:443"
transport = "tls"
identity = "/home/user/.ssh/id_ed25519"
@@ -252,16 +286,17 @@ This is a convenience layer on top of `ConnectOptions`, not a replacement.
| Interface | Static config | Dynamic config | Reload mechanism |
|---|---|---|---|
| CLI | Flags + optional `--config` file | Loaded at startup from `--authorized-keys` | None (restart to change) |
| Core Rust | `StaticConfig` struct | `ArcSwap<DynamicConfig>` | `ConfigReloadHandle::reload()` |
| NAPI | `serve()` options | Same `ArcSwap` | `server.reloadAuth()`, `server.reloadForwarding()` |
| Core Rust | `StaticConfig` struct | `AuthService` (irpc) or `ArcSwap<DynamicConfig>` (minimal) | `ConfigService::reload()` or `ConfigReloadHandle::reload()` |
| NAPI | `serve()` options | Same | `server.reloadAuth()`, `server.reloadForwarding()` |
The CLI doesn't need a reload mechanism. When you're running alknet from the
command line, restarting is fine. The reload mechanism exists for programmatic
consumers that manage credentials in a database.
consumers and for the auth service pattern where keys are queried on demand from
a database.
### Multi-Transport Listeners
A host may want to accept connections on multiple transports simultaneously:
A head node may want to accept connections on multiple transports simultaneously:
- TCP on port 22 (simple, direct SSH)
- TLS on port 443 (stealth mode, corporate firewalls)
@@ -458,7 +493,7 @@ compat via accepting both `transport: string` (single) and
Global rules with principal matching is simpler and covers most cases. Per-user
scope derived from certificates is more granular but requires the server to
maintain a mapping from key fingerprint to scope. This mapping comes from the
hub's database, not from the SSH protocol. Phase 2 starts with global rules;
head node's database, not from the SSH protocol. Phase 2 starts with global rules;
per-user scope can be added as an extension.
- **OQ-CFG-02**: Should the config file watch for changes and auto-reload?
@@ -553,15 +588,34 @@ compat via accepting both `transport: string` (single) and
presents an Ed25519-signed timestamp token. Verification produces the same
`Identity` type via the `IdentityProvider` trait. One `reloadAuth()` call
updates both. See [auth.md](../architecture/auth.md) and
[ADR-023](../architecture/decisions/023-unified-auth-shared-key-material.md).
[ADR-023](../architecture/decisions/023-unified-auth-shared-key-material.md).
- **OQ-CFG-07**: Should auth and secret services share a single irpc endpoint
or be separate services?
Separate services are better. Auth (verify credentials) and Secret (derive/store
keys) have different security boundaries. The secret service holds the master
seed; the auth service only needs public key fingerprints. They may run on
different machines. See [services.md](services.md) for protocol definitions.
- **OQ-CFG-08**: How do external credentials (API keys, OAuth tokens) relate
to the secret service's HD key derivation?
HD-derived keys (from SLIP-0010/BIP39) cover self-generated secrets (identity
keys, encryption keys, SSH keys). External credentials (third-party API keys,
OAuth tokens) can't be derived — they must be stored encrypted. The secret
service handles both: derived keys are regenerated on demand; stored secrets
are encrypted with a key that is itself derived from the seed. See
[services.md](services.md) for the `SecretProtocol` definition.
## Decisions Required
These decisions will be extracted into ADRs when the architecture is finalized:
1. **ADR-020**: Static/dynamic config split, `ArcSwap<DynamicConfig>` for
hot-reloadable auth and forwarding policy. Supersedes ADR-011's "no config
file" — adds optional config file while preserving programmatic-first API.
1. **ADR-020**: Static/dynamic config split. Auth delegated to `AuthService` (irpc)
for production; `ArcSwap<DynamicConfig>` for minimal deployments. Supersedes
ADR-011's "no config file" — adds optional config file while preserving
programmatic-first API.
2. **ADR-021**: Forwarding policy with rule-based allow/deny. Default-allow
preserves current behavior during migration; default-deny for production
@@ -571,6 +625,13 @@ These decisions will be extracted into ADRs when the architecture is finalized:
loops sharing auth config, session state, and shutdown. Replaces single
`ServeTransportMode` with `Vec<ListenerConfig>`.
4. **ADR-026**: Head/worker terminology. Replace hub/spoke with head/worker
throughout all documentation and APIs. A head is also a worker.
5. **ADR-028**: Auth as service. Auth verification via irpc `AuthProtocol`
service, not in-memory key set. Enables SQLite-backed auth for production,
`ArcSwap` fallback for minimal deployments.
## References
- [ADR-011](../architecture/decisions/011-no-ssh-config-programmatic-api.md) — Programmatic-first API (superseded by ADR-020)
@@ -585,4 +646,6 @@ These decisions will be extracted into ADRs when the architecture is finalized:
- [arc-swap crate](https://docs.rs/arc-swap) — Lock-free read, atomic write for shared state
- [ADR-023](../architecture/decisions/023-unified-auth-shared-key-material.md) — Unified auth with shared key material
- [auth.md](../architecture/auth.md) — Unified auth architecture spec
- [call-protocol.md](../architecture/call-protocol.md) — Bidirectional call protocol spec
- [call-protocol.md](../architecture/call-protocol.md) — Bidirectional call protocol spec
- [services.md](services.md) — Service layer architecture (irpc services)
- [core.md](core.md) — Core overview, head/worker terminology, service layer

View File

@@ -1,11 +1,22 @@
# Alknet Core: Transport, Call Protocol, Auth, and DNS
# Alknet Core: Transport, Call Protocol, Auth, Services, and DNS
> Status: Research / Draft
> Last updated: 2026-06-05
> Last updated: 2026-06-06
## Overview
`alknet-core` is the foundational crate providing pluggable transports, the bidirectional call protocol, Ed25519 authentication, and (future) DNS transport + naming. Everything else (storage, flowgraph, relay) builds on top of this.
`alknet-core` is the foundational crate providing pluggable transports, the bidirectional call protocol, Ed25519 authentication, a service layer (via irpc), and (future) DNS transport + naming. Everything else (storage, flowgraph, relay) builds on top of this.
### Terminology: Nodes, Heads, and Workers
Alknet uses a **head/worker** model instead of hub/spoke:
- **Node**: Any participant in the network. Every node has an Ed25519 identity.
- **Head node**: A node that coordinates — accepts connections, routes operations, manages cluster state. A head is also a worker (it can execute operations).
- **Worker node**: A node that connects to a head, registers its services, and executes operations. Any worker can become a head.
- **Service**: A named collection of operations exposed by a node (e.g., `fs`, `bash`, `compute`, `agent`). Services register via the call protocol.
This model allows natural mesh formation: a head can also be a worker for another head, enabling multi-hop routing, redundancy, and distributed topologies without a centralized authority.
## Transport Layer
@@ -102,10 +113,10 @@ A call is just a subscribe that resolves after one event. Both `call()` and `sub
### Operation Paths
```
/{spoke}/{service}/{op}
/{node}/{service}/{op}
```
- **spoke** — identity prefix of the node that exposes the operation
- **node** — identity prefix of the node that exposes the operation
- **service** — logical service namespace (e.g., `fs`, `bash`, `agent`)
- **op** — specific operation (e.g., `readFile`, `exec`, `chat`)
@@ -113,9 +124,9 @@ Examples:
| Path | Meaning |
|------|---------|
| `/dev1/fs/readFile` | Spoke `dev1`, service `fs`, op `readFile` |
| `/hub/agent/chat` | Hub's own `agent` service, op `chat` |
| `/hub/sessions/list` | Hub's `sessions` service, op `list` |
| `/dev1/fs/readFile` | Node `dev1`, service `fs`, op `readFile` |
| `/head/agent/chat` | Head's own `agent` service, op `chat` |
| `/head/sessions/list` | Head's `sessions` service, op `list` |
### PendingRequestMap
@@ -176,39 +187,41 @@ registry.register(OperationSpec { name: "/fs/readFile", ... }, fs_read_handler);
| Worker | `postMessage` | Bidirectional over structured clone |
| DNS | Query TXT records (client) / serve TXT records (server) | Request/response over DNS |
### Hub/Spoke Architecture
### Head/Worker Architecture
```
┌─────────────────────────────────┐
Hub
Head Node
│ │
│ Hub-local services:
│ /hub/agent/chat
│ /hub/agent/complete
│ /hub/sessions/list
│ Head-local services: │
│ /head/agent/chat │
│ /head/agent/complete │
│ /head/sessions/list │
│ │
Spoke registry:
Worker registry: │
│ /dev1/fs/* → dev1 connection │
│ /browser-1/notify/* → WT conn │
└──────┬───────┬──────────────────┘
│ │
┌─────────▼┐ ┌───▼────────────┐
Spoke │ │Browser Spoke
Worker │ │Browser Worker
│ "dev1" │ │"browser-1" │
│ /fs/* │ │/notify/* │
└───────────┘ └────────────────┘
```
Spokes register operations on connect:
A head node is also a worker. Any worker can become a head. This enables mesh topologies where nodes coordinate in a peer-to-peer fashion rather than through a single centralized authority.
Workers register operations on connect:
```json
{
"type": "call.requested",
"id": "uuid-123",
"payload": {
"operationId": "/hub/services/register",
"operationId": "/head/services/register",
"input": {
"spoke": "dev1",
"node": "dev1",
"operations": ["/fs/readFile", "/bash/exec"]
}
}
@@ -217,10 +230,84 @@ Spokes register operations on connect:
## Authentication
Ed25519 keys for SSH authentication. A separate authentication mechanism for browsers where they sign a token using the same Ed25519 keys. Hot key rotation without server restart (mechanism in core for programmatic key updates).
Ed25519 keys for SSH authentication. A separate authentication mechanism for browsers where they sign a token using the same Ed25519 keys.
Authentication is provided by the **auth service** — an irpc-based service that verifies credentials on demand rather than holding all keys in memory. This replaces the earlier `ArcSwap<DynamicConfig>` approach and scales to large user populations without requiring full key set reloads.
Peer credentials are stored in `peer_credentials` table (fingerprint-based lookup). Account credentials via `api_keys` table (SHA-256 hash for high-entropy keys).
See [services.md](services.md) for the auth service protocol definition.
## Service Layer
### Architecture
Alknet uses an **irpc-based service layer** to decompose core responsibilities into independently testable, deployable, and replaceable components. irpc provides lightweight RPC that works both as an in-process async boundary (tokio channels) and cross-process/cross-network (QUIC streams via noq).
A **service** is an irpc protocol enum that defines the operations a component supports. Services run as async actors — locally they communicate via `mpsc` channels, remotely via QUIC streams. The `Client<S>` abstracts over both.
### Core Services
| Service | irpc Protocol | Purpose | Always Local? |
|---------|--------------|---------|---------------|
| **Auth** | `AuthProtocol` | Verify identities, check credentials, issue tokens | Can be remote for large-scale auth |
| **Secret** | `SecretProtocol` | Derive keys from seed, encrypt/decrypt stored secrets, key versioning | Local in single-node, remote in clustered |
| **Config** | `ConfigProtocol` | Dynamic config reload (auth keys, forwarding policy) | Local |
| **Storage** | `StorageProtocol` | Graph CRUD, metagraph operations, honker event bridge | Local or remote |
### Service Definition Pattern
Services are defined as irpc protocol enums:
```rust
use irpc::{rpc_requests, channel::{mpsc, oneshot}};
#[rpc_requests(message = AuthMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum AuthProtocol {
#[rpc(tx=oneshot::Sender<AuthResult>)]
#[wrap(VerifyPubkey)]
VerifyPubkey { fingerprint: String, key_data: Vec<u8> },
#[rpc(tx=oneshot::Sender<AuthResult>)]
#[wrap(VerifyToken)]
VerifyToken { token: Vec<u8> },
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(ReloadKeys)]
ReloadKeys,
}
```
### Local vs Remote
```rust
enum AuthClient {
// In-process: zero-copy tokio channels
Local(Client<AuthProtocol>),
// Cross-process/cross-network: QUIC stream
Remote(irpc::rpc::Client<AuthProtocol>),
}
```
A node that runs all services locally uses `Client::local(mpsc::channel)`. A node that delegates auth to a separate service uses `Client::remote(quinn::Connection)`. The call sites are identical — the client abstracts over both.
### Relationship to Call Protocol
Services are **internal** to a node or cluster. The call protocol is **external** — it's how nodes talk to each other over SSH/WebSocket/QUIC/DNS transports. Services handle concerns like auth and secrets that should not be part of the wire protocol but are needed by every node.
A service can also be exposed as a call protocol operation. For example, the secret service's `DeriveKey` could be exposed as `/head/secrets/derive` for remote workers that need key derivation but shouldn't hold the master seed.
### Event Boundary Discipline
Following the event sourcing patterns in [event_source_types.md](/workspace/research/event_sourcing/event_source_types.md):
- **Honker streams** (`stream_publish`/`subscribe`) are **internal event sourcing** for the service that owns that data. They are domain events, not integration events.
- **Call protocol `EventEnvelope`** is the **integration boundary** between nodes. Cross-node notifications are projected from domain events, not published directly.
- **irpc service calls** are **synchronous request-response** within a node or cluster. They are not events and should not be used as such.
This prevents the conflation of internal state management (event sourcing), cross-service notification (integration events), and service calls (request-response).
## DNS Transport (Planned)
### Two DNS Concepts
@@ -322,6 +409,9 @@ alknet's DNS transport should support all of these. DoH (port 443, looks like HT
| 023 | Unified auth | Shared Ed25519 key material across auth mechanisms |
| 024 | Bidirectional call protocol | Both sides can call, generalized from ADR-018 |
| 025 | Handler/spec separation | Downstream registers operations without modifying core |
| 026 | Head/worker terminology | Replace hub/spoke with head/worker; any node can be a head |
| 027 | Service layer via irpc | Core responsibilities decomposed into irpc service protocols |
| 028 | Auth as service | Auth verification via irpc service, not in-memory key set |
## References
@@ -329,6 +419,8 @@ alknet's DNS transport should support all of these. DoH (port 443, looks like HT
- `@alkdev/operations` — TypeScript call protocol, `OperationSpec`, registry
- `@alkdev/flowgraph` — TypeScript operation graph and call graph (planned Rust port)
- `@alkdev/storage` — TypeScript metagraph, identity, ACL (planned Rust port as `alknet-storage`)
- `@alkdev/dispatch` — Instance management service (head+worker architecture reference)
- iroh-dns — DNS resolver and endpoint info (naming/discovery)
- iroh-live-relay — WebTransport relay (planned transport reference)
- irpc — iroh streaming RPC (postcard-only, Rust-to-Rust)
- irpc — iroh streaming RPC (service layer, async boundaries)
- [event_source_types.md](/workspace/research/event_sourcing/event_source_types.md) — Event-driven architecture patterns and anti-patterns

View File

@@ -1,7 +1,7 @@
# Alknet Flowgraph: Operation Graph, Call Graph, and Graph Operations
> Status: Research / Draft
> Last updated: 2026-06-05
> Last updated: 2026-06-06
## Overview
@@ -457,6 +457,7 @@ tokio = { version = "1", features = ["full"] }
| `NodeAttributes` / `EdgeAttributes` traits | Generic over attribute types, matching flowgraph's type parameter pattern |
| DAG enforcement at construction | Matches TypeScript flowgraph: `fromSpecs()` throws `CycleError` |
| `filter_by_status` is O(n) | Matches TypeScript: small graphs (tens to hundreds of nodes), no index needed |
| Call protocol as integration boundary | Call protocol `EventEnvelope` is the cross-node integration boundary; domain events stay within services |
## References
@@ -466,4 +467,6 @@ tokio = { version = "1", features = ["full"] }
- `/workspace/jsonschema` — JSON Schema validation crate
- `/workspace/@alkdev/storage/docs/architecture/metagraph-module.md` — TypeBox Module pattern
- `/workspace/@alkdev/storage/docs/architecture/sqlite-host.md` — SQLite table definitions
- `/workspace/@alkdev/storage/docs/architecture/acl.md` — ACL as metagraph
- `/workspace/@alkdev/storage/docs/architecture/acl.md` — ACL as metagraph
- [services.md](services.md) — Service layer architecture (irpc protocols)
- [core.md](core.md) — Core overview, head/worker terminology

688
docs/research/services.md Normal file
View File

@@ -0,0 +1,688 @@
# Alknet Services: irpc Service Architecture
> Status: Research / Draft
> Last updated: 2026-06-06
## Overview
Alknet uses an **irpc-based service layer** to decompose core responsibilities into independently testable, deployable, and replaceable components. Services communicate via irpc protocol enums that work both as in-process async boundaries (tokio channels) and cross-process/cross-network (QUIC streams via noq).
This document defines the service protocols and their relationships, following the head/worker terminology established in [core.md](core.md).
## Design Principles
### 1. Services are protocol enums
An irpc service is defined as a Rust enum annotated with `#[rpc_requests]`. The macro generates two versions:
- **Serializable** (`Request`): safe to encode with postcard, for remote communication
- **With channels** (`RequestWithChannels`): includes `oneshot::Sender` and `mpsc` channels, for local communication
Both versions use the same `Client<S>` type — the local/remote distinction is transparent at the call site.
### 2. Services are the async boundary
Instead of a giant `mpsc` message enum per the irpc documentation's description of the common anti-pattern, each service has its own focused protocol. This keeps responsibilities clear and prevents the "god enum" problem.
### 3. Local-first, remote-capable
Every service can run locally (mpsc channels, zero serialization overhead) or remotely (QUIC streams, postcard serialization). The deployment choice doesn't affect the call sites. A single-node setup runs everything locally. A distributed setup runs auth and secrets on dedicated nodes.
### 4. Event boundary discipline
Per [event_source_types.md](/workspace/research/event_sourcing/event_source_types.md):
- **Honker streams** = domain events (internal to the owning service, for state reconstruction)
- **irpc service calls** = request-response between services (synchronous boundary within a node)
- **Call protocol EventEnvelope** = integration events (cross-node asynchronous boundary)
Domain events are projected to integration events when crossing service or node boundaries. Never publish domain events directly to other services.
## Service Definitions
### AuthService
Verifies identities without holding all keys in memory.
```rust
use irpc::{rpc_requests, channel::{mpsc, oneshot}};
use serde::{Serialize, Deserialize};
#[rpc_requests(message = AuthMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum AuthProtocol {
#[rpc(tx=oneshot::Sender<AuthResult>)]
#[wrap(VerifyPubkey)]
VerifyPubkey {
fingerprint: String,
key_data: Vec<u8>,
},
#[rpc(tx=oneshot::Sender<AuthResult>)]
#[wrap(VerifyToken)]
VerifyToken {
token_bytes: Vec<u8>,
timestamp: u64,
},
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(ReloadKeys)]
ReloadKeys,
#[rpc(tx=oneshot::Sender<bool>)]
#[wrap(CheckAccess)]
CheckAccess {
identity: Identity,
operation: String,
},
}
#[derive(Debug, Serialize, Deserialize)]
enum AuthResult {
Ok(Identity),
Denied(String),
}
#[derive(Debug, Serialize, Deserialize)]
struct Identity {
node_id: String,
fingerprint: String,
scopes: Vec<String>,
}
```
**Backends:**
| Mode | Backend | When to use |
|------|---------|-------------|
| Minimal | `ArcSwap<DynamicConfig>` with all keys in memory | CLI, single-node, few users |
| SQLite | Query `peer_credentials` / `api_keys` on demand | Production, multi-user head nodes |
| Remote | Forward to dedicated auth service | Multi-head clusters, auth federation |
**Why this solves the scaling problem:** Instead of loading all keys into memory and swapping them atomically, the auth service queries SQLite per request. An LRU cache on hot fingerprints avoids repeated DB hits. Key revocations are propagated via honker stream notifications.
### SecretService
Derives keys from a master seed, encrypts/decrypts external credentials. The **only** component that holds the master seed phrase.
```rust
#[rpc_requests(message = SecretMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum SecretProtocol {
#[rpc(tx=oneshot::Sender<DerivedKey>)]
#[wrap(DeriveEd25519)]
DeriveEd25519 {
path: String, // e.g. "m/74'/0'/0'/0'"
},
#[rpc(tx=oneshot::Sender<DerivedKey>)]
#[wrap(DeriveEncryptionKey)]
DeriveEncryptionKey {
path: String, // e.g. "m/74'/2'/0'/0'"
},
#[rpc(tx=oneshot::Sender<DerivedKey>)]
#[wrap(DeriveEthereumKey)]
DeriveEthereumKey {
path: String, // e.g. "m/44'/60'/0'/0/0"
},
#[rpc(tx=oneshot::Sender<Vec<u8>>)]
#[wrap(DerivePassword)]
DerivePassword {
path: String,
length: usize,
},
#[rpc(tx=oneshot::Sender<EncryptedData>)]
#[wrap(Encrypt)]
Encrypt {
plaintext: String,
key_version: u32,
},
#[rpc(tx=oneshot::Sender<String>)]
#[wrap(Decrypt)]
Decrypt {
encrypted: EncryptedData,
},
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(Lock)]
Lock,
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(Unlock)]
Unlock {
passphrase: String,
},
}
#[derive(Debug, Serialize, Deserialize)]
struct DerivedKey {
key_type: KeyType,
private_key: Vec<u8>,
public_key: Vec<u8>,
}
#[derive(Debug, Serialize, Deserialize)]
enum KeyType {
Ed25519,
Aes256Gcm,
Secp256k1,
}
#[derive(Debug, Serialize, Deserialize)]
struct EncryptedData {
key_version: u32,
salt: String, // Base64-encoded
iv: String, // Base64-encoded
data: String, // Base64-encoded
}
```
**Security model:**
| State | What's in memory | What's on disk |
|-------|-----------------|---------------|
| Locked | Nothing | Encrypted database, derivation path metadata |
| Unlocked | Master seed in RAM | Same (seed is never persisted) |
| After use | Derived keys cached in RAM | Derivation paths only |
The seed phrase is entered once (at node startup or via `Unlock` call), held in memory, and never written to disk. Derived keys are computed on demand. The `Lock` call purges the seed and all cached derived keys from memory.
**Derived key patterns (see [storage.md](storage.md) for derivation path conventions):**
- Identity keys: SLIP-0010 `m/74'/0'/0'/0'` → Ed25519 keypair for alknet authentication
- Encryption keys: SLIP-0010 `m/74'/2'/0'/0'` → AES-256-GCM key for external credential encryption
- Ethereum keys: BIP32 `m/44'/60'/0'/0/0` → secp256k1 keypair for smart contract signing
- Site passwords: BIP32 `m/74'/1'/0'/{hash}'` → deterministic password derivation (orbit-db-wallet pattern)
### ConfigService
Dynamic configuration reload. Wraps `ArcSwap<DynamicConfig>` for minimal deployments, or delegates to SQLite-backed storage for production.
```rust
#[rpc_requests(message = ConfigMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum ConfigProtocol {
#[rpc(tx=oneshot::Sender<ForwardingPolicy>)]
#[wrap(GetForwardingPolicy)]
GetForwardingPolicy,
#[rpc(tx=oneshot::Sender<RateLimitConfig>)]
#[wrap(GetRateLimits)]
GetRateLimits,
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(ReloadForwarding)]
ReloadForwarding {
policy: ForwardingPolicy,
},
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(ReloadRateLimits)]
ReloadRateLimits {
limits: RateLimitConfig,
},
}
```
### StorageService
Graph CRUD operations, metagraph management, and honker event bridge. Wraps the `alknet-storage` crate.
```rust
#[rpc_requests(message = StorageMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum StorageProtocol {
#[rpc(tx=oneshot::Sender<Graph>)]
#[wrap(CreateGraph)]
CreateGraph {
graph_type_id: String,
name: String,
},
#[rpc(tx=oneshot::Sender<Node>)]
#[wrap(AddNode)]
AddNode {
graph_id: String,
key: String,
attributes: serde_json::Value,
},
#[rpc(tx=oneshot::Sender<Node>)]
#[wrap(GetNode)]
GetNode {
graph_id: String,
key: String,
},
#[rpc(tx=mpsc::Sender<StorageEvent>)]
#[wrap(Subscribe)]
Subscribe {
stream_name: String,
},
}
```
The `Subscribe` variant uses server-streaming irpc — the client sends one request and receives multiple `StorageEvent` messages via `mpsc::Sender`. These are honker stream events projected into integration events.
## Service Composition
### Minimal Deployment (Single Node, CLI)
All services run locally as tokio actors:
```
┌──────────────────────────────────────────────┐
│ Single Process │
│ │
│ ┌─────────┐ ┌─────────┐ ┌──────────────┐ │
│ │ Auth │ │ Secret │ │ Config │ │
│ │ Service │ │ Service │ │ Service │ │
│ │ (mpsc) │ │ (mpsc) │ │ (mpsc) │ │
│ └────┬─────┘ └────┬────┘ └──────┬───────┘ │
│ │ │ │ │
│ ┌────▼─────────────▼───────────────▼───────┐ │
│ │ alknet-core Server │ │
│ │ (SSH auth, call protocol, forwarding) │ │
│ └──────────────────────────────────────────┘ │
└──────────────────────────────────────────────┘
```
- Auth service uses `ArcSwap<DynamicConfig>` (all keys in memory)
- Secret service runs unlocked (seed in memory, no external access)
- Config service uses `ArcSwap<DynamicConfig>` directly
### Production Deployment (Multi-Node)
Auth and secrets run on dedicated nodes; workers access them remotely:
```
┌────────────────────┐ ┌─────────────────────┐
│ Auth Node │ │ Secret Node │
│ │ │ │
│ AuthProtocol │ │ SecretProtocol │
│ (SQLite-backed) │ │ (seed in RAM) │
│ │ │ │
└────────┬───────────┘ └──────────┬──────────┘
│ QUIC (irpc) │ QUIC (irpc)
│ │
┌────────▼────────────────────────────▼─────────┐
│ Head Node │
│ │
│ ┌──────────┐ ┌──────────┐ ┌─────────────┐ │
│ │ Config │ │ Storage │ │ alknet-core │ │
│ │ Service │ │ Service │ │ Server │ │
│ │ (local) │ │ (local) │ │ │ │
│ └──────────┘ └──────────┘ └──────────────┘ │
└───────────────────────────────────────────────┘
│ SSH / iroh / TLS
┌────────▼──────────────────────────────────────┐
│ Worker Node │
│ │
│ ┌──────────┐ ┌──────────────┐ │
│ │ Storage │ │ alknet-core │ │
│ │ Client │ │ Client │ │
│ │ (remote) │ │ │ │
│ └──────────┘ └──────────────┘ │
└───────────────────────────────────────────────┘
```
Workers don't hold the seed or the auth database. They request derived keys and auth verification via irpc over QUIC.
## Service and Call Protocol Relationship
Services are **internal** — they run within a node or cluster. The call protocol is **external** — it's how nodes communicate with each other over SSH/QUIC/WebSocket/DNS transports.
A service can be exposed as a call protocol operation:
| Internal Service | Call Protocol Path | Direction |
|-----------------|-------------------|-----------|
| AuthProtocol::VerifyPubkey | `/head/auth/verify` | Worker → Head |
| SecretProtocol::DeriveEd25519 | `/head/secrets/derive` | Worker → Head (restricted) |
| StorageProtocol::Subscribe | `/{node}/storage/watch` | Any → Any |
| ConfigProtocol::GetForwardingPolicy | `/head/config/forwarding` | Worker → Head |
External workers call these through the call protocol, which routes to the service on the head node:
```
Worker Head
│ │
│ call.requested │
│ operation: /head/auth/verify │
│ payload: { fingerprint, key }│
│ ─────────────────────────────►│
│ │ ┌─ AuthProtocol::VerifyPubkey ─┐
│ │ │ (irpc, local mpsc channel) │
│ │ └─ Result: AuthResult ──────────┘
│ │
│ call.responded │
│ payload: { status: "ok" } │
│ ◄─────────────────────────────│
```
## Service Integration Example
A head/worker deployment demonstrates service integration end-to-end:
- **Head node**: runs Auth, Secret, and Config services locally
- **Worker node**: connects to head via alknet call protocol
The worker-to-head protocol maps to call protocol operations:
| Worker Message | Call Protocol Path | Service |
|----------------|-------------------|---------|
| Auth | `/head/auth/verify` | AuthProtocol |
| Heartbeat | `/worker/heartbeat` (subscription) | ConfigProtocol |
| Task result | `/worker/task/submit` | StorageProtocol (persistence) |
| Task assignment | `/head/task/template` (subscription) | StorageProtocol |
Worker keys are derived from the seed by the secret service. The head node's API credentials are stored encrypted and decrypted on demand by the secret service.
## Derived Key Conventions
Standardized SLIP-0010/BIP32 paths (see [storage.md](storage.md) for full table):
| Path | Purpose | Curve/Algorithm |
|------|---------|----------------|
| `m/74'/0'/0'/0'` | Primary identity keypair | Ed25519 (alknet auth) |
| `m/74'/0'/0'/{n}'` | Worker/ device identity | Ed25519 |
| `m/74'/0'/1'/0'` | SSH host key | Ed25519 |
| `m/74'/1'/0'/{hash}'` | Site-specific password | Deterministic (like orbit-db-wallet) |
| `m/74'/2'/0'/0'` | Encryption key for external credentials | AES-256-GCM |
| `m/44'/60'/0'/0/0` | Ethereum signing key | secp256k1 (smart contract) |
The `74'` coin type is unallocated per SLIP-0044 and reserved for alknet.
## Application Services
Core services (auth, secret, config, storage) are infrastructure that every node needs. Application services are domain-specific and pluggable — they expose operations via the call protocol and are registered dynamically by the node operator.
### Service Tiers
```
┌─────────────────────────────────────────────────────────┐
│ Application Layer │
│ DockerService · NodeService · WalletService · GitService│
│ ProxyService · ComputeService · AgentService · ... │
├─────────────────────────────────────────────────────────┤
│ Core Services │
│ AuthService · SecretService · ConfigService │
│ StorageService │
├─────────────────────────────────────────────────────────┤
│ alknet-core │
│ Transport · Call Protocol · SSH · irpc │
└─────────────────────────────────────────────────────────┘
```
### DockerService
Container lifecycle management on a node. Wraps the Docker Engine API (via `bollard` crate, already used in dispatch) and exposes it through the call protocol.
```rust
#[rpc_requests(message = DockerMessage)]
enum DockerProtocol {
#[rpc(tx=oneshot::Sender<ContainerInfo>)]
#[wrap(CreateContainer)]
CreateContainer { image: String, name: Option<String>, env: Vec<(String, String)>, ports: Vec<(u16, u16)> },
#[rpc(tx=oneshot::Sender<ContainerInfo>)]
#[wrap(InspectContainer)]
InspectContainer { id: String },
#[rpc(tx=oneshot::Sender<Vec<ContainerInfo>>)]
#[wrap(ListContainers)]
ListContainers { all: bool },
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(StopContainer)]
StopContainer { id: String, timeout: u64 },
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(RemoveContainer)]
RemoveContainer { id: String, force: bool },
#[rpc(tx=mpsc::Sender<ContainerEvent>)]
#[wrap(StreamEvents)]
StreamEvents { filters: Vec<String> },
}
```
This makes container management a first-class alknet operation that can be called from any connected node, not just SSH. The dispatch project's `InstanceProvider` trait pattern maps directly here.
**Self-hosting use case**: An operator deploys a "server in a box" by connecting a worker node with DockerService registered. A head node (or another authorized node) can then deploy containers remotely via call protocol: `/node/docker/create`, `/node/docker/list`, etc. This replaces manual SSH + docker-compose with automated, auditable, policy-governed deployment.
### NodeService
System health, metrics, and tiered observability. Exposes system metrics and supports tiered escalation from small models to larger models to humans.
```rust
#[rpc_requests(message = NodeMessage)]
enum NodeProtocol {
#[rpc(tx=oneshot::Sender<SystemMetrics>)]
#[wrap(GetMetrics)]
GetMetrics { categories: Vec<MetricCategory> },
#[rpc(tx=oneshot::Sender<HealthStatus>)]
#[wrap(HealthCheck)]
HealthCheck,
#[rpc(tx=mpsc::Sender<SystemEvent>)]
#[wrap(SubscribeMetrics)]
SubscribeMetrics { interval_ms: u64, categories: Vec<MetricCategory> },
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(Escalate)]
Escalate { severity: Severity, message: String, context: serde_json::Value },
}
#[derive(Serialize, Deserialize)]
enum MetricCategory { Cpu, Memory, Disk, Network, Docker, Uptime }
#[derive(Serialize, Deserialize)]
enum Severity { Info, Warning, Critical }
```
**Tiered escalation pattern**: A small model (fast, cheap) subscribes to `/node/metrics/stream` and evaluates simple rules (disk > 90%, memory > 95%, container crashed). When a rule triggers, it calls `/node/alert/escalate` with context. The head node decides whether to notify a larger model or a human.
### WalletService
Multichain wallet operations using a HD derivation library (e.g., wagyu). Derives keys from the same master seed via the secret service, signs transactions, and manages addresses.
```rust
#[rpc_requests(message = WalletMessage)]
enum WalletProtocol {
#[rpc(tx=oneshot::Sender<AddressInfo>)]
#[wrap(GetAddress)]
GetAddress { chain: Chain, path: String },
#[rpc(tx=oneshot::Sender<BalanceInfo>)]
#[wrap(GetBalance)]
GetBalance { chain: Chain, address: String },
#[rpc(tx=oneshot::Sender<SignedTransaction>)]
#[wrap(SignTransaction)]
SignTransaction { chain: Chain, path: String, tx_params: serde_json::Value },
#[rpc(tx=oneshot::Sender<String>)]
#[wrap(VerifyAddress)]
VerifyAddress { chain: Chain, address: String },
}
#[derive(Serialize, Deserialize)]
enum Chain { Bitcoin, Ethereum, Monero, Zcash }
```
The WalletService delegates key derivation to the SecretService via irpc. It never sees the master seed — only derived keypairs for specific paths. This means wallet operations are available to authorized nodes without exposing the full key hierarchy.
### ProxyService
Reverse proxy and TLS certificate management. Automates nginx/certbot configuration for services deployed via DockerService.
```rust
#[rpc_requests(message = ProxyMessage)]
enum ProxyProtocol {
#[rpc(tx=oneshot::Sender<ProxyConfig>)]
#[wrap(GetConfig)]
GetConfig,
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(AddRoute)]
AddRoute { domain: String, upstream: String, tls: bool },
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(RemoveRoute)]
RemoveRoute { domain: String },
#[rpc(tx=oneshot::Sender<CertificateInfo>)]
#[wrap(ProvisionCert)]
ProvisionCert { domain: String },
#[rpc(tx=oneshot::Sender<Vec<CertificateInfo>>)]
#[wrap(ListCerts)]
ListCerts,
}
```
### ComputeService
Abstracts compute provider APIs (starting with dispatch's `InstanceProvider` pattern). Manages remote instances across providers.
```rust
#[rpc_requests(message = ComputeMessage)]
enum ComputeProtocol {
#[rpc(tx=oneshot::Sender<InstanceInfo>)]
#[wrap(CreateInstance)]
CreateInstance { provider: String, spec: InstanceSpec },
#[rpc(tx=oneshot::Sender<Vec<InstanceInfo>>)]
#[wrap(ListInstances)]
ListInstances { provider: Option<String> },
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(DestroyInstance)]
DestroyInstance { id: String },
#[rpc(tx=oneshot::Sender<InstanceInfo>)]
#[wrap(GetInstance)]
GetInstance { id: String },
}
```
### Registration Pattern
Application services register with the call protocol's `OperationRegistry` at startup:
```rust
registry.register(
OperationSpec { name: "/node/docker/create", namespace: "docker", ... },
docker_service.create_container_handler,
);
registry.register(
OperationSpec { name: "/node/metrics/stream", namespace: "node", ... },
node_service.subscribe_metrics_handler,
);
```
A worker node that exposes Docker and Node services registers those operations when it connects to the head. The head can then route calls from any node to the appropriate worker via the call protocol.
### Self-Hosting Stack Example
A minimal self-hosted server with all services:
```
┌─────────────────────────────────────────────────────────┐
│ Head Node │
│ │
│ Core: Auth · Secret · Config · Storage │
│ App: Docker · Node · Proxy · Git · Wallet · Compute │
│ │
│ Call protocol paths: │
│ /head/auth/* │
│ /head/docker/* │
│ /head/proxy/* │
│ /head/wallet/* │
│ /head/compute/* │
│ /head/node/metrics/* │
└─────────────────────────────────────────────────────────┘
```
An operator deploys this by:
1. Running `alknet serve --config stack.toml`
2. Entering their seed phrase once (unlocks the secret service)
3. All services come online with keys derived from the seed
4. Docker containers for Gitea, Postgres, Redis, etc. are managed via DockerService
5. Reverse proxy and TLS are automated via ProxyService
6. Wallet keys are derived on demand via WalletService
No manual SSH, no hardcoded credentials, no separate secret management. The seed phrase is the single root of trust.
## Crate Structure
```
alknet-core/
├── transport/ — Transport trait, TCP, TLS, iroh, DNS
├── call/ — Call protocol, PendingRequestMap, OperationRegistry
├── auth/ — AuthService protocol, identity types
├── secrets/ — SecretService protocol, BIP39, SLIP-0010, AES-GCM
├── config/ — ConfigService protocol, StaticConfig, DynamicConfig
├── handler/ — ServerHandler, SSH authentication hooks
└── serve.rs — Server::run(), multi-transport listeners
alknet-storage/
├── metagraph/ — GraphType, NodeType, EdgeType persistence
├── identity/ — accounts, organizations, peer_credentials, api_keys
├── acl/ — PrincipalNode, DelegatesEdge, access control graph
├── secrets/ — Encrypted node type, encrypt/decrypt, key derivation bridge
├── honker/ — honker integration: notify, stream, queue
├── graph/ — GraphInstance, Node, Edge CRUD with schema validation
└── schema/ — JSON Schema definitions (serde + jsonschema)
```
## Security Considerations
1. **Seed phrase is never persisted** — it's entered at startup or via `Unlock` call and held only in RAM
2. **Derived keys are cached in memory** — cleared on `Lock`
3. **External credentials are encrypted at rest** — the encryption key is itself derived from the seed
4. **Auth service never sees the seed** — it only sees public key fingerprints and verification results
5. **irpc remote communication is over QUIC** — encrypted in transit; irpc doesn't add its own encryption layer (assumes the transport provides it)
6. **Lock wipes all secrets** — a locked secret service returns errors for all requests until unlocked
## Open Questions
- **OQ-SVC-01**: Should the secret service support multiple seed phrases (one per tenant or identity)?
The simplest approach is one seed per node. Multi-seed support (e.g., one per tenant in a multi-tenant system) can be added later by indexing the `Unlock` call with a tenant ID. Defer for now.
- **OQ-SVC-02**: Should service protocols use postcard (binary) or JSON for remote calls?
irpc defaults to postcard for efficiency. However, the call protocol uses JSON `EventEnvelope` for cross-language compatibility. Service-to-service calls should use postcard (Rust-to-Rust), while node-to-node calls use JSON (call protocol). The irpc remote path naturally uses postcard.
- **OQ-SVC-03**: How does the secret service integrate with the existing `EncryptedDataSchema` from `@alkdev/storage`?
The TypeScript `encrypt()`/`decrypt()` functions use PBKDF2 with a password. In Rust, the secret service replaces the password with a derived AES-256-GCM key. The `EncryptedData` schema (key_version, salt, iv, data) stays the same, but key derivation changes from PBKDF2(password) to SLIP-0010(seed, path). This is a superset — the old format can be migrated by re-encrypting with the new key.
- **OQ-SVC-04**: Should workers cache derived keys locally?
Yes, with a TTL. A worker that holds a derived Ed25519 keypair for its session can re-authenticate without calling the secret service every time. The TTL should be configurable (default: 1 hour). The head can revoke by invalidating the session, not by expiring the key.
- **OQ-SVC-05**: How does the smart contract (NFT-based ACL) interact with the secret service?
The Ethereum signing key (`m/44'/60'/0'/0/0`) is derived from the same seed. The secret service can sign transactions on behalf of the node. The smart contract is a separate concern — it's the external source of truth for identity registration. The local ACL graph (in `alknet-storage`) is a cache that's synced from the contract, not the other way around.
## References
- [core.md](core.md) — Core overview, transport, call protocol, head/worker model
- [configuration.md](configuration.md) — Config architecture, auth service, DynamicConfig
- [storage.md](storage.md) — Metagraph, identity, ACL, secrets, event boundaries
- [flow.md](flow.md) — Operation graph, call graph, petgraph mapping
- `/workspace/@alkdev/storage/docs/architecture/encrypted-data.md` — Original encrypted data design (TypeScript)
- `/workspace/research/event_sourcing/event_source_types.md` — Event-driven architecture patterns
- irpc crate — https://docs.rs/irpc — Service protocol definitions, local/remote abstraction
- SLIP-0010 — https://github.com/satoshilabs/slips/blob/master/slip-0010.md — HD key derivation for Ed25519
- BIP39 — https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki — Mnemonic code for generating deterministic keys (widely used beyond cryptocurrency)
- `ed25519-bip32` crate — https://docs.rs/ed25519-bip32 — BIP32-Ed25519 (Cardano/IOHK approach)
- `bip39` crate — https://docs.rs/bip39 — Mnemonic generation and seed derivation

View File

@@ -1,11 +1,18 @@
# Alknet Storage: Metagraph, Identity, ACL, and Honker Integration
# Alknet Storage: Metagraph, Identity, ACL, Secrets, and Honker Integration
> Status: Research / Draft
> Last updated: 2026-06-05
> Last updated: 2026-06-06
## Overview
`alknet-storage` is a Rust crate providing SQLite-backed graph storage, identity management, access control, and reactivity via honker. It mirrors the TypeScript `@alkdev/storage` package's design (`sqlite-host.md`, `metagraph-module.md`, `acl.md`) while leveraging Rust's type system and petgraph's performance.
`alknet-storage` is a Rust crate providing SQLite-backed graph storage, identity management, access control, secrets management, and reactivity via honker. It mirrors the TypeScript `@alkdev/storage` package's design (`sqlite-host.md`, `metagraph-module.md`, `acl.md`) while leveraging Rust's type system and petgraph's performance.
## Terminology
This document uses **head/worker** terminology instead of hub/spoke:
- **Head node**: Coordinating node that can also be a worker
- **Worker node**: Node that connects to a head and registers services
- **Node**: Any participant in the network
## Crate Decomposition
@@ -14,6 +21,7 @@ alknet-storage
├── metagraph/ — GraphType, NodeType, EdgeType definitions and persistence
├── identity/ — accounts, organizations, peer_credentials, api_keys, audit_logs
├── acl/ — PrincipalNode, DelegatesEdge, access control graph
├── secrets/ — HD key derivation (BIP39/SLIP-0010), encrypted data, secret service bridge
├── honker/ — honker integration: notify, stream, queue, event bridge
├── graph/ — GraphInstance, Node, Edge CRUD with schema validation
└── schema/ — JSON Schema definitions (serde + jsonschema for runtime validation)
@@ -199,7 +207,7 @@ pub struct DelegatesEdgeAttrs {
- **Account** nodes represent individual users
- **Org** nodes represent organizations
- **Service** nodes represent automated agents (LLM workers, spoke credentials)
- **Service** nodes represent automated agents (LLM workers, node credentials)
- **Role** nodes represent named permission sets
Delegation edges (`delegates`) carry `narrowed_scopes` — the delegate can only exercise scopes that are a subset of the delegator's. Liability flows upward; permissions flow downward with narrowing.
@@ -318,6 +326,103 @@ For the distributed use case (later):
Replication mindset from the start: **every write is atomic with a notification**. The honker stream event is the replication unit. A future replicator reads `_honker_stream_*` tables and propagates changes to subscribed relays.
### Event Boundary Discipline
Following [event_source_types.md](/workspace/research/event_sourcing/event_source_types.md), honker streams serve different roles in different contexts. Preventing conflation is critical:
| Event Type | Source | Consumer | Boundary |
|-----------|--------|----------|----------|
| **Domain events** (Event Sourcing) | Service that owns the data | Same service, for state reconstruction | Internal — never published directly to other services |
| **Integration events** (State Transfer) | Projected from domain events | Other services/nodes, for cache updates | Cross-service — simple, versioned, stripped of internals |
| **Notifications** (Thin Events) | Service that owns the data | Any subscriber, for triggering workflows | Cross-node — just entity ID + action, consumer fetches details |
Conflation anti-patterns to avoid:
- **Leaky event store**: Don't let other services read honker stream events directly to drive business logic. Project domain events into integration events first.
- **Boomerang coupling**: If a consumer of an integration event must call back to the source service synchronously, the event payload is too thin. Upgrade to a fat event.
- **Fat notification trap**: If a notification event carries the full entity state, use state transfer instead.
The call protocol's `EventEnvelope` is the **integration boundary** between nodes. Domain events in honker streams stay within the service that owns them.
## Secrets and HD Key Derivation
### Key Categories
Different categories of secrets require different storage and derivation strategies:
| Category | Example | Derived from seed? | Storage |
|-----------|---------|-------------------|---------|
| **Identity keys** | Ed25519 keypair for alknet auth | Yes — SLIP-0010 `m/74'/0'/0'/0'` | Only derivation path in DB |
| **Encryption keys** | AES-256-GCM key for encrypted nodes | Yes — SLIP-0010 `m/74'/2'/0'/0'` | Only derivation path in DB |
| **External credentials** | OpenAI API key, OAuth token | No — third-party issued | Encrypted in DB with derived key |
| **On-chain identity** | Ethereum key for contract signing | Yes — SLIP-0010 `m/44'/60'/0'/0/0` | Only derivation path in DB |
| **Service registration** | NFT token ID, replicator endpoint | No — on-chain data | Plain in DB or on-chain |
### BIP39 Seed Phrase as Root of Trust
The master seed phrase (BIP39 mnemonic) is the single recovery mechanism for the entire system. From one seed phrase, all self-generated secrets can be derived on demand:
```rust
// Seed phrase → master seed (BIP39)
let mnemonic = Mnemonic::from_phrase(&phrase, Language::English)?;
let seed = mnemonic.to_seed(Some(&passphrase));
// Master seed → SLIP-0010 Ed25519 master key
let master_key = ExtendedPrivKey::new_master(Network::Alknet, &seed)?;
// Derive identity keypair
let identity_key = master_key.derive_path("m/74'/0'/0'/0'")?;
// Derive encryption key material (use first 32 bytes of derived key as AES-256 key)
let encryption_key = master_key.derive_path("m/74'/2'/0'/0'")?;
// Derive Ethereum signing key (for smart contract interactions)
let eth_key = master_key.derive_path("m/44'/60'/0'/0/0")?;
```
### External Credentials: Encryption with Derived Keys
For external credentials (API keys, OAuth tokens) that can't be derived, the existing `EncryptedDataSchema` pattern from `@alkdev/storage` applies — but the encryption key is itself derived from the seed:
1. The secret service derives an AES-256-GCM key via SLIP-0010 path `m/74'/2'/0'/0'`
2. External credentials are encrypted with this derived key using the existing encrypt/decrypt functions
3. The encrypted data is stored as a `SecretNode` in the metagraph
4. Only the derivation path and key version are stored in plain attributes
5. The seed phrase (or derived encryption key) is held only by the secret service — never in the database
### Secret Service
The secret service is an irpc service (see [services.md](services.md)) that:
- Holds the master seed phrase in memory (never persisted to disk in plain text)
- Derives keys on demand via SLIP-0010/BIP39
- Encrypts/decrypts external credentials using derived keys
- Is the **only** component that ever sees the master seed
Workers request derived keys through the secret service's irpc protocol. They never see the seed or the encryption key.
### Derivation Path Conventions
| Path | Purpose |
|------|---------|
| `m/74'/0'/0'/0'` | Primary Ed25519 identity keypair (alknet auth) |
| `m/74'/0'/0'/1'` | Secondary identity keypair (device key) |
| `m/74'/0'/1'/0'` | SSH host key (for server identity) |
| `m/74'/1'/0'/{site_hash}'` | Site-specific password derivation |
| `m/74'/2'/0'/0'` | AES-256-GCM encryption key (for external credentials) |
| `m/44'/60'/0'/0/0` | Ethereum signing key (for smart contract interactions) |
The `74'` coin type is unallocated per SLIP-0044 and can be registered for alknet. The `0'`/`1'`/`2'` account levels divide identity, password, and encryption purposes.
### Rust Crates Required
| Crate | Purpose |
|-------|---------|
| `bip39` | Mnemonic generation and seed derivation |
| `ed25519-bip32` (IOHK) or `rust-bip32-ed25519` (BitBoxSwiss) | SLIP-0010 Ed25519 HD key derivation |
| `aes-gcm` | AES-256-GCM encryption for external credentials |
| `sha2` | SHA-256 for key hashing |
| `irpc` | Service protocol definitions |
## Design Decisions (mapped from TypeScript ADRs)
| Original ADR | Decision | Rust adaptation |
@@ -335,14 +440,21 @@ Replication mindset from the start: **every write is atomic with a notification*
| 047 | Honker event target | honker stream/notify as pub/sub mechanism |
| 049 | Identity schema restructuring | Separate credential tables, no Gitea columns |
| 050 | SHA-256 for API key hashing | Fast hash for high-entropy machine keys |
| 051 | BIP39/SLIP-0010 for HD key derivation | Seed phrase as root of trust for identity, encryption, and signing keys |
| 052 | Secrets as irpc service | Secret service holds seed, derives keys, encrypts/decrypts external creds |
| 053 | Event boundary discipline | Honker streams are domain events; call protocol is integration boundary |
## References
- `@alkdev/storage` — TypeScript metagraph, identity, ACL implementation
- `@alkdev/storage` — TypeScript metagraph, identity, ACL, encrypted data implementation
- `@alkdev/flowgraph` — TypeScript call-graph and operation-graph (maps to petgraph in Rust)
- `@alkdev/operations` — TypeScript OperationSpec, CallHandler, registry
- `/workspace/honker` — SQLite extension with pub/sub, streams, queues
- `/workspace/polyglot` — SQL transpiler (future: schema migration validation)
- `/workspace/petgraph` — Graph data structure library (used in alknet-flowgraph)
- `/workspace/jsonschema` — JSON Schema validation (Rust, replaces TypeBox at runtime)
- `/workspace/iroh/iroh-dns` — DNS resolver and endpoint info
- `/workspace/iroh/iroh-dns` — DNS resolver and endpoint info
- `/workspace/@alkdev/storage/docs/architecture/encrypted-data.md` — Original encrypted data design (TypeScript)
- `/workspace/research/event_sourcing/event_source_types.md` — Event-driven architecture patterns
- [services.md](services.md) — Service layer architecture (irpc protocols)
- [core.md](core.md) — Core overview, head/worker terminology