Files

glm-5.1 d291a485f0 docs: refactor hub/spoke to head/worker, add service layer and HD key derivation

- Replace hub/spoke terminology with head/worker throughout all research docs
- Add irpc service layer architecture (AuthProtocol, SecretProtocol,
  ConfigProtocol, StorageProtocol)
- Add BIP39/SLIP-0010 HD key derivation for secrets management
- Add event boundary discipline (domain events vs integration events)
- Add application services layer (Docker, Node, Wallet, Proxy, Compute)
- New docs/research/services.md defining irpc service protocols
- Update core.md with service layer section and head/worker model
- Update configuration.md to delegate auth to AuthService (irpc)
- Update storage.md with secrets/key derivation and event boundaries
- Update flow.md with event boundary decision and cross-references

2026-06-06 15:33:35 +00:00

22 KiB

Raw Blame History

Alknet Storage: Metagraph, Identity, ACL, Secrets, and Honker Integration

Status: Research / Draft Last updated: 2026-06-06

Overview

alknet-storage is a Rust crate providing SQLite-backed graph storage, identity management, access control, secrets management, and reactivity via honker. It mirrors the TypeScript @alkdev/storage package's design (sqlite-host.md, metagraph-module.md, acl.md) while leveraging Rust's type system and petgraph's performance.

Terminology

This document uses head/worker terminology instead of hub/spoke:

Head node: Coordinating node that can also be a worker
Worker node: Node that connects to a head and registers services
Node: Any participant in the network

Crate Decomposition

alknet-storage
├── metagraph/     — GraphType, NodeType, EdgeType definitions and persistence
├── identity/      — accounts, organizations, peer_credentials, api_keys, audit_logs
├── acl/           — PrincipalNode, DelegatesEdge, access control graph
├── secrets/       — HD key derivation (BIP39/SLIP-0010), encrypted data, secret service bridge
├── honker/        — honker integration: notify, stream, queue, event bridge
├── graph/         — GraphInstance, Node, Edge CRUD with schema validation
└── schema/        — JSON Schema definitions (serde + jsonschema for runtime validation)

Metagraph Data Model

The metagraph is a three-level type system (mirrors @alkdev/storage exactly):

GraphType — A class of graphs (e.g., "call-graph", "acl", "task-dependencies"). Defines structural constraints (directed/undirected/mixed, allows self-loops, multi-edges).
NodeType — A category of node within a graph type (e.g., "call", "account", "task"). Each node type has a JSON Schema that validates the attributes of nodes belonging to that type.
EdgeType — A category of edge within a graph type (e.g., "triggered", "can_read", "depends_on"). Each edge type has a JSON Schema for its attributes. Optionally constrains which source/target node types are valid.

Graph instances belong to a graph type and contain Nodes and Edges conforming to those type definitions.

Rust Types

pub struct GraphType {
    pub id: String,
    pub name: String,                          // "call-graph", "acl"
    pub description: String,
    pub config: GraphConfig,                   // directed/undirected/mixed, multi, self-loops
    pub version: u32,
    pub scope: Scope,                          // System, Tenant, User
    pub metadata: serde_json::Value,
}

pub struct GraphConfig {
    pub graph_type: GraphDirection,             // Directed, Undirected, Mixed
    pub multi: bool,
    pub allow_self_loops: bool,
}

pub enum Scope {
    System,
    Tenant,
    User,
}

pub struct NodeType {
    pub id: String,
    pub graph_type_id: String,
    pub name: String,                           // "call", "account"
    pub description: String,
    pub schema: serde_json::Value,             // JSON Schema for node attributes
}

pub struct EdgeType {
    pub id: String,
    pub graph_type_id: String,
    pub name: String,                           // "triggered", "can_read"
    pub description: String,
    pub schema: serde_json::Value,             // JSON Schema for edge attributes
    pub allowed_source_types: Vec<String>,      // [] = no restriction
    pub allowed_target_types: Vec<String>,
}

pub struct Graph {
    pub id: String,
    pub graph_type_id: String,
    pub name: String,
    pub description: String,
    pub status: GraphStatus,                    // Active, Archived, Draft
    pub owner_id: Option<String>,
    pub project_id: Option<String>,
    pub metadata: serde_json::Value,
}

pub enum GraphStatus {
    Active,
    Archived,
    Draft,
}

pub struct Node {
    pub id: String,
    pub graph_id: String,
    pub key: String,                            // Consumer-defined identity within the graph
    pub attributes: serde_json::Value,          // Validated by node type schema
    pub metadata: serde_json::Value,
}

pub struct Edge {
    pub id: String,
    pub graph_id: String,
    pub key: Option<String>,                    // Null for anonymous edges
    pub source_node_key: String,
    pub target_node_key: String,
    pub attributes: serde_json::Value,          // Validated by edge type schema
    pub undirected: bool,
    pub metadata: serde_json::Value,
}

SQLite Tables (mirrors `sqlite-host.md`)

Common columns on all tables: id TEXT PK, metadata TEXT JSON DEFAULT '{}', created_at INTEGER TIMESTAMP DEFAULT (strftime('%s','now')), updated_at INTEGER TIMESTAMP DEFAULT (strftime('%s','now')).

graph_types: id, name TEXT UNIQUE, description TEXT DEFAULT '', config TEXT JSON NOT NULL, version INTEGER NOT NULL DEFAULT 1, scope TEXT NOT NULL DEFAULT 'system'

node_types: id, graph_type_id TEXT FK → graph_types.id CASCADE, name TEXT NOT NULL, description TEXT DEFAULT '', schema TEXT JSON NOT NULL. Unique constraint: (graph_type_id, name).

edge_types: id, graph_type_id TEXT FK → graph_types.id CASCADE, name TEXT NOT NULL, description TEXT DEFAULT '', schema TEXT JSON NOT NULL, allowed_source_types TEXT JSON DEFAULT '[]', allowed_target_types TEXT JSON DEFAULT '[]'. Unique constraint: (graph_type_id, name).

graphs: id, graph_type_id TEXT FK → graph_types.id SET NULL, name TEXT NOT NULL, description TEXT DEFAULT '', status TEXT NOT NULL DEFAULT 'draft', owner_id TEXT, project_id TEXT. Indexes on (owner_id), (project_id), (owner_id, project_id).

nodes: id, graph_id TEXT FK → graphs.id CASCADE, key TEXT NOT NULL, attributes TEXT JSON NOT NULL DEFAULT '{}'. Unique constraint: (graph_id, key). No node_type_id column (ADR-020).

edges: id, graph_id TEXT FK → graphs.id CASCADE, key TEXT, source_node_key TEXT NOT NULL, target_node_key TEXT NOT NULL, attributes TEXT JSON NOT NULL DEFAULT '{}', undirected INTEGER DEFAULT 0. Unique constraint: (graph_id, key). FK: source_node_key, target_node_key reference (nodes.graph_id, nodes.key) with CASCADE delete (ADR-022).

System DB vs Tenant DB (ADR-040)

System DB (system.db): Identity tables (accounts, organizations, peer_credentials, api_keys, audit_logs) + system-scoped graph types.
Tenant DB (tenant-{orgId}.db): Metagraph tables (graph_types, node_types, edge_types, graphs, nodes, edges) + tenant-scoped graph types.

No FK constraints across database files. Consumer enforces referential integrity at application layer.

Identity Tables

Mirrors sqlite-host.md identity tables with the same column definitions and FK cascades:

accounts: email TEXT UNIQUE NOT NULL, display_name TEXT, access_level TEXT NOT NULL DEFAULT 'user' (admin/user/service), status TEXT NOT NULL DEFAULT 'active' (active/suspended/deactivated).

organizations: name TEXT UNIQUE NOT NULL, slug TEXT UNIQUE NOT NULL, owner_id TEXT FK → accounts.id RESTRICT.

organization_members: org_id TEXT FK → organizations.id CASCADE, account_id TEXT FK → accounts.id CASCADE, membership_level TEXT NOT NULL (owner/admin/member). Unique constraint: (org_id, account_id).

api_keys: owner_id TEXT FK → accounts.id CASCADE, key_hash TEXT UNIQUE NOT NULL, name TEXT, enabled INTEGER NOT NULL DEFAULT 1, expires_at INTEGER TIMESTAMP, revoked_at INTEGER TIMESTAMP, rotated_to_id TEXT, last_used_at INTEGER TIMESTAMP.

peer_credentials: owner_id TEXT FK → accounts.id CASCADE, credential_type TEXT NOT NULL (ssh_key/cert_authority), fingerprint TEXT UNIQUE NOT NULL, public_key_data TEXT NOT NULL, name TEXT, enabled INTEGER NOT NULL DEFAULT 1, expires_at INTEGER TIMESTAMP, revoked_at INTEGER TIMESTAMP.

audit_logs: action TEXT NOT NULL, owner_id TEXT FK → accounts.id RESTRICT, credential_id TEXT, credential_type TEXT, org_id TEXT FK → organizations.id SET NULL, details TEXT JSON.

Access Control (ACL) as Metagraph

Mirrors @alkdev/storage acl.md:

AclGraph Module

// Graph config: directed, multi=false, allowSelfLoops=false
pub const ACL_GRAPH_CONFIG: GraphConfig = GraphConfig {
    graph_type: GraphDirection::Directed,
    multi: false,
    allow_self_loops: false,
};

// Node types
pub const PRINCIPAL_NODE: &str = "principal";
pub const RESOURCE_NODE: &str = "resource";

// Edge types
pub const CAN_READ_EDGE: &str = "can_read";
pub const CAN_WRITE_EDGE: &str = "can_write";
pub const CAN_EXECUTE_EDGE: &str = "can_execute";
pub const BELONGS_TO_EDGE: &str = "belongs_to";
pub const DELEGATES_EDGE: &str = "delegates";

// PrincipalNode attributes
pub struct PrincipalNodeAttrs {
    pub identity_type: IdentityType,    // Account, Org, Service, Role
    pub identity_id: String,            // FK to accounts.id or organizations.id
    pub scopes: Vec<String>,
    pub resources: Option<HashMap<String, Vec<String>>>,
}

pub enum IdentityType {
    Account,
    Org,
    Service,
    Role,
}

// DelegatesEdge attributes
pub struct DelegatesEdgeAttrs {
    pub narrowed_scopes: Vec<String>,     // Subset of delegator's scopes
    pub narrowable: bool,                  // Can the delegate further narrow?
}

Principal-Agent Hierarchy

Account nodes represent individual users
Org nodes represent organizations
Service nodes represent automated agents (LLM workers, node credentials)
Role nodes represent named permission sets

Delegation edges (delegates) carry narrowed_scopes — the delegate can only exercise scopes that are a subset of the delegator's. Liability flows upward; permissions flow downward with narrowing.

BelongsToEdge (Derived from org_members)

ADR-045: The organization_members SQL table is the authoritative source. When membership changes, the consumer writes the SQL row first, then creates or removes the ACL belongs_to edge. The edge is derived, not the source of truth.

Operation-Level ACL

OperationSpec.access_control maps to ACL graph traversal at runtime:

pub fn check_access(
    acl_graph: &Graph,
    principal_key: &str,
    operation_spec: &OperationSpec,
) -> bool {
    // Traverse from PrincipalNode to ResourceNode
    // Check if any path satisfies required_scopes (AND) and required_scopes_any (OR)
    // Honor delegation chains with scope narrowing
}

Honker Integration

Reactivity Pattern (ADR-047)

Every mutation is atomic with a notification:

// Insert a node and notify in one transaction
tx.execute(
    "INSERT INTO nodes (id, graph_id, key, attributes) VALUES (?, ?, ?, ?)",
    &[&node_id, &graph_id, &key, &attrs_json],
)?;
tx.stream_publish("nodes:created", &node_attrs_json)?;

This mirrors the TypeScript pattern from sqlite-host.md but in Rust, using honker's SQLite extension functions:

use honker::Database;

let db = Database::open("tenant.db")?;

// Transactional: business write + event stream publish commit together
let mut tx = db.transaction()?;
tx.execute("INSERT INTO nodes (id, graph_id, key, attributes) VALUES (?, ?, ?, ?)", ...)?;
tx.stream_publish("nodes:created", &attrs)?;
tx.commit()?;

// Subscribe to changes
let stream = db.stream("nodes:created");
async for event in stream.subscribe("alknet-node-watcher") {
    // event is a serde_json::Value
}

Honker Features Used

Feature	Use case
`stream_publish` / `subscribe`	Durable pub/sub for node/edge/membership changes with per-consumer offsets
`notify` / `listen`	Ephemeral pub/sub for real-time control channel events
`queue` / `claim` / `ack`	Task queue for async operations (key rotation, ACL evaluation)
`scheduler`	Periodic tasks (session cleanup, audit log pruning)

Database Concurrency

WAL mode (default) for concurrent reads during writes
Single writer per .db file
busy_timeout=5000 default
PRAGMA data_version polling for cross-process wake (honker pattern)
max_readers=4 concurrent read connections in the reader pool

JSON Schema Validation

TypeBox from TypeScript maps to serde_json::Value + jsonschema in Rust:

TypeScript	Rust
`Type.Object({...})`	`serde_json::json!({...})` as JSON Schema
`Value.Check(schema, data)`	`jsonschema::validate(&schema, &data)`
`Type.Module({...})`	JSON Schema with `$defs` stored in DB
`Type.Composite([A, B])`	Merge + intersect via `serde_json` merge logic

The jsonschema crate provides runtime validation analogous to TypeBox's Value.Check(). Schema definitions are stored as serde_json::Value in the schema column of node_types and edge_types tables.

Crate Dependency Map

[dependencies]
honker = "0.x"                    # SQLite extension with pub/sub/queue
serde = { version = "1", features = ["derive"] }
serde_json = "1"
jsonschema = "0.x"                # JSON Schema validation (runtime)
petgraph = "0.x"                  # Graph data structure (shared with alknet-flowgraph)
rusqlite = { version = "0.x", features = ["bundled"] }  # SQLite access (via honker)
uuid = { version = "1", features = ["v4"] }
chrono = "0.x"
thiserror = "1"
tokio = { version = "1", features = ["full"] }

Multi-Tenant Replication Path

For the private use case: single .db files, honker for reactivity, no cross-database FK constraints.

For the distributed use case (later):

Smart contracts (Base L2) own namespace identity → ownerId field on graphs table
alknet-relay gossips namespace availability via iroh-gossip or call protocol subscriptions
ACL inference — Contract collaborators → ACL graph DelegatesEdge entries
Honker streams — stream_subscribe("nodes:modified") carries mutations to relay subscribers

Replication mindset from the start: every write is atomic with a notification. The honker stream event is the replication unit. A future replicator reads _honker_stream_* tables and propagates changes to subscribed relays.

Event Boundary Discipline

Following event_source_types.md, honker streams serve different roles in different contexts. Preventing conflation is critical:

Event Type	Source	Consumer	Boundary
Domain events (Event Sourcing)	Service that owns the data	Same service, for state reconstruction	Internal — never published directly to other services
Integration events (State Transfer)	Projected from domain events	Other services/nodes, for cache updates	Cross-service — simple, versioned, stripped of internals
Notifications (Thin Events)	Service that owns the data	Any subscriber, for triggering workflows	Cross-node — just entity ID + action, consumer fetches details

Conflation anti-patterns to avoid:

Leaky event store: Don't let other services read honker stream events directly to drive business logic. Project domain events into integration events first.
Boomerang coupling: If a consumer of an integration event must call back to the source service synchronously, the event payload is too thin. Upgrade to a fat event.
Fat notification trap: If a notification event carries the full entity state, use state transfer instead.

The call protocol's EventEnvelope is the integration boundary between nodes. Domain events in honker streams stay within the service that owns them.

Secrets and HD Key Derivation

Key Categories

Different categories of secrets require different storage and derivation strategies:

Category	Example	Derived from seed?	Storage
Identity keys	Ed25519 keypair for alknet auth	Yes — SLIP-0010 `m/74'/0'/0'/0'`	Only derivation path in DB
Encryption keys	AES-256-GCM key for encrypted nodes	Yes — SLIP-0010 `m/74'/2'/0'/0'`	Only derivation path in DB
External credentials	OpenAI API key, OAuth token	No — third-party issued	Encrypted in DB with derived key
On-chain identity	Ethereum key for contract signing	Yes — SLIP-0010 `m/44'/60'/0'/0/0`	Only derivation path in DB
Service registration	NFT token ID, replicator endpoint	No — on-chain data	Plain in DB or on-chain

BIP39 Seed Phrase as Root of Trust

The master seed phrase (BIP39 mnemonic) is the single recovery mechanism for the entire system. From one seed phrase, all self-generated secrets can be derived on demand:

// Seed phrase → master seed (BIP39)
let mnemonic = Mnemonic::from_phrase(&phrase, Language::English)?;
let seed = mnemonic.to_seed(Some(&passphrase));

// Master seed → SLIP-0010 Ed25519 master key
let master_key = ExtendedPrivKey::new_master(Network::Alknet, &seed)?;

// Derive identity keypair
let identity_key = master_key.derive_path("m/74'/0'/0'/0'")?;

// Derive encryption key material (use first 32 bytes of derived key as AES-256 key)
let encryption_key = master_key.derive_path("m/74'/2'/0'/0'")?;

// Derive Ethereum signing key (for smart contract interactions)
let eth_key = master_key.derive_path("m/44'/60'/0'/0/0")?;

External Credentials: Encryption with Derived Keys

For external credentials (API keys, OAuth tokens) that can't be derived, the existing EncryptedDataSchema pattern from @alkdev/storage applies — but the encryption key is itself derived from the seed:

The secret service derives an AES-256-GCM key via SLIP-0010 path m/74'/2'/0'/0'
External credentials are encrypted with this derived key using the existing encrypt/decrypt functions
The encrypted data is stored as a SecretNode in the metagraph
Only the derivation path and key version are stored in plain attributes
The seed phrase (or derived encryption key) is held only by the secret service — never in the database

Secret Service

The secret service is an irpc service (see services.md) that:

Holds the master seed phrase in memory (never persisted to disk in plain text)
Derives keys on demand via SLIP-0010/BIP39
Encrypts/decrypts external credentials using derived keys
Is the only component that ever sees the master seed

Workers request derived keys through the secret service's irpc protocol. They never see the seed or the encryption key.

Derivation Path Conventions

Path	Purpose
`m/74'/0'/0'/0'`	Primary Ed25519 identity keypair (alknet auth)
`m/74'/0'/0'/1'`	Secondary identity keypair (device key)
`m/74'/0'/1'/0'`	SSH host key (for server identity)
`m/74'/1'/0'/{site_hash}'`	Site-specific password derivation
`m/74'/2'/0'/0'`	AES-256-GCM encryption key (for external credentials)
`m/44'/60'/0'/0/0`	Ethereum signing key (for smart contract interactions)

The 74' coin type is unallocated per SLIP-0044 and can be registered for alknet. The 0'/1'/2' account levels divide identity, password, and encryption purposes.

Rust Crates Required

Crate	Purpose
`bip39`	Mnemonic generation and seed derivation
`ed25519-bip32` (IOHK) or `rust-bip32-ed25519` (BitBoxSwiss)	SLIP-0010 Ed25519 HD key derivation
`aes-gcm`	AES-256-GCM encryption for external credentials
`sha2`	SHA-256 for key hashing
`irpc`	Service protocol definitions

Design Decisions (mapped from TypeScript ADRs)

Original ADR	Decision	Rust adaptation
002	Metagraph over domain tables	Same 6-table schema, same graph type/node type/edge type model
008	Common columns pattern	`id`, `metadata`, `created_at`, `updated_at` on all tables
019	JSON text for schema columns	`serde_json::Value` stored as TEXT in SQLite
020	No nodeTypeId on nodes	Node type enforced at application layer
022	Composite FKs for node refs	`source_node_key` + `target_node_key` with cascade
034	ACL as metagraph	AclGraph is a metagraph instance
038	SQLite-first, PG removed	SQLite only via honker
040	System DB + tenant DB	Two `.db` files
041	Identity tables in storage	Same tables, same constraints
045	org_members authoritative	SQL table is source of truth, BelongsToEdge is derived
047	Honker event target	honker stream/notify as pub/sub mechanism
049	Identity schema restructuring	Separate credential tables, no Gitea columns
050	SHA-256 for API key hashing	Fast hash for high-entropy machine keys
051	BIP39/SLIP-0010 for HD key derivation	Seed phrase as root of trust for identity, encryption, and signing keys
052	Secrets as irpc service	Secret service holds seed, derives keys, encrypts/decrypts external creds
053	Event boundary discipline	Honker streams are domain events; call protocol is integration boundary

References

@alkdev/storage — TypeScript metagraph, identity, ACL, encrypted data implementation
@alkdev/flowgraph — TypeScript call-graph and operation-graph (maps to petgraph in Rust)
@alkdev/operations — TypeScript OperationSpec, CallHandler, registry
/workspace/honker — SQLite extension with pub/sub, streams, queues
/workspace/polyglot — SQL transpiler (future: schema migration validation)
/workspace/petgraph — Graph data structure library (used in alknet-flowgraph)
/workspace/jsonschema — JSON Schema validation (Rust, replaces TypeBox at runtime)
/workspace/iroh/iroh-dns — DNS resolver and endpoint info
/workspace/@alkdev/storage/docs/architecture/encrypted-data.md — Original encrypted data design (TypeScript)
/workspace/research/event_sourcing/event_source_types.md — Event-driven architecture patterns
services.md — Service layer architecture (irpc protocols)
core.md — Core overview, head/worker terminology

22 KiB Raw Blame History