docs: write Phase 0 architecture foundation — ADRs 026-034, spec docs, and task updates
Phase 0a — ADRs (9 new): - ADR-026: Transport/interface separation (three-layer model) - ADR-027: Crate decomposition (core, secret, storage, flowgraph, napi, CLI) - ADR-028: Auth as irpc service (AuthProtocol behind feature flag) - ADR-029: Identity as core type (Identity + IdentityProvider in alknet-core) - ADR-030: Static/dynamic config split (ArcSwap, ConfigReloadHandle) - ADR-031: Forwarding policy (rule-based allow/deny, TransportKind-aware) - ADR-032: Event boundary discipline (domain, irpc, call protocol boundaries) - ADR-033: OperationEnv universal composition (three dispatch paths) - ADR-034: Head/worker terminology (replace hub/spoke) Phase 0b — New spec documents (7): - identity.md, services.md, interface.md, configuration.md, storage.md, flowgraph.md, secret-service.md Updated existing docs: - auth.md: reference identity.md for canonical definitions, add AuthProtocol - open-questions.md: resolve OQ-12, OQ-16, OQ-18, OQ-22, OQ-23-25 - README.md: add all new docs, ADRs 026-034 Marked 19 architecture tasks as completed.
This commit is contained in:
@@ -1,16 +1,18 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-04
|
||||
last_updated: 2026-06-07
|
||||
---
|
||||
|
||||
# Alknet Architecture
|
||||
|
||||
## Current State
|
||||
|
||||
Architecture specification in active development. 22 ADRs accepted. Unified
|
||||
auth and call protocol architecture being specified — see [auth.md](auth.md)
|
||||
and [call-protocol.md](call-protocol.md). Configuration architecture under
|
||||
exploration — see [research/configuration.md](../research/configuration.md).
|
||||
Architecture specification in active development. Phase 0 foundation ADRs
|
||||
completed (026–034). New spec documents created for identity, services,
|
||||
interface, configuration, storage, flowgraph, and secret service. Existing
|
||||
specs updated for the three-layer model, crate decomposition, and unified
|
||||
identity. See [open-questions.md](open-questions.md) for remaining open
|
||||
questions.
|
||||
|
||||
## Architecture Documents
|
||||
|
||||
@@ -24,12 +26,24 @@ exploration — see [research/configuration.md](../research/configuration.md).
|
||||
| [server.md](server.md) | reviewed | Server acceptance, channel handling, proxy |
|
||||
| [tun-shim.md](tun-shim.md) | deprecated | TUN interface wrapper — **deferred**, use tun2proxy |
|
||||
| [napi-and-pubsub.md](napi-and-pubsub.md) | reviewed | NAPI wrapper and pubsub event target adapter |
|
||||
| [identity.md](identity.md) | draft | Identity type, IdentityProvider trait, auth flows |
|
||||
| [services.md](services.md) | draft | irpc service layer, OperationEnv, three dispatch paths |
|
||||
| [interface.md](interface.md) | draft | Layer 2: Interface trait, SshInterface, RawFramingInterface |
|
||||
| [configuration.md](configuration.md) | draft | StaticConfig, DynamicConfig, forwarding policy, reload |
|
||||
| [storage.md](storage.md) | draft | alknet-storage: metagraph, identity, ACL, honker |
|
||||
| [flowgraph.md](flowgraph.md) | draft | alknet-flowgraph: call graph, operation graph, petgraph |
|
||||
| [secret-service.md](secret-service.md) | draft | alknet-secret: BIP39, SLIP-0010, AES-GCM, SecretProtocol |
|
||||
|
||||
## Research Documents
|
||||
|
||||
| Document | Status | Description |
|
||||
|----------|--------|-------------|
|
||||
| [configuration.md](../research/configuration.md) | draft | Configuration architecture: static/dynamic split, hot reload, forwarding policy |
|
||||
| [configuration.md](../research/configuration.md) | draft | Configuration architecture (source for promoted spec) |
|
||||
| [core.md](../research/core.md) | draft | Core overview, transport, call protocol, DNS |
|
||||
| [services.md](../research/services.md) | draft | irpc service protocols, OperationContext, application services |
|
||||
| [storage.md](../research/storage.md) | draft | Metagraph, identity, ACL, secrets, honker |
|
||||
| [flow.md](../research/flow.md) | draft | FlowGraph, operation graph, call graph, petgraph mapping |
|
||||
| [integration-plan.md](../research/integration-plan.md) | draft | Phased integration plan for services, pubsub, and operations |
|
||||
|
||||
## ADR Table
|
||||
|
||||
@@ -57,12 +71,24 @@ exploration — see [research/configuration.md](../research/configuration.md).
|
||||
| [023](decisions/023-unified-auth-shared-key-material.md) | Unified auth with shared key material + token auth | Accepted |
|
||||
| [024](decisions/024-bidirectional-call-protocol.md) | Bidirectional call protocol (EventEnvelope) | Accepted |
|
||||
| [025](decisions/025-handler-spec-separation.md) | Handler/spec separation for downstream service registration | Accepted |
|
||||
| [026](decisions/026-transport-interface-separation.md) | Transport/interface separation (three-layer model) | Accepted |
|
||||
| [027](decisions/027-crate-decomposition.md) | Crate decomposition (core, secret, storage, flowgraph) | Accepted |
|
||||
| [028](decisions/028-auth-irpc-service.md) | Auth as irpc service behind feature flag | Accepted |
|
||||
| [029](decisions/029-identity-core-type.md) | Identity as core type in alknet-core | Accepted |
|
||||
| [030](decisions/030-static-dynamic-config-split.md) | Static/dynamic config split with ArcSwap | Accepted |
|
||||
| [031](decisions/031-forwarding-policy.md) | Forwarding policy with rule-based allow/deny | Accepted |
|
||||
| [032](decisions/032-event-boundary-discipline.md) | Event boundary discipline (domain, irpc, call protocol) | Accepted |
|
||||
| [033](decisions/033-operationenv-irpc-call-protocol.md) | OperationEnv as universal composition mechanism | Accepted |
|
||||
| [034](decisions/034-head-worker-terminology.md) | Head/worker terminology replacing hub/spoke | Accepted |
|
||||
|
||||
## Open Questions
|
||||
|
||||
Most open questions have been resolved. Open questions remain for
|
||||
configuration, auth, and call protocol — see
|
||||
[open-questions.md](open-questions.md) for details.
|
||||
See [open-questions.md](open-questions.md) for all open and resolved questions.
|
||||
Key resolved questions from Phase 0: OQ-12, OQ-16, OQ-18 (forwarding policy
|
||||
and identity scopes), OQ-17 (transport-aware auth), OQ-23 (irpc feature flag),
|
||||
OQ-24 (DNS control channel scope), OQ-25 (crate irpc dependencies). Key open
|
||||
questions: OQ-15 (QUIC coexistence), OQ-19 (WebTransport TLS), OQ-20 (worker
|
||||
registration).
|
||||
|
||||
## Lifecycle Definitions
|
||||
|
||||
|
||||
@@ -3,15 +3,15 @@ status: draft
|
||||
last_updated: 2026-06-07
|
||||
---
|
||||
|
||||
# Authentication & Identity
|
||||
# Authentication
|
||||
|
||||
## What
|
||||
|
||||
A unified authentication and identity layer that works across all transports —
|
||||
SSH-over-any-transport and WebTransport (non-SSH HTTP-level transports). The
|
||||
same key material (Ed25519 authorized keys and certificate authorities) is
|
||||
shared across both auth paths. Identity resolution produces a transport-agnostic
|
||||
`Identity` that carries scopes and resources for downstream authorization.
|
||||
A unified authentication layer that works across all transports — SSH-over-any-
|
||||
transport and WebTransport (non-SSH HTTP-level transports). The same key
|
||||
material (Ed25519 authorized keys and certificate authorities) is shared across
|
||||
both auth paths. Identity resolution produces a transport-agnostic `Identity`
|
||||
that carries scopes and resources for downstream authorization.
|
||||
|
||||
## Why
|
||||
|
||||
@@ -21,8 +21,27 @@ need a different auth presentation that shares the same key material. The
|
||||
unified auth layer ensures one key set, one identity, one rotation mechanism
|
||||
across all transports. See ADR-023 for the decision context.
|
||||
|
||||
The canonical definitions of `Identity` and `IdentityProvider` are in
|
||||
[identity.md](identity.md). This document covers auth-specific behavior:
|
||||
auth presentation per transport, `AuthPolicy` structure, and the auth service
|
||||
relationship.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Identity and IdentityProvider
|
||||
|
||||
See [identity.md](identity.md) for the canonical definitions of:
|
||||
- `Identity` struct (`{ id, scopes, resources }`)
|
||||
- `IdentityProvider` trait (`resolve_from_fingerprint()`, `resolve_from_token()`)
|
||||
- `ConfigIdentityProvider` (default, ArcSwap-backed)
|
||||
- `StorageIdentityProvider` (production, SQLite-backed, in alknet-storage)
|
||||
- `AuthProtocol` irpc service (behind `irpc` feature flag)
|
||||
|
||||
The key relationship: `IdentityProvider` is the contract. `ConfigIdentityProvider`
|
||||
is the default implementation (reads from `DynamicConfig.auth`). `AuthProtocol`
|
||||
irpc service is one way to satisfy the trait, behind a feature flag. Both paths
|
||||
produce the same `Identity` result. See ADR-028 and ADR-029.
|
||||
|
||||
### Auth Presentation Per Transport
|
||||
|
||||
| Transport | Auth presentation | Verification |
|
||||
@@ -72,44 +91,23 @@ V1 uses timestamp-only (±300s window, no server state). The replay trade-offs
|
||||
and future zero-replay options (nonce challenge-response) are documented in
|
||||
ADR-023.
|
||||
|
||||
### IdentityProvider Trait
|
||||
### IdentityProvider and Auth Service Relationship
|
||||
|
||||
The `IdentityProvider` trait decouples alknet-core from any specific identity
|
||||
storage. It resolves a key fingerprint or auth token to an `Identity` with
|
||||
scopes and resources.
|
||||
The `IdentityProvider` trait (defined in [identity.md](identity.md)) decouples
|
||||
alknet-core from any specific identity storage. Two implementations exist:
|
||||
|
||||
```rust
|
||||
pub trait IdentityProvider: Send + Sync + 'static {
|
||||
/// Resolve an SSH public key fingerprint to an identity.
|
||||
fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
|
||||
- **ConfigIdentityProvider** (in alknet-core) — reads from
|
||||
`ArcSwap<DynamicConfig.auth>`. Every authorized key gets a default scope set.
|
||||
No database required. This is the default for minimal deployments.
|
||||
|
||||
/// Resolve an auth token to an identity.
|
||||
/// Returns None if the token is invalid, expired, or the key is not authorized.
|
||||
fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
|
||||
}
|
||||
- **StorageIdentityProvider** (in alknet-storage) — backed by SQLite
|
||||
`peer_credentials` and `api_keys` tables plus the ACL graph. Resolves
|
||||
fingerprint → account → organization membership → effective scopes.
|
||||
|
||||
pub struct Identity {
|
||||
pub id: String, // Unique identifier — fingerprint (config) or account UUID (database)
|
||||
pub scopes: Vec<String>, // e.g., ["relay:connect", "service:gitea:read"]
|
||||
pub resources: HashMap<String, Vec<String>>, // e.g., {"service": ["gitea", "registry"]}
|
||||
}
|
||||
```
|
||||
|
||||
> **Note on identity models**: Earlier research used `{node_id, fingerprint, scopes}`.
|
||||
> The unified model uses `{id, scopes, resources}` where `id` serves as both
|
||||
> fingerprint (for key-based auth from config) and account UUID (for
|
||||
> database-backed auth). The `resources` field provides resource-level
|
||||
> authorization beyond what scopes offer. This is the canonical definition
|
||||
> that all components should use.
|
||||
```
|
||||
|
||||
**Default implementation**: `ConfigIdentityProvider` loads from
|
||||
`DynamicConfig.auth` (the `authorized_keys` set). Every authorized key gets a
|
||||
default scope set. No database required.
|
||||
|
||||
**Head implementation**: Backed by `@alkdev/storage`'s `peer_credentials` and
|
||||
`accounts` tables plus the ACL graph. Resolves fingerprint → account →
|
||||
organization membership → effective scopes. Uses `ArcSwap` for hot reload.
|
||||
The `AuthProtocol` irpc service (behind the `irpc` feature flag, per ADR-028)
|
||||
provides an async boundary for auth verification. It is one way to satisfy the
|
||||
`IdentityProvider` trait, not a replacement for it. Both the trait path and the
|
||||
irpc path produce the same `Identity` result.
|
||||
|
||||
The trait is the contract. The backing store is pluggable. Alknet-core never
|
||||
depends on Honker, SQLite, or any specific database.
|
||||
@@ -240,13 +238,13 @@ security consideration:
|
||||
|
||||
## Open Questions
|
||||
|
||||
- **OQ-18**: Should `Identity.scopes` be populated from `ForwardingPolicy`
|
||||
rules, from an external `IdentityProvider`, or from both? See
|
||||
[open-questions.md](open-questions.md).
|
||||
- **OQ-18**: ~~Source of Identity.scopes~~ Resolved per ADR-029 and ADR-031.
|
||||
`IdentityProvider` owns scopes, `ForwardingPolicy` uses scopes from `Identity`.
|
||||
See [open-questions.md](open-questions.md).
|
||||
|
||||
- **OQ-19**: Should the WebTransport listener require its own TLS identity
|
||||
(separate from the SSH-over-TLS listener), or can they share the same
|
||||
certificate? See [open-questions.md](open-questions.md).
|
||||
certificate? Deferred to Phase 4. See [open-questions.md](open-questions.md).
|
||||
|
||||
## Design Decisions
|
||||
|
||||
@@ -254,16 +252,16 @@ security consideration:
|
||||
|-----|----------|---------|
|
||||
| [012](decisions/012-auth-ed25519-and-cert-authority.md) | Ed25519 + cert-authority | Key-based auth, no passwords |
|
||||
| [023](decisions/023-unified-auth-shared-key-material.md) | Unified auth, shared key material | Same keys for SSH and token auth |
|
||||
| [028](decisions/028-auth-irpc-service.md) | Auth as irpc service | AuthProtocol behind feature flag; IdentityProvider is the contract |
|
||||
| [029](decisions/029-identity-core-type.md) | Identity as core type | `Identity` and `IdentityProvider` in alknet-core |
|
||||
|
||||
## References
|
||||
|
||||
- [identity.md](identity.md) — Canonical Identity and IdentityProvider definitions
|
||||
- [server.md](server.md) — Current SSH auth handler
|
||||
- [transport.md](transport.md) — Transport abstraction
|
||||
- [configuration.md](../research/configuration.md) — DynamicConfig, AuthPolicy structure
|
||||
- [open-questions.md](open-questions.md) — OQ-17 (resolved), OQ-18, OQ-19
|
||||
- `server/handler.rs` — Current `auth_publickey()` callback
|
||||
- `auth/server_auth.rs` — Current `ServerAuthConfig` struct
|
||||
- `auth/keys.rs` — `KeySource` and key loading
|
||||
- [configuration.md](configuration.md) — DynamicConfig, AuthPolicy, ConfigReloadHandle
|
||||
- [services.md](services.md) — AuthProtocol irpc service
|
||||
- [open-questions.md](open-questions.md) — OQ-17 (resolved), OQ-18 (resolved), OQ-19
|
||||
- [wtransport](https://github.com/BiagioFesta/wtransport) — Rust WebTransport library
|
||||
- [WebTransport W3C Spec](https://www.w3.org/TR/webtransport/) — Browser API
|
||||
- [@alkdev/storage](/workspace/@alkdev/storage) — `peer_credentials` table, ACL graph
|
||||
192
docs/architecture/configuration.md
Normal file
192
docs/architecture/configuration.md
Normal file
@@ -0,0 +1,192 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-07
|
||||
---
|
||||
|
||||
# Configuration
|
||||
|
||||
## What
|
||||
|
||||
Alknet's configuration is split into `StaticConfig` (immutable after startup) and
|
||||
`DynamicConfig` (hot-reloadable at runtime), with `ArcSwap` providing lock-free
|
||||
reads on the hot path. `ConfigService` wraps reloads behind an irpc protocol
|
||||
for production deployments.
|
||||
|
||||
## Why
|
||||
|
||||
Three specific failures motivated the split (ADR-030):
|
||||
|
||||
1. No hot reload of authentication credentials — adding a key requires a restart.
|
||||
2. No port forwarding access control — any authenticated client has unrestricted
|
||||
access (ADR-031).
|
||||
3. No structured configuration beyond CLI flags — operators need config files
|
||||
and the NAPI layer needs programmatic reload.
|
||||
|
||||
The split is clean: anything that affects SSH handshake or socket binding is
|
||||
static; anything checked per-connection or per-channel is dynamic.
|
||||
|
||||
## Architecture
|
||||
|
||||
### StaticConfig
|
||||
|
||||
Immutable after startup. Constructed from `ServeOptions` (the builder pattern
|
||||
is preserved per ADR-011). Contains:
|
||||
|
||||
- Transport mode, listen address
|
||||
- TLS config (cert, key)
|
||||
- iroh config (relay URL)
|
||||
- Stealth mode flag
|
||||
- Host key, host key algorithm
|
||||
- Max auth attempts, max connections per IP
|
||||
- Proxy config
|
||||
|
||||
Changing any of these requires a restart.
|
||||
|
||||
### DynamicConfig
|
||||
|
||||
Hot-reloadable at runtime via `ArcSwap<DynamicConfig>`. Contains:
|
||||
|
||||
- `AuthPolicy` — authorized keys, certificate authorities, token config
|
||||
- `ForwardingPolicy` — allow/deny rules for channel targets (ADR-031)
|
||||
- `RateLimitConfig` — rate limiting parameters
|
||||
|
||||
`ArcSwap` provides lock-free reads. Every `auth_publickey()` and
|
||||
`channel_open_direct_tcpip()` call does a single `Arc` dereference — zero cost
|
||||
compared to the current approach. Writes are atomic: `store()` swaps the
|
||||
pointer.
|
||||
|
||||
### ConfigReloadHandle
|
||||
|
||||
```rust
|
||||
pub struct ConfigReloadHandle {
|
||||
dynamic: Arc<ArcSwap<DynamicConfig>>,
|
||||
}
|
||||
|
||||
impl ConfigReloadHandle {
|
||||
pub fn reload(&self, new_config: DynamicConfig) { ... }
|
||||
}
|
||||
```
|
||||
|
||||
Obtained from `Server::run()`. Passed to NAPI or CLI for explicit reload.
|
||||
|
||||
### ConfigService irpc Service
|
||||
|
||||
```rust
|
||||
enum ConfigProtocol {
|
||||
GetForwardingPolicy,
|
||||
GetRateLimits,
|
||||
ReloadForwarding { policy: ForwardingPolicy },
|
||||
ReloadRateLimits { limits: RateLimitConfig },
|
||||
}
|
||||
```
|
||||
|
||||
Behind the `irpc` feature flag. For production deployments that use the service
|
||||
layer. For minimal deployments, direct `ConfigReloadHandle::reload()` is
|
||||
sufficient.
|
||||
|
||||
### ForwardingPolicy
|
||||
|
||||
Part of DynamicConfig (ADR-031). Evaluated per-channel-open, matched against
|
||||
the authenticated `Identity`. Rules are evaluated in order; first match wins.
|
||||
Default determines fallback.
|
||||
|
||||
```rust
|
||||
pub struct ForwardingPolicy {
|
||||
pub default: ForwardingAction,
|
||||
pub rules: Vec<ForwardingRule>,
|
||||
}
|
||||
```
|
||||
|
||||
### TOML Config File
|
||||
|
||||
Optional convenience input format (amends ADR-011, does not replace
|
||||
programmatic API). Covers static config plus initial auth/forwarding paths.
|
||||
|
||||
```toml
|
||||
[server]
|
||||
transport = "tls"
|
||||
listen = "0.0.0.0:443"
|
||||
|
||||
[auth]
|
||||
host_key = "/etc/alknet/ssh/host_key"
|
||||
|
||||
[forwarding]
|
||||
default = "deny"
|
||||
|
||||
[[forwarding.rules]]
|
||||
target = "localhost:*"
|
||||
action = "allow"
|
||||
```
|
||||
|
||||
### NAPI Reload API
|
||||
|
||||
```typescript
|
||||
interface AlknetServer {
|
||||
reloadAuth(auth: { authorizedKeys?: Buffer, certAuthority?: Buffer }): void;
|
||||
reloadForwarding(policy: ForwardingPolicyConfig): void;
|
||||
reloadAll(config: DynamicConfig): void;
|
||||
}
|
||||
```
|
||||
|
||||
### Multi-Transport Listeners
|
||||
|
||||
A head node may accept connections on multiple transports simultaneously. The
|
||||
architecture supports `Vec<ListenerConfig>` instead of a single
|
||||
`ServeTransportMode`. `Server::run()` spawns one accept loop per listener,
|
||||
sharing `DynamicConfig`, `ConnectionRateLimiter`, sessions, and shutdown signal.
|
||||
|
||||
```toml
|
||||
[[listeners]]
|
||||
transport = "tls"
|
||||
listen = "0.0.0.0:443"
|
||||
stealth = true
|
||||
|
||||
[[listeners]]
|
||||
transport = "tcp"
|
||||
listen = "0.0.0.0:22"
|
||||
|
||||
[[listeners]]
|
||||
transport = "iroh"
|
||||
iroh_relay = "https://relay.alk.dev"
|
||||
```
|
||||
|
||||
### CLI vs Programmatic Behavior
|
||||
|
||||
| Interface | Static config | Dynamic config | Reload mechanism |
|
||||
|-----------|--------------|----------------|------------------|
|
||||
| CLI | Flags + optional `--config` file | Loaded at startup from `--authorized-keys` | None (restart to change) |
|
||||
| Core Rust | `StaticConfig` struct | `AuthService` (irpc) or `ArcSwap<DynamicConfig>` (minimal) | `ConfigService::reload()` or `ConfigReloadHandle::reload()` |
|
||||
| NAPI | `serve()` options | Same | `server.reloadAuth()`, `server.reloadForwarding()` |
|
||||
|
||||
## Constraints
|
||||
|
||||
- `StaticConfig` cannot be changed after startup. Changing transport mode,
|
||||
listen address, TLS config, or host key requires a restart.
|
||||
- `DynamicConfig` is reloaded atomically via `ArcSwap`. Existing connections
|
||||
continue with their current config; new connections get the new config.
|
||||
- Config file is optional. `ServeOptions` builder pattern remains the primary
|
||||
API (amends ADR-011, does not supersede it).
|
||||
- No file watching (OQ-13 resolved: potential attack vector, unnecessary
|
||||
complexity).
|
||||
- Client configuration stays as `ConnectOptions` — no `ArcSwap` needed.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- None. All configuration-related questions are resolved per ADR-030, ADR-031,
|
||||
and the resolved OQs in [open-questions.md](open-questions.md).
|
||||
|
||||
## Design Decisions
|
||||
|
||||
| ADR | Decision | Summary |
|
||||
|-----|----------|---------|
|
||||
| [030](decisions/030-static-dynamic-config-split.md) | Static/dynamic config split | Immutable transport vs. reloadable auth/forwarding |
|
||||
| [011](decisions/011-no-ssh-config-programmatic-api.md) | Programmatic-first API | Amended, not superseded — TOML is convenience layer |
|
||||
| [031](decisions/031-forwarding-policy.md) | Forwarding policy | Rule-based allow/deny, TransportKind-aware |
|
||||
| [029](decisions/029-identity-core-type.md) | Identity as core type | DynamicConfig.auth consumed by IdentityProvider |
|
||||
| [028](decisions/028-auth-irpc-service.md) | Auth as irpc service | ConfigService wraps DynamicConfig reloads |
|
||||
|
||||
## References
|
||||
|
||||
- [research/configuration.md](../research/configuration.md) — Full analysis and proposed solution
|
||||
- [identity.md](identity.md) — IdentityProvider trait, DynamicConfig.auth
|
||||
- [ADR-013](decisions/013-fail2ban-friendly-logging.md) — Rate limiting parameters
|
||||
@@ -0,0 +1,162 @@
|
||||
# ADR-026: Transport/Interface Separation (Three-Layer Model)
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
In the current architecture, SSH is deeply embedded in the server handler. The
|
||||
`ServerHandler` owns auth, channel management, and proxy logic — all mixed
|
||||
together. This makes it impossible to run the call protocol over any transport
|
||||
that doesn't speak SSH, such as:
|
||||
|
||||
- **DNS** — encoding call protocol frames as DNS TXT queries/responses for
|
||||
censorship resistance
|
||||
- **Raw framing** — 4-byte length prefix + JSON `EventEnvelope` without SSH
|
||||
wrapping, for local service mesh or browser-to-head direct communication
|
||||
- **WebTransport** — running call protocol over QUIC streams (browsers can't do
|
||||
SSH key exchange)
|
||||
|
||||
The DNS control channel concept from research (`core.md`) currently conflates
|
||||
"DNS as a transport that moves bytes" with "SSH sessions over those bytes." But
|
||||
SSH is not a transport — it's a protocol layer that sits *on top of* a
|
||||
transport. Separating them enables the DNS control channel to carry call
|
||||
protocol events directly, without wrapping SSH inside DNS queries.
|
||||
|
||||
The same separation enables raw framing (no SSH overhead) for trusted local
|
||||
networks, and WebTransport direct call protocol for browser clients.
|
||||
|
||||
## Decision
|
||||
|
||||
**Establish a three-layer model:**
|
||||
|
||||
### Layer 1: Transport
|
||||
|
||||
Produces byte streams. A `Transport` still produces
|
||||
`AsyncRead + AsyncWrite + Unpin + Send`. This layer is unchanged from ADR-001.
|
||||
|
||||
```rust
|
||||
#[async_trait]
|
||||
pub trait Transport: Send + Sync + 'static {
|
||||
type Stream: AsyncRead + AsyncWrite + Unpin + Send + 'static;
|
||||
async fn connect(&self) -> Result<Self::Stream>;
|
||||
fn describe(&self) -> String;
|
||||
}
|
||||
```
|
||||
|
||||
Transports: TCP, TLS, iroh, DNS (as byte carrier), WebTransport (future).
|
||||
|
||||
### Layer 2: Interface
|
||||
|
||||
Consumes a `Transport::Stream` and produces call protocol sessions. An
|
||||
interface is what SSH currently does: wrap a byte stream in session semantics.
|
||||
|
||||
```rust
|
||||
#[async_trait]
|
||||
pub trait Interface: Send + Sync + 'static {
|
||||
type Session;
|
||||
async fn accept(stream: TransportStream, config: &InterfaceConfig) -> Result<Self::Session>;
|
||||
}
|
||||
```
|
||||
|
||||
Interfaces:
|
||||
|
||||
- **SSH interface** — wraps existing `ServerHandler` logic. SSH handshake, auth,
|
||||
channel multiplexing. The call protocol runs over a reserved SSH channel
|
||||
(`alknet-control:0`).
|
||||
- **Raw framing interface** — 4-byte big-endian length prefix + JSON
|
||||
`EventEnvelope`. No SSH overhead. Direct call protocol over the transport
|
||||
stream.
|
||||
- **DNS control channel** — a (DNS transport, raw framing interface) pair that
|
||||
encodes/decodes `EventEnvelope` frames as DNS query/response pairs.
|
||||
|
||||
### Layer 3: Protocol
|
||||
|
||||
Carries semantics. Call protocol events, operation registry, service calls.
|
||||
The protocol is agnostic to both the transport and the interface below it. It
|
||||
receives `EventEnvelope` frames from whatever interface produced them.
|
||||
|
||||
### Connection Model
|
||||
|
||||
A **connection** is always a (Transport, Interface) pair. The valid combinations are enumerated:
|
||||
|
||||
| Transport | Interface | Use case |
|
||||
|-----------|-----------|----------|
|
||||
| TLS | SSH | Standard alknet tunnel |
|
||||
| TCP | SSH | Plain SSH tunnel |
|
||||
| iroh | SSH | P2P SSH tunnel |
|
||||
| DNS | raw framing | DNS control channel |
|
||||
| WebTransport | SSH | Browser SSH tunnel (future) |
|
||||
| WebTransport | raw framing | Browser call protocol (future) |
|
||||
| TCP | raw framing | Direct call protocol, local mesh |
|
||||
|
||||
**The DNS control channel carries call protocol frames directly — it does NOT
|
||||
wrap SSH inside DNS.** This is explicit because the research originally
|
||||
conflated "SSH tunneling over DNS" with "DNS as a transport for call protocol."
|
||||
The (DNS, raw framing) pair sends `EventEnvelope` frames as DNS TXT
|
||||
queries/responses — no SSH involved.
|
||||
|
||||
### `TransportKind` Enum
|
||||
|
||||
The `TransportKind` enum (currently `Tcp | Tls | Iroh`) gains `Dns` and
|
||||
`WebTransport` variants. Initially these are tags only — no acceptor
|
||||
implementation. The full DNS and WebTransport implementations are Phase 4 work
|
||||
per the integration plan.
|
||||
|
||||
```rust
|
||||
pub enum TransportKind {
|
||||
Tcp,
|
||||
Tls { server_name: Option<String> },
|
||||
Iroh { endpoint_id: String },
|
||||
Dns { domain: String },
|
||||
WebTransport { host: String },
|
||||
}
|
||||
```
|
||||
|
||||
### ServerHandler Refactor
|
||||
|
||||
The existing `ServerHandler` is refactored into `SshInterface`. The interface
|
||||
abstraction means the server's accept loop becomes:
|
||||
|
||||
```rust
|
||||
// Pseudocode
|
||||
let (transport, interface) = listener_config;
|
||||
let stream = transport.accept().await?;
|
||||
let session = interface.accept(stream, &config).await?;
|
||||
// session produces call protocol events
|
||||
```
|
||||
|
||||
The call protocol handler is interface-agnostic — it receives `EventEnvelope`
|
||||
frames from any interface. Auth, forwarding policy, and operation routing happen
|
||||
at Layer 3, not inside the SSH handler.
|
||||
|
||||
## Consequences
|
||||
|
||||
- **Positive**: Enables DNS control channel without SSH wrapping. The (DNS,
|
||||
raw framing) pair is a clean (Transport, Interface) combination.
|
||||
- **Positive**: Enables raw framing for local service mesh. No SSH overhead for
|
||||
trusted networks.
|
||||
- **Positive**: SSH becomes pluggable. The same call protocol handler works with
|
||||
any interface.
|
||||
- **Positive**: `ServerHandler` is refactored into `SshInterface` — a smaller,
|
||||
more focused component that only handles SSH session management.
|
||||
- **Positive**: Future WebTransport and WebSocket interfaces are additive — they
|
||||
implement the `Interface` trait without touching SSH code.
|
||||
- **Negative**: This is the most invasive code change in Phase 1
|
||||
(integration-plan, Phase 1.8). SSH auth, channel management, and proxy logic
|
||||
are currently tangled in `ServerHandler`. Extracting them requires careful
|
||||
refactoring to maintain existing behavior.
|
||||
- **Negative**: The `Interface` trait is new and untested. The design must
|
||||
accommodate both SSH's channel multiplexing and raw framing's single-stream
|
||||
model through the same abstraction.
|
||||
|
||||
## References
|
||||
|
||||
- [research/core.md](../../research/core.md) — Transport layer, DNS transport section
|
||||
- [research/integration-plan.md](../../research/integration-plan.md) — Phase 1.8, three-layer model
|
||||
- [transport.md](../transport.md) — Current Transport trait (unchanged at Layer 1)
|
||||
- [server.md](../server.md) — Current ServerHandler (will become SshInterface)
|
||||
- [ADR-001](001-pluggable-transport.md) — Transport trait produces stream (unchanged)
|
||||
- [ADR-004](004-ssh-over-transport.md) — SSH runs over transport (reinforced by Layer 2)
|
||||
- [ADR-024](024-bidirectional-call-protocol.md) — Bidirectional call protocol (Layer 3)
|
||||
150
docs/architecture/decisions/027-crate-decomposition.md
Normal file
150
docs/architecture/decisions/027-crate-decomposition.md
Normal file
@@ -0,0 +1,150 @@
|
||||
# ADR-027: Crate Decomposition
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
alknet-core currently contains everything: transport, SSH, auth, config, the
|
||||
call protocol handler, and the server accept loop. As the project grows to
|
||||
include SQLite-backed identity, HD key derivation, and metagraph storage, core
|
||||
would need to depend on rusqlite, bip39, petgraph, and other heavy dependencies
|
||||
— unacceptable for a library crate that CLI users embed.
|
||||
|
||||
Different deployment topologies need different subsets:
|
||||
- A minimal CLI tunnel only needs core, transport, and auth types
|
||||
- A head node needs SQLite-backed identity and the secret service
|
||||
- A flowgraph visualization tool only needs petgraph operations
|
||||
|
||||
Circular dependencies must be avoided. alknet-storage implements
|
||||
alknet-core's `IdentityProvider` trait, so alknet-core cannot depend on
|
||||
alknet-storage. alknet-storage references alknet-secret's `EncryptedData` wire
|
||||
format, but not as a crate dependency.
|
||||
|
||||
## Decision
|
||||
|
||||
**Decompose the project into six crates with a strict acyclic dependency graph.**
|
||||
|
||||
### Crate Structure
|
||||
|
||||
1. **alknet-core** — Transport, SSH, call protocol, config, auth types, identity,
|
||||
`OperationSpec`, `Interface` trait. The foundational crate that everything
|
||||
else depends on (by type, not by crate dep in some cases).
|
||||
- *Depends on*: russh, tokio, irpc (feature-gated), serde, arc-swap
|
||||
- *Does NOT depend on*: alknet-secret, alknet-storage, alknet-flowgraph
|
||||
|
||||
2. **alknet-secret** — BIP39 mnemonic generation, SLIP-0010 Ed25519 HD key
|
||||
derivation, AES-256-GCM encryption, `SecretProtocol` irpc service.
|
||||
- *Depends on*: bip39, ed25519-bip32 (or rust-bip32-ed25519), aes-gcm, sha2,
|
||||
irpc
|
||||
- *Does NOT depend on*: alknet-core, alknet-storage
|
||||
|
||||
3. **alknet-storage** — SQLite-backed metagraph, identity tables, ACL graph,
|
||||
honker integration, `StorageProtocol` irpc service.
|
||||
- *Depends on*: rusqlite (via honker), honker, petgraph, jsonschema, irpc
|
||||
- *Does NOT depend on alknet-core* (but implements alknet-core's
|
||||
`IdentityProvider` trait via the trait, not a crate dep)
|
||||
- *Does NOT depend on alknet-secret* (but references `EncryptedData` type
|
||||
format for wire compatibility)
|
||||
|
||||
4. **alknet-flowgraph** — `FlowGraph<N,E>` over petgraph, operation graph, call
|
||||
graph, type compatibility checking.
|
||||
- *Depends on*: petgraph, serde, jsonschema, thiserror
|
||||
- *Does NOT depend on*: alknet-core, alknet-storage, alknet-secret
|
||||
|
||||
5. **alknet-napi** — Node.js native addon. Exposes alknet-core to Node.js.
|
||||
- *Depends on*: alknet-core
|
||||
- *Does NOT depend on*: alknet-secret, alknet-storage, alknet-flowgraph
|
||||
|
||||
6. **alknet** (CLI binary) — Assembles everything.
|
||||
- *Depends on*: alknet-core, alknet-secret (feature), alknet-storage (feature),
|
||||
alknet-flowgraph (feature), toml
|
||||
|
||||
### Dependency Graph
|
||||
|
||||
```
|
||||
alknet-secret
|
||||
/ \
|
||||
/ \
|
||||
alknet-core ←──── ←── alknet-storage
|
||||
↑ \ /
|
||||
│ alknet-flowgraph
|
||||
│
|
||||
alknet-napi
|
||||
alknet (CLI binary — assembles everything)
|
||||
```
|
||||
|
||||
### Narrow Interface Points
|
||||
|
||||
Three types serve as the narrow interface points between crates:
|
||||
|
||||
1. **`Identity`** — Defined in `alknet_core::auth`. Used by auth handler,
|
||||
forwarding policy, and call protocol. alknet-storage implements
|
||||
`IdentityProvider` to produce instances.
|
||||
|
||||
2. **`IdentityProvider`** — Trait defined in `alknet_core::auth`. Implemented by
|
||||
`ConfigIdentityProvider` (in core) and `StorageIdentityProvider` (in
|
||||
alknet-storage). The CLI/NAPI layer wires the concrete implementation.
|
||||
|
||||
3. **`OperationSpec`** — Defined in `alknet_core::call`. Used by the operation
|
||||
registry and by alknet-flowgraph for type compatibility checking. The bridge
|
||||
is serialization — flowgraph serializes to JSON, storage persists it.
|
||||
|
||||
### irpc Feature Flag
|
||||
|
||||
irpc is a feature flag in alknet-core. When disabled, auth and config go through
|
||||
`IdentityProvider` and `ConfigReloadHandle` directly — no irpc overhead. Nodes
|
||||
that only do SSH tunneling don't need the service layer.
|
||||
|
||||
In alknet-secret and alknet-storage, irpc is an independent dependency, not
|
||||
feature-gated. These crates always define irpc service protocols because they
|
||||
are used in production deployments where the service layer is active.
|
||||
|
||||
### alknet-storage's Relationship to alknet-core
|
||||
|
||||
alknet-storage does NOT depend on alknet-core as a crate. Instead:
|
||||
|
||||
- alknet-storage defines its own `IdentityProvider` impl that matches
|
||||
alknet-core's trait signature. The trait is re-exported or defined locally
|
||||
with `#[cfg(feature = "alknet-core")]` interop.
|
||||
- In practice, the CLI binary crate depends on both and wires them together.
|
||||
alknet-storage provides `StorageIdentityProvider`; alknet-core takes
|
||||
`impl IdentityProvider`.
|
||||
|
||||
### alknet-storage's Relationship to alknet-secret
|
||||
|
||||
alknet-storage does NOT depend on alknet-secret as a crate. Instead:
|
||||
|
||||
- alknet-storage and alknet-secret share the `EncryptedData` wire format (key
|
||||
version, salt, IV, ciphertext). This is a type-level compatibility, not a
|
||||
crate dependency.
|
||||
- alknet-secret encrypts; alknet-storage stores the encrypted blob in a
|
||||
`SecretNode` in the metagraph. The bridge is serialization.
|
||||
|
||||
## Consequences
|
||||
|
||||
- **Positive**: Core is lean. No database, no crypto, no petgraph. CLI users
|
||||
get a small binary.
|
||||
- **Positive**: Services are pluggable. alknet-secret and alknet-storage can be
|
||||
swapped for alternative implementations.
|
||||
- **Positive**: No circular dependencies. The dependency graph is a DAG.
|
||||
- **Positive**: Deployment topology determines which crates to include. A CLI
|
||||
tunnel uses only alknet-core. A head node uses everything.
|
||||
- **Positive**: irpc is feature-gated in core. Minimal deployments don't pay for
|
||||
service layer overhead.
|
||||
- **Negative**: `IdentityProvider` trait interop between alknet-core and
|
||||
alknet-storage requires careful versioning. If the trait signature changes,
|
||||
both crates must update.
|
||||
- **Negative**: `EncryptedData` wire format compatibility between alknet-secret
|
||||
and alknet-storage is implicit (not enforced by the type system). A shared
|
||||
types crate could be extracted if needed, but adds another crate dependency.
|
||||
|
||||
## References
|
||||
|
||||
- [research/integration-plan.md](../../research/integration-plan.md) — Phase 2, dependency graph
|
||||
- [research/core.md](../../research/core.md) — alknet-core contents
|
||||
- [research/services.md](../../research/services.md) — Service protocols
|
||||
- [research/storage.md](../../research/storage.md) — alknet-storage contents
|
||||
- [research/flow.md](../../research/flow.md) — alknet-flowgraph contents
|
||||
- [ADR-029](029-identity-core-type.md) — Identity as core type (narrow interface point)
|
||||
146
docs/architecture/decisions/028-auth-irpc-service.md
Normal file
146
docs/architecture/decisions/028-auth-irpc-service.md
Normal file
@@ -0,0 +1,146 @@
|
||||
# ADR-028: Auth as irpc Service
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
For head nodes serving many users, in-memory key lookup via `ArcSwap<DynamicConfig>`
|
||||
doesn't scale. Loading all authorized keys into RAM and atomic-swapping the
|
||||
entire set on each reload works for small deployments but requires holding every
|
||||
key in memory. For production deployments with hundreds or thousands of users,
|
||||
auth verification should query a database on demand rather than holding all keys
|
||||
in memory.
|
||||
|
||||
The current `ArcSwap<DynamicConfig>` approach works for CLI and single-node
|
||||
setups. What's needed is an async boundary that allows auth verification to go
|
||||
through a service — locally via channels for minimal deployments, or via irpc
|
||||
for production deployments where auth runs on a separate process or node.
|
||||
|
||||
The critical design point: callers go through the `IdentityProvider` trait
|
||||
(ADR-029). The irpc service is one way to satisfy the trait. Both paths produce
|
||||
the same result — an `Identity` or rejection. The trait is the contract; the
|
||||
service is an implementation path.
|
||||
|
||||
## Decision
|
||||
|
||||
**Auth verification is provided via an irpc service protocol, with
|
||||
`IdentityProvider` as the interface contract and `ConfigIdentityProvider`
|
||||
(ArcSwap-backed) as the default implementation.**
|
||||
|
||||
### IdentityProvider Trait (ADR-029) — The Contract
|
||||
|
||||
Callers depend on `IdentityProvider`, not on any concrete implementation:
|
||||
|
||||
```rust
|
||||
pub trait IdentityProvider: Send + Sync + 'static {
|
||||
fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
|
||||
fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
|
||||
}
|
||||
```
|
||||
|
||||
### ConfigIdentityProvider — Default Implementation
|
||||
|
||||
Reads from `ArcSwap<DynamicConfig.auth>`. No database needed. Every authorized
|
||||
key gets a default scope set. This is the default for CLI and single-node
|
||||
deployments.
|
||||
|
||||
### AuthProtocol irpc Service — Behind Feature Flag
|
||||
|
||||
```rust
|
||||
#[rpc_requests(message = AuthMessage)]
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
enum AuthProtocol {
|
||||
#[rpc(tx=oneshot::Sender<AuthResult>)]
|
||||
#[wrap(VerifyPubkey)]
|
||||
VerifyPubkey { fingerprint: String, key_data: Vec<u8> },
|
||||
|
||||
#[rpc(tx=oneshot::Sender<AuthResult>)]
|
||||
#[wrap(VerifyToken)]
|
||||
VerifyToken { token_bytes: Vec<u8>, timestamp: u64 },
|
||||
|
||||
#[rpc(tx=oneshot::Sender<()>)]
|
||||
#[wrap(ReloadKeys)]
|
||||
ReloadKeys,
|
||||
|
||||
#[rpc(tx=oneshot::Sender<bool>)]
|
||||
#[wrap(CheckAccess)]
|
||||
CheckAccess { identity: Identity, operation: String },
|
||||
}
|
||||
|
||||
enum AuthResult {
|
||||
Ok(Identity),
|
||||
Denied(String),
|
||||
}
|
||||
```
|
||||
|
||||
The `AuthProtocol` is behind the `irpc` feature flag in alknet-core. Nodes
|
||||
that only do SSH tunneling don't need the service layer overhead. When the
|
||||
feature is disabled, auth goes through `IdentityProvider` directly.
|
||||
|
||||
### AuthServiceImpl
|
||||
|
||||
Two implementations exist:
|
||||
|
||||
- **ConfigAuthService** — backed by `ConfigIdentityProvider` (ArcSwap path).
|
||||
Wraps the trait in an irpc service for deployments that use the service layer
|
||||
but don't have SQLite.
|
||||
- **StorageAuthService** — backed by SQLite `peer_credentials` and `api_keys`
|
||||
tables (in alknet-storage). Queries on demand. Can maintain an LRU cache for
|
||||
hot fingerprints. This is the production implementation.
|
||||
|
||||
Both produce the same `AuthResult` — an `Identity` or a denial. Callers don't
|
||||
know or care which backend is running.
|
||||
|
||||
### Integration with IdentityProvider
|
||||
|
||||
The irpc service and the trait compose. A caller goes through `IdentityProvider`,
|
||||
which may internally delegate to the irpc service, or may satisfy the request
|
||||
locally via `ConfigIdentityProvider`. The deployment topology determines the
|
||||
path:
|
||||
|
||||
- **Minimal (CLI, single-node)**: `ConfigIdentityProvider` reads from
|
||||
`ArcSwap<DynamicConfig>`. No irpc overhead.
|
||||
- **Production with local auth**: `AuthServiceImpl` wraps
|
||||
`StorageIdentityProvider` locally. The handler calls `IdentityProvider` which
|
||||
routes to the local irpc service.
|
||||
- **Distributed auth**: Handler on a worker node calls `IdentityProvider` which
|
||||
routes to a remote auth irpc service over QUIC.
|
||||
|
||||
### ConfigService Integration
|
||||
|
||||
`AuthProtocol::ReloadKeys` triggers reload of the dynamic config's auth section.
|
||||
For the `ConfigIdentityProvider` path, this is equivalent to
|
||||
`ConfigReloadHandle::reload()`. For the `StorageIdentityProvider` path, this
|
||||
refreshes the LRU cache. Both update atomically — ongoing connections are
|
||||
unaffected, new connections pick up changes.
|
||||
|
||||
## Consequences
|
||||
|
||||
- **Positive**: Minimal deployments use `ArcSwap` without irpc overhead. No
|
||||
database dependency for CLI users.
|
||||
- **Positive**: Production deployments wire `StorageIdentityProvider` behind the
|
||||
irpc service. Auth scales to thousands of users without loading all keys into
|
||||
memory.
|
||||
- **Positive**: The `IdentityProvider` trait is the only contract callers depend
|
||||
on. This keeps alknet-core lean and testable.
|
||||
- **Positive**: Feature flag (`irpc`) keeps core lean for deployments that don't
|
||||
need the service layer.
|
||||
- **Positive**: Both paths produce identical `Identity` results. Behavioral
|
||||
parity is enforced by the shared `Identity` type.
|
||||
- **Negative**: Two implementations must be kept in sync. `ConfigIdentityProvider`
|
||||
and `StorageIdentityProvider` must produce the same `Identity` for the same
|
||||
input. Integration tests should verify this.
|
||||
- **Negative**: The `irpc` feature flag adds conditional compilation complexity.
|
||||
The core must compile and work without it, and the service layer must work
|
||||
with it enabled.
|
||||
|
||||
## References
|
||||
|
||||
- [research/services.md](../../research/services.md) — AuthService, AuthProtocol definition
|
||||
- [auth.md](../auth.md) — IdentityProvider trait, Identity struct
|
||||
- [research/configuration.md](../../research/configuration.md) — Auth service approach
|
||||
- [research/integration-plan.md](../../research/integration-plan.md) — Phase 1.4
|
||||
- [ADR-029](029-identity-core-type.md) — Identity as core type
|
||||
- [ADR-027](027-crate-decomposition.md) — Crate decomposition
|
||||
107
docs/architecture/decisions/029-identity-core-type.md
Normal file
107
docs/architecture/decisions/029-identity-core-type.md
Normal file
@@ -0,0 +1,107 @@
|
||||
# ADR-029: Identity as Core Type
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
The `Identity` struct and `IdentityProvider` trait are needed by auth,
|
||||
forwarding policy, and call protocol — three different subsystems in
|
||||
alknet-core. Without placing them in core, these subsystems would each define
|
||||
their own identity type, leading to duplication and conversion boilerplate.
|
||||
|
||||
The constraint: alknet-core must not depend on alknet-storage or any database.
|
||||
The `IdentityProvider` trait must be in core so that the handler can resolve
|
||||
identities without knowing whether the backing store is a config file or a
|
||||
SQLite database. External crates provide implementations.
|
||||
|
||||
Earlier research defined `Identity` inconsistently: `{node_id, fingerprint,
|
||||
scopes}` in services.md and `{id, scopes, resources}` in auth.md. The unified
|
||||
model uses `{id, scopes, resources}` where `id` serves as both fingerprint (for
|
||||
key-based auth from config) and account UUID (for database-backed auth).
|
||||
|
||||
## Decision
|
||||
|
||||
**`Identity` struct and `IdentityProvider` trait live in `alknet_core::auth`.**
|
||||
|
||||
### Identity Struct
|
||||
|
||||
```rust
|
||||
pub struct Identity {
|
||||
pub id: String, // Fingerprint (config auth) or account UUID (database auth)
|
||||
pub scopes: Vec<String>, // e.g., ["relay:connect", "service:gitea:read"]
|
||||
pub resources: HashMap<String, Vec<String>>, // e.g., {"service": ["gitea", "registry"]}
|
||||
}
|
||||
```
|
||||
|
||||
The `id` field serves dual purpose: when using config-based authentication
|
||||
(`ConfigIdentityProvider`), it holds the Ed25519 key fingerprint. When using
|
||||
database-backed authentication (`StorageIdentityProvider`), it holds the account
|
||||
UUID from the `accounts` table. This keeps the type simple while accommodating
|
||||
both auth paths.
|
||||
|
||||
The `scopes` field provides authorization scope strings used by
|
||||
`ForwardingPolicy` and `AccessControl` in `OperationSpec`. The `resources`
|
||||
field provides resource-level authorization beyond what scopes offer (e.g., which
|
||||
services this identity can access).
|
||||
|
||||
### IdentityProvider Trait
|
||||
|
||||
```rust
|
||||
pub trait IdentityProvider: Send + Sync + 'static {
|
||||
fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
|
||||
fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
|
||||
}
|
||||
```
|
||||
|
||||
The trait is the contract. Callers (auth handler, forwarding policy, call
|
||||
protocol) depend on `IdentityProvider` — not on any concrete implementation.
|
||||
|
||||
### Default and Production Implementations
|
||||
|
||||
- **`ConfigIdentityProvider`** (in alknet-core) — reads from
|
||||
`ArcSwap<DynamicConfig.auth>`. Every authorized key gets a default scope set.
|
||||
No database needed. This is the default for minimal deployments.
|
||||
- **`StorageIdentityProvider`** (in alknet-storage) — backed by SQLite
|
||||
`peer_credentials` and `api_keys` tables plus the ACL graph. Resolves
|
||||
fingerprint → account → organization membership → effective scopes. This is
|
||||
the production implementation for head nodes.
|
||||
|
||||
alknet-core never depends on alknet-storage. The trait relationship is:
|
||||
alknet-core *defines* the trait, alknet-storage *implements* it. The CLI or
|
||||
NAPI assembly layer wires the concrete implementation.
|
||||
|
||||
### Why Not in alknet-storage?
|
||||
|
||||
If `Identity` lived in alknet-storage, alknet-core would need to depend on
|
||||
alknet-storage to use the type — creating a circular dependency (since
|
||||
alknet-storage implements alknet-core's `IdentityProvider` trait). Placing the
|
||||
type and trait in core breaks the cycle.
|
||||
|
||||
## Consequences
|
||||
|
||||
- **Positive**: alknet-core has no database dependency. Auth, forwarding, and
|
||||
call protocol all use the same `Identity` type.
|
||||
- **Positive**: alknet-storage implements the core trait. The CLI/NAPI layer
|
||||
wires the concrete implementation. Deployment topology determines which impl
|
||||
to use.
|
||||
- **Positive**: The `id` field serves dual purpose (fingerprint or UUID),
|
||||
avoiding separate types for config-based and database-based auth.
|
||||
- **Positive**: `ForwardingPolicy` and `AccessControl` can reference scopes from
|
||||
`Identity` without knowing where they came from.
|
||||
- **Negative**: Two implementations of `IdentityProvider` exist — `Config` and
|
||||
`Storage`. Both must produce identical `Identity` results for the same input.
|
||||
Tests should verify behavioral parity.
|
||||
- **Negative**: The trait abstraction adds a level of indirection for the
|
||||
minimal (config-only) deployment path. The cost is negligible — the
|
||||
`ConfigIdentityProvider` is a simple `ArcSwap` dereference.
|
||||
|
||||
## References
|
||||
|
||||
- [auth.md](../auth.md) — IdentityProvider trait, Identity struct, unified auth
|
||||
- [research/services.md](../../research/services.md) — AuthService, Identity section
|
||||
- [research/integration-plan.md](../../research/integration-plan.md) — Phase 1.2
|
||||
- [ADR-023](023-unified-auth-shared-key-material.md) — Unified auth with shared key material
|
||||
- [ADR-028](028-auth-irpc-service.md) — Auth as irpc service
|
||||
- [OQ-18](../open-questions.md) — IdentityProvider owns scopes
|
||||
159
docs/architecture/decisions/030-static-dynamic-config-split.md
Normal file
159
docs/architecture/decisions/030-static-dynamic-config-split.md
Normal file
@@ -0,0 +1,159 @@
|
||||
# ADR-030: Static/Dynamic Configuration Split
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
Alknet's configuration is loaded once at startup and never changes. This causes
|
||||
three specific failures:
|
||||
|
||||
1. **No hot reload of authentication credentials.** Adding or removing an
|
||||
authorized key requires restarting the server process. In head/worker
|
||||
deployments where keys are managed via a database, the process must be
|
||||
restarted every time a key is added, revoked, or rotated. This is
|
||||
operationally unacceptable.
|
||||
|
||||
2. **No port forwarding access control.** Any authenticated client can open a
|
||||
`direct-tcpip` channel to any destination. There is no policy governing
|
||||
which hosts, ports, or alknet control channels a client may access. A
|
||||
compromised key grants unrestricted network access through the tunnel.
|
||||
|
||||
3. **No structured configuration beyond CLI flags.** ADR-011 chose
|
||||
programmatic-first configuration for the alpha — correct at the time. But as
|
||||
alknet moves toward publishable releases, operators need config files for
|
||||
reproducible deployments, and the NAPI layer needs programmatic reload
|
||||
capability that `ServeOptions` doesn't currently support.
|
||||
|
||||
Not all configuration should be reloadable. Transport-level settings (listen
|
||||
address, TLS certificates, host key) require socket/TLS renegotiation to change
|
||||
at runtime — effectively a restart. Auth and forwarding policy can change
|
||||
atomically without disrupting existing connections.
|
||||
|
||||
## Decision
|
||||
|
||||
**Split configuration into `StaticConfig` and `DynamicConfig`.**
|
||||
|
||||
### StaticConfig
|
||||
|
||||
Immutable after startup. Constructed from `ServeOptions` (the builder pattern is
|
||||
preserved). Contains everything that affects socket binding, TLS handshakes, or
|
||||
SSH session negotiation:
|
||||
|
||||
- Transport mode, listen address
|
||||
- TLS config (cert, key)
|
||||
- iroh config (relay URL)
|
||||
- Stealth mode flag
|
||||
- Host key, host key algorithm
|
||||
- Max auth attempts, max connections per IP
|
||||
- Proxy config
|
||||
|
||||
Changing any of these requires a restart.
|
||||
|
||||
### DynamicConfig
|
||||
|
||||
Hot-reloadable at runtime via `ArcSwap<DynamicConfig>`. Contains everything
|
||||
checked per-connection or per-channel:
|
||||
|
||||
- `AuthPolicy` — authorized keys, certificate authorities, token config
|
||||
- `ForwardingPolicy` — allow/deny rules for channel targets (ADR-031)
|
||||
- `RateLimitConfig` — rate limiting parameters
|
||||
|
||||
`ArcSwap` provides lock-free reads on the hot path (every `auth_publickey()` and
|
||||
every `channel_open_direct_tcpip()` call does an `Arc` dereference — zero cost
|
||||
compared to the current approach). Writes are atomic: `store()` swaps the
|
||||
pointer. Existing connections finish with their current config; new connections
|
||||
get the new config.
|
||||
|
||||
### ConfigReloadHandle
|
||||
|
||||
```rust
|
||||
pub struct ConfigReloadHandle {
|
||||
dynamic: Arc<ArcSwap<DynamicConfig>>,
|
||||
}
|
||||
|
||||
impl ConfigReloadHandle {
|
||||
pub fn reload(&self, new_config: DynamicConfig) { ... }
|
||||
}
|
||||
```
|
||||
|
||||
The handle is obtained from `Server::run()` and passed to NAPI or the CLI.
|
||||
|
||||
### ConfigService
|
||||
|
||||
The `ConfigService` wraps `ArcSwap<DynamicConfig>` reloads behind an irpc
|
||||
protocol (behind the `irpc` feature flag) for production deployments that use
|
||||
the service layer. For minimal deployments (CLI, single-node), direct
|
||||
`ConfigReloadHandle::reload()` is sufficient.
|
||||
|
||||
### TOML Config File
|
||||
|
||||
An optional TOML config file covers static config plus initial auth/forwarding
|
||||
paths. This **amends** ADR-011 (does not supersede it) — the programmatic-first
|
||||
API remains primary. The config file is a convenience input format:
|
||||
|
||||
```toml
|
||||
[server]
|
||||
transport = "tls"
|
||||
listen = "0.0.0.0:443"
|
||||
stealth = false
|
||||
max_connections_per_ip = 5
|
||||
max_auth_attempts = 3
|
||||
|
||||
[server.tls]
|
||||
cert = "/etc/alknet/tls/cert.pem"
|
||||
key = "/etc/alknet/tls/key.pem"
|
||||
|
||||
[auth]
|
||||
host_key = "/etc/alknet/ssh/host_key"
|
||||
|
||||
[forwarding]
|
||||
default = "deny"
|
||||
```
|
||||
|
||||
### NAPI Reload API
|
||||
|
||||
```typescript
|
||||
interface AlknetServer {
|
||||
reloadAuth(auth: { authorizedKeys?: Buffer, certAuthority?: Buffer }): void;
|
||||
reloadForwarding(policy: ForwardingPolicyConfig): void;
|
||||
reloadAll(config: DynamicConfig): void;
|
||||
}
|
||||
```
|
||||
|
||||
The NAPI layer parses key data and constructs a new `DynamicConfig`, then calls
|
||||
`ConfigReloadHandle::reload()`.
|
||||
|
||||
### Client Configuration
|
||||
|
||||
Client configuration stays as `ConnectOptions` — no `ArcSwap` needed. Client
|
||||
config is almost entirely static (which server to connect to, which key to use).
|
||||
|
||||
## Consequences
|
||||
|
||||
- **Positive**: Auth credentials and forwarding policy can be reloaded without
|
||||
restarting the server. Adding a key via `reloadAuth()` takes effect on the
|
||||
next connection attempt.
|
||||
- **Positive**: ADR-011's programmatic-first intent is preserved. The TOML
|
||||
config file is an optional convenience layer, not a replacement for
|
||||
`ServeOptions`.
|
||||
- **Positive**: `ArcSwap` provides zero-cost reads on the hot path. Every auth
|
||||
check and every channel open is a single `Arc` dereference.
|
||||
- **Positive**: The `ConfigService` irpc protocol (behind feature flag) allows
|
||||
production deployments to integrate config reload into their service mesh
|
||||
without taking a direct dependency on `DynamicConfig` internals.
|
||||
- **Positive**: Forwarding policy is now part of `DynamicConfig` — operators can
|
||||
restrict access per identity, per destination, per transport (ADR-031).
|
||||
- **Negative**: Two config structs where there was one. The split is clean
|
||||
(transport vs. policy) but adds surface area.
|
||||
- **Negative**: Config file introduces `toml` as a dependency in the CLI crate.
|
||||
This is acceptable for a CLI binary.
|
||||
|
||||
## References
|
||||
|
||||
- [research/configuration.md](../../research/configuration.md) — Full analysis
|
||||
- [ADR-011](011-no-ssh-config-programmatic-api.md) — Programmatic-first API (amended, not superseded)
|
||||
- [ADR-031](031-forwarding-policy.md) — Forwarding policy (part of DynamicConfig)
|
||||
- [ADR-029](029-identity-core-type.md) — Identity as core type (DynamicConfig.auth uses IdentityProvider)
|
||||
- [integration-plan.md](../../research/integration-plan.md) — Phase 1.1
|
||||
138
docs/architecture/decisions/031-forwarding-policy.md
Normal file
138
docs/architecture/decisions/031-forwarding-policy.md
Normal file
@@ -0,0 +1,138 @@
|
||||
# ADR-031: Forwarding Policy
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
Currently, any authenticated client can open a `direct-tcpip` SSH channel to
|
||||
any destination. The only gate is authentication — once authenticated, a client
|
||||
has unrestricted network access through the tunnel. This is a security gap: a
|
||||
compromised key grants unrestricted access.
|
||||
|
||||
Operators need the ability to:
|
||||
- Restrict which hosts and ports authenticated clients can access
|
||||
- Apply different rules to different principals (key fingerprints, accounts)
|
||||
- Restrict WebTransport clients to alknet control channels only
|
||||
- Set a default policy (allow-all for migration compatibility, deny-all for
|
||||
production)
|
||||
|
||||
## Decision
|
||||
|
||||
**Add `ForwardingPolicy` as part of `DynamicConfig` (reloadable without
|
||||
restart).**
|
||||
|
||||
### Type Definitions
|
||||
|
||||
```rust
|
||||
pub struct ForwardingPolicy {
|
||||
pub default: ForwardingAction,
|
||||
pub rules: Vec<ForwardingRule>,
|
||||
}
|
||||
|
||||
pub struct ForwardingRule {
|
||||
pub target: TargetPattern,
|
||||
pub action: ForwardingAction,
|
||||
pub principals: Vec<String>, // Empty = matches all
|
||||
pub transports: Vec<TransportKind>, // Empty = matches all
|
||||
}
|
||||
|
||||
pub enum ForwardingAction {
|
||||
Allow,
|
||||
Deny,
|
||||
}
|
||||
|
||||
pub enum TargetPattern {
|
||||
Any,
|
||||
Host(String), // "localhost", "*.example.com"
|
||||
Cidr(IpNetwork), // "10.0.0.0/8"
|
||||
PortRange(String, Range<u16>), // "localhost", ports 8080-8090
|
||||
AlknetPrefix, // Matches alknet-* control channels
|
||||
}
|
||||
```
|
||||
|
||||
### Rule Evaluation
|
||||
|
||||
Rules are evaluated in order. First match wins. If no rule matches, the default
|
||||
applies. This supports both allowlist and blocklist semantics:
|
||||
|
||||
- **Allowlist**: `default: Deny`, then explicit Allow rules for permitted
|
||||
destinations.
|
||||
- **Blocklist**: `default: Allow`, then explicit Deny rules for blocked
|
||||
destinations.
|
||||
|
||||
### Principals
|
||||
|
||||
Each rule can specify which principals it applies to. A principal is an
|
||||
`Identity.id` (fingerprint or UUID) or a scope from `Identity.scopes`. When the
|
||||
rule's `principals` field is empty, it matches all identities.
|
||||
|
||||
This connects to the `IdentityProvider` trait (ADR-029): when a client
|
||||
authenticates, the `Identity` is resolved, and the forwarding policy checks
|
||||
rules against `Identity.id` and `Identity.scopes`.
|
||||
|
||||
### TransportKind-Aware Rules
|
||||
|
||||
Each rule can specify which `TransportKind` it applies to. This enables
|
||||
transport-specific restrictions — for example, WebTransport clients can be
|
||||
restricted to `alknet-*` control channels only:
|
||||
|
||||
```rust
|
||||
ForwardingRule {
|
||||
target: TargetPattern::AlknetPrefix,
|
||||
action: ForwardingAction::Allow,
|
||||
principals: vec![],
|
||||
transports: vec![TransportKind::WebTransport { host: "*".into() }],
|
||||
}
|
||||
```
|
||||
|
||||
### Where the Policy Check Happens
|
||||
|
||||
The forwarding policy check occurs in `channel_open_direct_tcpip` before the
|
||||
proxy task is spawned. The current behavior (no check) is equivalent to
|
||||
`ForwardingPolicy::allow_all()` — default Allow, no rules. This preserves
|
||||
backward compatibility during migration.
|
||||
|
||||
### DynamicConfig Integration
|
||||
|
||||
`ForwardingPolicy` is part of `DynamicConfig` and reloadable via
|
||||
`ConfigReloadHandle::reload()` or NAPI's `reloadForwarding()`. Changes take
|
||||
effect on the next channel open — existing connections continue with their
|
||||
current policy.
|
||||
|
||||
### OQ Resolutions
|
||||
|
||||
- **OQ-12** (Per-user forwarding scope vs global rules): Resolved. Start with
|
||||
global rules + principal matching from `Identity.scopes`. Per-user scope
|
||||
from `peer_credentials.metadata.scopes` via `IdentityProvider`.
|
||||
- **OQ-16** (Transport-specific forwarding): Resolved. Add `TransportKind`
|
||||
match in `ForwardingRule`. WebTransport clients can be restricted.
|
||||
- **OQ-18** (Source of Identity.scopes): Resolved by ADR-029 and this ADR.
|
||||
`IdentityProvider` owns scopes. `ForwardingPolicy` consumes them.
|
||||
|
||||
## Consequences
|
||||
|
||||
- **Positive**: Operators can restrict access per identity, per destination, per
|
||||
transport. A compromised key no longer grants unrestricted network access.
|
||||
- **Positive**: Default-allow preserves current behavior during migration. Switch
|
||||
to default-deny for production deployments.
|
||||
- **Positive**: Policy is reloadable without restart. Adding a rule via
|
||||
`reloadForwarding()` takes effect on the next channel open.
|
||||
- **Positive**: `TransportKind`-aware rules enable transport-specific
|
||||
restrictions (e.g., WebTransport clients restricted to alknet-* channels).
|
||||
- **Negative**: Another check in the hot path (every `channel_open_direct_tcpip`
|
||||
call). The cost is a linear scan of rules — acceptable for small rule sets.
|
||||
Large rule sets should use compiled matchers (future optimization).
|
||||
- **Negative**: `TargetPattern` string matching is lenient. Host patterns like
|
||||
`*.example.com` require careful implementation to prevent bypasses. The
|
||||
`glob` or `globset` crate can handle this correctly.
|
||||
|
||||
## References
|
||||
|
||||
- [research/configuration.md](../../research/configuration.md) — ForwardingPolicy section
|
||||
- [auth.md](../auth.md) — Identity.scopes and IdentityProvider
|
||||
- [open-questions.md](../open-questions.md) — OQ-12, OQ-16, OQ-18
|
||||
- [ADR-029](029-identity-core-type.md) — Identity as core type
|
||||
- [ADR-030](030-static-dynamic-config-split.md) — DynamicConfig (ForwardingPolicy is part of it)
|
||||
- [integration-plan.md](../../research/integration-plan.md) — Phase 1.3
|
||||
96
docs/architecture/decisions/032-event-boundary-discipline.md
Normal file
96
docs/architecture/decisions/032-event-boundary-discipline.md
Normal file
@@ -0,0 +1,96 @@
|
||||
# ADR-032: Event Boundary Discipline
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
The research identified three distinct communication patterns in the system, and
|
||||
conflating them is a known anti-pattern in event-driven architectures:
|
||||
|
||||
1. **Domain events** (Honker streams) — Internal to the service that owns that
|
||||
data. Used for state reconstruction within the service's own boundaries.
|
||||
Examples: `nodes:created`, `edges:deleted`, `accounts:updated`.
|
||||
|
||||
2. **irpc service calls** — Synchronous request-response within a node or
|
||||
cluster. Internal to the system. Examples: `AuthProtocol::VerifyPubkey`,
|
||||
`SecretProtocol::DeriveEd25519`, `ConfigProtocol::ReloadForwarding`.
|
||||
|
||||
3. **Call protocol events** (`EventEnvelope`) — Asynchronous integration events
|
||||
that cross node boundaries. External to the system. Examples:
|
||||
`call.requested`, `call.responded`, `call.completed`, `call.aborted`.
|
||||
|
||||
Without a hard constraint, it's tempting to have one service subscribe directly
|
||||
to another service's Honker streams. This leads to:
|
||||
|
||||
- **Leaky event store**: Service A reads Service B's domain events directly,
|
||||
coupling A to B's internal state representation. When B changes its schema, A
|
||||
breaks.
|
||||
- **Boomerang coupling**: An integration event is too thin, causing the
|
||||
consumer to call back to the source service synchronously to get details. This
|
||||
negates the benefit of async communication.
|
||||
- **Fat notification trap**: A notification event carries full entity state,
|
||||
when it should use state transfer instead.
|
||||
|
||||
## Decision
|
||||
|
||||
**Event boundary discipline is a hard architectural constraint, not a
|
||||
suggestion.**
|
||||
|
||||
1. **Domain events stay within the owning service.** A Honker stream published
|
||||
by the storage service (`nodes:created`) is for the storage service's own
|
||||
state reconstruction. No other service reads these stream events directly.
|
||||
|
||||
2. **irpc service calls are synchronous and internal.** They never cross node
|
||||
boundaries. They are request-response, not events. They should not be used
|
||||
as a substitute for integration events.
|
||||
|
||||
3. **Call protocol events are the only events that cross node boundaries.**
|
||||
`EventEnvelope` frames are the integration boundary. When a domain event
|
||||
needs to be communicated to another node, it must be projected into a call
|
||||
protocol event.
|
||||
|
||||
4. **Projection from domain events to integration events is required when
|
||||
crossing boundaries.** A service that owns a Honker stream must project
|
||||
relevant state changes into `EventEnvelope` frames before they leave the
|
||||
node. The projection strips internal details and produces a versioned,
|
||||
stable integration event.
|
||||
|
||||
This discipline applies at three levels:
|
||||
|
||||
```
|
||||
Call Protocol (Layer 3, external, JSON)
|
||||
└── irpc Service (Layer 3, internal, postcard)
|
||||
└── Honker Streams (Domain events, within service boundary)
|
||||
```
|
||||
|
||||
A call protocol handler MAY call an irpc service internally (e.g.,
|
||||
`/head/auth/verify` calls `AuthProtocol::VerifyPubkey`). The irpc service MAY
|
||||
use Honker streams for its own state management. But domain events never
|
||||
propagate beyond the service boundary without projection.
|
||||
|
||||
## Consequences
|
||||
|
||||
- **Positive**: Prevents leaky event stores. Services are independently
|
||||
deployable and their internal schemas can evolve without breaking consumers.
|
||||
- **Positive**: Honker and irpc are implementation details, not cross-boundary
|
||||
contracts. The call protocol's `EventEnvelope` is the only stable, versioned
|
||||
contract that other nodes depend on.
|
||||
- **Positive**: Clear ownership. Each service owns its Honker streams and can
|
||||
change them freely. Integration events are a deliberate, reviewed contract.
|
||||
- **Positive**: Makes testing easier. Services can be tested in isolation with
|
||||
mock domain events. Integration events are tested against the `EventEnvelope`
|
||||
schema.
|
||||
- **Negative**: Projection code is required. Every domain event that needs to
|
||||
cross a boundary must be explicitly projected. This is deliberate — the
|
||||
overhead ensures the integration contract is intentional.
|
||||
- **Negative**: Developers must resist the temptation to subscribe directly to
|
||||
Honker streams across services. Code review should catch this pattern.
|
||||
|
||||
## References
|
||||
|
||||
- [research/services.md](../../research/services.md) — Event boundary discipline section
|
||||
- [research/storage.md](../../research/storage.md) — Honker integration, event boundaries
|
||||
- [research/integration-plan.md](../../research/integration-plan.md) — ADR 032 entry
|
||||
- [event_source_types.md](/workspace/research/event_sourcing/event_source_types.md) — Event-driven architecture patterns
|
||||
@@ -0,0 +1,130 @@
|
||||
# ADR-033: OperationEnv as Universal Composition Mechanism
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
The `@alkdev/operations` TypeScript package defines `OperationEnv` as a
|
||||
universal composition mechanism. A handler receives `context.env[namespace][op](input)`
|
||||
and can invoke any registered operation regardless of whether it runs locally, in
|
||||
an irpc service on the same cluster, or on a remote node via call protocol.
|
||||
|
||||
The research documents define three dispatch paths:
|
||||
1. **Local dispatch** — direct function call through the operation registry
|
||||
2. **Service dispatch** — irpc protocol call to a service backend
|
||||
3. **Remote dispatch** — call protocol `EventEnvelope` to a remote node
|
||||
|
||||
Without a formal decision, irpc services could be seen as a replacement for
|
||||
OperationEnv or for the call protocol. They are not — irpc is one dispatch
|
||||
backend for OperationEnv, not a replacement for anything. The call protocol is
|
||||
another dispatch backend. OperationEnv unifies them from the handler's
|
||||
perspective.
|
||||
|
||||
The three communication patterns in the system (ADR-032) are:
|
||||
- Domain events (Honker streams) — internal to the owning service
|
||||
- irpc service calls — synchronous, in-cluster
|
||||
- Call protocol events — asynchronous, cross-node
|
||||
|
||||
irpc services and call protocol operations serve different scopes but must
|
||||
compose cleanly through OperationEnv.
|
||||
|
||||
## Decision
|
||||
|
||||
**OperationEnv is the universal composition mechanism that all operation
|
||||
handlers receive. It provides namespace + operation name → invoke with input,
|
||||
return output, regardless of dispatch path.**
|
||||
|
||||
### OperationEnv Behavioral Contract
|
||||
|
||||
```rust
|
||||
// The behavioral contract: given a namespace and operation name, invoke the
|
||||
// operation with the given input and return the output. The handler neither
|
||||
// knows nor cares whether the dispatch is local, via irpc, or via call protocol.
|
||||
pub trait OperationEnv: Send + Sync {
|
||||
fn invoke(&self, namespace: &str, operation: &str, input: Value) -> ResponseEnvelope;
|
||||
}
|
||||
```
|
||||
|
||||
The Rust implementation may use typed method dispatch or a registry behind the
|
||||
scenes, but the handler-facing API must preserve this contract.
|
||||
|
||||
### Three Dispatch Paths
|
||||
|
||||
OperationEnv resolves each call to one of three dispatch backends:
|
||||
|
||||
| Path | Mechanism | Serialization | Scope |
|
||||
|------|-----------|---------------|-------|
|
||||
| Local | Direct function call through registry | None (in-process) | Same process |
|
||||
| Service | irpc protocol enum dispatch | postcard (binary) | Same cluster |
|
||||
| Remote | Call protocol `EventEnvelope` | JSON | Cross-node |
|
||||
|
||||
All three produce the same `ResponseEnvelope`. The handler always calls
|
||||
`context.env.invoke("secrets", "derive", input)` and gets a `ResponseEnvelope`
|
||||
back.
|
||||
|
||||
### Service Assembly
|
||||
|
||||
The deployment topology determines which dispatch path each operation uses:
|
||||
|
||||
```rust
|
||||
// Minimal deployment (single node, all local)
|
||||
let env = OperationEnv::local(local_registry);
|
||||
|
||||
// Production deployment (mix of local and remote)
|
||||
let env = OperationEnv::new()
|
||||
.local("auth", auth_registry) // Auth runs locally
|
||||
.local("config", config_registry) // Config runs locally
|
||||
.service("secrets", secret_irpc_client) // Secret service via irpc
|
||||
.remote("worker-1", call_protocol_conn) // Worker-1 operations via call protocol
|
||||
```
|
||||
|
||||
### irpc Services Are One Dispatch Backend
|
||||
|
||||
irpc services (`AuthProtocol`, `SecretProtocol`, `ConfigProtocol`) define the
|
||||
wire format for in-cluster communication. They are Rust-to-Rust, type-safe,
|
||||
and efficient. But they are not a replacement for OperationEnv or for the call
|
||||
protocol. They are one dispatch backend.
|
||||
|
||||
An irpc service can be exposed as a call protocol operation:
|
||||
`/head/auth/verify` receives a call protocol event and internally calls
|
||||
`AuthProtocol::VerifyPubkey` via irpc. The layers compose:
|
||||
|
||||
```
|
||||
Call Protocol (Layer 3, external, JSON)
|
||||
└── irpc Service (Layer 3, internal, postcard)
|
||||
└── Honker Streams (Domain events, within service boundary)
|
||||
```
|
||||
|
||||
### Adapters Map to OperationEnv
|
||||
|
||||
HTTP (`POST /v1/{namespace}/{op}`), MCP (`tools/call`), DNS
|
||||
(`{op}.{namespace}.alk.dev TXT?`), and call protocol
|
||||
(`/call.requested`) all resolve through OperationEnv. This is what makes
|
||||
operations universally composable across all interfaces.
|
||||
|
||||
## Consequences
|
||||
|
||||
- **Positive**: Handlers compose through a single interface. Adding a new
|
||||
dispatch path (e.g., a new irpc service) doesn't change handler code.
|
||||
- **Positive**: irpc and call protocol coexist naturally. The handler doesn't
|
||||
know which path was taken.
|
||||
- **Positive**: Adapters (MCP, HTTP, DNS) map to operations through the same
|
||||
OperationEnv interface. One handler, multiple dispatch paths.
|
||||
- **Positive**: Deployment topology determines dispatch, not code. Same handler
|
||||
works locally, in-cluster, or cross-node.
|
||||
- **Negative**: OperationEnv is a new abstraction that must coexist with the
|
||||
existing call protocol handler pattern. The registry currently maps paths to
|
||||
handlers; OperationEnv adds namespace-aware composition on top.
|
||||
- **Negative**: The `@alkdev/operations` TypeScript `HashMap<String,
|
||||
HashMap<String, fn>>` model needs idiomatic Rust translation. The behavioral
|
||||
contract must match, but the implementation can differ.
|
||||
|
||||
## References
|
||||
|
||||
- [research/services.md](../../research/services.md) — OperationContext, OperationEnv
|
||||
- [research/integration-plan.md](../../research/integration-plan.md) — Phase 1.5, OperationEnv wiring
|
||||
- [ADR-032](032-event-boundary-discipline.md) — Event boundary discipline
|
||||
- [ADR-024](024-bidirectional-call-protocol.md) — Bidirectional call protocol
|
||||
- [ADR-025](025-handler-spec-separation.md) — Handler/spec separation
|
||||
55
docs/architecture/decisions/034-head-worker-terminology.md
Normal file
55
docs/architecture/decisions/034-head-worker-terminology.md
Normal file
@@ -0,0 +1,55 @@
|
||||
# ADR-034: Head/Worker Terminology
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
The project previously used hub/spoke terminology for describing node
|
||||
relationships: a hub node that coordinates connections and spokes that connect to
|
||||
it. This terminology implies a strict star topology where the hub is
|
||||
fundamentally different from spokes.
|
||||
|
||||
In practice, a coordinating node can also execute operations (run services,
|
||||
forward traffic). Any node can become a coordinator. The architecture supports
|
||||
mesh topologies where nodes coordinate in a peer-to-peer fashion.
|
||||
|
||||
The research documents (`core.md`, `services.md`) and updated architecture
|
||||
specs (`call-protocol.md`, `auth.md`, `napi-and-pubsub.md`, `open-questions.md`)
|
||||
already use head/worker consistently. Existing ADRs (024, 025) retain their
|
||||
original hub/spoke language because ADRs are historical records.
|
||||
|
||||
## Decision
|
||||
|
||||
**Use head/worker terminology throughout the project.**
|
||||
|
||||
- **Head node**: A node that coordinates — accepts connections, routes
|
||||
operations, manages cluster state. A head is also a worker (it can execute
|
||||
operations).
|
||||
- **Worker node**: A node that connects to a head, registers its services, and
|
||||
executes operations. Any worker can become a head.
|
||||
- **Node**: Any participant in the network. Every node has an Ed25519 identity.
|
||||
|
||||
The terms hub and spoke are deprecated in all new specs, code, and
|
||||
documentation. Existing ADRs retain their original language as historical
|
||||
records — ADRs document what was decided at the time, not what the current
|
||||
terminology is.
|
||||
|
||||
## Consequences
|
||||
|
||||
- **Positive**: Natural mesh formation. A head that is also a worker enables
|
||||
multi-hop routing, redundancy, and distributed topologies without a
|
||||
centralized authority.
|
||||
- **Positive**: Consistency with integration plan and research documents.
|
||||
- **Positive**: The terminology better reflects the architecture — there is no
|
||||
single "hub" that's fundamentally different from "spokes."
|
||||
- **Neutral**: Existing ADRs (024, 025) retain hub/spoke in their text. This is
|
||||
intentional — ADRs are historical records.
|
||||
|
||||
## References
|
||||
|
||||
- [research/integration-plan.md](../../research/integration-plan.md) — Phase 0 ADR 034 entry, inconsistencies section
|
||||
- [ADR-024](024-bidirectional-call-protocol.md) — Uses hub/spoke historically
|
||||
- [ADR-025](025-handler-spec-separation.md) — Uses hub/spoke historically
|
||||
- [research/core.md](../../research/core.md) — Head/worker terminology
|
||||
186
docs/architecture/flowgraph.md
Normal file
186
docs/architecture/flowgraph.md
Normal file
@@ -0,0 +1,186 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-07
|
||||
---
|
||||
|
||||
# FlowGraph
|
||||
|
||||
## What
|
||||
|
||||
The `alknet-flowgraph` crate provides graph data structures and operations,
|
||||
mapping the TypeScript `@alkdev/flowgraph` package's call-graph and
|
||||
operation-graph concepts to `petgraph::DiGraph`.
|
||||
|
||||
## Why
|
||||
|
||||
Call graphs and operation graphs are core observability and type-safety
|
||||
constructs. Call graphs track request flow across services; operation graphs
|
||||
validate type compatibility between composed operations. The crate is pure
|
||||
computation (no I/O, no external state), making it safe to include in any
|
||||
deployment topology.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Core Abstraction
|
||||
|
||||
`petgraph::DiGraph` replaces graphology. The mapping is nearly 1:1 for the
|
||||
operations used:
|
||||
|
||||
| TypeScript (graphology) | Rust (petgraph) |
|
||||
|------------------------|-----------------|
|
||||
| `graph.addNode(key, attrs)` | `graph.add_node(attrs)` + key_to_index |
|
||||
| `graph.addEdge(source, target, attrs)` | `graph.add_edge(source, target, attrs)` |
|
||||
| `hasCycle()` | `is_cyclic_directed(&graph)` |
|
||||
| `topologicalSort()` | `toposort(&graph)` |
|
||||
|
||||
A `HashMap<String, NodeIndex>` provides node-key-to-index lookups, mirroring
|
||||
the `key` column in the SQLite `nodes` table.
|
||||
|
||||
### FlowGraph<N, E>
|
||||
|
||||
```rust
|
||||
pub struct FlowGraph<N, E>
|
||||
where
|
||||
N: NodeAttributes,
|
||||
E: EdgeAttributes,
|
||||
{
|
||||
graph: DiGraph<N, E>,
|
||||
key_to_index: HashMap<String, NodeIndex>,
|
||||
}
|
||||
|
||||
pub trait NodeAttributes: Clone + Serialize + DeserializeOwned + Debug + Send + Sync {
|
||||
fn key(&self) -> &str;
|
||||
fn set_key(&mut self, key: String);
|
||||
}
|
||||
|
||||
pub trait EdgeAttributes: Clone + Serialize + DeserializeOwned + Debug + Send + Sync {
|
||||
fn edge_type(&self) -> &str;
|
||||
}
|
||||
```
|
||||
|
||||
### Operation Graph (Static)
|
||||
|
||||
Built from `OperationSpec`s at startup. Answers structural questions: type
|
||||
compatibility, cycle detection, reachability.
|
||||
|
||||
```rust
|
||||
pub struct OperationNodeAttrs {
|
||||
pub name: String,
|
||||
pub namespace: String,
|
||||
pub op_type: OperationType,
|
||||
pub input_schema: Value,
|
||||
pub output_schema: Value,
|
||||
}
|
||||
|
||||
pub enum OperationType { Query, Mutation, Subscription }
|
||||
```
|
||||
|
||||
Type compatibility compares `output_schema` (source) against `input_schema`
|
||||
(target) using `jsonschema::validate()`. Exact match or subtype = compatible
|
||||
edge. Structural mismatch = incompatible edge.
|
||||
|
||||
### Call Graph (Dynamic)
|
||||
|
||||
Populated at runtime from call protocol events. Every `call.requested` adds a
|
||||
node; `call.responded`/`call.error`/`call.aborted` update status.
|
||||
|
||||
```rust
|
||||
pub struct CallNodeAttrs {
|
||||
pub request_id: String,
|
||||
pub operation_id: String,
|
||||
pub status: CallStatus,
|
||||
pub parent_request_id: Option<String>,
|
||||
pub input: Value,
|
||||
pub output: Option<Value>,
|
||||
pub error: Option<CallErrorInfo>,
|
||||
pub identity: Option<Identity>,
|
||||
pub started_at: Option<String>,
|
||||
pub completed_at: Option<String>,
|
||||
}
|
||||
|
||||
pub enum CallStatus { Pending, Running, Completed, Failed, Aborted }
|
||||
```
|
||||
|
||||
### Key Operations
|
||||
|
||||
| Query | Method | Returns |
|
||||
|-------|--------|---------|
|
||||
| Topological order | `topological_order()` | `Result<Vec<String>, CycleError>` |
|
||||
| Cycle detection | `has_cycles()` | `bool` |
|
||||
| Ancestors/descendants | `ancestors()`, `descendants()` | `Vec<String>` |
|
||||
| Status filtering | `filter_by_status()` | Keys with matching status |
|
||||
| Duration | `duration()` | `completed_at - started_at` |
|
||||
|
||||
### DAG Invariants
|
||||
|
||||
- **Operation graph**: DAG-only enforced at construction. Cycles throw
|
||||
`CycleError`.
|
||||
- **Call graph**: DAG by design. `parent_request_id` cannot create ancestor
|
||||
cycles.
|
||||
- **No parallel edges**: `multi: false`.
|
||||
- **No self-loops**: `allow_self_loops: false`.
|
||||
|
||||
### Integration with alknet-storage
|
||||
|
||||
Call graphs and operation graphs are stored as metagraph instances in
|
||||
alknet-storage. The bridge is serialization: `FlowGraph` serializes to
|
||||
`serde_json::Value`, which storage persists in the `nodes.attributes` and
|
||||
`edges.attributes` columns.
|
||||
|
||||
### Integration with alknet-core (Call Protocol)
|
||||
|
||||
The call protocol's `EventEnvelope` drives call graph construction:
|
||||
|
||||
```rust
|
||||
call_map.on_requested(|event| {
|
||||
call_graph.update_from_event(&CallEvent::Requested(event));
|
||||
});
|
||||
```
|
||||
|
||||
### Crate Dependencies
|
||||
|
||||
```toml
|
||||
[dependencies]
|
||||
petgraph = "0.x"
|
||||
serde = { version = "1", features = ["derive"] }
|
||||
serde_json = "1"
|
||||
jsonschema = "0.x"
|
||||
thiserror = "1"
|
||||
uuid = { version = "1", features = ["v4"] }
|
||||
chrono = { version = "0.x", features = ["serde"] }
|
||||
```
|
||||
|
||||
Does NOT depend on alknet-core, alknet-storage, or alknet-secret.
|
||||
|
||||
### Interface Back to Core
|
||||
|
||||
`OperationSpec` and `CallNodeAttrs` types must match alknet-core's definitions.
|
||||
The bridge is serialization — flowgraph serializes to JSON, storage persists it.
|
||||
alknet-flowgraph does not depend on alknet-core as a crate; it conforms to the
|
||||
`OperationSpec` schema independently.
|
||||
|
||||
## Constraints
|
||||
|
||||
- Pure computation crate — no I/O, no database, no external state.
|
||||
- No dependency on alknet-core, alknet-storage, or alknet-secret.
|
||||
- Type compatibility with alknet-core's `OperationSpec` is via serialization
|
||||
conformance, not a crate dependency.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- None specific to this spec. See [open-questions.md](open-questions.md) for
|
||||
general questions.
|
||||
|
||||
## Design Decisions
|
||||
|
||||
| ADR | Decision | Summary |
|
||||
|-----|----------|---------|
|
||||
| [027](decisions/027-crate-decomposition.md) | Crate decomposition | alknet-flowgraph is independent of core, storage, secret |
|
||||
|
||||
## References
|
||||
|
||||
- [research/flow.md](../research/flow.md) — Full FlowGraph, operation graph, call graph design
|
||||
- [research/integration-plan.md](../research/integration-plan.md) — Phase 2.3
|
||||
- [call-protocol.md](call-protocol.md) — EventEnvelope, PendingRequestMap
|
||||
- `@alkdev/flowgraph` — TypeScript call-graph and operation-graph implementation
|
||||
- `@alkdev/operations` — OperationSpec, CallHandler, registry
|
||||
189
docs/architecture/identity.md
Normal file
189
docs/architecture/identity.md
Normal file
@@ -0,0 +1,189 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-07
|
||||
---
|
||||
|
||||
# Identity
|
||||
|
||||
## What
|
||||
|
||||
The `Identity` type and `IdentityProvider` trait are the core abstractions for
|
||||
authentication and authorization in alknet. `Identity` is the unified result of
|
||||
auth verification — whether via SSH public key, signed timestamp token, or
|
||||
database lookup. `IdentityProvider` is the trait that resolves credentials to an
|
||||
`Identity`, decoupling alknet-core from any specific identity storage.
|
||||
|
||||
## Why
|
||||
|
||||
Auth, forwarding policy, and call protocol all need to know who is making a
|
||||
request and what they are authorized to do. Without `Identity` in core, each
|
||||
subsystem would define its own identity type, leading to duplication and
|
||||
conversion boilerplate. Without `IdentityProvider` as a trait, alknet-core
|
||||
would either hardcode config-file-based auth or take a database dependency —
|
||||
neither acceptable for a library crate.
|
||||
|
||||
The `IdentityProvider` trait exists because the same auth verification concept
|
||||
needs two implementations: `ConfigIdentityProvider` for minimal deployments (all
|
||||
keys in memory via ArcSwap) and `StorageIdentityProvider` for production (SQLite
|
||||
lookup via `peer_credentials` and ACL graph). The trait is the contract; the
|
||||
backing store is pluggable.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Identity Struct
|
||||
|
||||
```rust
|
||||
pub struct Identity {
|
||||
pub id: String, // Fingerprint or account UUID
|
||||
pub scopes: Vec<String>, // e.g., ["relay:connect", "service:gitea:read"]
|
||||
pub resources: HashMap<String, Vec<String>>, // e.g., {"service": ["gitea", "registry"]}
|
||||
}
|
||||
```
|
||||
|
||||
The `id` field serves dual purpose:
|
||||
- **Config-based auth** (`ConfigIdentityProvider`): holds the Ed25519 key
|
||||
fingerprint (e.g., `SHA256:abc123...`)
|
||||
- **Database-backed auth** (`StorageIdentityProvider`): holds the account UUID
|
||||
from the `accounts` table
|
||||
|
||||
This keeps the type simple while accommodating both auth paths. Downstream
|
||||
consumers (forwarding policy, call protocol ACL checks) use `scopes` and
|
||||
`resources` without knowing whether the identity came from a config file or a
|
||||
database.
|
||||
|
||||
### IdentityProvider Trait
|
||||
|
||||
```rust
|
||||
pub trait IdentityProvider: Send + Sync + 'static {
|
||||
/// Resolve an SSH public key fingerprint to an identity.
|
||||
fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
|
||||
|
||||
/// Resolve an auth token to an identity.
|
||||
/// Returns None if the token is invalid, expired, or the key is not authorized.
|
||||
fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
|
||||
}
|
||||
```
|
||||
|
||||
Both SSH key auth and token auth resolve to the same `Identity` type. The trait
|
||||
lives in `alknet_core::auth`.
|
||||
|
||||
### ConfigIdentityProvider (Default)
|
||||
|
||||
Reads from `ArcSwap<DynamicConfig.auth>` per ADR-030. Every authorized key gets
|
||||
a default scope set. No database dependency. This is the default for CLI and
|
||||
single-node deployments.
|
||||
|
||||
```rust
|
||||
pub struct ConfigIdentityProvider {
|
||||
auth_config: Arc<ArcSwap<DynamicConfig>>,
|
||||
}
|
||||
|
||||
impl IdentityProvider for ConfigIdentityProvider {
|
||||
fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity> {
|
||||
let config = self.auth_config.load();
|
||||
config.auth.ssh.authorized_keys.get(fingerprint)
|
||||
.map(|key_entry| Identity {
|
||||
id: fingerprint.to_string(),
|
||||
scopes: key_entry.scopes.clone(),
|
||||
resources: key_entry.resources.clone(),
|
||||
})
|
||||
}
|
||||
|
||||
fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity> {
|
||||
// Verify Ed25519 signature against the same authorized_keys set
|
||||
// Resolve to the same Identity as SSH auth would produce
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### StorageIdentityProvider (Production)
|
||||
|
||||
Implemented in `alknet-storage` (not in alknet-core). Backed by SQLite
|
||||
`peer_credentials` and `api_keys` tables plus the ACL graph. Resolves
|
||||
fingerprint → account → organization membership → effective scopes. Uses the
|
||||
`IdentityProvider` trait defined in alknet-core, providing the concrete impl via
|
||||
the trait.
|
||||
|
||||
### AuthProtocol irpc Service
|
||||
|
||||
The `AuthProtocol` irpc service (behind the `irpc` feature flag per ADR-028)
|
||||
provides an async boundary for auth verification. It is one way to satisfy the
|
||||
`IdentityProvider` trait, not a replacement for it:
|
||||
|
||||
```rust
|
||||
enum AuthProtocol {
|
||||
VerifyPubkey { fingerprint: String, key_data: Vec<u8> },
|
||||
VerifyToken { token_bytes: Vec<u8>, timestamp: u64 },
|
||||
ReloadKeys,
|
||||
CheckAccess { identity: Identity, operation: String },
|
||||
}
|
||||
|
||||
enum AuthResult {
|
||||
Ok(Identity),
|
||||
Denied(String),
|
||||
}
|
||||
```
|
||||
|
||||
The relationship:
|
||||
- **Trait-based path**: Handler calls `identity_provider.resolve_from_fingerprint()`
|
||||
directly. Zero overhead. Used when irpc is disabled or when the
|
||||
implementation is local.
|
||||
- **irpc path**: Handler calls `identity_provider.resolve_from_fingerprint()`,
|
||||
which internally delegates to `AuthProtocol::VerifyPubkey` via an irpc client.
|
||||
Used in production deployments with SQLite-backed auth.
|
||||
|
||||
Both paths produce the same `Identity` result.
|
||||
|
||||
### Auth Flows
|
||||
|
||||
**SSH key auth** (existing, unchanged):
|
||||
```
|
||||
Client connects → SSH handshake → auth_publickey() callback
|
||||
→ IdentityProvider::resolve_from_fingerprint(fingerprint)
|
||||
→ Some(Identity) or None
|
||||
```
|
||||
|
||||
**Token auth** (new, for non-SSH transports):
|
||||
```
|
||||
Browser connects → WebTransport CONNECT request
|
||||
→ Extract token from URL path or Authorization header
|
||||
→ IdentityProvider::resolve_from_token(token)
|
||||
→ Some(Identity) or None
|
||||
```
|
||||
|
||||
Both paths produce an `Identity`. The `Identity` is attached to the connection
|
||||
and used by `ForwardingPolicy` and call protocol for authorization decisions.
|
||||
|
||||
## Constraints
|
||||
|
||||
- `Identity` and `IdentityProvider` live in `alknet_core::auth`. No database
|
||||
dependency at the core level (ADR-029).
|
||||
- alknet-storage implements the core trait — the dependency goes from storage
|
||||
to core, not the other way.
|
||||
- The `id` field in `Identity` serves dual purpose (fingerprint or UUID). This
|
||||
is a deliberate simplification — downstream consumers don't need to know the
|
||||
source.
|
||||
- Certificate authority tokens are not supported for token auth in v1 (ADR-023).
|
||||
- The irpc feature flag means nodes that only do SSH tunneling don't need the
|
||||
service layer overhead.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- None specific to this spec. See [open-questions.md](open-questions.md) for
|
||||
general auth questions (OQ-15, OQ-19).
|
||||
|
||||
## Design Decisions
|
||||
|
||||
| ADR | Decision | Summary |
|
||||
|-----|----------|---------|
|
||||
| [029](decisions/029-identity-core-type.md) | Identity as core type | `Identity` and `IdentityProvider` live in alknet-core, not storage |
|
||||
| [028](decisions/028-auth-irpc-service.md) | Auth as irpc service | `AuthProtocol` behind feature flag; `IdentityProvider` is the contract |
|
||||
| [023](decisions/023-unified-auth-shared-key-material.md) | Unified auth | Same key material for SSH and token auth; same `Identity` result |
|
||||
|
||||
## References
|
||||
|
||||
- [auth.md](auth.md) — Token authentication, AuthPolicy, WebTransport session handling
|
||||
- [research/services.md](../research/services.md) — AuthService, AuthProtocol definition
|
||||
- [research/integration-plan.md](../research/integration-plan.md) — Phase 1.2
|
||||
- [ADR-030](decisions/030-static-dynamic-config-split.md) — DynamicConfig (ConfigIdentityProvider reads from it)
|
||||
- [ADR-031](decisions/031-forwarding-policy.md) — ForwardingPolicy consumes Identity.scopes
|
||||
221
docs/architecture/interface.md
Normal file
221
docs/architecture/interface.md
Normal file
@@ -0,0 +1,221 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-07
|
||||
---
|
||||
|
||||
# Interface (Layer 2)
|
||||
|
||||
## What
|
||||
|
||||
The Interface layer sits between Transport (Layer 1) and Protocol (Layer 3).
|
||||
An Interface consumes a `Transport::Stream` and produces call protocol sessions.
|
||||
SSH is an interface, not a transport — it wraps a byte stream in session
|
||||
semantics. Raw framing (4-byte length prefix + JSON `EventEnvelope`) is another
|
||||
interface, one without SSH overhead.
|
||||
|
||||
## Why
|
||||
|
||||
In the current architecture, SSH is deeply embedded in `ServerHandler`. This
|
||||
tangling of transport, interface, and protocol makes it impossible to:
|
||||
|
||||
- Run the call protocol over DNS queries without wrapping SSH inside DNS
|
||||
- Use raw framing for local service mesh (no SSH overhead)
|
||||
- Support WebTransport direct call protocol for browsers
|
||||
- Separate auth mechanics from channel management
|
||||
|
||||
The three-layer model (ADR-026) cleanly separates these concerns. Transport
|
||||
produces bytes. Interface parses bytes into sessions. Protocol carries
|
||||
semantics. A connection is always a (Transport, Interface) pair.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Three-Layer Model
|
||||
|
||||
```
|
||||
Layer 3: Protocol (Call protocol, Operations, OperationEnv)
|
||||
Layer 2: Interface (SSH, raw framing, HTTP/WS, DNS control channel)
|
||||
Layer 1: Transport (TCP, TLS, iroh, DNS, WebTransport)
|
||||
```
|
||||
|
||||
- **Layer 1: Transport** — produces byte streams (`AsyncRead + AsyncWrite + Unpin
|
||||
+ Send`). Unchanged per ADR-001.
|
||||
- **Layer 2: Interface** — consumes a `Transport::Stream` and produces call
|
||||
protocol sessions. SSH does handshake + auth + channel multiplexing. Raw
|
||||
framing does length-prefix parsing.
|
||||
- **Layer 3: Protocol** — carries semantics. Call protocol events, operation
|
||||
registry, service calls. Agnostic to both Transport and Interface below it.
|
||||
|
||||
### Interface Trait
|
||||
|
||||
```rust
|
||||
#[async_trait]
|
||||
pub trait Interface: Send + Sync + 'static {
|
||||
type Session;
|
||||
async fn accept(stream: TransportStream, config: &InterfaceConfig) -> Result<Self::Session>;
|
||||
}
|
||||
```
|
||||
|
||||
The session produced by an interface is consumed by the call protocol handler.
|
||||
Different interfaces produce different session types, but the call protocol
|
||||
handler receives `EventEnvelope` frames from any interface.
|
||||
|
||||
### SshInterface
|
||||
|
||||
Wraps the existing `ServerHandler` logic. This is the most complex interface
|
||||
because SSH provides channel multiplexing, auth negotiation, and proxy
|
||||
management within a single session.
|
||||
|
||||
What stays in SshInterface (Layer 2):
|
||||
- SSH handshake and session management
|
||||
- Auth delegation to `IdentityProvider` (via `auth_publickey()` callback)
|
||||
- Channel multiplexing (multiple channels per session)
|
||||
- `alknet-control:0` channel routing to call protocol
|
||||
|
||||
What moves to Layer 3 (call protocol handler):
|
||||
- Operation registry and dispatch
|
||||
- Forwarding policy checks (per ADR-031)
|
||||
- Operation context construction (Identity, scopes)
|
||||
|
||||
What moves to per-connection state:
|
||||
- Port forwarding proxy logic
|
||||
|
||||
### RawFramingInterface
|
||||
|
||||
Reads 4-byte big-endian length prefix + JSON `EventEnvelope` frames directly
|
||||
from the transport stream. No SSH wrapping. No channel multiplexing — the
|
||||
entire stream is a single call protocol channel.
|
||||
|
||||
```rust
|
||||
pub struct RawFramingInterface;
|
||||
|
||||
impl Interface for RawFramingInterface {
|
||||
type Session = RawFramingSession;
|
||||
// Reads length-prefixed EventEnvelope frames from the stream
|
||||
}
|
||||
```
|
||||
|
||||
Used for:
|
||||
- DNS control channel (DNS transport + raw framing)
|
||||
- Local service mesh (TCP + raw framing, no SSH overhead)
|
||||
- Browser direct call protocol (WebTransport + raw framing, future)
|
||||
|
||||
### DNS Control Channel
|
||||
|
||||
A (DNS transport, raw framing interface) pair. The DNS transport encodes
|
||||
`EventEnvelope` frames as DNS query/response pairs. The raw framing interface
|
||||
parses them directly — **NOT** SSH inside DNS.
|
||||
|
||||
```
|
||||
Client: Encode EventEnvelope as base32 DNS query labels
|
||||
→ DNS Transport → DNS Server → Raw Framing Interface → Call Protocol Handler
|
||||
|
||||
Server: Return EventEnvelope as DNS TXT record response
|
||||
← Raw Framing Interface ← DNS Transport ← Call Protocol Handler
|
||||
```
|
||||
|
||||
### Valid (Transport, Interface) Pairs
|
||||
|
||||
| Transport | Interface | Use case |
|
||||
|-----------|-----------|----------|
|
||||
| TLS | SSH | Standard alknet tunnel |
|
||||
| TCP | SSH | Plain SSH tunnel |
|
||||
| iroh | SSH | P2P SSH tunnel |
|
||||
| DNS | raw framing | DNS control channel |
|
||||
| WebTransport | SSH | Browser SSH tunnel (future) |
|
||||
| WebTransport | raw framing | Browser call protocol (future) |
|
||||
| TCP | raw framing | Direct call protocol, local mesh |
|
||||
|
||||
### InterfaceConfig
|
||||
|
||||
Different interfaces require different configuration:
|
||||
|
||||
```rust
|
||||
pub enum InterfaceConfig {
|
||||
Ssh(SshInterfaceConfig),
|
||||
RawFraming(RawFramingConfig),
|
||||
}
|
||||
|
||||
pub struct SshInterfaceConfig {
|
||||
pub auth: Arc<dyn IdentityProvider>,
|
||||
pub forwarding: Arc<ArcSwap<DynamicConfig>>, // for ForwardingPolicy
|
||||
pub host_key: Arc<PrivateKey>,
|
||||
}
|
||||
|
||||
pub struct RawFramingConfig {
|
||||
// No SSH-specific config needed
|
||||
// Auth is handled by the transport layer (e.g., token auth for WebTransport)
|
||||
// or by the call protocol layer
|
||||
}
|
||||
```
|
||||
|
||||
### Auth Across Interfaces
|
||||
|
||||
- **SshInterface**: Auth happens during SSH handshake via
|
||||
`IdentityProvider::resolve_from_fingerprint()`. The authenticated `Identity`
|
||||
is attached to the session.
|
||||
- **RawFramingInterface**: Auth is handled by the transport (e.g., token auth
|
||||
for WebTransport via `IdentityProvider::resolve_from_token()`) or by the call
|
||||
protocol layer (operation-level ACL).
|
||||
|
||||
Both paths produce the same `Identity` type (ADR-029).
|
||||
|
||||
### Server Accept Loop
|
||||
|
||||
With the Interface trait, the accept loop becomes:
|
||||
|
||||
```rust
|
||||
for listener in listeners {
|
||||
let (transport, interface) = listener;
|
||||
tokio::spawn(async move {
|
||||
loop {
|
||||
let stream = transport.accept().await?;
|
||||
let session = interface.accept(stream, &config).await?;
|
||||
// session produces call protocol events
|
||||
// call protocol handler is interface-agnostic
|
||||
}
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
## Constraints
|
||||
|
||||
- The Interface trait must accommodate both SSH's channel multiplexing and raw
|
||||
framing's single-stream model through the same abstraction.
|
||||
- `SshInterface` is the most invasive refactoring in Phase 1. The existing
|
||||
`ServerHandler` owns auth, channel management, and proxy logic — extracting
|
||||
these cleanly requires careful design (integration-plan, Phase 1.8).
|
||||
- DNS transport implementation is Phase 4 work. The `TransportKind::Dns` variant
|
||||
and `RawFramingInterface` are defined now; implementation is deferred.
|
||||
- WebTransport is Phase 4 work. The `TransportKind::WebTransport` variant is a
|
||||
tag only for now.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- **OQ-IF-01**: How does the `Interface` session type relate to the call
|
||||
protocol's `EventEnvelope` stream? Does every session implement
|
||||
`Stream<Item=EventEnvelope>`? This needs design during Phase 1.8.
|
||||
|
||||
- **OQ-IF-02**: Should `SshInterface` own the `ForwardingPolicy` check for
|
||||
`channel_open_direct_tcpip`, or should that move to Layer 3? Current thinking:
|
||||
the forwarding check is a Layer 3 concern (it's policy, not session mechanics),
|
||||
but the channel open/close lifecycle is Layer 2. The Interface reports channel
|
||||
open requests to Layer 3; Layer 3 applies `ForwardingPolicy` and tells
|
||||
Layer 2 whether to proxy.
|
||||
|
||||
## Design Decisions
|
||||
|
||||
| ADR | Decision | Summary |
|
||||
|-----|----------|---------|
|
||||
| [026](decisions/026-transport-interface-separation.md) | Three-layer model | SSH is Layer 2, not Layer 1 |
|
||||
| [033](decisions/033-operationenv-irpc-call-protocol.md) | OperationEnv | Protocol is interface-agnostic |
|
||||
| [029](decisions/029-identity-core-type.md) | Identity as core type | Auth resolution across interfaces |
|
||||
| [031](decisions/031-forwarding-policy.md) | Forwarding policy | Layer 3 policy applied to Layer 2 channel requests |
|
||||
|
||||
## References
|
||||
|
||||
- [research/integration-plan.md](../research/integration-plan.md) — Phase 1.8, valid (Transport, Interface) pairs
|
||||
- [research/core.md](../research/core.md) — DNS transport, three-layer model
|
||||
- [ADR-026](decisions/026-transport-interface-separation.md) — Transport/interface separation
|
||||
- [transport.md](transport.md) — Transport trait (unchanged at Layer 1)
|
||||
- [server.md](server.md) — Current ServerHandler (will become SshInterface)
|
||||
- [identity.md](identity.md) — IdentityProvider, auth across interfaces
|
||||
@@ -1,6 +1,6 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-04
|
||||
last_updated: 2026-06-07
|
||||
---
|
||||
|
||||
# Open Questions
|
||||
@@ -96,10 +96,10 @@ last_updated: 2026-06-04
|
||||
|
||||
### OQ-12: Per-user forwarding scope vs global rules
|
||||
- **Origin**: [research/configuration.md](../research/configuration.md)
|
||||
- **Status**: open
|
||||
- **Priority**: medium
|
||||
- **Resolution**: (pending)
|
||||
- **Cross-references**: configuration.md
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~medium~~ —
|
||||
- **Resolution**: ADR-031 — Start with global rules + principal matching from `Identity.scopes`. Per-user scope from `peer_credentials.metadata.scopes` via `IdentityProvider`. The `ForwardingPolicy` evaluates rules against `Identity.id` and `Identity.scopes` from the authenticated identity.
|
||||
- **Cross-references**: [ADR-031](decisions/031-forwarding-policy.md), [configuration.md](configuration.md)
|
||||
|
||||
### OQ-13: Config file auto-reload via file watching
|
||||
- **Origin**: [research/configuration.md](../research/configuration.md)
|
||||
@@ -119,38 +119,59 @@ last_updated: 2026-06-04
|
||||
- **Origin**: [research/configuration.md](../research/configuration.md)
|
||||
- **Status**: open
|
||||
- **Priority**: medium
|
||||
- **Resolution**: (pending — needs R&D in WebTransport transport session)
|
||||
- **Cross-references**: [auth.md](auth.md), OQ-19
|
||||
- **Resolution**: (deferred to Phase 4 — needs R&D in WebTransport transport session)
|
||||
- **Cross-references**: [auth.md](auth.md), OQ-19, [interface.md](interface.md)
|
||||
|
||||
### OQ-16: Transport-specific forwarding policy (e.g., WebTransport clients restricted to alknet-* channels)
|
||||
- **Origin**: [research/configuration.md](../research/configuration.md)
|
||||
- **Status**: open
|
||||
- **Priority**: low
|
||||
- **Resolution**: (pending — defer to forwarding policy design)
|
||||
- **Cross-references**: configuration.md
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~low~~ —
|
||||
- **Resolution**: ADR-031 — Add `TransportKind` match in `ForwardingRule`. WebTransport clients can be restricted to `alknet-*` channels via `TargetPattern::AlknetPrefix` combined with a `TransportKind::WebTransport` filter.
|
||||
- **Cross-references**: [ADR-031](decisions/031-forwarding-policy.md), [configuration.md](configuration.md)
|
||||
|
||||
### OQ-17: Transport-aware auth layer (SSH keys vs API keys for non-SSH transports)
|
||||
- **Origin**: [research/configuration.md](../research/configuration.md)
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~medium~~ —
|
||||
- **Resolution**: ADR-023 — Unified auth with shared key material. SSH transports use SSH pubkey auth. Non-SSH transports (WebTransport) use Ed25519-signed timestamp tokens. Both verify against the same `authorized_keys` set. The presentation differs per transport, but the identity is unified. `AuthPolicy` holds both `SshAuthConfig` and `TokenAuthConfig`, with `TokenKeySource::Shared` as the default (same keys for both paths). `IdentityProvider` trait decouples alknet-core from identity storage.
|
||||
- **Cross-references**: [ADR-023](decisions/023-unified-auth-shared-key-material.md), [auth.md](auth.md), OQ-15
|
||||
- **Cross-references**: [ADR-023](decisions/023-unified-auth-shared-key-material.md), [identity.md](identity.md), OQ-15
|
||||
|
||||
### OQ-23: irpc dependency — always or behind feature flag?
|
||||
- **Origin**: [research/integration-plan.md](../research/integration-plan.md)
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: medium —
|
||||
- **Resolution**: ADR-027 — Feature flag. Nodes that only do SSH tunneling don't need the service layer. irpc is behind a feature flag in alknet-core and an independent dependency in alknet-secret and alknet-storage.
|
||||
- **Cross-references**: [ADR-027](decisions/027-crate-decomposition.md)
|
||||
|
||||
### OQ-24: DNS control channel scope for initial implementation?
|
||||
- **Origin**: [research/integration-plan.md](../research/integration-plan.md)
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: medium —
|
||||
- **Resolution**: ADR-026 — DNS control channel carries call protocol frames only (no SSH tunneling over DNS). The (DNS transport, raw framing interface) pair sends `EventEnvelope` directly. SSH-over-DNS is a future possibility but out of scope.
|
||||
- **Cross-references**: [ADR-026](decisions/026-transport-interface-separation.md), [interface.md](interface.md)
|
||||
|
||||
### OQ-25: alknet-storage and alknet-secret irpc dependency
|
||||
- **Origin**: [research/integration-plan.md](../research/integration-plan.md)
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: low —
|
||||
- **Resolution**: ADR-027 — Independently. They're separate crates. irpc is a shared library they both use as an independent dependency.
|
||||
- **Cross-references**: [ADR-027](decisions/027-crate-decomposition.md)
|
||||
|
||||
## Auth
|
||||
|
||||
### OQ-18: Source of Identity.scopes — ForwardingPolicy, IdentityProvider, or both?
|
||||
- **Origin**: [auth.md](auth.md)
|
||||
- **Status**: open
|
||||
- **Priority**: medium
|
||||
- **Resolution**: (pending)
|
||||
- **Cross-references**: ADR-023, [call-protocol.md](call-protocol.md)
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~medium~~ —
|
||||
- **Resolution**: ADR-029 and ADR-031 — `IdentityProvider` owns scopes. The `Identity` struct includes `scopes` and `resources` fields populated by the `IdentityProvider` implementation (config-based or database-backed). `ForwardingPolicy` uses scopes from `Identity` — it consumes them, it doesn't produce them.
|
||||
- **Cross-references**: [ADR-029](decisions/029-identity-core-type.md), [ADR-031](decisions/031-forwarding-policy.md), [identity.md](identity.md)
|
||||
|
||||
### OQ-19: Separate TLS identity for WebTransport vs shared with SSH-over-TLS?
|
||||
- **Origin**: [auth.md](auth.md)
|
||||
- **Status**: open
|
||||
- **Priority**: low
|
||||
- **Resolution**: (pending)
|
||||
- **Cross-references**: OQ-15
|
||||
- **Resolution**: (deferred to Phase 4 — QUIC is UDP, TLS-over-TCP is TCP, they can share port 443 without conflict)
|
||||
- **Cross-references**: OQ-15, [interface.md](interface.md)
|
||||
|
||||
## Call Protocol
|
||||
|
||||
@@ -158,19 +179,65 @@ last_updated: 2026-06-04
|
||||
- **Origin**: [call-protocol.md](call-protocol.md)
|
||||
- **Status**: open
|
||||
- **Priority**: medium
|
||||
- **Resolution**: (pending — registration on connect / cleanup on disconnect is the leading approach)
|
||||
- **Resolution**: (pending — registration on connect / cleanup on disconnect is the leading approach but needs spec in call-protocol.md)
|
||||
- **Cross-references**: ADR-024, ADR-025
|
||||
|
||||
### OQ-21: Routing calls to specific workers with same-service operations
|
||||
- **Origin**: [call-protocol.md](call-protocol.md)
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~medium~~ —
|
||||
- **Resolution**: ADR-024, ADR-025 — Operation paths use `/{node}/{service}/{op}` format. The first path segment identifies the node and routes the call to the correct connected node. Multiple workers exposing the same service (e.g., two dev envs both with `/fs/*`) are differentiated by the node prefix (`/dev1/fs/readFile` vs `/dev2/fs/readFile`). The head maintains a routing table mapping node identity to connection. This mirrors iroh's ALPN dispatch: first segment = routing key.
|
||||
- **Resolution**: ADR-024, ADR-025 — Operation paths use `/{node}/{service}/{op}` format. The first path segment identifies the node and routes the call to the correct connected node. Multiple workers exposing the same service are differentiated by the node prefix (`/dev1/fs/readFile` vs `/dev2/fs/readFile`). The head maintains a routing table mapping node identity to connection.
|
||||
- **Cross-references**: [call-protocol.md](call-protocol.md), ADR-024, ADR-025
|
||||
|
||||
### OQ-22: Client streaming (streaming inputs) in the call protocol?
|
||||
- **Origin**: [call-protocol.md](call-protocol.md)
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~low~~ —
|
||||
- **Resolution**: Deferred. Current model (single request, optional streaming response) covers all identified use cases. Client streaming can be added later if needed.
|
||||
- **Cross-references**: ADR-024
|
||||
|
||||
## Services
|
||||
|
||||
### OQ-SVC-01: Should the secret service support multiple seed phrases (one per tenant)?
|
||||
- **Origin**: [secret-service.md](secret-service.md)
|
||||
- **Status**: open
|
||||
- **Priority**: low
|
||||
- **Resolution**: (pending)
|
||||
- **Cross-references**: ADR-024
|
||||
- **Resolution**: (deferred — one seed per node is simplest; multi-seed can be added later by indexing `Unlock` with a tenant ID)
|
||||
- **Cross-references**: [secret-service.md](secret-service.md)
|
||||
|
||||
### OQ-SVC-02: Should service protocols use postcard (binary) or JSON for remote calls?
|
||||
- **Origin**: [research/services.md](../research/services.md)
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: low —
|
||||
- **Resolution**: Postcard for irpc (Rust-to-Rust, efficient). JSON for call protocol (cross-language, universal). The irpc remote path naturally uses postcard.
|
||||
- **Cross-references**: [services.md](services.md)
|
||||
|
||||
### OQ-SVC-03: How does the secret service integrate with the existing EncryptedDataSchema from @alkdev/storage?
|
||||
- **Origin**: [secret-service.md](secret-service.md)
|
||||
- **Status**: open
|
||||
- **Priority**: medium
|
||||
- **Resolution**: (pending — Rust implementation replaces PBKDF2 password-based encryption with derived AES-256-GCM keys; EncryptedData format is a superset; migration by re-encrypting)
|
||||
- **Cross-references**: [secret-service.md](secret-service.md), [storage.md](storage.md)
|
||||
|
||||
### OQ-SVC-04: Should workers cache derived keys locally?
|
||||
- **Origin**: [secret-service.md](secret-service.md)
|
||||
- **Status**: open
|
||||
- **Priority**: low
|
||||
- **Resolution**: Yes, with a TTL (default: 1 hour). The head can revoke by invalidating the session.
|
||||
- **Cross-references**: [secret-service.md](secret-service.md)
|
||||
|
||||
## Interface
|
||||
|
||||
### OQ-IF-01: How does the Interface session type relate to the call protocol's EventEnvelope stream?
|
||||
- **Origin**: [interface.md](interface.md)
|
||||
- **Status**: open
|
||||
- **Priority**: high
|
||||
- **Resolution**: (pending — needs design during Phase 1.8 implementation)
|
||||
- **Cross-references**: [interface.md](interface.md), [ADR-026](decisions/026-transport-interface-separation.md)
|
||||
|
||||
### OQ-IF-02: Should SshInterface own ForwardingPolicy checks or should they move to Layer 3?
|
||||
- **Origin**: [interface.md](interface.md)
|
||||
- **Status**: open
|
||||
- **Priority**: medium
|
||||
- **Resolution**: (pending — current thinking: forwarding check is Layer 3 policy, but channel open/close lifecycle is Layer 2. The Interface reports channel open requests to Layer 3; Layer 3 applies ForwardingPolicy.)
|
||||
- **Cross-references**: [interface.md](interface.md), [ADR-031](decisions/031-forwarding-policy.md)
|
||||
197
docs/architecture/secret-service.md
Normal file
197
docs/architecture/secret-service.md
Normal file
@@ -0,0 +1,197 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-07
|
||||
---
|
||||
|
||||
# Secret Service
|
||||
|
||||
## What
|
||||
|
||||
The `alknet-secret` crate provides BIP39 mnemonic generation, SLIP-0010 Ed25519
|
||||
HD key derivation, AES-256-GCM encryption for external credentials, and the
|
||||
`SecretProtocol` irpc service. It is the only component that holds the master
|
||||
seed phrase.
|
||||
|
||||
## Why
|
||||
|
||||
Operations like SSH key generation, API key storage, and Ethereum transaction
|
||||
signing all need deterministic key derivation from a single root of trust. The
|
||||
seed phrase is the single recovery mechanism — from it, all self-generated
|
||||
secrets can be derived on demand. External credentials (third-party API keys,
|
||||
OAuth tokens) cannot be derived and must be stored encrypted, with the
|
||||
encryption key itself derived from the seed.
|
||||
|
||||
The secret service isolates this responsibility: no other crate sees the seed,
|
||||
and derived keys are provided on demand through an irpc service interface.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Security Model
|
||||
|
||||
| State | What's in memory | What's on disk |
|
||||
|-------|-----------------|---------------|
|
||||
| Locked | Nothing | Encrypted database, derivation path metadata |
|
||||
| Unlocked | Master seed in RAM | Same (seed is never persisted) |
|
||||
| After use | Derived keys cached in RAM | Derivation paths only |
|
||||
|
||||
The seed phrase is entered once (at node startup or via `Unlock` call), held
|
||||
only in RAM, and never written to disk. The `Lock` call purges the seed and all
|
||||
cached derived keys from memory.
|
||||
|
||||
### SecretProtocol irpc Service
|
||||
|
||||
```rust
|
||||
#[rpc_requests(message = SecretMessage)]
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
enum SecretProtocol {
|
||||
#[rpc(tx=oneshot::Sender<DerivedKey>)]
|
||||
#[wrap(DeriveEd25519)]
|
||||
DeriveEd25519 { path: String },
|
||||
|
||||
#[rpc(tx=oneshot::Sender<DerivedKey>)]
|
||||
#[wrap(DeriveEncryptionKey)]
|
||||
DeriveEncryptionKey { path: String },
|
||||
|
||||
#[rpc(tx=oneshot::Sender<DerivedKey>)]
|
||||
#[wrap(DeriveEthereumKey)]
|
||||
DeriveEthereumKey { path: String },
|
||||
|
||||
#[rpc(tx=oneshot::Sender<Vec<u8>>)]
|
||||
#[wrap(DerivePassword)]
|
||||
DerivePassword { path: String, length: usize },
|
||||
|
||||
#[rpc(tx=oneshot::Sender<EncryptedData>)]
|
||||
#[wrap(Encrypt)]
|
||||
Encrypt { plaintext: String, key_version: u32 },
|
||||
|
||||
#[rpc(tx=oneshot::Sender<String>)]
|
||||
#[wrap(Decrypt)]
|
||||
Decrypt { encrypted: EncryptedData },
|
||||
|
||||
#[rpc(tx=oneshot::Sender<()>)]
|
||||
#[wrap(Lock)]
|
||||
Lock,
|
||||
|
||||
#[rpc(tx=oneshot::Sender<()>)]
|
||||
#[wrap(Unlock)]
|
||||
Unlock { passphrase: String },
|
||||
}
|
||||
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
struct DerivedKey {
|
||||
key_type: KeyType,
|
||||
private_key: Vec<u8>,
|
||||
public_key: Vec<u8>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
enum KeyType {
|
||||
Ed25519,
|
||||
Aes256Gcm,
|
||||
Secp256k1,
|
||||
}
|
||||
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
struct EncryptedData {
|
||||
key_version: u32,
|
||||
salt: String, // Base64-encoded
|
||||
iv: String, // Base64-encoded
|
||||
data: String, // Base64-encoded
|
||||
}
|
||||
```
|
||||
|
||||
### BIP39 Mnemonic and Seed Derivation
|
||||
|
||||
```rust
|
||||
let mnemonic = Mnemonic::from_phrase(&phrase, Language::English)?;
|
||||
let seed = mnemonic.to_seed(Some(&passphrase));
|
||||
let master_key = ExtendedPrivKey::new_master(Network::Alknet, &seed)?;
|
||||
```
|
||||
|
||||
### SLIP-0010 Ed25519 HD Key Derivation
|
||||
|
||||
The `74'` coin type is unallocated per SLIP-0044 and reserved for alknet.
|
||||
|
||||
### Derivation Path Constants
|
||||
|
||||
| Path | Purpose | Curve/Algorithm |
|
||||
|------|---------|----------------|
|
||||
| `m/74'/0'/0'/0'` | Primary identity keypair | Ed25519 (alknet auth) |
|
||||
| `m/74'/0'/0'/{n}'` | Worker/device identity | Ed25519 |
|
||||
| `m/74'/0'/1'/0'` | SSH host key | Ed25519 |
|
||||
| `m/74'/1'/0'/{hash}'` | Site-specific password | Deterministic |
|
||||
| `m/74'/2'/0'/0'` | Encryption key for external credentials | AES-256-GCM |
|
||||
| `m/44'/60'/0'/0/0` | Ethereum signing key | secp256k1 |
|
||||
|
||||
### AES-256-GCM Encryption for External Credentials
|
||||
|
||||
External credentials (API keys, OAuth tokens) that cannot be derived are
|
||||
encrypted using a key derived from the seed at path `m/74'/2'/0'/0'`. The
|
||||
`EncryptedData` type stores the key version, salt, IV, and ciphertext. This
|
||||
format is compatible with the existing `@alkdev/storage` `EncryptedDataSchema`.
|
||||
|
||||
1. The secret service derives an AES-256-GCM key via path `m/74'/2'/0'/0'`
|
||||
2. External credentials are encrypted with this key
|
||||
3. The encrypted data is stored as a `SecretNode` in the metagraph
|
||||
4. Only the derivation path and key version are stored in plain attributes
|
||||
5. The seed phrase (or derived encryption key) is held only by the secret
|
||||
service — never in the database
|
||||
|
||||
### Deployment Topologies
|
||||
|
||||
**Minimal (single node, CLI)**: Secret service runs in the same process. Seed
|
||||
phrase entered at startup. All keys derived locally. No irpc overhead.
|
||||
|
||||
**Production (head node)**: Secret service runs on a dedicated node or as a
|
||||
local irpc service. Workers request derived keys via irpc over QUIC. The seed
|
||||
never leaves the secret service node.
|
||||
|
||||
## Constraints
|
||||
|
||||
- The seed phrase is never persisted to disk. It is entered at startup or via
|
||||
`Unlock` and held only in RAM.
|
||||
- `Lock` purges the seed and all cached derived keys from memory.
|
||||
- alknet-secret does not depend on alknet-core or alknet-storage. It is fully
|
||||
independent.
|
||||
- The `EncryptedData` wire format (key_version, salt, iv, data) is shared with
|
||||
alknet-storage for compatibility, but this is type-level compatibility — not a
|
||||
crate dependency.
|
||||
- Per ADR-032, the secret service's Honker streams (key derivation notifications)
|
||||
stay within the service boundary. External consumers use irpc calls or call
|
||||
protocol operations that project to integration events.
|
||||
- The irpc service defines the wire format for in-cluster communication
|
||||
(postcard serialization). For call protocol exposure (e.g.,
|
||||
`/head/secrets/derive`), the service is wrapped in an operation that serializes
|
||||
to JSON.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- **OQ-SVC-01**: Should the secret service support multiple seed phrases (one per
|
||||
tenant)? See [open-questions.md](open-questions.md).
|
||||
|
||||
- **OQ-SVC-02**: Should service protocols use postcard (binary) or JSON for
|
||||
remote calls? Postcard for irpc (Rust-to-Rust), JSON for call protocol
|
||||
(cross-language). See [open-questions.md](open-questions.md).
|
||||
|
||||
- **OQ-SVC-03**: How does the secret service integrate with the existing
|
||||
`EncryptedDataSchema` from `@alkdev/storage`? The Rust implementation replaces
|
||||
PBKDF2 password-based encryption with derived AES-256-GCM keys. The
|
||||
`EncryptedData` format is a superset.
|
||||
|
||||
- **OQ-SVC-04**: Should workers cache derived keys locally? Yes, with a TTL
|
||||
(default: 1 hour). The head can revoke by invalidating the session.
|
||||
|
||||
## Design Decisions
|
||||
|
||||
| ADR | Decision | Summary |
|
||||
|-----|----------|---------|
|
||||
| [027](decisions/027-crate-decomposition.md) | Crate decomposition | alknet-secret is independent of core and storage |
|
||||
| [032](decisions/032-event-boundary-discipline.md) | Event boundary | Secret service domain events stay internal |
|
||||
|
||||
## References
|
||||
|
||||
- [research/services.md](../research/services.md) — SecretProtocol definition, DerivedKey, KeyType
|
||||
- [research/storage.md](../research/storage.md) — Secrets section, derivation paths, EncryptedData
|
||||
- [research/integration-plan.md](../research/integration-plan.md) — Phase 2.1
|
||||
- SLIP-0010 — https://github.com/satoshilabs/slips/blob/master/slip-0010.md
|
||||
- BIP39 — https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki
|
||||
211
docs/architecture/services.md
Normal file
211
docs/architecture/services.md
Normal file
@@ -0,0 +1,211 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-07
|
||||
---
|
||||
|
||||
# Services
|
||||
|
||||
## What
|
||||
|
||||
The irpc service layer decomposes alknet's core responsibilities into
|
||||
independently testable, deployable, and replaceable components. Auth, Secret,
|
||||
Config, and Storage are irpc protocol enums that work both as in-process async
|
||||
boundaries (tokio channels) and cross-process/cross-network (QUIC streams via
|
||||
noq). OperationEnv is the universal composition mechanism that unifies local
|
||||
dispatch, irpc service dispatch, and remote call protocol dispatch.
|
||||
|
||||
## Why
|
||||
|
||||
Without the service layer, auth verification, key derivation, and config reload
|
||||
are scattered across the codebase with no async boundary. For head nodes serving
|
||||
many users, in-memory key lookup doesn't scale — auth needs to query a database
|
||||
on demand. For secret management, the seed must be isolated in its own process
|
||||
boundary.
|
||||
|
||||
Without OperationEnv, handlers calling other operations would need to know
|
||||
whether the target is local, in-cluster, or on a remote node. OperationEnv
|
||||
abstracts this away: `context.env.invoke("secrets", "derive", input)` works
|
||||
regardless of dispatch path.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Service Definition Pattern
|
||||
|
||||
Services are defined as irpc protocol enums:
|
||||
|
||||
```rust
|
||||
#[rpc_requests(message = AuthMessage)]
|
||||
#[derive(Debug, Serialize, Deserialize)]
|
||||
enum AuthProtocol {
|
||||
#[rpc(tx=oneshot::Sender<AuthResult>)]
|
||||
#[wrap(VerifyPubkey)]
|
||||
VerifyPubkey { fingerprint: String, key_data: Vec<u8> },
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
The `#[rpc_requests]` macro generates two versions:
|
||||
- **Serializable** (`Request`): for remote communication (postcard encoding)
|
||||
- **With channels** (`RequestWithChannels`): for local communication (tokio channels)
|
||||
|
||||
Both use the same `Client<S>` type. The local/remote distinction is transparent
|
||||
at the call site.
|
||||
|
||||
### Core Services
|
||||
|
||||
| Service | Protocol | Purpose | Always Local? |
|
||||
|---------|----------|---------|---------------|
|
||||
| **Auth** | `AuthProtocol` | Verify identities, check credentials | Can be remote |
|
||||
| **Secret** | `SecretProtocol` | Derive keys, encrypt/decrypt | Local or remote |
|
||||
| **Config** | `ConfigProtocol` | Dynamic config reload | Local |
|
||||
| **Storage** | `StorageProtocol` | Graph CRUD, metagraph operations | Local or remote |
|
||||
|
||||
### OperationContext
|
||||
|
||||
Every handler receives an `OperationContext`:
|
||||
|
||||
```rust
|
||||
pub struct OperationContext {
|
||||
pub request_id: String,
|
||||
pub parent_request_id: Option<String>,
|
||||
pub identity: Option<Identity>,
|
||||
pub metadata: HashMap<String, Value>,
|
||||
pub env: OperationEnv,
|
||||
pub trusted: bool, // set by buildEnv(), not by callers
|
||||
}
|
||||
```
|
||||
|
||||
- **`identity`**: The authenticated identity making the call. Populated by
|
||||
`IdentityProvider` from the interface layer.
|
||||
- **`env`**: The operation environment — namespaced access to other operations.
|
||||
- **`trusted`**: When a handler calls another operation through `env`, the
|
||||
nested call is `trusted` (skips ACL checks).
|
||||
|
||||
### OperationEnv — Universal Composition Mechanism
|
||||
|
||||
OperationEnv provides namespace + operation name → invoke with input, return
|
||||
output. The handler doesn't know or care whether the dispatch is local, irpc,
|
||||
or remote.
|
||||
|
||||
Three dispatch paths:
|
||||
|
||||
| Path | Mechanism | Serialization | Scope |
|
||||
|------|-----------|---------------|-------|
|
||||
| **Local** | Direct function call through registry | None (in-process) | Same process |
|
||||
| **Service** | irpc protocol enum dispatch | postcard (binary) | Same cluster |
|
||||
| **Remote** | Call protocol `EventEnvelope` | JSON | Cross-node |
|
||||
|
||||
All three produce the same `ResponseEnvelope`.
|
||||
|
||||
Service assembly determines which path each operation uses:
|
||||
|
||||
```rust
|
||||
// Minimal deployment (single node, all local)
|
||||
let env = OperationEnv::local(local_registry);
|
||||
|
||||
// Production deployment (mix of local and remote)
|
||||
let env = OperationEnv::new()
|
||||
.local("auth", auth_registry)
|
||||
.local("config", config_registry)
|
||||
.service("secrets", secret_irpc_client)
|
||||
.remote("worker-1", call_protocol_conn);
|
||||
```
|
||||
|
||||
### Service vs Call Protocol vs External Service
|
||||
|
||||
These are different concepts that compose through OperationEnv:
|
||||
|
||||
- **irpc service**: In-cluster, Rust-to-Rust, type-safe, postcard serialization.
|
||||
Dispatched by enum variant. Example: `AuthProtocol::VerifyPubkey`.
|
||||
- **Call protocol operation**: Cross-node, cross-language, path-based, JSON
|
||||
`EventEnvelope`. Dispatched by namespace + name. Example:
|
||||
`/head/auth/verify`.
|
||||
- **External service**: Any endpoint reachable via the call protocol.
|
||||
Example: a vast.ai instance, an HTTP API, another head node.
|
||||
|
||||
An irpc service can back a call protocol operation. The OperationEnv routes to
|
||||
the appropriate dispatch path:
|
||||
|
||||
```
|
||||
Call Protocol (Layer 3, external, JSON)
|
||||
└── irpc Service (Layer 3, internal, postcard)
|
||||
└── Honker Streams (Domain events, within service boundary)
|
||||
```
|
||||
|
||||
### Adapters
|
||||
|
||||
HTTP, MCP, DNS, and WebSocket adapters all resolve through OperationEnv:
|
||||
|
||||
- HTTP: `POST /v1/{namespace}/{op}` → `context.env.invoke(namespace, op, input)`
|
||||
- MCP: `tools/call` with tool name → `context.env.invoke(namespace, op, input)`
|
||||
- DNS: `{op}.{namespace}.alk.dev TXT?` → `context.env.invoke(namespace, op, input)`
|
||||
- Call protocol: `call.requested` with `operationId` → `context.env.invoke(namespace, op, input)`
|
||||
|
||||
### Deployment Topologies
|
||||
|
||||
**Minimal (single node, CLI)**: All services run locally via tokio channels.
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────┐
|
||||
│ Single Process │
|
||||
│ Auth (ArcSwap) | Secret (seed in RAM) | │
|
||||
│ Config (ArcSwap) | alknet-core Server │
|
||||
└──────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Production (multi-node)**: Auth and secrets on dedicated nodes; workers
|
||||
access them remotely.
|
||||
|
||||
```
|
||||
Auth Node (SQLite) Secret Node (seed in RAM)
|
||||
↑ ↑
|
||||
│ QUIC (irpc) │ QUIC (irpc)
|
||||
│ │
|
||||
Head Node (Config, Storage, alknet-core Server)
|
||||
│
|
||||
│ SSH / iroh / TLS
|
||||
│
|
||||
Worker Node (alknet-core Client)
|
||||
```
|
||||
|
||||
## Constraints
|
||||
|
||||
- Services are **internal** — they run within a node or cluster.
|
||||
- The call protocol is **external** — it's how nodes talk to each other.
|
||||
- Per ADR-032, domain events (Honker streams) stay within the owning service.
|
||||
irpc calls are synchronous request-response within a node. Call protocol
|
||||
`EventEnvelope` is the integration boundary between nodes.
|
||||
- OperationEnv is a hard constraint: the handler-facing API must match the
|
||||
behavioral contract from `@alkdev/operations`. Namespace + operation name →
|
||||
invoke with input, return output.
|
||||
- irpc is behind a feature flag in alknet-core. Nodes that only do SSH tunneling
|
||||
don't need the service layer overhead.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- **OQ-SVC-01**: Should the secret service support multiple seed phrases (one
|
||||
per tenant)? Defer for now — one seed per node. Multi-seed can be added
|
||||
later by indexing the `Unlock` call with a tenant ID.
|
||||
|
||||
- **OQ-SVC-02**: Should service protocols use postcard (binary) or JSON for
|
||||
remote calls? Postcard for irpc (Rust-to-Rust, efficient). JSON for call
|
||||
protocol (cross-language, universal). The irpc remote path naturally uses
|
||||
postcard.
|
||||
|
||||
## Design Decisions
|
||||
|
||||
| ADR | Decision | Summary |
|
||||
|-----|----------|---------|
|
||||
| [027](decisions/027-crate-decomposition.md) | Crate decomposition | Service crates are independent of core |
|
||||
| [028](decisions/028-auth-irpc-service.md) | Auth as irpc service | AuthProtocol behind feature flag |
|
||||
| [032](decisions/032-event-boundary-discipline.md) | Event boundary | Domain events never cross service boundaries |
|
||||
| [033](decisions/033-operationenv-irpc-call-protocol.md) | OperationEnv | Universal composition mechanism with three dispatch paths |
|
||||
|
||||
## References
|
||||
|
||||
- [research/services.md](../research/services.md) — Service protocol definitions, OperationContext, deployment topologies
|
||||
- [research/integration-plan.md](../research/integration-plan.md) — OperationEnv, three dispatch paths, adapter patterns
|
||||
- [secret-service.md](secret-service.md) — SecretProtocol definition
|
||||
- [identity.md](identity.md) — IdentityProvider, AuthProtocol
|
||||
- [configuration.md](configuration.md) — ConfigProtocol, DynamicConfig reload
|
||||
- [interface.md](interface.md) — Interface layer, auth across interfaces
|
||||
219
docs/architecture/storage.md
Normal file
219
docs/architecture/storage.md
Normal file
@@ -0,0 +1,219 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-07
|
||||
---
|
||||
|
||||
# Storage
|
||||
|
||||
## What
|
||||
|
||||
The `alknet-storage` crate provides SQLite-backed graph storage, identity
|
||||
management, access control, and reactivity via honker. It mirrors the
|
||||
TypeScript `@alkdev/storage` package's design while leveraging Rust's type
|
||||
system and honker's built-in pub/sub.
|
||||
|
||||
## Why
|
||||
|
||||
alknet-core needs persistent identity data (authorized keys, accounts, ACLs)
|
||||
and a way to store and query graph-structured data (call graphs, operation
|
||||
graphs, metagraph). But alknet-core cannot take a database dependency. The
|
||||
solution: alknet-storage implements alknet-core's `IdentityProvider` trait,
|
||||
providing SQLite-backed identity resolution without core knowing about SQLite.
|
||||
|
||||
The metagraph (three-level type system: GraphType → NodeType → EdgeType → Graph
|
||||
→ Node → Edge) is the foundation for ACL, flowgraph persistence, and any
|
||||
future graph-structured data.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Crate Structure
|
||||
|
||||
```
|
||||
alknet-storage/
|
||||
├── metagraph/ — GraphType, NodeType, EdgeType persistence
|
||||
├── identity/ — accounts, organizations, peer_credentials, api_keys, audit_logs
|
||||
├── acl/ — PrincipalNode, DelegatesEdge, access control graph
|
||||
├── secrets/ — Encrypted node type, encrypt/decrypt bridge
|
||||
├── honker/ — honker integration: notify, stream, queue
|
||||
├── graph/ — GraphInstance, Node, Edge CRUD with schema validation
|
||||
└── schema/ — JSON Schema definitions (serde + jsonschema)
|
||||
```
|
||||
|
||||
### Metagraph Data Model
|
||||
|
||||
Three-level type system:
|
||||
|
||||
1. **GraphType** — A class of graphs (e.g., "call-graph", "acl",
|
||||
"task-dependencies"). Defines structural constraints.
|
||||
2. **NodeType** — A category of node within a graph type. Each has a JSON Schema
|
||||
for attribute validation.
|
||||
3. **EdgeType** — A category of edge within a graph type. Each has a JSON Schema
|
||||
and optional source/target constraints.
|
||||
|
||||
Graph instances belong to a graph type and contain nodes and edges conforming
|
||||
to those type definitions.
|
||||
|
||||
### SQLite Table Schema
|
||||
|
||||
Common columns: `id TEXT PK`, `metadata TEXT JSON DEFAULT '{}'`,
|
||||
`created_at INTEGER TIMESTAMP`, `updated_at INTEGER TIMESTAMP`.
|
||||
|
||||
| Table | Key columns |
|
||||
|-------|------------|
|
||||
| `graph_types` | id, name (UNIQUE), config JSON, version, scope |
|
||||
| `node_types` | id, graph_type_id FK, name, schema JSON |
|
||||
| `edge_types` | id, graph_type_id FK, name, schema JSON, allowed_source/target types |
|
||||
| `graphs` | id, graph_type_id FK, name, description, status, owner_id, project_id |
|
||||
| `nodes` | id, graph_id FK, key (UNIQUE per graph), attributes JSON |
|
||||
| `edges` | id, graph_id FK, key, source_node_key, target_node_key, attributes JSON, undirected |
|
||||
|
||||
No FK constraints across database files. Referential integrity is enforced at
|
||||
the application layer.
|
||||
|
||||
### System DB vs Tenant DB
|
||||
|
||||
- **System DB** (`system.db`): Identity tables (accounts, organizations,
|
||||
peer_credentials, api_keys, audit_logs) + system-scoped graph types.
|
||||
- **Tenant DB** (`tenant-{orgId}.db`): Metagraph tables + tenant-scoped graph
|
||||
types.
|
||||
|
||||
### Identity Tables
|
||||
|
||||
| Table | Key columns |
|
||||
|-------|------------|
|
||||
| `accounts` | email (UNIQUE), display_name, access_level (admin/user/service), status |
|
||||
| `organizations` | name (UNIQUE), slug (UNIQUE), owner_id FK → accounts |
|
||||
| `organization_members` | org_id FK, account_id FK, membership_level (owner/admin/member) |
|
||||
| `api_keys` | owner_id FK, key_hash (UNIQUE), name, enabled, expires_at, revoked_at |
|
||||
| `peer_credentials` | owner_id FK, credential_type (ssh_key/cert_authority), fingerprint (UNIQUE), public_key_data |
|
||||
| `audit_logs` | action, owner_id FK, credential_id, org_id FK, details JSON |
|
||||
|
||||
### ACL as Metagraph
|
||||
|
||||
The ACL graph is a directed, non-multi metagraph:
|
||||
|
||||
- **PrincipalNode**: IdentityType (Account, Org, Service, Role) + identity_id + scopes + resources
|
||||
- **ResourceNode**: The thing being accessed
|
||||
- **Edges**: can_read, can_write, can_execute, belongs_to, delegates
|
||||
|
||||
Delegation edges carry `narrowed_scopes` — the delegate can only exercise scopes
|
||||
that are a subset of the delegator's.
|
||||
|
||||
### StorageIdentityProvider
|
||||
|
||||
Implements alknet-core's `IdentityProvider` trait (ADR-029). Queries
|
||||
`peer_credentials` (for SSH key resolution) and `api_keys` (for token auth), then
|
||||
traverses the ACL graph to compute effective scopes and resources.
|
||||
|
||||
```rust
|
||||
impl IdentityProvider for StorageIdentityProvider {
|
||||
fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity> {
|
||||
// 1. Find peer_credentials row by fingerprint
|
||||
// 2. Resolve to account → organization membership → effective scopes
|
||||
// 3. Return Identity { id: account_uuid, scopes, resources }
|
||||
}
|
||||
|
||||
fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity> {
|
||||
// 1. Verify Ed25519 signature against api_keys or peer_credentials
|
||||
// 2. Resolve to account → effective scopes
|
||||
// 3. Return Identity { id: account_uuid, scopes, resources }
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### StorageProtocol irpc Service
|
||||
|
||||
```rust
|
||||
#[rpc_requests(message = StorageMessage)]
|
||||
enum StorageProtocol {
|
||||
#[rpc(tx=oneshot::Sender<Graph>)]
|
||||
#[wrap(CreateGraph)]
|
||||
CreateGraph { graph_type_id: String, name: String },
|
||||
|
||||
#[rpc(tx=oneshot::Sender<Node>)]
|
||||
#[wrap(AddNode)]
|
||||
AddNode { graph_id: String, key: String, attributes: Value },
|
||||
|
||||
// ... (full protocol in research/services.md)
|
||||
}
|
||||
```
|
||||
|
||||
### Honker Integration
|
||||
|
||||
| Feature | Use case |
|
||||
|---------|----------|
|
||||
| `stream_publish` / `subscribe` | Durable pub/sub for node/edge/membership changes |
|
||||
| `notify` / `listen` | Ephemeral pub/sub for real-time control channel events |
|
||||
| `queue` / `claim` / `ack` | Task queue for async operations |
|
||||
|
||||
Per ADR-032, honker streams are domain events internal to the storage service.
|
||||
They are projected to call protocol `EventEnvelope` events when crossing service
|
||||
boundaries.
|
||||
|
||||
### Encrypted Data
|
||||
|
||||
alknet-storage references alknet-secret's `EncryptedData` wire format for
|
||||
storing encrypted nodes (API keys, OAuth tokens). The format (key_version,
|
||||
salt, iv, ciphertext) is shared by type-level compatibility, not a crate
|
||||
dependency. alknet-secret encrypts; alknet-storage stores the blob.
|
||||
|
||||
### Crate Dependencies
|
||||
|
||||
```toml
|
||||
[dependencies]
|
||||
honker = "0.x"
|
||||
rusqlite = { version = "0.x", features = ["bundled"] }
|
||||
serde = { version = "1", features = ["derive"] }
|
||||
serde_json = "1"
|
||||
jsonschema = "0.x"
|
||||
petgraph = "0.x"
|
||||
irpc = "0.x"
|
||||
```
|
||||
|
||||
Does NOT depend on alknet-core or alknet-secret. Implements alknet-core's
|
||||
`IdentityProvider` trait by conforming to its signature, not by direct crate
|
||||
dependency.
|
||||
|
||||
## Constraints
|
||||
|
||||
- alknet-storage does NOT depend on alknet-core as a crate. It implements the
|
||||
`IdentityProvider` trait by conforming to the signature. The CLI binary
|
||||
wires them together.
|
||||
- alknet-storage does NOT depend on alknet-secret. They share the `EncryptedData`
|
||||
wire format by type-level compatibility, not a crate dependency.
|
||||
- WAL mode for concurrent reads during writes. Single writer per `.db` file.
|
||||
- JSON Schema validation uses the `jsonschema` crate at runtime (replaces
|
||||
TypeBox from TypeScript).
|
||||
- Per ADR-032, honker stream events never cross service boundaries without
|
||||
projection to `EventEnvelope`.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- **OQ-SVC-03**: How does the secret service integrate with the existing
|
||||
`EncryptedDataSchema` from `@alkdev/storage`? The Rust implementation replaces
|
||||
PBKDF2 password-based encryption with derived AES-256-GCM keys. The
|
||||
`EncryptedData` format is a superset — old format can be migrated by
|
||||
re-encrypting with the new key.
|
||||
|
||||
- **OQ-SVC-04**: Should workers cache derived keys locally? Yes, with a TTL
|
||||
(default: 1 hour). The head can revoke by invalidating the session.
|
||||
|
||||
- **OQ-SVC-05**: How does the smart contract (NFT-based ACL) interact with the
|
||||
secret service? The Ethereum signing key (`m/44'/60'/0'/0/0`) is derived from
|
||||
the same seed. The smart contract is a separate concern.
|
||||
|
||||
## Design Decisions
|
||||
|
||||
| ADR | Decision | Summary |
|
||||
|-----|----------|---------|
|
||||
| [027](decisions/027-crate-decomposition.md) | Crate decomposition | alknet-storage is independent of core and secret |
|
||||
| [029](decisions/029-identity-core-type.md) | Identity as core type | alknet-storage implements IdentityProvider trait |
|
||||
| [032](decisions/032-event-boundary-discipline.md) | Event boundary | Honker streams stay internal; projection to EventEnvelope at boundaries |
|
||||
|
||||
## References
|
||||
|
||||
- [research/storage.md](../research/storage.md) — Full metagraph, identity, ACL, honker definitions
|
||||
- [research/services.md](../research/services.md) — StorageProtocol, StorageIdentityProvider
|
||||
- [research/integration-plan.md](../research/integration-plan.md) — Phase 2.2
|
||||
- [identity.md](identity.md) — IdentityProvider trait, Identity struct
|
||||
- [secret-service.md](secret-service.md) — EncryptedData format, derivation paths
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/adr-026-transport-interface-separation
|
||||
name: Write ADR-026 — Transport/interface separation (three-layer model)
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on: []
|
||||
scope: moderate
|
||||
risk: high
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/adr-027-crate-decomposition
|
||||
name: Write ADR-027 — Crate decomposition
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on:
|
||||
- architecture/adr-029-identity-core-type
|
||||
scope: moderate
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/adr-028-auth-irpc-service
|
||||
name: Write ADR-028 — Auth as irpc service
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on:
|
||||
- architecture/adr-029-identity-core-type
|
||||
scope: narrow
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/adr-029-identity-core-type
|
||||
name: Write ADR-029 — Identity as core type
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on: []
|
||||
scope: single
|
||||
risk: low
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/adr-030-static-dynamic-config-split
|
||||
name: Write ADR-030 — Static/dynamic config split
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on: []
|
||||
scope: narrow
|
||||
risk: low
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/adr-031-forwarding-policy
|
||||
name: Write ADR-031 — Forwarding policy
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on: []
|
||||
scope: narrow
|
||||
risk: low
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/adr-032-event-boundary-discipline
|
||||
name: Write ADR-032 — Event boundary discipline
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on: []
|
||||
scope: single
|
||||
risk: low
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/adr-033-operationenv-irpc-call-protocol
|
||||
name: Write ADR-033 — OperationEnv, irpc, and call protocol relationship
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on:
|
||||
- architecture/adr-028-auth-irpc-service
|
||||
- architecture/adr-027-crate-decomposition
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/adr-034-head-worker-terminology
|
||||
name: Write ADR-034 — Head/worker terminology
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on: []
|
||||
scope: single
|
||||
risk: trivial
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/spec-configuration
|
||||
name: Promote configuration.md from research to architecture spec
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on:
|
||||
- architecture/adr-030-static-dynamic-config-split
|
||||
- architecture/adr-031-forwarding-policy
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/spec-flowgraph
|
||||
name: Create flowgraph.md architecture spec (or stub referencing crate docs)
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on:
|
||||
- architecture/adr-027-crate-decomposition
|
||||
scope: narrow
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/spec-identity
|
||||
name: Create identity.md architecture spec
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on:
|
||||
- architecture/adr-029-identity-core-type
|
||||
- architecture/adr-028-auth-irpc-service
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/spec-interface
|
||||
name: Create interface.md architecture spec (Layer 2)
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on:
|
||||
- architecture/adr-026-transport-interface-separation
|
||||
- architecture/adr-033-operationenv-irpc-call-protocol
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/spec-secret-service
|
||||
name: Create secret-service.md architecture spec
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on:
|
||||
- architecture/adr-027-crate-decomposition
|
||||
- architecture/adr-032-event-boundary-discipline
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/spec-services
|
||||
name: Create services.md architecture spec (irpc service layer + OperationEnv)
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on:
|
||||
- architecture/adr-033-operationenv-irpc-call-protocol
|
||||
- architecture/adr-027-crate-decomposition
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/spec-storage
|
||||
name: Create storage.md architecture spec (or stub referencing crate docs)
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on:
|
||||
- architecture/adr-027-crate-decomposition
|
||||
- architecture/adr-029-identity-core-type
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/spec-update-auth
|
||||
name: Update auth.md — add IdentityProvider vs AuthService relationship
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on:
|
||||
- architecture/spec-identity
|
||||
- architecture/adr-028-auth-irpc-service
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/spec-update-open-questions
|
||||
name: Update open-questions.md — resolve questions per ADR decisions
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on:
|
||||
- architecture/adr-031-forwarding-policy
|
||||
- architecture/adr-029-identity-core-type
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
id: architecture/spec-update-readme
|
||||
name: Update architecture README.md — add new docs and ADRs to tables
|
||||
status: pending
|
||||
status: completed
|
||||
depends_on:
|
||||
- architecture/spec-configuration
|
||||
- architecture/spec-identity
|
||||
|
||||
Reference in New Issue
Block a user