From cfc44008d3d415f36843705c072f81aa518ad101 Mon Sep 17 00:00:00 2001 From: "glm-5.1" Date: Tue, 9 Jun 2026 08:09:45 +0000 Subject: [PATCH] Sync architecture specs with Phase 2 research findings - Add definitions.md: normative terminology disambiguation (Interface, Service, Transport, Token, Identity, Domain, Scope, CredentialProvider, etc.) - Add credentials.md: CredentialProvider trait and CredentialSet enum for outbound auth, mirroring IdentityProvider pattern for inbound auth - Rewrite interface.md: StreamInterface/MessageInterface split (ADR-035), InterfaceRequest/InterfaceResponse, HttpInterface/DnsInterface stubs, ListenerConfig with Stream/Http/Dns variants, credential presentation table - Update auth.md: API keys in DynamicConfig (ADR-037), credential presentation per (Transport, Interface) pair, ApiKeyEntry struct in AuthPolicy - Update configuration.md: API keys, ListenerConfig with Http/Dns variants, expanded TOML config examples - Update call-protocol.md: resolve OQ-IF-01 (InterfaceEvent carries EventEnvelope + Identity), add MessageInterface awareness to protocol adapter layer - Update overview.md: three-layer model now includes StreamInterface/ MessageInterface, CredentialProvider/CredentialSet exports, definitions.md reference, ADRs 035-037 - Update open-questions.md: resolve OQ-IF-01, OQ-IF-02, add OQ-P2-01 through OQ-P2-04, add OQ-CP-01 through OQ-CP-04, add OQ-DEF-01, OQ-DEF-03, OQ-DEF-08 - Update README.md: add definitions.md, credentials.md, ADRs 035-037, phase2 research docs, current state description Key architectural decisions: - ADR-035: StreamInterface/MessageInterface split (two Layer 2 traits) - ADR-036: CredentialProvider as core type (outbound auth, alknet_core::credentials) - ADR-037: API keys as DynamicConfig auth (hash-verified bearer tokens) --- docs/architecture/README.md | 41 +- docs/architecture/auth.md | 83 +++- docs/architecture/call-protocol.md | 28 +- docs/architecture/configuration.md | 104 ++++- docs/architecture/credentials.md | 263 +++++++++++++ ...-streaminterface-messageinterface-split.md | 65 ++++ .../036-credentialprovider-core-type.md | 82 ++++ .../decisions/037-api-keys-dynamic-config.md | 83 ++++ docs/architecture/definitions.md | 226 +++++++++++ docs/architecture/interface.md | 357 +++++++++++++----- docs/architecture/open-questions.md | 95 ++++- docs/architecture/overview.md | 38 +- 12 files changed, 1314 insertions(+), 151 deletions(-) create mode 100644 docs/architecture/credentials.md create mode 100644 docs/architecture/decisions/035-streaminterface-messageinterface-split.md create mode 100644 docs/architecture/decisions/036-credentialprovider-core-type.md create mode 100644 docs/architecture/decisions/037-api-keys-dynamic-config.md create mode 100644 docs/architecture/definitions.md diff --git a/docs/architecture/README.md b/docs/architecture/README.md index dbe7e03..5285c8d 100644 --- a/docs/architecture/README.md +++ b/docs/architecture/README.md @@ -1,18 +1,21 @@ --- status: draft -last_updated: 2026-06-07 +last_updated: 2026-06-09 --- # Alknet Architecture ## Current State -Architecture specification in active development. Phase 0 foundation complete: -ADRs 001–034 accepted, new spec documents created for all components, existing -specs updated for the three-layer model, crate decomposition, unified identity, -OperationEnv, and forwarding policy. Remaining open questions: OQ-15 (QUIC -coexistence), OQ-19 (WebTransport TLS), OQ-20 (worker registration), OQ-IF-01 -(Interface session/EventEnvelope), OQ-IF-02 (ForwardingPolicy placement). See +Architecture spec sync in progress. Phase 0 foundation complete (ADRs 001–037). +Phase 1 core modifications partially implemented (interface trait, config split, +identity provider, forwarding policy). Phase 2 core bridge research complete; +spec documents updated to reflect StreamInterface/MessageInterface split, +CredentialProvider as core type, and API keys in DynamicConfig. + +Remaining open questions: OQ-15 (QUIC coexistence), OQ-19 (WebTransport TLS), +OQ-20 (worker registration), OQ-CP-01 (per-identity credentials), OQ-CP-02 +(OIDC provider location), OQ-CP-03 (credential rotation). See [open-questions.md](open-questions.md). ## Architecture Documents @@ -21,7 +24,7 @@ coexistence), OQ-19 (WebTransport TLS), OQ-20 (worker registration), OQ-IF-01 |----------|--------|-------------| | [overview.md](overview.md) | reviewed | Package purpose, crate structure, three-layer model, exports, dependencies | | [transport.md](transport.md) | reviewed | Transport abstraction: TCP, TLS, iroh | -| [auth.md](auth.md) | draft | Unified auth: SSH + token, IdentityProvider trait | +| [auth.md](auth.md) | draft | Unified auth: SSH + token + API keys, credential presentation per interface | | [call-protocol.md](call-protocol.md) | draft | Bidirectional call/event protocol, OperationEnv, three dispatch paths | | [client.md](client.md) | reviewed | Client connection, SOCKS5, port forwarding | | [server.md](server.md) | reviewed | Server acceptance, IdentityProvider, ForwardingPolicy, channel handling | @@ -29,11 +32,13 @@ coexistence), OQ-19 (WebTransport TLS), OQ-20 (worker registration), OQ-IF-01 | [napi-and-pubsub.md](napi-and-pubsub.md) | reviewed | NAPI wrapper, reload API, pubsub event target adapter | | [identity.md](identity.md) | draft | Identity type, IdentityProvider trait, auth flows | | [services.md](services.md) | draft | irpc service layer, OperationEnv, three dispatch paths | -| [interface.md](interface.md) | draft | Layer 2: Interface trait, SshInterface, RawFramingInterface | -| [configuration.md](configuration.md) | draft | StaticConfig, DynamicConfig, forwarding policy, reload | +| [interface.md](interface.md) | draft | StreamInterface, MessageInterface, credential presentation, ListenerConfig | +| [configuration.md](configuration.md) | draft | StaticConfig, DynamicConfig, API keys, forwarding policy, reload | | [storage.md](storage.md) | draft | alknet-storage: metagraph, identity, ACL, honker | | [flowgraph.md](flowgraph.md) | draft | alknet-flowgraph: call graph, operation graph, petgraph | | [secret-service.md](secret-service.md) | draft | alknet-secret: BIP39, SLIP-0010, AES-GCM, SecretProtocol | +| [credentials.md](credentials.md) | draft | CredentialProvider, CredentialSet (outbound auth) | +| [definitions.md](definitions.md) | draft | Terminology disambiguation and concept mapping | ## Research Documents @@ -48,6 +53,10 @@ coexistence), OQ-19 (WebTransport TLS), OQ-20 (worker registration), OQ-IF-01 | [feasibility/](../research/feasibility/) | — | SSH tunnel feasibility assessment and related analyses | | [event-sourcing/](../research/event-sourcing/) | — | Event sourcing patterns and event-driven architecture reference | | [ops/](../research/ops/) | — | Production ops reference: certbot, fail2ban | +| [phase2/definitions.md](../research/phase2/definitions.md) | draft | Terminology disambiguation (promoted to architecture/definitions.md) | +| [phase2/interface-model.md](../research/phase2/interface-model.md) | draft | StreamInterface/MessageInterface analysis (promoted to interface.md) | +| [phase2/credential-provider.md](../research/phase2/credential-provider.md) | draft | CredentialProvider research (promoted to credentials.md) | +| [phase2/tls-transport.md](../research/phase2/tls-transport.md) | draft | HTTP interface, stealth handoff, ListenerConfig (promoted to interface.md, auth.md) | ## ADR Table @@ -84,6 +93,9 @@ coexistence), OQ-19 (WebTransport TLS), OQ-20 (worker registration), OQ-IF-01 | [032](decisions/032-event-boundary-discipline.md) | Event boundary discipline (domain, irpc, call protocol) | Accepted | | [033](decisions/033-operationenv-irpc-call-protocol.md) | OperationEnv as universal composition mechanism | Accepted | | [034](decisions/034-head-worker-terminology.md) | Head/worker terminology replacing hub/spoke | Accepted | +| [035](decisions/035-streaminterface-messageinterface-split.md) | StreamInterface / MessageInterface split | Accepted | +| [036](decisions/036-credentialprovider-core-type.md) | CredentialProvider as core type (outbound auth) | Accepted | +| [037](decisions/037-api-keys-dynamic-config.md) | API keys as DynamicConfig auth | Accepted | > ADR numbers 020–022 were allocated to proposals that were withdrawn before > acceptance and are not listed. @@ -93,15 +105,16 @@ coexistence), OQ-19 (WebTransport TLS), OQ-20 (worker registration), OQ-IF-01 See [open-questions.md](open-questions.md) for all open and resolved questions. Key resolved questions from Phase 0: OQ-12, OQ-16, OQ-18 (forwarding policy and identity scopes), OQ-17 (transport-aware auth), OQ-23 (irpc feature flag), -OQ-24 (DNS control channel scope), OQ-25 (crate irpc dependencies). Key open -questions: OQ-15 (QUIC coexistence), OQ-19 (WebTransport TLS), OQ-20 (worker -registration). +OQ-24 (DNS control channel scope), OQ-25 (crate irpc dependencies), OQ-IF-01 +(Interface session / EventEnvelope relationship), OQ-IF-02 (ForwardingPolicy +placement). Key open questions: OQ-15 (QUIC coexistence), OQ-19 (WebTransport +TLS), OQ-20 (worker registration). ## Lifecycle Definitions | Status | Meaning | Transitions | |--------|---------|-------------| | `draft` | Under active development. May change significantly. | → `reviewed` when open questions resolved | -| `reviewed` | Architecture final. Implementation may begin. Changes require review. | → `stable` when implementation verified | +| `reviewed` | Architecture final. Implementation may begin. Changes require review. | → `stable` when implementation is complete and verified | | `stable` | Locked. Changes require review and may warrant an ADR. | → `deprecated` when superseded | | `deprecated` | Superseded. Kept for reference. | Removed when no longer referenced | \ No newline at end of file diff --git a/docs/architecture/auth.md b/docs/architecture/auth.md index 048bafa..58c224e 100644 --- a/docs/architecture/auth.md +++ b/docs/architecture/auth.md @@ -42,16 +42,30 @@ is the default implementation (reads from `DynamicConfig.auth`). `AuthProtocol` irpc service is one way to satisfy the trait, behind a feature flag. Both paths produce the same `Identity` result. See ADR-028 and ADR-029. -### Auth Presentation Per Transport +### Credential Presentation Per Interface -| Transport | Auth presentation | Verification | -|-----------|-------------------|-------------| -| SSH (TCP, TLS, iroh) | SSH public key auth in the SSH handshake | `ServerAuthConfig::authenticate_publickey()` — key lookup in authorized set | -| WebTransport (HTTP/3) | Signed timestamp token in CONNECT request | Token auth — same authorized set verifies the Ed25519 signature | -| Future (WebSocket, etc.) | Signed timestamp token in headers/query | Same token verification | +Each (Transport, Interface) pair presents credentials differently, but all +resolve to the same `Identity` through `IdentityProvider`. See +[definitions.md](definitions.md) for the full terminology rules. -The **key material is shared**. The **presentation differs per transport**. The -**verification result is the same**: an authenticated identity with scopes. +| (Transport, Interface) | Credential presentation | Resolves via | +|------------------------|------------------------|-------------| +| (TLS, SshInterface) | SSH public key handshake | `resolve_from_fingerprint()` | +| (TCP, SshInterface) | SSH public key handshake | `resolve_from_fingerprint()` | +| (iroh, SshInterface) | SSH public key handshake | `resolve_from_fingerprint()` | +| (TLS, RawFramingInterface) | AuthToken in frame header | `resolve_from_token()` | +| (TCP, RawFramingInterface) | AuthToken in frame header | `resolve_from_token()` | +| (WebTransport, RawFramingInterface) | AuthToken in CONNECT request | `resolve_from_token()` | +| (—, HttpInterface) | `Authorization: Bearer` header | `resolve_from_token()` | +| (—, DnsInterface) | AuthToken in query labels | `resolve_from_token()` | + +The **key material is shared**. The **credential presentation** differs per +(Transport, Interface) pair. The **verification result is the same**: an +authenticated `Identity` with scopes. + +`resolve_from_token()` handles both AuthTokens (Ed25519-signed) and API keys +(hash-verified bearer tokens). The implementation discriminates by prefix or +format — see ADR-037. ### Token Authentication @@ -112,14 +126,46 @@ irpc path produce the same `Identity` result. The trait is the contract. The backing store is pluggable. Alknet-core never depends on Honker, SQLite, or any specific database. +### API Keys + +For service accounts, automation, and HTTP interface auth, Ed25519 AuthTokens +are inconvenient — they require client-side key generation and signing. API keys +provide a simpler bearer token format (ADR-037): + +``` +API key: "alk_dGhlX3NlY3JldA" (~20 chars, configurable prefix) +Storage: SHA-256 hash of the full key +Lookup: prefix match → hash verification → Identity +``` + +API keys are configured in `DynamicConfig.auth.api_keys`: + +```toml +[[auth.api_keys]] +prefix = "alk_" +hash = "sha256:abc..." +scopes = ["relay:connect"] +description = "dashboard service account" +ttl = "30d" # optional +``` + +Both AuthTokens and API keys go through `IdentityProvider::resolve_from_token()`. +The implementation discriminates by prefix (default `alk_`): if the token starts +with the API key prefix, it's verified by SHA-256 hash lookup; otherwise, it's +verified as an Ed25519 AuthToken. Both paths produce the same `Identity`. + +See [configuration.md](configuration.md) for the full `DynamicConfig.auth` +structure and ADR-037 for the decision context. + ### AuthPolicy Structure -`AuthPolicy` in `DynamicConfig` holds both auth paths, sharing key material: +`AuthPolicy` in `DynamicConfig` holds all auth paths, sharing key material: ```rust pub struct AuthPolicy { pub ssh: SshAuthConfig, pub token: TokenAuthConfig, + pub api_keys: Vec, } pub struct SshAuthConfig { @@ -142,6 +188,14 @@ pub enum TokenKeySource { /// For deployments that want distinct access control per transport. Separate(HashSet), } + +pub struct ApiKeyEntry { + pub prefix: String, // e.g., "alk_" + pub hash: String, // e.g., "sha256:abc..." + pub scopes: Vec, // e.g., ["relay:connect", "secrets:derive"] + pub description: Option, // e.g., "dashboard service account" + pub expires_at: Option, // Unix timestamp, optional TTL +} ``` When `TokenKeySource::Shared` (the default), adding a key to @@ -220,6 +274,13 @@ dependencies needed. - Token auth is only available on transports that carry HTTP metadata (URL path, headers). SSH-over-TCP/TLS/iroh continues to use SSH native auth exclusively. +- API keys are bearer tokens — anyone who obtains the key has the associated + permissions. The hash storage and optional TTL mitigate but do not eliminate + this risk. Ed25519 AuthTokens remain the preferred auth method for interactive + clients. See ADR-037. +- API keys are verified by SHA-256 hash lookup in `DynamicConfig.auth.api_keys` + (or the `api_keys` database table in production). The full key is provided to + the client exactly once at creation time. ### Security Considerations @@ -254,6 +315,8 @@ security consideration: | [023](decisions/023-unified-auth-shared-key-material.md) | Unified auth, shared key material | Same keys for SSH and token auth | | [028](decisions/028-auth-irpc-service.md) | Auth as irpc service | AuthProtocol behind feature flag; IdentityProvider is the contract | | [029](decisions/029-identity-core-type.md) | Identity as core type | `Identity` and `IdentityProvider` in alknet-core | +| [035](decisions/035-streaminterface-messageinterface-split.md) | StreamInterface/MessageInterface | Credential presentation differs per (Transport, Interface) pair | +| [037](decisions/037-api-keys-dynamic-config.md) | API keys in DynamicConfig | Hash-verified bearer tokens for service accounts | ## References @@ -261,6 +324,8 @@ security consideration: - [server.md](server.md) — Current SSH auth handler - [transport.md](transport.md) — Transport abstraction - [configuration.md](configuration.md) — DynamicConfig, AuthPolicy, ConfigReloadHandle +- [interface.md](interface.md) — Credential presentation per (Transport, Interface) pair +- [definitions.md](definitions.md) — Terminology disambiguation (IdentityProvider vs CredentialProvider, AuthToken vs API key) - [services.md](services.md) — AuthProtocol irpc service - [open-questions.md](open-questions.md) — OQ-17 (resolved), OQ-18 (resolved), OQ-19 - [wtransport](https://github.com/BiagioFesta/wtransport) — Rust WebTransport library diff --git a/docs/architecture/call-protocol.md b/docs/architecture/call-protocol.md index 849ce4f..c339f24 100644 --- a/docs/architecture/call-protocol.md +++ b/docs/architecture/call-protocol.md @@ -311,8 +311,18 @@ periodically. ### Protocol Adapter Layer -The call protocol is transport-agnostic by design. It maps to any transport -that carries `EventEnvelope` frames: +The call protocol is transport-agnostic and interface-agnostic by design. It +receives input from two interface categories (ADR-035): + +**StreamInterface** produces `InterfaceEvent` frames from a continuous byte +stream (SSH channel, raw framing). The call protocol handler calls `recv()` +on the session to get events. + +**MessageInterface** handles individual `InterfaceRequest` → `InterfaceResponse` +pairs (HTTP, DNS). The call protocol handler constructs an `OperationContext` +from the request and invokes the registry directly. + +Both paths resolve to the same `OperationRegistry` and `OperationEnv`: | Transport | Channel mechanism | Direction | |-----------|-------------------|-----------| @@ -494,9 +504,16 @@ agent service itself is built on top, not into the core. in gRPC terms)?~~ Resolved — deferred. Current model covers all identified use cases. See [open-questions.md](open-questions.md). -- **OQ-IF-01**: How does the `Interface` session type relate to the call - protocol's `EventEnvelope` stream? This needs design during Phase 1.8 - implementation. See [open-questions.md](open-questions.md). +- **~~OQ-IF-01~~**: ~~How does the `Interface` session type relate to the call + protocol's `EventEnvelope` stream?~~ Resolved — `InterfaceSession::recv()` + returns `Option` where `InterfaceEvent` carries + `EventEnvelope` + `Identity`. `InterfaceSession::send()` accepts `EventEnvelope`. + The `SshSession` bridge implements this over the `alknet-control:0` channel. + For `MessageInterface`, `InterfaceRequest`/`InterfaceResponse` normalize + request/response pairs. See [interface.md](interface.md) and ADR-035. + +- **OQ-P2-01**: Should `MessageInterface` and `StreamInterface` share a common + trait? See [interface.md](interface.md) and [open-questions.md](open-questions.md). ## Design Decisions @@ -507,6 +524,7 @@ agent service itself is built on top, not into the core. | [025](decisions/025-handler-spec-separation.md) | Handler/spec separation | Downstream registers operations without modifying core | | [028](decisions/028-auth-irpc-service.md) | Auth as irpc service | irpc is one dispatch backend for OperationEnv | | [033](decisions/033-operationenv-irpc-call-protocol.md) | OperationEnv | Universal composition with three dispatch paths | +| [035](decisions/035-streaminterface-messageinterface-split.md) | StreamInterface/MessageInterface | Call protocol accepts events from both interface categories | ## References diff --git a/docs/architecture/configuration.md b/docs/architecture/configuration.md index 5311585..ac8dfef 100644 --- a/docs/architecture/configuration.md +++ b/docs/architecture/configuration.md @@ -55,6 +55,25 @@ Hot-reloadable at runtime via `ArcSwap`. Contains: compared to the current approach. Writes are atomic: `store()` swaps the pointer. +### API Keys + +`DynamicConfig.auth` also includes API keys for service accounts and HTTP +interface auth (ADR-037): + +```toml +[[auth.api_keys]] +prefix = "alk_" +hash = "sha256:abc..." +scopes = ["relay:connect"] +description = "dashboard service account" +ttl = "30d" # optional +``` + +API keys are verified by `ConfigIdentityProvider::resolve_from_token()` — if +the token starts with the configured prefix, it's treated as an API key and +verified by SHA-256 hash lookup. Otherwise, it's treated as an Ed25519 AuthToken. +Both paths produce the same `Identity` result. + ### ConfigReloadHandle ```rust @@ -137,12 +156,67 @@ programmatic API). Covers static config plus initial auth/forwarding paths. ```toml [server] +# Stream-based listener: TLS + SSH on port 443 +[[listeners]] +type = "stream" transport = "tls" +interface = "ssh" listen = "0.0.0.0:443" +[server.tls] +cert = "/etc/alknet/tls/cert.pem" +key = "/etc/alknet/tls/key.pem" + +# Stream-based listener: TCP + SSH on port 22 +[[listeners]] +type = "stream" +transport = "tcp" +interface = "ssh" +listen = "0.0.0.0:22" + +# Stream-based listener: iroh P2P +[[listeners]] +type = "stream" +transport = "iroh" +iroh_relay = "https://relay.alk.dev" + +# Message-based listener: HTTP on port 443 (with stealth) +[[listeners]] +type = "http" +listen = "0.0.0.0:443" +tls = true +stealth = true + +# Message-based listener: HTTP on port 8080 (separate, no stealth) +# [[listeners]] +# type = "http" +# listen = "0.0.0.0:8080" +# tls = false +# stealth = false + +# Message-based listener: DNS on port 53 +# [[listeners]] +# type = "dns" +# listen = "0.0.0.0:53" +# tls = false + [auth] host_key = "/etc/alknet/ssh/host_key" +[auth.ssh] +authorized_keys = [...] + +[auth.token] +enabled = true +max_token_age = "5m" + +[[auth.api_keys]] +prefix = "alk_" +hash = "sha256:abc..." +scopes = ["relay:connect"] +description = "dashboard service account" +ttl = "30d" + [forwarding] default = "deny" @@ -163,10 +237,32 @@ interface AlknetServer { ### Multi-Transport Listeners -A head node may accept connections on multiple transports simultaneously. The -architecture supports `Vec` instead of a single -`ServeTransportMode`. `Server::run()` spawns one accept loop per listener, -sharing `DynamicConfig`, `ConnectionRateLimiter`, sessions, and shutdown signal. +A head node may accept connections on multiple transports and interfaces simultaneously. +Listeners come in two categories: stream-based (Transport + StreamInterface pairs) and +message-based (self-contained HTTP or DNS servers). + +```rust +pub enum ListenerConfig { + Stream { + transport: TransportKind, + interface: StreamInterfaceKind, + }, + Http { + bind_addr: SocketAddr, + tls: bool, + stealth: bool, // byte-peek protocol detection on shared port + }, + Dns { + bind_addr: SocketAddr, + tls: bool, + }, +} +``` + +For stream-based listeners, `Server::run()` spawns one accept loop per listener. +For HTTP listeners, it spawns an axum server. For DNS listeners, it spawns a DNS +server. All share `DynamicConfig`, `ConnectionRateLimiter`, sessions, and +shutdown signal. ```toml [[listeners]] diff --git a/docs/architecture/credentials.md b/docs/architecture/credentials.md new file mode 100644 index 0000000..1e978ab --- /dev/null +++ b/docs/architecture/credentials.md @@ -0,0 +1,263 @@ +--- +status: draft +last_updated: 2026-06-09 +--- + +# Credentials (Outbound Auth) + +## What + +The `CredentialProvider` trait and `CredentialSet` enum handle **outbound** +authentication: how alknet authenticates _to_ external and self-hosted services. +This is the complement to `IdentityProvider`, which handles **inbound** +authentication (who is calling alknet). + +## Why + +Without `CredentialProvider`, each service wrapper would independently solve +credential retrieval, caching, and lifecycle management. Cloud API integrations +(vast.ai, runpod) need API keys. Self-hosted services (rustfs, gitea) need +S3 access keys or OIDC tokens. The secret service can store these at rest, but +the wiring between "decrypt a credential from storage" and "use it in an HTTP +request" doesn't exist yet. + +`CredentialProvider` provides a unified abstraction — just as `IdentityProvider` +unifies inbound auth, `CredentialProvider` unifies outbound auth. Handlers +access credentials through `OperationEnv`, not by reaching into storage directly. + +## Architecture + +### Direction: Inbound vs Outbound + +| | IdentityProvider | CredentialProvider | +|---|---|---| +| **Direction** | Inbound (who is calling alknet) | Outbound (how alknet calls others) | +| **Resolves** | Fingerprint/token → `Identity` | Service name → `CredentialSet` | +| **Storage** | `peer_credentials`, `api_keys` | Encrypted nodes in metagraph | +| **Lifecycle** | Stateless lookup | May need refresh (OIDC tokens, S3 sessions) | +| **Location** | `alknet_core::auth` | `alknet_core::credentials` | + +Both live at the same architectural layer. A handler receives an +`OperationContext` with `identity` (who called us) and can access credentials +through `context.env` (how we call out). + +### CredentialProvider Trait + +```rust +pub trait CredentialProvider: Send + Sync + 'static { + fn get_credentials(&self, service: &str) -> Option; + fn refresh_credentials(&self, service: &str) -> Option; +} +``` + +The trait is intentionally narrow. It returns credentials for a named service. +It does not abstract the auth mechanism — that stays with the service wrapper +that knows the protocol (S3 signing, OAuth2 refresh, etc.). + +### CredentialSet + +```rust +pub enum CredentialSet { + ApiKey { + header_name: String, + token: String, + }, + Basic { + username: String, + password: String, + }, + Bearer { + token: String, + }, + S3AccessKey { + access_key: String, + secret_key: String, + session_token: Option, + }, + OidcToken { + access_token: String, + refresh_token: Option, + expires_at: Option, + }, + Custom { + scheme: String, + params: HashMap, + }, +} +``` + +Each variant carries the data needed for a specific auth mechanism. The service +wrapper that requested the credentials knows what variant it expects and how to +use it. + +### CredentialProvider vs IdentityProvider + +These are opposite-direction abstractions that compose through `OperationEnv`: + +``` + Incoming Request + │ + ▼ + IdentityProvider (credential → Identity) + │ + ├── SSH fingerprint → Identity.id, .scopes, .resources + ├── Bearer AuthToken → Identity.id, .scopes, .resources + └── API key → Identity.id, .scopes, .resources + │ + ▼ + OperationContext { identity, env, ... } + │ + ├── context.env.invoke("git", "push", input) + │ └── GitService handler + │ └── CredentialProvider (outbound) + │ └── get_credentials("rustfs") + │ └── S3AccessKey { access_key, secret_key } + │ + └── context.env.invoke("secrets", "derive", input) + └── local dispatch to SecretProtocol + + Two directions: Inbound (who is calling us) + Outbound (how we call others) +``` + +### SecretStoreCredentialProvider (Phase 1 Default) + +The default `CredentialProvider` implementation. Decrypts credentials via +`SecretProtocol::Decrypt` and holds them in RAM: + +```rust +pub struct SecretStoreCredentialProvider { + credentials: ArcSwap>, +} +``` + +At startup, the CLI or NAPI assembly loads credentials from the secret service +and populates the `ArcSwap`. The `refresh_credentials()` method re-decrypts +after a `Lock`/`Unlock` cycle on the secret service. + +### ManagedCredentialProvider (Phase C Future) + +For self-hosted services that need active lifecycle management (S3 session +token rotation, OIDC token refresh). Wraps `SecretStoreCredentialProvider` +with per-service `CredentialManager` instances: + +```rust +pub struct ManagedCredentialProvider { + base: SecretStoreCredentialProvider, + managers: HashMap>, +} + +pub trait CredentialManager: Send + Sync + 'static { + fn refresh(&self, current: &CredentialSet) -> Option; + fn is_expired(&self, current: &CredentialSet) -> bool; + fn provision(&self, identity: &Identity) -> Option; +} +``` + +- `refresh`: OIDC token refresh, S3 session token rotation +- `is_expired`: Check TTL before use +- `provision`: Create credentials on a self-hosted service for a given identity + +This is a Phase C concept. The spec defines the extension point but defers +implementation. + +### Integration with OperationEnv + +Handlers access credentials through `OperationEnv`: + +```rust +// Handler needs outbound credentials for a service +let creds = context.env.get_credentials("rustfs"); +``` + +This is analogous to how `context.env.invoke(namespace, op, input)` works for +operation dispatch — the handler doesn't know whether the credential comes from +config, the secret service, or a managed provider. + +### Integration with SecretProtocol + +Credentials are stored encrypted in the metagraph via `SecretProtocol`: + +1. Operator configures credentials: `alknet credential add vast-ai --type bearer --token-file ./key.txt` +2. CLI encrypts via `SecretProtocol::Encrypt` (AES-256-GCM, key at path `m/74'/2'/0'/0'`) +3. Encrypted credential stored as `EncryptedData` node in metagraph, tagged with service name +4. At startup, `SecretStoreCredentialProvider` calls `SecretProtocol::Decrypt` for each configured service +5. Decrypted credentials held in RAM with same lifecycle as the seed (purged on `Lock`) + +The `EncryptedData` wire format is shared with alknet-storage by type-level +compatibility, not a crate dependency. + +### Identity-Bound Credentials (Phase B+ Future) + +For multi-tenant setups where different alknet users have different access levels +on the same external service: + +```rust +// Service-level credential (all users share one key): +credential_provider.get_credentials("rustfs") + +// Identity-bound credential (per-user key): +credential_provider.get_credentials_for("rustfs", &identity.id) +``` + +The trait-level method is service-level. The identity-bound method is an +extension in alknet-storage that uses `Identity.id` (the account UUID in +database-backed deployments) as the lookup key. No separate `account_id` field +needed — `Identity.id` IS the account identifier. + +## Constraints + +- `CredentialProvider` and `CredentialSet` live in `alknet_core::credentials`. + No database dependency at the core level. +- `CredentialProvider` does not depend on `IdentityProvider`. They compose + through `OperationEnv`, not through dependency. +- `ManagedCredentialProvider` and `CredentialManager` are Phase C concepts. + They are defined as extension points but not implemented yet. +- Identity-bound credentials use `Identity.id` as the account key. In + config-backed deployments, this is the fingerprint or key prefix. In + database-backed deployments, this is the account UUID. +- `SecretStoreCredentialProvider` depends on `SecretProtocol::Decrypt`, which + requires the alknet-secret crate. A stub impl that reads from config is + sufficient for Phase 2 when alknet-secret isn't available. +- The `CredentialSet` variants cover all identified credential types (Phases + A–C). Phase D (alknet as OIDC provider) is additive. + +## Phase Progression + +| Phase | CredentialProvider Scope | Notes | +|-------|-------------------------|-------| +| Phase 2 (now) | Trait + `CredentialSet` in core. `SecretStoreCredentialProvider` stub reads from config. | Enables Phase 2 HTTP auth | +| Phase A | `SecretStoreCredentialProvider` backed by `SecretProtocol::Decrypt`. CLI command for credential management. | Full secret service integration | +| Phase B | `FromOpenAPI` integration. `CredentialProvider` populates `HttpServiceConfig.auth`. | Auto-registration of external services | +| Phase C | `ManagedCredentialProvider` + `CredentialManager`. S3 signing, OIDC refresh, identity-bound credentials. | Production self-hosted services | +| Phase D | Alknet as OIDC provider. Eliminates stored credentials for OIDC-compatible services. | Long-term goal | + +## Open Questions + +- **OQ-CP-01**: Should `CredentialProvider` support per-identity credentials + (`get_credentials(service, identity)`)? See [open-questions.md](open-questions.md). + +- **OQ-CP-02**: Where should OIDC provider operations live if alknet becomes + an OIDC provider (Phase D)? See [open-questions.md](open-questions.md). + +- **OQ-CP-03**: How do credential rotations propagate across a cluster? See + [open-questions.md](open-questions.md). + +- **OQ-CP-04**: Should `CredentialSet` include request-signing capability? + See [open-questions.md](open-questions.md). + +## Design Decisions + +| ADR | Decision | Summary | +|-----|----------|---------| +| [036](decisions/036-credentialprovider-core-type.md) | CredentialProvider as core type | Outbound credentials in `alknet_core::credentials`, parallel to IdentityProvider | +| [029](decisions/029-identity-core-type.md) | Identity as core type | Inbound auth — the opposite direction | +| [032](decisions/032-event-boundary-discipline.md) | Event boundary | Secret service domain events stay internal | + +## References + +- [identity.md](identity.md) — IdentityProvider (inbound auth, opposite direction) +- [secret-service.md](secret-service.md) — SecretProtocol, EncryptedData +- [services.md](services.md) — OperationEnv, OperationContext +- [definitions.md](definitions.md) — IdentityProvider vs CredentialProvider disambiguation +- [research/phase2/credential-provider.md](../research/phase2/credential-provider.md) — Full analysis with rustfs/gitea integration \ No newline at end of file diff --git a/docs/architecture/decisions/035-streaminterface-messageinterface-split.md b/docs/architecture/decisions/035-streaminterface-messageinterface-split.md new file mode 100644 index 0000000..d42591b --- /dev/null +++ b/docs/architecture/decisions/035-streaminterface-messageinterface-split.md @@ -0,0 +1,65 @@ +# ADR-035: StreamInterface and MessageInterface Split + +## Status +Accepted + +## Context + +The `Interface` trait (ADR-026) assumes a persistent byte stream from a `Transport`. It produces a `Session` that yields `InterfaceEvent` frames. This works for SSH and raw framing — both run over duplex streams. + +However, HTTP and DNS do not fit this model. They handle individual request/response pairs, not persistent sessions. HTTP runs over a TLS connection after byte-peek protocol detection (extending the existing stealth mode pattern). DNS runs its own server on port 53. Both are stateless per-request, not session-oriented. + +The three-layer model (Transport, Interface, Protocol) remains correct. The issue is that Layer 2 has two distinct patterns: stream-based (SSH, raw framing) where the transport provides a continuous byte stream, and message-based (HTTP, DNS) where the interface manages its own transport and handles discrete requests. + +## Decision + +Split the `Interface` trait into two independent traits: + +1. **`StreamInterface`** — consumes a `TransportStream`, produces a long-lived `Session` that yields `InterfaceEvent` frames. Existing `SshInterface` and `RawFramingInterface` become `StreamInterface` implementations. + +2. **`MessageInterface`** — handles individual `InterfaceRequest` → `InterfaceResponse` pairs. Manages its own transport (HTTP server, DNS server). `HttpInterface` and `DnsInterface` are `MessageInterface` implementations. + +The traits are independent. They have different signatures (`accept(stream)` vs `handle_request(req)`), different lifecycles (long-lived session vs stateless per-request), and different transport ownership (provided by caller vs self-managed). + +`ListenerConfig` gains variants for both: + +```rust +pub enum ListenerConfig { + Stream { + transport: TransportKind, + interface: StreamInterfaceKind, + }, + Http { + bind_addr: SocketAddr, + tls: bool, + stealth: bool, + }, + Dns { + bind_addr: SocketAddr, + tls: bool, + }, +} +``` + +`TransportKind::Dns` is removed. DNS is a `MessageInterface` that manages its own transport (UDP/TCP port 53), not a transport variant. + +The call protocol handler (Layer 3) is interface-agnostic: it processes `InterfaceEvent` frames from `StreamInterface` sessions and `InterfaceRequest` → `InterfaceResponse` from `MessageInterface` handlers. The dispatch logic is the same — only the framing differs. + +## Consequences + +**Positive**: HTTP and DNS are first-class interfaces with proper type signatures. No forcing stateless protocols into a session model. The existing stealth mode byte-peek pattern naturally extends to `HttpInterface`. The `InterfaceRequest` / `InterfaceResponse` types normalize calls across message-based interfaces. + +**Positive**: Removing `TransportKind::Dns` prevents a breaking change later — code should never depend on DNS as a transport variant. + +**Positive**: `ListenerConfig` correctly models the server's accept loop: stream listeners spawn one accept loop per (transport, interface) pair, while HTTP and DNS listeners each manage their own server. + +**Negative**: Two traits where there was one. But they serve fundamentally different purposes. A common super-trait would add complexity (`accept_stream` + `handle_request` + `transport_kind`) without practical benefit — implementations satisfy one trait or the other, never both. + +**Negative**: The `accept()` method on the current `Interface` trait needs to be renamed. This is a rename of an existing method signature, not a semantic change — `SshInterface` and `RawFramingInterface` implementations become `StreamInterface` implementations with the same `accept()` logic. + +## References + +- ADR-026 (transport/interface separation — updated by this ADR) +- [interface.md](../interface.md) — Interface layer spec +- [research/phase2/interface-model.md](../../research/phase2/interface-model.md) — Full analysis +- [research/phase2/tls-transport.md](../../research/phase2/tls-transport.md) — HTTP interface, ListenerConfig \ No newline at end of file diff --git a/docs/architecture/decisions/036-credentialprovider-core-type.md b/docs/architecture/decisions/036-credentialprovider-core-type.md new file mode 100644 index 0000000..2f040b0 --- /dev/null +++ b/docs/architecture/decisions/036-credentialprovider-core-type.md @@ -0,0 +1,82 @@ +# ADR-036: CredentialProvider as Core Type + +## Status +Accepted + +## Context + +Alknet's `IdentityProvider` resolves **inbound** authentication: given a +credential (fingerprint or token), produce an `Identity`. But there is no +corresponding abstraction for **outbound** credentials: how does alknet +authenticate _to_ external services (vast.ai, rustfs, gitea)? + +Without `CredentialProvider`, each service wrapper would independently solve +credential retrieval, caching, and lifecycle management. This leads to +duplicated effort and inconsistent security practices across service wrappers. + +The pattern mirrors the existing `IdentityProvider` pattern: trait in core, +default impl using simple storage, production impl using the secret service +and database. + +## Decision + +Define `CredentialProvider` trait and `CredentialSet` enum in +`alknet_core::credentials`. + +```rust +pub trait CredentialProvider: Send + Sync + 'static { + fn get_credentials(&self, service: &str) -> Option; + fn refresh_credentials(&self, service: &str) -> Option; +} + +pub enum CredentialSet { + ApiKey { header_name: String, token: String }, + Basic { username: String, password: String }, + Bearer { token: String }, + S3AccessKey { access_key: String, secret_key: String, session_token: Option }, + OidcToken { access_token: String, refresh_token: Option, expires_at: Option }, + Custom { scheme: String, params: HashMap }, +} +``` + +The trait is intentionally narrow. It returns credentials for a named service. +It does not try to abstract the auth mechanism itself — that stays with the +service wrapper that knows the protocol (S3 signing, OAuth2 refresh, etc.). + +Phase 1 provides `SecretStoreCredentialProvider` (reads from +`SecretProtocol::Decrypt`, holds in RAM). Phase 2+ adds +`ManagedCredentialProvider` (with `CredentialManager` for lifecycle management: +refresh, expiration, provisioning). + +`CredentialProvider` does not depend on `IdentityProvider`, though +`ManagedCredentialProvider` may use `Identity.id` for identity-bound credential +lookups. + +## Consequences + +**Positive**: Outbound auth has a unified abstraction, just as inbound auth +has `IdentityProvider`. Service wrappers retrieve credentials through one +interface. `OperationEnv` can expose credentials through `context.env`. + +**Positive**: The `CredentialSet` enum covers all identified credential types +(API keys, bearer tokens, S3 access keys, OIDC tokens, basic auth, custom). +This is sufficient for Phases A-C. Phase D (alknet as OIDC provider) is additive. + +**Positive**: The trait in core, impl in service crate pattern is consistent +with `IdentityProvider` (trait in core, `ConfigIdentityProvider` in core, +`StorageIdentityProvider` in alknet-storage). + +**Negative**: Adds a new core type and a new module (`credentials`). But this +is the same pattern as `IdentityProvider` and `auth` — a small, narrow trait +with a clear contract. + +**Negative**: `ManagedCredentialProvider` and `CredentialManager` are Phase C +concepts. The spec should define them as future extensions, not implement them +now. + +## References + +- ADR-029 (Identity as core type — same pattern) +- [credentials.md](../credentials.md) — CredentialProvider spec +- [research/phase2/credential-provider.md](../../research/phase2/credential-provider.md) — Full analysis +- [identity.md](../identity.md) — IdentityProvider (inbound, opposite direction) \ No newline at end of file diff --git a/docs/architecture/decisions/037-api-keys-dynamic-config.md b/docs/architecture/decisions/037-api-keys-dynamic-config.md new file mode 100644 index 0000000..2b6f30c --- /dev/null +++ b/docs/architecture/decisions/037-api-keys-dynamic-config.md @@ -0,0 +1,83 @@ +# ADR-037: API Keys as DynamicConfig Auth + +## Status +Accepted + +## Context + +Alknet's token auth uses Ed25519-signed `AuthToken`s — the same key material +used for SSH auth. This is appropriate for interactive clients (browsers, CLI) +that can generate and sign Ed25519 key pairs. + +But for service accounts, automation, and simple integrations, Ed25519 key +pairs are inconvenient. A dashboard backend, a CI/CD pipeline, or a monitoring +script needs a simple bearer token that can be stored in an environment variable +or config file without managing cryptographic key pairs. + +The HTTP interface (Phase 2+) requires bearer token auth for `Authorization: +Bearer ` headers. `AuthToken` works but requires client-side Ed25519 +signing. API keys offer a simpler alternative: short bearer tokens verified by +SHA-256 hash lookup, with optional scope restrictions and TTL. + +## Decision + +Add `[[auth.api_keys]]` section to `DynamicConfig`: + +```toml +[[auth.api_keys]] +prefix = "alk_" +hash = "sha256:abc..." +scopes = ["relay:connect", "secrets:derive"] +description = "dashboard service account" +ttl = "30d" # optional +``` + +`ConfigIdentityProvider::resolve_from_token()` handles both token types: +- If the input starts with the configured prefix (default `alk_`), treat it as + an API key: hash it with SHA-256 and look up the hash in the `api_keys` table. +- Otherwise, treat it as an `AuthToken`: decode, verify Ed25519 signature, + check timestamp, resolve from `authorized_keys`. + +Both paths produce the same `Identity` result. In database-backed deployments, +both resolve to the same account UUID. + +API keys are stored as SHA-256 hashes (like password hashing — the cleartext +key is never stored, only its hash). The prefix enables O(1) routing between +AuthToken and API key verification without trying both paths. + +The full key is provided to the client exactly once (at creation time). Subsequent +verifications only compare hashes. + +## Consequences + +**Positive**: Simple bearer token auth for HTTP and other non-SSH interfaces. +No cryptographic key management for service accounts. Consistent with industry +practice (Stripe, GitHub, AWS all use prefixed API keys). + +**Positive**: Both AuthTokens and API keys go through `resolve_from_token()`. +The caller doesn't need to know which type they're using. This keeps the +authentication layer unified. + +**Positive**: Scoped API keys enable fine-grained access control for service +accounts. A monitoring tool gets `["monitoring:read"]`, not full access. + +**Negative**: API keys are bearer tokens — anyone who obtains the key has the +associated permissions. The hash storage and optional TTL mitigate but do not +eliminate this risk. Ed25519 AuthTokens remain the preferred auth method for +interactive clients. + +**Negative**: API key rotation requires updating `DynamicConfig` (or the +`api_keys` database table). The `ConfigReloadHandle` / `ConfigService` reload +mechanism handles this, but it's a deliberate operation, not automatic. + +**Negative**: No rate limiting on API key verification is built into this ADR. +Rate limiting on the HTTP interface is a separate concern. + +## References + +- ADR-023 (unified auth, shared key material) +- ADR-029 (Identity as core type) +- ADR-030 (static/dynamic config split) +- [auth.md](../auth.md) — Token auth, AuthPolicy, API keys +- [configuration.md](../configuration.md) — DynamicConfig, AuthPolicy +- [research/phase2/interface-model.md](../../research/phase2/interface-model.md) — API keys in config \ No newline at end of file diff --git a/docs/architecture/definitions.md b/docs/architecture/definitions.md new file mode 100644 index 0000000..298ee20 --- /dev/null +++ b/docs/architecture/definitions.md @@ -0,0 +1,226 @@ +--- +status: draft +last_updated: 2026-06-09 +--- + +# Definitions: Terminology and Concept Disambiguation + +## Purpose + +Several terms are overloaded across alknet's architecture. This document defines +each term precisely and states the rule for using it in architecture specs. When +ambiguity is possible, specs must use the full qualifier. + +This is a normative reference — other architecture documents link here rather +than repeating definitions inline. + +## Term Definitions + +### Interface (Layer 2) + +An **Interface** consumes a Transport stream (Layer 1) or manages its own +transport, and produces call protocol sessions or handles discrete requests. +It is a _protocol parser_, not a network service. + +Two subtypes: + +| Subtype | Trait | Lifecycle | Transport ownership | Examples | +|---------|-------|-----------|---------------------|----------| +| `StreamInterface` | `accept(stream) → Session` | Long-lived session | Provided by caller | SshInterface, RawFramingInterface | +| `MessageInterface` | `handle_request(req) → Response` | Stateless per-request | Self-managed | HttpInterface, DnsInterface | + +**Rule**: In alknet architecture docs, "Interface" (capitalized) refers to +Layer 2. Rust trait definitions use "trait" or "contract." Network URLs use +"endpoint." When discussing auth mechanisms per transport/interface pair, use +"credential presentation" (not "auth interface"). + +See: [interface.md](interface.md), ADR-035. + +### Transport (Layer 1) + +A **Transport** produces a byte stream (`AsyncRead + AsyncWrite + Unpin + Send`). +It is a _wire mechanism_, not a protocol. `TransportKind` enumerates: +`Tcp`, `Tls`, `Iroh`, `WebTransport`. + +DNS is **not** a transport — it is a `MessageInterface` that manages its own +transport (UDP/TCP port 53). + +**Rule**: Never use "transport" to refer to HTTP, DNS, or any protocol that +doesn't produce a `TransportStream`. Use "MessageInterface" instead. + +See: [transport.md](transport.md), ADR-026, ADR-035. + +### Service (irpc service) + +An **irpc service** is an in-cluster, Rust-to-Rust service defined by an irpc +protocol enum. Dispatched by enum variant with postcard serialization. Examples: +`AuthProtocol`, `SecretProtocol`, `ConfigProtocol`. + +**Rule**: Always qualify: "irpc service" (in-cluster, enum-dispatched), +"application service" (operation-registered handler), or "external service" +(third-party endpoint). Never use bare "service" in architecture docs. + +See: [services.md](services.md), ADR-028, ADR-033. + +### Operation (call protocol) + +An **operation** is a path-based handler registered in `OperationRegistry`, +dispatched by `namespace + name`. Cross-node, cross-language, JSON +`EventEnvelope` framing. + +**Rule**: Use "operation" for call protocol handlers. Use "irpc service method" +for enum-dispatched calls. These are different dispatch mechanisms unified by +OperationEnv. + +See: [call-protocol.md](call-protocol.md), ADR-033. + +### Identity (core type) + +The `Identity` struct `{ id, scopes, resources }` represents an authenticated +principal. Produced by `IdentityProvider` (inbound auth resolution). + +| Identity field | Config-backed auth | Database-backed auth | +|---------------|-------------------|---------------------| +| `id` | SSH key fingerprint | Account UUID | +| `scopes` | From authorized_keys entry | From peer_credentials + ACL | +| `resources` | From authorized_keys entry | From organization membership | + +**Rule**: "Identity" (capitalized, code font) = the alknet struct. "identity +service" = a full identity management system (Keystone, etc.). Never conflate +the two. + +See: [identity.md](identity.md), ADR-029. + +### IdentityProvider (inbound auth) + +`IdentityProvider` resolves **inbound** authentication: given a credential +(fingerprint or token), produce an `Identity`. + +**Direction**: Inbound (who is calling alknet). + +**Rule**: Never use "IdentityProvider" to describe outbound auth. That is +`CredentialProvider`. + +See: [identity.md](identity.md), ADR-029. + +### CredentialProvider (outbound auth) + +`CredentialProvider` resolves **outbound** credentials: given a service name, +produce a `CredentialSet` for authenticating _to_ that service. + +**Direction**: Outbound (how alknet calls others). + +**Rule**: Never use "CredentialProvider" for inbound auth. That is +`IdentityProvider`. + +See: [credentials.md](credentials.md), ADR-036. + +### AuthToken + +`AuthToken = base64url(key_id || timestamp || signature)` — an Ed25519-signed +timestamp token used for non-SSH auth. Self-signed by the client, verified +server-side. + +**Rule**: Use "AuthToken" (capitalized) for this specific format. Use "API key" +for hash-verified bearer tokens. Never use bare "token" in architecture docs. + +See: [auth.md](auth.md), ADR-023. + +### API Key + +A hash-verified bearer token with a prefix like `alk_...`. Simpler than +AuthToken (no Ed25519 key pair needed). Stored as SHA-256 hash in +`DynamicConfig.auth.api_keys` or `api_keys` table. + +**Rule**: Always "API key" (two words) for hash-verified bearer tokens. +"AuthToken" for Ed25519-signed tokens. + +See: [auth.md](auth.md), ADR-037. + +### Domain Event vs Integration Event + +| Type | Scope | Serialization | Example | +|------|-------|---------------|---------| +| Domain event | Within a service boundary | Any format (Honker streams) | `KeyRotated`, `InventoryAdjusted` | +| Integration event | Across service or node boundaries | JSON `EventEnvelope` | `call.requested`, `UserCreated` | + +irpc service calls are synchronous request-response, not events. + +**Rule**: "Domain event" for internal Honker streams. "Integration event" for +call protocol `EventEnvelope`. "irpc call" for synchronous in-cluster calls. +Per ADR-032, domain events never cross service boundaries without projection. + +See: ADR-032, [services.md](services.md). + +### Scope + +A permission string attached to an `Identity`. Flat strings like +`"relay:connect"`, `"secrets:derive"`. Used by `ForwardingPolicy` and +operation-level ACL. + +**Rule**: Use "scope" for `Identity.scopes` flat strings. Use "resource" for +`Identity.resources` entries. Do not conflate with hierarchical role models +unless explicitly noting a comparison to Keystone. + +See: [identity.md](identity.md), ADR-031. + +### OperationRegistry + +The central registry mapping `(namespace, operation_name)` to handlers and +specs. All interfaces resolve to the same registry. + +**Rule**: "OperationRegistry" for this specific data structure. "Service +catalog" only when explicitly comparing to Keystone or similar external systems. + +See: [call-protocol.md](call-protocol.md), ADR-025. + +### Credential Presentation + +The mechanism by which credentials are presented on each (Transport, Interface) +pair: + +| (Transport, Interface) | Credential presentation | Resolves via | +|----------------------|----------------------|-------------| +| (TLS, SSH) | SSH key handshake | `resolve_from_fingerprint()` | +| (TCP, SSH) | SSH key handshake | `resolve_from_fingerprint()` | +| (iroh, SSH) | SSH key handshake | `resolve_from_fingerprint()` | +| (TLS, raw framing) | AuthToken in frame header | `resolve_from_token()` | +| (TCP, raw framing) | AuthToken in frame header | `resolve_from_token()` | +| (WebTransport, raw framing) | AuthToken in CONNECT request | `resolve_from_token()` | +| (—, HTTP) | `Authorization: Bearer` header | `resolve_from_token()` | +| (—, DNS) | AuthToken in query labels | `resolve_from_token()` | + +**Rule**: Use "credential presentation" for the mechanism of presenting +credentials on a specific (Transport, Interface) pair. Not "auth interface" +(which overloads "Interface"). + +See: [auth.md](auth.md), [interface.md](interface.md). + +## Cross-cutting Open Questions + +These questions affect multiple specs and need resolution before or during +Phase 2 implementation: + +- **OQ-DEF-03**: Should `Identity.scopes` be hierarchical (Keystone implied roles) + or stay flat? Recommendation: Stay flat. Add implied scope resolution in + alknet-storage when multi-tenant deployment requires it. + +- **OQ-DEF-07**: Should the on-chain `IdentityProvider` be a separate impl or a + `CredentialProvider` extension? Recommendation: Separate `IdentityProvider` + impl (`OnChainIdentityProvider`). `IdentityProvider` resolves inbound auth, + not outbound credentials. + +- **OQ-DEF-08**: Should "credential presentation" replace overloaded "interface" in + auth contexts? Recommendation: Yes. Adopted in this document. + +See: [open-questions.md](open-questions.md) for tracking. + +## References + +- [interface.md](interface.md) — StreamInterface / MessageInterface +- [auth.md](auth.md) — AuthToken, credential presentation per interface +- [identity.md](identity.md) — Identity, IdentityProvider +- [credentials.md](credentials.md) — CredentialProvider, CredentialSet +- [services.md](services.md) — irpc services vs application services +- [call-protocol.md](call-protocol.md) — Operations, OperationEnv +- [research/phase2/definitions.md](../research/phase2/definitions.md) — Full research with cross-domain mappings \ No newline at end of file diff --git a/docs/architecture/interface.md b/docs/architecture/interface.md index 0cde1d1..7d0a7c1 100644 --- a/docs/architecture/interface.md +++ b/docs/architecture/interface.md @@ -1,6 +1,6 @@ --- status: draft -last_updated: 2026-06-07 +last_updated: 2026-06-09 --- # Interface (Layer 2) @@ -8,24 +8,33 @@ last_updated: 2026-06-07 ## What The Interface layer sits between Transport (Layer 1) and Protocol (Layer 3). -An Interface consumes a `Transport::Stream` and produces call protocol sessions. -SSH is an interface, not a transport — it wraps a byte stream in session -semantics. Raw framing (4-byte length prefix + JSON `EventEnvelope`) is another -interface, one without SSH overhead. +Interfaces consume byte streams from Transports or manage their own transports, +and produce call protocol sessions or handle discrete requests. SSH is an +interface, not a transport — it wraps a byte stream in session semantics. Raw +framing (4-byte length prefix + JSON `EventEnvelope`) is another interface. +HTTP and DNS are message-based interfaces that handle individual request/response +pairs without persistent sessions. ## Why -In the current architecture, SSH is deeply embedded in `ServerHandler`. This -tangling of transport, interface, and protocol makes it impossible to: +In the original architecture, SSH was deeply embedded in `ServerHandler`. This +tangling of transport, interface, and protocol made it impossible to: - Run the call protocol over DNS queries without wrapping SSH inside DNS - Use raw framing for local service mesh (no SSH overhead) - Support WebTransport direct call protocol for browsers - Separate auth mechanics from channel management +- Accept HTTP requests and map them to call protocol operations The three-layer model (ADR-026) cleanly separates these concerns. Transport -produces bytes. Interface parses bytes into sessions. Protocol carries -semantics. A connection is always a (Transport, Interface) pair. +produces bytes. Interface parses bytes into sessions or handles requests. +Protocol carries semantics. A connection is always a (Transport, Interface) +pair for stream-based interfaces, or a standalone message-based interface. + +Phase 2 research identified that HTTP and DNS don't fit the persistent session +model — they're stateless per-request. This led to the StreamInterface / +MessageInterface split (ADR-035), which gives each interface category its own +trait with the right lifecycle and ownership model. ## Architecture @@ -33,37 +42,103 @@ semantics. A connection is always a (Transport, Interface) pair. ``` Layer 3: Protocol (Call protocol, Operations, OperationEnv) -Layer 2: Interface (SSH, raw framing, HTTP/WS, DNS control channel) -Layer 1: Transport (TCP, TLS, iroh, DNS, WebTransport) +Layer 2: Interface (StreamInterface: SSH, raw framing | MessageInterface: HTTP, DNS) +Layer 1: Transport (TCP, TLS, iroh, WebTransport) ``` - **Layer 1: Transport** — produces byte streams (`AsyncRead + AsyncWrite + Unpin - + Send`). Unchanged per ADR-001. -- **Layer 2: Interface** — consumes a `Transport::Stream` and produces call - protocol sessions. SSH does handshake + auth + channel multiplexing. Raw - framing does length-prefix parsing. + + Send`). Unchanged per ADR-001. DNS is NOT a transport. +- **Layer 2: Interface** — two categories: + - **StreamInterface**: consumes a `TransportStream` and produces a long-lived + session that yields `InterfaceEvent` frames. + - **MessageInterface**: handles individual `InterfaceRequest` → + `InterfaceResponse` pairs. Manages its own transport. - **Layer 3: Protocol** — carries semantics. Call protocol events, operation registry, service calls. Agnostic to both Transport and Interface below it. -### Interface Trait +### StreamInterface Trait ```rust #[async_trait] -pub trait Interface: Send + Sync + 'static { - type Session; - async fn accept(stream: TransportStream, config: &InterfaceConfig) -> Result; +pub trait StreamInterface: Send + Sync + 'static { + type Session: InterfaceSession; + + async fn accept( + &self, + stream: Box, + config: &InterfaceConfig, + ) -> Result; } ``` -The session produced by an interface is consumed by the call protocol handler. -Different interfaces produce different session types, but the call protocol -handler receives `EventEnvelope` frames from any interface. +The session produced by a `StreamInterface` is consumed by the call protocol +handler. Different stream interfaces produce different session types, but the +call protocol handler receives `InterfaceEvent` frames from any stream +interface. -### SshInterface +### MessageInterface Trait -Wraps the existing `ServerHandler` logic. This is the most complex interface -because SSH provides channel multiplexing, auth negotiation, and proxy -management within a single session. +```rust +#[async_trait] +pub trait MessageInterface: Send + Sync + 'static { + async fn handle_request(&self, request: InterfaceRequest) -> Result; +} +``` + +Message-based interfaces handle individual requests without persistent sessions. +They manage their own transport (HTTP server, DNS server) and normalize requests +into `InterfaceRequest` / `InterfaceResponse`. + +### InterfaceRequest / InterfaceResponse + +```rust +pub struct InterfaceRequest { + pub operation_path: String, // e.g., "/head/auth/verify" + pub input: Value, // JSON input payload + pub auth_token: Option, // Extracted from wire format + pub metadata: HashMap, +} + +pub struct InterfaceResponse { + pub result: Result, + pub status: u16, // HTTP status, DNS result code, etc. + pub headers: HashMap, +} +``` + +The call protocol handler processes `InterfaceRequest` the same way it processes +`InterfaceEvent` frames — both resolve to operation invocations through +`OperationEnv`. The difference is framing: stream interfaces produce `InterfaceEvent` +frames from a continuous byte stream, message interfaces construct `InterfaceRequest` +from their wire format. + +### InterfaceSession + +Every stream interface session implements `InterfaceSession`: + +```rust +pub struct InterfaceEvent { + pub envelope: EventEnvelope, + pub identity: Option, +} + +#[async_trait] +pub trait InterfaceSession: Send { + async fn recv(&mut self) -> Option; + async fn send(&mut self, envelope: EventEnvelope) -> Result<()>; +} +``` + +`InterfaceEvent` carries an `EventEnvelope` and the authenticated `Identity`. +The call protocol handler (Layer 3) receives `InterfaceEvent` frames and +processes them uniformly, regardless of whether they arrived over SSH or raw +framing. + +### SshInterface (StreamInterface) + +Wraps the existing `ServerHandler` logic. This is the most complex stream +interface because SSH provides channel multiplexing, auth negotiation, and +proxy management within a single session. What stays in SshInterface (Layer 2): - SSH handshake and session management @@ -79,7 +154,11 @@ What moves to Layer 3 (call protocol handler): What moves to per-connection state: - Port forwarding proxy logic -### RawFramingInterface +**Current implementation note**: `SshSession::recv()` and `SshSession::send()` +are stubs. The bridge from SSH channels to `InterfaceEvent` frames is +scheduled for Phase 2 implementation (see integration-plan.md Phase 2.1). + +### RawFramingInterface (StreamInterface) Reads 4-byte big-endian length prefix + JSON `EventEnvelope` frames directly from the transport stream. No SSH wrapping. No channel multiplexing — the @@ -88,134 +167,210 @@ entire stream is a single call protocol channel. ```rust pub struct RawFramingInterface; -impl Interface for RawFramingInterface { +impl StreamInterface for RawFramingInterface { type Session = RawFramingSession; // Reads length-prefixed EventEnvelope frames from the stream } ``` Used for: -- DNS control channel (DNS transport + raw framing) - Local service mesh (TCP + raw framing, no SSH overhead) -- Browser direct call protocol (WebTransport + raw framing, future) +- Secure mesh (TLS + raw framing) +- WebTransport direct call protocol (future: WebTransport + raw framing) -### DNS Control Channel +Auth for raw framing: `AuthToken` in frame header, resolved via +`IdentityProvider::resolve_from_token()`. -A (DNS transport, raw framing interface) pair. The DNS transport encodes -`EventEnvelope` frames as DNS query/response pairs. The raw framing interface -parses them directly — **NOT** SSH inside DNS. +**Current implementation note**: `RawFramingInterface::accept()` returns an +error. Frame reading/writing is scheduled for Phase 2 implementation (see +integration-plan.md Phase 2.2). + +### HttpInterface (MessageInterface) + +Accepts standard HTTP requests and maps them to call protocol operations: ``` -Client: Encode EventEnvelope as base32 DNS query labels - → DNS Transport → DNS Server → Raw Framing Interface → Call Protocol Handler - -Server: Return EventEnvelope as DNS TXT record response - ← Raw Framing Interface ← DNS Transport ← Call Protocol Handler +POST /v1/{namespace}/{op} → registry.invoke(namespace, op, input) (mutation) +GET /v1/{namespace}/{op} → registry.invoke(namespace, op, input) (query) +GET /v1/{namespace}/{op} SSE → registry.subscribe(namespace, op, input) (subscription) +GET /v1/schema → registry.list_operations() ``` -### Valid (Transport, Interface) Pairs +Auth: `Authorization: Bearer ` header, resolved via +`IdentityProvider::resolve_from_token()`. Both AuthTokens and API keys are +accepted. -| Transport | Interface | Use case | -|-----------|-----------|----------| -| TLS | SSH | Standard alknet tunnel | -| TCP | SSH | Plain SSH tunnel | -| iroh | SSH | P2P SSH tunnel | -| DNS | raw framing | DNS control channel | -| WebTransport | SSH | Browser SSH tunnel (future) | -| WebTransport | raw framing | Browser call protocol (future) | -| TCP | raw framing | Direct call protocol, local mesh | +The HTTP interface runs inside the existing stealth mode byte-peek architecture: +after a TLS handshake, the server peeks at the first bytes. If they're +`SSH-2.0-`, the stream goes to `SshInterface`. Otherwise, the stream goes to +the axum HTTP router. -### InterfaceConfig +**Phase 2 scope**: Auth middleware, stealth handoff, and default 404 handler +only. Specific operation routes and path conventions are Phase 5+. The +`ListenerConfig::Http` variant spawns an axum router that reaches auth context; +routing inside axum is a later concern. -Different interfaces require different configuration: +### DnsInterface (MessageInterface) + +A DNS server that encodes/decodes `EventEnvelope` frames as DNS query/response +pairs. AuthToken is embedded in DNS query labels. Resolution via +`IdentityProvider::resolve_from_token()`. + +This is a `MessageInterface` — it manages its own transport (UDP/TCP port 53) +and handles individual DNS queries as request/response pairs. DNS is NOT a +transport. + +**Phase**: DNS interface implementation is Phase 5+. The `ListenerConfig::Dns` +variant and `DnsInterface` stub are defined now; implementation is deferred. + +### Stream-Based Interface Pairs + +| Transport | StreamInterface | Credential Presentation | Use case | +|-----------|---------------|------------------------|----------| +| TLS | SshInterface | SSH key handshake | Standard alknet tunnel | +| TCP | SshInterface | SSH key handshake | Plain SSH tunnel | +| iroh | SshInterface | SSH key handshake | P2P SSH tunnel | +| TCP | RawFramingInterface | AuthToken in frame header | Local service mesh | +| TLS | RawFramingInterface | AuthToken in frame header | Secure mesh | +| WebTransport | RawFramingInterface | AuthToken in CONNECT request | Browser call protocol (future) | + +### Message-Based Interface Pairs + +| MessageInterface | Credential Presentation | Owns transport? | Use case | +|-----------------|------------------------|----------------|----------| +| HttpInterface | `Authorization: Bearer` header | Yes (axum) | REST API, dashboard, integrations | +| DnsInterface | AuthToken in query labels | Yes (DNS server) | Censorship-resistant control channel | +| WebSocketInterface | AuthToken in handshake | Yes (WS server) | Browser persistent connection (future) | + +Message-based interfaces manage their own transport. They don't need a +`Transport` from Layer 1 — they ARE the transport+interface combined. + +### ListenerConfig + +The server's accept loop configuration covers both stream and message interfaces: ```rust -pub enum InterfaceConfig { - Ssh(SshInterfaceConfig), - RawFraming(RawFramingConfig), +pub enum ListenerConfig { + Stream { + transport: TransportKind, + interface: StreamInterfaceKind, + }, + Http { + bind_addr: SocketAddr, + tls: bool, + stealth: bool, // byte-peek protocol detection on shared port + }, + Dns { + bind_addr: SocketAddr, + tls: bool, + }, } -pub struct SshInterfaceConfig { - pub auth: Arc, - pub forwarding: Arc>, // for ForwardingPolicy - pub host_key: Arc, +pub enum StreamInterfaceKind { + Ssh, + RawFraming, } -pub struct RawFramingConfig { - // No SSH-specific config needed - // Auth is handled by the transport layer (e.g., token auth for WebTransport) - // or by the call protocol layer +pub enum TransportKind { + Tcp, + Tls { server_name: Option }, + Iroh { endpoint_id: String }, + WebTransport, // Phase 5+: tag only, no acceptor yet } ``` -### Auth Across Interfaces +Note: `TransportKind::Dns` does NOT exist. DNS is a `MessageInterface`, not a +transport. The `ListenerConfig::Dns` variant handles DNS listener configuration +directly. -- **SshInterface**: Auth happens during SSH handshake via - `IdentityProvider::resolve_from_fingerprint()`. The authenticated `Identity` - is attached to the session. -- **RawFramingInterface**: Auth is handled by the transport (e.g., token auth - for WebTransport via `IdentityProvider::resolve_from_token()`) or by the call - protocol layer (operation-level ACL). +### Credential Presentation Across Interfaces -Both paths produce the same `Identity` type (ADR-029). +Every interface resolves to the same `Identity` through `IdentityProvider`: + +``` +SSH fingerprint → IdentityProvider::resolve_from_fingerprint → Identity +AuthToken (Bearer) → IdentityProvider::resolve_from_token → Identity +API key (Bearer) → IdentityProvider::resolve_from_token → Identity +DNS embedded token → IdentityProvider::resolve_from_token → Identity +``` + +The credential presentation differs per (Transport, Interface) pair, but the +resolution result is always an `Identity`. See [definitions.md](definitions.md) +for the full table and terminology rules. ### Server Accept Loop -With the Interface trait, the accept loop becomes: +With both stream and message interfaces, the accept loop becomes: ```rust for listener in listeners { - let (transport, interface) = listener; - tokio::spawn(async move { - loop { - let stream = transport.accept().await?; - let session = interface.accept(stream, &config).await?; - // session produces call protocol events - // call protocol handler is interface-agnostic + match listener { + ListenerConfig::Stream { transport, interface } => { + // Spawn accept loop: transport.accept() → interface.accept(stream) } - }); + ListenerConfig::Http { bind_addr, tls, stealth } => { + // Spawn axum HTTP server on bind_addr + // If stealth: byte-peek after TLS, route SSH vs HTTP + } + ListenerConfig::Dns { bind_addr, tls } => { + // Spawn DNS server on bind_addr + } + } } ``` ## Constraints -- The Interface trait must accommodate both SSH's channel multiplexing and raw - framing's single-stream model through the same abstraction. -- `SshInterface` is the most invasive refactoring in Phase 1. The existing - `ServerHandler` owns auth, channel management, and proxy logic — extracting - these cleanly requires careful design (integration-plan, Phase 1.8). -- DNS transport implementation is Phase 4 work. The `TransportKind::Dns` variant - and `RawFramingInterface` are defined now; implementation is deferred. -- WebTransport is Phase 4 work. The `TransportKind::WebTransport` variant is a - tag only for now. +- `StreamInterface` and `MessageInterface` are independent traits with different + signatures, lifecycles, and transport ownership. No common super-trait (ADR-035). +- `SshInterface` is the most invasive refactoring. The existing `SshHandler` + owns auth, channel management, and proxy logic — extracting these cleanly + requires careful design (integration-plan Phase 1.8, completed in Phase 1). +- DNS interface implementation is Phase 5 work. `DnsInterface` is defined as a + `MessageInterface` stub; implementation is deferred. +- HTTP interface Phase 2 scope is limited to auth middleware and stealth handoff. + Specific operation routes are Phase 5+. +- WebTransport is Phase 5 work. `TransportKind::WebTransport` and + `StreamInterfaceKind::WebTransport` are tags only for now. +- `TransportKind::Dns` does not exist. DNS is a `MessageInterface`, not a + transport. This was `TransportKind` enum pollution from an earlier design. +- The `Interface` trait (singular) in the current codebase needs to be renamed + to `StreamInterface`. This is a rename, not a semantic change. ## Open Questions -- **OQ-IF-01**: How does the `Interface` session type relate to the call - protocol's `EventEnvelope` stream? Does every session implement - `Stream`? This needs design during Phase 1.8. +- **OQ-IF-02**: ~~Should `SshInterface` own the `ForwardingPolicy` check for + `channel_open_direct_tcpip`, or should that move to Layer 3?~~ **Resolved**: + ForwardingPolicy is Layer 3, but channel open/close lifecycle is Layer 2. + SshInterface reports channel requests to Layer 3; Layer 3 applies policy. -- **OQ-IF-02**: Should `SshInterface` own the `ForwardingPolicy` check for - `channel_open_direct_tcpip`, or should that move to Layer 3? Current thinking: - the forwarding check is a Layer 3 concern (it's policy, not session mechanics), - but the channel open/close lifecycle is Layer 2. The Interface reports channel - open requests to Layer 3; Layer 3 applies `ForwardingPolicy` and tells - Layer 2 whether to proxy. +- **OQ-P2-01**: Should `MessageInterface` and `StreamInterface` share a common + trait? **Recommendation**: No. Independent traits with different signatures, + lifecycles, and transport ownership. A common super-trait adds complexity + without clear benefit. (See ADR-035.) + +- **OQ-P2-02**: Should the HTTP interface share a port with the SSH listener? + **Recommendation**: Start with separate ports. ALPN multiplexing on port 443 + is a future optimization that doesn't change the interface abstraction. + Stealth mode byte-peek already handles shared-port detection for the common + case. ## Design Decisions | ADR | Decision | Summary | |-----|----------|---------| | [026](decisions/026-transport-interface-separation.md) | Three-layer model | SSH is Layer 2, not Layer 1 | +| [035](decisions/035-streaminterface-messageinterface-split.md) | StreamInterface / MessageInterface | Two trait categories at Layer 2 | | [033](decisions/033-operationenv-irpc-call-protocol.md) | OperationEnv | Protocol is interface-agnostic | | [029](decisions/029-identity-core-type.md) | Identity as core type | Auth resolution across interfaces | | [031](decisions/031-forwarding-policy.md) | Forwarding policy | Layer 3 policy applied to Layer 2 channel requests | ## References -- [research/integration-plan.md](../research/integration-plan.md) — Phase 1.8, valid (Transport, Interface) pairs -- [research/core.md](../research/core.md) — DNS transport, three-layer model -- [ADR-026](decisions/026-transport-interface-separation.md) — Transport/interface separation +- [definitions.md](definitions.md) — Terminology disambiguation, credential presentation +- [research/phase2/interface-model.md](../research/phase2/interface-model.md) — Full StreamInterface/MessageInterface analysis +- [research/phase2/tls-transport.md](../research/phase2/tls-transport.md) — HTTP interface, stealth handoff, ListenerConfig +- [research/integration-plan.md](../research/integration-plan.md) — Phase 1.8, Phase 2.1-2.7 - [transport.md](transport.md) — Transport trait (unchanged at Layer 1) -- [server.md](server.md) — Current ServerHandler (will become SshInterface) +- [auth.md](auth.md) — Credential presentation per (Transport, Interface) pair - [identity.md](identity.md) — IdentityProvider, auth across interfaces \ No newline at end of file diff --git a/docs/architecture/open-questions.md b/docs/architecture/open-questions.md index 458fa29..7fb5cc8 100644 --- a/docs/architecture/open-questions.md +++ b/docs/architecture/open-questions.md @@ -237,14 +237,95 @@ last_updated: 2026-06-07 ### OQ-IF-01: How does the Interface session type relate to the call protocol's EventEnvelope stream? - **Origin**: [interface.md](interface.md) -- **Status**: open -- **Priority**: high -- **Resolution**: (pending — needs design during Phase 1.8 implementation) -- **Cross-references**: [interface.md](interface.md), [ADR-026](decisions/026-transport-interface-separation.md) +- **Status**: ~~resolved~~ +- **Priority**: ~~high~~ — +- **Resolution**: `InterfaceSession::recv()` returns `Option` where `InterfaceEvent` carries `EventEnvelope` + `Identity`. `InterfaceSession::send()` accepts `EventEnvelope`. The `SshSession` bridge implements this over the `alknet-control:0` channel. For `MessageInterface`, `InterfaceRequest`/`InterfaceResponse` normalize request/response pairs. See [interface.md](interface.md) and ADR-035. +- **Cross-references**: [ADR-035](decisions/035-streaminterface-messageinterface-split.md), [interface.md](interface.md) ### OQ-IF-02: Should SshInterface own ForwardingPolicy checks or should they move to Layer 3? - **Origin**: [interface.md](interface.md) -- **Status**: open +- **Status**: ~~resolved~~ +- **Priority**: ~~medium~~ — +- **Resolution**: ForwardingPolicy is Layer 3 (it's policy, not session mechanics). Channel open/close lifecycle is Layer 2. The Interface reports channel open requests to Layer 3; Layer 3 applies ForwardingPolicy. The current `SshHandler` implementation checks policy in `channel_open_direct_tcpip`, which already delegates to `Identity.scopes` from the authenticated identity — this is consistent with the resolution. +- **Cross-references**: [ADR-031](decisions/031-forwarding-policy.md), [interface.md](interface.md) + +### OQ-P2-01: Should MessageInterface and StreamInterface share a common trait? +- **Origin**: [research/phase2/interface-model.md](../research/phase2/interface-model.md) +- **Status**: resolved - **Priority**: medium -- **Resolution**: (pending — current thinking: forwarding check is Layer 3 policy, but channel open/close lifecycle is Layer 2. The Interface reports channel open requests to Layer 3; Layer 3 applies ForwardingPolicy.) -- **Cross-references**: [interface.md](interface.md), [ADR-031](decisions/031-forwarding-policy.md) \ No newline at end of file +- **Resolution**: Independent traits. Different signatures (`handle_request` vs `accept` + session lifecycle), different transport ownership (self-managed vs provided), different lifecycles (stateless per-request vs long-lived session). A common super-trait adds complexity without benefit. See ADR-035. +- **Cross-references**: [ADR-035](decisions/035-streaminterface-messageinterface-split.md), [interface.md](interface.md) + +### OQ-P2-02: Should the HTTP interface share a port with the SSH listener? +- **Origin**: [research/phase2/interface-model.md](../research/phase2/interface-model.md) +- **Status**: resolved +- **Priority**: low +- **Resolution**: Start with separate ports. Stealth mode byte-peek on a shared port is already implemented for SSH vs HTTP detection. `ListenerConfig::Http { stealth: true }` enables the existing peek pattern. ALPN multiplexing on port 443 is a future optimization that doesn't change the interface abstraction. +- **Cross-references**: [interface.md](interface.md), [research/phase2/tls-transport.md](../research/phase2/tls-transport.md) + +### OQ-P2-03: Should the HTTP interface auto-generate OpenAPI specs from OperationRegistry? +- **Origin**: [research/phase2/interface-model.md](../research/phase2/interface-model.md) +- **Status**: resolved +- **Priority**: low +- **Resolution**: Yes, but Phase 5+. The HTTP interface needs to exist first (Phase 5.3 in the integration plan). `GET /v1/schema` producing an OpenAPI spec from registered `OperationSpec`s is the natural end state. This creates symmetry with `FromOpenAPI` (inbound spec consumption). +- **Cross-references**: [call-protocol.md](call-protocol.md), [interface.md](interface.md) + +### OQ-P2-04: How do self-hosted services authenticate via alknet? +- **Origin**: [research/phase2/credential-provider.md](../research/phase2/credential-provider.md), [research/phase2/definitions.md](../research/phase2/definitions.md) +- **Status**: resolved +- **Priority**: medium +- **Resolution**: Three-phase approach. Phase A: shared secret (`CredentialSet::Bearer` or `S3AccessKey`). Phase C: identity-bound credentials via `ManagedCredentialProvider`. Phase D: alknet as OIDC provider. The `CredentialProvider` trait in core enables Phase A immediately; Phases C and D are additive. +- **Cross-references**: [ADR-036](decisions/036-credentialprovider-core-type.md), [credentials.md](credentials.md) + +## Credentials + +### OQ-CP-01: Should CredentialProvider support per-identity credentials? +- **Origin**: [credentials.md](credentials.md) +- **Status**: open +- **Priority**: low +- **Resolution**: Start with service-level credentials (`get_credentials(service)`). Add identity-level resolution (`get_credentials_for(service, identity_id)`) when the need is concrete. `Identity.id` already serves as the account UUID in database-backed mode. +- **Cross-references**: [credentials.md](credentials.md), [ADR-036](decisions/036-credentialprovider-core-type.md) + +### OQ-CP-02: Where should OIDC provider operations live? +- **Origin**: [credentials.md](credentials.md) +- **Status**: open +- **Priority**: low +- **Resolution**: Application service (Phase D). OIDC is an application concern, not a core concern. The call protocol and OperationRegistry provide the transport; OIDC is just another set of operations. +- **Cross-references**: [credentials.md](credentials.md) + +### OQ-CP-03: How do credential rotations propagate across a cluster? +- **Origin**: [credentials.md](credentials.md) +- **Status**: open +- **Priority**: low +- **Resolution**: TBD. Likely TTL-based caching with a refresh threshold. Workers call `CredentialProvider::get_credentials()` which checks `is_expired()` and calls `refresh_credentials()` if needed. +- **Cross-references**: [credentials.md](credentials.md) + +### OQ-CP-04: Should CredentialSet include request-signing capability? +- **Origin**: [credentials.md](credentials.md) +- **Status**: resolved +- **Priority**: low +- **Resolution**: No. `CredentialSet` is pure data. Request signing (e.g., AWS Signature V4) is a separate utility function in the service wrapper or a shared `alknet-s3` crate. Credentials are data; signing is protocol behavior. +- **Cross-references**: [credentials.md](credentials.md) + +## Definitions + +### OQ-DEF-01: Should alknet adopt a "Service Catalog" concept like Keystone? +- **Origin**: [research/phase2/definitions.md](../research/phase2/definitions.md) +- **Status**: resolved +- **Priority**: low +- **Resolution**: Keep `OperationRegistry` global, check scope at invocation time. Add scope-filtered discovery (`GET /v1/schema?scope=...`) when multi-tenant deployment requires it. The unfiltered registry is sufficient for current needs. +- **Cross-references**: [call-protocol.md](call-protocol.md) + +### OQ-DEF-03: Should Identity.scopes be hierarchical or stay flat? +- **Origin**: [research/phase2/definitions.md](../research/phase2/definitions.md) +- **Status**: resolved +- **Priority**: low +- **Resolution**: Stay flat. Add implied scope resolution in alknet-storage when multi-tenant deployment requires it. A full policy language (like Rustfs IAM JSON policies) is Phase D territory. +- **Cross-references**: [identity.md](identity.md) + +### OQ-DEF-08: Should "credential presentation" replace "auth interface" in terminology? +- **Origin**: [research/phase2/definitions.md](../research/phase2/definitions.md) +- **Status**: resolved +- **Priority**: medium +- **Resolution**: Yes. Adopted in [definitions.md](definitions.md). Use "credential presentation" for the mechanism of presenting credentials on a (Transport, Interface) pair. Never use "auth interface" (overloads "Interface"). +- **Cross-references**: [definitions.md](definitions.md), [auth.md](auth.md) \ No newline at end of file diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md index 618d516..62d3538 100644 --- a/docs/architecture/overview.md +++ b/docs/architecture/overview.md @@ -35,17 +35,22 @@ irpc is behind a feature flag in alknet-core. Nodes that only do SSH tunneling d ## Three-Layer Model -Alknet uses a three-layer model (ADR-026): +Alknet uses a three-layer model (ADR-026, ADR-035): | Layer | Responsibility | Examples | |-------|---------------|----------| -| **Layer 1: Transport** | Produces byte streams (`AsyncRead + AsyncWrite + Unpin + Send`) | TCP, TLS, iroh, DNS (future), WebTransport (future) | -| **Layer 2: Interface** | Consumes a transport stream and produces call protocol sessions | SSH (handshake + auth + channel multiplexing), raw framing (length-prefix + JSON) | +| **Layer 1: Transport** | Produces byte streams (`AsyncRead + AsyncWrite + Unpin + Send`) | TCP, TLS, iroh, WebTransport (future) | +| **Layer 2: Interface** | Two categories: StreamInterface (consumes transport stream, produces session) and MessageInterface (handles discrete requests, manages own transport) | Stream: SSH, raw framing. Message: HTTP, DNS | | **Layer 3: Protocol** | Carries semantics — operation registry, service calls, events | Call protocol, OperationEnv, operation dispatch | -SSH is an interface, not a transport. The three-layer model enables DNS control channels (DNS transport + raw framing), local service mesh (TCP + raw framing), and browser direct call protocol (WebTransport + raw framing) without wrapping SSH inside those transports. +SSH is an interface, not a transport. DNS is a message interface, not a transport. +The three-layer model enables HTTP interfaces (stealth mode byte-peek), +DNS control channels, and local service mesh (raw framing) without wrapping SSH +inside those transports. -A connection is always a (Transport, Interface) pair. The protocol layer is agnostic to both. +A stream-based connection is always a (Transport, StreamInterface) pair. +Message-based interfaces manage their own transport. The protocol layer is +agnostic to both. ## Service Layer @@ -93,15 +98,21 @@ The `alknet-core` crate exports the pluggable components for embedding or progra - `TcpTransport` — direct TCP connection - `TlsTransport` — TCP + tokio-rustls TLS - `IrohTransport` — iroh QUIC P2P connection -- `Interface` trait — consumes transport stream, produces call protocol session +- `Interface` trait → `StreamInterface` trait and `MessageInterface` trait (ADR-035) +- `InterfaceSession` trait — `recv()`/`send()` producing/consuming `InterfaceEvent` frames +- `InterfaceRequest` / `InterfaceResponse` — normalized request/response for message interfaces - `Socks5Server` — local SOCKS5 proxy that forwards through SSH channels - `PortForwarder` — manages local/remote port forwards -- `ServerHandler` — russh server handler with configurable auth and channel policies +- `ServerHandler` → `SshInterface` — russh server handler with configurable auth and channel policies - `Identity` / `IdentityProvider` — core identity types (ADR-029) +- `CredentialProvider` / `CredentialSet` — outbound credential types (ADR-036) - `OperationSpec` — operation registration for call protocol (ADR-025) +- `OperationEnv` / `OperationContext` — universal composition and operation context - `ConnectOptions` / `ServeOptions` — programmatic configuration structs -- `StaticConfig` / `DynamicConfig` — static/immutable vs. hot-reloadable config (ADR-030) +- `StaticConfig` / `DynamicConfig` — static/immutable vs, hot-reloadable config (ADR-030) - `ConfigReloadHandle` — programmatic reload of dynamic config +- `ForwardingPolicy` — rule-based allow/deny for channel targets (ADR-031) +- `ListenerConfig` — stream and message listener configuration ## Dependencies @@ -134,7 +145,7 @@ The `alknet-core` crate exports the pluggable components for embedding or progra 1. **SSH runs over transport, not alongside** — The transport layer produces a single `AsyncRead+AsyncWrite+Unpin+Send` stream. SSH runs over that stream via `russh::client::connect_stream()` / `russh::server::run_stream()`. The SSH layer never knows what transport it's on. (ADR-001, ADR-004) -2. **Three-layer model: Transport, Interface, Protocol** — SSH is an interface (Layer 2), not a transport (Layer 1). A connection is always a (Transport, Interface) pair. The call protocol (Layer 3) is agnostic to both. This enables DNS control channels, raw framing, and WebTransport direct call protocol without wrapping SSH inside those transports. (ADR-026) +2. **Three-layer model: Transport, Interface, Protocol** — SSH is a StreamInterface (Layer 2), not a transport (Layer 1). HTTP and DNS are MessageInterfaces (Layer 2). A connection is always a (Transport, StreamInterface) pair for stream-based interfaces, or a standalone MessageInterface for message-based ones. The call protocol (Layer 3) is agnostic to both. This enables HTTP interfaces, DNS control channels, and local service mesh without wrapping SSH. (ADR-026, ADR-035) 3. **SOCKS5 is the primary client interface** — Port forwarding is built on top of SOCKS5-like channel management. For VPN-like "route all traffic" behavior, users run `tun2proxy` alongside alknet's SOCKS5 proxy. TUN is not in the project scope. (ADR-005, ADR-014) @@ -193,6 +204,9 @@ The `alknet-core` crate exports the pluggable components for embedding or progra | [032](decisions/032-event-boundary-discipline.md) | Event boundary | Domain events never cross service boundaries | | [033](decisions/033-operationenv-irpc-call-protocol.md) | OperationEnv | Universal composition, three dispatch paths | | [034](decisions/034-head-worker-terminology.md) | Head/worker | Replaces hub/spoke terminology | +| [035](decisions/035-streaminterface-messageinterface-split.md) | StreamInterface/MessageInterface | Two Layer 2 trait categories for stream vs message | +| [036](decisions/036-credentialprovider-core-type.md) | CredentialProvider as core type | Outbound credentials in `alknet_core::credentials` | +| [037](decisions/037-api-keys-dynamic-config.md) | API keys in DynamicConfig | Hash-verified bearer tokens for service accounts | ## Open Questions @@ -204,10 +218,12 @@ relationship). ## References - [transport.md](transport.md) — Transport abstraction (Layer 1) -- [interface.md](interface.md) — Interface layer (Layer 2) +- [interface.md](interface.md) — StreamInterface and MessageInterface (Layer 2) - [call-protocol.md](call-protocol.md) — Call protocol (Layer 3) -- [auth.md](auth.md) — Unified authentication +- [auth.md](auth.md) — Unified authentication, API keys, credential presentation - [identity.md](identity.md) — Identity and IdentityProvider +- [credentials.md](credentials.md) — CredentialProvider and CredentialSet (outbound auth) +- [definitions.md](definitions.md) — Terminology disambiguation - [configuration.md](configuration.md) — StaticConfig, DynamicConfig, ForwardingPolicy - [services.md](services.md) — irpc service layer, OperationEnv - [server.md](server.md) — Server acceptance, channel handling