greenfield: clean slate for ALPN-as-service pivot

Delete old source crates (alknet-core, alknet, alknet-napi), old
architecture docs (ADRs, specs, open questions), old research docs
(phase2, event-sourcing, feasibility, etc.), old tasks, and obsolete
reference material (gitserver/MPL, honker, nats, rustfs, polyglot,
keystone, distributed-identity).

Keep: alknet-secret (standalone, compiles), pivot docs, iroh and ssh
references, rudolfs reference (MIT/Apache, fork candidate), ops docs,
sdd_process.md, and licenses.

Previous implementation preserved at /workspace/@alkdev/alknet-main/
for reference during porting.

Workspace compiles: cargo check + 14 tests pass for alknet-secret.
This commit is contained in:
2026-06-15 12:08:08 +00:00
parent d003a4f4ec
commit b5a4600d74
261 changed files with 138 additions and 53794 deletions

View File

@@ -1,651 +0,0 @@
---
status: draft
last_updated: 2026-06-04
phase: exploration
---
# Configuration Architecture
## Terminology Change: Head/Worker
This document previously used **hub/spoke** terminology. It has been updated to **head/worker**:
- **Head node**: The coordinating node (formerly "hub"). A head can also be a worker.
- **Worker node**: A node that connects to a head and registers services (formerly "spoke").
- **Node**: Any participant in the network. Every node has an identity.
This better reflects that a head is also a worker, enabling mesh topologies.
## Problem
## Problem
Alknet's configuration is loaded once at startup and never changes. This has
three specific failures:
1. **No hot reload of authentication credentials.** Adding or removing an
authorized key requires restarting the server process. In a head/worker
deployment where keys are managed via a database (see
`@alkdev/storage`'s `peer_credentials` table), the alknet process must be
restarted every time a key is added, revoked, or rotated. This is
operationally unacceptable for a production service.
2. **No port forwarding access control.** Any authenticated client can open a
`direct-tcpip` channel to any destination. There is no policy governing
which hosts, ports, or `alknet-*` control channels a client may access. This
is a security gap — a compromised key grants unrestricted network access
through the tunnel.
3. **No structured configuration beyond CLI flags.** ADR-011 chose
programmatic-first configuration for the alpha. This was correct — it
avoided cross-platform path issues and kept the API surface small. But as
alknet moves toward publishable releases, operators need config files for
reproducible deployments, and the NAPI layer needs programmatic reload
capability that the current `ServeOptions` builder pattern doesn't support.
### What's Not The Problem
- This does not propose depending on Honker, SQLite, or any specific data
source at the `alknet-core` level. The core provides a reload mechanism;
data sources plug in from outside.
- This does not propose file-watching (potential attack vector, unnecessary
complexity). CLI usage loads config once at startup. Programmatic usage
(NAPI, head node) calls reload explicitly.
- This does not replace the existing `ServeOptions` builder pattern. It
generalizes it.
## Analysis
### Static vs Dynamic Configuration
Not all configuration should be reloadable. Transport-level settings (listen
address, TLS certificates, host key) require socket/TLS renegotiation to change
at runtime — effectively a restart. Auth and forwarding policy can change
atomically without disrupting existing connections.
| Category | Examples | Reloadable? |
|---|---|---|
| Transport | listen addr, TLS cert/key, iroh relay, stealth mode | No — requires bind change |
| Identity | host key, host key algorithm | No — requires SSH re-negotiation |
| Auth | authorized keys, cert authorities | **Yes** — next auth check picks up changes |
| Forwarding | allowed destinations, per-principal rules | **Yes** — next channel open picks up changes |
| Rate limits | max connections per IP, max auth attempts | **Yes** — next check picks up changes |
The split is clean: anything that affects the SSH handshake or socket binding
is static. Anything that's checked per-connection or per-channel is dynamic.
### Auth Reload: Service Approach
The original design held all authorized keys in memory via `ArcSwap<DynamicConfig>`. For small deployments this works, but for nodes serving many users it requires loading every key into RAM and atomic-swapping the entire set on each reload.
The improved approach is to make auth an **irpc service** (see [core.md](core.md) and [services.md](services.md)). Auth verification becomes a service call: `VerifyPubkey { fingerprint, key_data }``oneshot::Sender<AuthResult>`. The service can:
- Query SQLite on demand (no need to hold all keys in memory)
- Maintain an LRU cache for hot keys
- Subscribe to honker streams for key invalidation
- Run locally (in-process mpsc) or remotely (QUIC stream)
`ArcSwap<DynamicConfig>` remains as a fallback for minimal deployments (CLI usage, single-node setups) where SQLite overhead isn't warranted. The service approach is the primary path for production deployments.
### Current Architecture
```
ServeOptions (builder) → Server::new()
├─ Arc<server::Config> (russh config, immutable)
├─ Arc<ServerAuthConfig> (keys + CAs, immutable after load)
├─ Arc<ConnectionRateLimiter> (mutable but not reloadable)
└─ ServerHandler::new(auth_config, ...)
ServerHandler
├─ auth_config: Arc<ServerAuthConfig> ← shared, immutable
├─ connection_limiter: Arc<ConnectionRateLimiter>
├─ outbound_proxy: Option<ProxyConfig>
└─ (no forwarding policy field)
```
`auth_publickey()` reads from `self.auth_config` via `Arc` dereference. No
path to update it.
### Proposed Architecture
Replace `Arc<ServerAuthConfig>` with a service-based approach:
```
StaticConfig (Arc, loaded once)
├─ transport mode, listen addr, TLS config, iroh config
├─ stealth, proxy
├─ host key
└─ max_auth_attempts, max_connections_per_ip
AuthService (irpc service, local or remote)
├─ VerifyPubkey(fingerprint, key_data) → AuthResult
├─ VerifyToken(token_bytes) → AuthResult
└─ ReloadKeys() → ()
Backed by: SQLite (peer_credentials, api_keys)
Optional: ArcSwap<DynamicConfig> for minimal deployments
ConfigService (irpc service, always local)
├─ ReloadDynamicConfig(DynamicConfig)
└─ GetForwardingPolicy() → ForwardingPolicy
DynamicConfig (Arc<ArcSwap<DynamicConfig>>, reloadable)
├─ forwarding: ForwardingPolicy
└─ rate_limits: RateLimitConfig
```
For production: auth verification goes through the auth service, which queries SQLite. The `DynamicConfig` only holds forwarding policy and rate limits — not the full key set. For minimal deployments: auth falls back to `ArcSwap<DynamicConfig>` with all keys in memory, wrapped by the same service interface.
`ArcSwap` provides lock-free reads on the hot path. Every `auth_publickey()`
and `channel_open_direct_tcpip()` call does an `Arc` dereference — zero cost
compared to the current approach. Writes are atomic: `store()` swaps the
pointer. Existing connections finish with their current config, new connections
get the new config.
### Forwarding Policy
Currently, `channel_open_direct_tcpip` in `handler.rs` spawns a proxy task for
any destination. The only gate is authentication. A forwarding policy adds a
check before the proxy spawn:
```rust
pub struct ForwardingPolicy {
default: ForwardingAction,
rules: Vec<ForwardingRule>,
}
pub struct ForwardingRule {
target: TargetPattern,
action: ForwardingAction,
principals: Vec<String>,
}
pub enum ForwardingAction { Allow, Deny }
pub enum TargetPattern {
Any,
Host(String),
Cidr(IpNetwork),
PortRange(String, Range<u16>),
AlknetPrefix,
}
```
Rule evaluation: first match wins, default applies if no rule matches. This
model maps to OpenSSH's `AllowTcpForwarding` + `PermitOpen` but is more
expressive. It also maps to `peer_credentials.metadata.scopes` in `@alkdev/storage`
— the head node can generate forwarding rules from stored scopes.
Rule ordering matters. A deny-then-allow pattern gives blocklist semantics. An
allow-then-deny pattern gives allowlist semantics. Both are useful. The
default determines the fallback.
### Configuration File Format
ADR-011 chose "programmatic-first, no config file." This was correct for alpha.
For publishable releases, a config file enables:
- Reproducible deployments (version-controlled config)
- Less verbose CLI invocations
- Separate files for static and dynamic config (only static needs to be in the
config file; dynamic comes from the reload mechanism)
TOML is the idiomatic Rust choice. The config file covers static config only —
the same fields as `ServeOptions`. Dynamic config (auth, forwarding) comes from
the reload mechanism, not from the file. This preserves ADR-011's intent: the
core doesn't know about the data source for auth keys, it just provides a place
to put them.
```toml
[server]
transport = "tls"
listen = "0.0.0.0:443"
stealth = false
max_connections_per_ip = 5
max_auth_attempts = 3
[server.tls]
cert = "/etc/alknet/tls/cert.pem"
key = "/etc/alknet/tls/key.pem"
[server.iroh]
relay = "https://relay.alk.dev"
[auth]
host_key = "/etc/alknet/ssh/host_key"
[forwarding]
default = "deny"
[[forwarding.rules]]
target = "localhost:*"
action = "allow"
[[forwarding.rules]]
target = "alknet-*"
action = "allow"
[[forwarding.rules]]
target = "*:22"
action = "deny"
```
The `[[forwarding.rules]]` array syntax is TOML's array-of-tables pattern.
Rules are evaluated in order; first match wins.
### NAPI Reload API
The NAPI layer exposes the reload handle:
```typescript
interface AlknetServer {
reloadAuth(auth: { authorizedKeys?: Buffer, certAuthority?: Buffer }): void;
reloadForwarding(policy: ForwardingPolicyConfig): void;
reloadAll(config: DynamicConfig): void;
}
interface ForwardingPolicyConfig {
default: 'allow' | 'deny';
rules: ForwardingRuleConfig[];
}
interface ForwardingRuleConfig {
target: string; // "localhost:*", "10.0.0.0/8:80", "alknet-*"
action: 'allow' | 'deny';
principals?: string[]; // default ["*"]
}
```
The head node calls `server.reloadAuth(...)` after writing to `peer_credentials`.
The NAPI layer parses the key data and constructs a new `DynamicConfig`, then
calls the `ConfigReloadHandle`.
### Client Configuration
Client configuration is almost entirely static (which server to connect to,
which key to use). The only potential dynamic config is key rotation, which is
less urgent because clients don't serve. For now, client configuration stays
as `ConnectOptions` — no `ArcSwap` needed.
A config file for client connections could define named profiles:
```toml
[profiles.production]
server = "head.alk.dev:443"
transport = "tls"
identity = "/home/user/.ssh/id_ed25519"
[profiles.staging]
server = "staging.alk.dev:22"
transport = "tcp"
identity = "/home/user/.ssh/staging_key"
```
This is a convenience layer on top of `ConnectOptions`, not a replacement.
### CLI vs Programmatic Behavior
| Interface | Static config | Dynamic config | Reload mechanism |
|---|---|---|---|
| CLI | Flags + optional `--config` file | Loaded at startup from `--authorized-keys` | None (restart to change) |
| Core Rust | `StaticConfig` struct | `AuthService` (irpc) or `ArcSwap<DynamicConfig>` (minimal) | `ConfigService::reload()` or `ConfigReloadHandle::reload()` |
| NAPI | `serve()` options | Same | `server.reloadAuth()`, `server.reloadForwarding()` |
The CLI doesn't need a reload mechanism. When you're running alknet from the
command line, restarting is fine. The reload mechanism exists for programmatic
consumers and for the auth service pattern where keys are queried on demand from
a database.
### Multi-Transport Listeners
A head node may want to accept connections on multiple transports simultaneously:
- TCP on port 22 (simple, direct SSH)
- TLS on port 443 (stealth mode, corporate firewalls)
- iroh QUIC (P2P, no port forwarding needed)
- WebTransport on port 443 (browser clients, shares the HTTP/3 listener)
Currently `ServeTransportMode` is a single enum and `Server::run()` takes one
acceptor. To serve multiple transports, the architecture needs to change.
**Option A: `Server` manages multiple listeners internally.**
```rust
pub struct Server {
// Shared state (one copy, shared across all listeners)
config: Arc<server::Config>,
dynamic_config: Arc<ArcSwap<DynamicConfig>>,
connection_limiter: Arc<ConnectionRateLimiter>,
outbound_proxy: Option<ProxyConfig>,
sessions: Arc<tokio::sync::Mutex<Vec<ActiveSession>>>,
shutdown_tx: tokio::sync::watch::Sender<bool>,
shutdown_rx: tokio::sync::watch::Receiver<bool>,
// Per-listener state
listeners: Vec<ListenerConfig>,
}
pub struct ListenerConfig {
transport: ServeTransportMode,
listen_addr: SocketAddr,
stealth: bool,
// Transport-specific config (TLS cert, iroh relay, etc.)
tls: Option<TlsConfig>,
iroh: Option<IrohConfig>,
}
```
`Server::run()` spawns one accept loop per `ListenerConfig`. Each loop
constructs its own acceptor and `ServerHandler` (with the appropriate
`TransportKind` tag), but shares the auth config, connection limiter, and
session list. Shutdown signal goes to all loops.
**Option B: Caller manages multiple `Server` instances.**
The caller creates N `Server` objects, each with its own transport. They share
`Arc<ArcSwap<DynamicConfig>>` and `Arc<ConnectionRateLimiter>` explicitly.
Option A is better because: shared shutdown, shared session tracking, single
point for config reload. Option B puts coordination burden on the caller and
makes graceful shutdown harder (N independent shutdown channels).
**The TLS + WebTransport coexistence question.** Both TLS and WebTransport
use port 443. WebTransport is HTTP/3 (QUIC), TLS on port 443 is typically
TCP+TLS. They can share the port because they're different protocols — QUIC
is UDP, TLS-over-TCP is TCP. The kernel routes by protocol. But if both are
on 443, the stealth mode protocol detector needs to handle HTTP/3 as well:
```
Port 443:
TCP connection → TLS handshake → SSH (existing)
UDP "connection" → QUIC handshake → WebTransport → stream proxy
```
This is similar to how iroh-live-relay works: HTTP/3 listener accepts
WebTransport sessions, each session opens bidirectional streams that map to
internal services.
**Config file for multi-transport:**
```toml
[[listeners]]
transport = "tls"
listen = "0.0.0.0:443"
stealth = true
[listeners.tls]
cert = "/etc/alknet/tls/cert.pem"
key = "/etc/alknet/tls/key.pem"
[[listeners]]
transport = "tcp"
listen = "0.0.0.0:22"
[[listeners]]
transport = "iroh"
iroh_relay = "https://relay.alk.dev"
[[listeners]]
transport = "webtransport"
listen = "0.0.0.0:443"
# WebTransport shares port 443 with TLS because QUIC is UDP, TLS is TCP
[listeners.webtransport]
cert = "/etc/alknet/tls/cert.pem"
key = "/etc/alknet/tls/key.pem"
```
The `[[listeners]]` array-of-tables pattern means each listener is an
independent config block. The `[auth]`, `[forwarding]`, and `[server]`
sections at the top level are shared — they apply to all listeners.
**NAPI multi-transport:**
```typescript
const server = await serve({
listeners: [
{ transport: 'tls', listen: '0.0.0.0:443', stealth: true, tlsCert: '...', tlsKey: '...' },
{ transport: 'tcp', listen: '0.0.0.0:22' },
{ transport: 'iroh', irohRelay: 'https://relay.alk.dev' },
],
hostKey: hostKeyBuffer,
authorizedKeys: keysBuffer,
});
```
Single `AlknetServer` object, single `reloadAuth()` call affects all
listeners.
### Transport Kind and WebTransport
The `TransportKind` enum (currently `Tcp | Tls | Iroh`) tags each connection
so the handler can behave differently per transport. Adding `WebTransport` to
this enum is straightforward — WebTransport connections are identifiable at
accept time. The handler behavior is the same (port forwarding only), but
the tag enables transport-specific logging and future policy differences
(e.g., WebTransport clients can only access `alknet-*` control channels).
## Proposed Solution
### Phase 1: Static/Dynamic Split
1. Introduce `StaticConfig` and `DynamicConfig` structs
2. Replace `Arc<ServerAuthConfig>` in `ServerHandler` with
`Arc<ArcSwap<DynamicConfig>>`
3. Add `ConfigReloadHandle` with `reload(DynamicConfig)` method
4. Expose `reloadAuth()` on the NAPI `AlknetServer` object
**Scope**: `alknet-core` auth module + `alknet-napi` serve module
**Risk**: Low — internal refactor, no protocol changes
### Phase 2: Forwarding Policy
1. Add `ForwardingPolicy` to `DynamicConfig`
2. Add policy check to `channel_open_direct_tcpip` before proxy spawn
3. Expose `reloadForwarding()` on NAPI `AlknetServer`
**Scope**: `alknet-core` handler + `alknet-napi`
**Risk**: Low — new check, default-allow preserves current behavior
### Phase 3: Config File
1. Add `--config <path>` CLI flag parsing TOML
2. CLI flags override config file values (same precedence as cargo)
3. Config file only covers static config + initial auth config path
4. Add `serde` derive to `StaticConfig`
**Scope**: `alknet-cli` (new binary crate) + `alknet-core` config module
**Risk**: Medium — new dependency (`toml` crate), new CLI surface to validate
### Phase 4: Client Profiles
1. Add `[profiles]` section to client config file
2. `--profile production` loads named profile
3. CLI flags override profile values
**Scope**: `alknet-cli`
**Risk**: Low — convenience layer only
### Phase 5: Multi-Transport Listeners
1. Change `ServeTransportMode` from single enum to `Vec<ListenerConfig>`
2. `Server::run()` spawns one accept loop per listener, sharing `DynamicConfig`
3. Single shutdown signal drains all listeners
4. Add `[[listeners]]` to config file format
5. NAPI `serve()` accepts `listeners` array instead of single `transport`
6. Add `WebTransport` to `TransportKind` enum (initially as a tag only;
actual WebTransport acceptor is a separate R&D phase)
**Scope**: `alknet-core` serve.rs + `alknet-napi` + `alknet-cli`
**Risk**: Medium — changes the primary API surface of `serve()`. Backwards
compat via accepting both `transport: string` (single) and
`listeners: array` (multi) in NAPI.
## Open Questions
- **OQ-CFG-01**: Should forwarding rules support per-user scope derived from
the authenticated key's metadata (e.g., `peer_credentials.metadata.scopes`)?
Or is a global rules table with principal matching sufficient?
Global rules with principal matching is simpler and covers most cases. Per-user
scope derived from certificates is more granular but requires the server to
maintain a mapping from key fingerprint to scope. This mapping comes from the
head node's database, not from the SSH protocol. Phase 2 starts with global rules;
per-user scope can be added as an extension.
- **OQ-CFG-02**: Should the config file watch for changes and auto-reload?
No. File watching is a potential attack vector (symlink races, inotify
limitations on network filesystems). The CLI loads once at startup. The NAPI
layer reloads explicitly. This is the right model for a security-sensitive
tool.
- **OQ-CFG-03**: Should `ArcSwap` be the reload primitive, or is `RwLock`
sufficient?
`ArcSwap` is the standard pattern for this in Rust network services
(`arc-swap` crate). It provides lock-free reads (the hot path) and atomic
writes. `RwLock` would also work but adds lock contention on reads. The
`arc-swap` dependency is small (~500 lines) and well-maintained. Prefer it.
- **OQ-CFG-04**: Should TLS and WebTransport on the same port share a single
QUIC listener (like iroh Router's ALPN dispatch), or run as separate
listeners on the same port?
They can't conflict because QUIC is UDP and TLS-over-TCP is TCP — the
kernel routes by protocol, not by port number. They're naturally separate
listeners even on the same port. However, if iroh is also running on the
same host, the iroh endpoint already owns a QUIC listener. The WebTransport
listener needs its own. Options: (a) share the iroh endpoint's QUIC listener
with ALPN dispatch (reuses `from_endpoint` pattern), (b) separate QUIC
listeners on different ports, (c) bind both to 443/UDP — possible if
`SO_REUSEPORT` is used. Needs R&D; defer to WebTransport transport design
session.
~~**Update**: WebTransport is out of scope for the current configuration
work. It requires a fundamentally different authentication model (HTTP-level
API keys/session tokens vs SSH key-based auth). The `ServerHandler` only
knows SSH `auth_publickey`. WebTransport auth would need its own handler
path. This connects to the broader question of whether `DynamicConfig.auth`
should be transport-aware (see OQ-CFG-06). WebTransport transport design
is a separate R&D session.~~
**Update 2**: Auth concern is resolved by ADR-023. The same authorized_keys
set verifies both SSH pubkey auth and token auth (Ed25519-signed timestamp
for WebTransport). One key material, two presentations. The remaining
question is purely about QUIC listener coexistence — which is a transport
implementation detail, not an auth question. See [auth.md](../architecture/auth.md)
and [ADR-023](../architecture/decisions/023-unified-auth-shared-key-material.md).
- **OQ-CFG-05**: Does `TransportKind::WebTransport` need any handler behavior
different from other transports?
Initially no — all transports get the same port-forwarding-only handler.
But WebTransport connections come from browsers, which have different trust
assumptions. A future forwarding policy might restrict WebTransport clients
to `alknet-*` control channels only (no arbitrary host:port forwarding).
This is a policy question, not a transport question. The `TransportKind` tag
on the handler enables transport-aware policy rules in `ForwardingPolicy`
without changing the handler. Defer to Phase 2 (forwarding policy design).
- **OQ-CFG-06**: Should the auth layer be transport-aware?
Currently `DynamicConfig.auth` is `ServerAuthConfig` — SSH keys and CAs
only. This works for SSH over any transport (TCP, TLS, iroh) because SSH
carries its own auth protocol. But non-SSH transports (WebTransport,
WebSocket) use HTTP-level authentication (API keys, session tokens in
headers/query params). The auth question is: does the same `DynamicConfig`
serve both models, or does each transport carry its own auth config?
~~Option A: `AuthPolicy` contains both SSH auth and API key auth:
```rust
pub struct AuthPolicy {
ssh: SshAuthConfig, // for SSH-over-any-transport
api_keys: Option<ApiKeysConfig>, // for non-SSH transports
}
```
Option B: Auth is per-listener. Each `ListenerConfig` carries its own auth
config appropriate to its transport.
Option A is simpler for the initial implementation — the SSH auth path is
unchanged, and API key auth is additive. Option B is more flexible but
duplicates the shared auth state (keys should be reloadable once, not per
listener).
For now, the config architecture should accommodate Option A as a future
extension. Phase 1 implements `DynamicConfig` with SSH auth only. API key
auth is added when a non-SSH transport is implemented.~~
**Resolved by ADR-023**: The auth layer is transport-aware in its
*presentation*, not its *material*. `AuthPolicy` holds `SshAuthConfig` and
`TokenAuthConfig`, where `TokenAuthConfig.key_source` defaults to
`Shared` (same `authorized_keys` set as SSH auth). The same Ed25519 keys
serve both paths: SSH presents the public key in the handshake; WebTransport
presents an Ed25519-signed timestamp token. Verification produces the same
`Identity` type via the `IdentityProvider` trait. One `reloadAuth()` call
updates both. See [auth.md](../architecture/auth.md) and
[ADR-023](../architecture/decisions/023-unified-auth-shared-key-material.md).
- **OQ-CFG-07**: Should auth and secret services share a single irpc endpoint
or be separate services?
Separate services are better. Auth (verify credentials) and Secret (derive/store
keys) have different security boundaries. The secret service holds the master
seed; the auth service only needs public key fingerprints. They may run on
different machines. See [services.md](services.md) for protocol definitions.
- **OQ-CFG-08**: How do external credentials (API keys, OAuth tokens) relate
to the secret service's HD key derivation?
HD-derived keys (from SLIP-0010/BIP39) cover self-generated secrets (identity
keys, encryption keys, SSH keys). External credentials (third-party API keys,
OAuth tokens) can't be derived — they must be stored encrypted. The secret
service handles both: derived keys are regenerated on demand; stored secrets
are encrypted with a key that is itself derived from the seed. See
[services.md](services.md) for the `SecretProtocol` definition.
## Decisions Required
These decisions will be extracted into ADRs when the architecture is finalized:
1. **ADR-020**: Static/dynamic config split. Auth delegated to `AuthService` (irpc)
for production; `ArcSwap<DynamicConfig>` for minimal deployments. Supersedes
ADR-011's "no config file" — adds optional config file while preserving
programmatic-first API.
2. **ADR-021**: Forwarding policy with rule-based allow/deny. Default-allow
preserves current behavior during migration; default-deny for production
deployments.
3. **ADR-022**: Multi-transport listeners. `Server` spawns multiple accept
loops sharing auth config, session state, and shutdown. Replaces single
`ServeTransportMode` with `Vec<ListenerConfig>`.
4. **ADR-026**: Head/worker terminology. Replace hub/spoke with head/worker
throughout all documentation and APIs. A head is also a worker.
5. **ADR-028**: Auth as service. Auth verification via irpc `AuthProtocol`
service, not in-memory key set. Enables SQLite-backed auth for production,
`ArcSwap` fallback for minimal deployments.
## References
- [ADR-011](../architecture/decisions/011-no-ssh-config-programmatic-api.md) — Programmatic-first API (superseded by ADR-020)
- [ADR-012](../architecture/decisions/012-auth-ed25519-and-cert-authority.md) — Auth key format
- [ADR-018](../architecture/decisions/018-control-channel-for-pubsub.md) — Control channel routing
- `server/handler.rs` — Current `Arc<ServerAuthConfig>` usage
- `server/serve.rs` — Current single-transport `Server::run()` accept loop
- `auth/server_auth.rs``ServerAuthConfig` struct
- `auth/keys.rs``KeySource` and key loading
- `@alkdev/storage/docs/architecture/sqlite-host.md``peer_credentials` table schema
- [wtransport](https://github.com/BiagioFesta/wtransport) — Rust WebTransport library (in `/workspace/wtransport`)
- [arc-swap crate](https://docs.rs/arc-swap) — Lock-free read, atomic write for shared state
- [ADR-023](../architecture/decisions/023-unified-auth-shared-key-material.md) — Unified auth with shared key material
- [auth.md](../architecture/auth.md) — Unified auth architecture spec
- [call-protocol.md](../architecture/call-protocol.md) — Bidirectional call protocol spec
- [services.md](services.md) — Service layer architecture (irpc services)
- [core.md](core.md) — Core overview, head/worker terminology, service layer

View File

@@ -1,426 +0,0 @@
# Alknet Core: Transport, Call Protocol, Auth, Services, and DNS
> Status: Research / Draft
> Last updated: 2026-06-06
## Overview
`alknet-core` is the foundational crate providing pluggable transports, the bidirectional call protocol, Ed25519 authentication, a service layer (via irpc), and (future) DNS transport + naming. Everything else (storage, flowgraph, relay) builds on top of this.
### Terminology: Nodes, Heads, and Workers
Alknet uses a **head/worker** model instead of hub/spoke:
- **Node**: Any participant in the network. Every node has an Ed25519 identity.
- **Head node**: A node that coordinates — accepts connections, routes operations, manages cluster state. A head is also a worker (it can execute operations).
- **Worker node**: A node that connects to a head, registers its services, and executes operations. Any worker can become a head.
- **Service**: A named collection of operations exposed by a node (e.g., `fs`, `bash`, `compute`, `agent`). Services register via the call protocol.
This model allows natural mesh formation: a head can also be a worker for another head, enabling multi-hop routing, redundancy, and distributed topologies without a centralized authority.
## Transport Layer
### Architecture
The transport layer produces a duplex byte stream (`AsyncRead + AsyncWrite + Unpin + Send`) that the SSH layer consumes via `russh::client::connect_stream()` or `russh::server::run_stream()`. SSH is completely unaware of what transport it runs over.
### Transport Trait
```rust
#[async_trait]
pub trait Transport: Send + Sync + 'static {
type Stream: AsyncRead + AsyncWrite + Unpin + Send + 'static;
async fn connect(&self) -> Result<Self::Stream>;
fn describe(&self) -> String;
}
#[async_trait]
pub trait TransportAcceptor: Send + Sync + 'static {
type Stream: AsyncRead + AsyncWrite + Unpin + Send + 'static;
async fn accept(&self) -> Result<(Self::Stream, TransportInfo)>;
}
#[derive(Debug, Clone)]
pub struct TransportInfo {
pub remote_addr: Option<SocketAddr>,
pub transport_kind: TransportKind,
}
#[derive(Debug, Clone)]
pub enum TransportKind {
Tcp,
Tls { server_name: Option<String> },
Iroh { endpoint_id: String },
Dns { domain: String }, // NEW
WebTransport { host: String }, // NEW (planned)
}
```
### Existing Transports
| Transport | Client | Server | Stream Type |
|-----------|--------|--------|-------------|
| TcpTransport | `TcpStream::connect(addr)` | `TcpListener::accept()` | `TcpStream` |
| TlsTransport | `TlsStream<TcpStream>` | `TlsStream<TcpStream>` | tokio_rustls |
| IrohTransport | `endpoint.connect(peer, alpn)` then `conn.open_bi()` then `join(recv, send)` | `endpoint.accept()` then `conn.accept_bi()` then `join(recv, send)` | `tokio::io::Join<RecvStream, SendStream>` |
| AcmeTlsAcceptor | Auto-provision via Let's Encrypt | ACME cert provision + TLS accept | TlsStream |
### Transport Chaining
```bash
alknet connect --transport iroh --proxy socks5://127.0.0.1:1080
alknet connect --transport tls --proxy socks5://127.0.0.1:1080
```
`--proxy` routes outbound connections. Client: routes transport connection. Server: routes data-channel TCP targets.
### Stealth Mode
When `--stealth` is enabled with TLS transport on port 443: after TLS handshake, peek first bytes. If `SSH-2.0-`, run SSH. Otherwise, return `HTTP/1.1 404 Not Found\r\nServer: nginx\r\n\r\n` and close. Makes the server indistinguishable from an HTTPS site.
## Call Protocol
### Wire Format
Every message is a length-prefixed JSON `EventEnvelope`:
```rust
pub struct EventEnvelope {
pub r#type: String, // "call.requested", "call.responded", etc.
pub id: String, // Correlation key (requestId, topic, or "" for broadcasts)
pub payload: Value, // JSON payload — schema depends on event type
}
// Frame: 4-byte big-endian length prefix + UTF-8 JSON body
```
This is the same format used by `@alkdev/pubsub` adapters. The envelope is transport-agnostic — it runs over SSH channels, WebTransport streams, iroh bidirectional streams, WebSocket, Worker postMessage, or DNS queries.
Binary payloads are base64-encoded in the `payload` field. The envelope itself stays JSON for cross-language compatibility.
### Call Protocol Events
| Event | Direction | Purpose |
|-------|-----------|---------|
| `call.requested` | Caller → Handler | Initiate a call or subscription |
| `call.responded` | Handler → Caller | Deliver a result (one for calls, many for subscriptions) |
| `call.completed` | Handler → Caller | Signal end of subscription stream |
| `call.aborted` | Either side | Cancel the call/subscription |
| `call.error` | Handler → Caller | Signal an error |
A call is just a subscribe that resolves after one event. Both `call()` and `subscribe()` send the same `call.requested` event.
### Operation Paths
```
/{node}/{service}/{op}
```
- **node** — identity prefix of the node that exposes the operation
- **service** — logical service namespace (e.g., `fs`, `bash`, `agent`)
- **op** — specific operation (e.g., `readFile`, `exec`, `chat`)
Examples:
| Path | Meaning |
|------|---------|
| `/dev1/fs/readFile` | Node `dev1`, service `fs`, op `readFile` |
| `/head/agent/chat` | Head's own `agent` service, op `chat` |
| `/head/sessions/list` | Head's `sessions` service, op `list` |
### PendingRequestMap
Manages in-flight calls and subscriptions. Correlates `call.responded` events back to the original `call.requested`:
```rust
pub struct PendingRequestMap {
pending: HashMap<String, PendingEntry>,
}
enum PendingEntry {
Call { tx: oneshot::Sender<Result<Value>>, timeout: Instant },
Subscribe { tx: mpsc::Sender<Result<Value>>, timeout: Option<Instant> },
}
```
### Operation Registry
```rust
pub struct OperationSpec {
pub name: String, // "/fs/readFile", "/agent/chat"
pub namespace: String, // "fs", "agent"
pub op_type: OperationType, // Query, Mutation, Subscription
pub input_schema: Value, // JSON Schema for input
pub output_schema: Value, // JSON Schema for output
pub access_control: AccessControl, // Required scopes/resources
}
pub enum OperationType {
Query, // Read-only, idempotent
Mutation, // Side effects
Subscription, // Streaming
}
pub struct AccessControl {
pub required_scopes: Vec<String>,
pub required_scopes_any: Option<Vec<String>>,
pub resource_type: Option<String>,
pub resource_action: Option<String>,
}
```
Specs and handlers are separated — downstream consumers register both without modifying core:
```rust
registry.register(OperationSpec { name: "/services/list", ... }, list_services_handler);
registry.register(OperationSpec { name: "/fs/readFile", ... }, fs_read_handler);
```
### Protocol Adapter Layer
| Transport | Channel mechanism | Direction |
|-----------|-------------------|-----------|
| SSH | Reserved `direct_tcpip` destination `alknet-control:0` | Bidirectional over SSH channel |
| WebTransport | Bidirectional stream after CONNECT | Bidirectional over WT stream |
| iroh QUIC | `open_bi()` / `accept_bi()` | Bidirectional over QUIC stream |
| WebSocket | Single WS connection | Bidirectional over WS frames |
| Worker | `postMessage` | Bidirectional over structured clone |
| DNS | Query TXT records (client) / serve TXT records (server) | Request/response over DNS |
### Head/Worker Architecture
```
┌─────────────────────────────────┐
│ Head Node │
│ │
│ Head-local services: │
│ /head/agent/chat │
│ /head/agent/complete │
│ /head/sessions/list │
│ │
│ Worker registry: │
│ /dev1/fs/* → dev1 connection │
│ /browser-1/notify/* → WT conn │
└──────┬───────┬──────────────────┘
│ │
┌─────────▼┐ ┌───▼────────────┐
│ Worker │ │Browser Worker │
│ "dev1" │ │"browser-1" │
│ /fs/* │ │/notify/* │
└───────────┘ └────────────────┘
```
A head node is also a worker. Any worker can become a head. This enables mesh topologies where nodes coordinate in a peer-to-peer fashion rather than through a single centralized authority.
Workers register operations on connect:
```json
{
"type": "call.requested",
"id": "uuid-123",
"payload": {
"operationId": "/head/services/register",
"input": {
"node": "dev1",
"operations": ["/fs/readFile", "/bash/exec"]
}
}
}
```
## Authentication
Ed25519 keys for SSH authentication. A separate authentication mechanism for browsers where they sign a token using the same Ed25519 keys.
Authentication is provided by the **auth service** — an irpc-based service that verifies credentials on demand rather than holding all keys in memory. This replaces the earlier `ArcSwap<DynamicConfig>` approach and scales to large user populations without requiring full key set reloads.
Peer credentials are stored in `peer_credentials` table (fingerprint-based lookup). Account credentials via `api_keys` table (SHA-256 hash for high-entropy keys).
See [services.md](services.md) for the auth service protocol definition.
## Service Layer
### Architecture
Alknet uses an **irpc-based service layer** to decompose core responsibilities into independently testable, deployable, and replaceable components. irpc provides lightweight RPC that works both as an in-process async boundary (tokio channels) and cross-process/cross-network (QUIC streams via noq).
A **service** is an irpc protocol enum that defines the operations a component supports. Services run as async actors — locally they communicate via `mpsc` channels, remotely via QUIC streams. The `Client<S>` abstracts over both.
### Core Services
| Service | irpc Protocol | Purpose | Always Local? |
|---------|--------------|---------|---------------|
| **Auth** | `AuthProtocol` | Verify identities, check credentials, issue tokens | Can be remote for large-scale auth |
| **Secret** | `SecretProtocol` | Derive keys from seed, encrypt/decrypt stored secrets, key versioning | Local in single-node, remote in clustered |
| **Config** | `ConfigProtocol` | Dynamic config reload (auth keys, forwarding policy) | Local |
| **Storage** | `StorageProtocol` | Graph CRUD, metagraph operations, honker event bridge | Local or remote |
### Service Definition Pattern
Services are defined as irpc protocol enums:
```rust
use irpc::{rpc_requests, channel::{mpsc, oneshot}};
#[rpc_requests(message = AuthMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum AuthProtocol {
#[rpc(tx=oneshot::Sender<AuthResult>)]
#[wrap(VerifyPubkey)]
VerifyPubkey { fingerprint: String, key_data: Vec<u8> },
#[rpc(tx=oneshot::Sender<AuthResult>)]
#[wrap(VerifyToken)]
VerifyToken { token: Vec<u8> },
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(ReloadKeys)]
ReloadKeys,
}
```
### Local vs Remote
```rust
enum AuthClient {
// In-process: zero-copy tokio channels
Local(Client<AuthProtocol>),
// Cross-process/cross-network: QUIC stream
Remote(irpc::rpc::Client<AuthProtocol>),
}
```
A node that runs all services locally uses `Client::local(mpsc::channel)`. A node that delegates auth to a separate service uses `Client::remote(quinn::Connection)`. The call sites are identical — the client abstracts over both.
### Relationship to Call Protocol
Services are **internal** to a node or cluster. The call protocol is **external** — it's how nodes talk to each other over SSH/WebSocket/QUIC/DNS transports. Services handle concerns like auth and secrets that should not be part of the wire protocol but are needed by every node.
A service can also be exposed as a call protocol operation. For example, the secret service's `DeriveKey` could be exposed as `/head/secrets/derive` for remote workers that need key derivation but shouldn't hold the master seed.
### Event Boundary Discipline
Following the event sourcing patterns in [event_source_types.md](/workspace/research/event_sourcing/event_source_types.md):
- **Honker streams** (`stream_publish`/`subscribe`) are **internal event sourcing** for the service that owns that data. They are domain events, not integration events.
- **Call protocol `EventEnvelope`** is the **integration boundary** between nodes. Cross-node notifications are projected from domain events, not published directly.
- **irpc service calls** are **synchronous request-response** within a node or cluster. They are not events and should not be used as such.
This prevents the conflation of internal state management (event sourcing), cross-service notification (integration events), and service calls (request-response).
## DNS Transport (Planned)
### Two DNS Concepts
1. **DNS as Transport** — Encode `EventEnvelope` frames as DNS queries/responses. Censorship resistance. Request/response maps to `call.requested`/`call.responded` naturally.
2. **DNS as Naming/Discovery** — Publish/resolve endpoint information via DNS TXT records (iroh-dns style). Smart contract provides on-chain `name → namespaceId + relays`. DNS transport carries the data flow when other transports are blocked.
### DNS as Call Protocol Transport
The call protocol is transport-agnostic. DNS becomes another adapter:
```
Transport Layer:
SSH channel → EventEnvelope frames → CallHandler
WebTransport → EventEnvelope frames → CallHandler
iroh QUIC stream → EventEnvelope frames → CallHandler
DNS query/response → EventEnvelope frames → CallHandler ← NEW
```
**Upstream (client → server)**: Encode `EventEnvelope` JSON as base32 DNS query labels.
**Downstream (server → client)**: Return `EventEnvelope` JSON in TXT record responses.
**Polling**: For `call.responded` after `call.requested`, client polls `requestId.alk.dev TXT?`.
The `DnsTransportAdapter` implements the same adapter pattern as `@alkdev/pubsub`'s event targets, making DNS a first-class transport for control channel operations.
### DNS as Full Transport (SSH Tunneling)
Full-duplex SSH tunneling over DNS requires a framing protocol:
- Chunk SSH data into fixed-size frames (e.g., 220-byte frames with 4-byte header for seq/ack)
- Encode upstream in base32 subdomain labels
- Encode downstream in TXT records or CNAME targets
- Handle resequencing and retransmission
This is higher latency (~1-50 KB/s) but works when all other transports are blocked. Fine for interactive SSH. Log a warning at connect time.
### iroh-dns Relationship
iroh-dns publishes `EndpointInfo` via `_iroh.<z32-endpoint-id>.<origin> TXT` records. alknet can extend this:
- Add `tunnel=dnst.example.com` attribute to indicate DNS transport availability
- Use iroh-dns `DnsResolver` for endpoint discovery
- When a client sees the `tunnel` attribute and QUIC is blocked, fall back to DNS transport
### DnsTransport Implementation Sketch
```rust
#[cfg(feature = "dns")]
mod dns;
pub struct DnsTransport {
domain: String, // e.g. "t.alk.dev"
resolver_addr: SocketAddr,
protocol: DnsProtocol, // Udp, Tcp, Tls, Https
auth_token: Option<String>,
}
pub struct DnsAcceptor {
domain: String,
listen_addr: SocketAddr,
protocol: DnsProtocol,
}
// DnsStream: virtual duplex backed by DNS poll/push
// Uses tokio::io::duplex() internally with a background task that:
// - Chunks outgoing bytes into DNS queries (client) or response records (server)
// - Reassembles incoming DNS payloads into the read buffer
// - Handles ACK/NACK for reliability
```
### DnsProtocol in iroh-dns
iroh-dns already supports multiple DNS protocols:
```rust
pub enum DnsProtocol {
Udp, // Classic DNS
Tcp, // DNS over TCP
Tls, // DNS over TLS (DoT) — RFC 7858
Https, // DNS over HTTPS (DoH) — RFC 8484
}
```
alknet's DNS transport should support all of these. DoH (port 443, looks like HTTPS) is particularly valuable for censorship resistance since it's indistinguishable from normal web traffic.
## Design Decisions
| ADR | Decision | Summary |
|-----|----------|---------|
| 001 | Pluggable transport | Transport trait produces stream, SSH consumes it |
| 003 | iroh stream join | `tokio::io::join` combines QUIC halves |
| 004 | SSH over transport | SSH never touches TCP/iroh/TLS directly |
| 008 | ACME/Let's Encrypt | Auto-provision TLS certs |
| 009 | Default iroh relay | n0 relay by default, `--iroh-relay` override |
| 010 | Transport chaining | `--proxy` works with all transports natively |
| 017 | Stealth mode | Peek first bytes, return 404 for non-SSH on port 443 |
| 018 | Control channel for pubsub | Reserved destination for event bus |
| 019 | Proxy dual semantics | `--proxy` routes transport on client, data on server |
| 023 | Unified auth | Shared Ed25519 key material across auth mechanisms |
| 024 | Bidirectional call protocol | Both sides can call, generalized from ADR-018 |
| 025 | Handler/spec separation | Downstream registers operations without modifying core |
| 026 | Head/worker terminology | Replace hub/spoke with head/worker; any node can be a head |
| 027 | Service layer via irpc | Core responsibilities decomposed into irpc service protocols |
| 028 | Auth as service | Auth verification via irpc service, not in-memory key set |
## References
- `@alkdev/pubsub` — TypeScript event target adapters and `EventEnvelope`
- `@alkdev/operations` — TypeScript call protocol, `OperationSpec`, registry
- `@alkdev/flowgraph` — TypeScript operation graph and call graph (planned Rust port)
- `@alkdev/storage` — TypeScript metagraph, identity, ACL (planned Rust port as `alknet-storage`)
- `@alkdev/dispatch` — Instance management service (head+worker architecture reference)
- iroh-dns — DNS resolver and endpoint info (naming/discovery)
- iroh-live-relay — WebTransport relay (planned transport reference)
- irpc — iroh streaming RPC (service layer, async boundaries)
- [event_source_types.md](/workspace/research/event_sourcing/event_source_types.md) — Event-driven architecture patterns and anti-patterns

View File

@@ -1,91 +0,0 @@
Here is an article tailored specifically to untangle these concepts. It is structured not just as a conceptual guide, but as a **diagnostic tool**—perfect for feeding into an AI coding CLI to sniff out architectural smells and "spaghetti concepts" in a codebase.
***
# Deconstructing Event-Driven Architecture: Untangling "Spaghetti Concepts"
In modern software architecture, the term "Event" has fallen victim to *semantic diffusion*—a concept popularized by Martin Fowler where a term becomes so widely used that it loses its original, specific meaning. When developers use the same word to describe state persistence, data distribution, and asynchronous notifications, the result is "Spaghetti Concepts."
Just like spaghetti code, spaghetti concepts lead to tight coupling, brittle systems, and unpredictable side effects. To fix an Event-Driven Architecture (EDA), we must draw hard boundaries around what an "event" is actually doing in any given context.
This guide breaks down the distinct types of events, their proper use cases, and the structural anti-patterns (Conflation Points) that occur when they are mixed up.
---
## 1. Event Sourcing (State Persistence)
**The Concept:** Event Sourcing is a method of persisting state. Instead of saving the *current* state of an entity (e.g., `Quantity: 27`) in a database row, you save the *history of facts* that led to that state (e.g., `Received 30`, `Shipped 5`, `Adjusted +2`). The current state is derived by replaying these facts.
**The Golden Rule:** Event Sourcing is an **internal implementation detail** of a specific service or Aggregate. It is highly specific to the domain logic.
**How to Identify It:**
* Uses a specialized stream database (like EventStoreDB).
* Events are named in the past tense representing highly specific domain actions (`InventoryAdjusted`, `OrderPlaced`).
* The system reads a stream of these events to reconstruct an object in memory before applying new business rules.
### 🚨 Conflation Point: Leaking the Event Store (The Database Reach-In)
**The Smell:** Service B connects directly to Service As event store to read its events and react to them.
**Why its bad:** Because Event Sourcing events are internal state, exposing them externally completely shatters Service A's encapsulation. If Service A refactors how it calculates inventory, Service B breaks.
**The Fix:** Service A should project its internal Event Sourcing events into generalized **Integration Events** (see below) and publish those to a message broker (like RabbitMQ or Kafka) for Service B to consume.
---
## 2. Event-Carried State Transfer (Data Distribution)
**The Concept:** Also known as "Fat Events," this pattern is used to distribute data across services to avoid synchronous API calls (temporal coupling). If Service B needs to know about a Product's price to calculate a shopping cart total, Service A publishes an event containing the *entire* current state of that product. Service B listens to this event and builds a local, read-only cache (a projection).
**The Golden Rule:** These events exist to answer the question, *"What does the data look like now?"* without requiring a synchronous HTTP callback.
**How to Identify It:**
* Events often have generic CRUD-like names (`ProductUpdated`, `CustomerCreated`).
* Payloads are "fat"—they contain a lot of data (ID, Name, Price, Category, etc.).
* Often implemented using Change Data Capture (CDC) tools like Debezium reading from a primary database and publishing to Kafka.
### 🚨 Conflation Point: Event Sourcing vs. State Transfer
**The Smell:** Using a state transfer tool (like Debezium publishing `RowUpdated` events) as a makeshift Event Sourcing log to derive business logic.
**Why its bad:** A database row update doesn't tell you *why* the data changed. Was a user's address updated because they moved, or because there was a typo? Business intent is lost.
**The Fix:** Keep CDC and state transfer events strictly for updating local read-caches in downstream services. Do not use them to drive complex business workflows that rely on "intent."
---
## 3. Notification Events (Behavioral Triggers)
**The Concept:** Also known as "Thin Events," these are lean messages broadcasted to notify the system that a business milestone has occurred. They usually contain minimal data—often just an Entity ID and an action.
**The Golden Rule:** They act as an asynchronous "tap on the shoulder" to tell downstream services to trigger their own workflows (Choreography).
**How to Identify It:**
* Payloads are "thin" (e.g., `{ "Event": "OrderShipped", "OrderId": "123" }`).
* Used heavily in integrations (e.g., triggering an email via AWS SES, or notifying a shipping warehouse).
### 🚨 Conflation Point: The Synchronous Callback Trap (Boomerang Coupling)
**The Smell:** Service A publishes a thin `OrderPlaced` event. Service B receives it, but to do its job, it must immediately make a synchronous HTTP REST call back to Service A to fetch the order details.
**Why its bad:** If Service A goes down, Service B fails. You have successfully implemented Event-Driven Architecture, but kept the exact synchronous temporal coupling you were trying to eliminate. Furthermore, a flood of events can cause a DDoS attack on your own service.
**The Fix:** If downstream services *always* need the data to process the event, upgrade the Notification Event to an Event-Carried State Transfer ("Fat Event") by including the required data in the payload.
---
## 4. Domain Events vs. Integration Events (The Boundary Rule)
*Own Insight / DDD Integration*
A massive source of spaghetti concepts is failing to differentiate between events meant for *inside* the house and events meant for *outside* the house.
* **Domain Events:** Fired and consumed *within the same service boundary*. They can contain rich, complex, internal domain models because the producer and consumer share the same codebase/ubiquitous language.
* **Integration Events:** Fired *across service boundaries*. They should be simple, generalized, and stripped of internal jargon or complex objects.
### 🚨 Conflation Point: The Leaky Domain Model
**The Smell:** A microservice publishes an event to a global Kafka topic, and the payload contains internal database IDs, complex nested objects, or serialized language-specific data types (like Java/C# specific enums).
**Why its bad:** Downstream services are now strictly coupled to the internal data structure of the upstream service.
**The Fix:** Implement an Anti-Corruption Layer. The producing service should catch its own Domain Event, map the data to a simplified, standardized, versioned Integration Event, and publish *that* to the wider system.
---
## Code Review CLI Prompt: "The Conflation Detector"
*(Note: Feed the following heuristics to your AI CLI alongside this article to review your codebase).*
**AI CLI Instructions:** Scan the provided codebase for Event-Driven Architecture anti-patterns. Flag code that violates the conceptual boundaries described in the article. Look specifically for:
1. **Shared Event Stores:** Are multiple distinct microservices connecting to the same EventStoreDB or reading the exact same raw Event Sourcing stream?
2. **Boomerang Callbacks:** Is an event consumer receiving a message from a broker (RabbitMQ/Kafka/Azure Service Bus), extracting an ID, and immediately making an HTTP request to the service that originated the event?
3. **Leaky Domain Models:** Are internal entity objects (e.g., classes mapped directly to ORMs like Entity Framework or Hibernate) being serialized directly into event payloads sent to external message brokers?
4. **Misused CDC:** Are Debezium/database-trigger events being used to trigger business logic workflows, rather than simply updating read-models/caches?
5. **Fat Notification Trap:** Are Notification events carrying massive payloads just to trigger an email, when a thin event would suffice? Or conversely, are thin events starving consumers of necessary data?

View File

@@ -1,773 +0,0 @@
# SSH Tunnel VPN Alternative — Feasibility Assessment
**Date**: 2026-06-01
**Status**: Feasibility assessment / architecture sketch
**Updated**: 2026-06-01 — Added iroh transport analysis (§11)
## 1. Problem Statement
Countries in the "developed west" (UK, CA, etc.) are increasingly banning or restricting VPNs at the protocol level. The valid use case of a VPN — a *virtual private network* for securing traffic on hostile networks, accessing private infrastructure, and tunneling between trusted endpoints — gets caught in the crossfire when VPNs are treated primarily as location-spoofing tools.
SSH-based tunnels cover the same functional ground without being a VPN protocol. Blocking SSH would break the internet in critical ways (infrastructure management, CI/CD, development workflows). The goal is to build a dead-simple, self-hostable Rust client/server that provides VPN-like functionality over SSH, with optional TLS wrapping for traffic obfuscation.
## 2. Reference Codebase Analysis
### 2.1 Dispatch (`/workspace/@alkdev/dispatch`)
Dispatch proves russh usage well within scope. Key takeaways:
- **Pure SSH client** — `client::Handler` is a zero-sized type, auto-accepts server keys. Minimal boilerplate.
- **Arc-wrapped Handle pattern** — `Arc<client::Handle<Client>>` enables sharing across concurrent tasks (port forwarding, SFTP, exec).
- **Port forwarding via `channel_open_direct_tcpip`** — Already implemented. Local TCP listener → `direct-tcpip` SSH channel → `tokio::io::copy_bidirectional`. This is the standard SSH `-L` pattern, implemented programmatically.
- **Channel-per-operation model** — Each operation opens its own SSH channel on a shared session. Multiplexing is handled by russh internally.
- **Channel.into_stream()** — Converts SSH channels to `AsyncRead + AsyncWrite` streams, enabling use with any tokio I/O combinator.
The dispatch codebase is clean and demonstrates that the core SSH mechanics are straightforward. The new project would need both client **and** server sides, but russh's server API mirrors the client API closely.
### 2.2 russh (`/workspace/russh`)
Critical capabilities confirmed:
| Feature | API | Status |
|---------|-----|--------|
| Local port forwarding (client → server → remote) | `Handle::channel_open_direct_tcpip()` | Available, no feature flag |
| Remote port forwarding (server listens, client gets channels) | `Handle::tcpip_forward()` / Handler callback `server_channel_open_forwarded_tcpip()` | Available, no feature flag |
| Unix socket forwarding | `Handle::channel_open_direct_streamlocal()` / `Handle::streamlocal_forward()` | Available, no feature flag |
| Server-side reverse forwarding | `server::Handler::tcpip_forward()` / `server::Handle::forward_tcpip()` | Available, no feature flag |
| Arbitrary stream transport | `client::connect_stream()` / `server::run_stream()` | **Both accept `AsyncRead+AsyncWrite+Unpin+Send`** |
| Channel as bidirectional stream | `Channel::into_stream()` / `split()` | Available |
**The `connect_stream()` and `run_stream()` APIs are the key enabler for TLS wrapping.** They accept any async byte stream, meaning we can layer TLS (via `tokio-rustls`) underneath russh without modifying russh itself. The SSH session runs over a TLS stream, which looks like HTTPS to DPI.
## 3. Architecture Sketch
### 3.1 Components
```
┌─────────────────────────────────┐ ┌─────────────────────────────────┐
│ CLIENT │ │ SERVER │
│ │ │ │
│ ┌──────────┐ ┌───────────┐ │ │ ┌───────────┐ ┌──────────┐ │
│ │ TUN │ │ SSH │ │ SSH │ │ SSH │ │ Proxy │ │
│ │ Interface│───▶│ Client │──┼─ over ──▶│ Server │───▶│ Handler │ │
│ │ (tun-rs)│◀───│ (russh) │ │ TLS │ (russh) │◀───│ │ │
│ └──────────┘ └─────┬─────┘ │ opt. │ └─────┬─────┘ └────┬─────┘ │
│ │ │ │ │ │ │
│ ┌─────▼─────┐ │ │ ┌─────▼─────┐ ┌────▼─────┐ │
│ │ TLS Layer │ │ │ │ TLS Layer │ │ Outbound │ │
│ │(tokio- │ │ │ │(tokio- │ │ Proxy │ │
│ │ rustls) │ │ │ │ rustls) │ │(SOCKS5/ │ │
│ └─────┬─────┘ │ │ └─────┬─────┘ │ HTTP) │ │
│ │ │ │ │ └────┬─────┘ │
│ ┌─────▼─────┐ │ │ ┌─────▼─────┐ │ │
│ │ TCP │ │ │ │ TCP │ ┌────▼─────┐ │
│ │ Connect │◀─┼────────▶│ │ Listener │ │ Direct │ │
│ └───────────┘ │ │ └───────────┘ │ Forward │ │
│ │ │ └────┬─────┘ │
└─────────────────────────────────┘ └─────────────────────────────────┘
│ │
Proxy Mode Direct Mode
(outbound via (outbound
SOCKS5/HTTP) direct TCP)
```
### 3.2 Data Flow — Client TUN Mode
1. **TUN interface** (created via `tun-rs`) captures IP packets from the OS routing table
2. **Client reads IP packets** from the TUN device, determines destination IP:port
3. **Client opens `direct-tcpip` SSH channel** to destination via `handle.channel_open_direct_tcpip(dest_ip, dest_port, ...)`
4. **Client writes packet payload** to the SSH channel, reads response
5. **Client writes response** back to TUN interface
This is essentially what tun2proxy does, except instead of SOCKS5 upstream, it's an SSH channel.
### 3.3 Data Flow — TLS Obfuscation Mode
When `--tls` or `--https` is specified:
1. **Client establishes TLS connection** to `server:443` using `tokio-rustls::TlsStream`
2. **SSH session runs over the TLS stream** via `client::connect_stream(Arc::new(config), tls_stream, handler)`
3. **Server accepts TLS connection**, then runs `server::run_stream(server_config, tls_stream, handler)`
4. **To DPI, the traffic looks like HTTPS** — standard TLS handshake, then encrypted application data
5. Optional: Server can present a legitimate-looking certificate and serve a fake nginx 404 to non-SSH probes (similar to https_proxy's stealth approach)
### 3.4 Data Flow — Server-Side Proxy Mode
When `--proxy` is specified on the server:
1. Client requests `channel_open_direct_tcpip(target_host, target_port, ...)`
2. Server's `channel_open_direct_tcpip` handler checks ACLs
3. Instead of connecting directly, server routes through a local SOCKS5/HTTP proxy
4. This provides an additional hop for privacy — the SSH server's IP isn't exposed to the destination
### 3.5 CLI Interface Sketch
```bash
# Server — simplest mode (SSH only, port 22)
ghost serve --key /etc/ssh/ssh_host_ed25519_key
# Server — with TLS on port 443
ghost serve --key /etc/ssh/ssh_host_ed25519_key --tls --tls-cert /etc/ssl/cert.pem --tls-key /etc/ssl/key.pem
# Server — with TLS + outbound proxy
ghost serve --key /etc/ssh/ssh_host_ed25519_key --tls --tls-cert /etc/ssl/cert.pem --tls-key /etc/ssl/key.pem --proxy socks5://127.0.0.1:9050
# Client — TUN mode (routes all traffic through SSH tunnel)
ghost connect --server example.com:443 --tls --identity ~/.ssh/id_ed25519 --tun
# Client — Single port forward (like SSH -L)
ghost connect --server example.com:443 --tls --identity ~/.ssh/id_ed25519 --forward 5432:db.internal:5432
# Client — SOCKS5 proxy mode (local SOCKS5 that tunnels through SSH)
ghost connect --server example.com:443 --tls --identity ~/.ssh/id_ed25519 --socks5 1080
```
**Working name: `ghost`** (as in "ghost in the shell" — it's SSH, it's stealthy, it passes through walls). Or `shade`, `wraith`, `spectre`. Pick anything.
## 4. Key Technical Decisions & Unknowns Analysis
### 4.1 TUN Interface — SOLVED
**Library: `tun-rs` (v2, formerly `tun` crate)**
- Supports Linux, macOS, Windows (via wintun.dll), FreeBSD, OpenBSD, NetBSD, Android, iOS
- Async API with `tokio` feature: `DeviceBuilder::new().build_async()`
- Clean `recv()` / `send()` API — read IP packets, write IP packets
- Already used in production by tun2proxy and similar projects
- Supports hardware offload (TSO/GSO) on Linux for performance
- No `CAP_NET_ADMIN` needed on some platforms when using `--unshare` namespace approach (tun2proxy pattern)
**This is a solved problem.** The `tun-rs` crate is mature, cross-platform, and async-native with tokio. The implementation is straightforward:
```rust
let dev = DeviceBuilder::new()
.ipv4("10.0.0.1", 24, None)
.mtu(1400)
.build_async()?;
let mut buf = vec![0u8; 65536];
loop {
let len = dev.recv(&mut buf).await?;
// Parse IP header, determine destination
// Open SSH channel to destination
// Write response back to TUN
}
```
**Key consideration**: On Linux requires `CAP_NET_ADMIN` or root. The tun2proxy approach of using network namespaces (`--unshare`) is worth adopting for unprivileged operation.
### 4.2 SSH over TLS — SOLVED (architecturally)
**Approach: Layer TLS beneath SSH using russh's `connect_stream` / `run_stream`**
This is the critical insight. russh already decouples transport from protocol:
- `client::connect_stream(config, stream, handler)` — accepts any `AsyncRead + AsyncWrite + Unpin + Send`
- `server::run_stream(config, stream, handler)` — same for server
This means:
```rust
// Client side
let tcp_stream = TcpStream::connect((server_addr, server_port)).await?;
let tls_stream = TlsStream::connect(tls_connector, server_domain, tcp_stream).await?;
let handle = client::connect_stream(config, tls_stream, handler).await?;
// Server side
let (tcp_stream, addr) = tcp_listener.accept().await?;
let tls_stream = TlsStream::accept(tls_acceptor, tcp_stream).await?;
server::run_stream(config, tls_stream, handler).await?;
```
**No modification to russh is needed.** This is a clean layering.
**For HTTPS stealth**: The server can:
1. Accept connections on port 443
2. Present a valid TLS certificate (self-signed or Let's Encrypt via ACME)
3. Non-SSH clients making HTTP requests get a normal-looking 404 response
4. SSH clients speak SSH protocol directly after TLS handshake
5. DPI sees standard HTTPS traffic since the TLS handshake is normal
The https_proxy project demonstrates this pattern well — stealth proxy returning fake nginx 404s to probes.
### 4.3 IP Packet Handling — NEEDS DESIGN
When using TUN mode, we're receiving raw IP packets. We need to:
1. **Parse IP headers** to determine destination IP and port
2. **Track connection state** — map `(src_ip, src_port, dst_ip, dst_port)` to SSH channels
3. **TCP reassembly** — handle segmentation, retransmission, etc.
4. **ICMP handling** — respond to pings, handle unreachable destinations
5. **DNS interception** — handle DNS queries that arrive at the TUN interface
This is the most complex part. Options:
**Option A: Use a userspace TCP/IP stack (smoltcp)**
- Parse packets, but let a userspace stack handle TCP
- Heavier dependency, but proven approach (what tun2proxy does with its own stack)
- `smoltcp` is well-maintained, used in embedded and networking projects
**Option B: Raw packet forwarding with NAT**
- Simpler conceptually — just NAT the packets, forward them through the SSH channel
- Requires handling TCP state at the IP level (seq/ack manipulation, checksum recalculation)
- More error-prone
**Option C: SOCKS5 proxy mode only (no TUN)**
- Simplest to implement — just a local SOCKS5 server that forwards through SSH
- Browsers, curl, and most apps can use SOCKS5
- No root/CAP_NET_ADMIN needed
- But: doesn't capture all traffic (UDP, DNS leaks, etc.)
**Recommendation**: Start with Option C (SOCKS5 proxy mode) as the minimal viable product. Add TUN mode (Option A with smoltcp) as an advanced feature. This matches how tun2proxy structures their project and is the pragmatic path.
### 4.4 SSH Server Authentication — STRAIGHTFORORD
The server implementation needs:
- **Public key authentication** — primary method, matching standard SSH practices
- **`authorized_keys` file support** — read `~/.ssh/authorized_keys` or a custom path
- **Optional password authentication** — for convenience, but not recommended for production
russh's `server::Handler` trait provides `auth_publickey` and `auth_password` callbacks. Implementation is trivial:
```rust
async fn auth_publickey(&mut self, user: &str, public_key: &PublicKey) -> Auth {
if self.authorized_keys.iter().any(|k| k == public_key) {
Auth::Accept
} else {
Auth::Reject { proceed_with_methods: None, partial_success: false }
}
}
```
### 4.5 DNS Handling — DESIGN DECISION NEEDED
In TUN mode, DNS queries need to be routed through the tunnel. Options:
1. **Virtual DNS** (tun2proxy approach) — intercept DNS packets, map query names to fake IPs from a reserved range (198.18.0.0/15), resolve via the SSH tunnel
2. **DNS-over-TCP** — Force DNS through the SSH tunnel
3. **Direct DNS** — Don't handle DNS in the tunnel, rely on system resolver
4. **SOCKS5 mode** — SOCKS5 supports DOMAIN names natively (SOCKS5h), so DNS resolution happens server-side
**Recommendation**: SOCKS5 mode handles DNS naturally via SOCKS5h. For TUN mode, adopt the virtual DNS approach from tun2proxy (their `ip-stack` crate handles this).
### 4.6 Connection Multiplexing — ALREADY SOLVED
russh multiplexes channels over a single SSH connection. No need to manage multiple TCP connections per tunnel. One SSH connection, many channels. This is exactly what we want.
### 4.7 Keep-Alive and Reconnection — NEEDS DESIGN
- **SSH keepalive**: russh `Config` has `keepalive_interval` and `keepalive_max`
- **Auto-reconnect**: Client should detect disconnection (`is_closed()`) and reconnect with exponential backoff
- **TUN continuity**: When SSH reconnects, existing TCP connections through the tunnel will fail, but new ones will work. This is acceptable behavior (same as any VPN).
### 4.8 Server-Side Proxy (Outbound) — STRAIGHTFORORD
When `--proxy` is specified, the server's `channel_open_direct_tcpip` handler forwards through a local proxy:
```rust
async fn channel_open_direct_tcpip(
&mut self,
host: &str,
port: u32,
...
) -> Result<Channel<Msg>, Self::Error> {
// Option 1: Connect directly
let stream = TcpStream::connect((host, port as u16)).await?;
// Option 2: Connect through SOCKS5 proxy
let stream = connect_socks5(proxy_addr, host, port).await?;
// Option 3: Connect through HTTP CONNECT proxy
let stream = connect_http_proxy(proxy_addr, host, port).await?;
// Then bidirectional copy between SSH channel and stream
Ok(channel)
}
```
SOCKS5 client implementation is simple (5-byte handshake, variable-length connect). HTTP CONNECT is also straightforward. Both can be implemented in a few hundred lines.
## 5. Dependency Assessment
| Dependency | Purpose | Maturity | Risk |
|------------|---------|----------|------|
| `russh` | SSH client & server | High (used in dispatch, well-maintained) | Low — already proven |
| `tun-rs` (v2) | TUN/TAP interface | High (cross-platform, prod-tested, bench'd at 70Gbps) | Low — well-maintained |
| `tokio-rustls` | TLS layer | High (standard Rust TLS) | Low — widely used |
| `rustls` | TLS implementation | High | Low — no ring dependency needed with aws-lc-rs |
| `smoltcp` | Userspace TCP/IP stack (TUN mode) | Medium-High | Medium — complex but well-proven |
| `clap` | CLI args | High | None |
| `tracing` | Structured logging | High | None |
| `anyhow/thiserror` | Error handling | High | None |
| `tokio` | Async runtime | High | None |
**No immature or risky dependencies.** Every crate is well-established with active maintenance.
## 6. Risk Assessment
### 6.1 Technical Risks
| Risk | Likelihood | Impact | Mitigation |
|------|-----------|--------|------------|
| TUN mode complexity (TCP state, IP parsing) | Medium | Medium | Start with SOCKS5 mode; TUN is advanced feature |
| Cross-platform TUN differences | Medium | Medium | tun-rs handles most; `--unshare` for Linux privilege separation |
| TLS + SSH interaction edge cases | Low | Low | Both are well-tested; russh's `connect_stream` / `run_stream` abstracts transport |
| Performance under load | Low | Medium | russh multiplexes channels; tun-rs has benchmarked 35+ Gbps async |
| DPI detecting SSH banner over TLS | Medium | High | After TLS, the SSH banner ("SSH-2.0-...") is encrypted. But SNI reveals domain. Use `Config { anonymous: true }` to minimize fingerprint, or configure `client_id` to look like a web server. |
### 6.2 Protocol-Level Risks
| Risk | Likelihood | Impact | Mitigation |
|------|-----------|--------|------------|
| SSH protocol fingerprinting (packet sizes, timing) | Medium | Medium | Pad messages, add random delays. russh doesn't do this natively — would need custom channel wrapping. |
| SNI leaks domain in TLS handshake | High | Low | Use a innocuous domain. Could also explore ECH (Encrypted Client Hello) in rustls if available. |
| Deep packet inspection identifying SSH patterns even over TLS | Low-Medium | Medium | The TLS layer prevents payload inspection. Only traffic analysis (sizes, timing) is possible. Padding and traffic shaping could help. |
| Countries blocking SSH traffic on port 22 | Already happening | N/A | That's the whole point — we run SSH over TLS on port 443 |
### 6.3 Usability Risks
| Risk | Likelihood | Impact | Mitigation |
|------|-----------|--------|------------|
| Requires self-hosted server | By design | Medium | Document simple deployment. Provide Docker image. Consider one-command install script. |
| Root/CAP_NET_ADMIN needed for TUN on Linux | High | Medium | Provide `--unshare` mode. SOCKS5 mode needs no privileges. |
| Certificate management for TLS mode | Medium | Low | Support self-signed certs, ACME (Let's Encrypt), or manual cert paths. |
## 7. Implementation Plan
### Phase 1: MVP (2-3 days)
**SOCKS5 proxy mode only. No TUN. Client + server.**
1. **Server binary** (`ghost serve`)
- russh server implementation with public key auth
- `channel_open_direct_tcpip` handler: connect to target directly or via outbound proxy
- Optional TLS wrapping via `tokio-rustls` + `server::run_stream`
- Config: listen address, host key path, authorized keys, TLS options, proxy options
2. **Client binary** (`ghost connect`)
- russh client with public key auth
- Local SOCKS5 server that forwards connections through SSH `channel_open_direct_tcpip`
- Optional TLS wrapping via `tokio-rustls` + `client::connect_stream`
- Config: server address, identity key, TLS options, SOCKS5 listen address
3. **Testing**
- Integration test: client → server → HTTP target
- Test with: `curl --socks5-hostname 127.0.0.1:1080 https://example.com`
- Test TLS mode against DPI-like inspection
### Phase 2: Port Forwarding (1 day)
4. **Client: explicit port forwards** (`--forward local:remote:port`)
- Direct reimplementation of SSH `-L` and `-R`
- Uses `channel_open_direct_tcpip` for local forwards
- Uses `tcpip_forward` / handler callback for remote forwards
5. **Client: SOCKS5 with DNS** (SOCKS5h)
- Domain names resolved server-side, not client-side
### Phase 3: TUN Mode (2-3 days)
6. **Client: TUN interface mode** (`--tun`)
- Create TUN device via `tun-rs`
- IP packet routing through SSH channels
- Either: raw packet forwarding (simpler, but fragile) or smoltcp integration (robust, but more code)
- Recommend: use tun2proxy's `ip-stack` crate or similar for TCP reconstruction
- Virtual DNS for TUN mode
7. **Privilege separation**
- `--unshare` mode for Linux (create network namespace, unshare)
- Document CAP_NET_ADMIN requirement
### Phase 4: Hardening & Polish (1-2 days)
8. **Obfuscation improvements**
- SSH banner customization (`client_id` config)
- Random padding in channel data
- Traffic shaping / constant-rate padding (optional, advanced)
9. **Server stealth**
- Non-SSH connection detection: serve fake nginx 404 on TLS port
- Dual-protocol listener: HTTPS for browsers, SSH for ghost clients
10. **Auto-reconnect**
- Exponential backoff reconnect on SSH session drop
- TUN interface survives reconnect (new connections work, in-flight connections fail gracefully)
### Phase 5: Distribution (1 day)
11. **Build & packaging**
- Static musl binary for Linux
- Docker image
- systemd unit file
- One-line install script
## 8. Estimated Timeline
| Phase | Duration | Cumulative |
|-------|----------|------------|
| Phase 1: SOCKS5 MVP | 2-3 days | 2-3 days |
| Phase 2: Port Forwarding | 1 day | 3-4 days |
| Phase 3: TUN Mode | 2-3 days | 5-7 days |
| Phase 4: Hardening & Polish | 1-2 days | 6-9 days |
| Phase 5: Distribution | 1 day | 7-10 days |
With LLM-assisted development, the MVP (Phase 1) could realistically be done in 1-2 focused sessions. The full feature set in under a week.
## 9. Open Questions
1. **Project name**`ghost`, `wraith`, `shade`, `spectre`, something else? Needs to be catchy, not conflict with existing Rust crates, and suggest stealth/mobility.
2. **TUN vs smoltcp** — Should TUN mode integrate smoltcp for a userspace TCP stack, or try the simpler "just forward packets and let the OS handle TCP" approach? Smoltcp is more work but more robust. tun2proxy's approach (which uses their own `ip-stack`) suggests userspace TCP is the way to go for reliability.
3. **TLS certificate story** — Should the server support ACME/Let's Encrypt auto-provisioning (like https_proxy does), or is manual cert management sufficient? Auto-provisioning is more user-friendly but adds significant complexity and a dependency on the ACME protocol.
4. **Mobile support** — Should we target iOS/Android eventually? tun-rs supports both via platform APIs, but mobile is a much bigger scope. Probably Phase 6+.
5. **Multi-user server** — Should the server support multiple simultaneous clients? russh's server model handles this naturally (each connection gets its own Handler instance), but access control (per-user ACLs, bandwidth limits) would add complexity.
6. **Crates structure** — Single binary with subcommands (`ghost serve`, `ghost connect`), or separate binaries? Single crate with `#[tokio::main]` dispatch seems cleanest for MVP.
## 10. Conclusion
**This is feasible and straightforward.** The core mechanics — SSH tunnel via russh, TLS wrapping via tokio-rustls, TUN interface via tun-rs — are all solved problems with mature Rust libraries. The dispatch codebase proves russh is production-ready for this kind of work. The `connect_stream` / `run_stream` API in russh makes TLS wrapping a clean layering, not a hack.
The biggest design decision is TUN mode approach (raw packets vs. userspace TCP), and the recommendation is to start with SOCKS5 mode and add TUN later. This gives a working tool in 2-3 days that covers the primary use case (private tunneling that doesn't look like VPN traffic).
The project is well-scoped, the risk profile is low, and the existing tooling (russh, tun-rs, tokio-rustls) handles the hard parts. This is a "few days of focused work" estimate, not a "few weeks."
## 11. iroh Transport — Feasibility Addendum
### 11.1 The Insight
russh's `connect_stream()` and `server::run_stream()` accept **any** `AsyncRead + AsyncWrite + Unpin + Send` stream. The iroh project provides exactly such a stream — a QUIC bidirectional stream (`open_bi()` / `accept_bi()`) where both `SendStream` and `RecvStream` implement `tokio::io::AsyncWrite` and `tokio::io::AsyncRead` respectively.
This means **iroh can serve as a transport layer beneath SSH**, the same way TLS can. The architecture becomes:
```
┌──────────────────────────────────────────────────┐
│ APPLICATION │
│ (SOCKS5 / TUN / port-forward) │
├──────────────────────────────────────────────────┤
│ SSH (russh) │
│ channel_open_direct_tcpip/etc. │
├──────────────────────────────────────────────────┤
│ Transport Layer (SWAPPABLE) │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ TCP │ │ TLS │ │ iroh │ │
│ │(direct) │ │(obfusc) │ │ (P2P QUIC) │ │
│ └──────────┘ └──────────┘ └──────────────┘ │
└──────────────────────────────────────────────────┘
```
### 11.2 Why iroh is Compelling
iroh solves the **biggest deployment problem** with SSH tunnels: the server needs a public IP and open port.
With iroh as transport:
1. **No public IP needed** — Server and client both connect outbound to iroh's relay servers. Hole-punching attempts direct UDP in the background.
2. **No open firewall ports** — The server only needs outbound HTTPS to the relay. No inbound 22 or 443 required.
3. **NAT traversal for free** — iroh's relay + hole-punching means peers behind CGNAT or strict firewalls can still connect.
4. **Ed25519-based addressing** — Peers are identified by public key (EndpointId), no DNS or IP addresses needed.
5. **Built-in address discovery** — pkarr DNS records let you find a peer knowing only their public key.
6. **Still SSH underneath** — All the channel multiplexing, port forwarding, SOCKS5 logic still works. iroh is just the wire.
The use cases multiply:
- **Home server behind NAT**: No reverse proxy, no dynamic DNS, no port forwarding. Just run the server, share the EndpointId.
- **Temporary infrastructure**: Spin up a server anywhere (even behind corporate NAT), connect by public key.
- **Internal services**: Expose Postgres/Redis etc. over an SSH connection that traverses any NAT, no VPN required.
- **Censorship circumvention**: SSH over iroh QUIC to a relay that uses standard HTTPS. The deep packet inspector sees HTTPS traffic to a relay server, not SSH.
### 11.3 How It Works — The Code
The integration is trivially clean because both primitives implement the right traits:
**Client side:**
```rust
// Create iroh endpoint
let endpoint = Endpoint::builder(presets::N0)
.alpns(vec![b"ghost-ssh/1".to_vec()])
.bind()
.await?;
// Connect to peer (no IP needed — just public key)
let addr = EndpointAddr::from_bytes(peer_id_bytes);
let conn = endpoint.connect(addr, b"ghost-ssh/1").await?;
// Open a bidirectional QUIC stream
let (send_stream, recv_stream) = conn.open_bi().await?;
// Combine into a single AsyncRead+AsyncWrite
let iroh_stream = tokio::io::join(recv_stream, send_stream);
// OR use a custom wrapper that implements AsyncRead+AsyncWrite
// Run SSH client over the iroh stream
let handle = client::connect_stream(
Arc::new(client_config),
iroh_stream,
client_handler
).await?;
```
**Server side:**
```rust
// Create iroh endpoint
let endpoint = Endpoint::builder(presets::N0)
.alpns(vec![b"ghost-ssh/1".to_vec()])
.bind()
.await?;
// Accept incoming connections
while let Some(incoming) = endpoint.accept().await {
let conn = incoming.await?;
// For each connection, accept a bidirectional stream
let (send_stream, recv_stream) = conn.accept_bi().await?;
let iroh_stream = tokio::io::join(recv_stream, send_stream);
// Run SSH server over the iroh stream
server::run_stream(
Arc::new(server_config),
iroh_stream,
server_handler
).await?;
}
```
**Or using iroh's Router + ProtocolHandler pattern:**
```rust
struct GhostSshProtocol;
impl ProtocolHandler for GhostSshProtocol {
async fn accept(&self, connection: Connection) -> Result<(), AcceptError> {
// iroh already handled connection acceptance
// We can accept bi streams on the connection directly
// Or: each SSH session could be a new bi stream on the same connection
let (send, recv) = connection.accept_bi().await
.map_err(AcceptError::from_err)?;
let stream = join_streams(recv, send);
server::run_stream(server_config, stream, GhostHandler).await
.map_err(AcceptError::from_err)
}
}
let endpoint = Endpoint::builder(presets::N0).bind().await?;
let router = Router::builder(endpoint)
.accept(b"ghost-ssh/1", GhostSshProtocol)
.spawn();
```
### 11.4 Design Decision: One Stream per Session vs. One Connection with Multiple Streams
There are two ways to layer SSH over iroh:
**Option A: One QUIC bi-stream per SSH session**
- Each SSH session opens a new `open_bi()` stream under a single iroh `Connection`
- The iroh Connection itself persists (one QUIC connection per peer pair)
- Simpler: `open_bi()` gives you a stream, you feed it to `connect_stream()`
- Pro: Connection setup cost amortized. If SSH disconnects, `open_bi()` again is cheap.
- Con: Need to combine `RecvStream` + `SendStream` into a single `AsyncRead+AsyncWrite`
**Option B: One iroh Connection per SSH session (new QUIC connection each time)**
- Each SSH session = one `endpoint.connect()` + the whole connection
- Wasteful: QUIC handshake + iroh relay discovery each time
- Not recommended
**Recommendation: Option A.** One iroh `Connection` per peer pair, one `open_bi()` stream per SSH session. The connection is long-lived; SSH sessions can be re-established cheaply on the same QUIC connection.
### 11.5 Combining `RecvStream + SendStream` into `AsyncRead + AsyncWrite`
QUIC splits streams into separate send and receive halves. russh needs a single duplex stream. Two approaches:
**Approach 1: `tokio::io::join()` (simplest)**
```rust
use tokio::io;
fn join_iroh_stream(
recv: iroh::endpoint::RecvStream,
send: iroh::endpoint::SendStream,
) -> impl AsyncRead + AsyncWrite + Unpin + Send {
io::join(recv, send)
}
```
`tokio::io::join` returns a `Join<A, B>` that implements both `AsyncRead` (from the first) and `AsyncWrite` (from the second). Since `RecvStream: AsyncRead` and `SendStream: AsyncWrite`, this works directly.
**Approach 2: Custom wrapper (more control)**
```rust
struct IrohStream {
recv: iroh::endpoint::RecvStream,
send: iroh::endpoint::SendStream,
}
impl AsyncRead for IrohStream { /* delegate to recv */ }
impl AsyncWrite for IrohStream { /* delegate to send */ }
```
**Recommendation: Start with `tokio::io::join`.** It's one line and has the right trait implementations. Only switch to a custom wrapper if profiling shows overhead (unlikely).
### 11.6 Relay Considerations
iroh provides two relay options:
1. **Default n0 relay servers** (`https://use1-1.relay.n0.iroh.network.`) — free, operated by n0. Good for getting started and testing.
2. **Self-hosted relay** (`iroh-relay` crate) — The relay server is part of the iroh project. Can be self-hosted for complete independence.
For this project:
- **Development/quick start**: Use n0 relays (they're free and reliable)
- **Production/privacy**: Self-host the relay server. It's a single binary (`iroh-relay`) that can run on any VPS. The relay sees only encrypted QUIC packets — it cannot read SSH traffic.
- **Paranoid**: Disable relay entirely. Both peers must have direct network connectivity. No third-party dependency.
The `RelayMode` enum handles this:
```rust
// Default n0 relays
let endpoint = Endpoint::builder(presets::N0).bind().await?;
// Self-hosted relay
let relay_map = RelayMap::from([(relay_url, Some(direct_addr))]);
let endpoint = Endpoint::builder(presets::Custom(relay_map)).bind().await?;
// No relay (direct only)
let endpoint = Endpoint::builder(presets::RelayDisabled).bind().await?;
```
### 11.7 Updated Architecture with iroh Transport
```
┌───────────────────────────────────────────────────────────┐
│ CLIENT │
│ │
│ ┌──────────┐ ┌───────────┐ ┌────────────────────┐ │
│ │ TUN / │ │ SSH │ │ Transport │ │
│ │ SOCKS5 / │───▶│ Client │───▶│ (selectable) │ │
│ │ Port- │ │ (russh) │ │ │ │
│ │ Forward │ │ │ │ ┌────────────────┐ │ │
│ └──────────┘ └───────────┘ │ │ TCP direct │ │ │
│ │ │ TLS (rustls) │ │ │
│ │ │ iroh (QUIC) │ │ │
│ │ └────────────────┘ │ │
│ └────────────────────┘ │
└───────────────────────────────────────────────────────────┘
┌───────────────────────────────────────────────────────────┐
│ SERVER │
│ │
│ ┌──────────┐ ┌───────────┐ ┌────────────────────┐ │
│ │ Outbound │ │ SSH │ │ Transport │ │
│ │ Proxy / │◀───│ Server │◀───│ (selectable) │ │
│ │ Direct │ │ (russh) │ │ │ │
│ │ Forward │ │ │ │ ┌────────────────┐ │ │
│ └──────────┘ └───────────┘ │ │ TCP listener │ │ │
│ │ │ TLS (rustls) │ │ │
│ │ │ iroh (QUIC) │ │ │
│ │ └────────────────┘ │ │
│ └────────────────────┘ │
└───────────────────────────────────────────────────────────┘
┌──────────────┐
│ iroh Relay │ (optional, for NAT)
│ (self-host │
│ or n0) │
└──────────────┘
Transport modes:
--transport tcp Direct TCP (default, simplest)
--transport tls TCP + TLS (obfuscation)
--transport iroh iroh QUIC (NAT traversal, no public IP)
--transport iroh+tls iroh QUIC + TLS (NAT traversal + obfuscation)
```
### 11.8 iroh Transport — Risk Assessment
| Risk | Likelihood | Impact | Mitigation |
|------|-----------|--------|------------|
| iroh API instability (it's v0.x) | Medium | Medium | Pin version; iroh's core stream API is stable (it's just QUIC) |
| Relay dependency for initial connectivity | Low | Low | Self-host relay; or direct-only mode for LAN |
| QUIC stream vs TCP semantics differences | Low | Medium | QUIC streams are reliable ordered byte streams, same semantics as TCP. russh won't know the difference. |
| Performance overhead of QUIC + SSH | Low | Low | QUIC is fast. SSH over QUIC might actually be *faster* than SSH over TCP due to QUIC's multipath and no head-of-line blocking. |
| iroh crate size / compile time | Low | Low | iroh pulls in quinn + rustls + lots of networking. But we already need rustls for TLS mode. The incremental cost is the QUIC stack. |
**Key observation**: QUIC streams have identical reliability and ordering guarantees to TCP. russh's `connect_stream()` / `run_stream()` will work correctly over iroh QUIC streams with no modifications.
### 11.9 Updated CLI Sketch with iroh
```bash
# Server — iroh mode (no public IP needed!)
ghost serve --key ~/.ssh/id_ed25519 --transport iroh
# Prints endpoint ID: e.g., "abc123..."
# Clients connect using this ID
# Server — iroh mode with self-hosted relay
ghost serve --key ~/.ssh/id_ed25519 --transport iroh \
--iroh-relay https://my-relay.example.com
# Client — connect via iroh (no IP needed!)
ghost connect --peer abc123def456... --transport iroh --socks5 1080
# Client — connect via iroh with TUN
ghost connect --peer abc123def456... --transport iroh --tun
# Client — traditional TCP mode (still works)
ghost connect --server 1.2.3.4:443 --transport tls --socks5 1080
```
### 11.10 Implementation Impact
Adding iroh as a transport option is **incremental** — it doesn't change the SSH layer at all:
1. **Transport trait**: Define a `Transport` trait that produces `Box<dyn AsyncRead + AsyncWrite + Unpin + Send>`:
```rust
trait Transport {
async fn connect(&self) -> Result<Box<dyn AsyncRead + AsyncWrite + Unpin + Send>>;
}
```
2. **Three implementations**:
- `TcpTransport` — plain TCP
- `TlsTransport` — TCP + tokio-rustls
- `IrohTransport` — iroh endpoint + `open_bi()` + `tokio::io::join(recv, send)`
3. **Server side**: Same trait, different direction:
```rust
trait TransportAcceptor {
async fn accept(&self) -> Result<Box<dyn AsyncRead + AsyncWrite + Unpin + Send>>;
}
```
4. **The SSH layer never changes.** russh's `connect_stream()` / `run_stream()` takes the transport stream, and everything else stays the same.
### 11.11 Dependency Impact
| Dependency | Added? | Size concern |
|------------|--------|-------------|
| `iroh` (includes iroh-base) | Yes, feature-gated | Yes — pulls in QUIC stack, DNS, relay client |
| `n0-error` | Yes (small) | No |
| `tokio` | Already present | No |
| `rustls` | Already present (for TLS mode) | No |
**Recommendation**: Make iroh a feature flag (`--features iroh`) so the base install stays lean. Users who want P2P capability opt in:
```toml
[features]
default = ["tls"]
tls = ["tokio-rustls", "rustls-pemfile"]
iroh = ["dep:iroh"]
tun = ["dep:tun-rs", "dep:smoltcp"]
```
### 11.12 The Compelling Narrative
With iroh as a transport option, this tool becomes something genuinely new:
- **Not just a VPN alternative** — it's a VPN alternative that doesn't need port forwarding, public IPs, or DNS records.
- **Not just SSH tunneling** — it's SSH tunneling that works between any two machines on the internet, regardless of NAT configuration.
- **Not just for censorship circumvention** — it's how you securely expose internal services (Postgres, Redis, admin panels) from machines behind corporate firewalls or home networks.
The "ghetto VPN" becomes a **zero-config mesh VPN**. Spin up `ghost serve` on any machine, share the public key, connect from anywhere. The relay server is optional (self-host or n0's free tier). And underneath it's just SSH, doing what SSH does best.
This isn't theoretical — the API compatibility is exact. iroh's `RecvStream + SendStream` implement `AsyncRead + AsyncWrite`, and russh's `connect_stream` / `run_stream` accept `AsyncRead + AsyncWrite`. Three lines of `tokio::io::join(recv, send)` and you have a transport stream that russh can use.

View File

@@ -1,472 +0,0 @@
# Alknet Flowgraph: Operation Graph, Call Graph, and Graph Operations
> Status: Research / Draft
> Last updated: 2026-06-06
## Overview
`alknet-flowgraph` is a Rust crate providing graph data structures and operations, mapping the TypeScript `@alkdev/flowgraph` package's call-graph and operation-graph concepts to `petgraph::DiGraph`. It works with `alknet-storage` for persistence and `alknet-core` for call protocol event processing.
## Core Abstraction
`petgraph::DiGraph` replaces graphology. The mapping is nearly 1:1 for the operations used:
| TypeScript (graphology) | Rust (petgraph) |
|------------------------|-----------------|
| `graph.addNode(key, attrs)` | `graph.add_node(attrs)` returns `NodeIndex` |
| `graph.addEdge(source, target, attrs)` | `graph.add_edge(source, target, attrs)` returns `EdgeIndex` |
| `graph.getAttribute(key)` | `graph[node]` |
| `graph.forEachNode()` | `graph.node_indices().for_each()` |
| `graph.inNeighbors(node)` | `graph.neighbors_directed(node, Direction::Incoming)` |
| `graph.outNeighbors(node)` | `graph.neighbors_directed(node, Direction::Outgoing)` |
| `graph.hasCycle()` | `petgraph::algo::is_cyclic_directed(&graph)` |
| `graph.topologicalSort()` | `petgraph::algo::toposort(&graph)` returns `Result<Vec<NodeIndex>, Cycle>` |
| `graph.export()` | serde serialization |
| `FlowGraph.fromJSON(data)` | serde deserialization |
### Key Difference: Node Keys
graphology uses string node keys (`"call-001"`). petgraph uses `NodeIndex` (u32). We maintain a `HashMap<String, NodeIndex>` for node-key-to-index lookups, mirroring the `key` column in the `nodes` SQLite table.
```rust
pub struct FlowGraph<N, E>
where
N: NodeAttributes,
E: EdgeAttributes,
{
graph: DiGraph<N, E>,
key_to_index: HashMap<String, NodeIndex>,
}
pub trait NodeAttributes: Clone + Serialize + DeserializeOwned + Debug + Send + Sync {
fn key(&self) -> &str;
fn set_key(&mut self, key: String);
}
pub trait EdgeAttributes: Clone + Serialize + DeserializeOwned + Debug + Send + Sync {
fn edge_type(&self) -> &str;
}
```
## Operation Graph (Static)
Built from `OperationSpec`s at startup. Answers structural questions about the operation space: type compatibility, cycle detection, reachability.
### Construction
```rust
impl FlowGraph<OperationNodeAttrs, OperationEdgeAttrs> {
pub fn from_specs(specs: &[OperationSpec]) -> Result<Self, CycleError> {
let mut graph = Self::new();
for spec in specs {
graph.add_operation(spec.clone());
}
for (source, target) in graph.compute_type_edges(specs) {
graph.add_typed_edge(&source, &target, TypeCompat::compatible(/*...*/))?;
}
if graph.has_cycles() {
return Err(CycleError);
}
Ok(graph)
}
}
```
### Type Compatibility
Compare `output_schema` (source) against `input_schema` (target) using `jsonschema`:
```rust
pub fn type_compat(
output_schema: &Value,
input_schema: &Value,
) -> TypeCompatResult {
// 1. Exact match → compatible
// 2. Subtype match (output has extra fields) → compatible
// 3. Unknown on either side → skip (no edge)
// 4. Structural mismatch → incompatible edge (added with compatible: false)
}
```
### Node Attributes
```rust
#[derive(Clone, Serialize, Deserialize, Debug)]
pub struct OperationNodeAttrs {
pub name: String,
pub namespace: String,
pub op_type: OperationType,
pub input_schema: Value,
pub output_schema: Value,
}
#[derive(Clone, Serialize, Deserialize, Debug)]
pub enum OperationType {
Query,
Mutation,
Subscription,
}
#[derive(Clone, Serialize, Deserialize, Debug)]
pub struct OperationEdgeAttrs {
pub edge_type: String, // "typed"
pub compatible: bool,
pub detail: Option<String>,
}
```
### Queries
```rust
// petgraph delegations
pub fn topological_order(&self) -> Result<Vec<String>, CycleError>
pub fn has_cycles(&self) -> bool
pub fn find_cycles(&self) -> Vec<Vec<String>>
pub fn ancestors(&self, node_key: &str) -> Vec<String>
pub fn descendants(&self, node_key: &str) -> Vec<String>
pub fn predecessors(&self, node_key: &str) -> Vec<String>
pub fn successors(&self, node_key: &str) -> Vec<String>
pub fn reachable_from(&self, node_keys: &[String]) -> HashSet<String>
```
## Call Graph (Dynamic)
Populated at runtime from call protocol events. Every `call.requested` adds a node, every `call.responded`/`call.error`/`call.aborted` updates its status.
### Construction from Events
```rust
impl FlowGraph<CallNodeAttrs, CallEdgeAttrs> {
pub fn from_call_events(events: &[CallEventMapValue]) -> Self {
let mut graph = Self::new();
for event in events {
graph.update_from_event(event);
}
graph
}
pub fn update_from_event(&mut self, event: &CallEventMapValue) {
match event {
CallEvent::Requested(e) => {
self.add_call(CallNodeAttrs {
request_id: e.request_id.clone(),
operation_id: e.operation_id.clone(),
status: CallStatus::Pending,
parent_request_id: e.parent_request_id.clone(),
input: e.input.clone(),
..Default::default()
});
}
CallEvent::Responded(e) => {
self.update_status(&e.request_id, CallStatus::Completed, None);
}
CallEvent::Error(e) => {
self.update_status(&e.request_id, CallStatus::Failed, Some(e.clone()));
}
CallEvent::Aborted(e) => {
self.update_status(&e.request_id, CallStatus::Aborted, None);
}
CallEvent::Completed(e) => {
self.update_status(&e.request_id, CallStatus::Completed, None);
}
}
}
}
```
### Real-time Population
```rust
// Subscribe to call protocol events for live graph construction
let call_graph = FlowGraph::<CallNodeAttrs, CallEdgeAttrs>::new();
pubsub.subscribe("call.requested", |event| {
call_graph.update_from_event(&event);
});
pubsub.subscribe("call.responded", |event| {
call_graph.update_from_event(&event);
});
// ... etc for all call event types
```
### Node Attributes
```rust
#[derive(Clone, Serialize, Deserialize, Debug, Default)]
pub struct CallNodeAttrs {
pub request_id: String,
pub operation_id: String,
pub status: CallStatus,
pub parent_request_id: Option<String>,
pub input: Value,
pub output: Option<Value>,
pub error: Option<CallErrorInfo>,
pub identity: Option<Identity>,
pub started_at: Option<String>,
pub completed_at: Option<String>,
}
#[derive(Clone, Serialize, Deserialize, Debug, PartialEq)]
pub enum CallStatus {
Pending,
Running,
Completed,
Failed,
Aborted,
}
#[derive(Clone, Serialize, Deserialize, Debug)]
pub struct CallEdgeAttrs {
pub edge_type: EdgeType,
}
#[derive(Clone, Serialize, Deserialize, Debug)]
pub enum EdgeType {
Triggered,
DependsOn,
}
```
### Status Lifecycle
```
call.requested
┌─────────┐
│ pending │
└────┬────┘
handler starts
┌─────────┐
┌────│ running │────┐
│ └────┬────┘ │
call.aborted │ call.aborted
│ │ │
▼ │ ▼
┌─────────┐ │ ┌─────────┐
│ aborted │ │ │ aborted │
└─────────┘ │ └─────────┘
┌─────────┼─────────┐
│ │ │
call.responded │ call.error
│ │ │
▼ │ ▼
┌───────────┐ │ ┌────────┐
│ completed │ │ │ failed │
└───────────┘ │ └────────┘
call.completed
┌───────────┐
│ completed │
└───────────┘
```
### Abort Cascading
```rust
// Abort cascade: get all descendants of a call
let descendants = call_graph.descendants(&request_id);
// The protocol handler aborts each descendant via PendingRequestMap::abort()
```
### Observability Queries
| Query | Method | Returns |
|-------|--------|---------|
| Get running calls | `filter_by_status(CallStatus::Running)` | Node keys with running status |
| Get failed calls | `filter_by_status(CallStatus::Failed)` | Node keys with failed status |
| Get top-level calls | `get_roots()` | Nodes with no `parent_request_id` |
| Get children of call | `children(&request_id)` | Direct children via `triggered` edges |
| Get call duration | `duration(&request_id)` | `completed_at - started_at` |
| Get call lineage | `lineage(&request_id)` | Ancestor chain from root to this call |
### Serialization and Persistence
```rust
// Serialize via serde
let json = serde_json::to_value(&call_graph)?;
let restored: FlowGraph<CallNodeAttrs, CallEdgeAttrs> = serde_json::from_value(json)?;
// Persist via alknet-storage
storage.insert_call_graph("session-abc", &call_graph)?;
storage.load_call_graph("session-abc")?;
```
## Graph Operations (petgraph mapping)
All graph operations used in `@alkdev/flowgraph` map directly to petgraph:
| Flowgraph method | petgraph function |
|------------------|-------------------|
| `addNode(key, attrs)` | `add_node(attrs)` + `key_to_index.insert(key, idx)` |
| `addEdge(source, target, attrs)` | `add_edge(source_idx, target_idx, attrs)` |
| `addDirectedEdge(source, target, attrs)` | `add_edge(source_idx, target_idx, attrs)` |
| `getNodeAttributes(key)` | `graph[NodeIndex]` |
| `getEdgeAttributes(key)` | `graph[EdgeIndex]` |
| `getSource(key)` / `getTarget(key)` | `graph.edge_endpoints(EdgeIndex)` |
| `inDegree(key)` | `graph.neighbors_directed(idx, Incoming).count()` |
| `outDegree(key)` | `graph.neighbors_directed(idx, Outgoing).count()` |
| `inNeighbors(key)` | `graph.neighbors_directed(idx, Incoming)` |
| `outNeighbors(key)` | `graph.neighbors_directed(idx, Outgoing)` |
| `hasEdge(source, target)` | `graph.contains_edge(source_idx, target_idx)` |
| `forEachNode(callback)` | `graph.node_indices().for_each()` |
| `forEachEdge(callback)` | `graph.edge_indices().for_each()` |
| `findCycle()` | `is_cyclic_directed(&graph)` |
| `topologicalSort()` | `toposort(&graph, None)` |
| `export()` / `toJSON()` | `serde_json::to_value(&graph)` |
| `fromJSON()` | `serde_json::from_value(json)` |
### Cycle Detection
The operation graph rejects cycles at construction time. The call graph allows cycles in the parent-child hierarchy only via `parentRequestId` (which should not create actual cycles — a call cannot be its own ancestor).
```rust
pub fn add_call(&mut self, attrs: CallNodeAttrs) -> Result<(), CycleError> {
if let Some(parent) = &attrs.parent_request_id {
// Check if adding triggered edge would create a cycle
if self.would_create_cycle(parent, &attrs.request_id) {
return Err(CycleError);
}
}
// ...
}
```
### DAG Invariants
- **Operation graph**: DAG-only enforced at construction. `add_typed_edge` throws `CycleError` if a cycle would result.
- **Call graph**: DAG-only by design (a call cannot be its own ancestor). `add_call` with a `parentRequestId` that would create a cycle throws `CycleError`.
- **No parallel edges**: `multi: false` — at most one edge per (source, target) pair.
- **No self-loops**: `allow_self_loops: false` — an operation cannot depend on its own output.
## Integration with alknet-storage
Call graphs and operation graphs are stored as metagraph instances:
```rust
// Create call-graph type definition
let call_graph_type = GraphType {
name: "call-graph".to_string(),
config: GraphConfig { graph_type: Directed, multi: false, allow_self_loops: false },
scope: Scope::System,
..Default::default()
};
// Store a call graph instance
let graph = storage.create_graph("call-graph", "session-abc")?;
// Add call nodes
storage.add_node(graph.id, "call-001", &call_attrs)?;
// Query via petgraph
let pg: FlowGraph<CallNodeAttrs, CallEdgeAttrs> = storage.load_call_graph("session-abc")?;
let running = pg.filter_by_status(CallStatus::Running);
```
The `alknet-storage` crate handles persistence (SQLite via honker). The `alknet-flowgraph` crate handles in-memory graph operations (petgraph). The bridge is serialization: `FlowGraph` serializes to/from `serde_json::Value`, which `alknet-storage` stores in the `nodes.attributes` and `edges.attributes` columns.
## Integration with alknet-core (Call Protocol)
```rust
// The call protocol's EventEnvelope drives call graph construction
use alknet_core::call::PendingRequestMap;
use alknet_flowgraph::FlowGraph;
let mut call_graph = FlowGraph::<CallNodeAttrs, CallEdgeAttrs>::new();
// Wire up call protocol events to graph updates
call_map.on_requested(|event| {
call_graph.update_from_event(&CallEvent::Requested(event));
});
call_map.on_responded(|event| {
call_graph.update_from_event(&CallEvent::Responded(event));
// Persist incrementally to storage
storage.update_node(event.request_id, &call_graph)?;
});
call_map.on_error(|event| {
call_graph.update_from_event(&CallEvent::Error(event));
});
call_map.on_completed(|event| {
call_graph.update_from_event(&CallEvent::Completed(event));
});
// Abort cascading
call_map.on_aborted(|event| {
let descendants = call_graph.descendants(&event.request_id);
for desc in descendants {
call_map.abort(&desc);
}
call_graph.update_from_event(&CallEvent::Aborted(event));
});
```
## Type Compatibility Between TS and Rust
| TypeScript (flowgraph) | Rust (alknet-flowgraph) |
|------------------------|-------------------------|
| `graphology.DirectedGraph` | `petgraph::DiGraph<N, E>` |
| `CallNodeAttrs` (TypeBox) | `CallNodeAttrs` (serde struct) |
| `CallEdgeAttrs` (TypeBox) | `CallEdgeAttrs` (serde struct) |
| `CallStatus` (enum) | `CallStatus` (Rust enum) |
| `EdgeType` (enum) | `EdgeType` (Rust enum) |
| `OperationNodeAttrs` | `OperationNodeAttrs` (serde struct) |
| `OperationEdgeAttrs` | `OperationEdgeAttrs` (serde struct) |
| `OperationType` (enum) | `OperationType` (Rust enum) |
| `Identity` | `Identity` (serde struct) |
| `AccessControl` | `AccessControl` (serde struct) |
| `typeCompat()` | `type_compat()` using jsonschema |
| `Value.Check()` (TypeBox) | `jsonschema::validate()` |
| `addCall()` | `add_call()` |
| `updateStatus()` with state machine | `update_status()` with `is_valid_transition()` |
| `addDependency()` | `add_dependency()` |
| `descendants()` | petgraph DFS |
## Crate Dependency Map
```toml
[dependencies]
petgraph = "0.x" # Core graph data structure
serde = { version = "1", features = ["derive"] }
serde_json = "1"
jsonschema = "0.x" # Type compatibility checks
thiserror = "1"
uuid = { version = "1", features = ["v4"] }
chrono = { version = "0.x", features = ["serde"] }
[dev-dependencies]
tokio = { version = "1", features = ["full"] }
```
## Design Decisions
| Decision | Rationale |
|----------|-----------|
| petgraph over custom graph | Nearly 1:1 mapping to graphology operations; well-maintained; fast |
| `HashMap<String, NodeIndex>` for key lookups | Matches SQLite `key` column pattern; O(1) lookup by string key |
| serde_json for attributes | Matches SQLite `attributes TEXT JSON` column; dynamic validation via jsonschema |
| Separate crates for flowgraph and storage | Flowgraph is pure in-memory graph ops; storage is SQLite persistence; different dependency sets |
| `NodeAttributes` / `EdgeAttributes` traits | Generic over attribute types, matching flowgraph's type parameter pattern |
| DAG enforcement at construction | Matches TypeScript flowgraph: `fromSpecs()` throws `CycleError` |
| `filter_by_status` is O(n) | Matches TypeScript: small graphs (tens to hundreds of nodes), no index needed |
| Call protocol as integration boundary | Call protocol `EventEnvelope` is the cross-node integration boundary; domain events stay within services |
## References
- `@alkdev/flowgraph` — TypeScript implementation (call-graph, operation-graph)
- `@alkdev/operations``OperationSpec`, `CallHandler`, `PendingRequestMap`
- `/workspace/petgraph` — Graph data structure crate
- `/workspace/jsonschema` — JSON Schema validation crate
- `/workspace/@alkdev/storage/docs/architecture/metagraph-module.md` — TypeBox Module pattern
- `/workspace/@alkdev/storage/docs/architecture/sqlite-host.md` — SQLite table definitions
- `/workspace/@alkdev/storage/docs/architecture/acl.md` — ACL as metagraph
- [services.md](services.md) — Service layer architecture (irpc protocols)
- [core.md](core.md) — Core overview, head/worker terminology

View File

@@ -1,850 +0,0 @@
# Integration Plan: Services, PubSub, and Operations
> Status: Research / Draft
> Last updated: 2026-06-09
## Purpose
This document organizes the findings from the research phase (core.md, services.md, configuration.md, storage.md, flow.md) into an actionable integration plan. It identifies what requires changes to the core, what becomes new crates, what can be carried over from existing research specs, and what needs further specification before implementation.
The plan is organized into phases because not everything can be front-loaded. Earlier phases change the core architecture; later phases build on top. Things learned during implementation may adjust later phases.
## Key Clarifications
### Transport / Interface / Protocol — Three Layers
Carrying forward the distinction raised during review, the architecture has three distinct layers:
```
Layer 3: Application Protocol (Call Protocol, Operations, Service Calls)
Layer 2: Interface (SSH, raw EventEnvelope framing, HTTP/WS, DNS control channel)
Layer 1: Transport (TCP, TLS, iroh, WebTransport, DNS)
```
A **connection** is always a (Transport, Interface) pair. The call protocol runs at Layer 3 and is agnostic to both layers below it.
This means:
| Combination | What it does | Example |
|---|---|---|
| (TLS, SSH) | Standard alknet tunnel | `alknet connect --transport tls` |
| (TCP, SSH) | Plain SSH tunnel | `alknet connect --transport tcp` |
| (iroh, SSH) | P2P SSH tunnel | `alknet connect --transport iroh` |
| (DNS, raw framing) | DNS control channel | Call protocol frames as DNS TXT queries |
| (WebTransport, SSH) | Browser SSH tunnel | Future: browser client |
| (WebTransport, raw framing) | Browser call protocol | Future: browser-to-head direct |
| (TCP, raw framing) | Direct call protocol | Local service mesh, no SSH overhead |
"Raw framing" means the 4-byte length prefix + JSON EventEnvelope format without SSH wrapping. The DNS "control channel" concept from the research is a (DNS transport, raw framing interface) pair. It carries call protocol events directly — it does NOT wrap SSH inside DNS.
### Services vs Call Protocol — Two Different Layers
From services.md:
> Services are internal — they run within a node or cluster. The call protocol is external — it's how nodes communicate with each other over SSH/QUIC/WebSocket/DNS transports.
- **irpc service calls**: Internal, synchronous request-response. Rust-to-Rust, postcard serialization, over tokio channels (local) or QUIC streams (remote). Domain-level.
- **Call protocol events**: External, cross-node, cross-language. JSON EventEnvelope frames, over any (Transport, Interface) pair. Integration-level.
A call protocol handler MAY call an irpc service internally. For example, `/head/auth/verify` receives a call protocol `call.requested` event, then calls the local `AuthProtocol::VerifyPubkey` irpc service to actually perform the check. The layers compose:
```
Call Protocol (Layer 3, external, JSON)
└── irpc Service (Layer 3, internal, postcard)
└── Honker Streams (Domain events, within service boundary)
```
Future work on binary encoding (replacing JSON with postcard or similar for Rust-to-Rust cross-node communication) is possible but deferred — JSON works well across platforms and the performance characteristics are acceptable for control-plane traffic.
### OperationEnv — The Universal Composition Mechanism
The `OperationEnv` pattern from `@alkdev/operations` is not a TypeScript implementation detail. It is the **universal composition mechanism** that all operation handlers receive. It maps identically across every modern boundary:
- HTTP: `POST /v1/{namespace}/{op}``context.env[namespace][op](input)`
- MCP: `tools/call` with tool name `{namespace}_{op}``context.env[namespace][op](input)`
- DNS: `{op}.{namespace}.alk.dev TXT?``context.env[namespace][op](input)`
- Call protocol: `call.requested` with `operationId: "/{node}/{namespace}/{op}"``context.env[namespace][op](input)`
- irpc: service enum dispatch → wraps the same handler → `context.env[namespace][op](input)`
The handler always sees the same interface: given a namespace and operation name, invoke it with input. The OperationEnv implements the routing. The three dispatch paths are:
```
OperationEnv (handler-facing composition)
├── Local dispatch (in-process, direct function call through registry)
├── Service dispatch (in-cluster, irpc protocol enum to service backend)
└── Remote dispatch (cross-node, call protocol EventEnvelope to head)
```
All three resolve the same way from the handler's perspective. A handler calling `context.env.secrets.derive(input)` doesn't know or care whether it becomes a local function call, an irpc protocol message, or a cross-node call protocol event. The OperationEnv chooses the routing based on where the operation is registered.
This means:
- **irpc services are one dispatch backend for OperationEnv**, not a replacement for it.
- **irpc protocol enums** (`AuthProtocol::VerifyPubkey`, `SecretProtocol::DeriveEd25519`) define the wire format for in-cluster communication. They're the Rust-to-Rust optimization path.
- **Call protocol operations** define the cross-node, cross-language wire format. They use path-based routing (`/head/auth/verify`).
- **An irpc service can be exposed as a call protocol operation** — the registry maps the path to a handler that internally calls the irpc service.
- **Both coexist** and both are needed. irpc gives you type-safe, efficient in-cluster calls. Call protocol gives you universal, cross-language, cross-node calls. OperationEnv unifies them from the handler's perspective.
The Rust implementation of OperationEnv doesn't have to be a literal `HashMap<String, HashMap<String, fn(...)>>` — it can be a struct with typed method dispatch or a registry that resolves to irpc clients — but the **behavioral contract** must match: namespace + operation name → invoke with input, return output. Handlers compose through this interface. Adapters (MCP, OpenAPI, HTTP, DNS) map to operations through this interface.
This is a hard constraint: the OperationEnv composition model must survive the Rust port intact. It's what makes operations universally composable across all interfaces.
---
## What Exists Already
### Existing Architecture Specs (reviewed/stable)
| Doc | Status | Carries Over? |
|---|---|---|
| overview.md | reviewed | Yes — needs updates for expanded scope (services, identity, interface layer) |
| transport.md | reviewed | Yes — transport trait is unchanged |
| client.md | reviewed | Yes — client behavior unchanged |
| server.md | reviewed | Yes — server handler needs minor updates for DynamicConfig/AuthService |
| tun-shim.md | deprecated | No — remains deprecated |
| napi-and-pubsub.md | reviewed | Yes — NAPI layer needs call protocol additions |
### Existing Architecture Specs (draft)
| Doc | Status | Needs |
|---|---|---|
| auth.md | draft | Promote Identity to a first-class concern. Add IdentityProvider vs AuthService relationship. |
| call-protocol.md | draft | Add OperationEnv as universal composition mechanism. Update hub/spoke → head/worker. Clarify Layer 3 position. Show three dispatch paths (local, irpc, remote). |
### Research Documents (source material)
| Doc | Content | Spec Readiness |
|---|---|---|
| core.md | Transport, call protocol, auth, services, DNS | High for most parts. DNS section needs rewrite for transport/interface separation. |
| services.md | irpc service protocols, operation context, application services | High for core services. Application services are sketches — defer to phase 4+. |
| configuration.md | Static/dynamic split, forwarding policy, multi-transport | High — this was nearly spec-ready already. Needs ADR extraction. |
| storage.md | Metagraph, identity, ACL, secrets, honker | High for data model. Integration points with core need spec work. |
| flow.md | FlowGraph, petgraph mapping, call/operation graphs | High — straightforward port of TypeScript design. |
### Existing ADRs (25 accepted)
ADR-001 through ADR-025 are accepted. Several new ADRs are needed (see Phase 0). Existing ADRs to update:
- ADR-018 (control channel for pubsub) — superseded/extended by bidirectional call protocol (ADR-024) and the Layer 2/3 model
- ADR-024, ADR-025 — update terminology from hub/spoke to head/worker
---
## Phase 0: Architecture Foundation
**Goal**: Establish the structural decisions that everything else depends on. Write ADRs, create new spec documents, adjust existing specs for the three-layer model and crate decomposition.
**Why first**: Every subsequent phase depends on knowing where types live, what the layer boundaries are, and which crates depend on which. These decisions are architectural and cheap to change now but expensive to change later.
### ADRs to Write
| ADR | Title | Key Decision |
|---|---|---|
| 026 | Transport-interface separation | Three-layer model: Transport (Layer 1) produces byte streams, Interface (Layer 2) parses them into sessions, Protocol (Layer 3) carries semantics. Valid (Transport, Interface) pairs are enumerated. SSH is an interface, not a transport. DNS control channel is a (DNS transport, raw framing interface) pair. |
| 027 | Crate decomposition | alknet-core (transport, SSH, call protocol, config, auth types, identity), alknet-secret (BIP39, SLIP-0010, AES-GCM), alknet-storage (SQLite, honker, metagraph, ACL, identity tables), alknet-flowgraph (petgraph, type compatibility). Core depends on no heavy service crates. |
| 028 | Auth as irpc service | Auth verification via IdentityProvider trait (in core). Default impl: ArcSwap<DynamicConfig>. Production impl: irpc AuthService backed by SQLite. Callers don't know the difference. |
| 029 | Identity as core type | `Identity` struct (id, scopes, resources) and `IdentityProvider` trait live in alknet-core. Derivation and storage are external concerns. |
| 030 | Static/dynamic config split | StaticConfig (transport binding, TLS, host key) vs DynamicConfig (auth, forwarding, rate limits). ArcSwap for hot reload. ConfigService wraps reloads. Promoted from research/configuration.md. |
| 031 | Forwarding policy | Rule-based allow/deny for channel_open_direct_tcpip. Default-allow for migration, default-deny for production. TransportKind-aware rules. |
| 032 | Event boundary discipline | Domain events (honker streams) stay within the owning service. Integration events (call protocol EventEnvelope) cross node boundaries. Service calls (irpc) are synchronous and internal. Never conflate the three. |
| 033 | Call protocol / irpc relationship / OperationEnv | OperationEnv is the universal composition mechanism. irpc services are one dispatch backend for OperationEnv (in-cluster, postcard). Call protocol operations are another backend (cross-node, JSON). Handlers compose through `context.env[namespace][op](input)` regardless of dispatch path. Both are Layer 3, at different scope boundaries. |
| 034 | Head/worker terminology | Replace hub/spoke with head/worker throughout. A head is also a worker. Mesh topologies are natural. |
### Spec Documents to Create or Update
| Document | Action | Source |
|---|---|---|
| `interface.md` | **Create new** | Defines Layer 2. SSH as interface. Raw framing as interface. DNS control channel as (DNS transport, raw framing interface). |
| `services.md` | **Create new** | Defines irpc service layer. Auth, Secret, Config, Storage service protocols. How irpc services relate to call protocol operations and OperationEnv. Carries from research/services.md and research/core.md service layer section. |
| `identity.md` | **Create new** | `Identity` type, `IdentityProvider` trait, auth flow for SSH and token. Carries from architecture/auth.md + research/services.md Identity section. |
| `configuration.md` | **Promote from research** | StaticConfig, DynamicConfig, ConfigService, forwarding policy, auth service relationship. Needs cleanup: remove duplicate "Problem" heading, resolve open questions per ADRs. |
| `secret-service.md` | **Create new** | Slides from research/services.md SecretProtocol definition. BIP39/SLIP-0010, key derivation paths, encryption model, lock/unlock lifecycle. |
| `storage.md` | **Create new** (or reference alknet-storage's own docs) | Metagraph data model, identity tables, ACL graph, honker integration. Carries from research/storage.md. |
| `flowgraph.md` | **Create new** (or reference alknet-flowgraph's own docs) | FlowGraph<N,E>, operation graph, call graph, petgraph mapping. Carries from research/flow.md. |
| `overview.md` | **Update** | Add crate structure, Layer 3 description, service layer concept, updated dependency list. |
| `auth.md` | **Update** | Add IdentityProvider vs AuthService relationship. Update for irpc AuthProtocol. Note: this is mostly a rename/reorg since the current auth.md already defines IdentityProvider. |
| `call-protocol.md` | **Update** | Add OperationEnv as universal composition mechanism with three dispatch paths (local, irpc service, remote). Update hub/spoke → head/worker. Show how irpc is one backend for OperationEnv, not a replacement for it. |
| `README.md` | **Update** | Add new docs and ADRs to the tables. |
### Review Checklist (Phase 0)
After writing specs and ADRs:
1. **No inline decision rationale** — all "why" decisions are in ADRs, specs reference ADR numbers
2. **No inline open questions** — all OQs are in open-questions.md, specs reference OQ numbers
3. **Terminology is consistent** — head/worker everywhere (no hub/spoke remaining)
4. **Layer boundaries are clear** — every component belongs to exactly one layer
5. **Crate dependencies are acyclic** — core doesn't depend on secret, storage, or flowgraph
6. **Every spec has YAML frontmatter** with status and last_updated
---
## Phase 1: Core Modifications
**Goal**: Modify alknet-core to support the architectural changes. This is the "adjust the foundation" phase.
**Why second**: The core changes (config split, auth service, identity type, forwarding policy) are prerequisites for the service layer and the external crates. Implementation can begin after Phase 0 ADRs and specs are reviewed and stable.
### 1.1 Configuration: Static/Dynamic Split
**Source**: research/configuration.md (nearly spec-ready)
**Changes to alknet-core**:
- Introduce `StaticConfig` struct (transport mode, listen addr, TLS config, iroh config, host key, stealth, max_auth_attempts, max_connections_per_ip)
- Introduce `DynamicConfig` struct (auth policy, forwarding policy, rate limits)
- Replace `Arc<ServerAuthConfig>` with `Arc<ArcSwap<DynamicConfig>>` in ServerHandler
- Add `ConfigReloadHandle` with `reload(DynamicConfig)` method
- Expose `reloadAuth()` / `reloadForwarding()` on the NAPI AlknetServer object
**What stays the same**: `ServeOptions` builder pattern is preserved. `StaticConfig` is constructed from `ServeOptions`. `DynamicConfig` starts with what was in `ServerAuthConfig` and gains `ForwardingPolicy`.
**New crate**: None. This is all in alknet-core.
**ADR**: 030 (static/dynamic split)
**Risk**: Low — internal refactor, no protocol changes. Default-allow forwarding preserves current behavior.
### 1.2 Identity Type and IdentityProvider Trait
**Source**: architecture/auth.md (already defines IdentityProvider), research/services.md (Identity struct)
**Changes to alknet-core**:
- Define `Identity` struct in `alknet_core::auth` (id, scopes, resources)
- Define `IdentityProvider` trait in `alknet_core::auth`
- Implement `ConfigIdentityProvider` (reads from DynamicConfig's authorized_keys)
- Wire `IdentityProvider` into `ServerHandler::auth_publickey()` — currently reads from `ServerAuthConfig`, now goes through trait
- Wire `IdentityProvider` into token auth (WebTransport path) when that lands
**What stays the same**: SSH key verification logic. The `auth_publickey()` callback just delegates to the trait instead of reading directly.
**New crate**: None. Identity is core.
**ADR**: 029 (identity as core type)
**Risk**: Low — adding a trait abstraction over existing behavior.
### 1.3 Forwarding Policy
**Source**: research/configuration.md (ForwardingPolicy section)
**Changes to alknet-core**:
- Define `ForwardingPolicy`, `ForwardingRule`, `TargetPattern` structs
- Add policy check in `channel_open_direct_tcpip` before proxy spawn
- Default: `ForwardingPolicy::allow_all()` (preserves current behavior)
- Policy is part of `DynamicConfig` and reloadable
**New crate**: None. This is in alknet-core.
**ADR**: 031 (forwarding policy)
**Risk**: Low — new check, default-allow preserves current behavior.
### 1.4 Auth Service (irpc Protocol)
**Source**: research/services.md (AuthProtocol definition), research/configuration.md (auth service approach)
**Changes to alknet-core**:
- Define `AuthProtocol` enum with `#[rpc_requests]` (behind `irpc` feature flag)
- Define `AuthResult` and `Identity` types shared between SSH auth path and irpc auth path
- Implement `AuthServiceImpl` backed by `ConfigIdentityProvider` (ArcSwap path) — the default for minimal deployments
- Future: `AuthServiceImpl` backed by SQLite (in alknet-storage) — not in this phase
**What stays the same**: The `IdentityProvider` trait is the contract. Default impl uses ArcSwap. SQL impl is additive.
**New crate**: None. Auth service types live in alknet-core.
**Feature flag**: `irpc` feature in alknet-core. When disabled, auth goes through `IdentityProvider` directly (no irpc overhead).
**ADR**: 028 (auth as irpc service), 029 (identity as core type)
**Risk**: Medium — introduces irpc dependency behind feature flag. Needs careful API design so the trait-based path and the irpc path produce identical results.
### 1.5 OperationEnv and OperationRegistry
**Source**: research/services.md (OperationContext, OperationEnv), existing call-protocol.md (OperationSpec, OperationRegistry)
**Changes to alknet-core**:
- Define `OperationContext` struct (request_id, parent_request_id, identity, metadata, env, trusted)
- Define `OperationEnv` — the universal composition mechanism with three dispatch backends:
- **Local dispatch**: Direct function call through the operation registry
- **Service dispatch**: irpc protocol call to a service backend
- **Remote dispatch**: Call protocol EventEnvelope to a remote node
- Extend the existing `OperationRegistry` to support all three dispatch paths
- Define `ResponseEnvelope` as the universal return type (matching `@alkdev/operations`)
- Operation handlers receive `(input: Value, context: OperationContext) -> ResponseEnvelope`
- The `env` field on `OperationContext` allows handlers to call other operations without knowing the dispatch path
**Hard constraint**: The OperationEnv composition model must match the behavioral contract from `@alkdev/operations`. Namespace + operation name → invoke with input, return output. This is what makes operations universally composable across HTTP, MCP, DNS, call protocol, and irpc. The Rust implementation can differ in its internal dispatch mechanism, but the handler-facing API must preserve this contract.
**New crate**: None. OperationEnv, OperationContext, and OperationRegistry are core concepts in `alknet_core::call`.
**ADR**: 033 (call protocol / irpc relationship)
**Risk**: Medium — OperationEnv is a new abstraction that must coexist with the existing call protocol handler pattern. The registry currently maps paths to handlers; OperationEnv adds namespace-aware composition on top. Need to ensure the two models compose cleanly.
### 1.6 Config Service (irpc Protocol)
**Source**: research/configuration.md, research/services.md (ConfigProtocol definition)
**Changes to alknet-core**:
- Define `ConfigProtocol` enum with `#[rpc_requests]` (behind `irpc` feature flag)
- Implement `ConfigServiceImpl` backed by `ArcSwap<DynamicConfig>`
- Expose reload methods through the service
**New crate**: None. Config is core.
**Feature flag**: `irpc` feature.
**ADR**: 030 (static/dynamic split)
**Risk**: Low — thin wrapper over ArcSwap.
### 1.7 Multi-Transport Listeners
**Source**: research/configuration.md (multi-transport section)
**Changes to alknet-core**:
- Change `ServeTransportMode` from single enum to `Vec<ListenerConfig>`
- `Server::run()` spawns one accept loop per listener, sharing `DynamicConfig`, `ConnectionRateLimiter`, sessions, and shutdown signal
- Add `TransportKind::WebTransport` and `TransportKind::Dns` variants (initially tags only — no acceptor implementation)
- TOML config file support: `[[listeners]]` array-of-tables syntax
**New crate**: None. This is alknet-core server logic.
**ADR**: 026 (transport-interface separation) — TransportKind enum includes all Layer 1 types
**Risk**: Medium — changes the primary API surface of `serve()`. Backwards compat via accepting both single `transport` and `listeners` array.
### 1.8 Interface Abstraction
**Source**: New concept from review (not in research docs explicitly)
**Changes to alknet-core**:
- Define `Interface` trait that consumes a `Transport::Stream` and produces call protocol events
- `SshInterface` — wraps existing russh handler, produces SSH channels + control channel
- `RawFramingInterface` — reads length-prefixed JSON EventEnvelope frames, produces call protocol events directly (no SSH)
- The call protocol is interface-agnostic — it receives `EventEnvelope` frames from any interface
This is the most architecturally significant change in Phase 1. Currently, SSH is deeply embedded in the server handler. Extracting it into an Interface trait means:
```rust
#[async_trait]
pub trait Interface: Send + Sync + 'static {
type Session;
async fn accept(stream: TransportStream, config: &InterfaceConfig) -> Result<Self::Session>;
// The session produces call protocol events and handles responses
}
```
The existing `ServerHandler` logic (auth, channel open, proxy) becomes `SshInterface`. The raw framing interface becomes a simple length-prefix reader. DNS control channel becomes (DNS transport + raw framing interface).
**This requires careful design review**. The SSH handler currently owns auth, channel management, and proxy logic. Much of that moves to Layer 3 (call protocol) or stays in the interface. The split needs to be clean.
**ADR**: 026 (transport-interface separation)
**Risk**: High — refactoring the core server handler. This is the most invasive change in Phase 1. May need to be split into sub-phases or deferred partially.
---
## Phase 2: Core Bridge
**Goal**: Complete the interface-to-protocol bridge and add the core types that external crates and HTTP interfaces depend on. Phase 1 established the interface trait and SSH extraction but left the call protocol bridge (SshSession recv/send) as stubs and deferred key interface model refinements. Phase 2 closes those gaps so that Phase 3 crates can reference a stable, functional core.
**Why before external crates**: The external crates (alknet-secret, alknet-storage) depend on a core where the Layer 2→3 bridge actually works. Without `SshSession::recv()`/`send()` producing and consuming `InterfaceEvent` frames, the call protocol is inert for SSH sessions. Without `RawFramingInterface` implemented, there's no non-SSH path either. And without `StreamInterface`/`MessageInterface` split and `CredentialProvider`, the phase 2 research docs (interface-model, credential-provider, tls-transport) describe a target architecture that doesn't exist in code yet. These must exist before crates can wire against them.
### 2.1 SshSession Call Protocol Bridge
**Source**: interface.md (OQ-IF-01, resolved), ssh-interface-extraction task, control_channel.rs
**Current state**: `SshSession::recv()` always returns `None` and `SshSession::send()` silently discards. The `ControlChannelRouter` exists but has no handler wired. The `alknet-control:0` SSH channel is detected in `channel_open_direct_tcpip` but not bridged to `InterfaceEvent` frames.
**Changes to alknet-core**:
- Implement `SshSession::recv()` — read `EventEnvelope` frames from the `alknet-control:0` channel stream, wrap in `InterfaceEvent` with the session's `Identity`
- Implement `SshSession::send()` — write `EventEnvelope` frames to the `alknet-control:0` channel stream
- Wire `ControlChannelRouter` to bridge SSH channel data to the call protocol handler
- The session's `Identity` (from SSH auth) is attached to every `InterfaceEvent`
**Prerequisites**: Verify that `call::frame::{encode, decode}` exists and produces/consumes frames compatible with the SSH channel data stream. The `ControlChannelRouter` in `control_channel.rs` needs a handler wired — check its current API for how to register a call protocol handler.
**Why this is Phase 2 not Phase 4**: This is the duct work that connects Layer 2 (interface) to Layer 3 (protocol). Without it, SSH sessions can only forward ports — they cannot invoke call protocol operations. This is core functionality, not an advanced feature.
**New crate**: None. This is alknet-core.
**Risk**: Medium — the SSH channel → call protocol bridge needs careful framing (4-byte length prefix over the SSH channel data stream, matching `RawFramingInterface`'s wire format). The `SshHandler` already detects `alknet-*` destinations; the bridge is connecting that detection to the channel stream.
### 2.2 RawFramingInterface Implementation
**Source**: interface.md, integration-plan Phase 1.8
**Current state**: `RawFramingInterface` and `RawFramingSession` are stub types. `accept()` returns an error, `recv()` returns `None`, `send()` returns an error.
**Changes to alknet-core**:
- Implement `RawFramingInterface::accept()` — read the 4-byte length prefix + JSON `EventEnvelope` frame from the transport stream, return a `RawFramingSession` that wraps the stream
- Implement `RawFramingSession::recv()` — read length-prefixed `EventEnvelope` frames from the stream, produce `InterfaceEvent`
- Implement `RawFramingSession::send()` — write length-prefixed `EventEnvelope` frames to the stream
- Auth for raw framing: first frame on the session is an auth event carrying token data, resolved via `IdentityProvider::resolve_from_token()`. After auth succeeds, subsequent frames are call protocol `EventEnvelope` data. The `RawFramingSession` is not considered authenticated until the auth frame is processed.
**Auth design decision**: Raw framing sessions use a first-frame auth pattern. The first `InterfaceEvent` on a `RawFramingSession` carries an auth token (in the `InterfaceEvent.identity` field or a dedicated auth event type). After authentication, all subsequent frames are call protocol events. This is simpler and more secure than per-frame auth — the session has a clear auth state transition, and the token is only transmitted once. For sessions that fail auth, the session is terminated immediately.
**Why this is Phase 2**: Raw framing is the simplest interface and the foundation for all non-SSH paths (TCP mesh, WebTransport, DNS). Without it, no `MessageInterface` or `StreamInterface` other than SSH can carry call protocol traffic. HTTP interfaces (Phase 4) build on the framing logic established here.
**New crate**: None. This is alknet-core.
**Risk**: Low — straightforward length-prefixed frame reader/writer. The frame format already exists in `call::frame::{encode, decode}`. The auth design (first-frame auth) is simple and matches the `InterfaceEvent` model where `identity: Option<Identity>` is set on auth and carried forward.
### 2.3 StreamInterface / MessageInterface Split
**Source**: research/phase2/interface-model.md
**Current state**: The `Interface` trait has one form (`accept(stream) → Session`). Phase 2 research identifies that HTTP and DNS are not stream-based — they're message-based (individual request/response pairs, no persistent session). The research proposes splitting into `StreamInterface` and `MessageInterface`.
**Changes to alknet-core**:
- Rename `Interface``StreamInterface` (the current trait becomes the stream-specific variant)
- Rename `InterfaceSession``StreamInterfaceSession` (or keep as `InterfaceSession` — it's already specific to stream sessions)
- Add `MessageInterface` trait: `handle_request(&self, request: InterfaceRequest) -> Result<InterfaceResponse>`
- Add `InterfaceRequest` and `InterfaceResponse` types
- Add `HttpInterface` stub (struct and impl signature, axum not wired yet)
- Add `DnsInterface` stub (struct definition only)
- Restructure `InterfaceConfig` enum: current `InterfaceConfig::Ssh(SshInterfaceConfig)` and `InterfaceConfig::RawFraming(RawFramingConfig)` become `StreamInterfaceConfig::Ssh` and `StreamInterfaceConfig::RawFraming`. Add `MessageInterfaceConfig` variants for HTTP and DNS.
- Update `ListenerConfig` to include `Stream`, `Http`, and `Dns` variants (per ADR-035 and updated interface.md)
- Add `TransportKind::WebTransport` as a tag-only variant (no acceptor implementation) — this was planned for Phase 1 but never added. It's a trivial addition that prevents a breaking change later.
- Note: `TransportKind::Dns` was never added to the code, so no removal is needed. The updated specs correctly show DNS as a `MessageInterface` with its own `ListenerConfig::Dns` variant, not a transport.
**Why this is Phase 2**: This is a type-system change that affects how all future interfaces are implemented. If we build HTTP on top of `Interface` (singular) and then need to split later, we'd refactor HTTP, DNS, WebSocket, and any other interface added in Phases 4+. Doing the split now is cheap — it's a rename + new trait + two stubs — and prevents a larger refactor later.
**New crate**: None. This is alknet-core.
**ADR**: 035 (StreamInterface/MessageInterface split — supersedes the Layer 2 aspects of ADR-026)
**Risk**: Low — rename and new trait. Existing `SshInterface` and `RawFramingInterface` become `StreamInterface` implementations. No behavior change for stream-based interfaces. The `InterfaceConfig` enum restructuring and `TransportKind::WebTransport` addition are mechanical changes.
**Scheduling note**: This task should be done early in Phase 2 because all subsequent tasks (2.1, 2.2, 2.4, 2.5, 2.6, 2.7) reference the new trait names. It can be done in parallel with 2.1 and 2.2 since they're mostly additive.
### 2.4 CredentialProvider Trait and CredentialSet
**Source**: research/phase2/credential-provider.md
**Current state**: No outbound credential resolution exists. Each service wrapper would need to independently retrieve and manage credentials.
**Changes to alknet-core**:
- Define `CredentialProvider` trait in `alknet_core::credentials`
- Define `CredentialSet` enum: `ApiKey`, `Basic`, `Bearer`, `S3AccessKey`, `OidcToken`, `Custom`
- Implement `ConfigCredentialProvider` — a config-backed stub that reads API keys and static credentials from `DynamicConfig`. This is the Phase 2 default: simple, no secret service dependency, sufficient for testing and single-node deployments.
- Wire into `OperationEnv` so handlers can access credentials through `context.env` (or a separate `CredentialProvider` field on `OperationContext` — implementation detail)
- Define the `SecretStoreCredentialProvider` type and its interface (reads from `SecretProtocol::Decrypt`, holds in RAM) but **do not implement the body** — leave it as a stub that returns `None`. Full implementation requires alknet-secret (Phase 3).
**Why this is Phase 2**: The secret crate (Phase 3) needs `CredentialProvider` as a consumer of `SecretProtocol::Decrypt`. The trait and enum must exist in core before the secret crate can wire against them. This is the same pattern as `IdentityProvider` — trait in core, default impl uses simple storage, production impl uses the secret service.
**New crate**: None. Trait and enum in alknet-core.
**Risk**: Low — new trait and enum, no existing code changes. `ConfigCredentialProvider` is a simple config-backed lookup. `SecretStoreCredentialProvider` stub returns `None` until Phase 3 provides the secret service dependency.
**Split note**: This task is naturally split into:
- **2.4a** (this phase): Define `CredentialProvider` trait, `CredentialSet` enum, `ConfigCredentialProvider` impl, wire into `OperationEnv`/`OperationContext`. This is self-contained and testable.
- **2.4b** (Phase 3, after alknet-secret exists): Implement `SecretStoreCredentialProvider` backed by `SecretProtocol::Decrypt`. This requires alknet-secret as a dependency.
### 2.5 ListenerConfig Update and HTTP Listener Stub
**Source**: research/phase2/tls-transport.md
**Current state**: Phase 1 added `ListenerConfig` with `Stream` variant (transport + interface pair). Phase 2 research adds `Http` and `Dns` listener variants for message-based interfaces. The Phase 1 implementation also added `TransportKind::Dns` which should be removed (DNS is a `MessageInterface`, not a transport).
**Changes to alknet-core**:
- `TransportKind::Dns` removal: **No-op**`TransportKind` in the current code has `Tcp`, `Tls`, and `Iroh` only. `Dns` was never added to the enum. The updated specs correctly show DNS as a `MessageInterface` with its own `ListenerConfig::Dns` variant (per ADR-035), not as a transport variant.
- Add `ListenerConfig::Http` variant: `{ bind_addr, tls, stealth }`
- Add `ListenerConfig::Dns` variant: `{ bind_addr, tls }` (DNS as a MessageInterface with its own listener)
- Extend the server accept loop to handle `ListenerConfig::Http` by spawning an axum router when `stealth` mode detects HTTP traffic (replacing `send_fake_nginx_404`)
- `HttpInterface` stub defined in 2.3 gets its structural types but no route implementations yet
**Why this is Phase 2**: The `ListenerConfig` is the server's primary configuration type. Adding HTTP and DNS listener variants now means Phase 3+ crates and Phase 4 HTTP implementation can reference the right type from the start. Removing `TransportKind::Dns` before any code depends on it prevents a breaking change later.
**New crate**: None. This is alknet-core. New dependency: `axum` (behind `http` feature flag).
**Risk**: Low — type changes and a stub axum router. The `send_fake_nginx_404` → axum handoff is a small change to the existing stealth detection code. Full HTTP route implementations are Phase 4.
### 2.6 API Keys in DynamicConfig
**Source**: research/phase2/interface-model.md (Config section), research/phase2/credential-provider.md
**Current state**: `DynamicConfig.auth` has `authorized_keys` for SSH auth and `token` settings but no simple bearer API keys for service accounts or automation.
**Changes to alknet-core**:
- Add `[[auth.api_keys]]` section to `DynamicConfig`: prefix, hash (SHA-256), scopes, description, optional TTL
- Extend `ConfigIdentityProvider::resolve_from_token()` to verify API keys in addition to AuthTokens
- API keys are shorter and simpler than AuthTokens — no Ed25519 key pair needed, just a hash-verified bearer string
- `SecretStoreCredentialProvider` can also resolve API keys when database-backed storage is available
**Why this is Phase 2**: The HTTP interface (Phase 4) needs bearer token auth, and the simplest path is API keys that already work with `IdentityProvider::resolve_from_token()`. Without this, Phase 4 HTTP auth has no config-based auth mechanism.
**New crate**: None. This is alknet-core.
**Risk**: Low — additive config section and an additional lookup path in an existing trait method.
### 2.7 Axum HTTP Router Scaffold
**Source**: research/phase2/tls-transport.md
**Changes to alknet-core** (behind `http` feature flag):
- Add `axum` dependency (behind feature flag)
- Create `alknet_core::http` module with an axum `Router` scaffold:
- Auth middleware that extracts `Authorization: Bearer <token>` and calls `IdentityProvider::resolve_from_token()`, attaching the resolved `Identity` to the request extensions
- Stealth handoff: replace `send_fake_nginx_404` with axum router serving the `BufReader<TlsStream>`
- A default 404 handler for any unmatched routes (no hardcoded operation paths)
- No operational routes yet — the question of how HTTP paths map to operation invocations depends on the from_openapi / spec-generation work and is deferred to Phase 5. Custom routes (git, S3, OpenAI proxy) will register directly with the axum router at their own paths, sharing the auth middleware but with their own routing logic.
- The `ListenerConfig::Http` variant and stealth mode handoff are established here so that HTTP traffic reaches axum with auth context. Routing *inside* axum is a later concern.
**Why this is Phase 2**: The auth middleware and stealth handoff are prerequisites for any HTTP endpoint. Without this, the only way to reach call protocol operations is via SSH. The scaffold gets HTTP traffic to axum with identity — the specific routes and path conventions are intentionally not specified here.
**New crate**: None. In alknet-core behind `http` feature flag.
**Risk**: Low — structural scaffold with auth middleware and stealth handoff only. No operational routes or path conventions.
**Open question**: How should external HTTP paths map to alknet operations? The internal path convention (`/{namespace}/{op}` over call protocol channels) is one design; external HTTP paths are determined by the API being exposed (OpenAI `/v1/chat/completions`, S3 `/{bucket}/{key}`, git `/{repo}.git/info/refs`). The inverse of `from_openapi` — generating an OpenAPI spec from registered operations and mapping those to HTTP routes — will determine the answer. This is deferred to Phase 5.
---
## Phase 3: External Crates
**Goal**: Create the new crates that core depends on by type but not by implementation.
**Why after Phase 2**: The core types and bridges must be stable before building crates that reference them. Phase 2 ensures that the `InterfaceSession` bridge works, `CredentialProvider` exists, and `ListenerConfig` has its final shape. The external crates can then wire against a functional core.
### 3.1 alknet-secret
**Source**: research/services.md (SecretProtocol), research/storage.md (secrets section, key derivation)
**Contents**:
- BIP39 mnemonic generation and seed derivation
- SLIP-0010 Ed25519 HD key derivation (SLIP-0044 coin type 74')
- AES-256-GCM encryption/decryption for external credentials
- `SecretProtocol` irpc service implementation (Unlock, Lock, DeriveEd25519, DeriveEncryptionKey, Encrypt, Decrypt)
- `EncryptedData` type (key_version, salt, iv, ciphertext)
- Derivation path constants
**Dependencies**: bip39, ed25519-bip32 (or rust-bip32-ed25519), aes-gcm, sha2, irpc
**Does NOT depend on**: alknet-core, alknet-storage
**Interface back to core**: alknet-secret types (EncryptedData, derivation paths) are referenced by alknet-storage when storing encrypted nodes. The wire format is stable; core never sees the seed or derived keys.
**ADR**: 027 (crate decomposition)
**Risk**: Low — new crate, no existing code to refactor. Crypto dependencies are well-understood.
### 3.2 alknet-storage
**Source**: research/storage.md (entire document)
**Contents**:
- SQLite-backed metagraph (GraphType, NodeType, EdgeType, Graph, Node, Edge)
- Identity tables (accounts, organizations, peer_credentials, api_keys, audit_logs)
- ACL as metagraph (PrincipalNode, DelegatesEdge, access control graph)
- Encrypted node type (bridges to alknet-secret's EncryptedData format)
- Honker integration (stream_publish/subscribe, notify/listen, queue/claim)
- System DB vs Tenant DB separation
- `StorageProtocol` irpc service
**Dependencies**: rusqlite (via honker or direct), honker, serde_json, jsonschema, petgraph, irpc
**Does NOT depend on**: alknet-core, alknet-secret (but references EncryptedData type format)
**Interface back to core**:
- `StorageIdentityProvider` implements alknet-core's `IdentityProvider` trait (queries peer_credentials + ACL graph)
- `StorageProtocol` is called via irpc from alknet-core's service layer
**ADR**: 027 (crate decomposition), 032 (event boundary discipline)
**Risk**: Medium — honker integration is new. SQLite schema needs to match the TypeScript version for compatibility.
### 3.3 alknet-flowgraph
**Source**: research/flow.md (entire document)
**Contents**:
- `FlowGraph<N, E>` generic graph over `petgraph::DiGraph`
- `NodeAttributes` / `EdgeAttributes` traits
- Operation graph construction from `OperationSpec`s
- Call graph population from `EventEnvelope` events
- Type compatibility checking (jsonschema)
- Cycle detection, topological sort, reachability queries
- Serde serialization/deserialization
**Dependencies**: petgraph, serde, serde_json, jsonschema, thiserror
**Does NOT depend on**: alknet-core, alknet-storage, alknet-secret
**Interface back to core**: `OperationSpec` and `CallNodeAttrs` types must match alknet-core's definitions. Bridge is serialization — flowgraph serializes to JSON, storage persists it.
**ADR**: 027 (crate decomposition)
**Risk**: Low — pure computation crate, no I/O, no external state. Straight port of TypeScript design.
---
## Phase 4: Integration and Wiring
**Goal**: Wire the crates together. The CLI binary and NAPI layer assemble everything.
**Why after Phase 3**: Integration requires all pieces to exist. Phase 1 defines the interfaces; Phase 2 completes the core bridge; Phase 3 builds the crate implementations; Phase 4 connects them.
### 4.1 CLI Binary (alknet crate)
**Source**: research/configuration.md (CLI config, --config flag)
**Contents**:
- `alknet serve` — parse TOML config, assemble StaticConfig + initial DynamicConfig, create services, run multi-transport server
- `alknet connect` — parse CLI flags or TOML profile, create ConnectOptions, run client
- Service assembly: for minimal deployments, use ArcSwap-backed services. For production, wire in SQLite-backed services.
- TOML config file parsing (`alknet serve --config stack.toml`)
**New dependency**: `toml` crate (for config file parsing)
### 4.2 Service Assembly
The CLI or NAPI layer is responsible for wiring services together:
```rust
// Minimal deployment (single-node, CLI)
let auth = ConfigIdentityProvider::new(dynamic_config.clone());
let config = ConfigServiceImpl::new(dynamic_config.clone());
let secret = None; // No secret service in minimal mode
// Production deployment (head node)
let auth = StorageIdentityProvider::new(storage_db);
let config = ConfigServiceImpl::new(dynamic_config.clone());
let secret = SecretServiceImpl::new(storage_db); // Holds seed in memory
```
Core doesn't know about this assembly — it receives `IdentityProvider` and `DynamicConfig` through its public API.
### 4.3 OperationEnv Wiring — Three Dispatch Paths
The OperationEnv is the universal composition mechanism. When a handler calls `context.env.secrets.derive(input)`, the runtime resolves which dispatch path to take:
**Local dispatch** (in-process):
```
handler calls context.env[namespace][op](input)
→ OperationEnv resolves the handler function from the local registry
→ Direct function call, zero serialization
→ Returns ResponseEnvelope
```
**Service dispatch** (in-cluster, irpc):
```
handler calls context.env[namespace][op](input)
→ OperationEnv resolves that this operation is backed by an irpc service
→ Serializes input via postcard, sends to AuthProtocol::VerifyPubkey via mpsc channel (local) or QUIC stream (remote)
→ Receives AuthResult, wraps in ResponseEnvelope
```
**Remote dispatch** (cross-node, call protocol):
```
handler calls context.env[namespace][op](input)
→ OperationEnv resolves that this operation lives on a remote node
→ Sends call.requested EventEnvelope via the interface (SSH channel, raw framing, DNS, etc.)
→ Receives call.responded EventEnvelope, deserializes payload
```
All three paths produce the same `ResponseEnvelope`. The handler neither knows nor cares which path was taken. The OperationEnv is wired at startup based on deployment topology:
```rust
// Minimal deployment (single node, all local)
let env = OperationEnv::local(local_registry);
// Production deployment (mix of local and remote)
let env = OperationEnv::new()
.local("auth", auth_registry) // Auth runs locally
.local("config", config_registry) // Config runs locally
.service("secrets", secret_irpc_client) // Secret service via irpc
.remote("worker-1", call_protocol_conn) // Worker-1 operations via call protocol
;
```
The irpc service layer is thus **one dispatch backend** for OperationEnv — the path chosen when an operation is registered as backed by an in-cluster service. It is not a replacement for OperationEnv or for the call protocol.
### 4.4 NAPI Layer Updates
**Changes to alknet-napi**:
- Expose `reloadAuth()`, `reloadForwarding()`, `reloadAll()` on the AlknetServer object
- Call protocol integration: expose operation registry for NAPI consumers to register handlers
- Service layer: expose irpc service creation for NAPI consumers
### 4.5 Architecture Doc Sync
After Phase 2 core bridge changes are implemented and before Phase 3 crate development begins, the architecture docs should be updated to reflect the implementation state. The first round of doc sync has already been completed (commit `cfc4400`) based on Phase 2 research findings — this covered:
- StreamInterface/MessageInterface split in interface.md
- CredentialProvider/CredentialSet in credentials.md
- API keys in auth.md and configuration.md
- ListenerConfig variants for HTTP and DNS
- Resolved open questions (OQ-IF-01, OQ-IF-02, etc.)
- New ADRs (035, 036, 037)
A **second doc sync** will be needed after Phase 2 implementation is complete to capture any deviations between the spec and the actual implementation (e.g., if `InterfaceConfig` was restructured differently, or if the raw framing auth design differs from the first-frame approach specified here). This second sync should be done before Phase 3 crate development begins.
---
## Phase 5: Application Services and Advanced Features
**Goal**: Build services that register with the operation registry but don't change core.
**Why last**: These are pluggable. They depend on the core being stable (Phases 1-4) but don't affect core's architecture.
### 5.1 DNS Transport + Control Channel Interface
**Source**: research/core.md (DNS transport section)
**Scope**:
- `DnsInterface` (already defined as a `MessageInterface` stub in Phase 2) gets full implementation
- DNS server that encodes/decodes `EventEnvelope` frames as DNS TXT query/response pairs
- Call protocol over DNS (not SSH over DNS — that's a separate, future goal)
- AuthToken embedded in DNS query labels
**Crate**: `alknet-core` (behind `dns` feature flag)
**ADR**: 026 (transport-interface separation) — DNS is a `MessageInterface`, not a (DNS transport, raw framing) pair
**Risk**: Medium — DNS protocol implementation is non-trivial. Framing, chunking, and retransmission need R&D.
### 5.2 WebTransport Transport
**Source**: architecture/auth.md (WebTransport section), research/phase2/tls-transport.md
**Scope**:
- `WebTransportAcceptor` implements `TransportAcceptor` trait
- Token auth for WebTransport sessions (AuthToken in CONNECT URL, `IdentityProvider::resolve_from_token()`)
- `TransportKind::WebTransport` variant
- QUIC listener coexistence with iroh on UDP 443
**Crate**: `alknet-core` (behind `webtransport` feature flag)
**Risk**: Medium — requires wtransport crate dependency, QUIC listener coexistence questions (OQ-15).
### 5.3 Full HTTP Interface Implementation
**Source**: research/phase2/tls-transport.md
**Scope**:
- Replace stub handlers in the Phase 2 axum scaffold with actual operation dispatch
- `POST /v1/{namespace}/{op}``registry.invoke(namespace, op, input)` (mutation)
- `GET /v1/{namespace}/{op}``registry.invoke(namespace, op, input)` (query, params as input)
- `GET /v1/{namespace}/{op}` SSE → `registry.subscribe(namespace, op, input)` (subscription)
- `GET /v1/schema``registry.list_operations()`
- OpenAPI spec generation from `OperationRegistry`
- WebSocket upgrade handler for persistent browser connections
**Crate**: `alknet-core` (behind `http` feature flag)
**Risk**: Medium — full HTTP routing, SSE streaming, auth middleware integration with OperationEnv.
### 5.4 Docker Service, Node Service, Git Service, etc.
**Source**: research/services.md (application services section), research/references/gitserver/
These are all pluggable services that register operations with the core's `OperationRegistry`. They don't require core changes. They're candidates for a `alknet-services` crate or individual crates.
**Git Service** path (see research/references/gitserver/ and research/references/gitlfs/):
- Use `gitserver-core` as the git protocol engine (transport-agnostic, library-first design)
- `gitserver-http` nested in alknet's axum router for HTTPS git
- `rudolfs` (or a fork) as the LFS layer, backed by rustfs S3 storage
- Auth via `IdentityProvider` → gitserver's `AuthConfig`
- Operations: `git.clone`, `git.push`, `git.pull` registered in OperationRegistry
**Crate**: New crate(s) per service, or a consolidated `alknet-services` crate
**Risk**: Low — purely additive, no core changes needed.
### 5.5 Flow Graph Real-time Construction
**Source**: research/flow.md
Wire call protocol events (call.requested, call.responded, etc.) to `FlowGraph::update_from_event()`. This is application-level wiring, not a core concern.
**Crate**: Application code in `alknet` binary or a `alknet-head` crate.
**Risk**: Low — event subscription pattern is well-established.
---
## Phase Summary
| Phase | What | Core Changes? | New Crates? | ADR Dependency |
|---|---|---|---|---|
| 0 | Architecture: ADRs, specs, review | No | No | Write all |
| 1 | Core: config split, identity, forwarding, auth service, OperationEnv, interface abstraction | Yes | No | 026-034 |
| 2 | Core bridge: SshSession recv/send, RawFramingInterface, StreamInterface/MessageInterface split, CredentialProvider (trait+stub), HTTP listener stub, API keys | Yes | No | 035, 036, 037, phase2 research |
| 3 | External crates: secret, storage, flowgraph | No | Yes (3) | 027 |
| 4 | Integration: CLI assembly, NAPI, service wiring, doc sync | Minor (exports) | No | 027 |
| 5 | Advanced: DNS, WebTransport, full HTTP, application services | Minimal (feature flags) | Maybe | 026 |
## Dependency Graph
```
alknet-secret
/ \
/ \
alknet-core ←──── ←── alknet-storage
↑ \ /
│ alknet-flowgraph
alknet-napi
alknet (CLI binary — assembles everything)
```
alknet-core depends on: russh, tokio, irpc (feature flag), serde, axum (feature flag)
alknet-secret depends on: bip39, ed25519-bip32, aes-gcm, sha2, irpc
alknet-storage depends on: honker, rusqlite, petgraph, jsonschema, irpc
alknet-flowgraph depends on: petgraph, serde, jsonschema
alknet-napi depends on: alknet-core
alknet (CLI) depends on: alknet-core, alknet-secret (feature), alknet-storage (feature), alknet-flowgraph (feature), toml
No crate depends on alknet-core's internal types through a circular path. The `Identity` type, `IdentityProvider` trait, and `OperationSpec` are the narrow interface points.
---
## Open Questions to Resolve Before Phase 2
These must have answers before Phase 2 implementation begins. Phase 0/1 questions are resolved.
| OQ | Question | Proposed Resolution | Phase | ADR |
|---|---|---|---|---|
| ~~OQ-12~~ | Per-user forwarding scope vs global rules | **Resolved**: Start with global rules + principal matching. Per-user scope from peer_credentials.metadata.scopes via IdentityProvider. | 1 | 031 |
| ~~OQ-16~~ | Transport-specific forwarding policy | **Resolved**: Add `TransportKind` match in ForwardingRule. | 1 | 031 |
| ~~OQ-18~~ | Source of Identity.scopes | **Resolved**: IdentityProvider owns scopes. ForwardingPolicy uses scopes from Identity. | 1 | 029 |
| ~~OQ-22~~ | Client streaming in call protocol | **Resolved**: Defer. Single request + optional streaming response covers all identified use cases. | — | — |
| ~~OQ-IF-01~~ | How does InterfaceSession relate to EventEnvelope? | **Resolved**: `InterfaceSession::recv()` returns `Option<InterfaceEvent>` where `InterfaceEvent` carries `EventEnvelope` + `Identity`. `send()` accepts `EventEnvelope`. The SshSession bridge implements this over `alknet-control:0`. For `MessageInterface`, `InterfaceRequest`/`InterfaceResponse` normalize request/response pairs. See interface.md, ADR-035. | 2 | 035 |
| ~~OQ-IF-02~~ | Should SshInterface own ForwardingPolicy checks? | **Resolved**: ForwardingPolicy is Layer 3 (policy), channel open/close lifecycle is Layer 2. SshInterface reports channel requests to Layer 3; Layer 3 applies policy. Current implementation already does this. | 2 | 031 |
| OQ-15 | TLS + WebTransport + iroh QUIC coexistence | Defer WebTransport to Phase 5. TLS and iroh already coexist (TCP vs UDP). | 5 | — |
| OQ-19 | Separate TLS identity for WebTransport vs shared | Share certificates. QUIC is UDP, TLS is TCP, same port works. Different subject alt names possible but not required. | 5 | — |
| OQ-20 | Worker registration and discovery on connect/disconnect | Register on connect, cleanup on disconnect. Heartbeat for liveness. Spec in call-protocol.md. | 2+ | — |
| OQ-P2-01 | Should MessageInterface and StreamInterface share a common trait? | **Resolved**: Independent traits. Different signatures (`handle_request` vs `accept` + session lifecycle), different transport ownership (self-managed vs provided). A common super-trait adds complexity without benefit. ADR-035 accepted. | 2 | 035 |
| OQ-P2-02 | Should HTTP share a port with the SSH listener? | **Resolved**: Start with separate ports. Stealth mode byte-peek on shared port 443 already detects SSH vs HTTP. ALPN multiplexing is a future optimization that doesn't change the interface abstraction. | 2 | — |
| OQ-P2-03 | Should the HTTP interface auto-generate OpenAPI specs from OperationRegistry? | **Resolved**: Yes, but Phase 5+. The HTTP interface needs to exist first (Phase 5.3). | 5 | — |
| OQ-P2-04 | How do self-hosted services authenticate via alknet? | **Resolved**: Three-phase approach. Phase A: shared secret (`CredentialSet::Bearer` or `S3AccessKey`). Phase C: identity-bound credentials via `ManagedCredentialProvider`. Phase D: alknet as OIDC provider. `CredentialProvider` trait in core enables Phase A immediately. ADR-036 accepted. | 2-5 | 036 |
---
## Inconsistencies and Conflations to Clean Up
The research documents have a few areas that need reconciliation:
1. **Hub/spoke vs head/worker**~~: core.md and services.md use head/worker. call-protocol.md still uses hub/spoke in several places. All docs need to be updated consistently. ADR-034 formalizes this.~~ **Fixed**: call-protocol.md, auth.md, open-questions.md, and napi-and-pubsub.md updated to head/worker terminology. ADRs are historical records and retain original terminology. ADR-034 still needed to formalize the decision.
2. **DNS as transport vs interface**: core.md conflates "DNS as transport" (encoding bytes as DNS queries) with "DNS as naming/discovery" (TXT records). The three-layer model cleanly separates these: DNS is a `MessageInterface`, not a transport. **Phase 2 removes `TransportKind::Dns`** and adds `ListenerConfig::Dns`.
3. **Service naming collision — irpc service vs call protocol operation vs external service**: The research uses "service" for both irpc protocol enums and call protocol path-based handlers. See research/phase2/definitions.md for full disambiguation. The architecture should consistently use: **irpc service** (in-cluster, Rust-to-Rust), **operation** (path-based call protocol handler), **external service** (third-party endpoint), and **application service** (handler registered in OperationRegistry).
4. **Identity model divergence**~~: auth.md defines `Identity` with `{id, scopes, resources}`. services.md defines `Identity` with `{node_id, fingerprint, scopes}`.~~ **Fixed**: auth.md has the correct unified definition `{id, scopes, resources}`.
5. **OperationEnv is a universal composition mechanism, not an implementation detail**~~: services.md defines `OperationEnv` as `HashMap<String, HashMap<String, fn(...)>>`.~~ **Acknowledged**: The behavioral contract (namespace + operation name → invoke) must match. The Rust implementation can use typed dispatch behind the scenes.
6. **Event boundary discipline needs to be a hard constraint, not a suggestion**~~: storage.md and services.md both call this out, but it's presented as a pattern rather than a rule.~~ **Formalized**: ADR-032 makes it a hard architectural constraint. See also research/phase2/definitions.md (Domain Events vs Integration Events).
7. **Config file vs programmatic API**: configuration.md proposes TOML config files. ADR-011 says "no config file, programmatic-first." **Proposed**: TOML is an optional convenience layer that builds `StaticConfig`/`DynamicConfig`. `ServeOptions` builder pattern remains the primary API. ADR-011 is amended, not superseded.
8. **Interface model needs StreamInterface/MessageInterface split**: The current `Interface` trait assumes persistent byte streams. HTTP and DNS don't fit (they handle individual requests, not sessions). **Phase 2 addresses this** — rename `Interface``StreamInterface`, add `MessageInterface`, add `HttpInterface` stub. See research/phase2/interface-model.md.
9. **SshSession recv/send stubs are core, not "Phase 4"**: The Phase 1 implementation left `SshSession::recv()` and `SshSession::send()` as stubs returning `None` / silently discarding. This makes the interface model inert for call protocol operations. The bridge between SSH channels and `InterfaceEvent`/`EventEnvelope` frames is a **Phase 2** concern, not a future feature. See Phase 2.1.
10. **CredentialProvider is missing from core**: Outbound auth (how alknet authenticates to external services) has no trait or implementation. This is needed before any HTTP API integration work. **Phase 2.4** adds the trait and enum to core; Phase 3 (alknet-secret) provides the storage-backed implementation. See research/phase2/credential-provider.md.
11. **Architecture docs need sync after Phase 2**: The current architecture docs (interface.md, auth.md, services.md, call-protocol.md) reflect the pre-Phase-0/1 state. After Phase 2 core bridge changes land, these must be updated to reflect StreamInterface/MessageInterface, CredentialProvider, HTTP listener, and the functional call protocol bridge. **Phase 4.5** is the doc sync point.

View File

@@ -1,466 +0,0 @@
# Credential Provider: Outbound Service Authentication
> Status: Research / Draft
> Last updated: 2026-06-08
> Part of: Phase 2 planning
## Overview
Alknet's `IdentityProvider` resolves **inbound** authentication: who is making a request _to_ alknet. The `CredentialProvider` resolves **outbound** authentication: how alknet authenticates _to_ external and self-hosted services. This is a distinct and currently unaddressed concern that affects nearly every application service — from cloud API integrations (vast.ai, runpod, ubicloud) to self-hosted infrastructure (rustfs, gitea, postgres).
## Problem Statement
### External API credentials
Cloud providers use simple auth patterns — API keys, bearer tokens, basic auth. The existing `SecretProtocol` (encrypt/decrypt via derived AES-256-GCM keys, defined in [secret-service.md](../../architecture/secret-service.md)) can store and retrieve these at rest. But the wiring between "decrypt a credential from storage" and "use it in an HTTP request" doesn't exist yet. Each service wrapper currently would have to independently solve credential retrieval, caching, and lifecycle.
### Self-hosted service auth
Self-hosted services use more complex auth mechanisms that go beyond static tokens:
- **rustfs** uses S3-style access key + secret key pairs with AWS Signature V4 request signing. They also support OIDC (OpenID Connect with PKCE). The access key/secret key aren't a bearer header — they're inputs to a per-request HMAC-SHA256 signature computation.
- **gitea** supports OAuth2, OIDC, and reverse proxy authentication (SSO via headers). Its internal user/token system is separate from alknet's identity model.
- Other self-hosted services (postgres, redis) may use their own auth schemes.
These services are **inside the operational domain** — their credential lifecycle (provisioning, rotation, revocation, token refresh) is part of running the stack, not a one-time configuration step.
### The gap
Currently:
```
User → alknet → IdentityProvider (resolves who the user is) ✅ exists
alknet → external service → ??? (resolves how alknet authenticates) ❌ missing
```
Without `CredentialProvider`, each service wrapper would:
1. Independently retrieve and decrypt credentials from the secret service
2. Independently implement auth mechanism specifics (bearer, S3 signing, OIDC refresh)
3. Have no shared infrastructure for credential lifecycle management
This leads to duplicated effort and inconsistent security practices across service wrappers.
## Design
### CredentialProvider Trait
```rust
pub trait CredentialProvider: Send + Sync + 'static {
fn get_credentials(&self, service: &str) -> Option<CredentialSet>;
fn refresh_credentials(&self, service: &str) -> Option<CredentialSet>;
}
```
This is intentionally narrow. It returns credentials for a named service. It does not try to abstract the auth mechanism itself — that stays with the service wrapper that knows the protocol.
### CredentialSet
```rust
pub enum CredentialSet {
ApiKey {
header_name: String,
token: String,
},
Basic {
username: String,
password: String,
},
Bearer {
token: String,
},
S3AccessKey {
access_key: String,
secret_key: String,
session_token: Option<String>,
},
OidcToken {
access_token: String,
refresh_token: Option<String>,
expires_at: Option<u64>,
},
Custom {
scheme: String,
params: HashMap<String, String>,
},
}
```
Each variant carries the data needed for a specific auth mechanism. The service wrapper that requested the credentials knows what variant it expects and how to use it — the `OpenAPIServiceRegistry` knows it needs a `Bearer` or `ApiKey`, the rustfs S3 wrapper knows it needs `S3AccessKey` for request signing.
### CredentialProvider vs IdentityProvider
These are opposite-direction abstractions:
| | IdentityProvider | CredentialProvider |
|---|---|---|
| Direction | Inbound (who is calling alknet) | Outbound (how alknet calls others) |
| Resolves | Fingerprint/token → Identity | Service name → CredentialSet |
| Storage | `peer_credentials`, `api_keys` tables (alknet-storage) | Encrypted nodes in metagraph (via SecretProtocol) |
| Lifecycle | Stateless lookup | May need refresh (OIDC tokens, S3 sessions) |
| Location | `alknet_core::auth` | `alknet_core::credentials` |
Both live at the same architectural layer. A service handler receives an `OperationContext` with `identity` (who called us) and access to credentials through `context.env`. The handler doesn't interact with `CredentialProvider` directly in the common case — the service initialization code does, when setting up the HTTP client or SDK wrapper.
### Accounts: Storage-Layer Concern, Not Core
The `Identity` struct in core (`{ id, scopes, resources }`) does not need an explicit `account_id` field. In config-based auth (`ConfigIdentityProvider`), `id` is the SSH key fingerprint. In database-backed auth (`StorageIdentityProvider`), `id` is the account UUID. The account concept is an implementation detail of `StorageIdentityProvider` — it resolves `peer_credentials.fingerprint → account_id → Identity { id: account_uuid, ... }`. The same person authenticating via SSH key or bearer token gets the same `Identity { id: account_uuid, ... }` because both credential presentations map to the same account UUID in storage.
This means identity-bound credential lookups (e.g., "Alice's rustfs access key") use `Identity.id` (which is the account UUID in database-backed deployments) as the key — not a separate field. The call pattern is:
```rust
// Service-level credential (no identity needed):
credential_provider.get_credentials("rustfs") // shared admin key
// Identity-bound credential (uses id as account identifier):
credential_provider.get_credentials_for("rustfs", &identity.id) // per-user key
```
The `CredentialProvider` trait at core only needs the service-level method. Identity-bound lookups are an extension in alknet-storage that uses the same `Identity.id`.
### Interaction with SecretProtocol
Credentials are stored encrypted in the metagraph via the existing `SecretProtocol`:
1. At setup time, an operator configures credentials for a service (e.g., `alknet credential add vast-ai --type bearer --token-file ./key.txt`)
2. The CLI encrypts the credential via `SecretProtocol::Encrypt` (using the derived encryption key at `m/74'/2'/0'/0'`)
3. The encrypted credential is stored as an `EncryptedData` node in the metagraph, tagged with the service name
4. At startup, `SecretStoreCredentialProvider` (the default `CredentialProvider` impl) calls `SecretProtocol::Decrypt` for each configured service
5. The decrypted credentials are held in RAM with the same lifecycle as the secret service (purged on `Lock`)
```rust
pub struct SecretStoreCredentialProvider {
credentials: ArcSwap<HashMap<String, CredentialSet>>,
secret_client: Client<SecretProtocol>,
}
impl CredentialProvider for SecretStoreCredentialProvider {
fn get_credentials(&self, service: &str) -> Option<CredentialSet> {
let cache = self.credentials.load();
cache.get(service).cloned()
}
fn refresh_credentials(&self, service: &str) -> Option<CredentialSet> {
// Re-decrypt from storage — used after Lock/Unlock cycle
// Calls secret_client.decrypt() and updates cache
None // simplified
}
}
```
### Interaction with OpenAPIServiceRegistry
The TypeScript `@alkdev/operations` `from_openapi.ts` defines `HTTPServiceConfig.auth`:
```typescript
auth?: {
type: "bearer" | "apiKey" | "basic";
token?: string;
headerName?: string;
prefix?: string;
};
```
The Rust port would populate this from `CredentialProvider`:
```rust
let creds = credential_provider.get_credentials("vast-ai");
let auth = match creds {
Some(CredentialSet::Bearer { token }) => AuthConfig::Bearer { token },
Some(CredentialSet::ApiKey { header_name, token }) => AuthConfig::ApiKey { header_name, token },
Some(CredentialSet::Basic { username, password }) => AuthConfig::Basic { username, password },
_ => None,
};
let config = HttpServiceConfig {
namespace: "vast-ai",
base_url: "https://cloud.vast.ai/api/v1",
auth,
..
};
let ops = FromOpenAPI(spec, config);
registry.register_all(ops);
```
### Self-Hosted Services: ManagedCredentialProvider
For self-hosted services, credentials may need active lifecycle management:
**rustfs (S3)**:
- Access key + secret key are created inside rustfs IAM
- The alknet rustfs service wrapper holds the `S3AccessKey` credential set
- Each S3 request is signed using AWS Signature V4 (computed from access_key + secret_key + request details)
- Session tokens from STS-style calls have a TTL and need rotation
- Provisioning: alknet could create the rustfs access key via the rustfs admin API at first setup, then store the resulting credentials
**rustfs (OIDC)**:
- rustfs supports OIDC providers — alknet's identity system _could_ act as an OIDC provider
- This would allow alknet identities to authenticate directly to rustfs without stored credentials
- Requires: alknet running an OIDC authorization server endpoint (potentially exposed via the call protocol)
**gitea (OAuth2/OIDC)**:
- Similar to rustfs OIDC — alknet could act as the OAuth2/OIDC provider
- Gitea supports reverse proxy auth (SSO via headers) — if alknet sits in front as a reverse proxy, it can inject auth headers
- Gitea also has its own API token system — simpler case, just store the token
**ManagedCredentialProvider** wraps these cases:
```rust
pub struct ManagedCredentialProvider {
base: SecretStoreCredentialProvider,
managers: HashMap<String, Arc<dyn CredentialManager>>,
}
pub trait CredentialManager: Send + Sync + 'static {
fn refresh(&self, current: &CredentialSet) -> Option<CredentialSet>;
fn is_expired(&self, current: &CredentialSet) -> bool;
fn provision(&self, identity: &Identity) -> Option<CredentialSet>;
}
```
- `refresh`: For OIDC token refresh, S3 session token rotation
- `is_expired`: Check TTL on tokens before use
- `provision`: Create credentials on a self-hosted service for a given alknet identity (e.g., create a rustfs access key for a new user)
### Identity-Bound Credentials
For self-hosted services where alknet manages the user accounts, there's a higher-order pattern:
1. An alknet `Identity` (resolved by `IdentityProvider`) needs access to a self-hosted service
2. `ManagedCredentialProvider::provision(identity)` creates the corresponding account on the external service
3. The resulting credentials are stored and associated with the alknet identity in the metagraph
4. When the identity makes a call through the operation registry, the handler can resolve their service-specific credentials using `Identity.id` as the account key
This bridges `IdentityProvider` and `CredentialProvider`:
```
IdentityProvider: who is this user? → Identity
CredentialProvider: how do we talk to service X? → CredentialSet
Identity-bound: how does THIS user talk to service X? → CredentialSet (scoped to Identity.id)
```
The identity-bound case is important for multi-tenant self-hosted setups where different alknet users should have different access levels on rustfs or gitea. It can be deferred initially — Phase A only needs service-level credentials.
## Architectural Position
### Where CredentialProvider lives
`CredentialProvider` and `CredentialSet` are core types, analogous to `IdentityProvider` and `Identity`. They live in `alknet_core::credentials`.
Like `IdentityProvider`:
- The trait is in alknet-core
- The default impl (`SecretStoreCredentialProvider`) uses the secret service + metagraph
- Production impls (`ManagedCredentialProvider`) may live in alknet-storage or application crates
- The CLI/NAPI assembly wires the concrete impl
- Core does not depend on any storage system
### Dependencies
```
alknet-core (CredentialProvider trait, CredentialSet enum)
alknet-secret (SecretStoreCredentialProvider reads from SecretProtocol::Decrypt)
Application crates (rustfs wrapper, gitea wrapper, etc.)
```
`CredentialProvider` does not depend on `IdentityProvider`, but `ManagedCredentialProvider` may use `Identity.id` to resolve identity-bound credentials.
### Relationship to existing specs
| Existing concept | Relationship |
|---|---|
| `IdentityProvider` | Opposite direction. Identity is inbound auth. Credential is outbound auth. |
| `SecretProtocol` | Stores and retrieves encrypted credentials. `SecretStoreCredentialProvider` is a consumer of `SecretProtocol::Decrypt`. |
| `OperationEnv` | Service init code uses `CredentialProvider` to configure `HTTPServiceConfig.auth`. Handlers call operations through `env`. |
| `OpenAPIServiceRegistry` | Consumer of `CredentialProvider` — populates `auth` config from credential lookup. |
| `EncryptedData` | Wire format for stored credentials. Compatible with existing `EncryptedDataSchema` from `@alkdev/storage`. |
| `Identity.id` | In database-backed deployments, serves as the account UUID for identity-bound credential lookups. No separate `account_id` field needed — `id` IS the account identifier. |
### Account management is storage-layer, not core
The `AccountService` irpc protocol (CRUD for accounts and credential associations) lives in alknet-storage, not core. This follows the same pattern as `ConfigService`:
- Core has the read trait (`IdentityProvider`, `CredentialProvider`)
- Storage has the management service (`AccountProtocol`, `CredentialProtocol`)
- The CLI/NAPI assembly wires them together
The storage model for accounts:
```
accounts
├── id (UUID, primary key)
├── display_name
├── status (active, disabled)
└── default_scopes (JSON)
peer_credentials (inbound — SSH keys)
├── account_id → accounts.id
├── fingerprint (SHA-256 of public key)
├── public_key_data
└── scopes_override (JSON, null = use account default)
api_keys (inbound — bearer tokens)
├── account_id → accounts.id
├── key_prefix (first 8 chars, for lookup)
├── key_hash (SHA-256 of full key)
├── scopes (JSON)
└── expires_at
service_credentials (outbound — for external services)
├── id (UUID)
├── account_id → accounts.id (NULL = shared/service-level)
├── service_name
├── credential_type
├── encrypted_data → EncryptedData
├── metadata (JSON)
└── expires_at
```
`StorageIdentityProvider` queries `peer_credentials``accounts` and `api_keys``accounts` to resolve any inbound credential to the same `Identity { id: account_uuid, ... }`. `StorageCredentialProvider` queries `service_credentials` and decrypts via `SecretProtocol` to resolve outbound credentials.
## Implementation Phases
### Phase A: Core types and simple credential storage
Define the trait and enum in alknet-core. Implement `SecretStoreCredentialProvider` that decrypts stored credentials at startup. Wire into the service assembly (CLI). This enables static API key / bearer token patterns — sufficient for cloud API integrations.
Deliverables:
- `CredentialProvider` trait + `CredentialSet` enum in `alknet_core::credentials`
- `SecretStoreCredentialProvider` impl (reads from `SecretProtocol::Decrypt`)
- CLI command: `alknet credential add <service> --type bearer --token-file <path>`
- Credential storage in metagraph as encrypted nodes tagged by service name
Depends on: Phase 1 (OperationEnv, SecretProtocol) + alknet-secret crate existing
### Phase B: OpenAPI/JSON Schema auto-registration
Port `FromOpenAPI` and `OpenAPIServiceRegistry` from the TypeScript `@alkdev/operations` to Rust. Integrate with `CredentialProvider` for auth config. This enables any OpenAPI-described service to be auto-registered as a set of operations.
Deliverables:
- `alknet-openapi` or `alknet-operations-adapter` crate with `from_openapi` module
- `FromOpenAPI(spec, config) -> Vec<(OperationSpec, Handler)>`
- `HttpServiceConfig` with auth populated from `CredentialProvider`
- `OpenAPIServiceRegistry::register_all(registry)` port
Depends on: Phase A + existing `OperationRegistry`
### Phase C: Managed credentials and self-hosted auth
Add `ManagedCredentialProvider` with `CredentialManager` trait. Implement S3 signing for rustfs. Implement OIDC token refresh. Enable identity-bound credential provisioning.
Deliverables:
- `CredentialManager` trait
- `ManagedCredentialProvider` impl
- S3CredentialManager (request signing, session token rotation)
- OidcCredentialManager (token refresh, PKCE flow)
- Identity-bound credential resolution (uses `Identity.id` as account key)
Depends on: Phase A + alknet-storage + application-specific knowledge
### Phase D: Alknet as OIDC/OAuth2 provider
Alknet's identity system could expose an OIDC authorization server endpoint. Self-hosted services (rustfs, gitea) would be configured to use alknet as their OIDC provider. This eliminates stored credential management entirely for the OIDC path — users authenticate directly through alknet's existing identity.
This is the most complex but also the most elegant path for self-hosted services. It makes alknet the identity backbone of the entire self-hosted stack.
Deliverables:
- OIDC authorization server operations (authorize, token, userinfo, jwks)
- Exposed via call protocol and/or HTTP adapter
- Configuration for rustfs/gitea to use alknet as OIDC provider
- Identity mapping: alknet Identity scopes → rustfs/gitea policies
Depends on: Phase C + call protocol HTTP or web adapter + significant R&D
## Analysis of Self-Hosted Auth Mechanisms
### rustfs
**S3 access key/secret key**:
- rustfs IAM manages users, groups, policies, and service accounts
- Credentials are `access_key` + `secret_key` pairs (S3 standard)
- Auth uses AWS Signature V4: HMAC-SHA256 of request details using the secret key
- Session tokens (from STS AssumeRole-style flows) are JWTs with claims including policy
- Access keys are created via the rustfs admin API or UI
**OIDC**:
- Full OpenID Connect support with PKCE
- Uses the `openidconnect` Rust crate for standards compliance
- Supports discovery, token exchange, ID token verification
- OIDC users are mapped to rustfs policies via claims
**Integration path**:
- Minimal: Store access key + secret key as `CredentialSet::S3AccessKey`, use for request signing
- Better: alknet as OIDC provider → no stored credentials, direct identity mapping
- Best: Phase D path where rustfs trusts alknet as its identity provider
### gitea
**Auth options**:
- OAuth2 provider (gitea can act as OAuth2 provider for other apps)
- OIDC client (gitea can delegate auth to an external OIDC provider — alknet in Phase D)
- Reverse proxy auth (SSO via HTTP headers — alknet injects `X-WebAuth-User` as a reverse proxy)
- API tokens (personal access tokens, scoped, with TTL)
- SSH keys (for git operations, separate from API auth)
**Integration path**:
- Minimal: Store gitea API token as `CredentialSet::Bearer`
- Intermediate: If alknet runs as a reverse proxy in front of gitea, inject auth headers
- Best: alknet as OIDC provider for gitea
### General pattern
For both rustfs and gitea, the auth integration follows the same progression:
1. **Static credentials** (Phase A): Store API keys/tokens, decrypt at startup. Simple, works for single-user or admin-only access.
2. **Dynamic credentials** (Phase C): Managed credential lifecycle — token refresh, session rotation. Needed for production.
3. **Identity federation** (Phase D): Alknet acts as the identity provider. No stored service credentials. Users authenticate through alknet and their identity (scopes, resources) maps to the external service's policy model. Most secure, most complex.
Phase D is not required to start building service wrappers. Phases A and C are sufficient for functional integrations. Phase D is a quality-of-life and security improvement that becomes important in multi-user self-hosted deployments.
## Open Questions
### OQ-CP-01: Should CredentialProvider support per-identity credentials?
That is, should the trait be `get_credentials(service, identity)` instead of `get_credentials(service)`?
Pro: Enables multi-tenant self-hosted services where different alknet users have different access.
Con: More complex, and the identity resolution can be done by the service wrapper itself by looking up identity-bound credentials from the metagraph.
Recommendation: Start with service-level credentials. Add identity-level resolution as a second method (`get_credentials_for(service, account_id)`) when the need is concrete. Since `Identity.id` already serves as the account UUID in database-backed mode, there's no need for a separate `account_id` field.
### OQ-CP-02: Where should the OIDC provider operations live?
If alknet becomes an OIDC provider (Phase D), the authorization server endpoints need to live somewhere. Options:
1. In alknet-core behind a feature flag (like auth service)
2. In a new `alknet-oidc` crate
3. As an application service registered in the operation registry
Recommendation: Application service (option 3). OIDC is an application concern, not a core concern. The call protocol and `OperationRegistry` provide the transport; OIDC is just another set of operations.
### OQ-CP-03: How do credential rotations propagate across a cluster?
If a credential is rotated (e.g., S3 session token refreshed on the head node), how do worker nodes get the updated credential? Options:
1. Workers request fresh credentials on each use (always current, more secret service calls)
2. Push notification via honker stream (efficient, but adds cross-service event coupling)
3. Workers cache with TTL (simple, may briefly use stale credentials)
Recommendation: TTL-based caching with a refresh threshold. Workers call `CredentialProvider::get_credentials()` which checks `is_expired()` and calls `refresh_credentials()` if needed. The TTL is per-credential-type (e.g., 1 hour for S3 session tokens, no TTL for static API keys).
### OQ-CP-04: Should CredentialSet include request-signing capability?
For S3 auth, the credential set contains `access_key + secret_key`, but the actual HTTP request signing (AWS Signature V4) is a separate computation. Should `CredentialSet::S3AccessKey` include a signing method?
Recommendation: Keep `CredentialSet` as pure data. Add a separate `s3_sign(credential: &S3AccessKey, request: &HttpRequest) -> SignedRequest` utility function in the service wrapper or a shared `alknet-s3` utility crate. The `OpenAPIServiceRegistry` pattern already separates credentials from HTTP client behavior; signing is client behavior.
### OQ-CP-05: How does this relate to the HTTP service / AI SDK port?
The AI SDK port provides HTTP infrastructure (streaming, retries, SSE parsing, error handling). The `CredentialProvider` provides the auth config that the HTTP client consumes. They're separate concerns that compose: the HTTP service uses `CredentialProvider` to populate `auth` headers/tokens on outgoing requests, just as `OpenAPIServiceRegistry` does. The AI SDK's provider codegen (which would be replaced with the operation pattern) currently hardcodes auth per provider; `CredentialProvider` makes it dynamic and centrally managed instead.
## References
- [identity.md](../../architecture/identity.md) — IdentityProvider trait, Identity struct
- [secret-service.md](../../architecture/secret-service.md) — SecretProtocol, EncryptedData
- [services.md](../../architecture/services.md) — OperationEnv, OperationRegistry, service composition
- [call-protocol.md](../../architecture/call-protocol.md) — OperationEnv three dispatch paths
- [integration-plan.md](../integration-plan.md) — Phase structure, OperationEnv wiring
- [@alkdev/operations/src/from_openapi.ts](../../../@alkdev/operations/src/from_openapi.ts) — OpenAPIServiceRegistry, HTTPServiceConfig.auth

View File

@@ -1,549 +0,0 @@
# Definitions: Terminology Disambiguation and Concept Mapping
> Status: Research / Draft
> Last updated: 2026-06-08
> Part of: Phase 2 planning
## Purpose
Multiple terms are overloaded across alknet's architecture, OpenStack's identity model, and the distributed systems/git space. This document disambiguates each term, maps equivalent concepts across domains, and identifies open questions that need resolution before updating architecture specs.
The architecture docs (interface.md, auth.md, services.md) reflect a pre-Phase-0/1 state. This document exists to untangle conceptual knots before editing those specs.
---
## Term Definitions
### Interface (alknet Layer 2)
**Definition**: An interface consumes a byte stream from a Transport (Layer 1) and produces call protocol sessions or handles discrete requests. It is a _protocol parser_, not a network service.
**Subtypes**:
| Subtype | Trait | Lifecycle | Transport ownership | Examples |
|---|---|---|---|---|
| `StreamInterface` | `StreamInterface::accept(stream) -> Session` | Long-lived session | Provided by caller | SshInterface, RawFramingInterface |
| `MessageInterface` | `MessageInterface::handle_request(req) -> Response` | Stateless per-request | Self-managed | HttpInterface, DnsInterface, WebSocketInterface |
**Not to be confused with**: A "service interface" (API surface of a service), a Rust trait (also called an interface generically), or an "interface" in the OpenStack sense (a network endpoint).
**Source**: [interface-model.md](interface-model.md)
---
### Transport (alknet Layer 1)
**Definition**: A transport produces a byte stream (`AsyncRead + AsyncWrite + Unpin + Send`) or a datagram channel. It is a _wire mechanism_, not a protocol. Transports are listed in `TransportKind`: TCP, TLS, iroh (QUIC), WebTransport.
**Not to be confused with**: The HTTP transport (which is a transport+interface combined in a `MessageInterface`), or the DNS "transport" (which was removed from `TransportKind` because DNS is a `MessageInterface`).
**Key constraint**: A connection is always a (Transport, StreamInterface) pair for stream-based connections. `MessageInterface` implementations manage their own transport internally.
**Source**: [tls-transport.md](tls-transport.md), [interface-model.md](interface-model.md)
---
### Service (irpc service)
**Definition**: An in-cluster Rust-to-Rust service defined by an irpc protocol enum. Services are dispatched by enum variant and use postcard serialization. They run within a node or cluster and are synchronous request-response.
**Examples**: `AuthProtocol`, `SecretProtocol`, `ConfigProtocol`, `StorageProtocol`.
**Not to be confused with**: A call protocol operation (path-based, JSON, cross-node), an external service (a third-party endpoint reachable via HTTP/call protocol), or an application service (DockerService, GitService — an operation-registered handler).
**Architecture position**: irpc services are _one dispatch backend_ for OperationEnv, not a replacement for it.
**Source**: [integration-plan.md](../integration-plan.md), Inconsistencies section item 3.
---
### Operation (call protocol)
**Definition**: A path-based handler registered in the `OperationRegistry`, dispatched by namespace + name (e.g., `/head/auth/verify`). Operations are cross-node, cross-language, and use JSON `EventEnvelope` frames.
**Not to be confused with**: An irpc service method (which is dispatched by enum variant, not path), or an OpenStack operation (which is a REST API verb).
**Architecture position**: Operations are the universal composition unit. All interfaces (SSH, HTTP, DNS, WebSocket, MCP) resolve to the same operation invocations through `OperationEnv`.
**Source**: [integration-plan.md](../integration-plan.md), ADR-033.
---
### External Service
**Definition**: Any endpoint reachable via the call protocol from another node or over an interface — an HTTP API (vast.ai), another alknet head node, rustfs, gitea. External services are _consumed_ by alknet, not part of it.
**Examples**: vast.ai cloud API, runpod API, any OpenAPI-described endpoint consumed by `OpenAPIServiceRegistry`.
**Not to be confused with**: An irpc service (internal), or an application service (handler within alknet).
---
### Application Service
**Definition**: A handler registered with the `OperationRegistry` that provides application-level functionality. Application services are pluggable, don't change core, and register operations like any other handler.
**Examples**: DockerService, NodeService, GitService, RustfsService.
**Not to be confused with**: An irpc service (which is a dispatch mechanism, not a handler), or an external service (outside the cluster).
---
### Identity (alknet core type)
**Definition**: A struct `{ id, scopes, resources }` that represents an authenticated principal. Produced by `IdentityProvider::resolve_from_fingerprint()` or `IdentityProvider::resolve_from_token()`. The same person connecting via SSH key or API token resolves to the same `Identity` (same `id` in database-backed deployments).
**Mapping to other domains**:
| alknet Concept | OpenStack Keystone | Distributed Git |
|---|---|---|
| `Identity.id` (fingerprint or UUID) | User ID | Radicle DID / on-chain address |
| `Identity.scopes` | Role assignments on a project/domain | Repository ACL entries |
| `Identity.resources` | Service catalog endpoints | Repositories accessible |
| `IdentityProvider` | Keystone identity service | On-chain registry + local cache |
**Not to be confused with**: A "user" (which is an account concept in storage), a "principal" (similar but not identical — an Identity can represent a service account or API key).
**Source**: [identity.md](../../architecture/identity.md), [auth.md](../../architecture/auth.md), ADR-029.
---
### IdentityProvider (alknet trait)
**Definition**: A trait in `alknet_core::auth` with two methods: `resolve_from_fingerprint()` (SSH key auth) and `resolve_from_token()` (bearer token auth). It resolves an inbound credential to an `Identity`.
**Implementations**: `ConfigIdentityProvider` (ArcSwap-backed, minimal), `StorageIdentityProvider` (SQLite-backed, production). Future possibility: `OnChainIdentityProvider` (smart contract + local cache).
**Direction**: Inbound (who is calling alknet).
**Not to be confused with**: `CredentialProvider` (outbound — how alknet authenticates TO external services), or an OpenStack Keystone "identity provider" which is a federation concept.
---
### CredentialProvider (alknet trait)
**Definition**: A trait in `alknet_core::credentials` that resolves outbound credentials. `get_credentials(service) -> Option<CredentialSet>`. It answers: "how does alknet authenticate to service X?"
**Direction**: Outbound (how alknet calls external services).
**Mapping**: Rustfs credentials (S3AccessKey), gitea tokens (Bearer), OIDC tokens (OidcToken), API keys (ApiKey).
**Not to be confused with**: `IdentityProvider` (inbound auth resolution).
**Source**: [credential-provider.md](credential-provider.md)
---
### AuthToken (alknet wire format)
**Definition**: `base64url(key_id || timestamp || signature)` — an Ed25519-signed timestamp token used for non-SSH auth (HTTP, DNS, WebTransport, WebSocket).
**Mapping to other domains**:
| alknet | OpenStack Keystone | Description |
|---|---|---|
| AuthToken | Keystone token (X-Auth-Token) | Proof of identity carried in a request |
| AuthToken (Ed25519 signed) | Keystone token (scoped, with catalog) | Keystone tokens carry more metadata (catalog, scope); alknet tokens are minimal |
| API key (`alk_...`) | Application Credential | Password-less auth with restricted scope |
| `resolve_from_token()` | Token validation endpoint | Verify token → resolve identity |
**Key difference**: Keystone tokens are server-issued and carry scope/catalog. alknet AuthTokens are self-signed (client-generated) and carry only key_id + timestamp — scope is resolved server-side by `IdentityProvider`. This is intentional: alknet doesn't need a token issuance endpoint because tokens are self-proving.
**Source**: [auth.md](../../architecture/auth.md), ADR-023.
---
### Domain Event vs Integration Event
**Definition** (from event-sourcing/event_source_types.md):
| Type | Scope | Consumers | Serialization | Example |
|---|---|---|---|---|
| Domain Event | Within a single service boundary | Internal handlers only | Can be rich, domain-specific | `InventoryAdjusted`, `KeyRotated` |
| Integration Event | Across service boundaries | External services, other nodes | Simple, versioned, stripped of internals | `call.requested` (EventEnvelope), `UserCreated` (projected) |
**alknet mapping**:
| Boundary | Mechanism | Serialization | Scope |
|---|---|---|---|
| Within a service (e.g., AuthProtocol) | Honker streams (domain events) | Internal | Same service |
| Between services in a cluster | irpc protocol enum | postcard (binary) | Same cluster |
| Between nodes or over interfaces | Call protocol EventEnvelope | JSON | Cross-node |
**Hard constraint** (ADR-032): Domain events never cross service boundaries without projection. Integration events are the boundary contract.
**Not to be confused with**: A "call protocol event" (which IS an integration event), or a "service call" (which is synchronous, not event-based).
---
### Scope (alknet)
**Definition**: A permission or claim attached to an `Identity`. Used by `ForwardingPolicy` and operation-level ACL. Defined as part of the `Identity` struct.
**Mapping to other domains**:
| alknet `Scope` | OpenStack Keystone | Distributed Git |
|---|---|---|
| `scopes: ["relay:connect", "secrets:derive"]` | Role assignments on a project ("member", "admin") | Write/push access to repository X |
| `resources: [...]` | Project/domain scope targets | Which repositories are accessible |
**Open question**: Should alknet adopt a richer scope model (hierarchical, like Keystone's implied roles), or keep the flat string model? See OQ-DEF-03.
---
### OperationRegistry (alknet)
**Definition**: The central registry that maps `(namespace, operation_name)` to handlers. All interfaces resolve to the same registry. The HTTP interface maps `POST /v1/{namespace}/{op}` to `registry.invoke()`. The call protocol maps `call.requested` with `operationId` to `registry.invoke()`.
**Mapping to other domains**:
| alknet | OpenStack Keystone | Description |
|---|---|---|
| OperationRegistry | Service Catalog | Both map names to endpoints; registry is programmatic, catalog is runtime-discovered |
| `FromOpenAPI` | — | Consumes an external API spec and registers operations |
| `GET /v1/schema` (proposed) | `GET /v3/auth/catalog` | Produces a spec of available operations |
**Key difference**: Keystone's catalog is per-token (scoped to the user's project). alknet's OperationRegistry is global — scope checking happens at invocation time, not discovery time.
---
### Call Protocol (alknet Layer 3)
**Definition**: The application-level protocol that carries operations, events, and responses between nodes. Uses JSON `EventEnvelope` frames. Interface-agnostic: runs over any (Transport, StreamInterface) pair or any `MessageInterface`.
**Not to be confused with**: irpc service calls (synchronous, in-cluster, postcard), or HTTP (which is an interface that maps to call protocol operations).
---
## Concept Mapping Table
### Alknet ↔ OpenStack Keystone
| Alknet Concept | Keystone Equivalent | Notes |
|---|---|---|
| `Identity` | User + Role Assignment + Project scope | alknet is simpler; Keystone separates user/role/project |
| `Identity.id` | User ID | In storage-backed: UUID. In config-backed: key fingerprint |
| `Identity.scopes` | Role assignments | alknet uses flat strings; Keystone uses hierarchical roles |
| `Identity.resources` | Project scope + Service Catalog | Both limit what a token can access |
| `IdentityProvider` | Keystone identity service | Both resolve credentials → identity + scope |
| `AuthToken` | Keystone token (X-Auth-Token) | alknet tokens are self-signed (no issuance endpoint); Keystone tokens are server-issued |
| API key (`alk_...`) | Application Credential | Nearly identical pattern |
| `CredentialProvider` | — (no direct equivalent) | Keystone doesn't authenticate outbound; each service manages its own credentials |
| `OperationRegistry` | Service Catalog | Registry is programmatic; catalog is runtime-discovered per-token scope |
| `CredentialSet::S3AccessKey` | S3 credential (access key + secret) | Directly maps to rustfs IAM model |
| `CredentialSet::OidcToken` | Federated token | alknet Phase D: becomes OIDC provider |
| Domain events (Honker) | — | Internal event bus, no Keystone equivalent |
| Integration events (call protocol) | Keystone notifications | Both are cross-boundary, but call protocol is request/response, not pub/sub |
| Token scoping | Unscoped → scoped token flow | alknet resolves scope server-side; Keystone requires explicit scope request |
### Alknet ↔ Distributed Git / Smart Contracts
| Alknet Concept | Distributed Git Equivalent | Notes |
|---|---|---|
| `Identity.id` (Ed25519 fingerprint) | Radicle DID (Ed25519 pubkey hash) | Both use Ed25519; alknet uses SLIP-0010 derivation |
| `Identity.scopes` | Repository ACL entries | Smart contract: NFT ownership → write permission |
| `IdentityProvider` | On-chain identity registry | alknet: local/DB lookup. Distributed: on-chain verification + local cache |
| `CredentialSet` | Git push credentials | ssh-key for SSH git, token for HTTPS git |
| Call protocol (integration events) | Gossip protocol (Radicle) | Both are cross-node; call protocol is point-to-point, gossip is epidemic |
| `OperationRegistry` | Replicator registry (on-chain) | Both map names to endpoints/operations |
| Domain events (Honker) | Git ref updates (internal) | Internal to the git service boundary |
| Seed derivation (BIP39) | Ethereum private key | Both derive multiple keys from one seed; different curves (Ed25519 vs secp256k1) |
| SecretProtocol key paths | — | alknet's `m/74'/0'/0'/0'` for Ed25519 identity; `m/44'/60'/0'/0/0` for Ethereum signing |
### Alknet ↔ Rustfs Auth Integration
| Alknet Concept | Rustfs Equivalent | Integration Path |
|---|---|---|
| `IdentityProvider` (inbound) | Rustfs IAM / Keystone auth | Phase D: alknet as OIDC provider → rustfs accepts alknet tokens |
| `CredentialSet::S3AccessKey` | Rustfs access key + secret key | Phase A: static credentials; Phase C: per-identity provisioned keys |
| `CredentialProvider` (outbound) | Rustfs admin API (key provisioning) | Phase C: `ManagedCredentialProvider` provisions rustfs keys |
| `Identity.scopes` | Rustfs IAM policy | Phase D: scope → OIDC claim → policy mapping |
| HTTP MessageInterface | Rustfs S3 API (port 9000) | Rustfs sits behind alknet's HTTP router or sidecar |
| OperationRegistry | — | Git service maps `git.clone`, `git.push`, etc. to operations |
---
## Overloaded Terms: Disambiguation
### "Service" — Three Meanings
| Context | Meaning | Example | Architecture Layer |
|---|---|---|---|
| alknet irpc service | In-cluster Rust-to-Rust protocol enum | `AuthProtocol`, `SecretProtocol` | Layer 3 (internal) |
| alknet application service | Operation-registered handler | `GitService`, `RustfsService` | Layer 3 (handler) |
| External service | Third-party endpoint consumed by alknet | `vast.ai`, `rustfs` instance | Outside alknet (consumed via OperationEnv) |
**Rule**: When ambiguity is possible, use the full qualifier: "irpc service", "application service", or "external service". The bare word "service" should be avoided in architecture docs.
### "Interface" — Three Meanings
| Context | Meaning | Example |
|---|---|---|
| alknet Layer 2 | A protocol parser that consumes Transport streams or handles discrete requests | `SshInterface`, `HttpInterface`, `DnsInterface` |
| Rust/generic | A trait definition | `IdentityProvider`, `CredentialProvider` |
| OpenStack/generic | A network endpoint (URL) for a service | Keystone's public/internal/admin interfaces |
**Rule**: In alknet architecture docs, "Interface" (capitalized) refers to Layer 2. "trait" or "contract" should be used for Rust trait definitions. "endpoint" should be used for network URLs.
### "Token" — Three Meanings
| Context | Meaning | Structure |
|---|---|---|
| AuthToken (alknet) | Self-signed Ed25519 timestamp | `base64url(key_id \|\| timestamp \|\| sig)` |
| API key (alknet) | Hash-verified bearer string | `alk_...` prefix, SHA-256 hash verification |
| Keystone token | Server-issued scoped token | UUID or JWT, carries catalog and scope |
**Rule**: "AuthToken" refers to alknet's self-signed token. "API key" refers to the hash-verified bearer format. "Keystone token" when referring to OpenStack. Never use bare "token" in architecture docs.
### "Identity" — Two Meanings
| Context | Meaning |
|---|---|
| alknet `Identity` struct | `{ id, scopes, resources }` — the authenticated principal |
| OpenStack Identity (Keystone) | The entire identity management SERVICE, including users, projects, roles, tokens, catalog |
**Rule**: "Identity" (capitalized, code font) = alknet struct. "Keystone" or "identity service" = OpenStack concept.
### "Domain" — Two Meanings in Event Sourcing
| Context | Meaning |
|---|---|
| Domain Event | An event within a single service boundary (e.g., `KeyRotated` within AuthProtocol) |
| DNS Domain | A domain name in DNS queries/records |
These are unrelated. "Domain event" is from DDD. "DNS domain" is from networking. Context should always make it clear, but if there's any chance of confusion, use "bounded-context event" instead of "domain event".
---
## Architectural Patterns: Cross-Domain Comparison
### Pattern: Inbound Auth → Outbound Credentials
```
┌──────────────────────────────────────────────────────────────┐
│ Incoming Request │
│ │ │
│ ▼ │
│ IdentityProvider │
│ (credential → Identity) │
│ │ │
│ ├── SSH fingerprint → Identity.id, .scopes, .resources │
│ ├── Bearer AuthToken → Identity.id, .scopes, .resources │
│ └── API key → Identity.id, .scopes, .resources│
│ │ │
│ ▼ │
│ OperationContext { identity, env, ... } │
│ │ │
│ ├── context.env.invoke("git", "push", input) │
│ │ └── GitService handler │
│ │ └── CredentialProvider │
│ │ └── get_credentials("rustfs") │
│ │ └── S3AccessKey { access_key, │
│ │ secret_key } │
│ │ │
│ └── context.env.invoke("secrets", "derive", input) │
│ └── local dispatch to SecretProtocol │
│ │
│ Two directions: Inbound (who is calling us) │
│ Outbound (how we call others) │
└──────────────────────────────────────────────────────────────┘
```
### Pattern: Scope Resolution Across Systems
| System | Scope Source | Scope Shape | Scope Check Location |
|---|---|---|---|
| alknet (current) | `IdentityProvider` | Flat strings `["relay:connect"]` | Handler invocation |
| Keystone | Role assignment on project | Hierarchical roles with implied roles | Policy engine per service |
| Rustfs IAM | Policy document attached to user | JSON policy with actions/resources | Request evaluation |
| Smart contract ACL | NFT ownership + on-chain mapping | Address → repo → permission level | On-chain verification + local cache |
| Radicle | Local config | Pubkey → repo → permission | Pre-receive hook |
**Open question**: Should alknet adopt hierarchical implied roles (Keystone pattern) or stay with flat scopes and let individual services interpret them?
### Pattern: Token Self-Proving vs Server-Issued
| Property | alknet AuthToken | Keystone Token | API Key |
|---|---|---|---|
| Issued by | Client (self-signed) | Server (Keystone) | Admin (config or DB) |
| Carries | key_id + timestamp + signature | User ID, scope, catalog, expiry | Prefix + hash |
| Verified by | Ed25519 signature check | Server lookup (database or JWT) | SHA-256 hash check |
| Revocation | Key removal from `authorized_keys` | Token revocation list or JWT `jti` | DB deletion |
| Scope resolution | Server-side (IdentityProvider) | Embedded in token | Server-side (DB lookup) |
| Replay protection | Timestamp window (±300s) | Token TTL + server validation | N/A (stateless) |
alknet's self-proving model avoids the need for a token issuance endpoint. This is a deliberate trade-off: simpler at the cost of no server-side session state. For replay protection beyond the timestamp window, future work could add nonce challenge-response (ADR-023).
---
## Service Classification
Services within alknet's ecosystem are classified by their relationship to the core:
### Core Services (irpc, always present when feature flag enabled)
| Service | Protocol | Location | Purpose |
|---|---|---|---|
| Auth | `AuthProtocol` | alknet-core (`irpc` feature) | Identity resolution, credential verification |
| Config | `ConfigProtocol` | alknet-core (`irpc` feature) | Dynamic config reload |
| Secret | `SecretProtocol` | alknet-secret | Key derivation, encryption, decryption |
| Storage | `StorageProtocol` | alknet-storage | Metagraph CRUD, ACL, accounts |
### Application Services (operation-registered, pluggable)
| Service | Interface | Core dependency | Purpose |
|---|---|---|---|
| GitService | HTTP (MessageInterface) + SSH (StreamInterface) | IdentityProvider, CredentialProvider | Git clone/push/pull over HTTPS and SSH |
| RustfsService | HTTP (MessageInterface) | CredentialProvider | S3-compatible object storage proxy |
| DockerService | HTTP (MessageInterface) | CredentialProvider | Container management |
| NodeService | HTTP (MessageInterface) | IdentityProvider | Node management |
### External Services (consumed, not hosted)
| Service | Integration | Auth |
|---|---|---|
| vast.ai | `OpenAPIServiceRegistry` + `CredentialProvider` | API key |
| runpod | `OpenAPIServiceRegistry` + `CredentialProvider` | API key |
| ubicloud | `OpenAPIServiceRegistry` + `CredentialProvider` | API key |
**Key distinction**: Rustfs and gitea are "self-hosted external services" — they run inside the same deployment boundary but are managed independently. alknet acts as a gateway (identity provider, credential provider) and reverse proxy (HTTP interface) for them, but they are NOT part of alknet-core.
---
## Open Questions
### OQ-DEF-01: Should alknet adopt a "Service Catalog" concept like Keystone?
Keystone's service catalog lets a token carry information about which services and endpoints are available to the authenticated user. alknet's `OperationRegistry` is global — every authenticated identity sees the same operations. Should there be a scope-filtered operation discovery mechanism?
**Options**:
1. Keep `OperationRegistry` global, check scope at invocation time (current design)
2. Add `GET /v1/catalog` or `GET /v1/schema?scope=<scope>` that returns only operations the identity can invoke
3. Add a "service catalog" field to `Identity.resources` that lists available namespaces
**Recommendation**: Start with option 1 (current design). Add option 2 when multi-tenant deployment requires it. The `GET /v1/schema` endpoint (from tls-transport.md) already provides operation discovery — adding scope filtering is additive.
---
### OQ-DEF-02: Should "application service" and "irpc service" be renamed to avoid "service" overloading?
The word "service" has three meanings in the architecture (irpc, application, external). Should we adopt different terms?
**Options**:
1. Keep current names, always qualify with "irpc service", "application service", "external service"
2. Rename: "irpc service" → "irpc protocol" or "backend handler"; "application service" → "adapter" or "integration"
3. Adopt the call-protocol terminology exclusively: everything that registers in `OperationRegistry` is an "operation handler", and "service" refers only to external endpoints
**Recommendation**: Option 1 for now. The qualifiers are sufficient, and renaming would require changing ADRs and multiple specs. Revisit if confusion persists in practice.
---
### OQ-DEF-03: Should `Identity.scopes` be hierarchical (like Keystone implied roles) or stay flat?
Current design: `scopes: Vec<String>` with flat strings like `"relay:connect"`, `"secrets:derive"`.
Keystone pattern: Roles can imply other roles (admin implies member). Policies are per-service, not global strings.
**Options**:
1. Keep flat scopes, let individual services interpret them (current)
2. Add implied scope resolution: `"admin"``["relay:connect", "secrets:derive", ...]`
3. Adopt a policy language (JSON policy documents like Rustfs IAM)
**Recommendation**: Start with option 1. Add implied scope resolution in alknet-storage when multi-tenant deployment requires it. A full policy language is Phase D territory and should follow what Rustfs already uses (MinIO-style JSON policies) rather than inventing something new.
---
### OQ-DEF-04: How should the GitService adapter work across HTTP and SSH?
gitserver provides `gitserver-core` (transport-agnostic git protocol logic) and `gitserver-http` (Axum HTTP layer). alknet's architecture supports two paths:
**Path A — HTTP MessageInterface**: Git operations over HTTPS, with alknet's HTTP interface authenticating the request and passing Identity to the GitService handler. The GitService handler uses `Identity` to determine repo access and calls `gitserver-core` directly.
**Path B — SSH StreamInterface**: Git operations over SSH, where the SSH interface already authenticates the user. Git commands are dispatched through SSH channels (similar to how `channel_open_direct_tcpip` works for port forwarding, but with a `git-upload-pack` / `git-receive-pack` channel type).
**Path C — Both**: `gitserver-core` as the protocol engine, `gitserver-http` for HTTPS, and a custom `SshGitInterface` for SSH-git channels.
**Recommendation**: Phase 1 — Path A (HTTP only). Phase 2 — Path C (both). The git-smart-HTTP-protocol is well-understood, and `gitserver-http` can be nested into alknet's Axum router. SSH git requires designing a new channel type in `SshInterface`.
---
### OQ-DEF-05: Should alknet act as an OIDC provider (Phase D of credential-provider.md)?
This is the most cross-cutting question. If alknet becomes an OIDC provider, it becomes the identity backbone for all self-hosted services (rustfs, gitea, etc.). This maps to OpenStack Keystone's role but with a different scope model.
**Benefits**:
- Eliminates stored credential management for OIDC-compatible services
- Users authenticate once via alknet (SSH key or token) and get scoped access to all services
- Maps directly to `Identity.scopes → OIDC claims → service policies`
**Complexity**:
- Requires OIDC authorization server endpoints (authorize, token, userinfo, jwks)
- Requires PKCE flow for browser-based auth
- Requires claim → policy mapping per service
- alknet is not currently designed to be an OIDC server
**Recommendation**: Phase D (long-term). Phases A-C use static credentials and managed credentials, which are sufficient for most deployments. OIDC provider is a quality-of-life improvement that becomes important in multi-user self-hosted setups.
---
### OQ-DEF-06: How does Domain Event vs Integration Event discipline apply to self-hosted services?
ADR-032 says domain events stay within the service boundary and integration events cross it. Rustfs and gitea are outside alknet's boundary but inside the deployment boundary. Where do their events fall?
**Options**:
1. Self-hosted services are external: their events are integration events, consumed via callback/webhook
2. Self-hosted services are part of alknet's boundary: use Honker streams internally, project to integration events for cross-node
3. Hybrid: alknet projects rustfs/gitea state changes into `EventEnvelope` integration events, but rustfs/gitea internal events stay in their own boundary
**Recommendation**: Option 3. Self-hosted services have their own internal event systems. alknet projects state changes (bucket created, repo pushed) into `EventEnvelope` integration events for cross-node communication. Honker streams are for events within alknet-core services only.
---
### OQ-DEF-07: How should the smart-contract / on-chain identity model relate to alknet's IdentityProvider?
The distributed git concept (NFT-based org/repo tokens) introduces a third `IdentityProvider` implementation that validates identity on-chain. How does this relate to the existing two implementations?
**Option 1 — OnChainIdentityProvider**: A new implementation of the `IdentityProvider` trait that checks on-chain ownership. Slow path: on-chain verification (0.5-5s on L2). Fast path: local ACL metagraph cache validated against on-chain state periodically.
**Option 2 — Separate verification layer**: On-chain verification is a separate step, not an IdentityProvider. After normal auth (SSH key or token), a second check verifies on-chain ownership for specific operations (e.g., write to a distributed repo).
**Option 3 — CredentialProvider extension**: On-chain verification is outbound — alknet authenticates TO the smart contract to verify repo permissions. This would be a new `CredentialSet` variant.
**Recommendation**: Option 1 for the long term. The `IdentityProvider` trait is designed to be pluggable. An `OnChainIdentityProvider` with local cache is additive. It resolves on-chain identity to an `Identity` struct just like `ConfigIdentityProvider` and `StorageIdentityProvider`. The seed derivation path `m/44'/60'/0'/0/0` (Ethereum) alongside `m/74'/0'/0'/0'` (Ed25519 identity) provides a cryptographic link between the two key types.
---
### OQ-DEF-08: Should the "interface" concept in auth.md (which distinguishes auth "presentation" per transport/interface pair) be renamed to avoid confusion with Layer 2 "Interface"?
In auth.md, "auth presentation" is the mechanism by which credentials are presented on each interface:
- SSH: key handshake
- HTTP: Bearer header
- DNS: token in query labels
- WebTransport: token in CONNECT request
This is NOT the same as "Interface" (Layer 2), but uses the same word. Should we adopt a distinct term?
**Options**:
1. Keep "auth presentation" — it's already distinct from "Interface" (Layer 2)
2. Rename to "auth mechanism" or "credential presentation" to be more precise
3. Use the term from the interface-model.md table: "(Transport, Interface) → Auth mechanism"
**Recommendation**: Option 2. "Credential presentation" is precise and doesn't overload "interface". Update auth.md to use "credential presentation per (Transport, Interface) pair" consistently.
---
## References
- [interface-model.md](interface-model.md) — StreamInterface / MessageInterface trait design
- [credential-provider.md](credential-provider.md) — CredentialProvider, CredentialSet (outbound auth)
- [tls-transport.md](tls-transport.md) — Unified multi-interface architecture
- [integration-plan.md](../integration-plan.md) — Phase structure, OperationEnv, event boundary discipline
- [identity.md](../../architecture/identity.md) — Identity struct, IdentityProvider trait
- [auth.md](../../architecture/auth.md) — Unified auth, AuthToken format
- [services.md](../../architecture/services.md) — irpc services, OperationEnv
- [event-source-types.md](../event-sourcing/event_source_types.md) — Domain events vs integration events
- [ADR-032](../../architecture/decisions/032-event-boundary-discipline.md) — Event boundary rule
- [ADR-033](../../architecture/decisions/033-operationenv-irpc-call-protocol.md) — OperationEnv, three dispatch paths
- [references/rustfs/](../references/rustfs/) — Rustfs research and reference
- [references/gitserver/](../references/gitserver/) — Gitserver research and reference
- [references/openstack-keystone/](../references/openstack-keystone/) — OpenStack Keystone concepts
- [references/distributed-identity/](../references/distributed-identity/) — Distributed identity and smart contract ACL

View File

@@ -1,367 +0,0 @@
# Interface Model: Stream and Message Interfaces
> Status: Research / Draft
> Last updated: 2026-06-08
> Part of: Phase 2 planning
## Overview
The current three-layer model (ADR-026, [interface.md](../../architecture/interface.md)) defines Transport (Layer 1), Interface (Layer 2), and Protocol (Layer 3). The `Interface` trait assumes a persistent byte stream from a `Transport`, which works for SSH and raw framing. However, two important interface types — HTTP and DNS — don't fit this model: they handle individual requests, not persistent sessions. This document proposes splitting the interface model into `StreamInterface` and `MessageInterface`, adding HTTP as a first-class interface, and reclassifying DNS from a transport to a message-based interface.
## Problem Statement
### DNS is not a transport
The current `TransportKind` enum includes `Dns { domain: String }` alongside `Tcp`, `Tls`, and `Iroh`. But DNS doesn't produce a `AsyncRead + AsyncWrite + Unpin + Send` byte stream. It's a request/response protocol. Listing it as a transport conflates different abstractions. DNS encodes/decodes `EventEnvelope` frames as DNS query/response pairs — that's an interface behavior, not a transport behavior.
### HTTP is missing as an interface
The current valid (Transport, Interface) pairs are all stream-based:
| Transport | Interface |
|---|---|
| TLS | SSH |
| TCP | SSH |
| iroh | SSH |
| DNS | raw framing |
| WebTransport | SSH |
| WebTransport | raw framing |
| TCP | raw framing |
But there's no HTTP interface — the (TCP/TLS, HTTP) pair that accepts standard HTTP requests and maps them to call protocol operations. This is the **server-side** equivalent of `OpenAPIServiceRegistry` (which does client-side: consuming OpenAPI specs to make outbound HTTP calls). Without it, external clients (browsers, curl, monitoring) can only reach alknet through SSH.
### Auth across all interfaces
Different interfaces authenticate differently, but all resolve to the same `Identity` through `IdentityProvider`:
| (Transport, Interface) | Auth mechanism | Resolves via |
|---|---|---|
| (TLS, SSH) | SSH public key handshake | `IdentityProvider::resolve_from_fingerprint()` |
| (TCP, SSH) | SSH public key handshake | `IdentityProvider::resolve_from_fingerprint()` |
| (iroh, SSH) | SSH public key handshake | `IdentityProvider::resolve_from_fingerprint()` |
| (TLS, raw framing) | Token in frame header | `IdentityProvider::resolve_from_token()` |
| (TCP, raw framing) | Token in frame header | `IdentityProvider::resolve_from_token()` |
| (WebTransport, raw framing) | Token in CONNECT request | `IdentityProvider::resolve_from_token()` |
| (TLS, HTTP) | HTTP Authorization header | `IdentityProvider::resolve_from_token()` |
| (—, DNS) | Token embedded in DNS query | `IdentityProvider::resolve_from_token()` |
All token-based paths use the same `AuthToken` format (Ed25519-signed timestamp, defined in [auth.md](../../architecture/auth.md)). The `IdentityProvider` trait doesn't change — `resolve_from_token()` already covers all of these. The difference is just how the token gets extracted from the wire format.
## Design
### StreamInterface and MessageInterface
The current `Interface` trait has this signature:
```rust
#[async_trait]
pub trait Interface: Send + Sync + 'static {
type Session;
async fn accept(stream: TransportStream, config: &InterfaceConfig) -> Result<Self::Session>;
}
```
This works for SSH and raw framing — both run over a duplex stream. But HTTP and DNS are **message-based**: they receive isolated requests, not persistent sessions. The interface model needs to accommodate both patterns.
**Rename `Interface` to `StreamInterface`** for stream-based connections:
```rust
#[async_trait]
pub trait StreamInterface: Send + Sync + 'static {
type Session;
async fn accept(stream: TransportStream, config: &InterfaceConfig) -> Result<Self::Session>;
}
```
**Add `MessageInterface`** for message-based request/response interfaces:
```rust
#[async_trait]
pub trait MessageInterface: Send + Sync + 'static {
async fn handle_request(&self, request: InterfaceRequest) -> Result<InterfaceResponse>;
}
```
Why separate traits instead of one:
- Different signatures: `StreamInterface` produces a session from a stream. `MessageInterface` handles an individual request.
- Different lifecycles: Stream sessions are long-lived (SSH channels persist). Message handlers are stateless per-request (each HTTP request is independent).
- Different transport ownership: `StreamInterface` receives a `TransportStream` from elsewhere. `MessageInterface` manages its own transport (HTTP server, DNS server).
### InterfaceRequest / InterfaceResponse
```rust
pub struct InterfaceRequest {
pub operation_path: String, // e.g., "/head/auth/verify"
pub input: Value, // JSON input payload
pub auth_token: Option<AuthToken>, // Extracted from wire format
pub metadata: HashMap<String, String>,
}
pub struct InterfaceResponse {
pub result: Result<Value, CallError>,
pub status: u16, // HTTP status, DNS result code, etc.
pub headers: HashMap<String, String>,
}
```
This is a normalized interface-agnostic request/response. The `MessageInterface` implementation extracts the operation path, input, and auth token from its wire format (HTTP, DNS, etc.) and constructs an `InterfaceRequest`. The call protocol handler processes it and returns an `InterfaceResponse` that the implementation serializes back to its wire format.
### HTTP Interface
The HTTP interface accepts standard HTTP requests and maps them to call protocol operations:
```
POST /v1/{namespace}/{op} → registry.invoke(namespace, op, input) (mutation)
GET /v1/{namespace}/{op} → registry.invoke(namespace, op, input) (query, params as input)
GET /v1/{namespace}/{op} SSE → registry.subscribe(namespace, op, input) (subscription)
```
This is how external clients invoke alknet operations without SSH. Use cases:
- Dashboard UI calling operations via fetch()
- Third-party service integration via REST API
- Health checks and monitoring endpoints
- Other alknet nodes using `OpenAPIServiceRegistry` to register against this API
```rust
pub struct HttpInterface {
identity_provider: Arc<dyn IdentityProvider>,
registry: Arc<OperationRegistry>,
env: OperationEnv,
}
```
Auth: Extract `Authorization: Bearer <token>` header, pass to `IdentityProvider::resolve_from_token()`. The token is the same `AuthToken` format used by WebTransport and raw framing.
The HTTP interface manages its own transport layer (hyper/axum/actix). It doesn't need a `Transport` from Layer 1 — HTTP IS the transport. This is the same pattern as the DNS interface.
### DNS Interface
DNS is not a transport. It's a **message-based interface** that encodes `EventEnvelope` frames as DNS query/response pairs:
```
DNS query: "_alknet.request.{base64url(payload)}.alk.dev TXT?"
→ decoded as EventEnvelope (call.requested)
→ call protocol handler processes it
→ encoded as EventEnvelope (call.responded)
→ returned as DNS TXT record response
```
```rust
pub struct DnsInterface {
domain: String,
identity_provider: Arc<dyn IdentityProvider>,
registry: Arc<OperationRegistry>,
env: OperationEnv,
}
```
Auth: Token embedded in the DNS query. Same `AuthToken` format.
The DNS interface runs its own DNS server. It doesn't need a separate `Transport` — DNS is both the transport and the interface combined.
### Remove TransportKind::Dns
Since DNS is a `MessageInterface` (not a transport), `TransportKind::Dns` should be removed from the enum. The `ListenerConfig` enum should be updated to cover both stream and message listeners:
```rust
pub enum ListenerConfig {
Stream {
transport: TransportKind,
interface: StreamInterfaceKind,
},
Message {
interface: MessageInterfaceKind,
bind_addr: SocketAddr,
},
}
```
This cleanly separates "listen for byte streams" from "listen for messages."
### Revised Interface Pairs
**Stream-based connections** (persistent session, `StreamInterface`):
| Transport | StreamInterface | Auth | Use case |
|---|---|---|---|
| TLS | SshInterface | SSH pubkey handshake | Standard alknet tunnel |
| TCP | SshInterface | SSH pubkey handshake | Plain SSH tunnel |
| iroh | SshInterface | SSH pubkey handshake | P2P SSH tunnel |
| TCP | RawFramingInterface | Token in frame header | Local service mesh |
| TLS | RawFramingInterface | Token in frame header | Secure mesh |
| WebTransport | SshInterface | SSH pubkey handshake | Browser SSH tunnel (future) |
| WebTransport | RawFramingInterface | Token in CONNECT request | Browser call protocol (future) |
**Message-based interfaces** (stateless per-request, `MessageInterface`):
| MessageInterface | Auth | Owns transport? | Use case |
|---|---|---|---|
| HttpInterface | Authorization header (Bearer token) | Yes (hyper/axum) | REST API, dashboard, integrations |
| DnsInterface | Token embedded in query labels | Yes (DNS server) | Censorship-resistant control channel |
| WebSocketInterface | Token in handshake or first message | Yes (WS server) | Browser persistent connection (future) |
The `MessageInterface` implementations manage their own transport. They don't need the `Transport` trait because they're not wrapping a generic byte stream — they ARE the transport+interface combined.
### Unified auth across all interfaces
Every interface resolves to the same `Identity` through `IdentityProvider`:
```
SSH fingerprint → IdentityProvider::resolve_from_fingerprint → Identity
Bearer token → IdentityProvider::resolve_from_token → Identity
HTTP Authorization → IdentityProvider::resolve_from_token → Identity
DNS embedded token → IdentityProvider::resolve_from_token → Identity
WebSocket token → IdentityProvider::resolve_from_token → Identity
```
The token format is the same `AuthToken = base64url(key_id || timestamp || signature)` defined in [auth.md](../../architecture/auth.md). The interface just extracts the credential from its wire format. `IdentityProvider` resolves it to an `Identity`. The call protocol handler receives `OperationContext` with that identity.
In database-backed deployments (`StorageIdentityProvider`), `Identity.id` is the account UUID — so the same person connecting via SSH, HTTP, or DNS resolves to the same identity. No separate `account_id` field needed.
### ConfigIdentityProvider: Token auth without a database
The config-based (minimal) deployment gains API key / bearer token support through `DynamicConfig.auth`:
```toml
[auth.ssh]
authorized_keys = [...]
[auth.token]
enabled = true
max_token_age = "5m"
# key_source = "shared" (default: same keys as SSH)
[[auth.api_keys]]
prefix = "alk_"
hash = "sha256:xyz..."
scopes = ["relay:connect"]
description = "dashboard service account"
```
`ConfigIdentityProvider::resolve_from_token()` already exists in the current spec. It verifies the `AuthToken` format (Ed25519 signed timestamp) against the same `authorized_keys` set used for SSH. The `api_keys` section adds an alternative: simple bearer tokens (hash-verified, with optional TTL) that don't require Ed25519 key pairs. This is useful for service accounts and automation.
Both token types produce the same `Identity`. Config-based `Identity.id` is the key fingerprint (for `AuthToken`) or the key prefix (for simple bearer tokens). In database-backed deployments, both resolve to the account UUID.
## Service Decomposition
### AuthService (existing — ADR-028)
Resolves **inbound** credentials to an `Identity`. Already defined. Works across all interfaces — SSH interface calls `resolve_from_fingerprint()`, HTTP/DNS interfaces call `resolve_from_token()`. No changes needed.
### CredentialService (new — see credential-provider.md)
Resolves **outbound** credentials for external service access. Defined in [credential-provider.md](credential-provider.md).
### AccountService (new — storage layer)
Manages accounts and credential associations. This is a storage-layer irpc service, not a core concern:
- `AccountProtocol::CreateAccount { display_name, default_scopes }`
- `AccountProtocol::GetAccount { account_id }`
- `AccountProtocol::AddCredential { account_id, credential }` (SSH key, API key)
- `AccountProtocol::RemoveCredential { account_id, credential_id }`
- `AccountProtocol::ListCredentials { account_id }`
This is the CRUD layer. `StorageIdentityProvider` uses it internally. External management (admin UI) goes through `AccountService`. Analogous to how `ConfigService` provides `ConfigReloadHandle` — core has the read trait, storage has the management service.
Core doesn't need `AccountService` for operation. `IdentityProvider` is the read-only contract. Account management is additive.
## Impact on Existing Specs
### interface.md
Needs revision:
1. **Rename `Interface` to `StreamInterface`** — the current trait becomes the stream-specific variant.
2. **Add `MessageInterface` trait** — for HTTP, DNS, WebSocket.
3. **Add `HttpInterface`** as a `MessageInterface` implementation.
4. **Clarify DNS** — DNS is a `MessageInterface`, not a (DNS transport, raw framing) pair. Remove `TransportKind::Dns` from the transport enum.
5. **Add valid message-based interface pairs** table alongside the stream-based pairs table.
6. **Add `InterfaceRequest` / `InterfaceResponse`** types that normalize calls across message interfaces.
### auth.md
Needs revision:
1. **Add HTTP interface auth**`Authorization: Bearer <token>` extraction.
2. **Add DNS interface auth** — token embedded in DNS query labels.
3. **Add auth presentation table** showing all interface/auth combos.
4. **Add simple API keys** — bearer tokens (hash-verified, with optional TTL) for service accounts. Not all token auth needs Ed25519 key pairs.
### transport.md
Minor: **Remove `TransportKind::Dns`** from the enum. Add note that DNS is handled as a `MessageInterface`.
### call-protocol.md
Minor update: the call protocol handler should accept `EventEnvelope` frames from both `StreamInterface::Session` and `MessageInterface::handle_request()`. The dispatch logic is the same — only the framing differs.
### ADR-026
Needs update: the three-layer model is correct, but the (Transport, Interface) pair enumeration in ADR-026 lists DNS as a transport. This should be revised to show `StreamInterface` and `MessageInterface` as two interface categories at Layer 2.
## Phasing Considerations
| Work | Suggested Phase | Notes |
|---|---|---|
| Rename `Interface``StreamInterface` | Phase 1 (now) | Rename only, no behavior change. Existing code already implements the stream pattern. |
| Define `MessageInterface` trait | Phase 1 (now) | Cheap, forward-compatible. Define the trait and `InterfaceRequest`/`InterfaceResponse` types. |
| Define `HttpInterface` stub | Phase 1 (now) | Define the struct and impl signature. Full HTTP server wiring can wait. |
| `TransportKind::Dns` removal | Phase 1 (now) | Clean up the enum before code depends on `TransportKind::Dns`. |
| `ListenerConfig` with Stream/Message variants | Phase 1 (now) | Update the server accept loop to support both interface types. |
| `HttpInterface` implementation | Phase 2 | Full HTTP server with router, auth middleware, SSE. Depends on core being stable. |
| `DnsInterface` implementation | Phase 3+ | DNS protocol is non-trivial. Deferring is fine. |
| `AccountService` irpc protocol | Phase 2 | CRUD for accounts. Lives in alknet-storage. |
| `ApiKeys` in `DynamicConfig.auth` | Phase 1 (now) | Enable bearer token auth in config-based deployments. |
The key observation: defining the traits (`MessageInterface`, `InterfaceRequest`, `HttpInterface` stub) now is cheap and prevents refactoring later. The actual HTTP server implementation can wait for Phase 2. But the trait surface needs to exist in Phase 1 so downstream code can target it.
## Open Questions
### OQ-IF-03: Should `MessageInterface` and `StreamInterface` share a common trait?
Recommendation: Independent traits. Different signatures (`handle_request` vs `accept` + `next_event/send_event`), different lifecycles (stateless vs session-stateful), different transport ownership (self-managed vs provided). A common super-trait adds complexity without clear benefit.
### OQ-IF-04: Should `TransportKind::Dns` be removed from the enum?
Recommendation: Yes. DNS doesn't produce byte streams. Remove it and add `ListenerConfig::Message` variant. This is a cleanup, not a breaking change — `TransportKind::Dns` is currently a tag with no acceptor implementation.
### OQ-IF-05: Should the HTTP interface share a port with the SSH listener?
In production, alknet might run SSH on port 22 and HTTP on port 443. Or both on 443 (TLS with ALPN). The `HttpInterface` could share a TLS listener with `SshInterface` if ALPN negotiation selects SSH vs. HTTP.
Recommendation: Start simple — separate ports. HTTP on its own port (default 8080 or configured via `[[listeners]]`). ALPN multiplexing is a future optimization that doesn't change the interface abstraction.
### OQ-IF-06: Should the HTTP interface auto-generate OpenAPI specs from the OperationRegistry?
If alknet exposes operations as `POST /v1/{namespace}/{op}`, the HTTP interface could auto-generate an OpenAPI spec from the registered `OperationSpec`s. This would provide:
- Interactive API documentation
- Automatic client SDK generation
- Compatibility with `OpenAPIServiceRegistry` (another alknet node's `FromOpenAPI` could register against this spec)
This is the reverse of `OpenAPIServiceRegistry` — instead of consuming an OpenAPI spec to register operations, it produces an OpenAPI spec from registered operations. The `OperationSpec` already has `input_schema`, `output_schema`, `description`, and `tags`.
Recommendation: Yes, but Phase 4+. The HTTP interface needs to exist first.
### OQ-IF-07: How do self-hosted services (rustfs, gitea) authenticate requests from alknet users?
When alknet sits in front of rustfs or gitea (e.g., as a reverse proxy or HTTP interface gateway), how does it map alknet identities to external service identities?
Options:
1. **Shared secret / API key**: Alknet holds a service-level credential. All proxied requests use it. Simple but loses per-user identity on the external service.
2. **Identity-bound credentials**: Each alknet account has a corresponding rustfs/gitea credential, looked up via `Identity.id`. Per-user ACL on the external service.
3. **Alknet as OIDC provider**: Rustfs/gitea trust alknet as their identity provider. No stored credentials — users authenticate directly via OIDC.
Recommendation: Start with Option 1. Add Option 2 when multi-tenant access is needed. Option 3 is the long-term goal (Phase D in [credential-provider.md](credential-provider.md)).
## References
- [interface.md](../../architecture/interface.md) — Current Interface layer spec (needs update for `StreamInterface`/`MessageInterface`)
- [auth.md](../../architecture/auth.md) — Unified auth, IdentityProvider, AuthToken format
- [identity.md](../../architecture/identity.md) — Identity struct, IdentityProvider trait
- [call-protocol.md](../../architecture/call-protocol.md) — Call protocol, OperationEnv
- [services.md](../../architecture/services.md) — irpc service definitions
- [credential-provider.md](credential-provider.md) — CredentialProvider, CredentialSet (Phase 2)
- [ADR-026](../../architecture/decisions/026-transport-interface-separation.md) — Three-layer model (needs update for `MessageInterface`)
- [ADR-023](../../architecture/decisions/023-unified-auth-shared-key-material.md) — Unified auth with shared key material
- [ADR-029](../../architecture/decisions/029-identity-core-type.md) — Identity as core type

View File

@@ -1,401 +0,0 @@
# TLS Transport: Unified Multi-Interface Architecture
> Status: Research / Draft
> Last updated: 2026-06-08
> Part of: Phase 2 planning
## Overview
Alknet's existing stealth mode already does protocol detection: after a TLS handshake, the server peeks at the first bytes and routes SSH connections one way and HTTP connections another. This document extends that pattern into a unified architecture where a single TLS port supports SSH, REST, WebSocket, SSE, and gRPC — all routed by the first bytes after the TLS handshake. Alongside this, QUIC (UDP) supports WebTransport and iroh P2P, and DNS runs on its own port. Every interface resolves to the same call protocol operations through the `OperationRegistry`.
This replaces the earlier `(Transport, Interface)` pair model for TCP/TLS connections with a clearer distinction: persistent stream interfaces go through the peek-based router, message-based interfaces manage their own transports, and axum serves as the multiplexer for everything HTTP.
## Current State
The stealth mode implementation in `crates/alknet-core/src/server/stealth.rs` does byte-peeking after TLS handshake:
```rust
pub enum ProtocolDetection {
Ssh,
Http,
}
pub async fn detect_protocol<S>(stream: S) -> (ProtocolDetection, BufReader<S>) {
// Peek first bytes: "SSH-2.0-" → Ssh, anything else → Http
}
pub async fn send_fake_nginx_404<S>(reader: &mut BufReader<S>) {
// Currently: non-SSH gets a fake 404 and connection closed
}
```
This is almost exactly what we need. The `Http` detection currently sends a fake nginx 404. Instead, it should route to a real HTTP server.
## New Architecture
### TCP TLS Port 443: Peek-Based Routing
```
Client connects to port 443
TLS handshake completes
Peek first bytes
├─ "SSH-2.0-" → SshInterface (russh, existing path)
└─ (anything else) → axum HTTP router
├─ POST /v1/{namespace}/{op} → registry.invoke()
├─ GET /v1/{namespace}/{op} → registry.invoke()
├─ GET /v1/{namespace}/{op} (SSE) → registry.subscribe()
├─ POST /v1/batch → batch invoke
├─ GET /v1/schema → registry.list_operations()
├─ WebSocket upgrade /ws → WebSocketInterface
├─ gRPC via tonic routes → tonic services
├─ GET /.well-known/alknet/schema → OpenAPI spec generation
└─ (anything else) → 404
```
The peek happens after TLS, so the client sees a valid HTTPS server. The `send_fake_nginx_404` function becomes `hand_to_axum(stream)`. axum handles everything that isn't SSH.
### UDP Port 443: QUIC with ALPN Routing
```
Client sends QUIC Initial to port 443 UDP
TLS 1.3 handshake with ALPN negotiation
├─ ALPN "h3" (WebTransport) → wtransport → RawFramingInterface
│ │
│ └─ SessionRequest → validate AuthToken
│ from URL path or headers
│ → OperationContext → call protocol
└─ ALPN "alknet" (iroh P2P) → iroh endpoint → RawFramingInterface
└─ existing iroh accept loop
→ SshInterface or RawFramingInterface
```
wtransport and iroh both listen on UDP 443. Quinn supports multiple ALPN protocols — the QUIC handshake negotiates which handler gets the connection.
### DNS Port 53: MessageInterface
```
DNS query arrives on port 53 (UDP or TCP)
├─ UDP query → DnsInterface (MessageInterface)
└─ TCP query → DnsInterface over DoT (TLS on port 853)
└─ Encode EventEnvelope as DNS TXT query
Decode response from DNS TXT record
AuthToken embedded in query labels
→ IdentityProvider::resolve_from_token()
→ OperationContext → call protocol
```
DNS is a `MessageInterface` — it manages its own transport and handles individual request/response pairs. It doesn't sit on top of the TLS peek router.
### Revised Routing Table
| Protocol | Transport | Detection | Interface | Auth |
|---|---|---|---|---|
| SSH | TCP/TLS | Byte peek: `SSH-2.0-` prefix | SshInterface | SSH key fingerprint |
| HTTP REST | TCP/TLS | Byte peek: not SSH → axum | axum handler → registry | `Authorization: Bearer <AuthToken>` |
| WebSocket | TCP/TLS | Axum upgrade: `Upgrade: websocket` | axum upgrade handler | AuthToken in handshake |
| SSE | TCP/TLS | Axum route: `Accept: text/event-stream` | axum handler → registry.subscribe() | AuthToken in header |
| gRPC | TCP/TLS | Axum route: `content-type: application/grpc` | tonic via axum router | AuthToken in header/metadata |
| WebTransport | QUIC (UDP) | ALPN `h3` | wtransport → RawFramingInterface | AuthToken in CONNECT URL |
| iroh P2P | QUIC (UDP) | ALPN `alknet` | iroh → RawFramingInterface | iroh's existing auth |
| DNS | UDP/TCP | Own listener | DnsInterface (MessageInterface) | AuthToken in query labels |
## Implementation
### Extending ProtocolDetection
The current `ProtocolDetection` enum gains variants for known HTTP sub-protocols:
```rust
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum ProtocolDetection {
Ssh,
Http, // Any HTTP — axum handles sub-routing
}
```
This stays simple. SSH vs. not-SSH is the only peek-level decision. Everything else is HTTP-content routing inside axum. We don't need to detect WebSocket, SSE, or gRPC at the byte level — axum routes those by HTTP headers and paths.
The accept loop becomes:
```rust
// After TLS handshake and peek:
match detect_protocol(tls_stream).await {
(ProtocolDetection::Ssh, reader) => {
// Existing SSH path: hand to SshInterface
handle_ssh(reader, config).await;
}
(ProtocolDetection::Http, reader) => {
// Hand to axum HTTP server
handle_http(reader, config).await;
}
}
```
### Axum Integration
The axum server is an HTTP `Service` that receives the TLS stream after the peek. Since the TLS handshake is already complete, axum receives a plaintext stream:
```rust
async fn handle_http(stream: BufReader<TlsStream>, config: ServerConfig) {
let app = Router::new()
.route("/v1/{namespace}/{op}", post(invoke_operation))
.route("/v1/{namespace}/{op}", get(invoke_operation))
.route("/v1/batch", post(invoke_batch))
.route("/v1/schema", get(list_operations))
.route("/ws", get(websocket_upgrade))
// gRPC via tonic::Routes merged into axum router
.layer(ExtractorLayer::new(config.identity_provider, config.registry))
.layer(middleware::from_fn(auth_middleware));
// Serve the axum app on the TLS stream
hyper::server::conn::http1::Builder::new()
.serve_connection(TokioIo::new(stream), app.into_make_service())
.with_upgrades() // Enables WebSocket upgrades
.await;
}
```
The auth middleware extracts the `Authorization: Bearer <token>` header and calls `IdentityProvider::resolve_from_token()`. The operation handler constructs an `OperationContext` and calls `registry.invoke(namespace, op, input)`.
### WebTransport (QUIC/UDP)
WebTransport runs on UDP alongside iroh. The routing is by ALPN during the QUIC handshake:
```rust
// Quinn server config with two ALPN protocols:
let mut server_config = quinn::ServerConfig::with_crypto(Arc::new(tls_config));
server_config.alpn_protocols = vec![
WEBTRANSPORT_ALPN.to_vec(), // b"h3"
IROH_ALPN.to_vec(), // existing iroh ALPN
];
// Accept loop:
loop {
let incoming = quic_endpoint.accept().await;
match incoming.alpn() {
b"h3" => {
// Hand to wtransport
let session_request = IncomingSession::with_quic_incoming(incoming).await;
// Validate AuthToken from URL path/headers
// Create OperationContext
// Route to call protocol via RawFramingInterface or HTTP-like handler
}
b"alknet" | IROH_ALPN => {
// Hand to existing iroh accept loop
handle_iroh(incoming).await;
}
_ => { /* reject unknown ALPN */ }
}
}
```
wtransport's `with_quic_incoming()` escape hatch allows integrating with an externally managed Quinn endpoint, so alknet owns the Quinn `Endpoint` and routes WebTransport sessions to wtransport.
### Auth: Single Token Mechanism
Every interface except SSH uses the same `AuthToken` format defined in auth.md:
```
AuthToken = base64url(key_id || timestamp || signature)
key_id = SHA-256 fingerprint of the Ed25519 public key (32 bytes)
timestamp = Unix seconds, big-endian u64 (8 bytes)
signature = Ed25519 sign(key_id || timestamp_bytes, private_key)
```
| Interface | Auth mechanism | Token location |
|---|---|---|
| SSH | SSH key handshake | In SSH protocol (not a token) |
| HTTP REST | `Authorization: Bearer <AuthToken>` | HTTP header |
| WebSocket | AuthToken in first message or query param | After upgrade |
| SSE | `Authorization: Bearer <AuthToken>` | HTTP header |
| gRPC | `Authorization: Bearer <AuthToken>` | HTTP/2 metadata |
| WebTransport | AuthToken in CONNECT URL or header | WebTransport session request |
| DNS | AuthToken embedded in DNS query labels | Encoded in domain name |
All token-based paths call `IdentityProvider::resolve_from_token()`. The `resolve_from_token()` implementation handles Ed25519 signature verification (for AuthTokens) and will also handle hash-verified API keys (shorter tokens for simpler integrations).
For services and automation where Ed25519 key pairs are inconvenient, short API keys work:
```
API key: "alk_dGhlX3NlY3JldA" (~20 chars)
Storage: SHA-256 hash of the full key
Lookup: prefix match → hash verification → Identity
```
API keys are specified in `DynamicConfig.auth` or stored in `api_keys` tables (database-backed). Both AuthTokens and API keys go through the same `resolve_from_token()` method — the implementation discriminates by prefix or format.
### Contract Pattern: call / batch / schema / subscribe
Every interface exposes the same four primitive operations through `OperationRegistry`:
| Primitive | HTTP | MCP | DNS | Call protocol |
|---|---|---|---|---|
| `call(namespace, op, input)` | `POST /v1/{ns}/{op}` | `tools/call` | `{op}.{ns}.alk.dev TXT?` | `call.requested` |
| `batch([{ns, op, input}, ...])` | `POST /v1/batch` | (multiple `tools/call`) | (multiple queries) | (multiple `call.requested`) |
| `schema(namespace?)` | `GET /v1/schema` | `tools/list` | (not typically) | `call.requested` with special op |
| `subscribe(namespace, op, input)` | `GET /v1/{ns}/{op} SSE` | (future) | (not applicable) | `call.requested` with stream flag |
MCP's four core operations map directly:
- `tools/list``schema()`
- `tools/call``call()`
- `prompts/list``schema("prompts")`
- `prompts/get``call("prompts", "get", input)`
The `memory` tool pattern (one namespace gate dispatching to many operations behind it) is exactly `OperationRegistry` with `OperationSpec.access_control`:
```
memory({tool:"help"}) → registry.invoke("memory", "help", {})
memory({tool:"search"}) → registry.invoke("memory", "search", {query: "..."})
memory({tool:"store"}) → registry.invoke("memory", "store", {key: "...", value: "..."})
```
### Reverse: OpenAPI Spec Generation
The HTTP interface's `GET /v1/schema` endpoint (or `GET /.well-known/alknet/schema`) auto-generates an OpenAPI spec from the registered `OperationSpec`s. This creates a symmetry with `FromOpenAPI`:
```
Inbound: HTTP request → axum handler → registry.invoke(namespace, op, input) → ResponseEnvelope → HTTP response
Outbound: OpenAPI spec → FromOpenAPI(spec, config) → registry.register_all(operations) → HTTP client → external service
```
Node A's HTTP interface produces an OpenAPI spec. Node B's `FromOpenAPI` consumes it. Alknet nodes can discover each other's capabilities via the schema endpoint.
## Relationship to StreamInterface / MessageInterface
The earlier `interface-model.md` research defined `StreamInterface` and `MessageInterface` traits. This doc refines the architecture:
**StreamInterface** — persistent byte stream, used for SSH and raw framing:
- `SshInterface`: (TLS, SSH) — existing path, unchanged
- `RawFramingInterface`: (TCP/TLS, raw framing) — for local mesh
- `RawFramingInterface`: (iroh/QUIC, raw framing) — for P2P mesh
**MessageInterface** — manages its own transport, handles individual requests:
- `DnsInterface`: Runs its own DNS server on port 53
**The HTTP case** is special. The axum router is not a `MessageInterface` in the same sense as DNS. It receives a stream (the TLS connection after peek), but it handles individual requests within that stream. It's better modeled as:
- A `StreamInterface` that internally routes to axum
- Axum is the implementation detail, not a trait boundary
- The call protocol handler receives `InterfaceRequest` and returns `InterfaceResponse` regardless of whether the request came from HTTP, DNS, SSH, or raw framing
The `InterfaceRequest` / `InterfaceResponse` types from `interface-model.md` still make sense as the normalized interface-agnostic request/response that all interfaces produce:
```rust
pub struct InterfaceRequest {
pub operation_path: String, // e.g., "/head/auth/verify"
pub input: Value, // JSON input payload
pub auth_token: Option<AuthToken>, // Extracted from wire format
pub metadata: HashMap<String, String>,
}
pub struct InterfaceResponse {
pub result: Result<Value, CallError>,
pub status: u16, // HTTP status, DNS result code, etc.
pub headers: HashMap<String, String>,
}
```
But the HTTP implementation doesn't need to construct `InterfaceRequest` explicitly — it constructs `OperationContext` directly from the axum request and calls `registry.invoke()`. The `InterfaceRequest` abstraction is more useful for DNS where there's no framework doing routing for you.
## ListenerConfig Update
The `ListenerConfig` enum from the integration plan gains a `Http` variant alongside existing `Stream`:
```rust
pub enum ListenerConfig {
Stream {
transport: TransportKind,
interface: StreamInterfaceKind,
},
Http {
bind_addr: SocketAddr,
tls: bool, // true = TLS, false = plain TCP
stealth: bool, // true = byte-peek protocol detection
},
Dns {
bind_addr: SocketAddr,
tls: bool, // true = DoT, false = plain DNS
},
}
pub enum StreamInterfaceKind {
Ssh,
RawFraming,
}
pub enum TransportKind {
Tcp,
Tls { server_name: Option<String> },
Iroh { endpoint_id: String },
// NO Dns variant — DNS is a MessageInterface, not a Transport
}
```
For the common production deployment on port 443:
```toml
[[listeners]]
type = "stream"
transport = { tls = {} }
interface = "ssh"
bind = "0.0.0.0:443"
[[listeners]]
type = "http"
bind = "0.0.0.0:443"
tls = true
stealth = true
# If separate ports are preferred:
[[listeners]]
type = "http"
bind = "0.0.0.0:8080"
tls = false
stealth = false
```
When `stealth = true` on an HTTP listener sharing a port with an SSH listener, the accept loop uses the byte-peek pattern to route connections to the correct handler.
When the HTTP listener is on its own port, no peeking is needed — everything is HTTP.
## Phasing
| Work | Phase | Notes |
|---|---|---|
| Extend `ProtocolDetection` to route `Http` to axum | Phase 1 (now) | Replace `send_fake_nginx_404` with axum handoff |
| Axum HTTP server with `/v1/{ns}/{op}` routes | Phase 1 (now) | Core REST API for call protocol operations |
| Auth middleware (`Authorization: Bearer`) | Phase 1 (now) | Uses existing `IdentityProvider::resolve_from_token()` |
| `ListenerConfig::Http` variant | Phase 1 (now) | Define alongside existing `Stream` variant |
| Remove `TransportKind::Dns` | Phase 1 (now) | Cleanup before code depends on it |
| WebSocket upgrade handler | Phase 2 | axum `.with_upgrades()` is already available |
| SSE streaming handler | Phase 2 | axum + `axum-streams` or `tokio-stream` |
| gRPC via tonic integration | Phase 3 | `tonic::Routes` merges into axum router |
| WebTransport (QUIC/UDP) | Phase 3 | wtransport integration, ALPN routing |
| DNS interface | Phase 3+ | Uses `MessageInterface` trait, own listener |
| OpenAPI spec generation from registry | Phase 3+ | `GET /v1/schema` or `GET /.well-known/alknet/schema` |
| ALPN multiplexing on UDP 443 | Phase 3+ | Quinn ALPN routing between iroh and wtransport |
## References
- [stealth.rs](../../../crates/alknet-core/src/server/stealth.rs) — Current protocol detection implementation
- [auth.md](../../architecture/auth.md) — AuthToken format, IdentityProvider, unified auth
- [interface-model.md](interface-model.md) — StreamInterface / MessageInterface trait design
- [credential-provider.md](credential-provider.md) — CredentialProvider, outbound auth
- [call-protocol.md](../../architecture/call-protocol.md) — OperationRegistry, OperationEnv
- [services.md](../../architecture/services.md) — irpc service definitions, OperationContext
- [ADR-026](../../architecture/decisions/026-transport-interface-separation.md) — Three-layer model
- [wtransport](/workspace/wtransport/) — WebTransport server implementation (QUIC/HTTP3, ALPN h3)
- [iroh-relay](/workspace/iroh/iroh-relay/) — HTTP + WebSocket relay (hyper, MaybeTlsStream)
- [hickory-dns](/workspace/hickory-dns/) — DNS server with DoT/DoH/DoQ/DoH3
- [tonic](/workspace/tonic/) — gRPC framework (axum + hyper integration, ALPN h2)

View File

@@ -154,52 +154,70 @@ These docs describe concepts that carry forward but need updating to reflect the
---
## Phase 4: Clean Up Code
## Phase 4: Greenfield Workspace
Not a rewrite — just remove dead weight so agents don't pattern-match to it.
**Decision: Greenfield rather than in-place migration.** The old codebase is preserved at `/workspace/@alkdev/alknet-main/` as a reference implementation. The new workspace starts clean with only `alknet-secret` carried over (it's standalone with no alknet-core dependency).
### Delete from `alknet-core`
### What was deleted
These modules/files implement concepts that the pivot replaces entirely. They'll be re-implemented in new crates:
| What | Reason |
|------|--------|
| `crates/alknet-core/` | Replaced by new `alknet-core` v2 with ALPN router |
| `crates/alknet/` | CLI will be rebuilt for new model |
| `crates/alknet-napi/` | NAPI will be rebuilt as call protocol client |
| `docs/architecture/` | Old model specs — will be replaced by SDD process |
| `docs/research/core.md` | Three-layer model — superseded |
| `docs/research/services.md` | irpc service layer — superseded |
| `docs/research/storage.md` | Metagraph — deferred |
| `docs/research/flow.md` | FlowGraph — deferred |
| `docs/research/configuration.md` | Promoted to architecture already |
| `docs/research/integration-plan.md` | Old model integration — superseded |
| `docs/research/phase2/` | StreamInterface/MessageInterface, CredentialProvider — superseded |
| `docs/research/event-sourcing/` | Not currently needed |
| `docs/research/references/gitserver/` | MPL-2.0 licensed — licensing risk |
| `docs/research/references/gitlfs/` | MIT/Apache — kept as fork candidate, moved to references |
| `docs/research/references/honker/` | Biased toward old irpc model |
| `docs/research/references/nats.rs/` | Not directly used |
| `docs/research/references/distributed-identity/` | Deferred |
| `docs/research/references/openstack-keystone/` | Not directly used |
| `docs/research/references/polyglot/` | Not directly used |
| `docs/research/references/rustfs/` | Not directly used (may return for alknet-fs) |
| `docs/references/` | Stray duplicate directory |
| `tasks/` | Old task graph — will be regenerated by SDD process |
| What | Lines | Reason |
|------|-------|--------|
| `src/interface/mod.rs` | 140 | `StreamInterface` / `MessageInterface` — replaced by `ProtocolHandler` |
| `src/interface/pairs.rs` | 122 | Transport/interface validation — no longer needed |
| `src/interface/config.rs` | 270 | `ListenerConfig` variants — replaced by ALPN advertisement |
| `src/interface/session.rs` | 62 | `InterfaceSession` / `InterfaceEvent` — old model |
| `src/interface/http.rs` | 66 | Old HTTP interface — becomes `alknet-http` handler |
| `src/interface/dns.rs` | 47 | Old DNS interface — becomes `alknet-dns` handler |
| `src/interface/raw_framing.rs` | 399 | Stealth mode byte-peek — replaced by ALPN negotiation |
| `src/server/stealth.rs` | 316 | Stealth mode — replaced by ALPN negotiation |
| `src/server/control_channel.rs` | 196 | SSH control channel for pubsub — old model |
### What was kept
**Keep as-is (port later):**
| What | Reason |
|------|--------|
| `crates/alknet-secret/` | Standalone crate, no alknet-core dependency, fully working |
| `docs/research/pivot/` | The pivot proposal and this cleanup plan |
| `docs/research/references/iroh/` | ALPN dispatch, QUIC endpoints — directly relevant |
| `docs/research/references/ssh/` | russh, russh-sftp — directly relevant for alknet-ssh |
| `docs/research/ops/` | fail2ban, certbot — production reference |
| `docs/sdd_process.md` | The development process we follow |
| `Cargo.toml` (workspace) | Updated to only include alknet-secret |
| `Cargo.lock` | Preserved for alknet-secret dependencies |
| `LICENSE-MIT`, `LICENSE-APACHE` | License files |
| `README.md` | Updated for greenfield state |
| What | Lines | Destination |
|------|-------|-------------|
| `src/interface/ssh.rs` | 982 | → `alknet-ssh` (largest single extraction) |
| `src/server/handler.rs` | 974 | → `alknet-ssh` (SSH server handler) |
| `src/server/channel_proxy.rs` | 555 | → `alknet-ssh` (port forwarding proxy) |
| `src/server/serve.rs` | 1526 | → rewrite as ALPN router (keep for reference, rewrite later) |
| `src/call/*` | ~1200 | → `alknet-call` (relatively clean extraction) |
| `src/auth/*` | ~1450 | → `alknet-core` (shared auth/identity) |
| `src/config/*` | ~950 | → `alknet-core` (static/dynamic config) |
| `src/transport/*` | ~1500 | → `alknet-core` (endpoint acceptors) |
| `src/client/*` | ~1900 | → `alknet-ssh` (client session, SOCKS5, forwarding) |
| `src/socks5/*` | ~800 | → `alknet-ssh` (SOCKS5 server) |
| `src/credentials/*` | ~250 | → simplify into `alknet-core` auth |
| `src/http/*` | ~340 | → `alknet-http` |
| `src/error.rs` | ~240 | → `alknet-core` |
| `src/testutil.rs` | ~140 | → `alknet-core` test utilities |
### Reference implementation
### Delete entire crate
The previous codebase is preserved at `/workspace/@alkdev/alknet-main/`. When spec'ing and implementing new crates, the architect and implementation specialists can reference the old code to understand what worked and what didn't. Key modules to port:
| Crate | Reason |
|-------|--------|
| (none yet — `alknet-storage` and `alknet-flowgraph` don't exist as crates) |
| Old module | Lines | Port destination |
|------------|-------|-----------------|
| `src/interface/ssh.rs` | 982 | → `alknet-ssh` |
| `src/server/handler.rs` | 974 | → `alknet-ssh` |
| `src/server/channel_proxy.rs` | 555 | → `alknet-ssh` |
| `src/server/serve.rs` | 1526 | → reference for ALPN router rewrite |
| `src/call/*` | ~1200 | → `alknet-call` |
| `src/auth/*` | ~1450 | → `alknet-core` |
| `src/config/*` | ~950 | → `alknet-core` |
| `src/transport/*` | ~1500 | → `alknet-core` |
| `src/client/*` | ~1900 | → `alknet-ssh` |
| `src/socks5/*` | ~800 | → `alknet-ssh` |
The current workspace only has `alknet-core`, `alknet-secret`, `alknet-napi`, and `alknet` (CLI). No storage or flowgraph crates exist to delete.
**The old code is reference, not constraint.** Agents should understand what it did and why, then implement against the new ProtocolHandler trait and ALPN router — not copy-paste the old architecture.
---
@@ -243,17 +261,20 @@ Key architecture docs the architect will need to produce or rewrite:
## Execution Order
1. **Create `docs/_archived/` directory** and move files there (preserves git history)
2. **Mark superseded ADRs** with `Superseded` status and pivot reference
3. **Move obsolete research docs** to `docs/_archived/research/`
4. **Annotate stale-but-keeping architecture docs** with `status: needs-update` frontmatter and pivot reference note
5. **Delete replaced code modules** from `alknet-core` (interface layer, stealth, control channel)
6. **Fix compilation** — removing modules will break imports. Fix them minimally (comment out, stub, or remove call sites) so the project compiles. This is temporary scaffolding, not the refactor.
7. **Architect produces proper SDD architecture specs** per Phase 1 of the SDD process
1. ~~Create `docs/_archived/` directory~~**Greenfield instead.** Old code preserved at `/workspace/@alkdev/alknet-main/`.
2. ~~Mark superseded ADRs~~**Deleted.** Old architecture docs removed entirely. New ADRs will be created by the architect per SDD process.
3. ~~Move obsolete research docs~~**Deleted.** Only kept directly relevant references (iroh, ssh, ops, pivot).
4. ~~Annotate stale-but-keeping architecture docs~~**Deleted.** No stale docs remain. Architect will produce fresh specs.
5. **Delete old source crates** (alknet-core, alknet, alknet-napi) — done
6. **Update workspace Cargo.toml** to only include alknet-secret — done
7. **Update README.md** for greenfield state — done
8. **Verify compilation**`cargo check` and `cargo test -p alknet-secret` both pass — done
9. **Architect produces proper SDD architecture specs** per Phase 1 of the SDD process
After this cleanup, the repo should:
- Compile (possibly with reduced functionality)
- Have no references to `StreamInterface`, `MessageInterface`, `ListenerConfig`, or stealth mode in active docs
- Have superseded ADRs clearly marked so agents don't implement the old model
- Have all obsolete material in `docs/_archived/` where it won't bias agents
- Be ready for the architect role to produce proper Phase 1 architecture specs following the SDD process
After this cleanup, the repo:
- Compiles cleanly (alknet-secret passes all 14 tests)
- Has no old architecture docs, ADRs, or task graph
- Has only directly relevant reference material (iroh, ssh, ops)
- Has the pivot proposal and cleanup plan as the starting point
- Has a clean workspace ready for the architect to produce Phase 1 specs
- Has the reference implementation at `/workspace/@alkdev/alknet-main/`

View File

@@ -1,771 +0,0 @@
# Research: Distributed Identity, Smart Contract ACL, and Decentralized Git
> Status: Research Reference
> Created: 2026-06-08
> Scope: Decentralized git hosting, distributed identity, smart contract-based access control, and their relevance to alknet
## Table of Contents
1. [Executive Summary](#1-executive-summary)
2. [Source Concept: NFT-Based Decentralized Git](#2-source-concept-nft-based-decentralized-git)
3. [Existing Projects](#3-existing-projects)
4. [Identity on the Blockchain](#4-identity-on-the-blockchain)
5. [Access Control Models for Distributed Git](#5-access-control-models-for-distributed-git)
6. [Cryptographic Identity Mapping](#6-cryptographic-identity-mapping)
7. [Gossip Protocols for Repo Synchronization](#7-gossip-protocols-for-repo-synchronization)
8. [Relevance to Alknet](#8-relevance-to-alknet)
9. [References](#9-references)
---
## 1. Executive Summary
This document researches distributed identity systems, smart contract-based access control, and decentralized git platforms to inform alknet's architecture. The source concept — a decentralized, censorship-resistant git hosting platform using NFTs (ERC-721) for identity and smart contracts for ACL — directly inspired some of alknet's cryptographic identity and key derivation ideas. The research reveals several key findings:
**Key Findings:**
1. **Radicle is the most mature decentralized git system** and provides the closest production reference for alknet's architecture, particularly in Ed25519 identity, gossip-based replication, and self-certifying repositories. However, Radicle lacks the smart contract/on-chain ACL layer that the source concept envisions.
2. **Smart contract ACL is feasible but introduces latency trade-offs.** On-chain identity verification costs 0.5-5 seconds per look-up on L2s, making it unsuitable as a hot path. The correct pattern is on-chain registration + local cache, which aligns with alknet's `StorageIdentityProvider` approach.
3. **alknet's BIP39/SLIP-0010 key derivation already spans both worlds.** The `m/74'/0'/0'/0'` path for Ed25519 identity and `m/44'/60'/0'/0/0` for Ethereum signing means the same seed phrase that governs alknet authentication can also sign on-chain transactions — no separate wallet needed.
4. **The Identity + IdentityProvider model maps directly to decentralized identity.** `ConfigIdentityProvider` is the local-only mode (Radicle-like); `StorageIdentityProvider` is the cached mode (on-chain ACL mirrored to SQLite); a future `OnChainIdentityProvider` could verify against smart contracts.
5. **Domain events vs. integration events (from alknet's event sourcing research) is the correct pattern** for synchronizing on-chain state to local nodes. On-chain events are the source of truth; honker streams carry the projected local state.
---
## 2. Source Concept: NFT-Based Decentralized Git
The originating concept for this research is a decentralized, censorship-resistant git hosting platform built on the following principles:
### 2.1 Core Architecture
| Component | Mechanism | Purpose |
|-----------|----------|---------|
| **Org/User Identity** | Transferable ERC-721 tokens | Organizations and users are NFTs; ownership is on-chain and transferable |
| **Repository Identity** | ERC-721 tokens owned by org/user tokens | Repos are NFTs with a `mapping(address => Role)` ACL |
| **Replicators** | User/org nodes listing replicated repos + public endpoints | Decentralized hosting; replicators choose what to mirror |
| **Gossip Protocol** | Push/pull notifications about repo updates | Replicators learn about new commits from tracked repos |
| **Push Authorization** | Identity's on-chain ACL verified by replicator | No central authority can ban; replicators individually verify write privileges |
| **Funding Model** | After-the-fact Patreon-like contributions | Replicators receive donations; no paywall for access |
### 2.2 Key Design Properties
- **No central authority**: No single entity can ban an org, user, or repo
- **Individual replicator choice**: Each replicator independently decides what to replicate and whose pushes to accept
- **Transferable identity**: Selling the org NFT transfers all repos and access permissions
- **Self-certifying data**: Git content addresses + on-chain identity = verifiable data provenance
### 2.3 Critical Gaps in the Source Concept
| Gap | Issue | Solution Pattern |
|-----|-------|-----------------|
| **Hot path latency** | On-chain ACL look-up per push is too slow | Cache ACL locally; sync from chain events |
| **Key rotation** | If the private key controlling the NFT is lost, the identity is lost | Multi-delegate thresholds (like Radicle) + social recovery |
| **Fork/namespace collisions** | Multiple repos with same name under different orgs | Use on-chain IDs (token IDs) not human-readable names as the authoritative identifier |
| **Gas costs** | Every ACL change costs gas | Batch updates; use L2s (Base, Arbitrum); delegate to replicator-level local ACL |
| **Revocation propagation** | Revoking write access must propagate to all replicators | Event-driven: on-chain Revoked event → gossip notification → local ACL update |
---
## 3. Existing Projects
### 3.1 Radicle (radicle.xyz)
**Overview**: Radicle is an open-source, peer-to-peer code collaboration stack built on Git. It is the most mature decentralized git system currently in production (v1.x, Heartwood release).
#### Identity System
| Feature | Implementation |
|---------|---------------|
| **Node ID (NID)** | Ed25519 public key encoded as a DID (`did:key:z6Mk...`) |
| **Key format** | Ed25519 (same curve as alknet) |
| **Storage** | SSH-format key files; `MemorySigner` holds decrypted key in RAM |
| **Multi-device** | Currently one key per device (per RIP-0002); multi-device via threshold delegates is in development |
| **Identity Document** | JSON document stored in Git, listing delegates (DIDs) and a threshold for canonical updates |
**Relevance to alknet**: Radicle's NID system is architecturally very close to alknet's Ed25519-based identity. Both use:
- Ed25519 as the primary key type
- A single seed/identity as the root of trust
- DID-like identifiers for inter-node communication
- Cryptographic signatures for data verification
**Key difference**: Radicle uses pure Ed25519 keypairs directly (no hierarchical derivation), while alknet derives Ed25519 keys from a BIP39 seed phrase via SLIP-0010. This gives alknet the ability to derive multiple keys from a single root and to derive Ethereum signing keys from the same seed.
#### Gossip Protocol
Radicle uses a custom gossip protocol with three message types:
| Message Type | Purpose | Content |
|-------------|---------|---------|
| **Node Announcement** | Peer discovery | Node ID, alias, addresses, capabilities, timestamp |
| **Inventory Announcement** | Repo discovery | List of RepoIDs being seeded, timestamp |
| **Reference Announcement** | Repo update notification | RepoID + updated signed refs, timestamp |
Each announcement includes a cryptographic signature and timestamp, enabling verification before relay. Messages are dropped on re-encounter (epidemic-style deduplication). Bootstrap nodes seed peer discovery.
**Comparison with alknet's call protocol**: Radicle's gossip is metadata-only; actual data transfer uses Git protocol. alknet's approach uses a call protocol (`EventEnvelope`) for both metadata and operation invocation. The gossip pattern could be layered on top of alknet's call protocol as a subscription-based integration event mechanism.
#### Self-Certifying Repositories
Radicle repositories are **self-certifying**:
- The Repository ID (RID) is derived from the initial identity document hash
- All actions (commits, issue comments, patches) are cryptographically signed
- **Delegates** are public keys authorized to update the identity document
- A **threshold** defines how many delegates must sign for an update to be canonical
- Canonical branches are established dynamically based on signature thresholds
This eliminates the need for a central authority to determine "which version is correct."
**Relevance**: alknet's on-chain ACL concept (from the source) can use this threshold model. Instead of a single NFT owner dictating the canonical branch, a threshold of delegates can be required — this mirrors the `narrowed_scopes` / `DelegatesEdge` model in alknet's ACL graph.
#### Collaborative Objects (COBs)
COBs are Radicle's mechanism for distributed social artifacts (issues, patches, code review):
- Stored as Git objects in `refs/cobs/<type>/<object-id>` namespace
- Use CRDT DAG (Directed Acyclic Graph) for conflict-free merging
- All operations are Ed25519-signed by their author
- SQLite cache (`cobs.db`) provides indexed queries without traversing Git history
**Relevance**: COBs demonstrate that complex social data can be stored in Git with CRDT semantics. alknet's `alknet-storage` metagraph + honker streams could serve a similar role for distributed state, with the key difference being that alknet's state store is SQLite-backed rather than Git-backed, making it more efficient for real-time operations.
#### Summary Assessment
| Dimension | Radicle | alknet (proposed) |
|-----------|---------|-------------------|
| **Identity** | Ed25519 keypair (DID) | Ed25519 from SLIP-0010 + Ethereum key from same seed |
| **Naming** | No global naming; NID is identifier | On-chain NFT ID + human-readable name (via ENS or custom) |
| **Access Control** | Threshold delegates in identity doc | Smart contract ACL + local graph cache |
| **Replication** | Gossip for metadata, Git for data | Call protocol + (future) gossip subscriptions |
| **Data Storage** | Git objects + SQLite cache | SQLite (metagraph/honker) + Git-compatible |
| **Censorship Resistance** | P2P, no authority | P2P + on-chain identity (uncensorable registration) |
| **Funding Model** | Community-funded seed nodes | After-the-fact contributions (replicators) |
### 3.2 ForgeFed (Forgejo Federation)
**Overview**: ForgeFed is an ActivityPub-based federation protocol for software forges. It enables Gitea/Forgejo instances to interoperate — users on one instance can open issues and submit PRs on another without creating separate accounts.
| Feature | Details |
|---------|---------|
| **Protocol** | ActivityPub (same as Mastodon, PeerTube) |
| **Identity** | Web-based (user@example.com format, like email) |
| **ACL** | Per-instance ACL; no on-chain verification |
| **Censorship Resistance** | Limited; instances can block each other |
| **Status** | Forgejo implementing; Vervis is reference implementation |
**Relevance to alknet**: ForgeFed shows how federation works without blockchain. It uses ActivityPub for cross-instance communication, which is analogous to alknet's call protocol for cross-node communication. However, ForgeFed relies on instance-level trust (each Forgejo admin controls their instance), while alknet's concept uses on-chain identity for trust.
**Key takeaway**: ForgeFed's federation model is complementary, not competitive, with blockchain identity. An alknet node could expose a ForgeFed-compatible interface for interop with existing forges while using on-chain identity for internal trust decisions.
### 3.3 Git-Based Smart Contract Projects
| Project | Chain | Approach | Status |
|---------|-------|----------|--------|
| **GitBross** | Solana/Arbitrum + IPFS | Repos backed up to IPFS; smart contracts for metadata | Active |
| **GitLike** | Ethereum + IPFS | Browser-based decentralized VCS | Experimental |
| **Statik** | IPFS | Version control on IPFS with content-addressed storage | Experimental |
| **PineSU** | Ethereum | Git repos + blockchain for integrity/timestamping | Research paper |
**Common patterns**:
- IPFS for content-addressed storage of git objects
- Smart contracts for metadata (ownership, ACL, provenance)
- Ethereum or L2 for on-chain verification
- Git bridge tools that push to both IPFS and traditional remotes
**Key insight**: None of these projects have achieved widespread adoption. The main challenges are:
1. **Performance**: IPFS retrieval is slower than centralized git hosting
2. **UX**: Browser-based git clients lack feature parity with CLI tools
3. **Incentives**: No sustainable funding model for replicators
alknet's approach of using traditional git remotes with a smart contract ACL overlay avoids the IPFS performance trap while still providing censorship resistance.
### 3.4 NFT-Based Access Control Systems
Several projects use NFTs (ERC-721) for access gating:
| Pattern | Mechanism | Example |
|---------|-----------|---------|
| **Token-gated content** | Wallet verification proves NFT ownership before granting access | NFT-gated websites, Discord roles |
| **Role-based ACL via NFT** | NFTs represent roles; smart contract checks `balanceOf(address) > 0` | Token-gated DAOs, access-controlled channels |
| **Namespace NFTs** | Each NFT represents a namespace/org; sub-rights derive from ownership | ENS domains, NFT-based guild systems |
**Solidity Pattern for Repository ACL**:
```solidity
// Simplified example: NFT-based org/repo with on-chain ACL
contract OrgToken is ERC721 {
struct Org {
address owner;
mapping(address => Role) members; // ACL mapping
}
struct Repo {
uint256 orgTokenId; // Owning org
mapping(address => Permission) collaborators;
}
function canPush(uint256 repoId, address user) external view returns (bool) {
Repo storage repo = repos[repoId];
// Check direct permission
if (repo.collaborators[user] >= Permission.Write) return true;
// Check org membership
Org storage org = orgs[repo.orgTokenId];
if (org.members[user] >= Role.Member) return true;
return false;
}
}
```
**Performance considerations**: A `canPush()` check on L2 (Base, Arbitrum) costs ~0.001-0.01 USD and takes 0.5-2 seconds. This is acceptable for occasional operations (repo creation, ACL changes) but not for per-push verification. Caching is essential.
**Relevance to alknet**: The mapping from on-chain ACL to alknet's local ACL graph is direct:
- ERC-721 token ID → `PrincipalNode` in alknet's ACL metagraph
- `collaborators` mapping → `DelegatesEdge` with `narrowed_scopes`
- `canPush()` → alknet's `check_access()` function
---
## 4. Identity on the Blockchain
### 4.1 ERC-721 as Identity/Namespace Tokens
**How it works**: Each unique identity (org, user, namespace) is an ERC-721 NFT. The token ID is the on-chain identifier; metadata (display name, avatar, public key) is stored off-chain (IPFS or DNS).
**Advantages**:
- Inherent transferability (sell/gift an org identity)
- On-chain ownership verification
- Metadata can include cryptographic public keys for off-chain verification
- Composable with other on-chain protocols (DAO governance, treasury)
**Disadvantages**:
- Gas costs for every state change
- Key rotation requires a transaction (can't just change a local file)
- Metadata availability depends on off-chain storage
- Privacy: all ACL changes are public on-chain
**Resolution pattern**: Use on-chain registration as the root of trust, but resolve identity locally via cached data. This is exactly how DNS works — the zone file is authoritative, but resolvers cache it.
### 4.2 ENS (Ethereum Name Service) as a Naming Layer
**Overview**: ENS maps human-readable names (e.g., `alice.eth`) to machine-readable identifiers (Ethereum addresses, content hashes, text records).
| Feature | Implementation |
|---------|---------------|
| **Name resolution** | `alice.eth` → Ethereum address (NFT owner) |
| **Text records** | Store arbitrary key-value data (avatar, email, public key, SSH key) |
| **Subdomains** | `git.alice.eth` can point to a replicator endpoint |
| **Resolver** | Smart contract that returns records for a name |
| **Off-chain look-up** | CCIP-read (EIP-3668) allows resolving names via external data |
**Relevance to alknet**: ENS text records can store alknet node identifiers:
- `alk.id` text record → alknet Node ID (Ed25519 public key fingerprint)
- `alk.pubkey` text record → Ed25519 public key (for SSH authentication)
- `alk.replicator` text record → endpoint URL (for repo discovery)
This creates a human-friendly naming overlay on top of alknet's cryptographic identifiers. Combined with DNS TXT records (alknet's planned DNS naming layer), it provides multiple resolution paths.
**Limitation**: ENS resolution requires an Ethereum RPC call, which adds latency. For production use, ENS data should be cached locally and refreshed periodically, similar to DNS TTLs.
### 4.3 Smart Contracts as ACL/Naming Services
**Pattern**: A smart contract stores the ACL mapping and provides a view function for verification. This is the "source of truth" that local caches sync from.
```
On-chain ACL contract (source of truth)
│ events: RoleGranted, RoleRevoked, RepoCreated, etc.
alknet-storage (local cache)
├── ACL metagraph (PrincipalNode + DelegatesEdge)
├── Synced from on-chain events
└── Used for hot-path access checks
```
**Event-driven sync pattern** (critical for alknet):
1. Smart contract emits `RoleGranted(address, repoId, role)` event
2. alknet head node listens to these events (via Ethereum log subscription)
3. Event is projected into the ACL metagraph as a `DelegatesEdge` with `narrowed_scopes`
4. Local access checks use the metagraph (fast, SQLite)
5. Periodic consistency check ensures local cache matches on-chain state
This maps directly to alknet's **event boundary discipline**:
- On-chain events = external source of truth (like domain events from another service)
- ACL metagraph = local projection (like an integration event or read model)
- Honker stream `acl:updated` = notification that the local cache changed (integration event)
### 4.4 Decentralized Identity Standards
#### W3C DIDs (Decentralized Identifiers)
**Overview**: DIDs are a W3C standard for verifiable, self-sovereign digital identifiers. A DID is a URI that resolves to a DID Document describing how to interact with the identity holder.
| DID Method | Resolution | Key Type | Use Case |
|-----------|-----------|----------|----------|
| `did:key` | Static (no registry) | Ed25519, secp256k1, etc. | Radicle uses this; self-certifying |
| `did:ethr` | Ethereum registry | secp256k1 | Blockchain-verifiable identity |
| `did:web` | DNS/web server | Any | Traditional web PKI bridge |
| `did:ion` | Bitcoin Sidetree | secp256k1 | Microsoft's DID system |
**Relevance**: Radicle uses `did:key` with Ed25519 keys. alknet could use `did:key` for local identity (same key type!) and extend to `did:ethr` for on-chain identity, using the same seed phrase to derive both keys.
#### Verifiable Credentials (VCs)
**Overview**: VCs are tamper-evident, cryptographically secure attestations issued by a trusted authority. Think of them as digital certificates (driver's license, degree) that the holder presents to a verifier.
**Application to git access**: A VC could attest that "this Ed25519 public key has write access to repo X." The issuer is the org's NFT contract (or a delegate). VCs can be verified off-chain, reducing on-chain transaction costs.
**alknet mapping**: VCs are analogous to alknet's `Identity` struct with `scopes` and `resources`. A VC issuance maps to the creation of a `DelegatesEdge` in the ACL graph. The key difference is that VCs are bearer tokens (anyone who holds one can present it), while alknet's ACL is graph-based (the principal must be connected to the resource via edges).
---
## 5. Access Control Models for Distributed Git
### 5.1 Git's Own ACL Model
Git has limited built-in ACL. Access control is typically enforced at the transport layer:
| Mechanism | Layer | Scope |
|-----------|-------|-------|
| **`pre-receive` hook** | Server-side | Reject pushes based on branch, author, file patterns |
| **`update` hook** | Server-side | Per-ref checks (branch-level protection) |
| **`post-receive` hook** | Server-side | Post-push actions (notifications, CI triggers) |
| **SSH key mapping** | Transport | `authorized_keys` → system user → filesystem permissions |
| **HTTP basic auth** | Transport | Username/password → Git smart HTTP |
| **Gitolite** | Server-side | Config-file-based ACL mapping SSH keys to repos and permissions |
**Gitolite pattern** (most relevant for distributed git):
- `~/.ssh/authorized_keys` maps SSH keys to Gitolite users
- `~/.gitolite/conf/gitolite.conf` defines repos and permissions
- Permission levels: `R` (read), `RW` (read+write), `RW+` (read+write+force-push)
- Wildcard repos: `CREATOR/..*` — users can create repos matching patterns
**alknet mapping**: Gitolite's config file is the analog of alknet's ACL metagraph. The key difference is that Gitolite is centralized (one config file), while alknet's ACL can be distributed (synced from on-chain events).
### 5.2 Decentralized Write Permission Without Central Authority
In a truly decentralized system, no single node controls access. Several patterns exist:
#### Pattern 1: Self-Certifying Repositories (Radicle)
- The repo creator defines an identity document listing delegates
- Delegates are Ed25519 public keys with a threshold
- Only delegate signatures on refs are considered canonical
- Replicators accept any push but only replicate refs signed by sufficient delegates
**Trade-off**: Simple, no on-chain costs, but no mechanism for human-readable names or transferable ownership.
#### Pattern 2: On-Chain ACL (Source Concept)
- Smart contract stores `mapping(address => Role)` for each repo
- Replicators verify pusher's address against the contract before accepting
- Ownership is transferable (the NFT can be sold)
- Gas costs for setup and ACL changes
**Trade-off**: Transferable ownership and verifiable ACL, but requires Ethereum interaction and introduces latency.
#### Pattern 3: Hybrid — On-Chain Root + Local Cache
- On-chain contract defines who owns each org/repo NFT
- Local ACL graph caches on-chain state and adds local rules
- Hot-path checks use local cache (SQLite, fast)
- Cold-path operations (ACL changes, ownership transfers) go on-chain
- Local cache is periodically verified against on-chain state
**This is the recommended pattern for alknet.** It combines:
- On-chain censorship resistance (no single authority can revoke identity)
- Local performance (ACL checks are SQLite-fast)
- Transferable ownership (NFT can be sold/transferred on-chain)
- Graceful degradation (local ACL still works when chain is unavailable)
### 5.3 Radicle's Approach to Identity and Verification
Radicle's identity model has specific properties worth detailed comparison:
| Property | Radicle | alknet (proposed) |
|----------|---------|-------------------|
| **Identity root** | Ed25519 keypair (generated locally) | BIP39 seed phrase → SLIP-0010 derivation |
| **Identity document** | JSON in Git, signed by delegates | On-chain NFT + local ACL metagraph |
| **Delegate model** | Threshold of N public keys | Threshold of N delegates (on-chain or local) |
| **Key rotation** | Add/remove delegates via identity doc update | Transfer NFT to new address; update local keys |
| **Multi-device** | One key per device (RIP-0002) | One key per device derived from same seed (`m/74'/0'/0'/{n}'`) |
| **Namespace collision** | RID is content-hash, collision-free | NFT token ID is unique; human names via ENS |
| **Revocation** | Remove delegate from identity doc | On-chain ACL change + local cache update |
| **Verification** | Signature verification against delegate list | Signature verification + on-chain ACL check |
**alknet advantage**: Deriving multiple keys from one seed means:
- Multi-device support is built-in (derive a key per device)
- No "one key per identity" limitation
- The same seed provides identity keys, encryption keys, SSH keys, and Ethereum signing keys
- Key rotation for a single device is: derive a new key from the next index, updated locally
**alknet challenge**: If the seed phrase is lost, all derived keys are lost. Mitigation strategies:
- Social recovery (N-of-M threshold: trusted contacts hold shards)
- Hardware security module (HSM) protection for the seed
- Multi-sig on key operations (require threshold of devices to authorize)
---
## 6. Cryptographic Identity Mapping
### 6.1 Ed25519 Keys (alknet's Key Type)
alknet uses Ed25519 as the primary key type for:
- SSH authentication (fingerprint-based verification)
- Node identity (Node IDs are Ed25519 public keys)
- Channel signing (call protocol event signatures)
**Relevant properties of Ed25519**:
- 32-byte public key, 64-byte private key (or 32-byte seed + 32-byte public key)
- Deterministic signatures (same message, same key → same signature)
- Fast verification (~3x faster than secp256k1)
- Used in SSH (since OpenSSH 6.5), Tor onion services, Signal
**SLIP-0010 derivation** (what alknet uses):
- SLIP-0010 generalizes BIP-32 to non-secp256k1 curves
- Ed25519 derivation uses **hardened keys only** (cannot derive child public keys from parent public key)
- This means: the master seed must be available to derive any child key
- alknet's secret service holds the seed in RAM and derives keys on demand
### 6.2 Blockchain Private Keys vs SSH Keys
The key question for mapping blockchain identity to git access is: **how does an Ed25519 SSH key relate to a secp256k1 Ethereum key?**
| Key Type | Curve | Use Case | alknet Derivation Path |
|----------|-------|----------|----------------------|
| Identity key | Ed25519 | SSH auth, node identity | `m/74'/0'/0'/0'` |
| Device key | Ed25519 | Per-device identity | `m/74'/0'/0'/{n}'` |
| SSH host key | Ed25519 | Server identity | `m/74'/0'/1'/0'` |
| Encryption key | AES-256-GCM | External credential encryption | `m/74'/2'/0'/0'` |
| Ethereum key | secp256k1 | Smart contract signing | `m/44'/60'/0'/0/0` |
**The bridge**: Both keys derive from the **same BIP39 seed phrase**. The secret service can sign an Ethereum transaction using the secp256k1 key and also authenticate SSH using the Ed25519 key. This creates a cryptographically linked identity pair:
- On-chain identity (Ethereum address derived from `m/44'/60'/0'/0/0`)
- Off-chain identity (Ed25519 key derived from `m/74'/0'/0'/0'`)
**Binding them**: To prove that the Ed25519 key and the Ethereum key belong to the same entity:
1. Sign a message with the Ed25519 key: `"I, <Ed25519-pubkey>, attest that my on-chain identity is <Ethereum-address>"`
2. Store this attestation on-chain (in the org/user NFT metadata)
3. Anyone can verify: the on-chain address owns the NFT, and the attestation links the SSH key to that address
This is the **key binding mechanism** that connects alknet's SSH-based authentication to on-chain identity.
### 6.3 Deriving Repository Access from On-Chain Identity
The complete flow for a push operation in a decentralized git system with on-chain ACL:
```
1. Client connects to replicator via SSH
2. SSH auth succeeds (Ed25519 key verified by alknet IdentityProvider)
3. Client pushes to repo X
4. Replicator checks:
a. Local ACL metagraph: does this Ed25519 key have write access to repo X?
b. If local ACL is stale, re-verify against on-chain contract
5. If authorized: accept push, gossip update to other replicators
6. If not: reject with "access denied"
```
**Optimization**: Step 4b is rarely needed if the local ACL cache is kept fresh via event subscriptions. The on-chain contract emits events on ACL changes, and the head node's sync process projects these into the local ACL metagraph.
**alknet's existing support for this flow**:
| Component | Role |
|-----------|------|
| `IdentityProvider` trait | Resolves Ed25519 fingerprint → `Identity` with scopes/resources |
| `ConfigIdentityProvider` | Local-only: reads from `authorized_keys` config |
| `StorageIdentityProvider` | SQLite-backed: queries `peer_credentials` + ACL metagraph |
| `OnChainIdentityProvider` (future) | Verifies against on-chain ACL, falls back to local cache |
| `AuthProtocol` (irpc) | `VerifyPubkey``Identity` resolution |
| `CheckAccess` (irpc) | `Identity` + operation → access verification using ACL graph |
| `OperationSpec.access_control` | Declarative access requirements per operation |
---
## 7. Gossip Protocols for Repo Synchronization
### 7.1 Epidemic/Gossip Protocol Fundamentals
Gossip protocols are decentralized dissemination mechanisms inspired by how rumors spread in social networks. Key properties:
- **Eventual consistency**: All nodes eventually receive all updates
- **Fault tolerance**: Works even when nodes join/leave randomly
- **Scalability**: O(log N) time to reach all nodes in a network of N nodes
- **No single point of failure**: No coordinator node
### 7.2 Radicle's Gossip Protocol
Radicle uses three message types (detailed in Section 3.1):
- **Node Announcements**: Peer discovery (who's online, where to reach them)
- **Inventory Announcements**: Repo discovery (what repos each node seeds)
- **Reference Announcements**: Update notifications (new commits, new COB operations)
**Anti-entropy mechanism**: Nodes periodically exchange state summaries to ensure they haven't missed any updates. This is similar to Merkle tree-based reconciliation in distributed databases.
**Relevance to alknet**: alknet's call protocol subscription model (`call.requested` with `OperationType::Subscription`) can serve as the transport for gossip messages. The key difference is that alknet's call protocol is request-response oriented, while gossip is push-based. A gossip layer on top of the call protocol would work as follows:
```
alknet gossip layer:
1. Subscribe to `/{node}/gossip/announce` on known peers
2. Receive NodeAnnouncement, InventoryAnnouncement, RefAnnouncement events
3. Forward announcements to other connected peers (with deduplication)
4. For RefAnnouncements of tracked repos, trigger git fetch
```
### 7.3 Alternative: CRDT-Based Sync
Instead of gossip + git fetch, some systems use CRDTs for repository synchronization:
- **Advantages**: No merge conflicts, automatic convergence
- **Disadvantages**: Large metadata overhead, complex implementation, doesn't map directly to git's object model
**Recommendation for alknet**: Start with gossip + git fetch (as Radicle does) and consider CRDT-based sync for specific metadata (e.g., ACL state, org metadata) while keeping git data as-is. The ACL metagraph changes can propagate via honker streams (which are effectively a form of CRDT merge).
---
## 8. Relevance to Alknet
### 8.1 Identity + IdentityProvider Model
alknet's existing `Identity` struct and `IdentityProvider` trait are already designed for this use case:
```rust
pub struct Identity {
pub id: String, // Fingerprint or UUID
pub scopes: Vec<String>, // Permission scopes
pub resources: Option<HashMap<String, Vec<String>>>, // Resource-level access
}
```
The `id` field serves dual purpose:
- **Config-based auth**: SSH fingerprint (e.g., `SHA256:abc123...`)
- **Storage-based auth**: Account UUID (e.g., `acc_0123456789`)
**Extended for on-chain identity**, the `id` field could also be:
- **On-chain auth**: Ethereum address (e.g., `0x1234...`) or NFT token ID (e.g., `token_42`)
The `IdentityProvider` trait naturally extends:
```rust
trait IdentityProvider: Send + Sync {
fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
fn resolve_from_token(&self, token: &[u8]) -> Option<Identity>;
}
// Future extension:
// OnChainIdentityProvider resolves Ethereum address + Ed25519 binding
// from on-chain ACL contract, with local metagraph cache
```
### 8.2 OperationRegistry Extension with On-Chain Verification
alknet's `OperationSpec` includes `access_control` fields:
```rust
pub struct AccessControl {
pub required_scopes: Vec<String>,
pub required_scopes_any: Option<Vec<String>>,
pub resource_type: Option<String>,
pub resource_action: Option<String>,
}
```
For on-chain verification, a new `access_control` mode could be added:
```rust
pub enum AccessControlMode {
Local, // Check against local ACL metagraph (current)
OnChain, // Verify against on-chain contract (future)
CachedOnChain, // Check local cache first, verify on-chain on miss/stale (recommended)
}
```
The `AccessControl` struct gains a `mode` field defaulting to `Local`. This is additive and doesn't change existing behavior.
### 8.3 Git Service Adapter for Decentralized Replication
alknet's application service pattern (from services.md) can accommodate a `GitService`:
```rust
#[rpc_requests(message = GitMessage)]
enum GitProtocol {
#[rpc(tx=oneshot::Sender<RepoInfo>)]
#[wrap(GetRepo)]
GetRepo { repo_id: String },
#[rpc(tx=oneshot::Sender<Vec<RepoInfo>>)]
#[wrap(ListRepos)]
ListRepos { org: Option<String> },
#[rpc(tx=oneshot::Sender<bool>)]
#[wrap(CanPush)]
CanPush { repo_id: String, identity: Identity },
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(UpdateMirror)]
UpdateMirror { repo_id: String, refs: Vec<RefUpdate> },
#[rpc(tx=mpsc::Sender<RefAnnouncement>)]
#[wrap(SubscribeRefs)]
SubscribeRefs { repo_ids: Vec<String> },
}
```
This service:
- **Registers with the call protocol** as `/head/git/*`
- **Uses `StorageIdentityProvider`** for `CanPush` checks (with ACL metagraph)
- **Manages git mirrors** (git bare repos on the local filesystem)
- **Propagates updates** via `SubscribeRefs` (which maps to honker stream subscriptions → call protocol integration events)
### 8.4 CredentialProvider Role
The existing `CredentialProvider` pattern in alknet (used for outbound authentication TO external services) maps to:
| Use Case | CredentialProvider Implementation |
|----------|----------------------------------|
| Push to GitHub/GitLab | SSH key from alknet identity, or OAuth token from external source |
| Push to on-chain repo | Ed25519 key derived from seed (signs the push) + Ethereum key (signs on-chain attestation) |
| Authenticate to replicator | Ed25519 key (SSH auth via `IdentityProvider`) |
| Decrypt stored credentials | AES-256-GCM key derived from seed via `SecretProtocol` |
### 8.5 Domain Events vs. Integration Events (Distributed Git Context)
alknet's event boundary discipline (from event sourcing research and ADR-032) is critical for the distributed git scenario:
| Event Type | Source | Consumer | Boundary | Git Analog |
|-----------|--------|----------|----------|------------|
| **Domain events** (honker) | Local service | Same service | Internal | Git object creation/update in local repo |
| **Integration events** (call protocol) | Projected from domain events | Other nodes/services | Cross-node | Push notification, gossip announcement |
| **On-chain events** (smart contract) | Ethereum log | Head node sync process | External source | ACL change on blockchain |
| **Notifications** (honker) | Service | Any subscriber | Cross-service | "Repo X was updated" (thin, ID-only) |
**The flow for a decentralized git push**:
```
1. Client pushes to replicator
2. Replicator's GitService receives push
3. GitService publishes domain event: "repo:refs-updated" (honker stream)
4. Integration event projected: "call.responded" with repo update (call protocol)
5. Replicator gossips "RefAnnouncement" to tracked peers (call protocol subscription)
6. On-chain: if this push creates a new branch, optionally emit on-chain attestation
7. Peer replicators fetch updated refs (git protocol) and update their mirrors
```
**The flow for an ACL change**:
```
1. Org admin calls smart contract: grantWrite(repoId, newUserAddress)
2. Smart contract emits RoleGranted event
3. Head node's sync process detects the event (Ethereum log subscription)
4. Sync process calls StorageService: add DelegatesEdge to ACL metagraph
5. StorageService publishes domain event: "acl:updated" (honker stream)
6. Integration event projected: notify replicators of ACL change (call protocol)
7. Replicators update their local ACL cache
```
This cleanly separates:
- **On-chain events** (smart contract logs) = external source of truth
- **Local projections** (ACL metagraph) = cached view for fast access checks
- **Integration events** (call protocol) = cross-node notification mechanism
- **Domain events** (honker streams) = internal state management
### 8.6 Practical Integration Path
For alknet to support the decentralized git concept, the integration path is:
#### Phase 1: Foundation (Current Architecture)
- `IdentityProvider` trait supports multiple backends ✓
- `StorageIdentityProvider` queries `peer_credentials` + ACL graph ✓
- `SecretProtocol` derives Ed25519 and secp256k1 keys from same seed ✓
- `OperationSpec.access_control` supports scope-based checks ✓
#### Phase 2: Git Service (Additive)
- Add `GitProtocol` irpc service for repo management
- Implement `GitService` as an application service (like DockerService, NodeService)
- Map `CanPush` to ACL metagraph traversal
- Implement `pre-receive` hook that calls alknet's `CheckAccess` irpc
#### Phase 3: On-Chain ACL (Additive, Requires External Dependencies)
- Add `OnChainIdentityProvider` that:
1. Resolves Ed25519 fingerprint → Ethereum address (via attestation stored in NFT metadata)
2. Checks on-chain ACL contract for access rights
3. Caches results in local ACL metagraph
4. Subscribes to on-chain events for ACL changes
- Add `AccessControlMode::CachedOnChain` to `OperationSpec`
- Add `WalletProtocol` irpc service for signing on-chain transactions
#### Phase 4: Gossip and Replication (Additive)
- Add gossip message types to call protocol (`NodeAnnouncement`, `RepoAnnouncement`, `RefAnnouncement`)
- Implement `SubscribeRefs` streaming operation for repo update subscriptions
- Add replicator service that seeds repos and responds to gossip
Each phase is additive and doesn't require changes to earlier phases. The architecture supports this incremental extension because:
1. `IdentityProvider` is a trait — new implementations are additive
2. `OperationSpec.access_control` is a struct — new fields are additive
3. Application services register with the call protocol — new services don't change core
4. Honker streams are internal — new streams are additive
---
## 9. References
### Decentralized Git Platforms
- **Radicle Protocol Guide**: https://radicle.dev/guides/protocol — Comprehensive documentation of Radicle's identity system, gossip protocol, replication, and self-certifying repositories
- **Radicle Heartwood (source)**: https://github.com/radicle-dev/heartwood — Reference implementation in Rust
- **RIP-0002 Identity**: Radicle Improvement Proposal for identity documents and delegate thresholds
- **radicle-crypto crate**: Ed25519 key types, SSH encoding, keystore (DeepWiki analysis: https://deepwiki.com/radicle-dev/heartwood/7.1-radicle-crypto)
- **ForgeFed**: https://forgefed.org/ — ActivityPub-based federation protocol for forges (Forgejo, Gitea integration)
- **GitLike**: https://gitlike.dev/ — Browser-based decentralized VCS using IPFS and Ethereum
- **GitBross**: https://gitbross.com/ — Decentralized Git platform using Solana, Arbitrum, and IPFS
- **PineSU**: IEEE paper on Git + Ethereum integration for trusted information sharing
### Blockchain Identity and Naming
- **ERC-721 Standard**: https://ethereum.org/developers/docs/standards/tokens/erc-721 — Non-fungible token standard
- **ENS (Ethereum Name Service)**: https://docs.ens.domains/ — Decentralized naming on Ethereum
- **W3C DID Primer**: https://w3c-ccg.github.io/did-primer/ — Decentralized Identifiers overview
- **W3C Verifiable Credentials**: https://www.w3.org/TR/vc-data-model/ — VC specification
- **EIP-3668 (CCIP-Read)**: Off-chain data lookup for ENS, enabling smart contracts to verify off-chain data
### Access Control
- **Git Hooks**: https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks — Server-side hooks for git access control
- **Gitolite**: Config-file-based SSH key → repo permission mapping
- **Token-Gated Access Control**: https://chainscorelabs.com/guides/ — Patterns for ERC-721/ERC-1155 token-gated access
- **ChainGuard**: Blockchain-based authentication and access control (academic paper)
### Cryptographic Key Management
- **SLIP-0010**: https://slips.readthedocs.io/en/latest/slip-0010/ — Universal private key derivation from master private key (Ed25519, secp256k1, NIST P-256)
- **BIP-0032**: Hierarchical Deterministic Wallets
- **BIP-0039**: Mnemonic code for generating deterministic keys
- **SLIP-0044**: Registered coin types for BIP-0044 (alknet uses unallocated `74'`)
- **Ed25519**: Bernstein's Edwards-curve Digital Signature Algorithm
### Gossip Protocols
- **Gossip Protocol Fundamentals**: https://www.geeksforgeeks.org/distributed-systems/gossip-protocol-in-disrtibuted-systems/ — Epidemic-style information dissemination
- **libgossip**: C++17 implementation for decentralized node discovery and metadata propagation
- **Bitcoin Gossip**: Used in Bitcoin for transaction and block propagation
- **Secure Scuttlebutt (SSB)**: Inspiration for Radicle's gossip model
### Alknet Architecture Documents (Internal)
- **core.md**: Transport, call protocol, auth, services, DNS
- **services.md**: irpc service architecture, OperationEnv, Identity, auth/secret/config protocols
- **storage.md**: Metagraph data model, ACL as metagraph, identity tables, honker integration
- **integration-plan.md**: Phase 0-4 integration plan, ADRs 026-034
- **ADR-029**: Identity as core type (`Identity { id, scopes, resources }` + `IdentityProvider` trait)
- **ADR-032**: Event boundary discipline (domain events vs. integration events vs. service calls)
### Radicle-Specific Documentation
- **Radicle COBs (Collaborative Objects)**: CRDT-based distributed issues/patches stored as Git objects — https://deepwiki.com/radicle-dev/heartwood/6.1-collaborative-objects-(cobs)
- **Radicle Identity Documents**: Delegates, thresholds, and self-certifying repo identity — RIP-0002
- **Radicle Signed Refs**: Vulnerability disclosure (2026-03) on replay attacks in signed references

View File

@@ -1,716 +0,0 @@
# Gitserver Reference Document
> **Source**: <https://github.com/WJQSERVER/gitserver> (cloned at `/workspace/gitserver/`)
> **Version**: 0.0.3 (workspace Cargo.toml)
> **License**: MPL-2.0 (primary); upstream portions MIT (preserved in UPSTREAM-LICENSE)
> **Upstream origin**: <https://github.com/ggueret/git-server>
> **Date researched**: 2026-06-08
> **Purpose**: Evaluate gitserver as a basis for a git service adapter within alknet
---
## 1. Architecture Overview
### 1.1 What is gitserver?
Gitserver is a **Rust-native Git Smart HTTP server** that does not require an installed `git` binary at runtime. All Git operations (ref advertisement, pack generation, receive-pack) are implemented via the [gitoxide](https://github.com/GitoxideLabs/gitoxide) (`gix`) crate. It supports both Git protocol v1 and v2, including shallow clones and multi-ack negotiation.
The project follows a **library-first design**: `gitserver-core` and `gitserver-http` are reusable libraries, while the `gitserver` binary is a thin CLI wrapper for standalone deployment.
### 1.2 Crate Structure
```
crates/
├── gitserver-core/ # Git protocol operations (no HTTP dependency)
│ ├── backend.rs # GitBackend: unified interface for refs/pack/receive-pack
│ ├── discovery.rs # RepoStore: filesystem-based repo discovery
│ ├── dynamic_registry.rs # DynamicRepoRegistry, RepoResolver, MutableRepoRegistry traits
│ ├── error.rs # Error types (RepoNotFound, PathTraversal, Protocol, Git, Io)
│ ├── pack.rs # UploadPackRequest parsing, pack generation with side-band-64k
│ ├── path.rs # Path safety: resolve_repo_path (normalize + canonicalize)
│ ├── pktline.rs # pkt-line encoding/decoding utilities
│ ├── protocol_v2.rs # Git protocol v2: ls-refs, fetch, shallow, stateless-rpc
│ ├── receive_pack.rs # receive-pack: ref advertisement, pack reception, fast-forward validation
│ └── refs.rs # Protocol v1 ref advertisement
├── gitserver-http/ # Axum HTTP layer
│ ├── error.rs # AppError enum → HTTP status codes
│ ├── handlers.rs # Route handlers: info_refs, upload_pack, receive_pack, healthz, list
│ ├── lib.rs # router() function + public re-exports
│ └── state.rs # SharedState (RepoMode, AuthConfig, ServicePolicy, draining flag)
├── gitserver/ # CLI binary (thin wrapper)
│ └── main.rs # CLI args, RepoStore discovery, Axum server, graceful shutdown
└── gitserver-bench/ # Performance benchmarks (not published)
```
### 1.3 Key Dependencies
| Dependency | Version | Purpose |
|---|---|---|
| `gix` | 0.80.0 | Native Git repository operations (open refs, object store, rev-walk) |
| `gix-pack` | 0.67.0 | Pack file writing (receive-pack) |
| `axum` | 0.8.8 | HTTP routing and handlers |
| `tokio` | 1.50.0 | Async runtime, channels, IO |
| `miniz_oxide` | 0.8 | Zlib compression for pack objects |
| `sha1` | 0.10 | Pack checksum |
| `flate2` | 1 | Gzip response compression |
| `zstd` | 0.13 | Zstd response compression |
| `base64` | 0.22 | HTTP Basic auth decoding |
| `subtle` | 2 | Constant-time comparison (auth) |
| `clap` | 4.6.0 | CLI argument parsing |
### 1.4 Request Flow
#### Clone/Fetch (Protocol v1)
```
Client → GET /{repo}/info/refs?service=git-upload-pack
→ Server: resolve repo, verify auth, advertise_refs()
← Ref advertisement response
Client → POST /{repo}/git-upload-pack
→ Server: parse UploadPackRequest, generate_pack()
← Streamed side-band-64k pack response
```
#### Clone/Fetch (Protocol v2)
```
Client → GET /{repo}/info/refs (git-protocol: version=2)
← Capabilities advertisement
Client → POST /{repo}/git-upload-pack (git-protocol: version=2)
→ Server: parse_command_request() → ls-refs or fetch
← ls-refs result or streamed packfile
```
#### Push (receive-pack, must be enabled)
```
Client → GET /{repo}/info/refs?service=git-receive-pack
← Ref advertisement
Client → POST /{repo}/git-receive-pack
→ Server: parse commands, write pack, validate fast-forward, update refs
← Status report (ok/ng per ref)
```
---
## 2. Protocol Support
### 2.1 Smart HTTP Git Protocol
Gitserver implements the **Git Smart HTTP protocol** (RFC-like, de facto standard). This is the standard protocol used by `git clone http://...`, `git fetch`, and `git push` over HTTP.
**Supported endpoints:**
| Method | Endpoint | Protocol Version | Description |
|---|---|---|---|
| GET | `/healthz` | — | Health check (no auth) |
| GET | `/` | — | JSON repository listing (auth required if configured) |
| GET | `/{repo}/info/refs?service=git-upload-pack` | v1 | Ref advertisement for clone/fetch |
| GET | `/{repo}/info/refs?service=git-receive-pack` | v1 | Ref advertisement for push (disabled by default) |
| POST | `/{repo}/git-upload-pack` | v1 | Pack negotiation and transfer |
| POST | `/{repo}/git-receive-pack` | v1 | Push operations (disabled by default) |
| GET | `/{repo}/info/refs` with `git-protocol: version=2` | v2 | Capabilities advertisement |
| POST | `/{repo}/git-upload-pack` with `git-protocol: version=2` | v2 | `ls-refs` and `fetch` commands |
### 2.2 Git Operations
| Operation | Supported | Notes |
|---|---|---|
| `git clone` | ✓ | Both v1 and v2 |
| `git fetch` | ✓ | Multi-ack, multi-ack-detailed negotiation |
| `git push` | ✓ (opt-in) | Via `--enable-receive-pack` or `ServicePolicy.receive_pack: true` |
| Shallow clone | ✓ | Protocol v2 `fetch` with `deepen` |
| OFS_DELTA | ✓ | Offset delta compression in packs |
| Side-band-64k | ✓ | Multiplexed progress/pack data |
| Response compression | ✓ | Gzip and Zstd on ref advertisement |
### 2.3 Push Restrictions
When receive-pack is enabled, the following restrictions apply:
- **Fast-forward only**: Branch updates under `refs/heads/*` must be fast-forward (old commit is ancestor of new)
- **No ref deletion**: New OID cannot be the zero OID
- **No tag overwrite**: Updating an existing tag is rejected
- **Commits only**: Branch tips must point to commit objects
- **Timeouts**: 300s total, 30s idle
### 2.4 SSH Git Protocol
Gitserver does **not** support SSH Git protocol. It is HTTP-only. SSH git access would require a separate implementation or integration layer (see Section 6).
---
## 3. Interface Pattern Analysis
### 3.1 HTTP Handler Architecture
Gitserver's HTTP layer follows a clean handler pattern:
```rust
// gitserver-http/src/lib.rs
pub fn router(state: SharedState) -> Router {
Router::new()
.route("/healthz", get(handlers::healthz))
.route("/", get(handlers::list_repos))
.route("/{*path}", get(handlers::info_refs_dispatch))
.route("/{*path}", post(handlers::rpc_dispatch))
.with_state(state)
}
```
The `SharedState` is an Axum state object containing:
- `RepoMode` — either `Discovered(Arc<RwLock<RepoStore>>)` or `Dynamic { resolver, registry }`
- `AuthConfig` — optional Basic and/or Bearer authentication
- `ServicePolicy` — toggle for upload_pack, upload_pack_v2, receive_pack
- `draining: Arc<AtomicBool>` — graceful shutdown flag
Each handler follows this pattern:
1. Check `draining` flag → 503 if shutting down
2. Check `ServicePolicy` → 404 if service disabled
3. Authenticate request via `require_auth()` → 401 if credentials missing/invalid
4. Resolve repository via `SharedState::resolve()` → 404 if not found
5. Execute git operation via `GitBackend`
6. Return streaming or buffered response
### 3.2 Mapping to alknet's MessageInterface
Gitserver's `SharedState` + handler pattern maps closely to alknet's proposed `MessageInterface` trait:
```rust
// alknet's proposed MessageInterface
async fn handle_request(&self, request: InterfaceRequest) -> Result<InterfaceResponse>;
```
Gitserver's handler flow is essentially:
1. Receive HTTP request (analogous to `InterfaceRequest`)
2. Extract operation path, auth, and body
3. Dispatch to the appropriate Git operation
4. Return HTTP response (analogous to `InterfaceResponse`)
### 3.3 Low-Level Handler API
Gitserver also exposes handler functions that can be called directly without going through the Axum router:
```rust
use gitserver_http::handlers::{info_refs_endpoint, ServiceKind};
let response = info_refs_endpoint(
&state,
"my-project.git",
ServiceKind::UploadPack,
HeaderMap::new(),
).await?;
```
This is significant for alknet integration — it means the git logic can be invoked programmatically without HTTP routing.
---
## 4. Authentication
### 4.1 Current Auth Model
Gitserver supports two HTTP authentication mechanisms, both optional:
```rust
pub struct AuthConfig {
pub basic: Option<BasicAuthConfig>,
pub bearer_token: Option<String>,
}
pub struct BasicAuthConfig {
pub username: String,
pub password: String,
}
```
**Key characteristics:**
- Both can be configured simultaneously; **either one passing is sufficient**
- Basic auth uses **constant-time comparison** (`subtle` crate) to prevent timing attacks
- Bearer token is compared directly (suitable for generated tokens)
- Failed auth returns `401 Unauthorized` with `WWW-Authenticate: Basic realm="gitserver", Bearer`
- `GET /healthz` is **unauthenticated** (always accessible)
- Auth is **global** (same credentials for all repositories) — no per-repo or per-user ACL
### 4.2 Auth Flow in Handlers
```rust
fn require_auth(store: &SharedState, headers: &HeaderMap) -> Result<(), AppError> {
let auth = store.auth();
if auth.basic.is_none() && auth.bearer_token.is_none() {
return Ok(()); // No auth configured → allow all
}
let value = headers.get(AUTHORIZATION)...;
// Try Bearer first, then Basic
// Constant-time comparison for Basic
}
```
### 4.3 Mapping to alknet Identity
alknet's `IdentityProvider` resolves credentials to an `Identity`. The mapping would be:
| gitserver auth | alknet equivalent | Resolution path |
|---|---|---|
| No auth | `Identity::anonymous()` or reject | Configurable policy |
| Basic auth (username/password) | `IdentityProvider::resolve_from_token()` | Map to AuthToken or direct lookup |
| Bearer token | `IdentityProvider::resolve_from_token()` | Token is already in the right format |
The key gap is that gitserver's auth is **single-credential, global**, while alknet needs **per-identity, per-repository** access control. Integration would require:
1. Replacing `AuthConfig` with alknet's `IdentityProvider`
2. Extracting identity from the `Authorization` header
3. Checking per-repo ACL based on resolved `Identity`
---
## 5. Storage
### 5.1 Filesystem-Based Storage
Gitserver currently stores repositories as **bare Git repositories on the local filesystem**. The storage model is:
```
ROOT/
├── project-a.git/ # bare repository
│ ├── HEAD
│ ├── objects/
│ ├── refs/
│ └── description
├── org/
│ └── project-b.git/ # nested repository (up to max_depth)
└── ...
```
The `RepoStore::discover(root, max_depth)` function:
1. Canonicalizes the root path
2. Recursively walks subdirectories up to `max_depth`
3. Attempts `gix::open(path)` on each directory
4. If `repo.is_bare()`, adds it as a `RepoInfo`
5. Path traversal protection via lexical normalization + `canonicalize()` double-check
The `DynamicRepoRegistry` allows programmatic registration/unregistration of repos at runtime, validated by `gix::open()` confirming the path is a bare repo.
### 5.2 Storage Abstraction Points
The key storage interaction points in the codebase are:
| Component | Storage Pattern |
|---|---|
| `RepoStore::discover()` | Filesystem scan (local directory tree) |
| `DynamicRepoRegistry` | In-memory registry with filesystem-backed paths |
| `GitBackend::new(repo_path)` | Opens a local bare repo via `gix::open()` |
| `receive_pack::write_pack()` | Writes pack to `objects/pack/` via `gix_pack::Bundle::write_to_directory()` |
| `path::resolve_repo_path()` | Canonical path resolution + traversal protection |
**All storage operations assume a local filesystem path.** There is no abstraction for remote or object storage backends.
### 5.3 Rustfs (S3-Compatible) Integration Feasibility
Git operations fundamentally require **a local filesystem**`gix::open()` expects a directory with the standard `.git` layout (objects, refs, HEAD, etc.). Rustfs (S3-compatible) cannot serve as a **direct** storage backend for gitoxide's repository operations because:
1. `gix::open()` requires a local path — it reads `HEAD`, refs, and object packs from the filesystem
2. Pack generation (`generate_pack()`) streams objects from the local ODB
3. Receive-pack writes pack files to the local `objects/pack/` directory
4. Reference updates use `gix::Repository::edit_references()` which operates on the local refstore
However, rustfs **could** be used in several supporting roles:
| Integration Approach | Description | Feasibility |
|---|---|---|
| **Repo sync backend** | Store bare repo tarballs in rustfs; sync to local disk on demand | High — sync from S3 to local FS before serving |
| **Backup/archive** | Push repo backups to rustfs buckets | High — out-of-band backup |
| **Git LFS storage** | Store large file objects in rustfs via Git LFS | Medium — requires LFS server implementation |
| **Object store proxy** | Cache layer: serve from local FS, sync to/from rustfs | Medium — needs repo lifecycle management |
| **Direct S3 repo** | Custom `gix` object backend reading from S3 | Low — would require deep gitoxide customization |
The most practical approach: **use rustfs as a backing store for repository synchronization**. Gitserver would always operate on local filesystem paths, but a separate component would manage syncing repos to/from rustfs buckets.
---
## 6. SSH Support
### 6.1 Current State
Gitserver has **no SSH transport capability**. It only implements the HTTP Smart Git protocol. Adding SSH support would require implementing the Git SSH protocol, which is a different wire format:
| Aspect | Smart HTTP | SSH |
|---|---|---|
| Transport | HTTP (request/response) | Persistent SSH channel |
| Service discovery | `GET /info/refs?service=git-upload-pack` | `ssh://host/git-upload-pack 'repo'` |
| Protocol framing | pkt-line over HTTP | pkt-line over SSH channel |
| Authentication | HTTP Authorization header | SSH key-based |
| Multiplexing | HTTP/2 or separate connections | Multiple SSH channels |
### 6.2 How Git over SSH Works
The Git SSH protocol uses SSH as a transport for the same `git-upload-pack` and `git-receive-pack` commands:
```
Client connects via SSH → server executes git-upload-pack or git-receive-pack
Client ← SSH channel → Server (bidirectional pkt-line stream)
```
### 6.3 Integration with alknet's SSH Interface
alknet's SSH interface (`SshInterface`) is a `StreamInterface` — it accepts a persistent byte stream and multiplexes it into channels. This maps naturally to Git over SSH:
**Approach: Git as an alknet operation over SSH**
```
alknet SSH session
├─ Channel: call protocol (operations)
└─ Channel: git-upload-pack
OR git-receive-pack
gitserver-core protocol logic
(ref advertisement, pack generation, receive-pack)
```
This would work by:
1. The SSH interface receives a connection with a request like `git-upload-pack '/repos/project.git'`
2. alknet resolves the identity from the SSH key fingerprint
3. Checks ACL: does this identity have read/write access to this repo?
4. Invokes `gitserver-core` functions directly (no HTTP needed):
- `refs::advertise_refs()` → send over SSH channel
- `pack::generate_pack()` → stream over SSH channel
- `receive_pack::receive_pack()` → read/write over SSH channel
**Key advantage**: Since `gitserver-core` has no HTTP dependency, it can be used directly over SSH channels without the HTTP overhead. The `GitBackend` API is transport-agnostic.
### 6.4 Alternative: Dedicated Git SSH Adapter
A simpler approach that doesn't require modifying the SSH channel multiplexing:
```
alknet SSH session → call protocol → operation "git/upload-pack" →
→ GitAdapter::upload_pack(repo, wants, haves) → streaming response
```
This treats Git operations as alknet call operations, where the SSH interface is the transport but Git operations are invoked via the call protocol rather than raw SSH channels. This is more aligned with alknet's architecture but requires adapting the Git protocol to the call protocol's request/response model (potentially with streaming).
---
## 7. Relevance to alknet
### 7.1 Mapping to alknet's Interface Model
Gitserver is a textbook **`MessageInterface`** implementation:
| alknet MessageInterface | Gitserver Equivalent |
|---|---|
| `handle_request(InterfaceRequest)` | `info_refs_dispatch()` / `rpc_dispatch()` |
| `InterfaceRequest.operation_path` | URL path (`/{repo}/info/refs`, `/{repo}/git-upload-pack`) |
| `InterfaceRequest.auth_token` | `Authorization` header → `require_auth()` |
| `InterfaceRequest.input` | Request body (pack negotiation data) |
| `InterfaceResponse.result` | HTTP response body (ref advertisement, pack data) |
| `InterfaceResponse.status` | HTTP status code |
| `InterfaceResponse.headers` | Content-Type, Cache-Control, etc. |
However, gitserver **manages its own transport** (Axum HTTP server), which is exactly the `MessageInterface` pattern described in alknet's interface model: "MessageInterface implementations manage their own transport. They don't need the Transport trait because they're not wrapping a generic byte stream — they ARE the transport+interface combined."
### 7.2 Git as an alknet Operation
Git operations could be mapped to alknet's call protocol namespace:
```
Namespace: "git"
Operations:
- git/list → List available repositories
- git/info-refs → Get ref advertisement for a repo
- git/upload-pack → Clone/fetch (streaming response)
- git/receive-pack → Push (streaming request+response)
- git/ls-refs → Protocol v2 ls-refs
- git/fetch → Protocol v2 fetch
```
**Challenge**: Git operations are **streaming and bidirectional** (especially fetch negotiation and receive-pack), while alknet's call protocol is currently defined as request→response. This needs design consideration:
| Operation | Direction | Stream Duration | alknet Fit |
|---|---|---|---|
| `git/list` | Request → Response | Short | Direct fit |
| `git/info-refs` | Request → Response | Short | Direct fit |
| `git/upload-pack` | Request → Streaming Response | Long | Needs streaming response support |
| `git/receive-pack` | Streaming Request → Streaming Response | Long | Needs bidirectional streaming |
### 7.3 Proposed GitAdapter Architecture
```
┌─────────────────────────────────────────────────────────┐
│ alknet node │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ HttpInterface│ │ SshInterface │ │ DNS/other │ │
│ │ (Message) │ │ (Stream) │ │ (Message) │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ OperationRegistry │ │
│ │ "git/list" → GitAdapter::list_repos() │ │
│ │ "git/upload-pack" → GitAdapter::upload_pack() │ │
│ │ "git/receive-pack" → GitAdapter::receive_pack() │ │
│ └──────────────┬───────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────┐ │
│ │ GitAdapter │ │
│ │ - SharedState (repos, auth) │ │
│ │ - GitBackend (protocol ops) │ │
│ │ - IdentityProvider (auth) │ │
│ │ - RepoResolver (filesystem) │ │
│ └──────────────┬───────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────┐ ┌────────────────┐ │
│ │ Local filesystem │ │ Rustfs sync │ │
│ │ (bare git repos) │ │ (S3 backend) │ │
│ └──────────────────────────────┘ └────────────────┘ │
└─────────────────────────────────────────────────────────┘
```
### 7.4 Auth Integration: alknet Identity → Gitserver Auth
**Current gitserver auth** (single global credential):
```rust
AuthConfig {
basic: Option<BasicAuthConfig>, // one username/password
bearer_token: Option<String>, // one token
}
```
**Proposed alknet integration** (per-identity, per-repo):
```rust
struct GitAdapter {
identity_provider: Arc<dyn IdentityProvider>,
repo_resolver: Arc<dyn RepoResolver>,
backend_factory: Arc<dyn GitBackendFactory>,
acl: Arc<dyn GitAcl>,
}
impl GitAdapter {
async fn handle_request(
&self,
request: InterfaceRequest,
) -> Result<InterfaceResponse> {
// 1. Resolve identity from auth token
let identity = self.identity_provider
.resolve_from_token(request.auth_token)?;
// 2. Parse git operation from path
let operation = parse_git_operation(&request.operation_path)?;
// 3. Check ACL
self.acl.check_access(&identity, &operation.repo, operation.access_type)?;
// 4. Dispatch to gitserver-core logic
// ...
}
}
```
**ACL design** (per-repo, per-operation):
```rust
enum GitAccess {
Read, // clone, fetch
Write, // push
}
trait GitAcl: Send + Sync {
fn check_access(
&self,
identity: &Identity,
repo: &str,
access: GitAccess,
) -> Result<()>;
}
```
### 7.5 Storage Integration with Rustfs
**Recommended approach**: Rustfs as a sync backend:
```rust
trait RepoStorage: Send + Sync {
/// Ensure a local working copy exists for the given repo.
/// May involve syncing from S3 (rustfs) to local disk.
async fn ensure_local(&self, repo: &str) -> Result<PathBuf>;
/// Sync local changes back to S3 (rustfs) after a push.
async fn sync_to_remote(&self, repo: &str) -> Result<()>;
/// List available repos (may consult S3 bucket listing).
async fn list_repos(&self) -> Result<Vec<RepoInfo>>;
}
```
The flow would be:
1. `GitAdapter` receives a request for repo `X`
2. `RepoStorage::ensure_local("X")` checks if the repo exists on local disk; if not, syncs from rustfs
3. Git operations run on the local filesystem (using `gitserver-core` directly)
4. After push operations, `RepoStorage::sync_to_remote("X")` pushes updates to rustfs
This maintains gitserver's requirement for a local filesystem while leveraging rustfs for durability and distribution.
### 7.6 Operation Mapping
| Git Operation | alknet Namespace | alknet Op | Input | Output | Stream? |
|---|---|---|---|---|---|
| List repos | `git` | `list` | `{}` | `[RepoInfo]` | No |
| Ref advertisement (v1) | `git` | `info-refs` | `{repo, service: "upload-pack" \| "receive-pack"}` | Binary ref advertisement | No |
| Ref capabilities (v2) | `git` | `capabilities` | `{repo}` | Binary capabilities | No |
| Ls-refs (v2) | `git` | `ls-refs` | `{repo, peel, symrefs, ref_prefixes}` | Binary ref listing | No |
| Clone/Fetch | `git` | `upload-pack` | `{repo, wants, haves, done, ...}` | Streamed pack data | Yes (response) |
| Push | `git` | `receive-pack` | `{repo, commands, pack_data}` | Status report | Yes (both) |
### 7.7 What gitserver-core Provides Directly
The most valuable integration point is `gitserver-core` — the HTTP-free protocol library:
```rust
// Direct usage without HTTP
use gitserver_core::backend::GitBackend;
use gitserver_core::discovery::RepoStore;
use gitserver_core::pack::{UploadPackRequest, UploadPackCapabilities, ShallowRequest};
use gitserver_core::protocol_v2;
// Repository discovery
let store = RepoStore::discover("./repos".into(), 3)?;
let repo = store.resolve("my-project.git")?;
// Protocol v1 ref advertisement
let backend = GitBackend::new(repo.absolute_path.clone());
let refs = backend.advertise_refs()?;
// Pack generation (streaming)
let request = UploadPackRequest { wants, haves, done, ... };
let pack_stream = backend.upload_pack(&request).await?;
// Receive-pack (push)
let result = backend.receive_pack(request_stream).await?;
// Protocol v2
let capabilities = protocol_v2::advertise_capabilities();
let ls_refs_output = protocol_v2::ls_refs(&repo_path, &ls_refs_request)?;
let fetch_output = backend.upload_pack(&fetch_request.upload_request).await?;
```
These functions can be called from any async context — SSH channel handler, alknet operation handler, HTTP handler — without going through the Axum HTTP layer.
---
## 8. Integration Recommendations
### 8.1 Recommended Integration Strategy
**Phase 1: HTTP Gateway (MessageInterface)**
Embed gitserver-http's Axum router into alknet's HTTP interface. This provides immediate Git-over-HTTP capability:
```rust
// In alknet's HttpInterface::handle_request()
// Route: /git/* → gitserver router
let git_app = gitserver_http::router(git_state);
let app = Router::new()
.nest("/git", git_app) // Mount git under /git
.route("/v1/{namespace}/{op}", post(operation_handler));
```
This works because gitserver is designed to be nested into existing Axum apps. Auth integration would replace `AuthConfig` with alknet's `IdentityProvider`.
**Phase 2: SSH Git Adapter (StreamInterface)**
Use `gitserver-core` directly within alknet's SSH interface for Git-over-SSH:
```rust
// In alknet's SshInterface channel handler
// SSH channel request: "git-upload-pack '/repos/project.git'"
let backend = GitBackend::new(repo_path);
let refs = backend.advertise_refs()?;
// Send refs over SSH channel
// Stream pack data over SSH channel
```
**Phase 3: Call Protocol Operations (OperationRegistry)**
Register Git operations in the operation registry for access via any interface:
```rust
registry.register(GitListRepos::new(adapter.clone()));
registry.register(GitUploadPack::new(adapter.clone()));
registry.register(GitReceivePack::new(adapter.clone()));
```
### 8.2 Key Modifications Needed
1. **Auth replacement**: Replace `AuthConfig` with `IdentityProvider`-based auth in `handlers.rs`'s `require_auth()` function
2. **ACL addition**: Add per-repo, per-identity access control (gitserver currently has none)
3. **RepoResolver abstraction**: Replace `RepoStore`/`DynamicRepoRegistry` with alknet's `RepoResolver` that integrates with rustfs sync
4. **Streaming response support**: Adapt alknet's call protocol for streaming (large pack files)
5. **Bidirectional streaming**: For receive-pack, the call protocol needs to support bidirectional streaming
### 8.3 Risks and Mitigations
| Risk | Mitigation |
|---|---|
| gitserver requires local filesystem | Use rustfs as sync backend; maintain local working copies |
| Auth is global (single credential) | Fork/modify `require_auth()` to use `IdentityProvider` |
| No per-repo ACL | Add `GitAcl` trait in the adapter layer |
| MPL-2.0 license requires modifications to be under MPL-2.0 | Acceptable for alknet (MPL-2.0 is file-level copyleft) |
| Large pack files may not fit alknet's message size limits | Implement streaming response in the call protocol |
| gitoxide version coupling | Pin `gix = "0.80.0"` as gitserver does |
### 8.4 License Considerations
- **Primary license**: MPL-2.0 (file-level copyleft)
- **Upstream portions**: MIT (preserved in UPSTREAM-LICENSE)
- **Implication**: Modifications to gitserver's `.rs` files must remain under MPL-2.0. Linking from alknet code is unrestricted.
- **Recommendation**: Use gitserver as a library dependency. If alknet-specific auth/ACL modifications are needed, contribute them upstream or maintain them as separate files under MPL-2.0.
---
## 9. Summary
### 9.1 Key Findings
1. **gitserver is a well-structured, library-first Rust Git Smart HTTP server** with clean separation between protocol logic (`gitserver-core`) and HTTP transport (`gitserver-http`).
2. **Protocol support is comprehensive**: Git Smart HTTP v1 and v2, clone, fetch, push (opt-in), shallow clones, delta compression, streaming pack generation.
3. **No SSH support exists**, but `gitserver-core` is transport-agnostic and can serve Git operations over any channel.
4. **Auth is simple but limited**: single global Basic/Bearer credential, no per-repo or per-user ACL.
5. **Storage is local-filesystem only**: `gix::open()` requires a local path. S3/rustfs integration requires a sync-to-local approach.
6. **The library design enables direct integration**: `GitBackend` and protocol functions can be called without HTTP.
### 9.2 Recommendation
**Use `gitserver-core` as alknet's Git protocol engine.** The core crate provides all Git protocol operations (ref advertisement, pack generation, receive-pack, protocol v2) without any HTTP dependency. This allows alknet to expose Git services through any interface (HTTP, SSH, call protocol) while maintaining a single protocol implementation.
**Use `gitserver-http` as alknet's Git HTTP interface** by nesting its Axum router under alknet's HTTP interface, with auth replaced by `IdentityProvider`.
**Design a `GitAdapter`** that wraps `gitserver-core` and integrates with alknet's `OperationRegistry`, `IdentityProvider`, and rustfs-backed storage.
### 9.3 Next Steps
1. Fork or vendor `gitserver-core` and `gitserver-http` into alknet's dependency tree
2. Design the `GitAdapter` trait with `IdentityProvider` auth and `GitAcl` access control
3. Implement Phase 1: HTTP gateway with nested Axum router and `IdentityProvider` auth
4. Implement `RepoStorage` trait with rustfs sync-to-local strategy
5. Design streaming extensions to alknet's call protocol for pack file transfer
6. Evaluate Phase 2: SSH Git adapter using `gitserver-core` directly over SSH channels
---
## References
- [gitserver README](https://github.com/WJQSERVER/gitserver) — project overview, quick start, CLI usage
- [gitserver Architecture docs](docs/en/architecture.md) — crate responsibilities, request flows
- [gitserver Library docs](docs/en/library.md) — embedding, dynamic registration, auth config
- [gitserver API Reference](docs/en/api.md) — REST endpoints, protocol details, error codes
- [alknet Interface Model](../../phase2/interface-model.md) — StreamInterface/MessageInterface design
- [gitoxide](https://github.com/GitoxideLabs/gitoxide) — underlying Git implementation library

View File

@@ -1,857 +0,0 @@
# Research: Honker — SQLite Pub/Sub, Queue, and Notification Extension
## Key Findings
- **Honker is a Rust-based SQLite extension** that adds Postgres-style NOTIFY/LISTEN semantics plus durable pub/sub, task queues, and event streams entirely within SQLite. It eliminates the need for a separate message broker (Redis, Kafka) when SQLite is the primary datastore.
- **Three core primitives**: `notify/listen` (ephemeral pub/sub), `stream` (durable pub/sub with per-consumer offsets), and `queue` (at-least-once work queue with retries, priority, delayed jobs, and dead-letter handling). All three are SQL INSERTs inside your transaction — business write and side-effect commit or roll back together.
- **Wake mechanism**: Uses `PRAGMA data_version` polling at 1ms granularity to detect commits, achieving ~1-2ms median cross-process wake latency without requiring a daemon or broker. A single thread per database fans out to N subscribers via bounded channels.
- **Single-machine, single-writer model**: Designed for self-hosted deployments. Not distributed — no multi-node replication. This maps perfectly to alknet's per-node architecture where domain events are internal to a service boundary (ADR-032).
- **Comprehensive SQL API**: 30+ SQL scalar functions (`honker_enqueue`, `honker_claim_batch`, `honker_ack_batch`, `honker_stream_publish`, `honker_stream_read_since`, `honker_stream_save_offset`, `notify`, `honker_lock_acquire`, `honker_rate_limit_try`, `honker_scheduler_register`, etc.) registered as a loadable SQLite extension. Any language that can `SELECT load_extension('honker')` gets the same features.
- **Rust core (`honker-core`)**: All SQL implementations live in a shared Rust crate consumed by the loadable extension, PyO3 Python binding, napi-rs Node binding, and other language wrappers. One source of truth for the SQL — no behavioral drift across bindings.
- **License**: Apache 2.0 / MIT dual-license. Fully permissive for integration.
**Recommendation**: Adopt honker's patterns directly in `alknet-storage`. The `honker` crate (or `honker-core` for a Rust-native integration) should be a dependency of `alknet-storage`. Honker's single-node model aligns with alknet's event boundary discipline — domain events stay within the service boundary, and cross-node events go through the call protocol. For production deployments that use Postgres instead of SQLite, the same patterns (queue/claim, stream/subscribe, notify/listen) can be replicated using Postgres features, but honker's built-in retry, visibility timeout, and scheduling would need to be reimplemented.
---
## 1. Architecture
### What Is Honker?
Honker is a **SQLite extension + language bindings** that adds Postgres-style `NOTIFY`/`LISTEN` semantics to SQLite, with built-in durable pub/sub, task queues, and event streams — without requiring a client-polling loop, a daemon, or a separate broker.
**Core idea**: If SQLite is your primary datastore, your queue should live in the same file. `INSERT INTO orders` and `queue.enqueue(...)` commit in the same transaction. Rollback drops both.
**Implementation language**: Rust. The shared engine is `honker-core`, a plain Rust `rlib` crate. Language bindings (Python via PyO3, Node via napi-rs, Go via CGo, Ruby via C extension, .NET via P/Invoke, JVM via JNI, Kotlin wrapper, Elixir via NIF, C++ via header-only wrapper) are thin wrappers around the loadable extension's SQL functions.
**How it works as a SQLite extension**: The `honker-extension` crate compiles to `libhonker_ext.{so,dylib,dll}`. Any SQLite 3.9+ client loads it:
```sql
.load ./libhonker_ext
SELECT honker_bootstrap();
```
This creates the schema tables (`_honker_live`, `_honker_dead`, `_honker_notifications`, `_honker_stream`, `_honker_stream_consumers`, `_honker_locks`, `_honker_rate_limits`, `_honker_scheduler_tasks`, `_honker_results`) and registers all SQL scalar functions. The extension and Python/binding tables are shared, so a Python worker can claim jobs any other language pushed via the extension.
### Crate Structure
```
honker-core/ # Rust rlib shared across all bindings (published on crates.io)
honker-extension/ # SQLite loadable extension (cdylib, published on crates.io)
packages/
honker/ # Python package (PyO3 cdylib + Queue/Stream/Outbox/Scheduler)
honker-node/ # napi-rs Node.js binding
honker-rs/ # Ergonomic Rust wrapper
honker-go/ # Go binding
honker-ruby/ # Ruby binding
honker-bun/ # Bun binding
honker-ex/ # Elixir binding
honker-cpp/ # C++ binding
honker-dotnet/ # .NET / C# binding
honker-jvm/ # JVM / Java-compatible binding
honker-kotlin/ # Kotlin convenience wrapper
```
### Wake Path Architecture
The fundamental challenge for any SQLite-based pub/sub system: SQLite has no wire protocol or server-push. Consumers must initiate reads. Honker solves this with a **single-digit-microsecond `PRAGMA data_version` read**:
1. **One PRAGMA-poll thread per `Database`** queries `data_version` every 1ms
2. Counter change → fan out a tick to each subscriber's bounded channel (capacity 1 — coalesces redundant wakes)
3. Each subscriber runs `SELECT … WHERE id > last_seen` against a partial index, yields rows, returns to wait
4. 100 subscribers = 1 poll thread. Idle listeners run zero SQL queries.
Idle cost: ~3.5µs per `PRAGMA data_version` query, ~3.5ms/sec total at 1kHz. A 5-second paranoia poll exists as a fallback only if the update watcher cannot fire.
**Three backend options** (controlled by `WatcherBackend` enum):
- **Polling** (default, stable): `PRAGMA data_version` every 1ms. Correct on all platforms.
- **Kernel** (experimental, `kernel-watcher` Cargo feature): Uses `notify-rs` filesystem events. Fires on every filesystem write. May produce spurious/missed wakes. Dead-man's switch for file replacement.
- **SHM fast path** (experimental, `shm-fast-path` Cargo feature): Memory-maps the `-shm` WAL index file and reads `iChange` at offset 8 at ~100µs cadence. WAL-mode only. Dead-man's switch for file replacement.
**Dead-man's switch**: All backends check file identity `(dev, ino)` / `(volume_serial, file_index)` every ~100ms. If the database file is replaced (atomic rename, litestream restore, volume remount), the watcher panics with a clear error message. Subscribers see an error from `update_events()` instead of hanging silently.
### SharedUpdateWatcher
```rust
pub struct SharedUpdateWatcher {
watcher: Mutex<Option<UpdateWatcher>>, // background poll thread
senders: Arc<Mutex<HashMap<u64, SyncSender<()>>>>, // fan-out channels
next_id: AtomicU64,
}
```
- `subscribe()``(u64, Receiver<()>)` — register a channel; capacity 1
- `unsubscribe(id)` — remove channel; receiver sees `Err(RecvError)`
- `close()` — join the poll thread, clear all subscribers
- Wakes are idempotent "go re-read state" signals. Dropped redundant wakes never lose data.
---
## 2. Core Capabilities
### 2.1 Notify/Listen — Ephemeral Pub/Sub
**What it is**: Fire-and-forget notifications to channel subscribers. Like `pg_notify` but with table-backed persistence until explicitly pruned.
**How it works**:
- `notify(channel, payload)` is a SQL scalar function that INSERTs into `_honker_notifications` and returns the row id. Runs inside the caller's open transaction — rollbacks drop the notification.
- `db.listen(channel)` or `db.updateEvents()` in Node — registers a subscriber that wakes on any database commit, then filters by channel in the `SELECT` path.
- Listeners attach at current `MAX(id)`; **history is not replayed**. This is the key distinction from streams.
**Schema**:
```sql
CREATE TABLE _honker_notifications (
id INTEGER PRIMARY KEY AUTOINCREMENT,
channel TEXT NOT NULL,
payload TEXT NOT NULL,
created_at INTEGER NOT NULL DEFAULT (unixepoch())
);
CREATE INDEX _honker_notifications_recent ON _honker_notifications(channel, id);
```
**Key characteristics**:
- Not auto-pruned. Call `db.prune_notifications(older_than_s=…, max_keep=…)` from a scheduled task.
- Over-triggering is by design: a `data_version` change wakes every subscriber on that database, not just the matching channel. Each wasted wake = one indexed SELECT (microseconds). A missed wake = a silent correctness bug.
- Payload must be valid JSON for cross-language compatibility.
### 2.2 Queue — At-Least-Once Work Queue
**What it is**: Durable, at-least-once delivery work queue with retries, priority, delayed jobs, task expiration, dead-letter handling, named locks, and rate-limiting.
**Schema (single-table hybrid)**:
```sql
CREATE TABLE _honker_live (
id INTEGER PRIMARY KEY AUTOINCREMENT,
queue TEXT NOT NULL,
payload TEXT NOT NULL,
state TEXT NOT NULL DEFAULT 'pending', -- 'pending' | 'processing'
priority INTEGER NOT NULL DEFAULT 0,
run_at INTEGER NOT NULL DEFAULT (unixepoch()), -- for delayed jobs
worker_id TEXT,
claim_expires_at INTEGER, -- visibility timeout
attempts INTEGER NOT NULL DEFAULT 0,
max_attempts INTEGER NOT NULL DEFAULT 3,
created_at INTEGER NOT NULL DEFAULT (unixepoch()),
expires_at INTEGER -- job expiration
);
CREATE INDEX _honker_live_claim
ON _honker_live(queue, priority DESC, run_at, id)
WHERE state IN ('pending', 'processing');
CREATE TABLE _honker_dead (
id INTEGER PRIMARY KEY,
queue TEXT NOT NULL,
payload TEXT NOT NULL,
priority INTEGER NOT NULL DEFAULT 0,
run_at INTEGER NOT NULL DEFAULT 0,
attempts INTEGER NOT NULL DEFAULT 0,
max_attempts INTEGER NOT NULL DEFAULT 0,
last_error TEXT,
created_at INTEGER NOT NULL DEFAULT (unixepoch()),
died_at INTEGER NOT NULL DEFAULT (unixepoch())
);
```
**Claim/ack/nack model**:
| Operation | SQL | Notes |
|-----------|-----|-------|
| Enqueue | `INSERT INTO _honker_live (queue, payload, run_at, priority, max_attempts, expires_at) VALUES (…)` | Returns auto-increment id |
| Claim | `UPDATE _honker_live SET state='processing', worker_id=?, claim_expires_at=unixepoch()+?, attempts=attempts+1 WHERE id IN (SELECT id FROM _honker_live WHERE queue=? AND state IN ('pending','processing') AND (expires_at IS NULL OR expires_at > unixepoch()) AND ((state='pending' AND run_at <= unixepoch()) OR (state='processing' AND claim_expires_at < unixepoch())) ORDER BY priority DESC, run_at ASC, id ASC LIMIT ?) RETURNING …` | One `UPDATE … RETURNING` via partial index |
| Ack | `DELETE FROM _honker_live WHERE id=? AND worker_id=? AND claim_expires_at >= unixepoch() RETURNING id` | Returns 1 if claim still valid, 0 if expired |
| Retry | `UPDATE _honker_live SET state='pending', run_at=unixepoch()+?, worker_id=NULL, claim_expires_at=NULL WHERE id=?` + notify on queue channel | If `attempts >= max_attempts`, DELETE from `_honker_live` and INSERT into `_honker_dead` |
| Fail | `DELETE FROM _honker_live WHERE id=? AND worker_id=? AND claim_expires_at >= unixepoch() RETURNING …` + `INSERT INTO _honker_dead` | Unconditionally move to dead letter |
| Heartbeat | `UPDATE _honker_live SET claim_expires_at=unixepoch()+? WHERE id=? AND worker_id=? AND state='processing'` | Extend claim for long-running handlers |
| Cancel | `DELETE FROM _honker_live WHERE id=? AND state IN ('pending', 'processing')` | Idempotent |
**Visibility timeout**: Default 300 seconds (`claim_expires_at = unixepoch() + 300`). If a worker crashes mid-job, the claim expires and another worker reclaims. `attempts` increments. After `max_attempts` (default 3), the row moves to `_honker_dead`.
**Priority**: Higher `priority` value = claimed first. The partial index on `(queue, priority DESC, run_at, id)` ensures claim path is bounded by working-set size, not history size.
**Delayed jobs**: Set `run_at` to a future timestamp. Workers only claim rows where `run_at <= unixepoch()`. The `run_at` deadline also wakes sleeping workers through `honker_queue_next_claim_at()`.
**Task expiration**: Set `expires_at` on enqueue. Expired jobs are filtered from the claim path. Call `queue.sweep_expired()` to move them to `_honker_dead` with `last_error='expired'`.
**Named locks**: `honker_lock_acquire(name, owner, ttl_s)` → 1 (got it) or 0 (held). `honker_lock_release(name, owner)` → 1 (released) or 0 (not yours). Uses `_honker_locks` table with TTL-based expiration. Primary use case: cron tasks that shouldn't overlap (leader election).
**Rate limiting**: `honker_rate_limit_try(name, limit, per)` → 1 (under limit) or 0 (at limit). Fixed-window counter. Rejected calls don't inflate the count.
**Batch operations**: `honker_claim_batch(queue, worker_id, n, timeout_s)` returns a JSON array of claimed jobs. `honker_ack_batch('[1,2,3]', worker_id)` acks multiple jobs. Ack is per-transaction for batch — honest bool return.
**Task result storage**: `honker_enqueue()` returns the job id. Workers can persist return values via `honker_result_save(id, value, ttl_s)`. Callers await results with `queue.wait_result(id, timeout)`. Opt-in (default `save_result=False`).
**Claim iterator pattern**:
```python
async for job in q.claim("worker-1"):
try:
send(job.payload)
job.ack()
except Exception as e:
job.retry(delay_s=60, error=str(e))
```
Each iteration is `claim_batch(worker_id, 1)`. Wakes on database update from any process, or when the next `run_at` / reclaim deadline arrives. 5-second paranoia poll is the only fallback.
**Queue notifications**: Each enqueue also fires a notification on `honker:<queue>` channel so workers wake immediately without waiting for the next poll cycle.
### 2.3 Stream — Durable Pub/Sub with Per-Consumer Offsets
**What it is**: Durable event stream where each named consumer tracks its own offset. Events persist until explicitly pruned. At-least-once delivery with configurable offset flush cadence.
**Schema**:
```sql
CREATE TABLE _honker_stream (
offset INTEGER PRIMARY KEY AUTOINCREMENT,
topic TEXT NOT NULL,
key TEXT,
payload TEXT NOT NULL,
created_at INTEGER NOT NULL DEFAULT (unixepoch())
);
CREATE INDEX _honker_stream_topic
ON _honker_stream(topic, offset);
CREATE TABLE _honker_stream_consumers (
name TEXT NOT NULL,
topic TEXT NOT NULL,
offset INTEGER NOT NULL DEFAULT 0,
PRIMARY KEY (name, topic)
);
```
**API**:
| Function | Returns | Notes |
|----------|---------|-------|
| `honker_stream_publish(topic, key_or_null, payload_json)` | `offset` | INSERTs into `_honker_stream` + fires notification on `honker:stream:<topic>` |
| `honker_stream_read_since(topic, offset, limit)` | JSON array | Reads rows where `offset > ?` ordered by offset |
| `honker_stream_save_offset(consumer, topic, offset)` | 1 or 0 | Monotonic upsert — never rewinds. 1 = advanced, 0 = existing offset ≥ new |
| `honker_stream_get_offset(consumer, topic)` | offset or 0 | Returns saved offset for consumer/topic pair |
**Python binding**:
```python
stream = db.stream("user-events")
stream.publish({"user_id": uid, "change": "name"}, tx=tx)
async for event in stream.subscribe(consumer="dashboard"):
await push_to_browser(event)
```
**Subscribe behavior**:
1. Replay rows past `offset > saved_offset` in batches (default 1000 rows)
2. Transition to live delivery on commit wake
3. Auto-save offset at most every 1000 events or every 1 second (whichever first)
4. At-least-once: a crash re-delivers in-flight events up to the last flushed offset
5. Override auto-save with `save_every_n=` / `save_every_s=`; set both to 0 for manual control
**Transaction coupling**: `stream.publish(payload, tx=tx)` inserts into `_honker_stream` inside the caller's transaction. Rollback drops the event. This is the transactional outbox pattern without a separate dispatch table.
### 2.4 Scheduler — Time-Triggered Cron Tasks
**Schema**:
```sql
CREATE TABLE _honker_scheduler_tasks (
name TEXT PRIMARY KEY,
queue TEXT NOT NULL,
cron_expr TEXT NOT NULL,
payload TEXT NOT NULL,
priority INTEGER NOT NULL DEFAULT 0,
expires_s INTEGER,
next_fire_at INTEGER NOT NULL,
enabled INTEGER NOT NULL DEFAULT 1
);
```
**API**:
```sql
SELECT honker_scheduler_register('nightly', 'backups', '0 3 * * *', '"go"', 0, NULL);
SELECT honker_scheduler_tick(unixepoch()); -- JSON: fires due
SELECT honker_scheduler_soonest(); -- min next_fire_at
SELECT honker_scheduler_unregister('nightly'); -- 1 = deleted
SELECT honker_scheduler_pause('nightly'); -- 1 = paused
SELECT honker_scheduler_resume('nightly'); -- 1 = resumed
SELECT honker_scheduler_list(); -- JSON array of all schedules
SELECT honker_scheduler_update('nightly', '0 4 * * *', NULL, NULL, NULL, 0);
```
Supports: 5-field cron, 6-field cron (with seconds), `@every <n><unit>` interval expressions.
**Leader election via named lock**: `db.lock('honker-scheduler', ttl=60)`. Two scheduler processes can't both fire. The lock is heartbeat-refreshed every 30s.
**Missed-fire catch-up**: If the scheduler was down for 4 hours with an hourly schedule, the first iteration fires all 4 missed boundaries (with `expires=` to drop stale ones).
**Fires = enqueue**: The scheduler never runs handlers. It enqueues into the task queue. Regular workers consume.
### 2.5 Outbox Pattern
The `outbox` is a convenience wrapper around the `Queue` primitive:
```python
db.outbox("emails", delivery=send_email)
db.outbox("emails").enqueue({"to": "alice@example.com"}, tx=tx)
db.outbox("emails").run_worker("worker-1")
```
Failures retry with exponential backoff (`base_backoff_s * 2^(attempts-1)`) up to `max_attempts`, then land in `_honker_dead`.
---
## 3. Persistence and Reliability
### Durability Guarantees
- **Atomic commit**: Business write + side-effect enqueue/event/notify commit together or roll back together. This is SQLite ACID — the transactional outbox pattern is built into the primitives, not bolted on.
- **SIGKILL safety**: Verified in `tests/test_crash_recovery.py`. Subprocess killed pre-COMMIT → `PRAGMA integrity_check == 'ok'`, zero in-flight rows, no stale write lock, queue round-trip works post-crash.
- **Worker crash recovery**: If a worker crashes mid-job, the claim expires after `visibility_timeout_s` (default 300s) and another worker reclaims. `attempts` increments on each claim. After `max_attempts` (default 3), the row moves to `_honker_dead`.
- **Stream at-least-once**: Offsets auto-flush every 1000 events or 1 second. A crash re-delivers in-flight events up to the last flushed offset. The crash window is bounded by the flush thresholds.
- **Notify has no replay**: Listeners attach at `MAX(id)`. Pruned events are gone. For durable replay, use streams.
### WAL Mode
Recommended default (`journal_mode = WAL`). Gives concurrent readers with one writer and efficient fsync batching (`wal_autocheckpoint = 10000`). Other journal modes work but lose WAL's concurrent-read-while-writing property. Wake detection (`PRAGMA data_version`) works in all journal modes.
### What Happens on Crash
| Scenario | Result |
|----------|--------|
| Process SIGKILL mid-TRANSACTION | SQLite atomic-commit rollback. In-flight write did not land. Fresh process can acquire write lock immediately. |
| Worker process crash mid-job | Claim expires after visibility_timeout. Another worker reclaims. `attempts` increments. |
| Stream consumer crash | Resumes from last auto-saved offset (at-least-once). Pending offset is lost. |
| Database file replaced (litestream restore) | Watcher panics with clear error message. All subscribers see error from update_events(). Must reopen database. |
### What Honker Does NOT Provide
- **Multi-writer replication**: SQLite's locking is for single-host. Two servers writing one `.db` over NFS will corrupt it. Shard by file or switch to Postgres.
- **In-memory database support**: `:memory:` creates a separate database per connection, splitting writer/readers/watchers. Use temp file-backed `.db` for tests.
- **Cross-node distribution**: Honker is single-machine. No built-in mechanism for distributing events across nodes. (This is intentional — see alknet relevance below.)
- **Task pipelines/chains/groups/chords**. Deliberately not built.
- **Workflow orchestration with DAGs**. Deliberately not built.
- **Ordering guarantees across queues**. Each queue is independent.
- **Exactly-once delivery**. Honker provides at-least-once. Idempotent handlers are the user's responsibility.
---
## 4. Performance
### Benchmarks (M-series, release build, median of 3)
| Operation | Throughput |
|-----------|-----------|
| enqueue (1/tx) | ~8,000/sec |
| enqueue (100/tx) | ~110,000/sec |
| claim + ack (individual) | ~4,500/sec |
| claim_batch + ack_batch (32) | ~75,000/sec |
| claim_batch + ack_batch (128) | ~110,000/sec |
| async iter end-to-end | ~6,500/sec |
| stream replay | ~1,000,000/sec |
| stream live e2e p50 | 0.24ms |
| stream live e2e p99 | 8ms |
### Cross-Process Wake Latency
Median ~1-2ms on M-series, bounded by the 1ms `PRAGMA data_version` poll cadence. 600-second soak test under sustained ~75 commits/sec showed zero missed wakes, zero drift, `PRAGMA integrity_check = ok`.
### Claim Performance at Scale
With 100,000 dead rows in `_honker_dead`:
| Operation | Claim+ack |
|-----------|-----------|
| 0 dead rows (fresh DB) | ~4,000/sec |
| 100k dead rows | ~3,500/sec |
The partial index `(queue, priority DESC, run_at, id) WHERE state IN ('pending','processing')` keeps the claim hot path bounded by working-set size, not history size.
### How It Compares to Polling
Prior to honker's wake mechanism, the alternative would be application-level polling (e.g., `SELECT … WHERE id > last_seen` every N seconds). Honker replaces this with a single-digit-microsecond PRAGMA read. 100 subscribers still = 1 poll thread. The over-triggering trade-off (waking all subscribers on any commit) is explicitly chosen over potentially missing a wake.
---
## 5. SQLite Integration
### Loading the Extension
```sql
-- Any SQLite 3.9+ client
.load ./libhonker_ext
SELECT honker_bootstrap();
```
`honker_bootstrap()` is idempotent — it runs `CREATE TABLE IF NOT EXISTS` and `CREATE INDEX IF NOT EXISTS` for all schema tables.
### Compile/Load Flags
For Rust integration via `rusqlite`:
```toml
[dependencies]
honker-core = "0.2.3"
rusqlite = { version = "0.39.0", features = ["functions", "hooks"] }
```
Then in Rust:
```rust
use honker_core::{attach_notify, attach_honker_functions, bootstrap_honker_schema, open_conn};
let conn = open_conn("app.db", true)?; // true = install notify
attach_honker_functions(&conn)?;
bootstrap_honker_schema(&conn)?;
```
For the loadable extension:
```bash
cargo build --release -p honker-extension
# Produces: target/release/libhonker_ext.so (or .dylib, .dll)
```
### Rust Crate Usage
```rust
use honker_core::SharedUpdateWatcher;
let watcher = SharedUpdateWatcher::new(db_path.clone());
let (sub_id, rx) = watcher.subscribe();
// In a loop:
match rx.recv_timeout(Duration::from_secs(5)) {
Ok(()) => { /* re-read state from SQLite */ },
Err(RecvTimeoutError::Timeout) => { /* paranoia poll */ },
Err(RecvTimeoutError::Disconnected) => { /* watcher died, reopen */ },
}
watcher.unsubscribe(sub_id);
watcher.close()?;
```
### Using with ORM Connections
Load `libhonker_ext` on the ORM's connection and call `honker_bootstrap()` inside the ORM's transaction:
```python
# SQLAlchemy
@event.listens_for(engine, "connect")
def _load_honker(conn, _):
honker.load_extension(conn)
conn.execute("SELECT honker_bootstrap()")
with Session(engine) as s, s.begin():
s.add(Order(user_id=42))
s.execute(text("SELECT honker_enqueue(:q, :p, NULL, NULL, 0, 3, NULL)"),
{"q": "emails", "p": '{"to":"alice"}'})
```
### PRAGMA Defaults
Applied on every connection opened via `open_conn`:
```sql
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL; -- fsync WAL at checkpoint, not every commit
PRAGMA busy_timeout = 5000; -- wait up to 5s for writer lock
PRAGMA foreign_keys = ON;
PRAGMA cache_size = -32000; -- 32MB page cache (default was 2MB)
PRAGMA temp_store = MEMORY; -- temp B-trees in RAM
PRAGMA wal_autocheckpoint = 10000; -- fsync every 10k WAL pages
```
---
## 6. Complete API Surface
### Notification Functions
| Function | Signature | Returns | Notes |
|----------|-----------|---------|-------|
| `notify(channel, payload)` | Scalar, 2 args | `rowid` | INSERTs into `_honker_notifications`, returns auto-generated id |
### Queue Functions
| Function | Signature | Returns | Notes |
|----------|-----------|---------|-------|
| `honker_bootstrap()` | 0 args | `1` | Creates all schema tables/indexes. Idempotent. |
| `honker_enqueue(queue, payload, run_at_or_null, delay_or_null, priority, max_attempts, expires_or_null)` | 7 args | `id` | INSERTs job. Delay overrides run_at. |
| `honker_claim_batch(queue, worker_id, n, timeout_s)` | 4 args | JSON array | Claims up to `n` jobs. Each gets `claim_expires_at = now + timeout_s`. |
| `honker_ack_batch(ids_json, worker_id)` | 2 args | `count` | ACKs (DELETEs) claimed jobs. `ids_json` is `[1,2,3]`. |
| `honker_ack(job_id, worker_id)` | 2 args | `1` or `0` | Single-job ack. Returns 0 if claim expired. |
| `honker_retry(job_id, worker_id, delay_s, error)` | 4 args | `1` or `0` | Retries (flips back to pending) or fails to dead if `attempts >= max_attempts`. |
| `honker_fail(job_id, worker_id, error)` | 3 args | `1` or `0` | Unconditionally moves to `_honker_dead`. |
| `honker_heartbeat(job_id, worker_id, extend_s)` | 3 args | `1` or `0` | Extends claim for long-running handlers. |
| `honker_cancel(job_id)` | 1 arg | `1` or `0` | Removes pending/processing row. Idempotent. |
| `honker_get_job(job_id)` | 1 arg | JSON or `""` | Read job state. Pure read. |
| `honker_sweep_expired(queue)` | 1 arg | `count` | Moves expired pending jobs to `_honker_dead`. |
| `honker_queue_next_claim_at(queue)` | 1 arg | `unix_ts` or `0` | Earliest future deadline (run_at or claim_expires_at + 1). |
### Stream Functions
| Function | Signature | Returns | Notes |
|----------|-----------|---------|-------|
| `honker_stream_publish(topic, key_or_null, payload_json)` | 3 args | `offset` | INSERTs event + fires notification |
| `honker_stream_read_since(topic, offset, limit)` | 3 args | JSON array | Reads events with `offset > ?` |
| `honker_stream_save_offset(consumer, topic, offset)` | 3 args | `1` or `0` | Monotonic upsert. 0 = existing offset ≥ new |
| `honker_stream_get_offset(consumer, topic)` | 2 args | `offset` or `0` | Returns saved offset |
### Lock Functions
| Function | Signature | Returns | Notes |
|----------|-----------|---------|-------|
| `honker_lock_acquire(name, owner, ttl_s)` | 3 args | `1` or `0` | 1 = acquired, 0 = held |
| `honker_lock_release(name, owner)` | 2 args | `1` or `0` | 1 = released, 0 = not yours |
### Rate Limit Functions
| Function | Signature | Returns | Notes |
|----------|-----------|---------|-------|
| `honker_rate_limit_try(name, limit, per)` | 3 args | `1` or `0` | 1 = under limit, 0 = at limit |
| `honker_rate_limit_sweep(older_than_s)` | 1 arg | `count` | Prunes expired windows |
### Scheduler Functions
| Function | Signature | Returns | Notes |
|----------|-----------|---------|-------|
| `honker_scheduler_register(name, queue, cron_expr, payload, priority, expires_s_or_null)` | 6 args | `1` | Upserts task. Computes next_fire_at. |
| `honker_scheduler_unregister(name)` | 1 arg | `0` or `1` | Deletes task. |
| `honker_scheduler_tick(now_unix)` | 1 arg | JSON array | Fires due tasks, enqueues payloads, advances next_fire_at. |
| `honker_scheduler_soonest()` | 0 args | `unix_ts` or `0` | Earliest next_fire_at for sleep duration calculation. |
| `honker_scheduler_pause(name)` | 1 arg | `0` or `1` | Toggles `enabled = 0`. |
| `honker_scheduler_resume(name)` | 1 arg | `0` or `1` | Toggles `enabled = 1`. |
| `honker_scheduler_list()` | 0 args | JSON array | Returns all schedules with state. |
| `honker_scheduler_update(name, cron_expr_or_null, payload_or_null, priority_or_null, expires_s_or_null, touch_expires)` | 6 args | `0` or `1` | Mutates schedule fields. Recomputes next_fire_at if cron_expr changed. |
| `honker_cron_next_after(expr, from_unix)` | 2 args | `unix_ts` | Pure deterministic function. 5-field, 6-field, or `@every <n><unit>`. |
### Result Functions
| Function | Signature | Returns | Notes |
|----------|-----------|---------|-------|
| `honker_result_save(job_id, value_json, ttl_s)` | 3 args | `1` | UPSERTs result. `ttl_s=0` = no expiration. |
| `honker_result_get(job_id)` | 1 arg | `value` or `NULL` | Returns result or NULL if expired/missing. |
| `honker_result_sweep()` | 0 args | `count` | Prunes expired result rows. |
### Watcher Functions (Extension ABI)
| Function | Signature | Returns | Notes |
|----------|-----------|---------|-------|
| `honker_update_watcher_open(db_path, backend)` | 2 SQL args | `id` | Opens a watcher handle. For Elixir and extension consumers. |
| `honker_update_watcher_wait(id, timeout_ms)` | 2 SQL args | `1`/`0`/`-1` | 1 = update observed, 0 = timeout, -1 = disconnected |
| `honker_update_watcher_close(id)` | 1 SQL arg | `1` | Closes watcher handle. |
C ABI (for Go, .NET, C++, Ruby bindings that route through the extension):
| Function | Signature | Returns | Notes |
|----------|-----------|---------|-------|
| `honker_watcher_open(db_path, backend, err_buf, err_buf_len)` | C ABI | `*mut HonkerWatcherHandle` | Opens a core-backed update watcher. |
| `honker_watcher_wait(handle, timeout_ms)` | C ABI | `1`/`0`/`-1`/`-2` | 1 = update, 0 = timeout, -1 = closed, -2 = panic |
| `honker_watcher_close(handle)` | C ABI | void | Closes and frees the handle. |
### Tables
| Table | Purpose |
|-------|---------|
| `_honker_live` | Pending + processing jobs. Partial index for fast claims. |
| `_honker_dead` | Terminal jobs (retry-exhausted or explicitly failed). Never scanned by claim path. |
| `_honker_notifications` | Ephemeral notify/listen messages. Not auto-pruned. |
| `_honker_stream` | Durable stream events with auto-incrementing offsets. |
| `_honker_stream_consumers` | Per-consumer stream offsets. Monotonic upsert. |
| `_honker_locks` | Named advisory locks with TTL expiration. |
| `_honker_rate_limits` | Fixed-window rate limit counters. |
| `_honker_scheduler_tasks` | Cron/schedule task definitions with next_fire_at. |
| `_honker_results` | Task result storage with TTL expiration. |
---
## 7. Comparison to Postgres (pg_notify)
| Feature | Honker | pg_notify |
|---------|--------|-----------|
| **Delivery model** | Table-backed `INSERT` in transaction | In-memory NOTIFY with LISTEN callback |
| **Persistence** | Rows survive restart. Not auto-pruned. | Ephemeral — lost on restart, not replayed. |
| **Transactional coupling** | `notify(channel, payload)` inside `BEGIN IMMEDIATE; INSERT; COMMIT` — atomic with business write | NOTIFY fires at COMMIT inside the same transaction. Atomic with business write. |
| **Retry / visibility timeout** | Queue has `claim_expires_at`, `attempts`, `max_attempts`, dead-letter. | No retry. No visibility timeout. |
| **Delayed delivery** | `run_at` for scheduled delivery. Jobs only claimable after deadline. | No scheduling. |
| **Cross-process wake** | `PRAGMA data_version` polling at ~1ms cadence. SharedUpdateWatcher fans out to N subscribers. | Postgres notifies listeners via its inter-process communication. |
| **Priority** | Queue priority via partial index `(queue, priority DESC, run_at, id)`. | No priority. |
| **Rate limiting** | Built-in fixed-window `rate_limit_try`. | No rate limiting. |
| **Named locks** | TTL-based advisory locks in `_honker_locks`. | `pg_advisory_lock` (similar concept, different implementation). |
| **Cron scheduling** | Built-in scheduler with 5-field/6-field cron + `@every` intervals. | Needs pg-boss/Oban/cron extension. |
| **Stream offsets** | Per-consumer tracked offsets with monotonic upsert. | No built-in stream offsets. |
| **Multi-process** | Single-machine, single-writer. | Multi-process, multi-writer natively. |
| **Durability** | SQLite ACID. WAL mode for concurrent readers. | Postgres ACID. Full write-ahead logging. |
**What honker gives you that pg_notify alone doesn't**:
1. **Retry with exponential backoff** — automatic re-delivery on failure
2. **Visibility timeout** — crashed workers don't permanently lose messages
3. **Dead-letter queue** — exhausted retries land in `_honker_dead` for inspection
4. **Delayed jobs**`run_at` for future delivery
5. **Prioritization**`priority` column in claim index
6. **Transactional outbox** — business write + enqueue/event in one transaction, without adding Redis/Celery
7. **Task result storage** — workers can persist return values; callers can await results
8. **Durable streams** — per-consumer offsets with at-least-once delivery
9. **Cron scheduling** — built-in periodic tasks with leader election
10. **Named locks and rate limiting** — built-in coordination primitives
**What you'd need to add if you used Postgres instead**: pg-boss, Oban, or similar PgBoss-style packages provide many of these features, but they require Postgres as the database. Honker exists for the case where SQLite is already the primary datastore.
---
## 8. Comparison to Other Message Systems
| Feature | Honker | Redis Pub/Sub | NATS | Kafka |
|---------|--------|--------------|------|-------|
| **Persistence** | SQLite tables (disk) | In-memory only (unless RDB/AOF) | In-memory (JetStream adds persistence) | Persistent log |
| **Transactional coupling** | Business write + enqueue in one tx | Not atomic with business data | Not atomic with business data | Not atomic with business data |
| **Delivery guarantee** | At-least-once | At-most-once (fire-and-forget) | At-most-once (core); at-least-once (JetStream) | At-least-once with consumer offsets |
| **Retry/visibility** | Built-in (claim timeout, retry, dead-letter) | None (messages disappear if no consumer) | None (core); redelivery (JetStream) | Consumer group offsets |
| **Priority** | Yes (partial index) | No | No | No |
| **Delayed delivery** | Yes (`run_at`) | No (requires sorted sets hack) | No | No (requires time-based logic) |
| **Single-node complexity** | Zero — just a `.db` file | Requires Redis server | Requires NATS server | Requires Kafka cluster |
| **Cross-process wake latency** | 1-2ms | ~0.1ms | ~0.1ms | ~1-5ms |
| **Cross-node distribution** | None (single machine) | Pub/Sub is fan-out to connected clients | JetStream supports clustering | Built for distributed |
| **Dependency** | SQLite (already in your stack) | Additional server | Additional server | Additional cluster |
| **Schema coupling** | Same file as business data — dual-write impossible | Separate system — dual-write risk | Separate system — dual-write risk | Separate system — dual-write risk |
| **Language support** | Python, Node, Rust, Go, Ruby, Bun, Elixir, C++, .NET, JVM, Kotlin | Many (but protocol, not SQL) | 40+ client libraries | Many client libraries |
| **Dead-letter queue** | Built-in `_honker_dead` | None | JetStream has DLQ | DLQ via configuration |
**When honker is the right choice**: SQLite is already your primary datastore, and you need pub/sub + queue + scheduling without introducing Redis/Celery/NATS. The dual-write problem between your business tables and the queue disappears.
**When honker is NOT the right choice**: Multi-node deployments, multi-writer sharding, need for cross-datacenter replication, or workloads exceeding single-machine throughput.
---
## 9. Relevance to Alknet
### 9.1 Alignment with Event Boundary Discipline (ADR-032)
ADR-032 defines three communication layers:
```
Call Protocol (Layer 3, external, JSON)
└── irpc Service (Layer 3, internal, postcard)
└── Honker Streams (Domain events, within service boundary)
```
**Honker's single-machine model is exactly right for the bottom layer.** Domain events in alknet are internal to the service that owns that data — `nodes:created`, `edges:deleted`, `accounts:updated`. These never cross the service boundary without projection into a call protocol `EventEnvelope`.
The integration plan (Phase 2.2) explicitly lists honker integration patterns for alknet-storage:
| Feature | Use Case |
|---------|----------|
| `stream_publish` / `subscribe` | Durable pub/sub for node/edge/membership changes |
| `notify` / `listen` | Ephemeral pub/sub for real-time control channel events |
| `queue` / `claim` / `ack` | Task queue for async operations |
### 9.2 Patterns from Honker for alknet-storage Adoption
**Map honker's primitives to alknet-storage's internal events**:
| Alknet Domain Event | Honker Primitive | Stream Name |
|---------------------|------------------|-------------|
| Node created | `stream.publish("nodes:created", ...)` | `nodes:created` |
| Node updated | `stream.publish("nodes:updated", ...)` | `nodes:updated` |
| Node deleted | `stream.publish("nodes:deleted", ...)` | `nodes:deleted` |
| Edge created | `stream.publish("edges:created", ...)` | `edges:created` |
| Account updated | `stream.publish("accounts:updated", ...)` | `accounts:updated` |
| ACL rule changed | `stream.publish("acl:changed", ...)` | `acl:changed` |
**Map honker's task queue to alknet's async operations**:
| Alknet Async Task | Honker Queue |
|-------------------|-------------|
| Key rotation | `queue("key-rotation")` |
| Certificate renewal | `queue("cert-renewal")` |
| Audit log archival | `queue("audit-archival")` |
| Node encryption/decryption | `queue("node-crypto")` |
**Map honker's notify/listen to real-time events**:
| Alknet Real-Time Event | Honker Channel |
|------------------------|---------------|
| SSH connection opened | `notify("ssh:connected", ...)` |
| Config reload triggered | `notify("config:reload", ...)` |
| Forwarding rule activated | `notify("forwarding:activated", ...)` |
### 9.3 Replicating Honker Patterns with Postgres for Production
If alknet-storage is backed by Postgres in production deployments (the storage spec mentions `rusqlite` but leaves room for alternative backends), the following Postgres equivalents would be needed:
| Honker Primitive | Postgres Equivalent | What's Lost |
|-----------------|---------------------|-------------|
| `notify/listen` | `pg_notify` + `LISTEN` | Postgres NOTIFY is ephemeral (lost on restart). Honker's table-backed notifications persist. Need to add a `_notifications` table and polling. |
| `stream_publish/subscribe` | `pg_notify` + consumer offset table | No built-in per-consumer offset tracking. Would need a `_stream_consumers` table and polling/cursor logic. |
| `queue/claim/ack` | pg-boss / Oban | These exist and are production-quality. Honker's simplicity (one table, partial index) is lost. Need a dependency on Oban or pg-boss. |
| `run_at` (delayed jobs) | Oban's `scheduled_at` / pg-boss's `startAfter` | Available in both. |
| `claim_expires_at` (visibility timeout) | Oban's `attempted_at` + `max_attempts` | Available in both. |
| `honker_lock_acquire/release` | `pg_advisory_lock` | Built-in, similar concept. |
| `honker_rate_limit_try` | Custom table or Redis | Postgres has no built-in rate limiting. |
| Transactional coupling | Same tx | Naturally available: `INSERT INTO orders ...; INSERT INTO _honker_live ...;` both in the same Postgres tx. |
| Scheduler | pg-boss `schedule()` or Oban's `Oban.insert(CronWorker, ...)` | Available in both. |
**What would be lost switching to Postgres + pg-boss/Oban**:
- **Schema simplicity**: Honker uses 2 tables for 90% of queue operations. pg-boss uses more tables. Oban uses per-queue tables.
- **Zero-dependency**: Honker is a SQLite extension. No Redis, no Celery, no broker. pg-boss requires Postgres. Oban requires Postgres + Elixir.
- **Cross-language transparency**: Any SQLite client can `SELECT load_extension('honker')` and get the same features. Postgres requires language-specific client libraries.
- **File-based deployment**: Copy the `.db` file. Done. Postgres requires a server.
**Recommendation for alknet-storage**: Start with honker on SQLite for self-hosted/edge deployments. For production Postgres deployments, create an abstraction layer in `alknet-storage` that implements the same `EventStream`, `TaskQueue`, and `NotificationChannel` traits against both backends. The honker-on-SQLite implementation is the reference; the Postgres implementation uses `pg_notify` + offset tables + Oban/pg-boss.
### 9.4 Honker's Queue/Claim Model and alknet's Call Protocol
The call protocol's `EventEnvelope` frames are the integration boundary (ADR-033). When a domain event needs to cross node boundaries, it must be projected:
```
Honker stream event (internal)
→ Projection function
→ EventEnvelope frame (external, call protocol)
→ Transported over SSH/QUIC/DNS
→ Received by remote node
→ May trigger local Honker stream event on remote node
```
The **queue/claim model maps to async call protocol operations**:
1. **call.requested** → Honker `queue.enqueue({"operation": "/head/auth/verify", "input": {...}})`
2. **Worker claims the job** → Like a worker process picking up a call request
3. **job.ack()** → call.responded with the result
4. **job.retry()** → Call timeout / retry logic (but this is at the transport layer, not the queue)
5. **job fails → _honker_dead** → Dead letter equivalent for failed call protocol operations
The **key difference**: alknet's call protocol is synchronous request-response at the transport layer, while honker's queue is async at-least-once. They serve different purposes:
- **Call protocol**: "I need you to verify this pubkey NOW" (synchronous, cross-node)
- **Honker queue**: "Process this key rotation in the background" (asynchronous, within-node)
For **cross-node task distribution**, honker's queue should NOT be the transport. Instead:
1. A domain event (honker stream) in the storage service triggers a projection
2. The projection creates an `EventEnvelope` frame
3. The call protocol delivers it to remote nodes
4. Remote nodes may enqueue it into their own honker queues for local processing
### 9.5 Cross-Node Event Distribution
**Honker is single-node by design.** This is correct for alknet's architecture because:
1. **Domain events stay within the service boundary** (ADR-032). Honker streams are for internal state reconstruction, not cross-node distribution.
2. **Integration events cross boundaries via the call protocol.** When a domain event in the storage service needs to be communicated to another node, it's projected into an `EventEnvelope` frame and sent over the wire.
3. **Each node has its own `.db` file** with its own honker streams. This is a feature, not a limitation — it enforces the event boundary discipline.
The bridge pattern:
```
Node A (storage service):
1. Business write INSERTs into SQLite
2. stream.publish("nodes:created", {node_id: 42}) in same tx
3. A local subscriber detects the event
4. Projects it into EventEnvelope {operation: "/head/nodes/created", data: {node_id: 42}}
5. Sends via call protocol over SSH/QUIC/DNS to Node B
Node B (receiver):
1. Receives EventEnvelope via call protocol
2. Enqueues locally: queue("incoming-events").enqueue({source: "node-A", event: ...})
3. Or publishes locally: stream.publish("remote:nodes:created", {node_id: 42})
```
This preserves the three-layer model while respecting honker's single-machine design.
### 9.6 Honker Patterns and Integration Plan Mapping
The integration plan (Phase 2.2, alknet-storage) references these honker patterns. Here's the direct mapping:
| Plan Reference | Honker Primitive | Implementation Notes |
|---------------|-----------------|---------------------|
| `stream_publish/subscribe` | `db.stream("topic").publish(data, tx=tx)` + `async for event in stream.subscribe(consumer="name")` | Used for domain events within alknet-storage. Each metagraph change (node/edge created/updated/deleted) publishes to a stream. Consumers (local reactive logic, SSE endpoints) subscribe. |
| `notify/listen` | `tx.notify("channel", data)` + `async for n in db.listen("channel")` | Used for ephemeral real-time signals. SSH connection events, config reload triggers, forwarding rule activation. No persistence needed. |
| `queue/claim` | `queue.enqueue(data, tx=tx)` + `async for job in queue.claim(worker_id)` | Used for background tasks. Key rotation, certificate renewal, audit log archival, batch operations. The `tx=tx` parameter ensures atomicity with business writes. |
**Implementation approach for alknet-storage (Rust)**:
Use `honker-core` directly (not the Python binding). The Rust crate exposes:
- `open_conn(path, install_notify)` — open a connection with PRAGMA defaults
- `attach_honker_functions(&conn)` — register all SQL functions
- `bootstrap_honker_schema(&conn)` — create tables
- `SharedUpdateWatcher::new(db_path)` — the wake listener
Or load the extension via `rusqlite`:
```rust
use rusqlite::Connection;
let conn = Connection::open("alknet.db")?;
conn.load_extension("libhonker_ext", None)?;
conn.execute_batch("SELECT honker_bootstrap()")?;
```
### 9.7 Rust Integration: honker-core vs honker-rs
Two options for Rust integration in alknet-storage:
**Option A: Use `honker-core` directly**
The `honker-core` crate provides:
- `attach_notify(&conn)``_honker_notifications` table + `notify()` SQL function
- `attach_honker_functions(&conn)` — all `honker_*` SQL functions
- `bootstrap_honker_schema(&conn)` — all table/index creation
- `SharedUpdateWatcher` — the wake mechanism
- `open_conn(path, install_notify)` — connection factory with PRAGMA defaults
This gives you raw SQL access. You call `conn.query_row("SELECT honker_enqueue(…)")` etc. Maximum control, minimum abstraction.
**Option B: Use `honker-rs` ergonomic wrapper**
The `packages/honker-rs` crate provides:
- `Database::open(path)` — opens `system.db`
- `db.queue("name")``Queue` handle with `.enqueue()`, `.claim_batch()`, `.ack_batch()`
- `db.stream("name")``Stream` handle with `.publish()`, `.subscribe()`
- `db.listen("channel")` — async listener
- `db.outbox("name", delivery_fn)` — outbox pattern
- `db.lock("name", owner, ttl)` — named lock
- `db.scheduler()` — cron scheduler
**Recommendation**: Start with `honker-core` + direct SQL. The schema and functions are stable and well-tested. Wrap in application-level methods as needed. `honker-rs` may not expose all features (e.g., the scheduler pause/resume/list/update functions added in Phase Mantle). Using `honker-core` gives maximum flexibility while maintaining a single source of truth for SQL behavior.
---
## 10. Open Questions for Alknet
1. **Should alknet-storage bundle honker as a Rust crate dependency, or load the extension at runtime?**
- Bundling `honker-core` gives compile-time verification. Loading the extension requires shipping `libhonker_ext.so/.dylib/.dll` alongside the binary.
- Recommendation: Bundle `honker-core` as a crate dependency for the Rust implementation. Extension loading is for language bindings that can't link Rust code directly.
2. **Should the `alknet-storage` crate depend on `honker` (the Python package) or `honker-core` (the Rust rlib)?**
- `honker-core` (Rust rlib) — correct choice for a Rust crate. `honker` is the Python binding.
- The Crate dependency in storage.md currently lists `honker = "0.x"`. This should be `honker-core = "0.2"`.
3. **How does the Rust `SharedUpdateWatcher` integrate with tokio?**
- `SharedUpdateWatcher::subscribe()` returns a `std::sync::mpsc::Receiver<()>`, which is blocking. For tokio integration, wrap in `tokio::task::spawn_blocking` or use `tokio::sync::mpsc` as a bridge.
- Alternatively, use `UpdateWatcher::spawn()` directly and convert ticks to tokio notifications.
4. **Should alknet-storage abstract over honker-specific table names?**
- Honker prefixes all internal tables with `_honker_` (e.g., `_honker_live`, `_honker_stream`). Alknet-storage should treat these as honker's internal schema and not directly query them for application logic.
- Application-level tables (like `nodes`, `edges`, `accounts`) should use their own namespacing convention. Honker's tables coexist in the same `.db` file.
5. **Multi-tenant support**: Honker queues and streams are identified by name strings (e.g., `"emails"`, `"user-events"`). For alknet's multi-tenant model (system DB vs tenant DB), each tenant gets its own `.db` file with its own honker tables. Cross-tenant events must go through the call protocol — never by direct honker stream subscription across database files.
6. **Database file management**: Alknet-storage's system DB (`system.db`) and tenant DBs (`tenant-{orgId}.db`) should each have their own honker instance. The `SharedUpdateWatcher` is per-database, so 100 active tenants = 100 poll threads. This is fine for the expected alknet deployment size, but worth monitoring thread count in large deployments.
---
## 11. License and Maturity
- **License**: Apache 2.0 OR MIT (dual-licensed). Fully permissive for integration.
- **Maturity**: Alpha software (noted in README). Better than experimental but not beta-quality yet.
- **Status**: Active development. Regular commits. Cross-language interop tests. 180+ Python tests, 12+ Rust tests. Crash recovery verified. 600-second soak test under sustained writes.
- **Breaking changes risk**: The project is pre-1.0. Some table names still reference "joblite" and "litenotify" in the CHANGELOG (historical names). Current names use `_honker_` prefix. The API surface is stabilizing but may change.
- **Recommendation**: Pin to a specific `honker-core` version in `alknet-storage`'s `Cargo.toml`. The schema migration path (seen in `bootstrap_honker_schema`'s ALTER TABLE for `enabled` column) shows the project handles migrations.
---
## References
- [Honker GitHub Repository](https://github.com/russellromney/honker) — Primary source for all code and documentation
- [Honker README](https://github.com/russellromney/honker/blob/main/README.md) — Feature overview, quick start, architecture, performance
- [Honker BINDINGS.md](https://github.com/russellromney/honker/blob/main/BINDINGS.md) — Language binding support matrix
- [Honker ROADMAP.md](https://github.com/russellromney/honker/blob/main/ROADMAP.md) — Future work phases, planned features (singleton/dedup, state events, queue stats, per-queue config)
- [Honker CHANGELOG.md](https://github.com/russellromney/honker/blob/main/CHANGELOG.md) — Detailed history of all changes, performance passes, and architecture decisions
- [Honker honker-core/src/lib.rs](https://github.com/russellromney/honker/blob/main/honker-core/src/lib.rs) — Core Rust implementation: Writer, Readers, UpdateWatcher, SharedUpdateWatcher, schema, PRAGMAs
- [Honker honker-core/src/honker_ops.rs](https://github.com/russellromney/honker/blob/main/honker-core/src/honker_ops.rs) — All SQL function implementations: enqueue, claim, ack, retry, stream, lock, rate limit, scheduler
- [Honker honker-extension/src/lib.rs](https://github.com/russellromney/honker/blob/main/honker-extension/src/lib.rs) — Loadable extension entry point and C ABI for watcher
- [alknet ADR-032: Event Boundary Discipline](../../architecture/decisions/032-event-boundary-discipline.md) — Domain events stay within service boundary
- [alknet Integration Plan](../../research/integration-plan.md) — Phase 2.2: alknet-storage honker integration
- [alknet Storage Spec](../../architecture/storage.md) — alknet-storage crate design and honker integration table

View File

@@ -1,170 +0,0 @@
# async-nats: Overview & Architecture
**Crate**: `async-nats`
**Version**: 0.49.1
**Repository**: https://github.com/nats-io/nats.rs
**License**: Apache-2.0
**Rust Edition**: 2021
**MSRV**: 1.88.0
**Async Runtime**: Tokio
## What is async-nats?
`async-nats` is the official async Rust client for the [NATS messaging system](https://nats.io). It provides a Tokio-based asynchronous interface to NATS server features including:
- **Core NATS** — publish/subscribe, request/reply, queue groups
- **JetStream** — persistent stream-based messaging with at-least-once and exactly-once semantics
- **Key-Value Store** — KV abstraction built on JetStream streams
- **Object Store** — large-object storage built on JetStream streams
- **Service API** — microservice request/reply pattern with built-in PING/INFO/STATS verbs
The crate is positioned as the **core client** in the NATS Rust ecosystem. A separate project, [Orbit](https://github.com/synadia-io/orbit.rs), provides higher-level opinionated abstractions on top.
```
┌──────────────────────────────────────────────────────┐
│ Application code │
└──────────────┬───────────────────────────┬───────────┘
│ │
▼ ▼
┌───────────────────┐ ┌───────────────────┐
│ Orbit crates │ uses │ async-nats (core) │
│ (opinionated, │──────▶│ (parity, stable, │
│ per-crate semver) │ │ protocol-level) │
└───────────────────┘ └─────────┬─────────┘
┌─────────────┐
│ nats-server │
└─────────────┘
```
## Feature Flags
Features are extensive and control which subsystems are compiled:
| Feature | Default | Description |
|---------|---------|-------------|
| `jetstream` | ✅ | JetStream API (streams, consumers, publish) |
| `kv` | ✅ | Key-Value store (depends on `jetstream`) |
| `object-store` | ✅ | Object store (depends on `jetstream` + `crypto`) |
| `service` | ✅ | Service API (microservice pattern) |
| `nkeys` | ✅ | NKey/JWT authentication |
| `nuid` | ✅ | NUID-based unique ID generation |
| `crypto` | ✅ | Cryptographic primitives (SHA-256 for object store) |
| `websockets` | ✅ | WebSocket transport (`ws://`/`wss://`) |
| `ring` | ✅ | Use `ring` as TLS crypto backend |
| `aws-lc-rs` | ❌ | Use `aws-lc-rs` as TLS crypto backend |
| `fips` | ❌ | FIPS 140-2 compliant via `aws-lc-rs` |
| `chrono` | ❌ | Use `chrono` instead of `time` for datetime types |
| `server_2_10` | ✅ | Server 2.10+ features |
| `server_2_11` | ✅ | Server 2.11+ features |
| `server_2_12` | ✅ | Server 2.12+ features |
| `server_2_14` | ✅ | Server 2.14+ features |
| `experimental` | ❌ | Experimental features |
## Source Structure
```
async-nats/src/
├── lib.rs # Entry point: connect(), ServerInfo, Command, ClientOp, ServerOp,
│ ConnectionHandler, Subscriber, Event, ServerAddr, ConnectInfo
├── client.rs # Client struct, publish/subscribe/request/drain/flush APIs,
│ Request builder, Statistics, trait definitions
├── connection.rs # Framed connection: NATS protocol parser/serializer,
│ read/write buffer management, WebSocket adapter
├── connector.rs # Server pool, reconnection logic, TLS setup, DNS resolution,
│ authentication handshake
├── options.rs # ConnectOptions builder, auth methods, TLS config, callbacks
├── auth.rs # Auth struct (username, password, token, JWT, nkey, signature)
├── auth_utils.rs # Credentials file parsing (JWT + NKey seed)
├── message.rs # Message (inbound), OutboundMessage (outbound)
├── header.rs # HeaderMap, HeaderName, HeaderValue (NATS headers)
├── subject.rs # Subject type, ToSubject trait, SubjectError
├── status.rs # StatusCode enum (NATS status codes)
├── error.rs # Generic Error<K> type used throughout
├── datetime.rs # DateTime type (time or chrono backend)
├── id_generator.rs # Unique ID generation (NUID or rand fallback)
├── tls.rs # TLS configuration helper
├── crypto.rs # SHA-256 for object store integrity
├── jetstream/
│ ├── mod.rs # Module entry: new(), with_domain(), with_prefix()
│ ├── context.rs # Context: JetStream API (streams, consumers, KV, OS, publish)
│ ├── stream.rs # Stream handle, Config, Info, purge/delete/message ops
│ ├── consumer/
│ │ ├── mod.rs # Consumer trait, Info, Config base
│ │ ├── pull.rs # PullConsumer: batch fetch, sequence, messages stream
│ │ └── push.rs # PushConsumer: Ordered push consumer with auto-recreate
│ ├── publish.rs # PublishAck, PublishAckFuture, PublishMessage builder
│ ├── message.rs # JetStream Message (with ack methods), AckKind
│ ├── response.rs # Response<T> (Ok/Err) for JetStream API calls
│ ├── errors.rs # ErrorCode, Error for JetStream
│ ├── account.rs # Account info
│ ├── kv/
│ │ ├── mod.rs # Store: put/get/delete/purge/watch/history/keys
│ │ └── bucket.rs # Bucket Status
│ └── object_store/
│ └── mod.rs # ObjectStore: put/get/delete/watch/list/seal, Object (AsyncRead)
└── service/
├── mod.rs # Service, ServiceBuilder, Group, EndpointBuilder, Request
└── endpoint.rs # Endpoint stream, Stats, Info
```
## Architecture: Core Connection Model
The client uses a **single-connection, actor-model** design:
```
┌──────────────────────────────────────┐
Client (clone) ──▶│ mpsc::Sender<Command> │
(many handles) │ (bounded channel) │
└────────────┬────────────────────────┘
┌────────────────────────────────────────┐
│ ConnectionHandler (tokio::task) │
│ - Receives Command from channel │
│ - Converts to ClientOp │
│ - Manages subscriptions map │
│ - Manages multiplexer (request/reply)│
│ - Pings server on interval │
│ - Handles reconnection │
└────────────┬──────────────────────────┘
┌────────────────────────────────────────┐
│ Connection (framed TCP/TLS/WS) │
│ - Protocol parser (try_read_op) │
│ - Write buffer (VecDeque<Bytes>) │
│ - Vectored I/O support │
│ - Read buffer (BytesMut) │
└────────────┬──────────────────────────┘
nats-server
```
### Key Design Decisions
1. **Cloneable Client**: `Client` is `Clone` (via `mpsc::Sender` clone), enabling shared use across tasks
2. **Single TCP connection**: All traffic (Core NATS, JetStream API, etc.) multiplexes over one connection
3. **Background task**: `ConnectionHandler` runs as a spawned Tokio task, bridging the mpsc channel to the TCP stream
4. **Automatic reconnection**: On disconnect, `Connector` retries servers from the pool with exponential backoff
5. **Subscription rehydration**: On reconnect, all active subscriptions are re-subscribed with adjusted `max` counts
6. **Multiplexer for request/reply**: A single wildcard subscription (`_INBOX.<id>.*`) multiplexes all pending request/reply correlations
## Dependencies (Key)
| Crate | Purpose |
|-------|---------|
| `tokio` | Async runtime, TCP, time, sync, io-util |
| `bytes` | Efficient byte buffer (`Bytes`, `BytesMut`) |
| `tokio-rustls` | TLS via rustls |
| `rustls-native-certs` | Load system root certificates |
| `serde` / `serde_json` | JSON serialization for JetStream API |
| `futures-util` | Stream trait, Sink trait, StreamExt |
| `tracing` | Structured logging |
| `thiserror` | Error derive macros |
| `memchr` | Fast substring search for protocol parsing |
| `portable-atomic` | Atomic types with portable-atomic fallback |
| `tokio-util` | `PollSender` for Sink implementation |
| `tokio-stream` | `ReceiverStream` adapter |

View File

@@ -1,404 +0,0 @@
# async-nats: Key Types & Traits
## Core Types
### `Client`
The primary handle to a NATS connection. Cheaply cloneable (wraps `mpsc::Sender<Command>`).
```rust
#[derive(Clone, Debug)]
pub struct Client {
info: tokio::sync::watch::Receiver<Option<ServerInfo>>,
state: tokio::sync::watch::Receiver<State>,
sender: mpsc::Sender<Command>,
poll_sender: PollSender<Command>,
next_subscription_id: Arc<AtomicU64>,
subscription_capacity: usize,
inbox_prefix: Arc<str>,
request_timeout: Option<Duration>,
max_payload: Arc<AtomicUsize>,
connection_stats: Arc<Statistics>,
skip_subject_validation: bool,
}
```
**Key methods**:
- `publish(subject, payload)` — fire-and-forget publish
- `publish_with_headers(subject, headers, payload)` — publish with NATS headers
- `publish_with_reply(subject, reply, payload)` — publish with reply-to subject
- `subscribe(subject)``Subscriber` — subscribe to a subject
- `queue_subscribe(subject, queue_group)``Subscriber` — queue group subscription
- `request(subject, payload)``Message` — request/reply with default timeout
- `send_request(subject, request)``Message` — request with custom `Request` builder
- `flush()` — wait until all buffered writes are flushed to the server
- `drain()` — drain all subscriptions, flush, then close
- `force_reconnect()` — force a reconnection (e.g., to re-trigger auth)
- `new_inbox()` — generate a unique inbox subject (`_INBOX.<id>`)
- `server_info()``ServerInfo` — last known server info
- `connection_state()``State``Pending`/`Connected`/`Disconnected`
- `statistics()``Arc<Statistics>` — connection statistics (bytes, messages, connects)
- `max_payload()``usize` — server's max payload size
- `set_server_pool(addrs)` — replace the server pool for reconnection
- `server_pool()` — snapshot of current server pool
### `Subscriber`
A `Stream` yielding `Message` values from a subscription.
```rust
#[derive(Debug)]
pub struct Subscriber {
sid: u64,
receiver: mpsc::Receiver<Message>,
sender: mpsc::Sender<Command>,
}
```
Implements `futures_util::Stream<Item = Message>`. Methods:
- `unsubscribe()` — immediately unsubscribe
- `unsubscribe_after(n)` — unsubscribe after `n` total delivered messages
- `drain()` — unsubscribe after in-flight messages are delivered
**Drop behavior**: When a `Subscriber` is dropped, it spawns a task to send `Command::Unsubscribe` to the connection handler, ensuring the server is always notified.
### `Message`
An inbound NATS message:
```rust
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct Message {
pub subject: Subject,
pub reply: Option<Subject>,
pub payload: Bytes,
pub headers: Option<HeaderMap>,
pub status: Option<StatusCode>,
pub description: Option<String>,
pub length: usize,
}
```
### `OutboundMessage`
An outbound message for publishing (no status/description):
```rust
#[derive(Clone, Debug)]
pub struct OutboundMessage {
pub subject: Subject,
pub reply: Option<Subject>,
pub payload: Bytes,
pub headers: Option<HeaderMap>,
}
```
### `Request`
Builder for request/reply calls:
```rust
#[derive(Default)]
pub struct Request {
pub payload: Option<Bytes>,
pub headers: Option<HeaderMap>,
pub timeout: Option<Option<Duration>>,
pub inbox: Option<String>,
}
```
Builder methods: `payload()`, `headers()`, `timeout()`, `inbox()`. The `inbox` field, when set, bypasses the multiplexer and uses a dedicated subscription instead.
### `ServerInfo`
Server metadata received during connection handshake:
```rust
#[derive(Debug, Deserialize, Default, Clone, Eq, PartialEq)]
pub struct ServerInfo {
pub server_id: String,
pub server_name: String,
pub host: String,
pub port: u16,
pub version: String,
pub auth_required: bool,
pub tls_required: bool,
pub max_payload: usize,
pub proto: i8,
pub client_id: u64,
pub go: String,
pub nonce: String,
pub connect_urls: Vec<String>,
pub client_ip: String,
pub headers: bool,
pub lame_duck_mode: bool,
pub cluster: Option<String>,
pub domain: Option<String>,
pub jetstream: bool,
}
```
### `ConnectInfo`
Client → server `CONNECT` message payload:
```rust
#[derive(Clone, Debug, Serialize)]
pub struct ConnectInfo {
pub verbose: bool,
pub pedantic: bool,
pub user_jwt: Option<String>,
pub nkey: Option<String>,
pub signature: Option<String>,
pub name: Option<String>,
pub echo: bool,
pub lang: String,
pub version: String,
pub protocol: Protocol, // Original(0) or Dynamic(1)
pub tls_required: bool,
pub user: Option<String>,
pub pass: Option<String>,
pub auth_token: Option<String>,
pub headers: bool,
pub no_responders: bool,
}
```
The client always sets: `verbose=false`, `pedantic=false`, `lang="rust"`, `protocol=Dynamic`, `headers=true`, `no_responders=true`.
### `Statistics`
Atomic connection statistics (shared via `Arc`):
```rust
#[derive(Default, Debug)]
pub struct Statistics {
pub in_bytes: AtomicU64,
pub out_bytes: AtomicU64,
pub in_messages: AtomicU64,
pub out_messages: AtomicU64,
pub connects: AtomicU64,
}
```
## Subject Types
### `Subject`
A validated NATS subject string (newtype over `String`):
```rust
// Usage:
let subject: Subject = "foo.bar.baz".into();
```
### `ToSubject` trait
Conversion trait for subjects:
```rust
pub trait ToSubject {
fn to_subject(self) -> Result<Subject, SubjectError>;
}
```
Implemented for `&str`, `String`, `Subject` directly.
### `SubjectError`
```rust
pub enum SubjectError {
InvalidFormat,
}
```
## Header Types
### `HeaderMap`
A multimap of header name → values:
```rust
pub struct HeaderMap {
inner: VecMap<HeaderName, Vec<HeaderValue>>,
}
```
Methods: `insert()`, `append()`, `get()`, `len()`, `is_empty()`, `iter()`, `to_bytes()`.
### `HeaderName`
Case-insensitive header name. Created via `FromStr`:
```rust
let name: HeaderName = "Nats-Expected-Last-Subject-Sequence".parse()?;
```
### `HeaderValue`
Header value string. Created via `FromStr` or `From<u64>`:
```rust
let val: HeaderValue = "some value".parse()?;
let val: HeaderValue = HeaderValue::from(42u64);
```
## Server Address Types
### `ServerAddr`
Wraps a `url::Url` with NATS-specific validation. Supports schemes: `nats://`, `tls://`, `ws://`, `wss://`. Default port is `4222`.
```rust
let addr: ServerAddr = "demo.nats.io".parse()?;
let addr: ServerAddr = "nats://demo.nats.io:4222".parse()?;
let addr: ServerAddr = "tls://demo.nats.io".parse()?;
```
### `ToServerAddrs` trait
Flexible server address input (single URL, `Vec`, slice, etc.):
```rust
pub trait ToServerAddrs {
type Iter: Iterator<Item = ServerAddr>;
fn to_server_addrs(&self) -> io::Result<Self::Iter>;
}
```
### `Server`
Metadata about a server in the pool:
```rust
pub struct Server {
pub addr: ServerAddr,
pub failed_attempts: usize,
pub did_connect: bool,
pub is_discovered: bool,
pub last_error: Option<String>,
}
```
## Event & State Types
### `Event`
Asynchronous notifications from the connection:
```rust
pub enum Event {
Connected,
Disconnected,
LameDuckMode,
Draining,
Closed,
SlowConsumer(u64), // subscription sid
ServerError(ServerError),
ClientError(ClientError),
}
```
Received via `ConnectOptions::event_callback()`.
### `State`
Connection state observable via `watch::Receiver`:
```rust
pub enum State {
Pending,
Connected,
Disconnected,
}
```
### `StatusCode`
NATS protocol status codes (e.g., `NO_RESPONDERS = 404`, `TIMEOUT = 408`).
## Error Types
All error types follow the pattern `Error<Kind>` from `crate::error`:
| Error Type | Kind | Used By |
|------------|------|---------|
| `ConnectError` | `ConnectErrorKind` | Connection establishment |
| `PublishError` | `PublishErrorKind` | Publish operations |
| `RequestError` | `RequestErrorKind` | Request/reply |
| `SubscribeError` | `SubscribeErrorKind` | Subscribe |
| `FlushError` | `FlushErrorKind` | Flush |
| `DrainError` | — | Drain |
### `ConnectErrorKind`
```rust
pub enum ConnectErrorKind {
ServerParse, // URL parsing failed
Dns, // DNS resolution failed
Authentication, // Auth signing failed
AuthorizationViolation, // Server rejected auth
TimedOut, // Connection handshake timeout
Tls, // TLS error
Io, // Other I/O error
MaxReconnects, // Exceeded max reconnect attempts
}
```
## Trait Definitions
The `client::traits` module defines abstract interfaces:
```rust
pub trait Publisher {
fn publish_with_reply(&self, subject, reply, payload) -> Future<Output = Result<(), PublishError>>;
fn publish_message(&self, msg: OutboundMessage) -> Future<Output = Result<(), PublishError>>;
}
pub trait Subscriber {
fn subscribe(&self, subject) -> Future<Output = Result<crate::Subscriber, SubscribeError>>;
}
pub trait Requester {
fn send_request(&self, subject, request: Request) -> Future<Output = Result<Message, RequestError>>;
}
pub trait TimeoutProvider {
fn timeout(&self) -> Option<Duration>;
}
```
`Client` implements all of these. The JetStream `Context` also implements them via delegation.
## Authentication Types
### `Auth`
Container for all authentication methods:
```rust
pub struct Auth {
pub jwt: Option<String>,
pub nkey: Option<String>,
pub signature_callback: Option<CallbackArg1<String, Result<String, AuthError>>>,
pub signature: Option<Vec<u8>>,
pub username: Option<String>,
pub password: Option<String>,
pub token: Option<String>,
}
```
### `AuthError`
Simple string error for auth callback failures.
### `ReconnectToServer`
Returned by `reconnect_to_server_callback` to select a server and delay:
```rust
pub struct ReconnectToServer {
pub addr: ServerAddr,
pub delay: Option<Duration>,
}
```

View File

@@ -1,278 +0,0 @@
# async-nats: NATS Protocol & Wire Format
## Protocol Overview
NATS uses a simple, text-based protocol over TCP. Messages are terminated with `\r\n`. The protocol is symmetric for client and server operations.
### Client → Server Operations (`ClientOp`)
```rust
pub(crate) enum ClientOp {
Publish { subject, payload, respond, headers },
Subscribe { sid, subject, queue_group },
Unsubscribe { sid, max },
Ping,
Pong,
Connect(ConnectInfo),
}
```
### Server → Client Operations (`ServerOp`)
```rust
pub(crate) enum ServerOp {
Ok,
Info(Box<ServerInfo>),
Ping,
Pong,
Error(ServerError),
Message { sid, subject, reply, payload, headers, status, description, length },
}
```
## Wire Format: Client Operations
### CONNECT
Sent immediately after receiving the first `INFO` from the server:
```
CONNECT {"verbose":false,"pedantic":false,...}\r\n
```
The JSON payload is `ConnectInfo` serialized inline on the same line.
### PUB (Publish without headers)
```
PUB <subject> [reply-to] <payload-size>\r\n
<payload>\r\n
```
Example:
```
PUB events.data INBOX.67 11\r\n
Hello World\r\n
```
### HPUB (Publish with headers)
When headers are present and non-empty:
```
HPUB <subject> [reply-to] <header-size> <total-size>\r\n
<headers>\r\n
<payload>\r\n
```
The `<total-size>` = `<header-size>` + `<payload-size>`.
Header block format:
```
NATS/1.0\r\n
Header-Name: Header-Value\r\n
Another-Header: Another-Value\r\n
\r\n
```
The version line (`NATS/1.0`) may include a status code and description:
```
NATS/1.0 404 No Messages\r\n
\r\n
```
### SUB (Subscribe)
```
SUB <subject> [queue-group] <sid>\r\n
```
The `sid` (subscription ID) is a client-assigned u64, unique per connection.
### UNSUB (Unsubscribe)
```
UNSUB <sid> [max]\r\n
```
The optional `max` tells the server to auto-unsubscribe after `max` messages are delivered.
### PING / PONG
```
PING\r\n
PONG\r\n
```
Client sends PING periodically (default every 60s). If 2+ pings are pending without PONG, the connection is considered dead.
## Wire Format: Server Operations
### INFO
First message sent by the server on connection:
```
INFO {"server_id":"NATSxxx","version":"2.10"...}\r\n
```
Also sent asynchronously when cluster topology changes.
### MSG (Message without headers)
```
MSG <subject> <sid> [reply-to] <payload-size>\r\n
<payload>\r\n
```
### HMSG (Message with headers)
```
HMSG <subject> <sid> [reply-to] <header-size> <total-size>\r\n
<headers + payload>\r\n
```
### +OK / -ERR
```
+OK\r\n
-ERR <description>\r\n
```
Sent only when `verbose=true` in `CONNECT`. The client always sets `verbose=false`, so `+OK` is not expected.
## Protocol Parser
The `Connection` struct handles all protocol parsing and serialization:
### Read Path (`try_read_op`)
1. Search for `\r\n` in `read_buf` using `memchr::memmem::find`
2. Inspect the first bytes to determine the operation type:
- `+OK``ServerOp::Ok`
- `PING``ServerOp::Ping`
- `PONG``ServerOp::Pong`
- `-ERR``ServerOp::Error(...)` (description is `trim_matches('\'')`)
- `INFO ``ServerOp::Info(...)` (serde_json deserialization)
- `MSG ` → Parse subject/sid/reply/size, then read payload
- `HMSG ` → Parse subject/sid/reply/header_len/total_len, then read headers + payload
3. For `MSG`/`HMSG`: if the full message body hasn't been read yet, return `None` (wait for more data)
4. For `HMSG`: parse the header block — extract version line (`NATS/1.0[ <status>[ <description>]]`), then key-value pairs (supports folded/multi-line header values)
### Write Path (`enqueue_write_op`)
Writes into a buffer strategy:
- **Small writes** (< 4096 bytes): flattened into `flattened_writes: BytesMut`
- **Large writes** (≥ 4096 bytes): appended as separate `Bytes` chunks in `write_buf: VecDeque<Bytes>`
This enables efficient vectored I/O when the underlying stream supports it.
### Write Flush Strategy
The `should_flush()` method returns:
- `Yes` — buffers empty but haven't flushed yet
- `May` — buffers not empty and haven't flushed
- `No` — already flushed or nothing to flush
The `ConnectionHandler` calls `poll_flush()` after processing commands, ensuring data is actually sent to the server.
## Vectored I/O
When `stream.is_write_vectored()` returns true, the connection uses `poll_write_vectored()` to write up to 64 `IoSlice`s at once. This is significantly more efficient for bursty publish patterns.
```rust
const WRITE_VECTORED_CHUNKS: usize = 64;
```
## WebSocket Transport
When the `websockets` feature is enabled, `WebSocketAdapter<T>` wraps `tokio_websockets::WebSocketStream<T>` to implement `AsyncRead + AsyncWrite`, making WebSocket connections transparent to the protocol layer.
```rust
#[cfg(feature = "websockets")]
pub(crate) struct WebSocketAdapter<T> {
pub(crate) inner: WebSocketStream<T>,
pub(crate) read_buf: BytesMut,
}
```
WebSocket connections use `ws://` or `wss://` scheme in the server URL. TLS for `wss://` is handled by the WebSocket library's built-in TLS support.
## Connection Lifecycle
### Initial Connection Flow
```
Client Server
│ │
│──── TCP connect ────────────────────▶ │
│◀──── INFO {server_id, nonce, ...} ─── │
│──── CONNECT {auth, ...} ──────────▶ │
│──── PING ─────────────────────────▶ │
│◀──── PONG (or -ERR) ─────────────── │
│ │
│ [connected, ConnectionHandler runs] │
```
If `tls_first` is enabled, TLS is established before reading INFO:
```
Client Server
│ │
│──── TCP connect ────────────────────▶ │
│──── TLS handshake ─────────────────▶ │
│◀──── TLS handshake ──────────────── │
│◀──── INFO {...} ──────────────────── │
│──── CONNECT + PING ────────────────▶ │
│◀──── PONG ────────────────────────── │
```
### Ping/Pong Keepalive
- Client sends PING every `ping_interval` (default 60s)
- Server responds with PONG
- If `pending_pings > MAX_PENDING_PINGS (2)`, connection is considered dead
- Any server operation resets the ping interval timer
### Reconnection Flow
On disconnect:
1. `handle_disconnect()` sends `Event::Disconnected` and sets state to `Disconnected`
2. `handle_reconnect()` calls `connector.connect()` which:
- Shuffles servers (unless `retain_servers_order`)
- Sorts by `failed_attempts` (ascending)
- Iterates through servers with exponential backoff delay
- On each server: DNS resolve → TCP connect → INFO → TLS (if needed) → CONNECT+PING → PONG
3. On success:
- Sends `Event::Connected`, sets state to `Connected`
- Removes closed subscriptions
- Re-subscribes all active subscriptions (with adjusted `max = max - delivered`)
- Re-subscribes the multiplexer (if active)
4. On failure with `MaxReconnects` reached, the handler loop exits
### Default Reconnect Delay
Exponential backoff capped at 4 seconds:
```rust
fn reconnect_delay_callback_default(attempts: usize) -> Duration {
if attempts <= 1 {
Duration::from_millis(0)
} else {
let exp: u32 = (attempts - 1).try_into().unwrap_or(u32::MAX);
cmp::min(Duration::from_millis(2_u64.saturating_pow(exp)), Duration::from_secs(4))
}
}
```
| Attempt | Delay |
|---------|-------|
| 1 | 0ms |
| 2 | 0ms |
| 3 | 2ms |
| 4 | 8ms |
| 5 | 32ms |
| 6 | 128ms |
| 7 | 512ms |
| 8 | 2048ms |
| 9+ | 4000ms (cap) |

View File

@@ -1,221 +0,0 @@
# async-nats: Connection Management & Configuration
## ConnectOptions Builder
`ConnectOptions` provides a builder for all connection configuration:
```rust
let client = ConnectOptions::new()
.require_tls(true)
.ping_interval(Duration::from_secs(10))
.name("my-service")
.connect("demo.nats.io")
.await?;
```
### Authentication Methods
| Method | Description |
|--------|-------------|
| `with_token(token)` | Token-based auth |
| `with_user_and_password(user, pass)` | Username/password auth |
| `with_nkey(seed)` | NKey auth (requires `nkeys` feature) |
| `with_jwt(jwt, sign_cb)` | JWT + signing callback (requires `nkeys`) |
| `with_credentials_file(path)` | Load from `.creds` file (requires `nkeys`) |
| `with_credentials(creds_str)` | Parse credentials string (requires `nkeys`) |
| `with_auth_callback(cb)` | Dynamic auth callback receiving nonce, returning `Auth` |
The auth callback is the most flexible — it receives the server nonce and can return any combination of auth fields:
```rust
ConnectOptions::with_auth_callback(move |nonce| async move {
let mut auth = Auth::new();
auth.username = Some("user".to_string());
auth.password = Some("pass".to_string());
Ok(auth)
})
```
### TLS Configuration
| Option | Description |
|-------|-------------|
| `require_tls(bool)` | Require TLS for the connection |
| `tls_first()` | Establish TLS before INFO (requires server `handshake_first`) |
| `add_root_certificates(path)` | Load root CA certificates from PEM file |
| `add_client_certificate(cert, key)` | Load client certificate for mTLS |
| `tls_client_config(config)` | Pass a custom `rustls::ClientConfig` |
Two TLS crypto backends: `ring` (default) or `aws-lc-rs` (via feature flags). FIPS mode available via `aws-lc-rs` + `fips` features.
### Connection Behavior
| Option | Default | Description |
|--------|---------|-------------|
| `connection_timeout` | 5s | Timeout for full connection establishment |
| `request_timeout` | 10s | Default timeout for `Client::request` |
| `ping_interval` | 60s | How often client sends PING |
| `retry_on_initial_connect` | false | Return client immediately, connect in background |
| `max_reconnects` | None (unlimited) | Max consecutive reconnect attempts |
| `ignore_discovered_servers` | false | Ignore servers advertised in INFO |
| `retain_servers_order` | false | Don't shuffle server list on reconnect |
| `skip_subject_validation` | false | Skip whitespace validation on publish subjects |
| `subscription_capacity` | 65536 | mpsc channel capacity per subscription |
| `client_capacity` | 2048 | mpsc channel capacity for command sender |
| `custom_inbox_prefix` | `_INBOX` | Custom prefix for inbox subjects |
| `read_buffer_capacity` | 65535 | Initial size of the protocol read buffer |
| `local_address` | None | Local socket address to bind to |
| `no_echo` | false | Don't deliver messages published by this connection |
### Reconnection Callbacks
**`reconnect_delay_callback`**: Custom backoff strategy:
```rust
.reconnect_delay_callback(|attempts| {
Duration::from_millis(std::cmp::min((attempts * 100) as u64, 8000))
})
```
**`reconnect_to_server_callback`**: Select which server to connect to on each reconnect attempt:
```rust
.reconnect_to_server_callback(|servers, _info| async move {
servers.first().map(|s| ReconnectToServer {
addr: s.addr.clone(),
delay: Some(Duration::ZERO),
})
})
```
Receives `(Vec<Server>, ServerInfo)`, returns `Option<ReconnectToServer>`. If the returned server isn't in the pool, falls back to default selection.
**`event_callback`**: Receive async notifications:
```rust
.event_callback(|event| async move {
match event {
Event::Disconnected => println!("disconnected"),
Event::Connected => println!("connected"),
Event::SlowConsumer(sid) => eprintln!("slow consumer: {sid}"),
_ => {}
}
})
```
## Connection Handler Internals
### ProcessFut — The Core Event Loop
The `ConnectionHandler::process()` method creates a custom `Future` (`ProcessFut`) that drives the connection forward. Each `poll()` call:
1. **Check ping interval** — if timer ticked, send PING; if too many pending pings, disconnect
2. **Read server operations** — drain all available `ServerOp`s from `Connection::poll_read_op()`
3. **Process drain completions** — remove subscriptions that finished draining
4. **Handle commands** — receive up to 16 `Command`s from the mpsc channel and process them
5. **Write to socket** — flush the write buffer via `Connection::poll_write()`
6. **Flush** — call `poll_flush()` on the underlying stream when needed
7. **Check reconnect flag** — if `should_reconnect` is set, shut down and reconnect
```rust
const RECV_CHUNK_SIZE: usize = 16;
```
### Exit Reasons
The event loop exits with one of:
| Reason | Action |
|--------|--------|
| `Disconnected(Option<io::Error>)` | Attempt reconnection |
| `ReconnectRequested` | Shut down stream, attempt reconnection |
| `Closed` | Send `Event::Closed`, exit loop |
### Handle Disconnect & Reconnect
```rust
async fn handle_disconnect(&mut self) -> Result<(), ConnectError> {
self.pending_pings = 0;
self.connector.events_tx.try_send(Event::Disconnected).ok();
self.connector.state_tx.send(State::Disconnected).ok();
self.handle_reconnect().await
}
async fn handle_reconnect(&mut self) -> Result<(), ConnectError> {
let (info, connection) = self.connector.connect().await?;
self.connection = connection;
let _ = self.info_sender.send(Some(info));
// Remove closed subscriptions
self.subscriptions.retain(|_, sub| !sub.sender.is_closed());
// Re-subscribe all active subscriptions
for (sid, subscription) in &self.subscriptions {
self.connection.enqueue_write_op(&ClientOp::Subscribe {
sid: *sid,
subject: subscription.subject.to_owned(),
queue_group: subscription.queue_group.to_owned(),
});
if let Some(max) = subscription.max {
self.connection.enqueue_write_op(&ClientOp::Unsubscribe {
sid: *sid,
max: Some(max.saturating_sub(subscription.delivered)),
});
}
}
// Re-subscribe multiplexer if active
if let Some(multiplexer) = &self.multiplexer {
self.connection.enqueue_write_op(&ClientOp::Subscribe {
sid: MULTIPLEXER_SID,
subject: multiplexer.subject.to_owned(),
queue_group: None,
});
}
Ok(())
}
```
## Request/Reply Multiplexer
The client uses a **multiplexer** pattern for request/reply to avoid creating a separate subscription per request:
1. A single wildcard subscription is created on first request: `_INBOX.<random_id>.*`
2. Each request gets a unique token appended to the inbox: `_INBOX.<random_id>.<token>`
3. When a response arrives, the token is extracted from the subject and used to look up the `oneshot::Sender` in `multiplexer.senders`
4. The response is forwarded through the oneshot channel to the waiting `send_request()` future
```rust
struct Multiplexer {
subject: Subject, // _INBOX.<id>.*
prefix: Subject, // _INBOX.<id>.
senders: HashMap<String, oneshot::Sender<Message>>, // token → sender
}
```
The multiplexer subscription uses `sid = 0` (`MULTIPLEXER_SID`), which is separate from regular subscription IDs (which start at 1).
### Custom Inbox Bypass
If a `Request` has a custom `inbox` set, the multiplexer is bypassed — a dedicated subscription is created for that specific request, and the timeout/response logic is handled locally within `send_request()`.
## Server Pool Management
The `Connector` maintains a `Vec<Server>` pool. Servers can come from:
1. **Explicit URLs** — provided by the user at connect time
2. **Discovered servers** — advertised in `INFO.connect_urls` (unless `ignore_discovered_servers` is set)
On reconnection:
- Servers are shuffled (unless `retain_servers_order`)
- Sorted by `failed_attempts` (ascending) — prefer servers that haven't failed recently
- Each server is tried with exponential backoff delay
- On success: `failed_attempts` reset to 0, `did_connect` set to true
- On failure: `failed_attempts` incremented, `last_error` updated
### Dynamic Server Pool Updates
`Client::set_server_pool()` replaces the pool at runtime:
- Per-server state is preserved for servers that appear in both old and new pools
- The global reconnection attempt counter is reset
- Cannot mix WebSocket and non-WebSocket URLs
- Pool cannot be empty

View File

@@ -1,373 +0,0 @@
# async-nats: JetStream
## Overview
JetStream is NATS' built-in persistence layer, providing stream-based messaging with at-least-once and exactly-once delivery semantics. The `async-nats` JetStream API is accessed through a `Context` object.
### Creating a Context
```rust
// Default context (prefix: $JS.API)
let jetstream = async_nats::jetstream::new(client);
// With domain (prefix: $JS.<domain>.API)
let jetstream = async_nats::jetstream::with_domain(client, "hub");
// With custom prefix
let jetstream = async_nats::jetstream::with_prefix(client, "JS.acc@hub.API");
// Builder with fine-grained control
let context = ContextBuilder::new()
.timeout(Duration::from_secs(5))
.api_prefix("MY.JS.API")
.max_ack_inflight(1000)
.backpressure_on_inflight(true)
.ack_timeout(Duration::from_secs(30))
.build(client);
```
## Context
```rust
#[derive(Debug, Clone)]
pub struct Context {
pub(crate) client: Client,
pub(crate) prefix: String,
pub(crate) timeout: Duration,
pub(crate) max_ack_semaphore: Arc<tokio::sync::Semaphore>,
pub(crate) ack_sender: mpsc::Sender<(oneshot::Receiver<Message>, OwnedSemaphorePermit)>,
pub(crate) backpressure_on_inflight: bool,
pub(crate) semaphore_capacity: usize,
}
```
### Publish Backpressure
The context uses a semaphore to limit the number of pending publish acknowledgments:
- `max_ack_inflight(n)` — sets semaphore capacity (default 5000)
- `backpressure_on_inflight(true)``publish()` waits for a permit when limit is reached
- `backpressure_on_inflight(false)``publish()` returns `MaxAckPending` error immediately when limit is reached
A background **acker task** monitors pending acks with a timeout (`ack_timeout`, default 30s), releasing permits when acks arrive or time out.
### JetStream API Request Pattern
All JetStream API calls follow the same pattern:
1. Build a subject from the prefix: `format!("{}.STREAM.CREATE.<name>", self.prefix)`
2. Serialize the request payload as JSON
3. Send a request via `client.send_request()` with the API subject
4. Deserialize the response as `Response<T>` (which is `Ok(T)` or `Err(ErrorCode)`)
## Streams
### Stream Handle
```rust
pub struct Stream<I = Info> {
context: Context,
info: I,
name: String,
}
```
`Stream<Info>` carries server-side info. `Stream<()>` is a lightweight handle that skips the INFO fetch. `Stream` (no generic) defaults to `Stream<Info>`.
### Stream Config
```rust
pub struct Config {
pub name: String,
pub description: Option<String>,
pub subjects: Vec<String>,
pub retention: RetentionPolicy,
pub max_consumers: i64,
pub max_messages: i64,
pub max_messages_per_subject: i64,
pub max_bytes: i64,
pub max_age: Duration,
pub max_messages_per_stream: i64,
pub max_msg_size: i32,
pub discard: DiscardPolicy,
pub discard_new_per_subject: bool,
pub storage: StorageType,
pub num_replicas: usize,
pub no_ack: bool,
pub duplicate_window: Duration,
pub placement: Option<Placement>,
pub mirror: Option<Source>,
pub sources: Option<Vec<Source>>,
pub sealed: bool,
pub allow_direct: bool,
pub allow_rollup_hdrs: bool,
// server_2_10 features:
pub compression: Option<Compression>,
pub first_sequence: Option<u64>,
pub subject_transform: Option<SubjectTransform>,
pub republish: Option<Republish>,
pub metadata: Option<HashMap<String, String>>,
}
```
### Stream Operations
| Method | Description |
|--------|-------------|
| `create_stream(config)` | Create a new stream |
| `get_stream(name)` | Get stream handle (with INFO) |
| `get_stream_no_info(name)` | Get lightweight handle (no server round-trip) |
| `get_or_create_stream(config)` | Get existing or create new |
| `delete_stream(name)` | Delete a stream |
| `update_stream(config)` | Update stream configuration |
| `create_or_update_stream(config)` | Update or create if not found |
| `stream_names()` | `Stream` of stream names (paginated) |
| `streams()` | `Stream` of stream info (paginated) |
| `stream_by_subject(subject)` | Find stream name containing subject |
### Stream Handle Methods
```rust
let stream: Stream = jetstream.get_stream("events").await?;
// Info
let info: Info = stream.info().await?; // Fresh info from server
let info: &Info = stream.cached_info(); // Cached info from last fetch
// Message operations
stream.get_raw_message(seq).await?; // Get raw message by sequence
stream.get_last_raw_message_by_subject(subj).await?; // Get last message for subject
stream.direct_get(seq).await?; // Direct get (if allow_direct)
stream.direct_get_last_for_subject(subj).await?; // Direct last by subject
stream.delete_message(seq).await?; // Delete a specific message
stream.purge().await?; // Purge all messages
stream.purge().filter(subj).await?; // Purge messages for subject
// Consumers
stream.create_consumer(config).await?; // Create consumer bound to stream
stream.get_consumer(name).await?; // Get existing consumer
stream.delete_consumer(name).await?; // Delete consumer
```
## Consumers
### Consumer Types
Two consumer types, each with distinct delivery models:
1. **Pull Consumer** (`pull::Config` / `PullConsumer`) — Client explicitly requests batches of messages
2. **Push Consumer** (`push::Config` / `PushConsumer`) — Server pushes messages to a deliver subject
### Pull Consumer
```rust
let consumer: PullConsumer = stream
.get_or_create_consumer("my-consumer", pull::Config {
durable_name: Some("my-consumer".to_string()),
..Default::default()
})
.await?;
```
**Key methods**:
- `consumer.batch(n).await?` — Fetch up to `n` messages (one-shot batch)
- `consumer.messages().await?` — Continuous `Stream` of messages
- `consumer.sequence(n).await?` — Continuous `Stream` of batches of `n` messages
- `consumer.fetch().max(n).expires(dur).await?` — Configurable fetch
Each message from a pull consumer is a `jetstream::Message` which has `ack()` methods.
### Push Consumer
Two push consumer variants:
1. **Standard** (`push::Config`) — messages delivered to a specific subject
2. **Ordered** (`push::OrderedConfig`) — auto-recreated on failure, with flow control
```rust
// Standard push
let consumer = stream.create_consumer(push::Config {
deliver_subject: "deliver.subject".to_string(),
durable_name: Some("push-consumer".to_string()),
..Default::default()
}).await?;
// Ordered push (no durable name, auto-recreates on failure)
let consumer = stream.create_consumer(push::OrderedConfig {
deliver_subject: client.new_inbox(),
filter_subject: "events.>".to_string(),
..Default::default()
}).await?;
```
### Consumer Config (Shared Fields)
```rust
pub struct Config {
// Pull fields
pub durable_name: Option<String>,
pub name: Option<String>,
// Push fields
pub deliver_subject: Option<String>,
pub deliver_group: Option<String>,
pub deliver_policy: DeliverPolicy,
pub opt_start_time: Option<DateTime>,
pub opt_start_sequence: Option<u64>,
pub ack_policy: AckPolicy,
pub ack_wait: Duration,
pub max_deliver: i64,
pub backoff: Vec<Duration>,
pub filter_subject: String,
pub filter_subjects: Vec<String>, // server_2_10+
pub replay_policy: ReplayPolicy,
pub rate_limit_bps: Option<u64>,
pub max_waiting: i64, // pull: max outstanding pull requests
pub max_ack_pending: i64,
pub flow_control: bool,
pub idle_heartbeat: Duration,
pub headers_only: bool,
pub num_replicas: usize,
pub mem_storage: bool,
pub description: Option<String>,
pub metadata: Option<HashMap<String, String>>,
pub inactive_threshold: Option<Duration>, // for ephemeral consumers
}
```
### Deliver Policy
```rust
pub enum DeliverPolicy {
All, // Deliver all messages
Last, // Deliver last message only
New, // Deliver only new messages
ByStartSequence { start_sequence: u64 },
ByStartTime { start_time: DateTime },
LastPerSubject, // Deliver last message per subject
}
```
### Ack Policy
```rust
pub enum AckPolicy {
None, // No acknowledgment needed
All, // Ack all messages up to this one
Explicit, // Ack each message individually
}
```
## JetStream Messages
### `jetstream::Message`
Wraps a core `Message` with JetStream-specific metadata:
```rust
pub struct Message {
pub message: crate::Message, // The underlying NATS message
pub ack_subject: Subject, // Subject for sending acks
pub stream: String, // Stream name
pub consumer: String, // Consumer name
pub stream_sequence: u64, // Sequence in stream
pub consumer_sequence: u64, // Sequence for this consumer
pub delivered: u64, // Delivery count
pub pending: u64, // Pending message count
pub published: DateTime, // Original publish time
}
```
### Ack Methods
```rust
// In-memory ack (non-persistent, fast)
message.ack().await?;
// Ack with specific type
message.ack_with(AckKind::Nak).await?;
message.ack_with(AckKind::Progress).await?;
message.ack_with(AckKind::Term).await?;
message.ack_with(AckKind::NakWithDelay(duration)).await?;
message.ack_with(AckKind::TermWithReason("reason")).await?;
```
### `AckKind`
```rust
pub enum AckKind {
Ack, // +ACK — message processed
Nak, // -NAK — re-deliver
Progress, // PRI — still working
Term, // +TERM — don't redeliver
NakWithDelay(Duration), // -NAK with re-delivery delay
TermWithReason(String), // +TERM with reason
}
```
## JetStream Publish
### `Context::publish()`
JetStream publish returns a `PublishAckFuture` — a future that resolves to a `PublishAck`:
```rust
let ack_future = jetstream.publish("events", "data".into()).await?;
let ack: PublishAck = ack_future.await?; // Wait for server acknowledgment
```
### `PublishAck`
```rust
pub struct PublishAck {
pub stream: String,
pub sequence: u64,
pub domain: String,
pub duplicate: bool,
}
```
### `PublishMessage` Builder
```rust
let ack = jetstream.send_publish(
"events",
PublishMessage::build()
.payload("data".into())
.message_id("uuid-123") // Deduplication ID
.expected_stream("events") // Fail if wrong stream
.expected_last_msg_id("prev-id")
.expected_last_sequence(42)
.headers(headers),
).await?;
```
## Pagination
Stream and consumer listing uses pagination internally:
```rust
pub struct StreamNames {
context: Context,
offset: usize,
page_request: Option<Request>,
streams: Vec<String>,
subject: Option<String>,
done: bool,
}
```
Implements `futures_util::Stream<Item = Result<String, Error>>`, lazily fetching pages as needed.
## Error Handling
JetStream errors follow the `Response<T>` pattern:
```rust
pub enum Response<T> {
Ok(T),
Err { error: ErrorCode },
}
```
`ErrorCode` carries the server's error code and description. Most JetStream-specific errors map to typed error enums (e.g., `CreateStreamError`, `ConsumerError`, etc.).

View File

@@ -1,237 +0,0 @@
# async-nats: Key-Value Store
## Overview
The Key-Value (KV) store is an abstraction built on top of JetStream streams. Each KV bucket is backed by a JetStream stream with the naming convention `KV_<bucket_name>`. Keys are mapped to subjects under the `$KV.<bucket>.<key>` prefix.
The KV feature requires `kv` (which implies `jetstream`).
## Store Handle
```rust
#[derive(Debug, Clone)]
pub struct Store {
pub name: String,
pub stream_name: String,
pub prefix: String, // $KV.<bucket>.
pub put_prefix: Option<String>, // For mirrored buckets
pub use_jetstream_prefix: bool, // Whether to prepend JS API prefix
pub stream: Stream,
}
```
## Bucket Config
```rust
#[derive(Debug, Clone, Default)]
pub struct Config {
pub bucket: String,
pub description: String,
pub max_value_size: i32,
pub history: i64, // Max historical entries per key (1-64)
pub max_age: Duration, // Max age of any entry
pub max_bytes: i64, // Total bucket size limit
pub storage: StorageType, // File or Memory
pub num_replicas: usize,
pub republish: Option<Republish>,
pub mirror: Option<Source>, // Mirror another bucket
pub sources: Option<Vec<Source>>,
pub mirror_direct: bool,
pub compression: bool, // server_2_10+
pub placement: Option<Placement>,
pub limit_markers: Option<Duration>, // server_2_11+
}
```
## Creating/Accessing Buckets
```rust
// Create a new bucket
let kv = jetstream.create_key_value(kv::Config {
bucket: "my-bucket".to_string(),
history: 10,
max_age: Duration::from_secs(3600),
..Default::default()
}).await?;
// Get an existing bucket
let kv = jetstream.get_key_value("my-bucket").await?;
// Create or update
let kv = jetstream.create_or_update_key_value(kv::Config { ... }).await?;
// Delete a bucket
jetstream.delete_key_value("my-bucket").await?;
```
## KV Operations
### Put
```rust
let revision: u64 = kv.put("key", "value".into()).await?;
```
Publishes to `$KV.<bucket>.<key>` (or with JS prefix). The JetStream stream stores it, and the returned sequence number serves as the revision.
### Get
```rust
let value: Option<Bytes> = kv.get("key").await?;
```
Returns `None` if the key doesn't exist or was deleted/purged. Uses either direct get (if `allow_direct`) or the standard message API.
### Entry
```rust
let entry: Option<Entry> = kv.entry("key").await?;
let entry: Option<Entry> = kv.entry_for_revision("key", 2).await?;
```
Returns full entry metadata:
```rust
pub struct Entry {
pub bucket: String,
pub key: String,
pub value: Bytes,
pub revision: u64,
pub created: DateTime,
pub delta: u64,
pub operation: Operation,
pub seen_current: bool,
}
```
### Create (Put if not exists)
```rust
let revision: u64 = kv.create("key", "value".into()).await?;
```
Uses `update` with `expected_last_subject_sequence = 0` (create-only). If the key exists and is deleted/purged, it's re-created.
### Update (Conditional Put)
```rust
let revision: u64 = kv.update("key", "value".into(), last_revision).await?;
```
Uses the `Nats-Expected-Last-Subject-Sequence` header for optimistic concurrency control. Only succeeds if the key's current revision matches.
### Delete
```rust
kv.delete("key").await?;
kv.delete_expect_revision("key", Some(revision)).await?;
```
Non-destructive — publishes a `DEL` marker message. The key appears deleted to `get()`, but history is preserved (up to `history` limit).
### Purge
```rust
kv.purge("key").await?;
kv.purge_with_ttl("key", Duration::from_secs(10)).await?;
kv.purge_expect_revision("key", Some(revision)).await?;
```
Destructive — publishes a `PURGE` marker with rollup header, removing all previous revisions of the key. Leaves a single purge entry.
### Watch
```rust
// Watch for new changes
let mut watch = kv.watch("key").await?;
// Watch with initial value
let mut watch = kv.watch_with_history("key").await?;
// Watch from specific revision
let mut watch = kv.watch_from_revision("key", 5).await?;
// Watch all keys
let mut watch = kv.watch_all().await?;
// Watch multiple keys (server_2_10+)
let mut watch = kv.watch_many(["foo", "bar"]).await?;
```
`Watch` implements `futures_util::Stream<Item = Result<Entry, WatcherError>>`.
Under the hood, each watch creates an **ordered push consumer** on the KV stream with:
- `filter_subject` matching `$KV.<bucket>.<key>`
- `replay_policy: Instant`
- Appropriate `deliver_policy`
### History
```rust
let mut history = kv.history("key").await?;
```
Returns a `Stream` of all past `Entry` values for a key (including deletes/purges).
### Keys
```rust
let mut keys = kv.keys().await?;
```
Returns a `Stream<String>` of all current keys. Uses a headers-only consumer with `LastPerSubject` deliver policy to efficiently scan the bucket.
## Entry Operations
```rust
#[derive(Debug, Clone, Copy, Eq, PartialEq)]
pub enum Operation {
Put, // Value was put
Delete, // Value was deleted (DEL marker)
Purge, // Value was purged (PURGE marker with rollup)
}
```
The operation type is determined from the `KV-Operation` header (`PUT`, `DEL`, `PURGE`) or the `Nats-Marker-Reason` header (fallback for server-generated markers like `MaxAge`, `Purge`, `Remove`).
## Key and Bucket Name Validation
```rust
// Bucket: alphanumeric, dash, underscore only
VALID_BUCKET_RE: \A[a-zA-Z0-9_-]+\z
// Key: alphanumeric, dash, slash, underscore, equals, dot; no leading/trailing dots
VALID_KEY_RE: \A[-/_=\.a-zA-Z0-9]+\z
```
## Bucket Status
```rust
let status: Status = kv.status().await?;
```
Wraps stream info to provide bucket-level statistics (bucket name, message count, byte count, etc.).
## Mirrored Buckets
When a bucket is configured as a mirror of another (potentially in a different account/domain):
- `prefix` is set to `$KV.<mirror_bucket>.`
- `put_prefix` may be set to the source bucket's API prefix for cross-domain writes
- `use_jetstream_prefix` is adjusted based on whether the mirror is in the same domain
## KV → Stream Config Mapping
When creating a KV bucket, the `Config` is converted to a JetStream `stream::Config`:
| KV Config | Stream Config |
|-----------|---------------|
| `bucket` | `name = "KV_<bucket>"` |
| `subjects` | `["$KV.<bucket>.>"]` |
| `max_messages_per_subject` | `history` (max 64) |
| `max_age` | `max_age` |
| `max_bytes` | `max_bytes` |
| `storage` | `storage` |
| `num_replicas` | `num_replicas` |
| `republish` | `republish` |
| `mirror` | `mirror` |
| `discard` | `DiscardPolicy::New` |
| `allow_direct` | `true` |
| `allow_rollup_hdrs` | `true |
| `max_msg_size` | `max_value_size` |

View File

@@ -1,245 +0,0 @@
# async-nats: Object Store
## Overview
The Object Store provides large-object storage built on JetStream. Objects are chunked and stored as messages in a JetStream stream, with metadata stored separately. The stream is named `OBJ_<bucket_name>`.
The object-store feature requires `object-store` (which implies `jetstream` + `crypto`).
## ObjectStore Handle
```rust
#[derive(Clone)]
pub struct ObjectStore {
pub(crate) name: String,
pub(crate) stream: Stream,
}
```
## Object Store Config
```rust
#[derive(Debug, Default, Clone, Serialize, Deserialize)]
pub struct Config {
pub bucket: String,
pub description: Option<String>,
pub max_age: Duration,
pub max_bytes: i64,
pub storage: StorageType,
pub num_replicas: usize,
pub compression: bool,
pub placement: Option<Placement>,
}
```
## Creating/Accessing Object Stores
```rust
// Create
let bucket = jetstream.create_object_store(object_store::Config {
bucket: "my-bucket".to_string(),
..Default::default()
}).await?;
// Get existing
let bucket = jetstream.get_object_store("my-bucket").await?;
// Delete
jetstream.delete_object_store("my-bucket").await?;
```
## Object Store Operations
### Put
```rust
let info: ObjectInfo = bucket.put("file.txt", &mut async_read).await?;
```
The put operation:
1. Reads data from any `AsyncRead + Unpin` source in chunks (default 128KB)
2. Each chunk is published to `$O.<bucket>.C.<nuid>` (chunk subject)
3. SHA-256 digest is computed incrementally
4. After all chunks, metadata is published to `$O.<bucket>.M.<encoded_name>` with a rollup header
5. If the object previously existed, old chunks are purged
### Get
```rust
let mut object: Object = bucket.get("file.txt").await?;
```
Returns an `Object` that implements `tokio::io::AsyncRead`:
```rust
let mut bytes = Vec::new();
object.read_to_end(&mut bytes).await?;
```
On read, the Object:
1. Creates an ordered push consumer on `$O.<bucket>.C.<nuid>`
2. Streams chunk messages, feeding bytes to the reader
3. Verifies SHA-256 digest after the last chunk
4. If digest doesn't match, returns `io::ErrorKind::InvalidData`
### Delete
```rust
bucket.delete("file.txt").await?;
```
Marks the object as deleted in metadata (sets `deleted = true`, `chunks = 0`, `size = 0`) with a rollup, then purges all chunk messages.
### Info
```rust
let info: ObjectInfo = bucket.info("file.txt").await?;
```
Fetches the last metadata message for the object (from `$O.<bucket>.M.<encoded_name>`).
### Watch
```rust
let mut watcher = bucket.watch().await?;
let mut watcher = bucket.watch_with_history().await?;
```
Returns a `Stream<Item = Result<ObjectInfo, WatcherError>>`. Uses an ordered push consumer on `$O.<bucket>.M.>`.
### List
```rust
let mut list = bucket.list().await?;
```
Returns a `Stream<Item = Result<ObjectInfo, ListerError>>`. Lists all non-deleted objects. Uses `DeliverPolicy::All` to replay all metadata.
### Seal
```rust
bucket.seal().await?;
```
Sets the underlying stream's `sealed = true`, preventing any further modifications.
### Links
```rust
// Link to another object (same or different bucket)
let info = bucket.add_link("link_name", &object).await?;
// Link to another bucket
let info = bucket.add_bucket_link("link_name", "other_bucket").await?;
```
Links are followed automatically when `get()` is called (one level deep). Cannot link to a deleted object or create a link to a link.
### Update Metadata
```rust
bucket.update_metadata("object", object_store::UpdateMetadata {
name: "new_name".to_string(),
description: Some("updated description".to_string()),
..Default::default()
}).await?;
```
If the name changes, old metadata is purged and new metadata is published.
## Object Types
### ObjectInfo
```rust
#[derive(Debug, Clone, Serialize, Deserialize, Eq, PartialEq)]
pub struct ObjectInfo {
pub name: String,
pub description: Option<String>,
pub metadata: HashMap<String, String>,
pub headers: Option<HeaderMap>,
pub options: Option<ObjectOptions>,
pub bucket: String,
pub nuid: String,
pub size: usize,
pub chunks: usize,
pub modified: Option<DateTime>,
pub digest: Option<String>, // Format: "SHA-256=<base64url-digest>"
pub deleted: bool,
}
```
### ObjectMetadata
```rust
#[derive(Debug, Default, Clone, Serialize, Deserialize, Eq, PartialEq)]
pub struct ObjectMetadata {
pub name: String,
pub description: Option<String>,
pub chunk_size: Option<usize>,
pub metadata: HashMap<String, String>,
pub headers: Option<HeaderMap>,
}
```
### ObjectLink
```rust
#[derive(Debug, Default, Clone, Serialize, Deserialize, Eq, PartialEq)]
pub struct ObjectLink {
pub name: Option<String>, // None = bucket link, Some = object link
pub bucket: String,
}
```
### Object
```rust
pub struct Object {
pub info: ObjectInfo,
remaining_bytes: VecDeque<u8>,
has_pending_messages: bool,
digest: Option<Sha256>,
subscription: Option<crate::jetstream::consumer::push::Ordered>,
subscription_future: Option<BoxFuture<'static, Result<Ordered, StreamError>>>,
stream: Stream,
}
```
Implements `tokio::io::AsyncRead`. Lazy-creates the consumer on first read.
## Subject Naming Convention
| Purpose | Subject Pattern |
|---------|----------------|
| Chunks | `$O.<bucket>.C.<nuid>` |
| Metadata | `$O.<bucket>.M.<base64url-encoded-name>` |
Object names are base64url-encoded in metadata subjects to allow arbitrary characters (the raw name might contain characters invalid in NATS subjects).
## Validation
```rust
// Bucket: alphanumeric, dash, underscore only
BUCKET_NAME_RE: \A[a-zA-Z0-9_-]+\z
// Object name: alphanumeric, dash, slash, underscore, equals, dot; no leading/trailing dots
OBJECT_NAME_RE: \A[-/_=\.a-zA-Z0-9]+\z
```
## Data Integrity
The object store uses SHA-256 hashing (from the `crypto` module) to verify data integrity:
1. On `put()`: SHA-256 is computed incrementally as chunks are read. The digest is stored in `ObjectInfo.digest` as `"SHA-256=<base64url>"`.
2. On `get()` (via `AsyncRead`): SHA-256 is verified after the last chunk is read. If the computed digest doesn't match the stored digest, `io::ErrorKind::InvalidData` is returned.
```rust
// crypto module
pub(crate) struct Sha256 { ... }
impl Sha256 {
pub fn new() -> Self;
pub fn update(&mut self, data: &[u8]);
pub fn finish(self) -> [u8; 32];
}
```

View File

@@ -1,272 +0,0 @@
# async-nats: Service API
## Overview
The Service API provides a microservice request/reply pattern with built-in service discovery, health checking, and statistics. It follows the [NATS Micro v1 specification](https://github.com/nats-io/nats-architecture-design/blob/main/adr/ADR-33.md).
The `service` feature is required.
## Service
```rust
#[derive(Debug)]
pub struct Service {
endpoints_state: Arc<Mutex<Endpoints>>,
info: Info,
client: Client,
handle: JoinHandle<Result<(), Error>>,
shutdown_tx: Sender<()>,
subjects: Arc<Mutex<Vec<String>>>,
queue_group: String,
}
```
## Creating a Service
Via the `ServiceExt` trait on `Client`:
```rust
use async_nats::service::ServiceExt;
// Builder pattern
let mut service = client
.service_builder()
.description("product service")
.stats_handler(|endpoint, stats| serde_json::json!({ "endpoint": endpoint }))
.metadata(HashMap::from([("version".into(), "v2".into())]))
.queue_group("products-group")
.start("products", "1.0.0")
.await?;
// Direct config
let mut service = client
.add_service(service::Config {
name: "products".to_string(),
version: "1.0.0".to_string(),
description: Some("product service".to_string()),
stats_handler: None,
metadata: None,
queue_group: None,
})
.await?;
```
Service name must match `^[A-Za-z0-9\-_]+$`. Version must be valid SemVer.
## Service Verbs
Every service automatically subscribes to three verb subjects for discovery and monitoring:
| Verb | Subject Pattern | Purpose |
|------|----------------|---------|
| PING | `$SRV.PING`, `$SRV.PING.<name>`, `$SRV.PING.<name>.<id>` | Lightweight health check |
| INFO | `$SRV.INFO.<name>`, `$SRV.INFO.<name>.<id>` | Service metadata |
| STATS | `$SRV.STATS.<name>`, `$SRV.STATS.<name>.<id>` | Service + endpoint statistics |
A background task handles these verb requests and responds with JSON payloads.
## Service Config
```rust
#[derive(Serialize, Deserialize, Debug)]
pub struct Config {
pub name: String,
pub description: Option<String>,
pub version: String,
pub stats_handler: Option<StatsHandler>,
pub metadata: Option<HashMap<String, String>>,
pub queue_group: Option<String>,
}
```
## Adding Endpoints
```rust
// Simple endpoint
let mut endpoint = service.endpoint("get-products").await?;
// Endpoint with custom name and metadata
let endpoint = service
.endpoint_builder()
.name("api")
.metadata(HashMap::from([("auth".into(), "required".into())]))
.queue_group("custom-group")
.add("products")
.await?;
// Grouped endpoints
let v1 = service.group("v1");
let products = v1.endpoint("products").await?;
let orders = v1.endpoint("orders").await?;
// Nested groups
let v1_api = service.group("api").group("v1");
```
## Endpoint
```rust
pub struct Endpoint {
requests: Subscriber,
stats: Arc<Mutex<Endpoints>>,
client: Client,
endpoint: String,
shutdown: Option<ShutdownRx>,
shutdown_future: Option<ShutdownReceiverFuture>,
}
```
Implements `futures_util::Stream<Item = Request>`.
```rust
while let Some(request) = endpoint.next().await {
request.respond(Ok("response data".into())).await?;
}
```
## Service Request
```rust
#[derive(Debug)]
pub struct Request {
issued: Instant,
client: Client,
pub message: Message,
endpoint: String,
stats: Arc<Mutex<Endpoints>>,
}
```
### Responding
```rust
// Success
request.respond(Ok("result".into())).await?;
// Success with headers
request.respond_with_headers(Ok("result".into()), headers).await?;
// Error
request.respond(Err(service::error::Error {
code: 500,
status: "internal error".to_string(),
})).await?;
```
Error responses always include `Nats-Service-Error` and `Nats-Service-Error-Code` headers. If user-supplied headers contain these headers, they are overridden by the error values.
### Stats Tracking
Each response updates endpoint statistics:
- `requests` — total requests
- `processing_time` — cumulative processing time
- `average_processing_time` — average per request
- `errors` — error count
- `last_error` — last error details
## Service Info Types
### PingResponse
```rust
pub struct PingResponse {
pub kind: String, // "io.nats.micro.v1.ping_response"
pub name: String,
pub id: String,
pub version: String,
pub metadata: HashMap<String, String>,
}
```
### Info
```rust
pub struct Info {
pub kind: String, // "io.nats.micro.v1.info_response"
pub name: String,
pub id: String,
pub description: String,
pub version: String,
pub metadata: HashMap<String, String>,
pub endpoints: Vec<endpoint::Info>,
}
```
### Stats
```rust
pub struct Stats {
pub kind: String, // "io.nats.micro.v1.stats_response"
pub name: String,
pub id: String,
pub version: String,
pub started: DateTime,
pub endpoints: Vec<endpoint::Stats>,
}
```
### Endpoint Stats
```rust
pub struct endpoint::Stats {
pub name: String,
pub subject: String,
pub queue_group: String,
pub data: Option<serde_json::Value>, // Custom data from stats_handler
pub errors: u64,
pub processing_time: Duration,
pub average_processing_time: Duration,
pub requests: u64,
pub last_error: Option<error::Error>,
}
```
## Service Groups
Groups provide subject prefixing for endpoint organization:
```rust
let service = client.service_builder().start("api", "1.0.0").await?;
// Endpoints subscribe to "products" and "orders"
let products = service.endpoint("products").await?;
let orders = service.endpoint("orders").await?;
// Grouped: subscribe to "v1.products" and "v1.orders"
let v1 = service.group("v1");
let products = v1.endpoint("products").await?;
let orders = v1.endpoint("orders").await?;
// Nested: subscribe to "api.v1.products"
let api_v1 = service.group("api").group("v1");
let products = api_v1.endpoint("products").await?;
```
Each group can have its own queue group:
```rust
let v1 = service.group_with_queue_group("v1", "v1-workers");
```
## Stopping a Service
```rust
service.stop().await?;
```
Sends a shutdown signal and aborts the verb-handling task. Other service instances with the same name continue running.
## Resetting Stats
```rust
service.reset().await?;
```
Resets all endpoint statistics (errors, processing time, requests, average processing time) to zero.
## Querying Service State
```rust
let stats: HashMap<String, endpoint::Stats> = service.stats().await?;
let info: Info = service.info().await?;
```

View File

@@ -1,312 +0,0 @@
# async-nats: Quick Reference
## Connection
```rust
// Basic connect
let client = async_nats::connect("demo.nats.io").await?;
// With options
let client = async_nats::ConnectOptions::new()
.require_tls(true)
.name("my-service")
.ping_interval(Duration::from_secs(10))
.request_timeout(Some(Duration::from_secs(5)))
.connect("demo.nats.io")
.await?;
// Multiple servers
let client = async_nats::connect(vec![
"nats://server1:4222".parse()?,
"nats://server2:4222".parse()?,
]).await?;
// Background connect
let client = async_nats::ConnectOptions::new()
.retry_on_initial_connect()
.connect("demo.nats.io")
.await?;
```
## Core NATS: Publish
```rust
// Simple publish
client.publish("subject", "payload".into()).await?;
// With reply-to
client.publish_with_reply("subject", "reply-to", "payload".into()).await?;
// With headers
let mut headers = HeaderMap::new();
headers.insert("X-Custom", "value");
client.publish_with_headers("subject", headers, "payload".into()).await?;
// Full control
client.publish_with_reply_and_headers("subject", "reply-to", headers, "payload".into()).await?;
// Flush (ensure all published messages are sent)
client.flush().await?;
```
## Core NATS: Subscribe
```rust
use futures_util::StreamExt;
// Basic subscribe
let mut subscriber = client.subscribe("subject").await?;
// Queue group
let mut subscriber = client.queue_subscribe("subject", "group".into()).await?;
// Receive messages (Subscriber implements Stream)
while let Some(message) = subscriber.next().await {
println!("subject: {}, payload: {:?}", message.subject, message.payload);
}
// Unsubscribe
subscriber.unsubscribe().await?;
// Unsubscribe after N messages
subscriber.unsubscribe_after(10).await?;
// Drain (wait for in-flight, then unsubscribe)
subscriber.drain().await?;
```
## Core NATS: Request/Reply
```rust
// Simple request (uses default timeout)
let response = client.request("subject", "data".into()).await?;
// With custom timeout and headers
let request = async_nats::Request::new()
.payload("data".into())
.timeout(Some(Duration::from_secs(5)))
.headers(headers);
let response = client.send_request("subject", request).await?;
// Custom inbox (bypasses multiplexer)
let request = async_nats::Request::new()
.payload("data".into())
.inbox("custom-inbox".into());
let response = client.send_request("subject", request).await?;
```
## Message Structure
```rust
pub struct Message {
pub subject: Subject,
pub reply: Option<Subject>,
pub payload: Bytes,
pub headers: Option<HeaderMap>,
pub status: Option<StatusCode>,
pub description: Option<String>,
pub length: usize,
}
```
## JetStream
```rust
let jetstream = async_nats::jetstream::new(client);
// Publish (returns ack future)
let ack = jetstream.publish("events", "data".into()).await?;
let publish_ack = ack.await?;
// Stream management
let stream = jetstream.create_stream(stream::Config {
name: "events".to_string(),
subjects: vec!["events.>".to_string()],
max_messages: 10_000,
..Default::default()
}).await?;
let stream = jetstream.get_stream("events").await?;
let stream = jetstream.get_or_create_stream(config).await?;
jetstream.delete_stream("events").await?;
jetstream.update_stream(config).await?;
// Consumer management
let consumer: PullConsumer = stream.create_consumer(pull::Config {
durable_name: Some("my-consumer".to_string()),
..Default::default()
}).await?;
// Pull consumer: fetch messages
let mut messages = consumer.messages().await?;
while let Some(message) = messages.next().await {
let message = message?;
message.ack().await?;
}
// Push consumer (ordered)
let consumer = stream.create_consumer(push::OrderedConfig {
deliver_subject: client.new_inbox(),
filter_subject: "events.>".to_string(),
..Default::default()
}).await?;
let mut messages = consumer.messages().await?;
```
## Key-Value Store
```rust
let kv = jetstream.create_key_value(kv::Config {
bucket: "my-bucket".to_string(),
history: 10,
..Default::default()
}).await?;
// CRUD
let revision = kv.put("key", "value".into()).await?;
let revision = kv.create("key", "value".into()).await?;
let value: Option<Bytes> = kv.get("key").await?;
let entry: Option<Entry> = kv.entry("key").await?;
let revision = kv.update("key", "new-value".into(), revision).await?;
kv.delete("key").await?;
kv.purge("key").await?;
// Watch
let mut watch = kv.watch("key").await?;
let mut watch_all = kv.watch_all().await?;
// History & Keys
let mut history = kv.history("key").await?;
let mut keys = kv.keys().await?;
```
## Object Store
```rust
let bucket = jetstream.create_object_store(object_store::Config {
bucket: "files".to_string(),
..Default::default()
}).await?;
// Put (from any AsyncRead)
let info = bucket.put("file.txt", &mut file).await?;
// Get (returns AsyncRead)
let mut object = bucket.get("file.txt").await?;
let mut bytes = Vec::new();
object.read_to_end(&mut bytes).await?;
// Info, delete, list, watch
let info = bucket.info("file.txt").await?;
bucket.delete("file.txt").await?;
let mut list = bucket.list().await?;
let mut watch = bucket.watch().await?;
```
## Service API
```rust
use async_nats::service::ServiceExt;
use futures_util::StreamExt;
let mut service = client
.service_builder()
.description("product service")
.start("products", "1.0.0")
.await?;
let mut endpoint = service.endpoint("get").await?;
while let Some(request) = endpoint.next().await {
request.respond(Ok("result".into())).await?;
}
```
## Client State & Events
```rust
// Check connection state
match client.connection_state() {
State::Connected => {},
State::Disconnected => {},
State::Pending => {},
}
// Get server info
let info: ServerInfo = client.server_info();
println!("max_payload: {}", info.max_payload);
println!("jetstream: {}", info.jetstream);
// Get statistics
let stats = client.statistics();
println!("in_messages: {}", stats.in_messages.load(Ordering::Relaxed));
// Force reconnect
client.force_reconnect().await?;
// Server pool management
client.set_server_pool(["nats://s1:4222".parse()?, "nats://s2:4222".parse()?].as_slice()).await?;
let pool = client.server_pool().await?;
// Drain
client.drain().await?;
```
## Error Handling Patterns
```rust
// Connect errors
match async_nats::connect("server").await {
Err(e) => match e.kind() {
ConnectErrorKind::TimedOut => {},
ConnectErrorKind::Authentication => {},
ConnectErrorKind::AuthorizationViolation => {},
_ => {},
},
Ok(client) => {},
}
// Publish errors
match client.publish("subject", "data".into()).await {
Err(e) => match e.kind() {
PublishErrorKind::MaxPayloadExceeded => {},
PublishErrorKind::InvalidSubject => {},
PublishErrorKind::Send => {},
_ => {},
},
_ => {},
}
// Request errors
match client.request("subject", "data".into()).await {
Err(e) => match e.kind() {
RequestErrorKind::TimedOut => {},
RequestErrorKind::NoResponders => {},
RequestErrorKind::InvalidSubject => {},
RequestErrorKind::MaxPayloadExceeded => {},
_ => {},
},
Ok(message) => {},
}
```
## Feature Flag Quick Reference
| Feature | Enables | Default |
|---------|---------|---------|
| `jetstream` | JetStream streams, consumers, publish | ✅ |
| `kv` | Key-Value store (implies `jetstream`) | ✅ |
| `object-store` | Object store (implies `jetstream` + `crypto`) | ✅ |
| `service` | Service API | ✅ |
| `nkeys` | NKey/JWT authentication | ✅ |
| `nuid` | NUID-based ID generation | ✅ |
| `crypto` | SHA-256 (for object store) | ✅ |
| `websockets` | WebSocket transport | ✅ |
| `ring` | `ring` TLS crypto backend | ✅ |
| `aws-lc-rs` | `aws-lc-rs` TLS crypto backend | ❌ |
| `fips` | FIPS mode via `aws-lc-rs` | ❌ |
| `chrono` | `chrono` datetime instead of `time` | ❌ |
| `server_2_10` | Server 2.10+ features | ✅ |
| `server_2_11` | Server 2.11+ features | ✅ |
| `server_2_12` | Server 2.12+ features | ✅ |
| `server_2_14` | Server 2.14+ features | ✅ |

View File

@@ -1,23 +0,0 @@
# async-nats Reference Documentation
**Crate**: `async-nats` v0.49.1
**Source**: https://github.com/nats-io/nats.rs (`async-nats/` directory)
**License**: Apache-2.0
## Contents
| # | File | Topic |
|---|------|-------|
| 01 | [Overview & Architecture](01-overview-and-architecture.md) | Crate overview, feature flags, source structure, core connection model, dependency graph |
| 02 | [Key Types & Traits](02-key-types-and-traits.md) | `Client`, `Subscriber`, `Message`, `Request`, `ServerInfo`, `ConnectInfo`, `Statistics`, subject/header types, event/state types, error types, trait definitions |
| 03 | [Protocol & Wire Format](03-protocol-and-wire-format.md) | NATS wire protocol (PUB/HPUB/SUB/UNSUB/PING/PONG, MSG/HMSG/INFO/ERR), parser/serializer internals, vectored I/O, WebSocket transport, connection lifecycle, reconnection |
| 04 | [Connection Management](04-connection-management.md) | `ConnectOptions` builder, authentication methods, TLS configuration, reconnection callbacks, event callbacks, `ConnectionHandler` internals, multiplexer, server pool management |
| 05 | [JetStream](05-jetstream.md) | `Context` and `ContextBuilder`, streams, consumers (pull/push/ordered), JetStream messages and acks, publish with ack futures, pagination |
| 06 | [Key-Value Store](06-key-value-store.md) | KV `Store` handle, bucket CRUD, put/get/create/update/delete/purge, watch/history/keys, entry operations, mirrored buckets, KV-to-stream mapping |
| 07 | [Object Store](07-object-store.md) | `ObjectStore` handle, put/get/delete/watch/list/seal, links, `Object` (AsyncRead), chunking, SHA-256 integrity, subject naming |
| 08 | [Service API](08-service-api.md) | `Service` and `ServiceBuilder`, endpoints, groups, verb subscriptions (PING/INFO/STATS), request/respond with stats tracking |
| 09 | [Quick Reference](09-quick-reference.md) | Code examples for all major operations, feature flag reference |
## How This Documentation Was Produced
All information was derived by reading the source code of the `async-nats` crate at version 0.49.1 from the `nats.rs` repository. No external documentation was consulted — this is a ground-up reference based purely on the source.

View File

@@ -1,200 +0,0 @@
# nats.rs: Overview and Architecture
**Version**: async-nats 0.49.1, nats-server 0.1.0
**Repository**: https://github.com/nats-io/nats.rs
**License**: Apache-2.0
**Rust Edition**: 2021
**MSRV**: 1.88.0
**Protocol**: NATS Client Protocol (INFO/CONNECT/PUB/SUB/UNSUB/PING/PONG)
## What It Is
The `nats.rs` repository contains the **official Rust client for NATS.io**, a high-performance messaging system. The active crate is **`async-nats`** — a fully async, Tokio-based NATS client. The deprecated `nats` crate (synchronous) receives security fixes only.
The `nats-server` crate is **not** an implementation of the NATS server. It is a **test harness** that spawns the Go-based `nats-server` binary for integration tests. The actual NATS server is a separate Go project at `github.com/nats-io/nats-server`.
Core design decisions:
- **Fully async** — all I/O is Tokio-based with async/await throughout
- **Cloneable Client handle** — `Client` is cheap to clone (Arc internals), all protocol work happens in a single `ConnectionHandler` task
- **Channel-based internal communication** — `Client` sends `Command` variants via `mpsc` channel to `ConnectionHandler`
- **Multiplexed request-reply** — one internal subscription handles all request-response patterns via inbox token routing
- **Automatic reconnection** — exponential backoff with configurable server pool rotation
- **Feature-gated subsystems** — JetStream, KV, Object Store, Service API, NKeys, WebSockets, and crypto backends are all optional
## Workspace Structure
```
nats.rs/
├── async-nats/ # Primary crate — async NATS client
│ ├── src/
│ │ ├── lib.rs # Entry point: connect(), ServerOp, ClientOp, Command, ConnectionHandler, Subscriber
│ │ ├── client.rs # Client handle: publish, subscribe, request, flush, drain
│ │ ├── connection.rs # Low-level I/O: protocol parsing, read/write buffers
│ │ ├── connector.rs # Connection establishment, reconnection, server pool
│ │ ├── options.rs # ConnectOptions builder
│ │ ├── auth.rs # Auth struct (credentials container)
│ │ ├── auth_utils.rs # Credential file parsing (.creds files)
│ │ ├── error.rs # Generic Error<Kind> type
│ │ ├── header.rs # HeaderMap — NATS message headers
│ │ ├── subject.rs # Subject type, ToSubject trait
│ │ ├── status.rs # StatusCode (100-999 NATS protocol codes)
│ │ ├── message.rs # Message and OutboundMessage types
│ │ ├── tls.rs # TLS configuration helpers
│ │ ├── crypto.rs # Crypto feature support
│ │ ├── id_generator.rs # NUID/rand-based unique ID generation
│ │ ├── datetime.rs # DateTime helpers for JetStream/Service
│ │ ├── jetstream/ # JetStream API (feature-gated)
│ │ │ ├── mod.rs # Module root, jetstream::new(), with_domain()
│ │ │ ├── context.rs # JetStream Context — streams, publishing, consumers
│ │ │ ├── stream.rs # Stream management, Config, Info, Consumer creation
│ │ │ ├── consumer/ # Pull, Push, Ordered consumers
│ │ │ ├── message.rs # JetStream Message with ack methods
│ │ │ ├── publish.rs # PublishAck
│ │ │ ├── response.rs # Response wrapper
│ │ │ ├── errors.rs # JetStream error codes
│ │ │ ├── account.rs # Account info
│ │ │ ├── kv/ # Key-Value store (feature: "kv")
│ │ │ └── object_store/ # Object store (feature: "object-store")
│ │ └── service/ # Service API (feature-gated)
│ │ ├── mod.rs # Service, ServiceBuilder
│ │ ├── endpoint.rs # Endpoint handling
│ │ └── error.rs # Service errors
│ ├── tests/ # Integration tests (require nats-server binary)
│ ├── examples/ # Runnable examples
│ └── benches/ # Criterion benchmarks
├── nats-server/ # Test harness — spawns Go nats-server for tests
│ ├── src/lib.rs # Server struct, run_server(), run_cluster()
│ └── configs/ # Server config files for tests
│ └── jetstream.conf
└── nats/ # DEPRECATED sync client — do not modify
```
## Architecture Diagram
```
┌──────────────────────────────────────────────────────────┐
│ Application Layer │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ JetStream│ │ KV │ │ Object │ │ Service │ │
│ │ Context │ │ Store │ │ Store │ │ API │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ └──────────────┴─────────────┴─────────────┘ │
│ │ │
│ ┌──────┴──────┐ │
│ │ Client │ Cloneable handle │
│ │ (mpsc::Sender) │
│ └──────┬──────┘ │
│ │ Command channel │
└──────────────────────────┼────────────────────────────────┘
┌──────────────────────────┼────────────────────────────────┐
│ ConnectionHandler │
│ (single Tokio task) │
│ │ │
│ ┌───────────┐ ┌───────┴───────┐ ┌──────────────────┐ │
│ │Subscriptions│ │ Multiplexer │ │ Flush Observers │ │
│ │ HashMap │ │ (request-reply)│ │ │ │
│ └──────┬──────┘ └───────┬───────┘ └──────────────────┘ │
│ └────────────────┼ │
│ ┌──────┴──────┐ │
│ │ Connector │ Server pool, reconnect │
│ └──────┬──────┘ │
│ │ │
│ ┌──────┴──────┐ │
│ │ Connection │ Protocol I/O │
│ │ (read/write)│ ServerOp / ClientOp │
│ └──────┬──────┘ │
└──────────────────────────┼────────────────────────────────┘
┌──────┴──────┐
│ NATS Server │ (Go binary, TCP/TLS/WS)
└─────────────┘
```
## Key Concepts
### Subject
NATS uses subject strings for message addressing. A `Subject` is a validated, immutable, UTF-8 string backed by `Bytes`. Subjects use dot-delimited tokens (e.g., `events.data.sensor1`). Wildcards `*` (single token) and `>` (multi-token suffix) are supported for subscriptions.
### ClientOp / ServerOp
The NATS client-server protocol is text-based with binary payloads. The client sends `ClientOp` variants (CONNECT, PUB/HPUB, SUB, UNSUB, PING, PONG) and receives `ServerOp` variants (INFO, MSG/HMSG, +OK, -ERR, PING, PONG).
### Command
Internal command type sent from `Client` to `ConnectionHandler` via `mpsc` channel. Includes Publish, Request, Subscribe, Unsubscribe, Flush, Drain, Reconnect, SetServerPool, ServerPool.
### Multiplexer
A single internal subscription (SID 0) that routes all request-reply responses. When a `Request` is made, a unique inbox token is registered in the multiplexer's sender map, and the response is dispatched to the corresponding `oneshot::Sender`.
### ConnectionHandler
A single Tokio task that drives all protocol I/O. It processes server operations from `Connection`, handles client commands from the `mpsc` channel, manages subscriptions, maintains ping/pong health, and orchestrates reconnection.
## nats-server Test Harness
The `nats-server` crate provides utilities for launching real NATS server instances in tests:
- `run_server(cfg)` — starts a single server with optional config
- `run_cluster(cfg)` — starts a 3-node cluster
- `Server` struct — holds the child process, cleans up on drop
- `Server::restart()` — kills and restarts the server process
- `Server::client_url()` — reads the INFO from the server to get the client URL
- `set_lame_duck_mode(server)` — sends LDM signal to the server process
The test harness spawns the Go `nats-server` binary via `std::process::Command`, using dynamic ports for parallel test execution. It auto-discovers the client URL by connecting to the server's TCP port and parsing the `INFO` JSON. On `Drop`, it kills the child process and cleans up JetStream storage directories.
## Feature Flags
```toml
# Default: everything enabled
default = ["server_2_10", "server_2_11", "server_2_12", "server_2_14",
"service", "ring", "jetstream", "nkeys", "crypto",
"object-store", "kv", "websockets", "nuid"]
# Subsystems
jetstream # JetStream API
kv # Key-Value store (requires jetstream)
object-store # Object store (requires jetstream + crypto)
service # Service API
# Crypto backends (pick one)
ring # Default crypto backend
aws-lc-rs # Alternative backend
fips # FIPS mode (requires aws-lc-rs)
# Auth
nkeys # NKey authentication
# Other
nuid # NUID-based ID generation (falls back to rand)
crypto # Encryption support
websockets # WebSocket transport
experimental # Experimental features
# Server version markers (enable version-specific API fields)
server_2_10
server_2_11
server_2_12
server_2_14
```
## Dependencies (Key)
| Dependency | Purpose |
|-----------|---------|
| `tokio` | Async runtime (macros, rt, net, sync, time, io-util) |
| `bytes` | Zero-copy byte buffers for payloads |
| `tokio-rustls` | TLS via rustls |
| `rustls-native-certs` | Load native TLS root certificates |
| `serde` / `serde_json` | JSON serialization for protocol messages and JetStream API |
| `memchr` | Fast CRLF search for protocol parsing |
| `futures-util` | Stream trait, Sink trait, StreamExt |
| `tracing` | Structured logging |
| `thiserror` | Error type derivation |
| `url` | URL parsing for server addresses |
| `portable-atomic` | Portable atomic operations |
## References
- [NATS Protocol Specification](https://docs.nats.io/reference/reference-protocols/nats-protocol)
- [NATS JetStream Documentation](https://docs.nats.io/nats-concepts/jetstream)
- [async-nats on docs.rs](https://docs.rs/async-nats)

View File

@@ -1,281 +0,0 @@
# NATS Client Protocol and Wire Format
**Protocol**: NATS Client Protocol v1 (with dynamic reconfiguration)
**Transport**: TCP (port 4222), TLS, WebSocket (ws/wss)
## Protocol Overview
The NATS client-server protocol is a simple, text-based protocol with binary payload support. All operations are terminated with `\r\n`. Messages carry their payload length, allowing efficient binary data transfer.
### Connection Lifecycle
```
Client Server
│ │
│◄──────────── INFO {json} ────────────────────│ Server sends INFO first
│ │
│────────────── CONNECT {json} ────────────────►│ Client sends CONNECT
│────────────── PING ──────────────────────────►│ Client sends PING
│◄──────────── PONG ────────────────────────── │ Server confirms connection
│ │
│──── SUB/UNSUB/PUB/HPUB ──────────────────────►│ Normal operation
│◄─── MSG/HMSG/+OK/-ERR/PING ─────────────────│
│ │
```
## Server Operations (ServerOp)
These are operations received from the server. The `Connection` module parses these from the read buffer.
### INFO
Sent by the server upon connection and asynchronously when cluster topology changes.
```
INFO {json}\r\n
```
JSON fields (see `ServerInfo` struct):
| Field | Type | Description |
|-------|------|-------------|
| `server_id` | String | Unique server identifier |
| `server_name` | String | Generated server name |
| `host` | String | Cluster host |
| `port` | u16 | Cluster port |
| `version` | String | Server version |
| `auth_required` | bool | Authentication required |
| `tls_required` | bool | TLS required |
| `max_payload` | usize | Maximum payload size |
| `proto` | i8 | Protocol version (0 or 1) |
| `client_id` | u64 | Server-assigned client ID |
| `go` | String | Go build version |
| `nonce` | String | Nonce for nkey auth |
| `connect_urls` | Vec<String> | Cluster server URLs |
| `client_ip` | String | Client IP as seen by server |
| `headers` | bool | Server supports headers |
| `ldm` | bool | Lame duck mode |
| `cluster` | Option<String> | Cluster name |
| `domain` | Option<String> | NATS domain |
| `jetstream` | bool | JetStream enabled |
### MSG
Delivers a message to a subscription (no headers):
```
MSG <subject> <sid> [reply-to] <#bytes>\r\n
<payload>\r\n
```
### HMSG
Delivers a message with headers:
```
HMSG <subject> <sid> [reply-to] <#header-bytes> <#total-bytes>\r\n
<NATS/1.0 [status] [description]>\r\n
<header-name>: <header-value>\r\n
\r\n
<payload>\r\n
```
Header format follows the NATS/1.0 header spec:
- First line: `NATS/1.0` optionally followed by status code and description
- Subsequent lines: `name: value` headers
- Empty line separates headers from payload
- Header values may span multiple lines (continuation lines start with whitespace)
### PING / PONG
```
PING\r\n → Client responds with PONG
PONG\r\n → Acknowledges client's PING
```
### +OK / -ERR
```
+OK\r\n → Success acknowledgment (verbose mode)
-ERR <description>\r\n → Error from server
```
Common server errors:
- `authorization violation` → parsed as `ServerError::AuthorizationViolation`
- Other strings → `ServerError::Other(String)`
## Client Operations (ClientOp)
These are operations sent from the client to the server. The `Connection` module serializes these to the write buffer.
### CONNECT
Sent as the first client operation after receiving INFO. Contains authentication and capability information.
```
CONNECT {json}\r\n
```
JSON fields (see `ConnectInfo` struct):
| Field | Type | Description |
|-------|------|-------------|
| `verbose` | bool | Enable +OK acknowledgments (always false in this client) |
| `pedantic` | bool | Strict format checking (always false) |
| `jwt` | Option<String> | User JWT for auth |
| `nkey` | Option<String> | Public nkey for auth |
| `sig` | Option<String> | Signed nonce (Base64URL encoded) |
| `name` | Option<String> | Client name |
| `echo` | bool | Whether server should echo messages back |
| `lang` | String | Implementation language ("rust") |
| `version` | String | Client version |
| `protocol` | u8 | Protocol version (1 = dynamic) |
| `tls_required` | bool | TLS required |
| `user` | Option<String> | Username |
| `pass` | Option<String> | Password |
| `auth_token` | Option<String> | Auth token |
| `headers` | bool | Client supports headers (always true) |
| `no_responders` | bool | Client supports no-responders (always true) |
### PUB / HPUB
Publish a message:
```
PUB <subject> [reply-to] <#payload-bytes>\r\n
<payload>\r\n
```
Publish with headers:
```
HPUB <subject> [reply-to] <#header-bytes> <#total-bytes>\r\n
<NATS/1.0\r\n
<header-name>: <header-value>\r\n
\r\n
<payload>\r\n
```
### SUB
Subscribe to a subject:
```
SUB <subject> [queue-group] <sid>\r\n
```
### UNSUB
Unsubscribe from a subscription:
```
UNSUB <sid> [max]\r\n
```
The optional `max` parameter tells the server to auto-unsubscribe after receiving the specified number of messages.
### PING / PONG
```
PING\r\n → Health check / keepalive
PONG\r\n → Response to server PING
```
## Protocol Version
The `Protocol` enum has two variants:
| Value | Name | Description |
|-------|------|-------------|
| 0 | Original | Basic protocol |
| 1 | Dynamic | Supports async INFO for cluster topology changes, lame duck mode |
This client always sends `protocol: 1` (Dynamic), enabling:
- Asynchronous INFO messages with updated server lists
- Lame duck mode notifications
- Dynamic reconfiguration of cluster topology
## Wire Format Details
### Message Length Calculation
For plain `MSG`:
```
length = subject.len() + reply.map_or(0, |r| r.len()) + payload.len()
```
For `HMSG`:
```
length = subject.len() + reply.map_or(0, |r| r.len()) + header_len + payload.len()
```
Where `header_len` = serialized header bytes and `total_len` = `header_len + payload.len()`.
### Write Buffer Architecture
The `Connection` uses a two-tier write buffer:
1. **`flattened_writes`** (`BytesMut`) — for small writes (< 4096 bytes). Protocol headers, short commands, and small messages are flattened into this buffer for efficient sequential writing.
2. **`write_buf`** (`VecDeque<Bytes>`) — for large writes (>= 4096 bytes). Large payloads are appended as separate `Bytes` chunks. Supports vectored writes (`write_vectored`) when the underlying stream supports it, writing up to 64 chunks at once.
The soft limit for the total write buffer is 65,535 bytes (`SOFT_WRITE_BUF_LIMIT`). When exceeded, the `ConnectionHandler` stops processing new commands until the buffer drains.
### Read Buffer Architecture
The `Connection` uses a single `BytesMut` read buffer with configurable initial capacity (default 65,535 bytes). Protocol parsing uses `memchr::memmem::find` to locate CRLF delimiters efficiently. If a partial message is in the buffer, the parser returns `None` and waits for more data.
### Header Serialization
Headers are serialized in NATS/1.0 format:
```
NATS/1.0\r\n
Header-Name: Header-Value\r\n
Multi-Line-Header: value part 1\r\n
continuation of value\r\n
Another-Header: another value\r\n
\r\n
```
The `HeaderMap::to_bytes()` method handles this serialization, using `httparse`-compatible line folding for multi-line values.
### Status Codes in Headers
NATS status codes are embedded in the `HMSG` header version line:
```
NATS/1.0 404 No Messages\r\n
NATS/1.0 408 Request Timeout\r\n
NATS/1.0 503 No Responders\r\n
```
Common codes used by the client:
| Code | Constant | Meaning |
|------|----------|---------|
| 100 | `IDLE_HEARTBEAT` | JetStream idle heartbeat |
| 200 | `OK` | Success |
| 404 | `NOT_FOUND` | Message/stream not found |
| 408 | `TIMEOUT` | Request timeout |
| 409 | `REQUEST_TERMINATED` | Request terminated |
| 503 | `NO_RESPONDERS` | No responders available |
## Protocol Parsing Implementation
The `Connection::try_read_op()` method handles all protocol parsing:
1. Search for `\r\n` delimiter using `memchr::memmem::find`
2. Match the operation prefix:
- `+OK``ServerOp::Ok`
- `PING``ServerOp::Ping`
- `PONG``ServerOp::Pong`
- `-ERR` → parse error description → `ServerOp::Error`
- `INFO ` → parse JSON → `ServerOp::Info`
- `MSG ` → parse subject/sid/reply/length, read payload → `ServerOp::Message`
- `HMSG ` → parse headers + payload → `ServerOp::Message`
3. Unknown prefix → return `io::Error` with `InvalidInput`
For `MSG` and `HMSG`, if the complete payload isn't yet in the read buffer (checked via `len + payload_len + 4 > remaining`), the method returns `Ok(None)` and the buffer accumulates more data before retrying.
Non-UTF8 subjects in server messages are handled gracefully — the parser returns an `io::Error` rather than panicking, which is critical because the Go server does not enforce UTF-8 in subjects (regression fix for issue #1572).

View File

@@ -1,443 +0,0 @@
# Key Types and Traits
This document covers the core data types in the `async-nats` crate that form the public API and internal plumbing.
## Public Types
### Client
**Location**: `client.rs`
`Client` is the primary user-facing type. It is a lightweight, cloneable handle to a NATS connection.
```rust
#[derive(Clone, Debug)]
pub struct Client {
info: tokio::sync::watch::Receiver<Option<ServerInfo>>,
state: tokio::sync::watch::Receiver<State>,
sender: mpsc::Sender<Command>,
poll_sender: PollSender<Command>,
next_subscription_id: Arc<AtomicU64>,
subscription_capacity: usize,
inbox_prefix: Arc<str>,
request_timeout: Option<Duration>,
max_payload: Arc<AtomicUsize>,
connection_stats: Arc<Statistics>,
skip_subject_validation: bool,
}
```
Key methods:
- `publish(subject, payload)` — fire-and-forget publish
- `publish_with_headers(subject, headers, payload)` — publish with NATS headers
- `publish_with_reply(subject, reply, payload)` — publish with reply subject
- `request(subject, payload)` — request-response (returns `Message`)
- `send_request(subject, request)` — request with `Request` builder
- `subscribe(subject)` — subscribe to a subject, returns `Subscriber`
- `queue_subscribe(subject, queue_group)` — subscribe as part of a queue group
- `flush()` — ensure all pending messages are written to the wire
- `drain()` — gracefully drain all subscriptions and close
- `force_reconnect()` — trigger immediate reconnection
- `new_inbox()` — generate a unique inbox subject for request-reply
- `server_info()` — get last received `ServerInfo`
- `max_payload()` — get server's maximum payload size
- `connection_state()` — get current connection `State`
- `statistics()` — get `Arc<Statistics>` for connection metrics
- `is_server_compatible(major, minor, patch)` — check server version compatibility
- `set_server_pool(addrs)` / `server_pool()` — manage server pool
`Client` also implements `Sink<OutboundMessage>` for backpressure-aware publishing.
### Subscriber
**Location**: `lib.rs`
A `Subscriber` receives messages from a single subscription. It implements `futures::Stream`.
```rust
#[derive(Debug)]
pub struct Subscriber {
sid: u64,
receiver: mpsc::Receiver<Message>,
sender: mpsc::Sender<Command>,
}
```
Key methods:
- `unsubscribe()` — unsubscribe and close the stream
- `unsubscribe_after(max)` — auto-unsubscribe after N messages
- `drain()` — gracefully drain remaining messages then close
On `Drop`, `Subscriber` automatically sends an `Unsubscribe` command and closes the receiver channel.
### Message
**Location**: `message.rs`
Represents an inbound NATS message.
```rust
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct Message {
pub subject: Subject,
pub reply: Option<Subject>,
pub payload: Bytes,
pub headers: Option<HeaderMap>,
pub status: Option<StatusCode>,
pub description: Option<String>,
pub length: usize,
}
```
### OutboundMessage
**Location**: `message.rs`
Represents a message to be published. No status/description fields (those are inbound-only).
```rust
#[derive(Clone, Debug)]
pub struct OutboundMessage {
pub subject: Subject,
pub reply: Option<Subject>,
pub payload: Bytes,
pub headers: Option<HeaderMap>,
}
```
### Subject
**Location**: `subject.rs`
An immutable, validated UTF-8 string backed by `Bytes`. Used throughout the crate instead of raw `String` for subjects.
```rust
#[derive(Clone, Debug, PartialEq, Eq, PartialOrd, Ord)]
pub struct Subject {
bytes: Bytes,
}
```
Implements `Deref<Target = str>`, `From<&str>`, `From<String>`, `TryFrom<Bytes>`, `Serialize`, `Deserialize`.
Validation methods:
- `is_valid()` — checks NATS subject rules (no leading/trailing dots, no consecutive dots, no whitespace)
- `validated(s)` — construct with validation, returns `Result<Subject, SubjectError>`
- `from_static_validated(s)` — const-time validation for static strings (compile-time panic on invalid)
### ToSubject Trait
**Location**: `subject.rs`
```rust
pub trait ToSubject {
fn to_subject(&self) -> Subject;
}
```
Implemented for `Subject`, `&'static str`, `String`. All methods accepting subjects are generic over `impl ToSubject`.
### HeaderMap
**Location**: `header.rs`
NATS message headers, modeled after the `http::header` crate.
```rust
#[derive(Clone, PartialEq, Eq, Debug, Default)]
pub struct HeaderMap {
inner: HashMap<HeaderName, Vec<HeaderValue>>,
}
```
Supports multiple values per header name (like HTTP). Key methods:
- `insert(name, value)` — replace all values for a name
- `append(name, value)` — add a value to a name
- `get(name)` — get the first value
- `get_all(name)` — get all values as an iterator
- `len()` / `is_empty()` — number of header entries
- `to_bytes()` — serialize to NATS/1.0 wire format
- `wire_len()` — size in wire format (for payload size checks)
### StatusCode
**Location**: `status.rs`
NATS status codes (100-999), structurally similar to HTTP status codes.
```rust
#[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Serialize, Deserialize)]
pub struct StatusCode(NonZeroU16);
```
Constants:
| Constant | Code | Meaning |
|----------|------|---------|
| `IDLE_HEARTBEAT` | 100 | JetStream idle heartbeat |
| `OK` | 200 | Success |
| `NOT_FOUND` | 404 | Not found |
| `TIMEOUT` | 408 | Timeout |
| `REQUEST_TERMINATED` | 409 | Request terminated |
| `NO_RESPONDERS` | 503 | No responders |
### ServerInfo
**Location**: `lib.rs`
Deserialized from the server's `INFO` JSON message. Contains server capabilities, connection details, and cluster information.
### ConnectInfo
**Location**: `lib.rs`
Serialized into the client's `CONNECT` JSON message. Contains authentication credentials, client capabilities, and protocol preferences.
### ServerAddr
**Location**: `lib.rs`
A validated NATS server URL, supporting schemes `nats://`, `tls://`, `ws://`, `wss://`.
```rust
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct ServerAddr(Url);
```
Methods:
- `from_url(url)` — validate and create
- `tls_required()` — true for `tls://` scheme
- `is_websocket()` — true for `ws://` or `wss://`
- `host()` / `port()` / `scheme()` — URL component accessors
- `socket_addrs()` — async DNS resolution
- `username()` / `password()` — embedded credentials
### Auth
**Location**: `auth.rs`
Container for authentication credentials.
```rust
#[derive(Clone, Default)]
pub struct Auth {
pub jwt: Option<String>,
pub nkey: Option<String>,
pub signature_callback: Option<CallbackArg1<String, Result<String, AuthError>>>,
pub signature: Option<Vec<u8>>,
pub username: Option<String>,
pub password: Option<String>,
pub token: Option<String>,
}
```
### Request
**Location**: `client.rs`
Builder for customized request-response operations.
```rust
#[derive(Default)]
pub struct Request {
pub payload: Option<Bytes>,
pub headers: Option<HeaderMap>,
pub timeout: Option<Option<Duration>>,
pub inbox: Option<String>,
}
```
### Statistics
**Location**: `client.rs`
Atomic connection statistics shared between Client and ConnectionHandler.
```rust
#[derive(Default, Debug)]
pub struct Statistics {
pub in_bytes: AtomicU64,
pub out_bytes: AtomicU64,
pub in_messages: AtomicU64,
pub out_messages: AtomicU64,
pub connects: AtomicU64,
}
```
### Event
**Location**: `lib.rs`
Events emitted by the client for connection lifecycle monitoring.
```rust
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum Event {
Connected,
Disconnected,
LameDuckMode,
Draining,
Closed,
SlowConsumer(u64),
ServerError(ServerError),
ClientError(ClientError),
}
```
## Internal Types
### Command
**Location**: `lib.rs`
Internal commands sent from `Client` to `ConnectionHandler` via `mpsc` channel.
```rust
pub(crate) enum Command {
Publish(OutboundMessage),
Request { subject, payload, respond, headers, sender: oneshot::Sender<Message> },
Subscribe { sid, subject, queue_group, sender: mpsc::Sender<Message> },
Unsubscribe { sid, max: Option<u64> },
Flush { observer: oneshot::Sender<()> },
Drain { sid: Option<u64> },
Reconnect,
SetServerPool { servers: Vec<ServerAddr>, result: oneshot::Sender<Result<(), String>> },
ServerPool { result: oneshot::Sender<Vec<connector::Server>> },
}
```
### ClientOp / ServerOp
**Location**: `lib.rs`
Protocol-level operation types used by `Connection` for wire format parsing and serialization.
### Subscription (Internal)
**Location**: `lib.rs`
```rust
struct Subscription {
subject: Subject,
sender: mpsc::Sender<Message>,
queue_group: Option<String>,
delivered: u64,
max: Option<u64>,
}
```
### Multiplexer (Internal)
**Location**: `lib.rs`
```rust
struct Multiplexer {
subject: Subject, // Wildcard subscription subject (e.g., "_INBOX.xxx.*")
prefix: Subject, // Prefix for routing (e.g., "_INBOX.xxx.")
senders: HashMap<String, oneshot::Sender<Message>>, // token → sender
}
```
### Connection State
**Location**: `connection.rs`
```rust
#[derive(Debug, Eq, PartialEq, Clone)]
pub enum State {
Pending,
Connected,
Disconnected,
}
```
### Protocol
**Location**: `lib.rs`
```rust
#[derive(Serialize_repr, Deserialize_repr, PartialEq, Eq, Debug, Clone, Copy)]
#[repr(u8)]
pub enum Protocol {
Original = 0,
Dynamic = 1,
}
```
## Error Type Pattern
The crate uses a generic `Error<Kind>` type throughout. Every subsystem defines its own `ErrorKind` enum and a type alias:
```rust
// Define the kind enum
#[derive(Clone, Debug, PartialEq)]
pub enum PublishErrorKind {
MaxPayloadExceeded,
InvalidSubject,
Send,
}
// Define the error type alias
pub type PublishError = Error<PublishErrorKind>;
// Construct errors
PublishError::new(PublishErrorKind::MaxPayloadExceeded)
PublishError::with_source(PublishErrorKind::Send, io_error)
// Match on errors
if err.kind() == PublishErrorKind::MaxPayloadExceeded { ... }
```
Error kinds in the crate:
| Error Type | Kind Enum | Context |
|-----------|-----------|---------|
| `ConnectError` | `ConnectErrorKind` | Initial connection failures |
| `PublishError` | `PublishErrorKind` | Publish validation failures |
| `RequestError` | `RequestErrorKind` | Request-response failures |
| `SubscribeError` | `SubscribeErrorKind` | Subscription failures |
| `FlushError` | `FlushErrorKind` | Flush failures |
| `ServerPoolError` | `ServerPoolErrorKind` | Server pool query failures |
| `SetServerPoolError` | `SetServerPoolErrorKind` | Server pool modification failures |
## Trait Implementations
### Client Trait Interfaces
The `Client` implements several traits defined in `client::traits`:
```rust
// Publisher trait — publish with optional reply subject
trait Publisher {
fn publish_with_reply<S, R>(&self, subject: S, reply: R, payload: Bytes) -> impl Future<Output = Result<(), PublishError>>;
fn publish_message(&self, msg: OutboundMessage) -> impl Future<Output = Result<(), PublishError>>;
}
// Subscriber trait — subscribe to a subject
trait Subscriber {
fn subscribe<S>(&self, subject: S) -> impl Future<Output = Result<crate::Subscriber, SubscribeError>>;
}
// Requester trait — send request-response
trait Requester {
fn send_request<S>(&self, subject: S, request: Request) -> impl Future<Output = Result<Message, RequestError>>;
}
// TimeoutProvider trait — access request timeout
trait TimeoutProvider {
fn timeout(&self) -> Option<Duration>;
}
```
### ToServerAddrs Trait
**Location**: `lib.rs`
Converts various address types into server address iterators. Implemented for `ServerAddr`, `str`, `String`, `&[T]`, `Vec<T>`, `&[ServerAddr]`, and references.
### Sink<OutboundMessage>
`Client` implements `futures::Sink<OutboundMessage>` for backpressure-aware publishing through the `PollSender` adapter.
### Stream for Subscriber
`Subscriber` implements `futures::Stream` with `Item = Message`, delegating to the internal `mpsc::Receiver`.

View File

@@ -1,338 +0,0 @@
# Connection Handler and Data Flow
This document covers the internal `ConnectionHandler` that drives all protocol I/O, and the data flow through the system.
## ConnectionHandler
**Location**: `lib.rs`
The `ConnectionHandler` is the heart of the client. It runs as a single Tokio task and manages all communication with the NATS server.
```rust
pub(crate) struct ConnectionHandler {
connection: Connection, // Low-level I/O
connector: Connector, // Server pool, reconnection
subscriptions: HashMap<u64, Subscription>, // Active subscriptions
multiplexer: Option<Multiplexer>, // Request-reply multiplexer
pending_pings: usize, // Unanswered PINGs
info_sender: tokio::sync::watch::Sender<Option<ServerInfo>>,
ping_interval: Interval, // Periodic PING timer
should_reconnect: bool, // Flag for forced reconnect
flush_observers: Vec<oneshot::Sender<()>>, // Pending flush callbacks
is_draining: bool, // Connection is draining
drain_pings: VecDeque<u64>, // SIDs being drained
}
```
## Data Flow: Publish
```
Application
│ client.publish("events.data", payload)
Client
│ validates subject & payload size
│ sends Command::Publish(OutboundMessage) via mpsc channel
ConnectionHandler::handle_command(Command::Publish)
│ increments out_messages, out_bytes statistics
│ calls connection.enqueue_write_op(&ClientOp::Publish { ... })
Connection::enqueue_write_op
│ serializes to wire format:
│ "PUB events.data 11\r\n" or "HPUB events.data 23 34\r\n"
│ appends to flattened_writes or write_buf
Connection::poll_write
│ uses vectored writes (64 chunks) if supported
│ or sequential writes otherwise
Connection::poll_flush
│ flushes the TCP/TLS/WS stream
│ notifies flush_observers
NATS Server (TCP/TLS/WebSocket)
```
## Data Flow: Subscribe
```
Application
│ client.subscribe("events.>")
Client::subscribe
│ validates subject (always, regardless of skip_subject_validation)
│ allocates next sid via AtomicU64
│ creates mpsc channel for messages
│ sends Command::Subscribe { sid, subject, sender }
│ returns Subscriber { sid, receiver }
ConnectionHandler::handle_command(Command::Subscribe)
│ creates Subscription { subject, sender, delivered: 0, max: None }
│ inserts into subscriptions HashMap
│ calls connection.enqueue_write_op(&ClientOp::Subscribe { sid, subject, queue_group })
Connection::enqueue_write_op
│ serializes: "SUB events.> 42\r\n"
Server sends MSG for matching subjects:
ConnectionHandler::handle_server_op(ServerOp::Message { sid, subject, ... })
│ looks up sid in subscriptions HashMap
│ constructs Message { subject, reply, payload, headers, status, description }
│ tries subscription.sender.try_send(message)
├── Ok → increments subscription.delivered, checks max
├── Full → emits Event::SlowConsumer(sid)
└── Closed → removes subscription, sends ClientOp::Unsubscribe
Subscriber::poll_next (Stream impl)
│ receives from mpsc::Receiver
Application processes Message
```
## Data Flow: Request-Response
The request-response pattern uses the **multiplexer** — a single wildcard subscription that routes responses to their waiting requesters.
```
Application
│ client.request("service", payload)
Client::send_request
│ validates subject & payload size
│ creates oneshot channel for response
│ generates unique inbox: "_INBOX.<nuid>.<token>"
│ sends Command::Request { subject, payload, respond, sender }
ConnectionHandler::handle_command(Command::Request)
│ extracts token from respond subject (after last '.')
│ if no multiplexer exists:
│ creates Multiplexer with wildcard sub "_INBOX.<id>.*" (SID 0)
│ sends ClientOp::Subscribe { sid: 0, subject: "_INBOX.<id>.*" }
│ inserts token → oneshot::Sender in multiplexer.senders
│ sends ClientOp::Publish { subject, payload, respond: "<prefix><token>" }
Server routes request to service:
Service responds by publishing to the reply subject:
ConnectionHandler::handle_server_op(ServerOp::Message { sid: 0, ... })
│ sid == MULTIPLEXER_SID (0), so enters multiplexer path
│ extracts token by stripping prefix from subject
│ looks up token in multiplexer.senders
│ sends Message via oneshot::Sender
Client::send_request receives via oneshot::Receiver
│ applies timeout (default 10s)
│ checks for NO_RESPONDERS status (503)
Application receives Message
```
### Custom Inbox Request
If the `Request` builder specifies a custom `inbox`, the flow is different:
- The client subscribes to the inbox directly (not via multiplexer)
- Publishes with the inbox as the reply subject
- Waits for the message on that subscription
- No multiplexer involvement
## Data Flow: Flush
```
Application
│ client.flush()
Client::flush
│ creates oneshot channel
│ sends Command::Flush { observer }
ConnectionHandler::handle_command(Command::Flush)
│ pushes observer into flush_observers Vec
ProcessFut::poll (main loop)
│ after writing all pending data...
│ checks should_flush():
│ Yes (write buffers empty, not yet flushed) → poll_flush
│ May (write buffers not empty) → poll_flush
│ No (already flushed) → skip
│ on successful flush:
│ drains flush_observers, sending () to each
Client::flush receives via oneshot::Receiver
```
## Data Flow: Drain
```
Application
│ client.drain() or subscriber.drain()
Client::drain / Subscriber::drain
│ sends Command::Drain { sid: None } (whole client)
│ or Command::Drain { sid: Some(n) } (single subscription)
ConnectionHandler::handle_command(Command::Drain)
│ if sid is Some:
│ pushes sid to drain_pings
│ sends ClientOp::Unsubscribe { sid, max: None }
│ if sid is None (whole client):
│ sets is_draining = true
│ emits Event::Draining
│ for each subscription: drain_pings.push(sid), Unsubscribe
│ sends ClientOp::Ping (to flush the UNSUB messages)
ProcessFut::poll (main loop)
│ processes any remaining server messages
│ removes drained subscriptions from HashMap
│ if is_draining: returns ExitReason::Closed
ConnectionHandler exits, emits Event::Closed
```
## Main Processing Loop
The `ConnectionHandler::process` method implements the core event loop via a custom `Future` (`ProcessFut`):
```rust
impl Future for ProcessFut<'_> {
type Output = ExitReason;
fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
// 1. Check ping interval — send PING if due, disconnect if too many pending
while self.handler.ping_interval.poll_tick(cx).is_ready() {
if let Poll::Ready(exit) = self.ping() { return Poll::Ready(exit); }
}
// 2. Read all available server operations
loop {
match self.handler.connection.poll_read_op(cx) {
Poll::Pending => break,
Poll::Ready(Ok(Some(server_op))) => self.handler.handle_server_op(server_op),
Poll::Ready(Ok(None)) => return Poll::Ready(ExitReason::Disconnected(None)),
Poll::Ready(Err(err)) => return Poll::Ready(ExitReason::Disconnected(Some(err))),
}
}
// 3. Clean up drained subscriptions
while let Some(sid) = self.handler.drain_pings.pop_front() {
self.handler.subscriptions.remove(&sid);
}
// 4. If draining, exit
if self.handler.is_draining { return Poll::Ready(ExitReason::Closed); }
// 5. Process client commands (batch of up to 16)
// while write buffer not full
loop {
while !self.handler.connection.is_write_buf_full() {
match receiver.poll_recv_many(cx, recv_buf, 16) {
Poll::Pending => break,
Poll::Ready(1..) => { for cmd in recv_buf.drain(..) { handler.handle_command(cmd); } }
Poll::Ready(0) => return Poll::Ready(ExitReason::Closed),
}
}
// 6. Write pending data to stream
match self.handler.connection.poll_write(cx) {
Poll::Pending => break,
Poll::Ready(Ok(())) => continue, // write buffer empty, try more commands
Poll::Ready(Err(err)) => return Poll::Ready(ExitReason::Disconnected(Some(err))),
}
}
// 7. Flush stream and notify observers
match self.handler.connection.poll_flush(cx) { ... }
// 8. Check for forced reconnect
if mem::take(&mut self.handler.should_reconnect) {
return Poll::Ready(ExitReason::ReconnectRequested);
}
Poll::Pending
}
}
```
### Exit Reasons
The main loop exits for three reasons:
| Reason | Action |
|--------|--------|
| `Disconnected(Option<io::Error>)` | Attempt reconnection via `handle_disconnect()` |
| `ReconnectRequested` | Force reconnect (user-triggered) |
| `Closed` | Connection handler terminates, emit `Event::Closed` |
On disconnection, `handle_disconnect()` is called which:
1. Resets `pending_pings` to 0
2. Emits `Event::Disconnected`
3. Updates connection state to `Disconnected`
4. Calls `handle_reconnect()` which uses `Connector::connect()`
5. On successful reconnect, re-subscribes all active subscriptions
6. Re-subscribes the multiplexer wildcard if present
## Slow Consumer Handling
When a subscription's `mpsc::Sender` channel is full (the application isn't consuming messages fast enough):
1. `try_send` returns `TrySendError::Full`
2. The `ConnectionHandler` emits `Event::SlowConsumer(sid)`
3. The message is **dropped** (not queued)
4. The subscription remains active
When a subscription's receiver is dropped (application closed the stream):
1. `try_send` returns `TrySendError::Closed`
2. The subscription is removed from the HashMap
3. An `UNSUB` command is sent to the server
## Ping/Pong Health Check
The `ConnectionHandler` maintains a periodic PING interval (default 60 seconds):
1. `ping_interval` fires every N seconds
2. A `ClientOp::Ping` is enqueued
3. `pending_pings` counter increments
4. If `pending_pings > MAX_PENDING_PINGS (2)`, the connection is considered dead
5. When `ServerOp::Pong` is received, `pending_pings` decrements
6. Any server operation resets the ping interval timer
## Batched Command Processing
Commands from the `Client` are received in batches of up to 16 (`RECV_CHUNK_SIZE`) using `poll_recv_many`. This amortizes the cost of waking the task and enables pipelining multiple operations (e.g., publishing many messages) in a single poll cycle.

View File

@@ -1,277 +0,0 @@
# Connection and Reconnection
This document covers how connections are established, TLS handling, the server pool, and the reconnection mechanism.
## Connector
**Location**: `connector.rs`
The `Connector` manages the server pool and handles connection establishment and reconnection.
```rust
pub(crate) struct Connector {
servers: Vec<Server>, // Server pool with per-server metadata
options: ConnectorOptions, // Connection configuration
connect_stats: Arc<Statistics>, // Shared statistics
attempts: usize, // Global reconnection attempt counter
events_tx: mpsc::Sender<Event>, // Event channel
state_tx: watch::Sender<State>, // Connection state watcher
max_payload: Arc<AtomicUsize>, // Server's max payload
last_info: ServerInfo, // Last known server info
}
```
### Server Pool
Each server in the pool carries metadata:
```rust
#[derive(Debug, Clone)]
pub struct Server {
pub addr: ServerAddr,
pub failed_attempts: usize, // Consecutive failed attempts
pub did_connect: bool, // Ever successfully connected?
pub is_discovered: bool, // Discovered via INFO, not user-configured
pub last_error: Option<String>, // Last connection error
}
```
### ConnectorOptions
```rust
pub(crate) struct ConnectorOptions {
pub tls_required: bool,
pub certificates: Vec<PathBuf>,
pub client_cert: Option<PathBuf>,
pub client_key: Option<PathBuf>,
pub tls_client_config: Option<rustls::ClientConfig>,
pub tls_first: bool,
pub auth: Auth,
pub no_echo: bool,
pub connection_timeout: Duration, // Default: 5 seconds
pub name: Option<String>,
pub ignore_discovered_servers: bool,
pub retain_servers_order: bool,
pub read_buffer_capacity: u16, // Default: 65535
pub reconnect_delay_callback: Arc<dyn Fn(usize) -> Duration>,
pub auth_callback: Option<CallbackArg1<Vec<u8>, Result<Auth, AuthError>>>,
pub max_reconnects: Option<usize>,
pub local_address: Option<SocketAddr>,
pub reconnect_to_server_callback: Option<ReconnectToServerCallback>,
}
```
## Connection Establishment Flow
```
Connector::try_connect_to_server(addr)
├── 1. DNS resolution
│ server_addr.socket_addrs()
├── 2. For each resolved address:
│ │
│ ├── 2a. Connect with timeout
│ │ tokio::time::timeout(connection_timeout, try_connect_to(socket_addr, ...))
│ │
│ └── 2b. try_connect_to():
│ │
│ ├── Select transport:
│ │ ├── "ws" → WebSocket (tokio_websockets)
│ │ ├── "wss" → WebSocket over TLS
│ │ └── default → TCP (TcpStream)
│ │
│ ├── Optional: bind to local_address
│ ├── Set TCP_NODELAY
│ ├── Create Connection with read_buffer_capacity
│ │
│ ├── If tls_first: upgrade to TLS before INFO
│ │
│ ├── Read INFO from server
│ │
│ ├── If TLS required (by option, server, or URL scheme):
│ │ upgrade to TLS (rustls)
│ │
│ ├── Discover servers from INFO.connect_urls
│ │ (unless ignore_discovered_servers)
│ │
│ ├── Build ConnectInfo with auth:
│ │ ├── username/password (from Auth or URL)
│ │ ├── token (from Auth)
│ │ ├── nkey + signed nonce (feature: nkeys)
│ │ ├── JWT + signature callback (feature: nkeys)
│ │ └── auth_callback (custom async callback)
│ │
│ ├── Send CONNECT + PING
│ │
│ └── Wait for response:
│ ├── -ERR (authorization violation) → error
│ ├── PONG or +OK → success
│ └── EOF → error
└── 3. On success:
├── Reset attempt counter
├── Increment connects statistic
├── Emit Event::Connected
├── Update State::Connected
├── Store max_payload
├── Update per-server metadata (did_connect, failed_attempts)
└── Return (ServerInfo, Connection)
```
## TLS Handling
The client supports three TLS modes:
### 1. Standard TLS (INFO → TLS)
Default behavior. The client receives the `INFO` message in plaintext, then upgrades to TLS if:
- `tls_required` option is set
- Server's `INFO.tls_required` is true
- URL scheme is `tls://`
### 2. TLS First (TLS → INFO)
When `ConnectOptions::tls_first()` is enabled, the client establishes TLS before reading INFO. This requires the server to have `handshake_first` enabled. Useful for environments where plaintext INFO is not acceptable.
### 3. WebSocket TLS
For `wss://` URLs, TLS is handled by the WebSocket library (`tokio-websockets`) directly, not by the client's TLS layer.
### TLS Configuration
The client uses `rustls` via `tokio-rustls`. Configuration steps:
1. Load root certificates from system store (`rustls-native-certs`)
2. Optionally add custom root certificates from PEM files
3. Optionally configure client certificate and key for mTLS
4. Optionally pass a custom `rustls::ClientConfig`
Crypto backend is selectable via feature flags:
- `ring` (default)
- `aws-lc-rs`
- `fips` (requires aws-lc-rs)
## Reconnection
### Reconnection Trigger
Reconnection is triggered when:
1. I/O error during read or write (`ExitReason::Disconnected`)
2. Too many pending PINGs (no PONG received)
3. User calls `Client::force_reconnect()` (`ExitReason::ReconnectRequested`)
### Reconnection Flow
```
ConnectionHandler::handle_disconnect()
├── Reset pending_pings to 0
├── Emit Event::Disconnected
├── Update State::Disconnected
└── handle_reconnect()
└── Connector::connect()
└── Loop: try_connect()
├── If reconnect_to_server_callback is set:
│ │ Call callback with (server_pool, server_info)
│ │ If returns Some(ReconnectToServer):
│ │ Validate server is in pool
│ │ Use callback's delay or default backoff
│ │ Try connecting to selected server
│ └── If None or invalid: fall through to default
├── Default selection:
│ ├── Shuffle servers (unless retain_servers_order)
│ ├── Sort by failed_attempts (ascending)
│ └── Try each server in order
├── For each server:
│ ├── Increment attempts counter
│ ├── Check max_reconnects limit
│ ├── Apply reconnect delay (exponential backoff)
│ └── try_connect_to_server(addr)
├── On success:
│ ├── Reset attempts to 0
│ ├── Re-subscribe all active subscriptions
│ │ (filter out closed subscription channels)
│ ├── Re-subscribe multiplexer wildcard
│ └── Return (ServerInfo, Connection)
└── On failure:
├── Update per-server metadata (failed_attempts, last_error)
├── Auth errors → propagate immediately
└── Other errors → continue to next server
```
### Exponential Backoff
Default reconnect delay function:
```rust
fn reconnect_delay_callback_default(attempts: usize) -> Duration {
if attempts <= 1 {
Duration::from_millis(0)
} else {
let exp: u32 = (attempts - 1).try_into().unwrap_or(u32::MAX);
let max = Duration::from_secs(4);
cmp::min(Duration::from_millis(2_u64.saturating_pow(exp)), max)
}
}
```
| Attempt | Delay |
|---------|-------|
| 1 | 0ms |
| 2 | 0ms |
| 3 | 2ms |
| 4 | 4ms |
| 5 | 8ms |
| ... | ... |
| 13 | 4096ms |
| 14+ | 4000ms (capped) |
Custom delay functions can be provided via `ConnectOptions::reconnect_delay_callback()`.
### Server Pool Updates
The server pool is dynamic:
1. **Initial pool**: from `connect()` / `ConnectOptions::connect()` URL(s)
2. **Discovered servers**: added from `INFO.connect_urls` on each connection (unless `ignore_discovered_servers` is set)
3. **Runtime updates**: via `Client::set_server_pool()` — replaces the entire pool while preserving per-server state for servers that appear in both old and new pools
4. **Order**: servers are shuffled by default (random selection), unless `retain_servers_order` is set
### Max Reconnects
The `max_reconnects` option limits total reconnection attempts:
- `None` or `0` → unlimited (default)
- `Some(n)` → give up after `n` total attempts
- Counter is reset on successful connection and when `set_server_pool()` is called
## ConnectOptions Defaults
| Option | Default |
|--------|---------|
| `connection_timeout` | 5 seconds |
| `ping_interval` | 60 seconds |
| `sender_capacity` | 2048 |
| `subscription_capacity` | 65536 |
| `inbox_prefix` | `"_INBOX"` |
| `request_timeout` | 10 seconds |
| `retry_on_initial_connect` | false |
| `ignore_discovered_servers` | false |
| `retain_servers_order` | false |
| `read_buffer_capacity` | 65535 |
| `skip_subject_validation` | false |
| `no_echo` | false |
| `tls_required` | false |
| `tls_first` | false |
| `max_reconnects` | None (unlimited) |
## Background Connection
When `ConnectOptions::retry_on_initial_connect()` is enabled, the `connect()` function returns a `Client` immediately, before the connection is established. The connection is established in a background Tokio task. This means:
- `client.server_info()` returns `ServerInfo::default()` until connected
- `client.connection_state()` returns `State::Pending`
- Operations like `publish()` will queue in the command channel
- The `Client` becomes usable once the background task connects

View File

@@ -1,472 +0,0 @@
# JetStream Internals
This document covers the JetStream subsystem — how it provides stream-based messaging with persistence, consumer management, and higher-level APIs like KV and Object Store.
## JetStream Context
**Location**: `jetstream/context.rs`
The `Context` is the entry point to the JetStream API. It wraps a `Client` and provides stream management, publishing, and consumer operations.
```rust
#[derive(Debug, Clone)]
pub struct Context {
pub(crate) client: Client,
pub(crate) prefix: String, // API subject prefix (default: "$JS.API")
pub(crate) timeout: Duration, // Default request timeout
pub(crate) max_ack_semaphore: Arc<Semaphore>, // Limits in-flight ack waits
pub(crate) ack_sender: mpsc::Sender<(oneshot::Receiver<Message>, OwnedSemaphorePermit)>,
pub(crate) backpressure_on_inflight: bool,
}
```
### Context Creation
```rust
// Default context (prefix = "$JS.API")
let jetstream = async_nats::jetstream::new(client);
// With domain (prefix = "$JS.hub.API")
let jetstream = async_nats::jetstream::with_domain(client, "hub");
// With custom prefix
let jetstream = async_nats::jetstream::with_prefix(client, "JS.acc@hub.API");
// Builder pattern for more options
let jetstream = async_nats::jetstream::Context::builder(client)
.domain("hub")
.prefix("$JS.API")
.timeout(Duration::from_secs(30))
.max_ack_pending(256)
.backpressure_on_inflight(true)
.build();
```
### JetStream API Subject Convention
All JetStream API calls are request-response messages sent to subjects following the pattern:
```
$JS.API.<operation>.<stream-name>[.<consumer-name>]
```
Examples:
- `$JS.API.STREAM.CREATE.events` — create stream "events"
- `$JS.API.STREAM.INFO.events` — get stream info
- `$JS.API.CONSUMER.DURABLE.CREATE.events.myconsumer` — create durable consumer
- `$JS.API.CONSUMER.MSG.NEXT.events.myconsumer` — pull next message
With a domain, the prefix changes to `$JS.<domain>.API`.
## Stream Management
**Location**: `jetstream/stream.rs`
### Stream Config
```rust
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq)]
pub struct Config {
pub name: String,
pub subjects: Vec<String>, // Subject filter
pub retention: RetentionPolicy, // Limits, Interest, WorkQueue
pub max_consumers: i32,
pub max_messages: i64, // Per-stream message limit
pub max_messages_per_subject: i64,
pub max_bytes: i64, // Per-stream byte limit
pub max_age: Duration, // Message TTL
pub max_message_size: Option<i32>, // Max individual message size
pub storage: StorageType, // File or Memory
pub num_replicas: usize,
pub no_ack: bool, // Don't require ack
pub discard: DiscardPolicy, // Old or New
pub duplicate_window: Duration,
pub allow_rollup_hdrs: bool,
pub allow_direct: bool,
pub mirror: Option<External>,
pub sources: Vec<External>,
pub sealed: bool,
pub compression: Option<Compression>, // server_2_10+
pub first_sequence: Option<u64>, // server_2_11+
pub subject_transform: Option<SubjectTransform>, // server_2_12+
pub metadata: Option<HashMap<String, String>>, // server_2_10+
pub placement: Option<Placement>,
pub republish: Option<RePublish>,
}
```
### Stream Operations
Via `Context`:
| Method | API Subject | Description |
|--------|------------|-------------|
| `create_stream(config)` | `STREAM.CREATE.<name>` | Create a new stream |
| `get_stream(name)` | `STREAM.INFO.<name>` | Get existing stream |
| `get_or_create_stream(config)` | `STREAM.INFO``STREAM.CREATE` | Get or create |
| `delete_stream(name)` | `STREAM.DELETE.<name>` | Delete a stream |
| `update_stream(name, config)` | `STREAM.UPDATE.<name>` | Update stream config |
| `purge_stream(name)` | `STREAM.PURGE.<name>` | Purge all messages |
| `streams()` | `STREAM.LIST` | List all streams (paged iterator) |
| `stream_names()` | `STREAM.NAMES` | List stream names (paged iterator) |
| `account_info()` | `ACCOUNT.INFO` | Get account info |
Via `Stream`:
| Method | API Subject | Description |
|--------|------------|-------------|
| `info()` | `STREAM.INFO.<name>` | Refresh stream info |
| `purge()` | `STREAM.PURGE.<name>` | Purge messages |
| `delete()` | `STREAM.DELETE.<name>` | Delete this stream |
| `update(config)` | `STREAM.UPDATE.<name>` | Update config |
| `get_raw_message(seq)` | `STREAM.MSG.GET.<name>` | Get message by sequence (stored mode) |
| `get_last_message(subject)` | `STREAM.MSG.GET.<name>` | Get last message for subject (stored mode) |
| `direct_get_last(subject)` | `DIRECT.GET.<name>` | Direct get last (bypasses RAA) |
| `direct_get(seq)` | `DIRECT.GET.<name>` | Direct get by sequence |
| `delete_message(seq)` | `STREAM.MSG.DELETE.<name>` | Delete a specific message |
| `create_consumer(config)` | `CONSUMER.CREATE.<stream>` | Create consumer |
| `get_or_create_consumer(name, config)` | `CONSUMER.DURABLE.CREATE.<stream>.<name>` | Get or create durable |
| `get_consumer(name)` | `CONSUMER.INFO.<stream>.<name>` | Get existing consumer |
### Stream Info
```rust
#[derive(Serialize, Deserialize, Debug, Clone)]
pub struct Info {
pub config: Config,
pub created: DateTime,
pub state: State, // Messages, bytes, first/last sequence, consumer count
pub cluster: Option<ClusterInfo>,
pub timestamp: DateTime,
pub leader: Option<String>,
pub subjects: Option<HashMap<String, u64>>, // Subject → message count
}
```
### Paged List Operations
Stream and consumer listing uses a paged iterator pattern:
```rust
// streams() returns an iterator that automatically pages
let mut streams = jetstream.streams();
while let Some(stream) = streams.next().await {
let stream = stream?;
// process stream
}
// stream_names() similarly pages
let mut names = jetstream.stream_names();
while let Some(name) = names.next().await {
println!("{}", name?);
}
```
The paged iterator sends an initial request with `offset: 0` and continues fetching pages until no more results are returned.
## Publishing
**Location**: `jetstream/context.rs`, `jetstream/publish.rs`
### Publish
```rust
// Basic publish (fire-and-forget)
jetstream.publish("events.data", "payload".into()).await?;
// Publish with custom message builder
jetstream.publish_message(
jetstream::message::PublishMessage::build()
.payload("data".into())
.message_id("unique-id") // Nats-Msg-Id header for dedup
.expected_last_message_id("prev") // Nats-Expected-Last-Msg-Id
.expected_last_sequence(42) // Nats-Expected-Last-Sequence
.expected_last_subject_sequence("events", 10) // Per-subject sequence
.header("Custom", "Value")
).await?;
```
### PublishAck
When a message is published to a JetStream stream, the server responds with a `PublishAck`:
```rust
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq)]
pub struct PublishAck {
pub stream: String,
pub sequence: u64,
pub domain: Option<String>,
pub duplicate: bool,
}
```
### PublishAckFuture
Publishing returns a `PublishAckFuture` that resolves to `PublishAck`. The future uses a semaphore (`max_ack_semaphore`) to limit in-flight ack waits and prevent backpressure issues.
When `backpressure_on_inflight` is enabled, the publish operation blocks if there are too many pending acks, preventing the command channel from filling up with unbounded publish operations.
### Idempotent Publishing
Headers for exactly-once semantics:
| Header | Purpose |
|--------|---------|
| `Nats-Msg-Id` | Message ID for deduplication within the stream's duplicate window |
| `Nats-Expected-Last-Msg-Id` | Expected last message ID (conditional publish) |
| `Nats-Expected-Last-Sequence` | Expected last sequence number |
| `Nats-Expected-Last-Subject-Sequence` | Expected last sequence for a specific subject |
## Consumers
**Location**: `jetstream/consumer/`
### Consumer Types
| Type | Description |
|------|-------------|
| `PullConsumer` | Client pulls messages on demand |
| `PushConsumer` | Server pushes messages to a delivery subject |
| `OrderedConsumer` | Push consumer with automatic re-creation on failure |
### Consumer Config
```rust
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq)]
pub struct Config {
pub name: Option<String>,
pub durable_name: Option<String>,
pub description: Option<String>,
pub deliver_subject: Option<String>, // Push consumers only
pub ack_policy: AckPolicy,
pub ack_wait: Duration,
pub max_deliver: i64,
pub max_ack_pending: i32,
pub max_waiting: i32, // Pull consumers only
pub filter_subject: Option<String>,
pub replay_policy: ReplayPolicy,
pub sample_frequency: Option<i8>,
pub max_batch: i32, // Pull consumers
pub max_expires: Duration, // Pull consumers
pub inactive_threshold: Duration,
pub flow_control: bool, // Push consumers
pub heartbeat: Option<Duration>, // Push consumers
pub backoff: Vec<Duration>,
pub deliver_group: Option<String>,
pub num_replicas: usize,
pub mem_storage: bool,
pub metadata: Option<HashMap<String, String>>,
pub ack_markers: Option<Vec<String>>, // server_2_12+
}
```
### Pull Consumer
**Location**: `jetstream/consumer/pull.rs`
Pull consumers require explicit requests for messages:
```rust
// Batch request
let mut messages = consumer.messages().await?.take(100);
while let Some(message) = messages.next().await {
let message = message?;
message.ack().await?;
}
// Sequence-based batch
let mut batches = consumer.sequence(50)?.take(10);
while let Some(mut batch) = batches.try_next().await? {
while let Some(Ok(message)) = batch.next().await {
message.ack().await?;
}
}
// Single message fetch
let message = consumer.fetch().await?;
```
Pull requests are sent to: `$JS.API.CONSUMER.MSG.NEXT.<stream>.<consumer>`
The request payload is JSON:
```json
{"batch": 10, "expires": 5000, "no_wait": false}
```
### Push Consumer
**Location**: `jetstream/consumer/push.rs`
Push consumers receive messages automatically on a delivery subject. The client subscribes to the delivery subject and processes messages as they arrive.
Features:
- **Flow control** — server sends flow control messages, client responds to maintain delivery rate
- **Heartbeats** — idle heartbeats (status code 100) when no messages are available
- **Ordered consumers** — automatically recreated on delivery failures with correct sequence positioning
### Acknowledgment
**Location**: `jetstream/message.rs`
JetStream messages support multiple acknowledgment types:
```rust
pub enum AckKind {
Ack, // Ack (message processed)
Nack, // Nak (re-deliver)
Progress, // Progress (still working)
Next, // Next (ack + pull next)
Term, // Term (don't redeliver, remove from stream)
All, // Ack all messages up to this sequence
}
```
Methods on JetStream `Message`:
- `ack()` — simple acknowledgment
- `ack_with(kind)` — acknowledgment with specific type
- `double_ack()` — exactly-once ack (ACK + separate ack message)
- `nack()` — negative acknowledgment (request redelivery)
- `in_progress()` — progress indicator
- `term()` — terminate message (no redelivery)
## JetStream Message
**Location**: `jetstream/message.rs`
JetStream messages wrap core `Message` with metadata extracted from headers:
```rust
#[derive(Debug)]
pub struct Message {
pub message: crate::Message, // The underlying NATS message
pub context: Context, // JetStream context for acking
pub ack_pending: Arc<AtomicU64>, // Pending ack counter
}
impl Message {
pub fn info(&self) -> Result<Info, MessageInfoError> // Parse message info from headers
pub async fn ack(&self) -> Result<(), AckError>
pub async fn ack_with(&self, kind: AckKind) -> Result<(), AckError>
pub async fn double_ack(&self) -> Result<(), AckError>
pub async fn nack(&self) -> Result<(), AckError>
pub async fn in_progress(&self) -> Result<(), AckError>
pub async fn term(&self) -> Result<(), AckError>
}
```
Message info is extracted from the `HMSG` headers:
- `Nats-Stream` — stream name
- `Nats-Consumer` — consumer name
- `Nats-Delivered` — delivery count
- `Nats-Sequence` — stream sequence
- `Nats-Time-Stamp` — timestamp
- `Nats-Subject` — original subject
- `Nats-Pending-Messages` / `Nats-Pending-Bytes` — pending counts
## Key-Value Store
**Location**: `jetstream/kv/`
The KV store is a JetStream-based key-value API. Each bucket maps to a JetStream stream with specific configuration:
```rust
// Create a KV store
let kv = jetstream
.create_key_value(async_nats::jetstream::kv::Config {
bucket: "my_bucket".to_string(),
history: 5, // Max history per key (1-64)
ttl: Duration::from_secs(3600), // Key TTL
max_bytes: 1024 * 1024, // Max bucket size
storage: StorageType::File,
replicas: 1,
..Default::default()
})
.await?;
```
Under the hood:
- Each key is stored as a message with subject `$KV.<bucket>.<key>`
- Keys support wildcard patterns (`$KV.bucket.prefix.*`)
- History is managed via stream `max_messages_per_subject`
- TTL is managed via stream `max_age`
- `put(key, value)` publishes to the key subject
- `get(key)` reads the last message for the key subject
- `delete(key)` publishes an internal delete marker
- `purge(key)` uses stream purge API
- `watch()` subscribes to key changes and returns a `Watch` stream
- `keys()` / `history(key)` list keys and history
## Object Store
**Location**: `jetstream/object_store/`
The Object Store provides large object storage built on JetStream. Objects are chunked and stored across multiple messages in a stream.
```rust
// Create an object store
let store = jetstream
.create_object_store(async_nats::jetstream::object_store::Config {
bucket: "my_objects".to_string(),
..Default::default()
})
.await?;
// Put an object
let info = store.put("file.txt", stream).await?;
// Get an object
let mut object_stream = store.get("file.txt").await?;
```
Under the hood:
- Objects are chunked into ~128KB messages
- Metadata (object info) is stored as the first "chunk 0" message
- Each chunk is a message with subject `$OBJ.<bucket>.<object-nuid>.C<chunk-number>`
- Metadata includes: name, description, headers, size, chunks, digest (SHA-256)
- `get()` returns a stream of chunks
- Links allow referencing one object from another (like symlinks)
## JetStream Error Codes
**Location**: `jetstream/errors.rs`
Standard JetStream error codes returned by the server:
| Code | Constant | Description |
|------|----------|-------------|
| 10001 | `NOT_FOUND` | Resource not found |
| 10002 | `STREAM_NOT_FOUND` | Stream not found |
| 10003 | `CONSUMER_NOT_FOUND` | Consumer not found |
| 10004 | `REQUEST_NOT_FOUND` | Request not found |
| 10005 | `STREAM_WRONG_LAST_SEQ` | Wrong last sequence |
| 10006 | `STREAM_NAME_EXISTS` | Stream already exists |
| 10007 | `CONSUMER_NAME_EXISTS` | Consumer already exists |
| 10008 | `INSUFFICIENT_RESOURCES` | Insufficient resources |
| 10009 | `NO_MESSAGE_FOUND` | No message found |
| 10013 | `CONSUMER_EXISTS` | Consumer already exists (duplicate) |
| 10014 | `STREAM_NOT_CONFIGURED` | Stream not configured |
| 10015 | `CLUSTER_NOT_ACTIVE` | Cluster not active |
| 10016 | `CLUSTER_NOT_LEADER` | Not the cluster leader |
| 10017 | `CLUSTER_NOT_ENOUGH_PEERS` | Not enough peers |
| 10018 | `CLUSTER_INCOMPLETE` | Cluster incomplete |
| 10019 | `CONSUMER_DELETED` | Consumer was deleted |
| 10020 | `CONSUMER_BAD_ACK` | Bad acknowledgment |
| 10021 | `CONSUMER_BAD_SUBJECT` | Bad consumer subject |
| 10022 | `CONSUMER_DELETED_DRIFT` | Consumer deleted due to drift |
| ... | ... | Additional codes |
## Account
**Location**: `jetstream/account.rs`
The `Account` struct provides information about the JetStream account:
```rust
pub struct Account {
pub memory: i64,
pub storage: i64,
pub streams: i64,
pub consumers: i64,
pub limits: AccountLimits,
}
```

View File

@@ -1,292 +0,0 @@
# Authentication and Security
This document covers the authentication mechanisms, TLS configuration, and security-related features of the async-nats client.
## Authentication Methods
The NATS server supports multiple authentication methods. The client implements all of them.
### 1. Username/Password
The simplest authentication method.
```rust
// Via ConnectOptions
let client = ConnectOptions::with_user_and_password("user".into(), "pass".into())
.connect("nats://localhost")
.await?;
// Via URL
let client = connect("nats://user:pass@localhost:4222").await?;
```
These credentials are sent in the `CONNECT` message as `user` and `pass` fields.
### 2. Token Authentication
A single token used for authentication.
```rust
let client = ConnectOptions::with_token("my-token".into())
.connect("nats://localhost")
.await?;
```
Token is sent in the `CONNECT` message as `auth_token` field.
### 3. NKey Authentication
NKey-based authentication using Ed25519 key pairs. Requires the `nkeys` feature.
```rust
let seed = "SUANQDPB2RUOE4ETUA26CNX7FUKE5ZZKFCQIIW63OX225F2CO7UEXTM7ZY";
let client = ConnectOptions::with_nkey(seed.into())
.connect("nats://localhost")
.await?;
```
Flow:
1. Server sends `INFO` with a `nonce` field
2. Client creates a `KeyPair` from the seed
3. Client signs the nonce: `key_pair.sign(nonce.as_bytes())`
4. Client sends `CONNECT` with `nkey` (public key) and `sig` (Base64URL-encoded signature)
5. Server verifies the signature against the public key and nonce
### 4. JWT Authentication
User JWT with a signing callback. Requires the `nkeys` feature.
```rust
let key_pair = Arc::new(nkeys::KeyPair::from_seed(seed)?);
let jwt = load_jwt().await?;
let client = ConnectOptions::with_jwt(jwt, move |nonce| {
let key_pair = key_pair.clone();
async move { key_pair.sign(&nonce).map_err(AuthError::new) }
})
.connect("nats://localhost")
.await?;
```
Flow:
1. Server sends `INFO` with a `nonce` field
2. Client sends `CONNECT` with `jwt` (user JWT) and `sig` (Base64URL-encoded nonce signature)
3. The signing callback is async, allowing integration with external signing services (e.g., HSM)
### 5. Credentials File
Combines JWT and NKey from a `.creds` file. Requires the `nkeys` feature.
```rust
// From file
let client = ConnectOptions::with_credentials_file("path/to/my.creds")
.await?
.connect("nats://localhost")
.await?;
// From string
let client = ConnectOptions::with_credentials(creds_string)
.connect("nats://localhost")
.await?;
```
Credentials file format:
```
-----BEGIN NATS USER JWT-----
eyJ0eXAiOiJqd3QiLCJhbGciOiJlZDI1NTE5...
------END NATS USER JWT------
************************* IMPORTANT *************************
NKEY Seed printed below can be used sign and prove identity.
-----BEGIN USER NKEY SEED-----
SUAIO3FHUX5PNV2LQIIP7TZ3N4L7TX3W53MQGEIVYFIGA635OZCKEYHFLM
------END USER NKEY SEED------
```
**Location**: `auth_utils.rs` handles parsing:
- `load_creds(path)` — async file read + parse
- `parse_jwt_and_key_from_creds(creds)` — extracts JWT and KeyPair from the string
### 6. Auth Callback
A custom async callback that receives the server nonce and returns an `Auth` struct. This is the most flexible mechanism.
```rust
let client = ConnectOptions::with_auth_callback(move |nonce| {
async move {
let mut auth = Auth::new();
auth.username = Some("user".to_string());
auth.password = Some("pass".to_string());
// Can also set jwt, nkey, signature, token
Ok(auth)
}
})
.connect("nats://localhost")
.await?;
```
The callback is invoked on each connection/reconnection, allowing dynamic credential refresh (e.g., refreshing JWTs from an auth server).
### 7. URL-Embedded Credentials
```rust
// Username and password in URL
let client = connect("nats://user:pass@localhost:4222").await?;
// Token in URL (username field)
let client = connect("nats://token@localhost:4222").await?;
```
## Auth Struct
**Location**: `auth.rs`
The `Auth` struct is a container for all authentication methods. Multiple fields can be set simultaneously:
```rust
#[derive(Clone, Default)]
pub struct Auth {
pub jwt: Option<String>,
pub nkey: Option<String>,
pub signature_callback: Option<CallbackArg1<String, Result<String, AuthError>>>,
pub signature: Option<Vec<u8>>,
pub username: Option<String>,
pub password: Option<String>,
pub token: Option<String>,
}
```
Priority in `Connector::try_connect_to()`:
1. Auth callback overrides all other methods
2. NKey authentication (if `auth.nkey` is set)
3. JWT authentication (if `auth.jwt` is set)
4. Username/password/token from `Auth` struct
5. Username/password from URL
## TLS Configuration
### TLS Modes
| Mode | When | Description |
|------|------|-------------|
| None | Default | Plaintext connection |
| Standard | `tls_required` or server requires | TLS after INFO |
| TLS First | `tls_first` option | TLS before INFO |
| WebSocket | `wss://` URL | TLS handled by WebSocket library |
### TLS Setup
**Location**: `tls.rs`
The `config_tls()` function builds a `rustls::ClientConfig`:
1. Create `RootCertStore` and load native system certificates
2. Add custom root certificates from configured PEM files
3. Build `ClientConfig` with the chosen crypto provider:
- `ring` (default)
- `aws-lc-rs`
- `fips` (aws-lc-rs in FIPS mode)
4. If client certificate + key are configured, add them for mTLS
5. If a custom `rustls::ClientConfig` was provided, use it directly
### TLS First
```rust
let client = ConnectOptions::new()
.tls_first()
.connect("nats://localhost")
.await?;
```
This sets both `tls_first = true` and `tls_required = true`. The client performs TLS handshake before reading the `INFO` message. The server must have `handshake_first: true` in its configuration.
### Custom TLS Configuration
```rust
let tls_client = rustls::ClientConfig::builder()
.with_root_certificates(root_store)
.with_no_client_auth();
let client = ConnectOptions::new()
.require_tls(true)
.tls_client_config(tls_client)
.connect("nats://localhost")
.await?;
```
### mTLS (Mutual TLS)
```rust
let client = ConnectOptions::new()
.add_root_certificates("ca.pem".into())
.add_client_certificate("cert.pem".into(), "key.pem".into())
.connect("tls://localhost")
.await?;
```
## WebSocket Transport
Requires the `websockets` feature. Supports `ws://` and `wss://` schemes.
```rust
let client = connect("ws://localhost:8080").await?;
let client = connect("wss://localhost:443").await?;
```
Implementation uses `tokio-websockets` with a `WebSocketAdapter` that wraps the WebSocket stream to implement `AsyncRead + AsyncWrite`:
```rust
// WebSocketAdapter bridges WebSocket messages to byte streams
pub(crate) struct WebSocketAdapter<T> {
pub(crate) inner: WebSocketStream<T>,
pub(crate) read_buf: BytesMut, // Buffered incoming WebSocket messages
}
```
For `wss://`, TLS is configured within the WebSocket connector, not via the client's TLS layer.
## Security Considerations
### Nonce Signing
The server's `nonce` in the `INFO` message prevents replay attacks:
- Each connection gets a unique nonce
- The nonce must be signed with the client's private key
- The signature is verified server-side against the public key
### Authorization Violations
When the server sends `-ERR 'authorization violation'`:
- The client parses this as `ServerError::AuthorizationViolation`
- The `Connector` immediately propagates this error (does not retry)
- The error is converted to `ConnectErrorKind::AuthorizationViolation`
### Subject Validation
By default, the client validates subjects for protocol safety:
- **Publish subjects**: checked for emptiness and whitespace (can be disabled with `skip_subject_validation`)
- **Subscribe subjects**: always checked for emptiness, whitespace, leading/trailing dots, consecutive dots
- **Queue group names**: checked for emptiness and whitespace
The server enforces its own validation, but client-side checks prevent protocol-framing errors.
### Max Payload Size
The client checks payload size against the server's `max_payload` before publishing:
- For plain messages: `payload.len() > max_payload`
- For messages with headers: `headers.wire_len() + payload.len() > max_payload`
- Returns `PublishErrorKind::MaxPayloadExceeded` if exceeded
### No Echo
When `no_echo` is set, the `CONNECT` message includes `echo: false`. The server will not deliver messages published by this connection back to its own subscriptions. This prevents feedback loops.
### Lame Duck Mode
When the server enters lame duck mode (draining for shutdown):
1. Server sends `INFO` with `ldm: true`
2. Client emits `Event::LameDuckMode`
3. Application should gracefully close or reconnect to another server
The `nats-server` test harness provides `set_lame_duck_mode(server)` for testing this behavior.

View File

@@ -1,347 +0,0 @@
# nats-server Test Harness
This document covers the `nats-server` crate — a test harness for spawning real NATS server instances in integration tests.
**Location**: `nats-server/src/lib.rs`
**Version**: 0.1.0
**License**: Apache-2.0
**Dependencies**: `lazy_static`, `regex`, `serde_json`, `nuid`, `rand`, `tokio-retry`
## What It Is
The `nats-server` crate is **not** a NATS server implementation. It is a thin test harness that:
- Spawns the Go-based `nats-server` binary as a child process
- Configures it for test use (dynamic ports, temp storage, log files)
- Discovers the client URL from the server's `INFO` protocol message
- Cleans up resources (JetStream storage, logs, PID files) on `Drop`
- Supports single servers and 3-node clusters
The actual NATS server must be installed separately (Go binary from `github.com/nats-io/nats-server`).
## Server Struct
```rust
pub struct Server {
inner: Inner,
}
struct Inner {
cfg: String, // Config file path
id: String, // Unique server ID (NUID)
port: Option<String>, // Explicit port (None = dynamic)
child: Child, // Child process handle
logfile: PathBuf, // Log file path in temp dir
pidfile: PathBuf, // PID file path in temp dir
}
```
## Public API
### run_server
```rust
pub fn run_server(cfg: &str) -> Server
```
Starts a single NATS server with optional config file.
- Uses dynamic port (`-1` flag) for parallel test execution
- Stores JetStream data in temp directory
- Writes logs to temp file: `nats-server-<id>.log`
- Writes PID to temp file: `nats-server-<id>.pid`
- If `cfg` is non-empty, passes `-c <cfg>` to the server
Example:
```rust
let server = nats_server::run_server("tests/configs/jetstream.conf");
let client = async_nats::connect(server.client_url()).await.unwrap();
```
### run_basic_server
```rust
pub fn run_basic_server() -> Server
```
Starts a server with no config (bare minimum). Equivalent to `run_server("")`.
### run_server_with_port
```rust
pub fn run_server_with_port(cfg: &str, port: Option<&str>) -> Server
```
Starts a server with an explicit port. If `None`, uses dynamic port.
### run_cluster
```rust
pub fn run_cluster<'a, C: IntoConfig<'a>>(cfg: C) -> Cluster
```
Starts a 3-node cluster with the given config.
- Allocates 3 random port ranges (base, base+100, base+200)
- Configures cluster routes between nodes
- Each node gets: `--cluster nats://127.0.0.1:<cluster_port>`, `--routes <other_routes>`, `--cluster_name cluster`, `-n nodeN`
- Waits 2 seconds for cluster formation and leader election
The `IntoConfig` trait allows passing either a single config string (applied to all 3 nodes) or an array of 3 configs (one per node):
```rust
// Same config for all nodes
let cluster = run_cluster("configs/jetstream.conf");
// Different configs per node
let cluster = run_cluster(["node1.conf", "node2.conf", "node3.conf"]);
```
### Cluster Struct
```rust
pub struct Cluster {
pub servers: Vec<Server>,
}
impl Cluster {
pub fn client_url(&self) -> String {
self.servers[0].client_url()
}
}
```
### Server Methods
```rust
impl Server {
pub fn restart(&mut self)
pub fn client_url(&self) -> String
pub fn client_port(&self) -> u16
pub fn client_url_with(&self, user: &str, pass: &str) -> String
pub fn client_url_with_token(&self, token: &str) -> String
pub fn client_pid(&self) -> usize
}
```
#### restart()
Kills the current server process, waits for it to exit, then restarts with the same config, port, and ID. Used for testing reconnection behavior.
#### client_url()
Connects to the server's TCP port, reads the `INFO` line, parses the JSON, and constructs a URL:
- `nats://localhost:<port>` for non-TLS
- `tls://localhost:<port>` for TLS-required servers
Polls the log file (up to 10 seconds) to discover the client address, since the port may be dynamically allocated.
#### client_pid()
Reads the PID file and returns the server process ID. Used for sending signals.
### set_lame_duck_mode
```rust
pub fn set_lame_duck_mode(s: &Server)
```
Sends the lame duck mode signal to the server:
```bash
nats-server --signal ldm=<pid>
```
### is_port_available
```rust
pub fn is_port_available(port: usize) -> bool
```
Tests if a TCP port is available by attempting to bind to it.
## Server Lifecycle
### Spawning
The `do_run` function constructs and spawns the server process:
```rust
fn do_run(cfg: &str, port: Option<&str>, id: Option<String>) -> Inner {
let id = id.unwrap_or_else(|| nuid::next().to_string());
let logfile = env::temp_dir().join(format!("nats-server-{id}.log"));
let pidfile = env::temp_dir().join(format!("nats-server-{id}.pid"));
let store_dir = env::temp_dir().join(format!("store-dir-{id}"));
let mut cmd = Command::new("nats-server");
cmd.arg("--store_dir").arg(store_dir.as_path())
.arg("-p");
match port {
Some(port) => cmd.arg(port),
None => cmd.arg("-1"), // Dynamic port
};
cmd.arg("-l").arg(logfile.as_os_str())
.arg("-P").arg(pidfile.as_os_str());
if !cfg.is_empty() {
cmd.arg("-c").arg(cfg);
}
let child = cmd.spawn().unwrap();
// ...
}
```
Key flags:
- `--store_dir` — JetStream storage directory in temp
- `-p -1` — Dynamic port allocation (or explicit port)
- `-l` — Log file path
- `-P` — PID file path
- `-c` — Config file path
### Cleanup (Drop)
```rust
impl Drop for Server {
fn drop(&mut self) {
self.inner.child.kill().unwrap();
self.inner.child.wait().unwrap();
if let Ok(log) = fs::read_to_string(self.inner.logfile.as_os_str()) {
// Clean up JetStream storage directory if found in log
if let Some(caps) = SD_RE.captures(&log) {
let sd = caps.get(1).map_or("", |m| m.as_str());
fs::remove_dir_all(sd).ok();
}
// Remove log file
fs::remove_file(self.inner.logfile.as_os_str()).ok();
}
}
}
```
The regex `SD_RE` matches the "Store Directory" line in the server log:
```
.+\sStore Directory:\s+"([^"]+)"
```
### Client URL Discovery
The `client_addr` method polls the log file to find the server's listen address:
```rust
fn client_addr(&self) -> String {
for _ in 0..100 { // 100 iterations × 500ms = 50s max
match fs::read_to_string(self.inner.logfile.as_os_str()) {
Ok(l) => {
if let Some(cre) = CLIENT_RE.captures(&l) {
return cre.get(1).unwrap().as_str()
.replace("0.0.0.0", "localhost");
} else {
thread::sleep(Duration::from_millis(500));
}
}
_ => thread::sleep(Duration::from_millis(500)),
}
}
panic!("no client addr info");
}
```
The regex `CLIENT_RE` matches:
```
.+\sclient connections on\s+(\S+)
```
After finding the address, `client_url()` connects to it and parses the `INFO` JSON to get the port and TLS requirements.
## Cluster Setup
The `run_cluster_node_with_port` function spawns a single cluster node:
```rust
fn run_cluster_node_with_port(
cfg: &str,
port: Option<&str>,
routes: Vec<usize>,
name: String,
cluster_name: String,
cluster: usize,
) -> Server
```
Additional flags for cluster nodes:
- `--routes nats://127.0.0.1:<port1>,nats://127.0.0.1:<port2>` — routes to other cluster members
- `--cluster nats://127.0.0.1:<cluster_port>` — cluster listen address
- `--cluster_name <name>` — cluster name for grouping
- `-n <name>` — server name
Port allocation for a cluster:
```
Base port: random in 3000..50000
Node 1: client_port=base, cluster_port=base+1
Node 2: client_port=base+100, cluster_port=base+101
Node 3: client_port=base+200, cluster_port=base+201
```
Each port is checked for availability with `is_port_available()`, including the +1 cluster port.
## JetStream Config
**Location**: `configs/jetstream.conf`
```conf
jetstream: {
strict: true,
max_mem_store: 8MiB,
max_file_store: 10GiB
}
```
This is the default test config for JetStream-enabled servers. It enables strict mode and sets memory/file storage limits suitable for testing.
## Test Usage Patterns
```rust
#[tokio::test]
async fn basic_test() {
let server = nats_server::run_server("configs/jetstream.conf");
let client = async_nats::connect(server.client_url()).await.unwrap();
// ... test logic ...
// Server cleaned up on drop
}
#[tokio::test]
async fn cluster_test() {
let cluster = nats_server::run_cluster("configs/jetstream.conf");
let client = async_nats::connect(cluster.client_url()).await.unwrap();
// ... test logic ...
}
#[tokio::test]
async fn reconnect_test() {
let mut server = nats_server::run_server("");
let client = async_nats::connect(server.client_url()).await.unwrap();
// Restart the server to test reconnection
server.restart();
// Client should reconnect automatically
client.publish("test", "data".into()).await.unwrap();
}
```
## Dependencies
| Dependency | Version | Purpose |
|-----------|---------|---------|
| `lazy_static` | 1.4.0 | Static regex initialization |
| `regex` | 1.7.1 | Log parsing (store directory, client address) |
| `url` | 2 | URL manipulation for client_url_with |
| `serde_json` | 1.0.104 | INFO JSON parsing |
| `nuid` | 0.5 | Unique server ID generation |
| `rand` | 0.10.1 | Random port selection |
| `tokio-retry` | 0.3.0 | Exponential backoff for cluster operations |
Note: `async-nats` is only a dev-dependency, used in the crate's own integration tests.

View File

@@ -1,307 +0,0 @@
# Service API and Higher-Level Abstractions
This document covers the Service API and other higher-level abstractions built on top of the core NATS client.
## Service API
**Location**: `service/` (feature: `service`)
The Service API provides a framework for building NATS-based microservices with built-in monitoring, health checks, and statistics.
### Service
```rust
#[derive(Debug)]
pub struct Service {
client: Client,
info: Info,
endpoints: HashMap<String, Endpoint>,
started: DateTime,
stats_handler: Arc<dyn Fn(&str, &Stats) -> serde_json::Value + Send + Sync>,
stop_sender: mpsc::Sender<()>,
stop_receiver: Option<mpsc::Receiver<()>>,
}
```
### Creating a Service
```rust
use async_nats::service::ServiceExt;
let mut service = client
.service_builder()
.description("Product service")
.stats_handler(|endpoint, stats| {
serde_json::json!({
"endpoint": endpoint,
"requests": stats.num_requests,
"errors": stats.num_errors,
})
})
.start("products", "1.0.0")
.await?;
```
### ServiceBuilder
```rust
impl ServiceBuilder {
pub fn description(mut self, description: impl Into<String>) -> Self
pub fn stats_handler<F>(mut self, handler: F) -> Self
pub async fn start(self, name: impl Into<String>, version: impl Into<String>) -> Result<Service, ServiceError>
}
```
### Endpoints
A service exposes one or more endpoints, each handling requests on a specific subject:
```rust
// Add an endpoint
let mut endpoint = service
.endpoint("get_product")
.await?;
// Process requests
while let Some(request) = endpoint.next().await {
let request = request?;
// Handle the request
request.respond(serde_json::json!({ "id": 1, "name": "Widget" })).await?;
}
```
### Endpoint
**Location**: `service/endpoint.rs`
```rust
pub struct Endpoint {
subject: Subject,
queue_group: Option<String>,
info: EndpointInfo,
stats: Stats,
subscriber: Subscriber,
}
```
Implements `futures::Stream` yielding `ServiceRequest` objects.
### ServiceRequest
```rust
pub struct ServiceRequest {
pub subject: Subject,
pub payload: Bytes,
pub headers: Option<HeaderMap>,
pub reply: Option<Subject>,
pub client: Client,
}
```
Methods:
- `respond(payload)` — send a response to the requester
- `respond_with_headers(payload, headers)` — send a response with headers
### Monitoring Subjects
The Service API automatically creates monitoring endpoints:
| Subject | Description |
|---------|-------------|
| `$SRV.PING` | Ping all services (returns service info) |
| `$SRV.PING.<name>` | Ping specific service by name |
| `$SRV.PING.<name>.<id>` | Ping specific service instance |
| `$SRV.INFO` | Get service info |
| `$SRV.STATS` | Get service statistics |
### Service Info
```rust
pub struct Info {
pub name: String,
pub id: String,
pub version: String,
pub description: String,
pub endpoints: Vec<EndpointInfo>,
}
```
### Stats
```rust
pub struct Stats {
pub num_requests: u64,
pub num_errors: u64,
pub last_error: Option<String>,
pub processing_time: Duration,
pub average_processing_time: Duration,
}
```
## ID Generation
**Location**: `id_generator.rs`
The client needs unique IDs for inbox subjects and other purposes.
### With `nuid` Feature (Default)
Uses the NUID library for high-performance, cryptographically strong, collision-resistant IDs:
```rust
pub(crate) fn next() -> String {
nuid::next().to_string()
}
```
NUID generates 22-character alphanumeric strings using a combination of a random prefix and a sequential counter.
### Without `nuid` Feature
Falls back to `rand`-based generation:
```rust
pub(crate) fn next() -> String {
rng()
.sample_iter(Alphanumeric)
.take(22)
.map(char::from)
.collect()
}
```
Both approaches produce 22-character alphanumeric strings, but NUID is more performant and has better collision resistance.
## Inbox Generation
The `Client::new_inbox()` method generates globally unique inbox subjects for request-reply:
```rust
pub fn new_inbox(&self) -> String {
format!("{}.{}", self.inbox_prefix, crate::id_generator::next())
}
```
Default prefix is `_INBOX`, producing subjects like `_INBOX.UaBG3f3q5NxX3KdNcRmF2f`.
Custom prefix via `ConnectOptions::custom_inbox_prefix()`:
```rust
let client = ConnectOptions::new()
.custom_inbox_prefix("MYAPP")
.connect("demo.nats.io")
.await?;
// Inbox subjects: MYAPP.UaBG3f3q5KdNcRmF2f
```
## DateTime Helpers
**Location**: `datetime.rs` (feature: `jetstream` or `service` or `chrono`)
Provides date/time types for JetStream and Service API timestamps:
- Uses the `time` crate by default
- Optionally uses `chrono` via the `chrono` feature flag
- Supports RFC 3339 formatting and parsing
- `DateTime` type wraps either `time::OffsetDateTime` or `chrono::DateTime<Utc>`
## Crypto Module
**Location**: `crypto.rs` (feature: `crypto`)
Provides encryption/decryption support used by the Object Store for server-side encryption.
## Subject Validation
**Location**: `lib.rs`
The client provides two levels of subject validation:
### is_valid_publish_subject
```rust
pub(crate) fn is_valid_publish_subject<T: AsRef<str>>(subject: T) -> bool
```
Checks for protocol safety only:
- Not empty
- No whitespace (space, tab, CR, LF) which would break protocol framing
Used for publish operations. Can be disabled with `skip_subject_validation`.
### is_valid_subject
```rust
pub(crate) fn is_valid_subject<T: AsRef<str>>(subject: T) -> bool
```
Checks structural validity:
- Not empty
- No leading/trailing dots
- No consecutive dots (`..`)
- No whitespace
Used for subscribe operations (always runs, matching Go/Java behavior).
### is_valid_queue_group
```rust
pub(crate) fn is_valid_queue_group(queue_group: &str) -> bool
```
Checks:
- Not empty
- No whitespace
## JetStream Name Validation
**Location**: `jetstream/mod.rs`
```rust
pub(crate) fn is_valid_name(name: &str) -> bool {
!name.is_empty()
&& name.bytes().all(|c| !c.is_ascii_whitespace() && c != b'.' && c != b'*' && c != b'>')
}
```
JetStream names (stream names, consumer names) must not contain:
- Whitespace
- Dots (`.`) — would conflict with subject delimiters
- Wildcards (`*`, `>`) — would conflict with subject wildcards
## CallbackArg1
**Location**: `options.rs`
A type-erased async callback wrapper used throughout the crate:
```rust
pub(crate) type AsyncCallbackArg1<A, T> =
Arc<dyn Fn(A) -> Pin<Box<dyn Future<Output = T> + Send + Sync + 'static>> + Send + Sync>;
#[derive(Clone)]
pub(crate) struct CallbackArg1<A, T>(AsyncCallbackArg1<A, T>);
impl<A, T> CallbackArg1<A, T> {
pub(crate) async fn call(&self, arg: A) -> T {
(self.0.as_ref())(arg).await
}
}
```
Used for:
- `event_callback``CallbackArg1<Event, ()>`
- `auth_callback``CallbackArg1<Vec<u8>, Result<Auth, AuthError>>`
- `reconnect_to_server_callback``CallbackArg1<(Vec<Server>, ServerInfo), Option<ReconnectToServer>>`
- `signature_callback``CallbackArg1<String, Result<String, AuthError>>`
## Version Compatibility Checking
The `Client::is_server_compatible` method checks if the server version meets a minimum requirement:
```rust
pub fn is_server_compatible(&self, major: i64, minor: i64, patch: i64) -> bool
```
This parses the server version string from `ServerInfo::version` using a regex and compares major/minor/patch components. Note: this checks the directly-connected server, not necessarily the JetStream leader.
The `server_2_10`, `server_2_11`, `server_2_12`, and `server_2_14` feature flags enable version-specific API fields and methods without runtime checks.

View File

@@ -1,215 +0,0 @@
# Quick Reference
## Crate Summary
| | |
|---|---|
| **Crate** | `async-nats` |
| **Version** | 0.49.1 |
| **Edition** | 2021 |
| **MSRV** | 1.88.0 |
| **License** | Apache-2.0 |
| **Runtime** | Tokio |
| **Protocol** | NATS Client Protocol v1 (Dynamic) |
| **TLS** | rustls (ring / aws-lc-rs / fips) |
| **WebSocket** | tokio-websockets (feature-gated) |
## Quick Start
```rust
use async_nats::connect;
use futures_util::StreamExt;
#[tokio::main]
async fn main() -> Result<(), async_nats::Error> {
let client = connect("demo.nats.io").await?;
// Publish
client.publish("events.data", "hello".into()).await?;
// Subscribe
let mut sub = client.subscribe("events.>").await?;
while let Some(msg) = sub.next().await {
println!("{:?}", msg);
}
// Request-Response
let response = client.request("service", "input".into()).await?;
Ok(())
}
```
## Architecture at a Glance
```
Client (cloneable handle, mpsc::Sender<Command>)
ConnectionHandler (single Tokio task)
├── Subscriptions HashMap<u64, Subscription>
├── Multiplexer (request-reply, SID 0)
├── Flush Observers
└── Ping/Pong health check
Connection (protocol I/O, read/write buffers)
Connector (server pool, reconnection)
NATS Server (Go binary, TCP/TLS/WebSocket)
```
## Key Types
| Type | Location | Purpose |
|------|----------|---------|
| `Client` | `client.rs` | Cloneable connection handle |
| `Subscriber` | `lib.rs` | Message stream (impl `futures::Stream`) |
| `Message` | `message.rs` | Inbound NATS message |
| `OutboundMessage` | `message.rs` | Outbound publish message |
| `Subject` | `subject.rs` | Validated subject string (backed by `Bytes`) |
| `HeaderMap` | `header.rs` | NATS message headers |
| `StatusCode` | `status.rs` | NATS protocol status codes |
| `ServerInfo` | `lib.rs` | Server INFO data |
| `ConnectInfo` | `lib.rs` | Client CONNECT data |
| `ServerAddr` | `lib.rs` | Validated server URL |
| `Auth` | `auth.rs` | Authentication credentials |
| `ConnectOptions` | `options.rs` | Connection configuration builder |
| `Event` | `lib.rs` | Connection lifecycle events |
| `State` | `connection.rs` | Connection state (Pending/Connected/Disconnected) |
| `Statistics` | `client.rs` | Atomic connection metrics |
| `Request` | `client.rs` | Request-response builder |
## JetStream Types
| Type | Location | Purpose |
|------|----------|---------|
| `jetstream::Context` | `jetstream/context.rs` | JetStream API entry point |
| `jetstream::stream::Stream` | `jetstream/stream.rs` | Stream management |
| `jetstream::stream::Config` | `jetstream/stream.rs` | Stream configuration |
| `jetstream::stream::Info` | `jetstream/stream.rs` | Stream info/state |
| `jetstream::consumer::PullConsumer` | `jetstream/consumer/pull.rs` | Pull-based consumer |
| `jetstream::consumer::PushConsumer` | `jetstream/consumer/push.rs` | Push-based consumer |
| `jetstream::consumer::Config` | `jetstream/consumer/mod.rs` | Consumer configuration |
| `jetstream::Message` | `jetstream/message.rs` | Message with ack methods |
| `jetstream::PublishAck` | `jetstream/publish.rs` | Publish acknowledgment |
| `jetstream::kv::Store` | `jetstream/kv/bucket.rs` | Key-Value store |
| `jetstream::object_store::ObjectStore` | `jetstream/object_store/mod.rs` | Object store |
| `jetstream::ErrorCode` | `jetstream/errors.rs` | JetStream error codes |
## Protocol Operations
### Client → Server (ClientOp)
| Op | Wire Format | Purpose |
|----|-----------|---------|
| `CONNECT` | `CONNECT {json}\r\n` | Authentication and capabilities |
| `PUB` | `PUB <subject> [reply] <len>\r\n<payload>\r\n` | Publish message |
| `HPUB` | `HPUB <subject> [reply] <hlen> <tlen>\r\n<hdrs><payload>\r\n` | Publish with headers |
| `SUB` | `SUB <subject> [queue] <sid>\r\n` | Subscribe |
| `UNSUB` | `UNSUB <sid> [max]\r\n` | Unsubscribe |
| `PING` | `PING\r\n` | Keepalive / health check |
| `PONG` | `PONG\r\n` | Response to server PING |
### Server → Client (ServerOp)
| Op | Wire Format | Purpose |
|----|-----------|---------|
| `INFO` | `INFO {json}\r\n` | Server capabilities, cluster info |
| `MSG` | `MSG <subj> <sid> [reply] <len>\r\n<payload>\r\n` | Deliver message |
| `HMSG` | `HMSG <subj> <sid> [reply] <hlen> <tlen>\r\n<hdrs><payload>\r\n` | Message with headers |
| `+OK` | `+OK\r\n` | Success (verbose mode) |
| `-ERR` | `-ERR <desc>\r\n` | Server error |
| `PING` | `PING\r\n` | Server health check |
| `PONG` | `PONG\r\n` | Ack client PING |
## Internal Commands (Command → ConnectionHandler)
| Command | Purpose |
|---------|---------|
| `Publish(OutboundMessage)` | Queue message for sending |
| `Request { subject, payload, respond, headers, sender }` | Request-response via multiplexer |
| `Subscribe { sid, subject, queue_group, sender }` | Create subscription |
| `Unsubscribe { sid, max }` | Remove subscription |
| `Flush { observer }` | Wait for write buffer flush |
| `Drain { sid }` | Gracefully drain (sub or whole client) |
| `Reconnect` | Force reconnection |
| `SetServerPool { servers, result }` | Replace server pool |
| `ServerPool { result }` | Query server pool |
## Feature Flags
| Feature | Default | Enables |
|---------|---------|---------|
| `jetstream` | ✓ | JetStream API (streams, consumers, publish) |
| `kv` | ✓ | Key-Value store (requires jetstream) |
| `object-store` | ✓ | Object store (requires jetstream + crypto) |
| `service` | ✓ | Service API |
| `nkeys` | ✓ | NKey/JWT authentication |
| `crypto` | ✓ | Encryption support |
| `websockets` | ✓ | WebSocket transport |
| `nuid` | ✓ | NUID ID generation |
| `ring` | ✓ | Ring crypto backend |
| `aws-lc-rs` | ✗ | Alternative crypto backend |
| `fips` | ✗ | FIPS mode (requires aws-lc-rs) |
| `chrono` | ✗ | Use chrono instead of time |
| `experimental` | ✗ | Experimental features |
| `server_2_10` | ✓ | Server 2.10+ API fields |
| `server_2_11` | ✓ | Server 2.11+ API fields |
| `server_2_12` | ✓ | Server 2.12+ API fields |
| `server_2_14` | ✓ | Server 2.14+ API fields |
## Connection Defaults
| Parameter | Default |
|-----------|---------|
| Connection timeout | 5 seconds |
| Ping interval | 60 seconds |
| Max pending pings | 2 |
| Request timeout | 10 seconds |
| Command channel capacity | 2048 |
| Subscription capacity | 65536 |
| Read buffer capacity | 65535 |
| Inbox prefix | `_INBOX` |
| Reconnect delay | Exponential (0ms → 4s cap) |
| Max reconnects | Unlimited |
| TLS required | Auto (server-dependent) |
## Error Hierarchy
```
ConnectError (ConnectErrorKind::ServerParse | Dns | Authentication | AuthorizationViolation | TimedOut | Tls | Io | MaxReconnects)
PublishError (PublishErrorKind::MaxPayloadExceeded | InvalidSubject | Send)
RequestError (RequestErrorKind::TimedOut | NoResponders | InvalidSubject | MaxPayloadExceeded | Other)
SubscribeError (SubscribeErrorKind::InvalidSubject | InvalidQueueName | Other)
FlushError (FlushErrorKind::SendError | FlushError)
```
## nats-server Test Harness
| Function | Description |
|----------|-------------|
| `run_server(cfg)` | Start single server with config |
| `run_basic_server()` | Start bare server |
| `run_cluster(cfg)` | Start 3-node cluster |
| `set_lame_duck_mode(s)` | Send LDM signal |
## JetStream API Subjects
| Operation | Subject Pattern |
|-----------|---------------|
| Create stream | `$JS.API.STREAM.CREATE.<name>` |
| Stream info | `$JS.API.STREAM.INFO.<name>` |
| Update stream | `$JS.API.STREAM.UPDATE.<name>` |
| Delete stream | `$JS.API.STREAM.DELETE.<name>` |
| Purge stream | `$JS.API.STREAM.PURGE.<name>` |
| List streams | `$JS.API.STREAM.LIST` |
| Create consumer | `$JS.API.CONSUMER.CREATE.<stream>` |
| Create durable | `$JS.API.CONSUMER.DURABLE.CREATE.<stream>.<name>` |
| Consumer info | `$JS.API.CONSUMER.INFO.<stream>.<name>` |
| Pull next | `$JS.API.CONSUMER.MSG.NEXT.<stream>.<name>` |
| Account info | `$JS.API.ACCOUNT.INFO` |
| Direct get | `$JS.API.DIRECT.GET.<name>` |

View File

@@ -1,963 +0,0 @@
# OpenStack Keystone Identity Service — Reference Document
> Status: Research reference
> Created: 2026-06-08
> Context: alknet auth/identity system design; rustfs S3-compatible store with Keystone auth
## 1. Overview
OpenStack Keystone is the identity service for the OpenStack cloud platform. It
provides authentication, authorization, and service discovery via a RESTful HTTP
API. Every other OpenStack service (Nova, Neutron, Cinder, Swift, etc.) depends
on Keystone for token validation and access control.
Key responsibilities:
| Responsibility | Description |
|---|---|
| **Authentication** | Verify identity via passwords, tokens, TOTP, SAML, OIDC, application credentials |
| **Authorization** | Role-based access control (RBAC) across projects, domains, and system scope |
| **Service Catalog** | Registry of available services and their endpoint URLs |
| **Token Management** | Issue, validate, and revoke bearer tokens with scoped authorization |
| **Federation** | Accept identity assertions from external IdPs (SAML, OIDC) |
| **Trust Delegation** | Allow users to delegate limited authority to other users |
---
## 2. Core Concepts
### 2.1 Domains
A **domain** is a top-level namespace that contains users, groups, and projects.
Domains provide administrative isolation: a domain administrator can manage
users and projects within their domain but not across domains.
- Domains were introduced in the Identity API v3 (the "v3" API).
- Before domains, OpenStack used "tenants" (v2 API) — projects are the v3
equivalent, but domains add a containment boundary.
- Every user, group, and project belongs to exactly one domain.
- The `Default` domain is created automatically and holds all v2-compatible
resources.
**Key property**: Domains are the unit of administrative delegation. A domain
admin can create/delete users, groups, and projects within their domain.
### 2.2 Projects
A **project** is a container for resources — compute instances, storage volumes,
networks, etc. Projects are the primary scope for authorization in OpenStack.
- Projects group resources: "who can see/use these VMs and volumes?"
- Projects belong to a domain.
- Projects are the primary unit for role assignment and token scoping.
- Projects can be hierarchical (parent/child) with inherited role assignments.
**Key property**: A project-scoped token lets you operate on resources within
that project. You cannot use a project-scoped token to access resources in a
different project.
### 2.3 Users
A **user** represents a digital identity — a person, system account, or service
account that can authenticate and be authorized.
- Users belong to a domain.
- Users can have multiple authentication methods (password, TOTP, application
credentials, federated identity).
- Users can be members of groups.
- Users receive role assignments on projects, domains, or system scope.
### 2.4 Groups
A **group** is a named collection of users. Groups simplify role management: you
assign a role to a group on a project, and every user in the group inherits that
role.
- Groups belong to a domain.
- Groups are used for role assignment: `group:X → role:member → project:Y`.
- Federation mappings often resolve external IdP groups to local Keystone groups.
### 2.5 Roles
A **role** is a named permission set. Roles by themselves don't define what
operations are allowed — they are labels that policy files map to API operations.
- Roles are assigned by binding an actor (user or group) to a target (project,
domain, or system) with a role.
- Assignment format: `{actor, role, target}` — e.g., `{user:alice, member,
project:engineering}`.
- OpenStack defines default roles: `admin`, `member`, `reader`.
- Custom roles can be created. Policy files (policy.yaml) map roles to API
operations.
- **Implied roles**: one role can imply another (e.g., `admin` implies `member`
implies `reader`).
- **Inherited roles**: a role assigned on a domain with `inherited_to_projects`
flag propagates to all projects within that domain.
### 2.6 Endpoints
An **endpoint** is a network-accessible URL for an OpenStack service. Each
service registers one or more endpoints in Keystone's service catalog.
- Endpoints have an **interface** type:
- `public` — for end users (public network)
- `internal` — for service-to-service communication (internal network)
- `admin` — for administrative operations (restricted network)
- Endpoints have a **region** attribute for multi-region deployments.
- Endpoint URLs can contain template variables like `$(project_id)s` that are
resolved at token time.
### 2.7 Service Catalog
The **service catalog** is a registry of all services available in the
deployment and their endpoints. It is included in token responses and is
available via `GET /v3/auth/catalog`.
- A service has a `type` (e.g., `identity`, `compute`, `object-store`) and a
`name` (e.g., `keystone`, `nova`, `swift`).
- The `type` follows the [service-types authority][] — it identifies the API
contract, not the implementation version.
- The service catalog in a token is filtered by scope: a project-scoped token
shows only endpoints relevant to that project.
- Endpoint filtering allows administrators to restrict which endpoints are
visible to specific projects via project-endpoint associations or endpoint
groups.
[service-types authority]: https://service-types.openstack.org/
**Example service catalog entry:**
```json
{
"catalog": [
{
"name": "Keystone",
"type": "identity",
"endpoints": [
{
"interface": "public",
"url": "https://identity.example.com:5000/"
},
{
"interface": "internal",
"url": "https://identity.internal:5000/"
},
{
"interface": "admin",
"url": "https://identity.admin:5000/"
}
]
}
]
}
```
---
## 3. Token Lifecycle
### 3.1 Token Types by Scope
| Token Type | Scope | Contains | Use Case |
|---|---|---|---|
| **Unscoped** | None | User identity only, no roles, no catalog | Prove identity for subsequent scoped auth |
| **Project-scoped** | Project | Roles, catalog, project info | Operate on project resources (VMs, volumes) |
| **Domain-scoped** | Domain | Roles, catalog, domain info | Manage users/projects within a domain |
| **System-scoped** | System | Roles, catalog, system info | Cloud-wide admin operations |
| **Trust-scoped** | Trust | Delegated roles, trust metadata | Act on behalf of another user |
### 3.2 Authentication Flow
```
1. Client → POST /v3/auth/tokens (with credentials)
2. Keystone validates credentials
3. Keystone issues token:
- Token ID returned in X-Subject-Token header
- Token body (JSON) returned in response body
4. Client uses token: X-Auth-Token: <token_id> on subsequent requests
5. Services validate token:
- Option A: Local validation (Fernet/JWS — self-contained)
- Option B: Call Keystone to validate (UUID tokens)
```
### 3.3 Token Providers
| Provider | Format | Persistence | Size | Security |
|---|---|---|---|---|
| **Fernet** (default) | AES256-encrypted ciphertext + SHA256 HMAC | None (self-contained) | ~200 bytes | Symmetric keys; only Keystone can decrypt |
| **JWS** | JSON Web Signature (ES256) | None (self-contained) | ~800 bytes | Asymmetric keys; anyone can verify signature, payload is readable |
| **UUID** (legacy) | Random UUID string | Database (must be stored) | ~32 bytes | Requires database lookup for validation |
**Fernet tokens** are the recommended default. They are:
- Self-contained: no database persistence needed.
- Encrypted: the token payload is opaque to clients.
- Compact: much smaller than JWS tokens.
- Key rotation: Fernet keys are rotated using `keystone-manage fernet_rotate`.
**JWS tokens** are appropriate when:
- You want asymmetric key verification (services can validate without sharing
symmetric keys).
- You're comfortable with the payload being readable by anyone who has the token.
### 3.4 Token Contents
A project-scoped token contains:
```json
{
"token": {
"methods": ["password"],
"user": {
"id": "aaa...",
"name": "alice",
"domain": { "id": "default", "name": "Default" }
},
"project": {
"id": "bbb...",
"name": "engineering",
"domain": { "id": "default", "name": "Default" }
},
"roles": [
{ "id": "ccc...", "name": "member" },
{ "id": "ddd...", "name": "reader" }
],
"catalog": [ ... ],
"expires_at": "2026-06-08T12:00:00.000000Z",
"issued_at": "2026-06-08T11:00:00.000000Z",
"audit_ids": ["eeee..."],
"is_domain": false
}
}
```
Key fields:
- `methods`: Authentication methods used (e.g., `["password"]` or
`["password", "totp"]` for MFA).
- `user`: Who the token belongs to.
- `project` / `domain` / `system`: The authorization scope.
- `roles`: The roles assigned to the user within the scope.
- `catalog`: Service catalog (absent in unscoped tokens).
- `expires_at` / `issued_at`: Token validity window.
- `audit_ids`: Chain of audit IDs for tracking token derivation.
### 3.5 Token Validation
When a service receives a request with a token:
1. Extract `X-Auth-Token` header.
2. For Fernet tokens: decrypt with local Fernet key, parse payload, verify
expiration. Check revocation events.
3. For JWS tokens: verify signature with public key, parse payload, verify
expiration. Check revocation events.
4. For UUID tokens: call Keystone to validate. (Deprecated, but still supported.)
Keystone middleware (`keystonemiddleware`) handles this automatically for
OpenStack services.
### 3.6 Token Revocation
Tokens can be revoked explicitly (`DELETE /v3/auth/tokens`) or implicitly via
revocation events triggered by:
- User account disabled
- Domain disabled
- Project disabled
- Password changed (invalidates all tokens for that user)
- Role assignment changed (invalidates tokens for the affected scope)
Revocation events use pattern matching for efficiency — a single event can
invalidate many tokens (e.g., all tokens for a user, or all tokens for a project).
---
## 4. Scoping
### 4.1 Unscoped → Scoped Flow
The typical authentication flow is two-step:
1. **Authenticate** → receive an **unscoped token** (proves identity, no
authorization).
2. **Re-authenticate with scope** → receive a **scoped token** (proves identity
+ authorization).
```bash
# Step 1: Get unscoped token
curl -X POST /v3/auth/tokens -d '{
"auth": {
"identity": {
"methods": ["password"],
"password": { "user": { "name": "alice", "password": "..." } }
}
}
}'
# Step 2: Get project-scoped token using unscoped token
curl -X POST /v3/auth/tokens -d '{
"auth": {
"identity": {
"methods": ["token"],
"token": { "id": "<unscoped_token>" }
},
"scope": {
"project": { "name": "engineering", "domain": { "name": "Default" } }
}
}
}'
```
### 4.2 Scope Types and Authorization
| Scope | Token Can Do | Token Cannot Do |
|---|---|---|
| **Project** | Operate on project resources (VMs, storage, networks) | Manage domain users, system-wide operations |
| **Domain** | Manage users/projects within that domain | Operate on project resources (without project scope) |
| **System** | Cloud-wide admin: manage endpoints, services, hypervisor info | Project-specific resource operations |
| **None (unscoped)** | Prove identity to Keystone | Access any service resources |
A project-scoped token **cannot** be reused in a different project. Each scope
is a separate token. This is a deliberate security design: token scope limits
the blast radius of a compromised token.
### 4.3 Design Rationale
The scoping model exists because:
1. **Principle of least privilege**: Users authenticate once (expensive), then
get narrowly scoped tokens (cheap) for each operation context.
2. **Multi-tenancy**: A cloud serves many organizations; project scoping
prevents cross-tenant access.
3. **Administrative separation**: Domain admins manage users; system admins
manage infrastructure. Different scopes for different jobs.
---
## 5. Role-Based Access Control (RBAC)
### 5.1 Role Assignments
A role assignment binds an **actor** (user or group) to a **role** on a
**target** (project, domain, or system).
The four assignment types:
| Assignment | Actor | Target | Example |
|---|---|---|---|
| User → Project | User | Project | Alice is `member` of `engineering` |
| Group → Project | Group | Project | `dev-team` group is `member` of `engineering` |
| User → Domain | User | Domain | Alice is `admin` of `acme-domain` |
| Group → Domain | Group | Domain | `ops-team` group is `admin` of `acme-domain` |
Plus **system** role assignments for cloud-wide operations.
### 5.2 Effective Role Assignments
When querying role assignments with `effective=True`, Keystone resolves:
1. **Direct assignments**: Roles explicitly granted.
2. **Group memberships**: Roles inherited from groups the user belongs to.
3. **Inherited roles**: Roles from parent projects or domains (via
`inherited_to_projects` flag).
4. **Implied roles**: Roles implied by other roles (e.g., `admin` → `member`
→ `reader`).
### 5.3 Policy Enforcement
Keystone uses `oslo.policy` for policy enforcement. Each OpenStack service
defines policy rules in `policy.yaml` files. A rule maps an API operation to a
check string:
```yaml
"identity:create_project": "role:admin and domain_id:%(target.domain.id)s"
"identity:list_projects": "role:reader"
"identity:update_project": "role:admin or project_id:%(target.project.id)s"
```
Policy rules can check:
- Role membership (`role:admin`)
- Scope type (`system_scope:all`, `domain_id:...`)
- Resource ownership (`user_id:%(target.user.id)s`)
- Arbitrary target attributes
### 5.4 Scope Enforcement in Policy
Since the Rocky release, policies can require specific token scopes:
```yaml
# System-scoped token required
"identity:list_projects": "role:reader and system_scope:all"
# Project-scoped token required
"nova:create_server": "role:member and project_id:%(target.project.id)s"
```
This prevents:
- Using a project-scoped token for system operations.
- Using a system-scoped token for project operations (without a project context).
---
## 6. Trust Delegation (OS-TRUST)
### 6.1 Overview
Trusts allow one user (**trustor**) to delegate a subset of their authority to
another user (**trustee**) for a limited scope and duration, without sharing
credentials.
**Key properties of a trust:**
| Property | Description |
|---|---|
| `trustor_user_id` | User creating the trust (delegating authority) |
| `trustee_user_id` | User receiving the delegation |
| `project_id` | Project scope for the delegated authority |
| `roles` | Subset of trustor's roles being delegated |
| `impersonation` | If `true`, tokens appear to come from the trustor |
| `expires_at` | Optional expiration timestamp |
| `remaining_uses` | Optional limit on how many tokens can be created from this trust |
| `allow_redelegation` | Whether the trustee can create sub-trusts |
| `redelegation_count` | Maximum depth of redelegation chain |
### 6.2 Trust-Scoped Tokens
When a trustee authenticates using a trust:
1. The trustee authenticates with their own credentials.
2. They specify `trust_id` in the auth request.
3. Keystone issues a **trust-scoped token** with:
- Roles: the intersection of the trust's roles and the trustor's current
roles (if trustor lost a role, the trust is invalidated).
- `OS-TRUST:trust` section in the token body containing trust metadata.
If `impersonation=true`, the token's `user` field shows the trustor — the
trustee acts as the trustor. If `impersonation=false`, the token's `user`
field shows the trustee.
### 6.3 Trust Delegation Chains
Trusts support **redelegation**: a trustee can create a new trust delegating to
a third party. This creates a trust chain:
```
Trustor → Trust(A) → Trustee1
Trustee1 → Trust(B) → Trustee2 (redelegation)
```
Delegation depth is controlled by:
- `allow_redelegation: true/false`
- `redelegation_count: N` (decremented on each redelegation; default max is 3)
**Security constraints:**
- The redelegated trust's roles must be a subset of the original trustor's
roles (not the intermediate trustee's).
- If `impersonation=false` in the source trust, the redelegated trust cannot
set `impersonation=true`.
- Application credentials cannot create or delete trusts (prevents automated
escalation chains).
### 6.4 Automatic Trust Revocation
Trusts are automatically revoked (soft-deleted) when:
- The trustor is deleted.
- The trustee is deleted.
- The project is deleted.
- The trust expires (`expires_at`).
- The remaining uses are exhausted (`remaining_uses` reaches 0).
- The trustor loses a role that was delegated in the trust.
---
## 7. Application Credentials
### 7.1 Overview
Application credentials allow users to create long-lived, restricted credentials
for applications without exposing their password. This is especially important
for users whose identity comes from LDAP or SSO — applications can't use their
password.
**Key properties:**
| Property | Description |
|---|---|
| `name` | Unique name within the user's application credentials |
| `secret` | Auto-generated or user-provided secret (hashed on storage, shown once) |
| `project_id` | Project scope (always the user's current project) |
| `roles` | Subset of the user's roles on the project (cannot exceed user's roles) |
| `expires_at` | Optional expiration timestamp |
| `unrestricted` | `false` by default — restricted from creating/deleting other app creds and trusts |
### 7.2 Authentication with Application Credentials
```bash
# Auth with application credential ID + secret
curl -X POST /v3/auth/tokens -d '{
"auth": {
"identity": {
"methods": ["application_credential"],
"application_credential": {
"id": "aa809205ed614a0e854bac92c0768bb9",
"secret": "oKce6DOC_WcZoE13l3eX..."
}
}
}
}'
```
Or by name + user:
```bash
"application_credential": {
"name": "monitoring",
"user": { "name": "glance", "domain": { "name": "Default" } },
"secret": "securesecret"
}
```
### 7.3 Restriction Model
By default (`unrestricted=false`), application credentials **cannot**:
- Create or delete other application credentials.
- Create or delete trusts.
- List other application credentials.
This prevents a compromised app credential from regenerating itself or escalating
privileges. Setting `unrestricted=true` removes these restrictions, but adds
risk.
### 7.4 Rotation
Application credentials support **zero-downtime rotation**:
1. Create a new application credential (names must be unique per user).
2. Update the application configuration with the new ID/secret.
3. Delete the old application credential.
Multiple application credentials can coexist for the same user+project,
enabling seamless transitions.
### 7.5 Invalidation
Application credentials are automatically invalidated when:
- The user is deleted or disabled.
- The user's role assignment on the project changes (roles are checked at
auth time against the user's current roles).
- The project is deleted or disabled.
- The credential expires (`expires_at`).
- The credential is explicitly deleted.
---
## 8. Federation
### 8.1 Overview
Keystone's federation module allows external Identity Providers (IdPs) to
authenticate users, with Keystone acting as a Service Provider (SP). Keystone
maps the external identity to local users, groups, and roles.
**Supported protocols:**
| Protocol | Module | Use Case |
|---|---|---|
| **SAML 2.0** | mod_shib / mod_auth_mellon | Enterprise SSO |
| **OpenID Connect** | mod_auth_openidc | OAuth2/OIDC providers (Google, Keycloak, Okta) |
| **Mapped** | Custom auth module | Any HTTP auth module |
| **K2K** | Keystone-to-Keystone | Multi-cloud federation between OpenStack deployments |
### 8.2 Federation Architecture
```
┌──────────────────┐
│ External IdP │
│ (SAML/OIDC/...) │
└────────┬────────┘
SAML assertion or
OIDC claims
┌──────────┐ HTTPD auth module ┌───────────────┐
│ Browser │ ───────────────────────▶│ Apache/Nginx │
│ or CLI │ (mod_shib / │ + auth module │
└──────────┘ mod_auth_openidc) └───────┬────────┘
REMOTE_USER header
+ other attributes
┌──────────────────┐
│ Keystone │
│ (SP) │
│ │
│ 1. Lookup IdP │
│ 2. Apply mapping│
│ │ remote attrs │
│ │ → local user,│
│ │ groups, │
│ │ roles │
│ 3. Issue token │
└──────────────────┘
```
### 8.3 Key Federation Components
1. **Identity Provider** object — represents the external IdP in Keystone.
Has `remote_ids` (entity IDs) that Keystone uses to match incoming
requests.
2. **Mapping** — a set of rules that transform attributes from the external IdP
into Keystone-local user properties and group memberships. Mappings can:
- Map remote users to local users (by name, email, or other attributes).
- Assign users to local groups (inherit group role assignments).
- Dynamically create projects based on remote attributes.
- Support complex condition logic.
3. **Protocol** — links an Identity Provider to a Mapping. Supported values:
`saml2`, `openid`, `mapped`, or custom.
4. **Mapping rule example:**
```json
[{
"local": [{
"user": { "name": "{0}" },
"group": { "domain": { "name": "Default" }, "name": "federated_users" }
}],
"remote": [{ "type": "REMOTE_USER" }]
}]
```
This maps all authenticated external users to a local user (named by the
`REMOTE_USER` attribute) and adds them to the `federated_users` group.
### 8.4 Federation Token Flow
1. User authenticates with the external IdP.
2. The HTTPD auth module (Apache/Nginx) validates the assertion and sets
`REMOTE_USER` and other headers.
3. Keystone receives the request at `/v3/OS-FEDERATION/identity_providers/{idp}/protocols/{protocol}/auth`.
4. Keystone applies the mapping rules to produce a local user + groups + roles.
5. Keystone issues a **federated unscoped token**.
6. The user can then exchange it for a scoped token (project, domain, or
system) just like any other unscoped token.
### 8.5 Identity Provider (Keystone as IdP)
Keystone can also act as an **Identity Provider** (SAML IdP), allowing it to
authenticate users from other OpenStack deployments (K2K federation) or other
SAML SPs.
---
## 9. Service Catalog Deep Dive
### 9.1 Service Registration
Services are registered with Keystone via the API:
```bash
openstack service create --name nova --description "Compute" compute
openstack endpoint create --region RegionOne compute public https://nova.example.com:8774/
openstack endpoint create --region RegionOne compute internal https://nova.internal:8774/
openstack endpoint create --region RegionOne compute admin https://nova.admin:8774/
```
### 9.2 Catalog Filtering
The catalog returned in a token is filtered by:
1. **Scope**: A project-scoped token includes endpoints filtered by
project-endpoint associations.
2. **Endpoint groups**: Admins can define endpoint groups (filtered by service
type, region, or interface) and associate them with projects.
3. **Enabled/disabled**: Disabled services and endpoints don't appear in the
catalog.
4. **Interface visibility**: `public`, `internal`, and `admin` endpoints serve
different audiences.
### 9.3 URL Templating
Endpoint URLs support template variables:
- `$(project_id)s` — replaced with the token's project ID
- `$(user_id)s` — replaced with the token's user ID
Example:
```
https://object-store.example.com/v1/KEY_$(project_id)s
```
When a project-scoped token is issued, the catalog resolves this to:
```
https://object-store.example.com/v1/KEY_d12af07f4e2c4390a21acc31517ebec9
```
### 9.4 Client Discovery
An OpenStack client authenticates with Keystone, receives a token (which
includes the service catalog), and then uses the catalog to discover the URL
for any service it needs:
```python
# After authentication, the catalog is in the token response:
for service in token['catalog']:
if service['type'] == 'compute':
for endpoint in service['endpoints']:
if endpoint['interface'] == 'public':
nova_url = endpoint['url']
break
```
This is how every OpenStack client discovers service endpoints — they never
hardcode URLs. They authenticate once, get the catalog, and dynamically route
to the correct endpoint.
---
## 10. Mapping to alknet Concepts
### 10.1 Concept Comparison Table
| Keystone Concept | alknet Concept | Notes |
|---|---|---|
| Domain | (Not directly mapped) | alknet is single-tenant/small-team focused; no need for domain-level admin boundaries yet |
| Project | `Identity.resources` | Projects scope resources; alknet's `resources: HashMap<String, Vec<String>>` serves a similar scoping purpose |
| User | `Identity.id` | Keystone users ↔ alknet identities (fingerprint or UUID) |
| Group | (Not directly mapped) | Could be added via `Identity.scopes` patterns or a groups concept in alknet-storage |
| Role | `Identity.scopes` | Keystone roles map to alknet scopes: `["relay:connect", "service:gitea:read"]` ≈ role assignments |
| Token (scoped) | `AuthToken` + scoped permissions | alknet's AuthToken proves identity + timestamp; scopes come from IdentityProvider lookup |
| Service Catalog | `OperationRegistry` + OpenAPI spec generation | Both solve service discovery; Keystone is runtime API catalog, alknet generates from OpenAPI |
| Trust Delegation | (Potential future model) | alknet doesn't have delegation yet; trust model could inspire future `DelegationToken` |
| Application Credentials | API keys in `api_keys` table | alknet's `api_keys` table parallels app creds: long-lived, scoped, user-bound |
| Federation (SAML/OIDC) | Phase D OIDC provider aspiration | alknet wants to *be* an OIDC provider; Keystone consumes external IdPs |
| Service Endpoint | (Implicit in OperationEnv) | alknet operations are discovered via registry, not external endpoint lookup |
| Policy (policy.yaml) | `ForwardingPolicy` + call protocol ACL | Both enforce "who can do what where"; alknet is code-based, not YAML-configured |
### 10.2 What to Adopt from Keystone
#### 10.2.1 Scoped Tokens (Strong Adopt)
**Keystone pattern**: Unscoped → project/domain/system scoped token flow.
**alknet application**: Currently, `AuthToken` proves identity with a timestamp.
`Identity.scopes` and `Identity.resources` are resolved *after* token
verification by `IdentityProvider`. This is analogous to Keystone's flow:
| Keystone | alknet |
|---|---|
| Unscoped token (identity only) | AuthToken (proves key possession + timestamp) |
| Scoped token (identity + roles + catalog) | Identity (resolved by IdentityProvider with scopes + resources) |
| Re-auth with scope | Not needed — alknet scopes come from the `IdentityProvider` lookup |
**Recommendation**: alknet's current model is already similar to Keystone's, but
more streamlined. alknet doesn't need a separate "re-auth with scope" step
because the `IdentityProvider` resolution *is* the scoping step. However,
consider adding explicit scope fields to the token in the future for
multi-tenant deployments.
#### 10.2.2 Service Catalog Pattern (Strong Adopt)
**Keystone pattern**: Services register endpoints; clients discover them from
the token/catalog.
**alknet application**: The `OperationRegistry` + `OpenAPIServiceRegistry`
serves a similar purpose:
- Keystone: `POST /v3/auth/tokens` → response includes catalog of services
and URLs.
- alknet: `OperationRegistry` knows all available operations; `FromOpenAPI`
generates them from specs.
**Key difference**: In Keystone, the catalog is returned *with the token* and
is dynamic (filtered by project scope). In alknet, the registry is built at
startup from configuration, and access control is enforced per-operation in the
call protocol.
**Recommendation**: Consider adding a "service discovery" operation to the
call protocol — a way for clients to ask "what operations are available to me?"
This would be analogous to Keystone's `GET /v3/auth/catalog`.
#### 10.2.3 Role Hierarchies and Implied Roles (Moderate Adopt)
**Keystone pattern**: Roles can imply other roles (`admin` → `member` →
`reader`). Role assignments on domains propagate to projects via inheritance.
**alknet application**: Currently, alknet's scopes are flat strings. Consider:
```
admin:service:* → implies → member:service:* → implies → reader:service:*
```
This would simplify scope assignment in the `IdentityProvider`: grant `admin:service:*`
and automatically get `member` and `reader` permissions.
**Recommendation**: Implement implied scopes as a Phase 2+ feature when
alknet-storage adds the ACL graph. Don't over-engineer in Phase 1.
#### 10.2.4 Application Credentials (Strong Adopt — alreded parallels)
**Keystone pattern**: Password-less auth with restricted capabilities, tied to a
user and project, with expiration and rotation support.
**alknet application**: The `api_keys` table in alknet-storage is exactly this:
| Keystone App Credential | alknet API Key |
|---|---|
| `id` + `secret` | `key_prefix` + `key_hash` |
| `roles` (subset of user's roles) | `scopes` (subset of account's scopes) |
| `project_id` (scope) | Account-scoped |
| `expires_at` | `expires_at` |
| `unrestricted` | (not yet implemented) |
| Rotation via create-new-then-delete | (not yet implemented) |
**Recommendation**: Add the `unrestricted` concept to API keys — by default,
API keys should NOT be able to create or delete other API keys or modify
account settings. Also add rotation support (create new key, update config,
delete old key).
#### 10.2.5 Trust Delegation (Future Consideration)
**Keystone pattern**: Trustor delegates limited authority to trustee with
impersonation, expiration, usage limits, and redelegation chains.
**alknet application**: alknet doesn't have this yet, but it could be useful
for:
- **Service-to-service auth**: An alknet node delegates limited authority to a
service wrapper (e.g., "let the rustfs wrapper access S3 on my behalf for 1
hour").
- **Temporary access grants**: "Give Alice access to the `engineering` scope
for 24 hours."
- **Impersonation for audit**: Trusted services acting on behalf of a user,
with the user's identity appearing in audit logs.
**Recommendation**: Design a `DelegationToken` or `Trust` model when
alknet-storage is built. The trust model — trustor, trustee, roles, expiration,
remaining_uses — is a good template.
#### 10.2.6 Federation (Phase D Alignment)
**Keystone pattern**: External IdPs (SAML, OIDC) authenticate users; Keystone
maps them to local identities via mapping rules.
**alknet application**: Phase D of `credential-provider.md` envisions alknet
*as* an OIDC provider for self-hosted services. This is the **inverse** of
Keystone's federation model:
- Keystone: external IdP → Keystone (SP) → local identity
- alknet Phase D: alknet (IdP) → rustfs/gitea (SP) → local identity on self-hosted service
**Key learning from Keystone's federation model**:
1. **Mapping rules** are critical. Keystone's mapping engine (`local` ← `remote`)
is how IdP attributes become local roles. alknet will need the inverse:
`Identity.scopes` → OIDC claims → rustfs/gitea policies.
2. **Group membership from federation** is temporary by default (valid for
token lifetime). alknet should consider whether federated identities are
permanent or session-scoped.
3. **Multiple IdP support**: Keystone can consume from multiple external IdPs.
alknet Phase D should support multiple SPs (multiple self-hosted services)
consuming from one alknet IdP.
**Recommendation**: When building Phase D, study Keystone's mapping rule
format. alknet will need a similar concept: `alknet.scope → oidc.claim →
service.policy`. This could be part of the `CredentialProvider` or a new
`IdentityMappingProvider`.
### 10.3 What NOT to Adopt from Keystone
#### 10.3.1 Domains (Not Needed)
Keystone's domain model is designed for multi-tenant cloud hosting where
different organizations share the same OpenStack deployment. alknet is designed
for self-hosted, single-organization or small-team deployments. The domain
concept adds complexity that doesn't justify itself in alknet's use case.
alknet's `Identity.resources` already provides a lightweight scoping mechanism
that covers the "which resources does this identity have access to" use case
without the overhead of a domain hierarchy.
#### 10.3.2 Separate Policy Engine (Over-Engineering)
Keystone's `oslo.policy` is a full YAML-based policy engine with complex rule
combinations (`role:admin AND domain_id:%(target.domain.id)s OR
project_id:%(target.project.id)s`). alknet's authorization model is
programmatic (Rust code in `ForwardingPolicy` and call protocol handlers), not
configured via YAML. This is appropriate for alknet's size and complexity.
**If** alknet needs configurable policies in the future (e.g., admin-editable
ACL rules stored in the database), a simple rule engine would suffice — not the
full oslo.policy model.
#### 10.3.3 Multiple Token/Scope Types (Unnecessary Complexity)
Keystone has separate token types for project/domain/system scope. alknet's
`AuthToken` is already simpler: it proves identity + timestamp, and the
`IdentityProvider` resolves scopes. There's no need for alknet to issue
different token types for different scopes.
If multi-tenancy is added in the future, the `Identity.resources` map can
encode project equivalents without needing a separate token type.
#### 10.3.3 Service Endpoint Registration (Unnecessary)
Keystone requires every service to register its endpoints in the catalog
before it can be discovered. alknet services are registered programmatically
(via `OperationRegistry::register()`) at startup, not via a central API. The
`OperationRegistry` is built from configuration and OpenAPI specs, not from a
catalog service.
This is appropriate for alknet's architecture: services are known at deploy
time, not dynamically registered. If dynamic service discovery is needed later,
a simple registry operation in the call protocol would suffice.
---
## 11. Summary of Recommendations
| Keystone Concept | Adoption Level | alknet Implementation |
|---|---|---|
| **Scoped tokens** | ✅ Strong Adopt | Already present in IdentityProvider resolution (AuthToken → Identity with scopes/resources) |
| **Service catalog** | ✅ Strong Adopt | `OperationRegistry` + `FromOpenAPI`; consider adding "list operations" discovery |
| **Application credentials** | ✅ Strong Adopt | `api_keys` table parallels exactly; add `unrestricted` flag and rotation support |
| **Role hierarchies / implied roles** | ⚡ Moderate | Implied scope hierarchies in Phase 2+ when ACL graph is built |
| **Trust delegation** | ⚡ Moderate | Design `DelegationToken` model for service-to-service and temporary access in Phase 2+ |
| **Federation mapping** | ⚡ Moderate | Phase D: adopt `scope → claim → policy` mapping pattern for OIDC provider |
| **Token revocation events** | ⚡ Moderate | Consider pattern-matching revocation for efficiency when alknet-storage supports it |
| **Domains** | ❌ Skip | alknet is self-hosted/small-team; `Identity.resources` provides lightweight scoping |
| **oslo.policy (YAML-based)** | ❌ Skip | alknet uses programmatic auth (Rust code); add simple rule engine only if needed |
| **Multiple token types** | ❌ Skip | One token type with scope resolution via `IdentityProvider` is sufficient |
| **Endpoint registration API** | ❌ Skip | `OperationRegistry` is configured at startup, not via a catalog API |
---
## 12. References
- [Keystone Architecture — OpenStack Docs](https://docs.openstack.org/keystone/2024.2/getting-started/architecture.html)
- [Keystone Tokens Overview](https://docs.openstack.org/keystone/latest/admin/tokens-overview.html)
- [Keystone Service Catalog Overview](https://docs.openstack.org/keystone/latest/contributor/service-catalog.html)
- [Keystone Trusts Documentation](https://docs.openstack.org/keystone/latest/user/trusts.html)
- [Keystone Application Credentials](https://docs.openstack.org/keystone/queens/user/application_credentials.html)
- [Keystone Federation Configuration](https://docs.openstack.org/keystone/latest/admin/federation/configure_federation.html)
- [Keystone RBAC and Authorization — DeepWiki](https://deepwiki.com/openstack/keystone/4-authorization-and-access-control)
- [Keystone Authentication and Token Management — DeepWiki](https://deepwiki.com/openstack/keystone/3-authentication-and-token-management)
- [Keystone Trust Delegation — DeepWiki](https://deepwiki.com/openstack/keystone/4.4-trust-delegation)
- [Keystone Service Catalog — DeepWiki](https://deepwiki.com/openstack/keystone/5.4-service-catalog)
- [Keystone Token Revocation — DeepWiki](https://deepwiki.com/openstack/keystone/3.4-token-revocation)
- [Understanding OpenStack Keystone: Scoped vs. Unscoped Tokens](https://osie.io/blog/understanding-openstack-keystone-scoped-vs-unscoped-tokens)
- [Trust Delegation in OpenStack Using Keystone Trusts](https://blog.zhaw.ch/icclab/trust-delegation-in-openstack-using-keystone-trusts/)
- [OpenStack Knowledge: Keystone Federation](https://github.com/stackers-network/openstack-knowledge/blob/main/core/identity/federation.md)
- [alknet identity.md](../../architecture/identity.md)
- [alknet auth.md](../../architecture/auth.md)
- [alknet credential-provider.md](../phase2/credential-provider.md)

View File

@@ -1,137 +0,0 @@
# Polyglot: Research Overview
**Library**: `polyglot-sql` (Rust crate) / `@polyglot-sql/sdk` (TypeScript/WASM) / `polyglot-sql` (Python)
**Repository**: <https://github.com/tobilg/polyglot>
**Current Version**: 0.4.4 (as of 2026-06-03)
**License**: MIT (+ sqlglot MIT for test fixtures)
**Author**: Tobias G. (tobilg)
**Inspiration**: Python [sqlglot](https://github.com/tobymao/sqlglot) by Toby Mao
---
## 1. What Is Polyglot?
Polyglot is a **SQL transpiler** — it parses SQL from one database dialect into an AST, and generates SQL for a different dialect. It is **not** a database driver, ORM, query executor, or connection pool. Its core purpose is **dialect-agnostic SQL manipulation**: parse, transform, validate, format, and transpile SQL across 32+ database dialects.
### Key Capabilities
| Capability | Description |
|---|---|
| **Parse** | Convert SQL string → typed AST with 200+ expression node types |
| **Generate** | Convert AST → SQL string for any supported dialect |
| **Transpile** | Convert SQL from dialect A → dialect B in one call |
| **Format** | Pretty-print SQL with configurable guard rails |
| **Build** | Construct SQL programmatically via a fluent builder API |
| **Validate** | Syntax + semantic validation with error positions |
| **Lineage** | Trace column lineage through queries; generate OpenLineage payloads |
| **Diff** | AST-aware diff between two SQL expressions |
| **Traverse** | DFS/BFS iterators, predicate queries, and transforms on the AST |
### Supported Dialects (32)
Athena, BigQuery, ClickHouse, CockroachDB, Databricks, Doris, Dremio, Drill, Druid, DuckDB, Dune, Exasol, Fabric, Hive, Materialize, MySQL, Oracle, PostgreSQL, Presto, Redshift, RisingWave, SingleStore, Snowflake, Solr, Spark, SQLite, StarRocks, Tableau, Teradata, TiDB, Trino, TSQL
Plus a `Generic` dialect for standard SQL.
### Language Bindings
| Binding | Package | Delivery |
|---|---|---|
| **Rust** | `polyglot-sql` on crates.io | Native Rust crate |
| **TypeScript/WASM** | `@polyglot-sql/sdk` on npm | WASM module + JS wrapper |
| **Python** | `polyglot-sql` on PyPI | PyO3 native extension |
| **Go** | `github.com/tobilg/polyglot/packages/go` | PureGo wrapper over C FFI |
| **C FFI** | Built from `polyglot-sql-ffi` | `.so` / `.dylib` / `.dll` + `.a` / `.lib` + header |
---
## 2. Core Philosophy & Design Principles
1. **Pipeline architecture**: SQL → Tokenize → Parse → AST → Transform → Generate → SQL string. Each stage is independently configurable per dialect.
2. **Ported from Python sqlglot**: The Rust implementation is a faithful port of the Python `sqlglot` library, maintaining compatibility with its test fixtures (10,220+ fixture cases at 100% pass rate). The architecture, expression types, transformation rules, and dialect behaviors mirror the Python original.
3. **No runtime database connection**: Polyglot never connects to a database. It operates purely on SQL strings and ASTs. This makes it safe for sandboxed environments (WASM, serverless) and suitable for build-time / CI-time SQL analysis.
4. **Feature-gated compilation**: Each dialect is behind a Cargo feature flag (`dialect-postgresql`, `dialect-mysql`, etc.), so users compiling for constrained targets (WASM) can include only what they need. The `default` feature set includes everything.
5. **Stack safety**: The `stacker` feature (default-on for native builds) grows the stack on deeply nested inputs, preventing stack overflow from pathological SQL. WASM builds opt out since `stacker` doesn't work there.
6. **Guard rails**: Format/guard options limit input size (16 MiB default), token count (1M), AST node count (1M), and set-operation chain depth (256) to prevent resource exhaustion.
7. **Performance-first**: Built in Rust for speed. Benchmarks show 819× speedup over the Python `sqlglot` for transpilation, with generation at ~86× faster. The WASM build enables near-native performance in browsers.
---
## 3. How It Differs from Database Abstraction Layers
**Critical distinction**: Polyglot is a **SQL dialect transpiler**, not a database abstraction layer. It does not:
- Connect to databases
- Execute queries
- Manage connection pools
- Handle migrations (no `CREATE TABLE` schema evolution management)
- Map Rust types to database types
- Provide an ORM-like interface
- Handle async I/O
Instead, it focuses purely on **SQL text manipulation**: parsing, analyzing, transforming, and generating SQL strings. This makes it complementary to (not competing with) libraries like Diesel, SQLx, or SeaORM.
---
## 4. Performance Characteristics
From the project's benchmark suite (polyglot-sql v0.1.2 vs sqlglot v28.10.1):
| Operation | Speedup Range |
|---|---|
| Parse (SQL → AST) | 1013× faster |
| Generate (AST → SQL) | 77101× faster |
| Roundtrip (parse → generate → re-parse) | 1315× faster |
| Transpile (full cross-dialect) | 1.6× (simple) to 19× (complex BigQuery→Snowflake) |
| Geometric mean | **8.70×** |
Parse benchmarks (v0.4.x, native Rust):
| Query | Mean |
|---|---|
| short (SELECT a, b, c) | 51.28 μs |
| medium (5 cols, JOIN, GROUP BY) | 259.61 μs |
| complex (3 CTEs, subquery) | 268.59 μs 1.03 ms |
---
## 5. Project Maturity Indicators
| Indicator | Status |
|---|---|
| **Version** | 0.4.4 (pre-1.0, active development) |
| **Test coverage** | 18,745 test cases at 100% pass rate |
| **crates.io downloads** | ~4,738 total (as of mid-2026) |
| **Dependent crates** | 2 (via entdb) |
| **Release cadence** | Frequent patch releases (0.4.2, 0.4.3, 0.4.4 in quick succession) |
| **Source code size** | ~241K lines of Rust in core crate |
| **Fuzzing** | Supported via `cargo +nightly fuzz` |
| **CI** | Full test suite + FFI + Python + WASM |
| **Documentation** | Rust API docs (docs.rs), TypeScript docs, Python docs, playground |
| **Breaking changes** | Possible before 1.0; semver suggests API instability |
---
## 6. License
- **MIT License** for the Polyglot code itself
- **sqlglot MIT License** for the test fixtures derived from the Python project
- Both are permissive, suitable for commercial use
---
## References
- <https://github.com/tobilg/polyglot> — Main repository
- <https://crates.io/crates/polyglot-sql> — Rust crate on crates.io
- <https://www.npmjs.com/package/@polyglot-sql/sdk> — TypeScript SDK on npm
- <https://pypi.org/project/polyglot-sql/> — Python bindings on PyPI
- <https://docs.rs/polyglot-sql/latest/polyglot_sql/> — Rust API documentation
- <https://polyglot-playground.gh.tobilg.com/> — Interactive playground
- <https://github.com/tobymao/sqlglot> — Original Python inspiration

View File

@@ -1,720 +0,0 @@
# Polyglot: Architecture Deep Dive
---
## 1. Workspace Structure
The repository is organized as a Cargo workspace with 5 crates and supporting packages:
```
polyglot/
├── crates/
│ ├── polyglot-sql/ # Core Rust library (~241K LOC)
│ │ └── src/
│ │ ├── lib.rs # Public API, top-level functions
│ │ ├── tokens.rs # Tokenizer (lexer)
│ │ ├── parser.rs # Recursive-descent parser (~62K LOC)
│ │ ├── expressions.rs # AST node types (~15K LOC)
│ │ ├── generator.rs # SQL code generator (~39K LOC)
│ │ ├── dialects/ # 33 dialect implementations
│ │ │ ├── mod.rs # Dialect trait, Dialect struct, CustomDialectBuilder
│ │ │ ├── generic.rs # Base/standard SQL dialect
│ │ │ ├── postgres.rs # PostgreSQL (~1.9K LOC)
│ │ │ ├── mysql.rs # MySQL
│ │ │ ├── sqlite.rs # SQLite
│ │ │ ├── bigquery.rs # BigQuery
│ │ │ ├── ... (32 total)
│ │ ├── builder.rs # Fluent query builder API
│ │ ├── transforms.rs # Cross-dialect transform functions
│ │ ├── validation.rs # Syntax + semantic validation
│ │ ├── schema.rs # Schema representation
│ │ ├── scope.rs # Scope analysis
│ │ ├── resolver.rs # Column resolution
│ │ ├── lineage.rs # Column lineage tracking
│ │ ├── openlineage.rs # OpenLineage payload generation
│ │ ├── diff.rs # AST diff (ChangeDistiller algorithm)
│ │ ├── planner.rs # Logical query plan
│ │ ├── optimizer/ # Query optimizer modules
│ │ │ ├── annotate_types.rs # Type annotation
│ │ │ ├── qualify_columns.rs # Column qualification
│ │ │ ├── qualify_tables.rs # Table qualification
│ │ │ ├── pushdown_predicates.rs
│ │ │ ├── pushdown_projections.rs
│ │ │ ├── eliminate_joins.rs
│ │ │ ├── eliminate_ctes.rs
│ │ │ ├── simplify.rs
│ │ │ └── ...
│ │ ├── traversal.rs # DFS/BFS visitors, AST predicates
│ │ ├── ast_transforms.rs # AST manipulation utilities
│ │ ├── error.rs # Error types
│ │ └── time.rs # Time format conversion
│ ├── polyglot-sql-function-catalogs/ # Optional dialect function catalogs
│ ├── polyglot-sql-wasm/ # WASM bindings (wasm-pack)
│ ├── polyglot-sql-ffi/ # C FFI bindings (cbindgen)
│ └── polyglot-sql-python/ # Python bindings (PyO3 + maturin)
├── packages/
│ ├── sdk/ # TypeScript SDK (@polyglot-sql/sdk)
│ ├── go/ # Go SDK (PureGo wrapper over FFI)
│ ├── documentation/ # TypeScript API docs site
│ ├── playground/ # Browser playground (React 19, Vite)
│ └── python-docs/ # Python API docs
├── examples/
│ ├── rust/ # Rust usage example
│ ├── typescript/ # TypeScript SDK example
│ └── c/ # C FFI usage example
└── tools/
├── sqlglot-compare/ # Fixture extraction & comparison
└── bench-compare/ # Performance benchmarks
```
---
## 2. Data Flow Pipeline
```
┌──────────────────────────────────────────────────────────────────────┐
│ SQL String (source dialect) │
└──────────────────────────┬──────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────┐
│ Tokenizer (tokens.rs) │
│ • Dialect-specific lexing rules (quotes, comments, keywords) │
│ • Configurable via TokenizerConfig per dialect │
│ • Produces Vec<Token> with type, text, and Span (line/col/offset) │
└──────────────────────────┬──────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────┐
│ Parser (parser.rs, ~62K LOC) │
│ • Recursive-descent with precedence climbing │
│ • Dialect-aware parsing (custom keywords, syntax rules) │
│ • Produces Expression AST tree │
│ • Stack safety via `stacker` feature (default-on) │
└──────────────────────────┬──────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────┐
│ Expression AST (expressions.rs) │
│ • Single tagged enum with 150+ variants │
│ • Each variant has its own struct (Select, Insert, Function, etc.) │
│ • Box<Variant> keeps enum size to 2 words (tag + pointer) │
│ • Serializable via serde (derive Serialize/Deserialize) │
│ • Optional TypeScript type generation via `ts-rs` feature flag │
└──────────────────────────┬──────────────────────────────────────────┘
┌────┴────┐
│ │
┌─────────┘ └──────────┐
│ │
▼ ▼
┌────────────────────────┐ ┌────────────────────────────────────┐
│ Transform Pipeline │ │ Semantic / Analysis Modules │
│ (transpile path) │ │ • validation.rs → syntax checks │
│ │ │ • schema.rs → column/type lookup │
│ 1. preprocess() │ │ • scope.rs → scope analysis │
│ (whole-tree rewrites│ │ • resolver.rs → column resolution │
│ like eliminate_ │ │ • lineage.rs → column lineage │
│ qualify) │ │ • openlineage.rs → OL payloads │
│ │ │ • optimizer/ → query optimization │
│ 2. transform_expr() │ │ • diff.rs → AST diff │
│ (per-node rewrites │ │ • planner.rs → logical plan DAG │
│ per dialect) │ │ • traversal.rs → DFS/BFS visitors │
│ │ │
│ 3. Generator │ │
│ (AST → SQL string) │ │
└───────────┬────────────┘ └────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────┐
│ SQL String (target dialect) │
└──────────────────────────────────────────────────────────────────────┘
```
---
## 3. Core Abstractions
### 3.1 Expression AST
The central type is `Expression`, a large tagged enum with one variant per SQL construct:
```rust
pub enum Expression {
// Literals
Literal(Box<Literal>),
Boolean(BooleanLiteral),
Null(Null),
// Identifiers
Identifier(Identifier),
Column(Box<Column>),
Table(Box<TableRef>),
Star(Star),
// Queries
Select(Box<Select>),
Union(Box<Union>),
Intersect(Box<Intersect>),
Except(Box<Except>),
Subquery(Box<Subquery>),
// DML
Insert(Box<Insert>),
Update(Box<Update>),
Delete(Box<Delete>),
Copy(Box<CopyStmt>),
// Binary/Unary operators
And(Box<BinaryOp>),
Or(Box<BinaryOp>),
Add(Box<BinaryOp>),
Eq(Box<BinaryOp>),
// ... 30+ operator variants
// Functions
Function(Box<Function>),
AggregateFunction(Box<AggregateFunction>),
WindowFunction(Box<WindowFunction>),
// Clauses
From(Box<From>),
Join(Box<Join>),
Where(Box<Where>),
OrderBy(Box<OrderBy>),
// ...
// ~150 total variants
}
```
Key design choices:
- **Boxed variants**: Most variants wrap their payload in `Box` to keep `size_of::<Expression>()` at 2 words (16 bytes on 64-bit).
- **Serde support**: `#[derive(Serialize, Deserialize)]` for JSON serialization across FFI/WASM boundaries.
- **TypeScript types**: Optional `ts-rs` feature generates TypeScript interfaces.
- **Convenience methods**: `Expression::column()`, `Expression::number()`, `Expression::sql()`, `Expression::sql_for()`.
### 3.2 DialectType Enum
```rust
pub enum DialectType {
Generic, PostgreSQL, MySQL, BigQuery, Snowflake, DuckDB, SQLite,
Hive, Spark, Trino, Presto, Redshift, TSQL, Oracle, ClickHouse,
Databricks, Athena, Teradata, Doris, StarRocks, Materialize,
RisingWave, SingleStore, CockroachDB, TiDB, Druid, Solr, Tableau,
Dune, Fabric, Drill, Dremio, Exasol, DataFusion,
}
```
- Implements `FromStr` with aliases (e.g., `"mssql"``TSQL`, `"cockroach"``CockroachDB`)
- Each variant maps to a feature-gated dialect module
- Custom dialects can be registered at runtime via `CustomDialectBuilder`
### 3.3 DialectImpl Trait
```rust
pub trait DialectImpl {
fn dialect_type(&self) -> DialectType;
fn tokenizer_config(&self) -> TokenizerConfig { /* default */ }
fn generator_config(&self) -> GeneratorConfig { /* default */ }
fn generator_config_for_expr(&self, _expr: &Expression) -> GeneratorConfig { /* default */ }
fn transform_expr(&self, expr: Expression) -> Result<Expression> { Ok(expr) }
fn preprocess(&self, expr: Expression) -> Result<Expression> { Ok(expr) }
}
```
Each dialect implements this trait to provide:
1. **Tokenizer config**: Identifier quoting characters, string delimiters, keyword overrides, comment styles, hex number support
2. **Generator config**: 30+ flags controlling SQL output (identifier quote style, function casing, `LIMIT` vs `TOP` vs `FETCH FIRST`, etc.)
3. **Per-node transform**: Dialect-specific expression rewrites (e.g., PostgreSQL transforms `IFNULL``COALESCE`, SQLite transforms `TRY_CAST``CAST`)
4. **Whole-tree preprocess**: Structural rewrites that need full-tree context (e.g., eliminating `QUALIFY` for dialects that don't support it)
### 3.4 Dialect Struct (High-Level API)
```rust
pub struct Dialect {
dialect_type: DialectType,
tokenizer: Tokenizer,
generator_config: Arc<GeneratorConfig>,
transformer: Box<dyn Fn(Expression) -> Result<Expression> + Send + Sync>,
generator_config_for_expr: Option<Box<dyn Fn(&Expression) -> GeneratorConfig + Send + Sync>>,
custom_preprocess: Option<Box<dyn Fn(Expression) -> Result<Expression> + Send + Sync>>,
}
```
The `Dialect` struct bundles all dialect-specific state and provides the primary API:
```rust
// Parse SQL
let ast = dialect.parse("SELECT 1")?;
// Generate SQL from AST
let sql = dialect.generate(&ast[0])?;
// Transpile between dialects
let results = dialect.transpile("SELECT IFNULL(a,b) FROM t", DialectType::PostgreSQL)?;
// Tokenize
let tokens = dialect.tokenize("SELECT 1")?;
```
### 3.5 CustomDialectBuilder
For runtime-extensible dialect support:
```rust
use polyglot_sql::dialects::{CustomDialectBuilder, Dialect, DialectType};
use polyglot_sql::generator::NormalizeFunctions;
// Register a custom dialect inheriting from PostgreSQL
CustomDialectBuilder::new("my_postgres")
.based_on(DialectType::PostgreSQL)
.generator_config_modifier(|gc| {
gc.normalize_functions = NormalizeFunctions::Lower;
})
.register()?;
let d = Dialect::get_by_name("my_postgres").unwrap();
// Use like any built-in dialect
```
---
## 4. Dialect Implementation Details
### 4.1 PostgreSQL (`postgres.rs`, ~1,879 LOC)
**Tokenizer:**
- `$$` string literals (dollar-quoting)
- Double-quote identifier quoting
- Nested block comments
- `EXEC` treated as generic command
**Generator config highlights:**
- `identifier_quote: '"'` (double quotes)
- `single_string_interval: true` (`INTERVAL '1 day'`)
- `parameter_token: "$"` (`$1`, `$2` placeholders)
- `supports_select_into: true`
- `supports_window_exclude: true`
- `can_implement_array_any: true`
**Transform examples:**
- `IFNULL(a, b)``COALESCE(a, b)`
- `RAND()``RANDOM()`
- `DATEDIFF(day, a, b)``CAST(b - a AS INT)` (date subtraction)
- `JSON_EXTRACT(a, '$.x')``a #> '{x}'` (arrow syntax)
- `JSON_EXTRACT_SCALAR(a, '$.x')``a #>> '{x}'`
- `DATE_ADD` / `DATE_SUB``+` / `-` interval arithmetic
- Type mappings: `TINYINT``SMALLINT`, `FLOAT``REAL`, `DOUBLE``DOUBLE PRECISION`
- `ILIKE` preserved (native PostgreSQL)
- `RegexpLike``~` operator, `RegexpILike``~*` operator
### 4.2 SQLite (`sqlite.rs`, ~750 LOC)
**Tokenizer:**
- Supports `"`, `[`, `` ` `` as identifier quote characters
- No nested comments
- Hex number literals (`0xCC`)
**Generator config:**
- `identifier_quote: '"'` (double quotes)
- `supports_table_alias_columns: false`
- `json_key_value_pair_sep: ","` (comma-style `JSON_OBJECT`)
**Transform examples:**
- `NVL(a, b)``IFNULL(a, b)`
- `TRY_CAST(x AS t)``CAST(x AS t)` (no try-cast)
- `RANDOM()` → function
- `ILIKE``LOWER(left) LIKE LOWER(right)` (no native ILIKE)
- `CountIf(cond)``SUM(IIF(cond, 1, 0))`
- `CEIL(x)` → function form
- `DATE_TRUNC(unit, col)` → various strftime patterns
- `DATE_DIFF``juliandiff` patterns
### 4.3 MySQL (`mysql.rs`)
**Tokenizer:** Backtick identifiers, `#` comments
**Generator:** Backtick quoting, `LIMIT` syntax, `CONCAT()` instead of `||`
**Transforms:** `COALESCE(a,b)``IFNULL(a,b)`, `||``CONCAT()` (string concat), etc.
### 4.4 BigQuery (`bigquery.rs`)
**Tokenizer:** Backtick identifiers, `QUALIFY` keyword
**Generator:** Backtick quoting, `STRUCT` types, `QUALIFY` clause, `DATE_DIFF` syntax
**Transforms:** Complex date/timestamp function mappings, `UNNEST` handling, `APPROX_COUNT_DISTINCT``APPROX_COUNT_DISTINCT`
### 4.5 How Transpilation Works
The full transpilation pipeline:
```
Input SQL (source dialect)
Source Dialect Tokenizer
Parser (dialect-aware)
Expression AST
Source Dialect::preprocess() ← whole-tree rewrites
Source Dialect::transform_expr() ← per-node rewrites (recursive, bottom-up)
Normalized AST
Target Dialect Generator
Output SQL (target dialect)
```
The transform pipeline uses an explicit task stack (not recursive calls) for the hot paths to avoid stack overflow. The `stacker` crate provides additional stack-growth protection.
Key cross-dialect transforms include:
- Function renaming: `IFNULL``COALESCE``NVL`, `DATEDIFF` ↔ date arithmetic, `STRING_AGG``GROUP_CONCAT`
- Type mapping: `TINYINT``SMALLINT`, `FLOAT``REAL`, `JSON``JSONB`
- Syntax conversion: `LIMIT``TOP``FETCH FIRST`, `||` (concat) ↔ `CONCAT()`, `SELECT INTO``CREATE TABLE AS`
- Boolean handling: `BOOL_AND`/`BOOL_OR``MIN`/`MAX`-over-`CASE`
- JSON operators: `JSON_EXTRACT``#>`/`#>>``->`/`->>` (PostgreSQL arrow syntax)
---
## 5. Fluent Builder API
The builder module (`builder.rs`, ~3.3K LOC) provides a type-safe, ergonomic way to construct SQL expressions without string interpolation:
```rust
use polyglot_sql::builder::*;
// SELECT id, name FROM users WHERE age > 18 ORDER BY name LIMIT 10
let expr = select(["id", "name"])
.from("users")
.where_(col("age").gt(lit(18)))
.order_by(["name"])
.limit(10)
.build();
// INSERT
let ins = insert_into("users")
.columns(["id", "name"])
.values([lit(1), lit("Alice")])
.build();
// CASE expression
let expr = case()
.when(col("x").gt(lit(0)), lit("positive"))
.else_(lit("non-positive"))
.build();
// Set operations
let expr = union_all(
select(["id"]).from("a"),
select(["id"]).from("b"),
).order_by(["id"]).limit(5).build();
```
Expression helpers:
- `col("users.id")` — column reference (splits on last `.`)
- `lit(42)`, `lit("hello")`, `lit(3.14)`, `lit(true)` — literals
- `func("COALESCE", [col("a"), col("b")])` — function calls
- Operator chain: `col("age").gte(lit(18)).and(col("status").eq(lit("active")))`
The builder generates an `Expression` AST that can then be serialized to any dialect via `generate()`.
---
## 6. Validation and Schema-Aware Analysis
### 6.1 Syntax Validation
```rust
use polyglot_sql::{validate, DialectType};
let result = validate("SELECT * FORM users", DialectType::Generic);
// result.valid == false
// result.errors contain line/column/message/error codes
```
Error codes:
- `E001` — Syntax error
- `E002` — Tokenization error
- `E003` — Parse error
- `E004` — Invalid expression (not a valid statement)
- `E005` — Trailing comma in strict mode
### 6.2 Schema-Aware Validation
```rust
use polyglot_sql::{
validate_with_schema, DialectType, SchemaColumn, SchemaTable,
SchemaValidationOptions, ValidationSchema,
};
let schema = ValidationSchema {
strict: Some(true),
tables: vec![
SchemaTable {
name: "users".into(),
columns: vec![
SchemaColumn { name: "id".into(), data_type: "integer".into(), nullable: Some(false), primary_key: true, unique: false, references: None },
SchemaColumn { name: "email".into(), data_type: "varchar".into(), nullable: Some(false), primary_key: false, unique: true, references: None },
],
// ...
},
],
};
let opts = SchemaValidationOptions { check_types: true, check_references: true, strict: None, semantic: true };
let result = validate_with_schema("SELECT id FROM users WHERE email = 1", DialectType::Generic, &schema, &opts);
// result.valid == false (type mismatch: email is varchar, compared to integer)
```
Schema-aware error codes:
- `E200`/`E201` — Unknown table/column
- `E210``E217`, `W210``W216` — Type checks
- `E220`, `E221`, `W220`, `W221`, `W222` — Reference/FK checks
### 6.3 Function Catalogs
Optional feature-gated function catalogs (currently ClickHouse and DuckDB) provide known function signatures for semantic type checking:
```toml
polyglot-sql = { version = "0.4", features = ["function-catalog-clickhouse"] }
```
---
## 7. Column Lineage & OpenLineage
### 7.1 Column Lineage
Trace how columns flow through a query:
```rust
use polyglot_sql::{parse, DialectType};
use polyglot_sql::lineage::get_column_lineage;
let ast = parse("SELECT a + b AS total FROM t", DialectType::Generic).unwrap();
let lineage = get_column_lineage(&ast[0], /* schema */ None, DialectType::Generic);
// lineage tells you that "total" depends on columns "a" and "b" from table "t"
```
### 7.2 OpenLineage Payload Generation
```rust
use polyglot_sql::openlineage::{generate_run_event, OpenLineageOptions, OpenLineageDatasetId};
let opts = OpenLineageOptions {
dialect: DialectType::PostgreSQL,
producer: "my-app".into(),
dataset_namespace: Some("mydb".into()),
// ...
};
let event = generate_run_event("SELECT * FROM users", &opts)?;
// event is a JSON-serializable OpenLineage RunEvent with columnLineage facets
```
---
## 8. Error Handling
### 8.1 Error Types
```rust
pub enum Error {
Tokenize { message: String, line: usize, column: usize, start: usize, end: usize },
Parse { message: String, line: usize, column: usize, start: usize, end: usize },
Generate(String),
Unsupported { feature: String, dialect: String },
Syntax { message: String, line: usize, column: usize, start: usize, end: usize },
Internal(String),
}
```
All position-bearing errors include:
- `line` — 1-based line number
- `column` — 1-based column number
- `start` / `end` — byte offsets (0-based, end exclusive)
```rust
let err = Error::parse("Unexpected token", 3, 15, 42, 44);
assert_eq!(err.line(), Some(3));
assert_eq!(err.column(), Some(15));
assert_eq!(err.start(), Some(42));
```
### 8.2 Validation Errors
```rust
pub struct ValidationError {
pub message: String,
pub line: Option<usize>,
pub column: Option<usize>,
pub severity: ValidationSeverity, // Error or Warning
pub code: String, // e.g., "E001", "E200"
pub start: Option<usize>,
pub end: Option<usize>,
}
pub struct ValidationResult {
pub valid: bool,
pub errors: Vec<ValidationError>,
}
```
### 8.3 Guard Rail Errors
Format operations have configurable guard limits that return structured errors:
- `E_GUARD_INPUT_TOO_LARGE` — input exceeds `max_input_bytes`
- `E_GUARD_TOKEN_BUDGET_EXCEEDED` — token count exceeds `max_tokens`
- `E_GUARD_AST_BUDGET_EXCEEDED` — AST node count exceeds `max_ast_nodes`
- `E_GUARD_SET_OP_CHAIN_EXCEEDED` — UNION/INTERSECT/EXCEPT chain exceeds `max_set_op_chain`
---
## 9. AST Traversal & Analysis
### 9.1 Traversal
```rust
use polyglot_sql::{parse, DialectType};
use polyglot_sql::traversal::*;
let ast = parse("SELECT a, b FROM t WHERE x > 1", DialectType::Generic).unwrap();
let columns = get_columns(&ast[0]); // ["a", "b", "x"]
let tables = get_tables(&ast[0]); // ["t"]
```
Available predicates (70+):
- `is_select`, `is_insert`, `is_update`, `is_delete`, `is_ddl`
- `is_join`, `is_where`, `is_group_by`, `is_order_by`, `is_limit`
- `is_function`, `is_aggregate`, `is_subquery`, `is_cte`
- `is_comparison`, `is_logical`, `is_arithmetic`
- `contains_subquery`, `contains_aggregate`, `contains_window_function`
Iterators: `DfsIter`, `BfsIter` for depth-first and breadth-first traversal.
### 9.2 AST Transforms
```rust
use polyglot_sql::ast_transforms::*;
// Rename tables
let renamed = rename_tables(expr, &[("old_name", "new_name")]);
// Add WHERE condition
let filtered = add_where(expr, col("active").eq(lit(true)));
// Remove LIMIT/OFFSET
let unlimited = remove_limit_offset(expr);
```
### 9.3 AST Diff
```rust
use polyglot_sql::diff::{diff, diff_with_config, DiffConfig};
let edits = diff(&source_expr, &target_expr, true);
for edit in &edits {
if edit.is_change() {
println!("{:?}", edit);
}
}
```
Uses the ChangeDistiller algorithm with Dice coefficient matching for structural comparison.
### 9.4 Logical Planner
```rust
use polyglot_sql::planner::Plan;
let plan = Plan::from_expression(&expr);
// plan.root is a Step DAG
// plan.leaves() returns leaf steps
// plan.dag() returns the dependency graph
```
Step kinds: Scan, Filter, Project, Aggregate, Join, Sort, Limit, etc.
---
## 10. Optimizer Modules
The optimizer is available behind the `semantic` feature flag:
| Module | Purpose |
|---|---|
| `qualify_columns.rs` | Resolve unqualified column references to table.column |
| `qualify_tables.rs` | Expand table names with schema/catalog |
| `annotate_types.rs` | Infer and annotate expression types |
| `pushdown_predicates.rs` | Push WHERE conditions into JOINs |
| `pushdown_projections.rs` | Reduce columns to only what's needed |
| `eliminate_joins.rs` | Remove unnecessary JOINs |
| `eliminate_ctes.rs` | Inline single-use CTEs |
| `simplify.rs` | Simplify boolean expressions, constant folding |
| `normalize.rs` | Expression normalization |
| `canonicalize.rs` | Query canonicalization |
| `subquery.rs` | Subquery analysis |
---
## 11. Async Support
**Polyglot does not use async I/O** — it is a pure computational library. All operations are synchronous and CPU-bound:
- `parse()` — synchronous
- `generate()` — synchronous
- `transpile()` — synchronous
- `validate()` — synchronous
- `format()` — synchronous
This is by design: Polyglot operates on SQL strings in memory, with no network or filesystem I/O. For use in async contexts (Tokio, async-std), callers should use `tokio::task::spawn_blocking()` or similar to offload CPU-heavy parsing/transpilation to a blocking thread pool.
---
## 12. Feature Flags
| Flag | Description | Default |
|---|---|---|
| `all-dialects` | Enable all 32 dialect parsers | ✅ |
| `generate` | SQL generation from AST | ✅ |
| `transpile` | Cross-dialect transpilation (implies `generate`) | ✅ |
| `builder` | Fluent query builder API (implies `generate`) | ✅ |
| `ast-tools` | AST inspection & transform utilities | ✅ |
| `semantic` | Schema, resolver, lineage, optimizer, validation | ✅ |
| `openlineage` | OpenLineage payload generation (implies `semantic`) | ✅ |
| `diff` | AST diff support (implies `generate`) | ✅ |
| `planner` | Logical planning helpers | ✅ |
| `time` | Time-format conversion helpers | ✅ |
| `stacker` | Stack-growth protection for native builds | ✅ |
| `bindings` | TypeScript type generation via `ts-rs` | ❌ |
| `dialect-postgresql` | PostgreSQL dialect only | — |
| `dialect-mysql` | MySQL dialect only | — |
| ... (one per dialect) | Individual dialect selector | — |
| `function-catalog-clickhouse` | ClickHouse function catalog | ❌ |
| `function-catalog-duckdb` | DuckDB function catalog | ❌ |
| `function-catalog-all-dialects` | All function catalogs | ❌ |
Minimal WASM build (for constrained targets):
```toml
polyglot-sql = { version = "0.4", default-features = false, features = ["generate", "transpile", "dialect-postgresql", "dialect-mysql"] }
```
---
## References
- Source code examined: `/workspace/polyglot/crates/polyglot-sql/src/` (~241K LOC)
- Architecture documentation: `/workspace/polyglot/docs/sqlglot-architecture.md`
- Benchmark results: `/workspace/polyglot/docs/benchmark.md`
- README: `/workspace/polyglot/README.md`, `/workspace/polyglot/crates/polyglot-sql/README.md`
- CHANGELOG: `/workspace/polyglot/CHANGELOG.md`

View File

@@ -1,294 +0,0 @@
# Polyglot: Suitability Analysis & Comparisons
---
## 1. What Polyglot Is NOT
Before evaluating suitability, it's essential to understand what Polyglot **does not** do:
| NOT a... | Because |
|---|---|
| **Database driver** | No connection management, no query execution, no result set handling |
| **ORM** | No object-relational mapping, no model definitions, no active record pattern |
| **Migration tool** | No `CREATE TABLE` evolution management, no up/down migrations framework |
| **Type mapper** | No Rust type → SQL type mapping, no `FromRow` derives |
| **Connection pool** | No async I/O, no TCP connections, no TLS |
| **Query executor** | Never connects to a database; operates purely on SQL text |
**Polyglot is a SQL dialect transpiler.** It converts SQL strings between database dialects. Period.
---
## 2. Suitability Assessment for Multi-Database Storage Layer
### 2.1 What Polyglot CAN Do for a Multi-DB Project
| Use Case | Polyglot Support | Maturity |
|---|---|---|
| **SQL dialect translation** | ✅ Core purpose; 32 dialects with 100% test pass rate | Mature |
| **SQL pretty-printing** | ✅ Built-in format with guard rails | Mature |
| **SQL syntax validation** | ✅ Line/column error positions, error codes | Mature |
| **Schema-aware validation** | ✅ Table/column/type checking with `ValidationSchema` | Moderate |
| **Column lineage tracing** | ✅ `get_column_lineage()` for data lineage | Moderate |
| **OpenLineage payloads** | ✅ `RunEvent` and `DatasetFacet` generation | Early but functional |
| **Query builder** | ✅ Fluent API for SELECT/INSERT/UPDATE/DELETE | Usable but not as rich as query-builder-first libraries |
| **AST diff** | ✅ ChangeDistiller-based structural diff | Functional |
| **Logical planning** | ✅ Basic DAG plan extraction | Early stage |
| **Query optimization** | ✅ Column qualification, predicate pushdown, join elimination | Moderate |
| **Custom dialect registration** | ✅ `CustomDialectBuilder` for runtime extension | Functional |
### 2.2 What Polyglot CANNOT Do for a Multi-DB Project
| Need | Polyglot Support | Alternative |
|---|---|---|
| **Execute queries** | ❌ No | Use sqlx, diesel, or sea-orm |
| **Connection pooling** | ❌ No | Use deadpool, bb8, or sqlx built-in |
| **Async I/O** | ❌ Synchronous only | Wrap in `spawn_blocking()` |
| **Type-safe query building** | ⚠️ Partial (builder API returns strings) | Use diesel or sea-orm for compile-time checks |
| **Schema migration management** | ❌ No | Use diesel migrations, sqlx migrations, or refinery |
| **Row mapping / deserialization** | ❌ No | Use sqlx `FromRow`, diesel `Queryable` |
| **Runtime type mapping** | ⚠️ Limited (DataType enum, no Rust type bridge) | Build your own layer |
| **Database-specific DDL generation** | ⚠️ Parses/generates DDL but no migration framework | Use as a building block |
| **Transaction management** | ❌ No | Use sqlx or diesel |
### 2.3 Integration Pattern: Polyglot as a SQL Dialect Layer
The most natural integration pattern for a multi-database storage layer:
```
┌──────────────────────────────────────────────┐
│ Application Logic │
├──────────────────────────────────────────────┤
│ Query Builder / ORM Layer │
│ (diesel / sea-orm / custom) │
├──────────────────────┬───────────────────────┤
│ │ │
│ Polyglot Layer │ Direct SQL │
│ (transpile, │ (no translation │
│ validate, │ needed) │
│ format) │ │
├──────────────────────┴───────────────────────┤
│ Database Driver Layer │
│ (sqlx / diesel / tungstenite) │
├──────────────────────────────────────────────┤
│ PostgreSQL │ MySQL │ SQLite │
└──────────────────────────────────────────────┘
```
In this pattern, Polyglot sits **above** the database drivers, translating SQL from a canonical dialect to the target database's dialect before execution. It does **not** replace the drivers.
---
## 3. Comparison with Other Rust SQL Libraries
### 3.1 Feature Comparison Matrix
| Feature | **Polyglot** | **Diesel** | **SQLx** | **SeaORM** | **sqlparser-rs** |
|---|---|---|---|---|---|
| **Primary Purpose** | SQL transpilation | ORM / query builder | Async DB driver | Async ORM | SQL parsing |
| **SQL Parsing** | ✅ Full AST (200+ node types) | ✅ DSL-based | ❌ No | ❌ No | ✅ Full AST |
| **SQL Generation** | ✅ Multi-dialect | ✅ Via DSL | ❌ No | ❌ No | ⚠️ Limited |
| **Cross-dialect Transpilation** | ✅ 32 dialects | ❌ No | ❌ No | ❌ No | ❌ No |
| **Query Builder** | ⚠️ Fluent, string-based | ✅ Type-safe DSL | ❌ No | ✅ Type-safe | ❌ No |
| **Async I/O** | ❌ No (sync only) | ❌ Diesel 1.x is sync | ✅ Native async | ✅ Native async | ❌ No |
| **Type-safe Queries** | ❌ No (runtime) | ✅ Compile-time | ❌ No | ✅ Compile-time | ❌ No |
| **Connection Pool** | ❌ No | ❌ No (Diesel 2.x via r2d2) | ✅ Built-in | ✅ Built-in | ❌ No |
| **Migration Support** | ❌ No | ✅ Built-in | ❌ No | ✅ Built-in | ❌ No |
| **Database Execution** | ❌ No | ✅ Yes | ✅ Yes | ✅ Yes | ❌ No |
| **Schema Validation** | ✅ Via ValidationSchema | ✅ Compile-time | ❌ No | ⚠️ Limited | ❌ No |
| **Column Lineage** | ✅ Built-in | ❌ No | ❌ No | ❌ No | ❌ No |
| **AST Diff** | ✅ Built-in | ❌ No | ❌ No | ❌ No | ❌ No |
| **Dialects Supported** | 32 | 4 (PG, MySQL, SQLite, MSSQL) | N/A | N/A | 1 (ANSI SQL) |
| **License** | MIT | MIT/Apache-2.0 | MIT/Apache-2.0 | MIT | MIT/Apache-2.0 |
| **Maturity** | v0.4.4 (pre-1.0) | v2.2 (stable) | v0.8 (stable) | v1.1 (stable) | v0.49 (mature) |
### 3.2 Polyglot vs Diesel
| Aspect | Polyglot | Diesel |
|---|---|---|
| **Philosophy** | Parse any SQL → AST → generate any dialect | Type-safe DSL → SQL for specific databases |
| **Type Safety** | Runtime (string-based) | Compile-time (macro-based) |
| **Query Building** | `select(["col"]).from("t").where_(...)``Expression` AST | `schema::table::dsl::col.filter(...)` → SQL |
| **Dialect Breadth** | 32 dialects | 4 (PostgreSQL, MySQL, SQLite, MSSQL) |
| **Database Execution** | None (SQL text only) | Full CRUD with connection management |
| **Migrations** | None | Built-in migration framework |
| **When to use** | You need cross-dialect SQL translation, validation, lineage | You need type-safe queries with database execution |
**Verdict**: Polyglot and Diesel are **complementary**, not competing. Use Diesel for type-safe database interaction; use Polyglot when you need to translate SQL between dialects or analyze SQL without executing it.
### 3.3 Polyglot vs SQLx
| Aspect | Polyglot | SQLx |
|---|---|---|
| **Philosophy** | SQL manipulation without execution | Async database driver with compile-time query checking |
| **Async** | Synchronous only | Fully async |
| **Query Checking** | Runtime validation against schema | Compile-time `query!()` macro |
| **Database Support** | 32 dialects (parsing) | PostgreSQL, MySQL, SQLite (execution) |
| **When to use** | SQL transformation/analysis | Database interaction with async Rust |
**Verdict**: SQLx is for executing queries against databases. Polyglot is for transforming SQL text. They solve entirely different problems.
### 3.4 Polyglot vs SeaORM
| Aspect | Polyglot | SeaORM |
|---|---|---|
| **Philosophy** | SQL transpilation | Async ORM built on SQLx |
| **Async** | No | Yes |
| **Model Definition** | None | Entity models via macros |
| **Relationships** | None | Has-one, has-many, many-to-many |
| **When to use** | SQL dialect conversion | Database CRUD with relationships |
**Verdict**: Same as SQLx — complementary, not competing.
### 3.5 Polyglot vs sqlparser-rs
| Aspect | Polyglot | sqlparser-rs |
|---|---|---|
| **Parsing** | ✅ Full (200+ node types) | ✅ Full (ANSI SQL + some dialects) |
| **Generation** | ✅ Multi-dialect generation | ⚠️ Limited round-trip |
| **Transpilation** | ✅ Cross-dialect transforms | ❌ No |
| **Dialects** | 32 | Primarily ANSI SQL |
| **Validation** | ✅ With error positions | ❌ Parse errors only |
| **Builder** | ✅ Fluent API | ❌ No |
| **Lineage** | ✅ Built-in | ❌ No |
| **Diff** | ✅ Built-in | ❌ No |
| **Maturity** | v0.4.4 | v0.49 (more established) |
**Verdict**: sqlparser-rs is a mature parser for ANSI SQL. Polyglot offers significantly more: transpilation, 32 dialects, validation, lineage, diff, and a builder API. If you need dialect translation, Polyglot is the clear choice. If you only need ANSI SQL parsing and don't need generation/transpilation, sqlparser-rs may suffice with less overhead.
### 3.6 Polyglot vs Python sqlglot
| Aspect | Polyglot (Rust) | sqlglot (Python) |
|---|---|---|
| **Performance** | 819× faster (transpile), ~86× faster (generate) | Baseline |
| **Language** | Rust | Python |
| **Feature Parity** | ~95% of sqlglot's transpilation | Full feature set |
| **Optimizer** | Column qualification, predicate pushdown (moderate) | Full optimizer (column pruning, join elimination, etc.) |
| **Execution** | ❌ No | ⚠️ Limited (can execute against some engines) |
| **Test Compatibility** | 10,220+ sqlglot fixture cases at 100% | Original test suite |
| **Deployment** | Native binary / WASM / Python / Go | Python package |
**Verdict**: Polyglot is the performance-oriented port of sqlglot. It covers the core transpilation use case at near-full feature parity. The Python sqlglot has a more mature optimizer and some execution capabilities, but Polyglot is catching up rapidly (0.4.x adds lineage, OpenLineage, schema validation, and more).
---
## 4. Limitations and Gotchas
### 4.1 Current Limitations
| Limitation | Impact | Mitigation |
|---|---|---|
| **Pre-1.0 API** | Breaking changes possible between minor versions | Pin exact version in Cargo.toml |
| **No query execution** | Cannot run SQL against databases | Use alongside sqlx/diesel |
| **No async** | Blocking in async contexts | Wrap in `spawn_blocking()` |
| **No migration framework** | Cannot manage schema evolution | Use diesel migrations or refinery |
| **No Rust type mapping** | `DataType` enum doesn't map to Rust types | Build your own type bridge |
| **Builder returns Expression** | Builder doesn't produce type-safe queries | Accept runtime nature; pair with runtime validation |
| **Optimizer is early** | Limited optimization passes vs Python sqlglot | Most useful passes exist (qualify_columns, pushdown_predicates) |
| **WASM lacks `stacker`** | Deeply nested SQL may overflow stack in browser | Set format guard limits; consider web workers |
| **Custom dialects are global** | `CustomDialectBuilder` uses a global `RwLock` registry | Fine for most apps; not ideal for per-request isolation |
| **No prepared statement support** | Cannot generate `?` placeholders for parameterized queries | Build queries as strings; use sqlx for parameterization |
### 4.2 Gotchas
1. **`Dialect::get()` creates a new instance each call**: The `Dialect` struct bundles tokenizer + generator config + transformer. For hot loops, cache the `Dialect` instance rather than calling `Dialect::get()` repeatedly. (The overhead is minimal but non-zero.)
2. **Transpilation is not always invertible**: Some dialects have features that don't exist in others (e.g., BigQuery's `QUALIFY`, PostgreSQL's `ILIKE`, TSQL's `TOP`). Transpiling `A → B` and then `B → A` may lose information.
3. **Function transformation depth**: The transform pipeline processes per-node bottom-up. Some transformations require multi-pass processing (handled by `preprocess()`), but edge cases may require manual intervention.
4. **AST is not a stable serialization format**: The `Expression` enum and its inner structs may change between versions. If you serialize ASTs to JSON, expect breaking changes across minor versions.
5. **Feature flags are cumulative**: `transpile` implies `generate`, `openlineage` implies `semantic`, etc. For minimal builds, use `default-features = false` and select only what you need.
6. **Global custom dialect registry**: Custom dialects registered via `CustomDialectBuilder::register()` are stored in a global `RwLock<HashMap>`. This means they persist for the lifetime of the process and are visible across threads. Call `unregister_custom_dialect()` to remove them.
7. **Parser is permissive**: The parser accepts many SQL constructs that some databases reject. Validation (via `validate()` or `validate_with_schema()`) can catch some issues, but it's not a substitute for database-level error checking.
8. **No `?` placeholder generation**: Polyglot doesn't generate parameterized query placeholders. For prepared statements, you'll need to handle parameter binding yourself with your database driver.
9. **Schema validation requires manual schema definition**: The `ValidationSchema` struct must be populated manually — there's no automatic schema introspection from a live database.
---
## 5. Production-Readiness Assessment
### 5.1 Strengths
| Area | Rating | Notes |
|---|---|---|
| **Transpilation accuracy** | ⭐⭐⭐⭐⭐ | 10,220+ fixture cases at 100% pass rate |
| **Performance** | ⭐⭐⭐⭐⭐ | 819× faster than Python sqlglot |
| **Dialect coverage** | ⭐⭐⭐⭐⭐ | 32 dialects covering all major databases |
| **API ergonomics** | ⭐⭐⭐⭐ | Clean public API; builder is pleasant |
| **Error reporting** | ⭐⭐⭐⭐ | Line/column/byte-offset positions |
| **WASM support** | ⭐⭐⭐⭐ | Full feature set in browser |
| **Multi-language bindings** | ⭐⭐⭐⭐⭐ | Rust, TypeScript, Python, Go, C FFI |
| **Documentation** | ⭐⭐⭐ | Rust API docs exist; could use more guides |
| **Test coverage** | ⭐⭐⭐⭐⭐ | 18,745 test cases |
| **Fuzzing** | ⭐⭐⭐⭐ | Supported via `cargo fuzz` |
### 5.2 Risks
| Risk | Severity | Mitigation |
|---|---|---|
| **Pre-1.0 breaking changes** | Medium | Pin version; monitor CHANGELOG |
| **Single maintainer** | Medium | Code is well-structured; community could fork |
| **Limited optimizer** | Low | Core passes exist; Python sqlglot is reference |
| **No query execution** | Low (by design) | Combine with sqlx/diesel |
| **WASM stack limits** | Low | Set guard rails; use web workers |
### 5.3 Overall Assessment
**Polyglot is production-viable for SQL transpilation and analysis tasks**, with caveats:
-**Use for**: SQL dialect translation, SQL linting/validation, column lineage, pretty-printing, AST analysis, cross-database query migration
- ⚠️ **Use with caution for**: Query building (no type safety), optimization (partial coverage)
-**Don't use for**: Database execution, connection management, migrations, type-safe queries
For a multi-database storage layer, the recommended pattern is:
```
Application → Polyglot (transpile SQL to target dialect) → sqlx/diesel (execute)
```
---
## 6. Recommendation
### When to Adopt Polyglot
1. **You need to support multiple database backends with different SQL dialects** and want to write queries once in a canonical dialect, then transpile to the target at runtime.
2. **You need SQL validation or analysis** (lineage, schema checking) without executing queries.
3. **You need SQL pretty-printing or formatting** with configurable guard rails.
4. **You need column lineage tracking** for data governance or OpenLineage integration.
5. **You need to parse and analyze SQL** in a Rust/WASM/Python/Go context without connecting to a database.
### When NOT to Adopt Polyglot
1. **You need type-safe query building** — use Diesel or SeaORM instead.
2. **You need async database execution** — use SQLx or SeaORM instead.
3. **You need schema migrations** — use Diesel migrations, sqlx migrations, or Refinery instead.
4. **You only need PostgreSQL** (or a single dialect) — a simpler parser may suffice.
5. **You need Rust type → SQL type mapping** — Polyglot doesn't provide this.
### Suggested Adoption Strategy
For a multi-database storage layer:
1. **Use Polyglot for SQL transpilation**: Write queries in a canonical dialect (e.g., PostgreSQL-compatible), transpile to the target dialect at runtime.
2. **Use SQLx for database execution**: Handle connections, pooling, and async I/O.
3. **Use Polyglot for validation**: Validate user-provided SQL before execution.
4. **Use Polyglot for lineage**: Trace column flow for data governance.
5. **Build a thin integration layer** that combines Polyglot's transpilation with SQLx's execution.
---
## References
- <https://github.com/tobilg/polyglot> — Main repository
- <https://crates.io/crates/polyglot-sql> — Rust crate (v0.4.4)
- <https://docs.rs/polyglot-sql/latest/polyglot_sql/> — Rust API docs
- <https://github.com/tobymao/sqlglot> — Python inspiration
- <https://lib.rs/crates/polyglot-sql> — Package metadata
- Local source: `/workspace/polyglot/`

View File

@@ -1,765 +0,0 @@
# RustFS Event Notification System & S3 Select Reference
> **Companion document**: This extends [rustfs-reference.md](./rustfs-reference.md) which covers auth, architecture, and credential mapping. This document focuses on the **event notification system** and **S3 Select** feature.
**Date**: 2026-06-08
**RustFS version**: Based on source at `/workspace/rustfs/` (commit-level snapshot)
**Purpose**: Evaluate rustfs event notification and S3 Select for alknet integration
---
## Table of Contents
1. [Event Notification System](#1-event-notification-system)
2. [Event Types & Structure](#2-event-types--structure)
3. [Notification Targets](#3-notification-targets)
4. [Configuration & Rule Engine](#4-configuration--rule-engine)
5. [Pipeline & Delivery](#5-pipeline--delivery)
6. [Live Event Stream](#6-live-event-stream)
7. [S3 Select](#7-s3-select)
8. [Mapping to alknet](#8-mapping-to-alknet)
9. [References](#9-references)
---
## 1. Event Notification System
### 1.1 Architecture Overview
RustFS implements a full S3-compatible bucket notification system. The architecture follows a layered pattern:
```
┌──────────────────────────────────────────────────────────┐
│ S3 API Layer │
│ (PutObject, DeleteObject, CopyObject, etc.) │
└─────────────┬────────────────────────────────────────────┘
│ emits EventArgs
┌──────────────────────────────────────────────────────────┐
│ ECStore (event_notification.rs) │
│ - send_event() hook (global OnceLock dispatch) │
│ - registers dispatch callback during init │
└─────────────┬────────────────────────────────────────────┘
│ converts EventArgs → NotifyEventArgs
┌──────────────────────────────────────────────────────────┐
│ rustfs_notify (NotificationSystem) │
│ ┌──────────────┐ ┌──────────────┐ ┌───────────────┐ │
│ │ NotifyPipeline│──▶│ NotifyRuleEngine│─▶│ EventNotifier │ │
│ │ (broadcast │ │ (match rules) │ │ (send to │ │
│ │ + history) │ │ │ │ targets) │ │
│ └──────────────┘ └──────────────┘ └──────┬────────┘ │
│ │ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────▼────────┐ │
│ │BucketConfigM │ │ NotifyConfigM │ │ TargetList │ │
│ │ anager │ │ anager │ │ (Webhook, │ │
│ └──────────────┘ └──────────────┘ │ Kafka, AMQP, │ │
│ │ NATS, Redis, │ │
│ │ MQTT, MySQL, │ │
│ │ Postgres, │ │
│ │ Pulsar) │ │
│ └───────────────┘ │
└──────────────────────────────────────────────────────────┘
```
### 1.2 Key Crates
| Crate | Purpose |
|-------|---------|
| `rustfs_notify` | Core notification orchestration: `Event`, `EventArgs`, `EventNotifier`, `NotifyPipeline`, `NotificationSystem`, rule engine, bucket config management |
| `rustfs_targets` | Target implementations (Webhook, Kafka, AMQP, NATS, Redis, MQTT, MySQL, PostgreSQL, Pulsar) + `Target` trait, `QueueStore`, TLS hot-reload |
| `rustfs_s3_types` | `EventName` enum with all S3 event type definitions, serialization, mask/bitfield support |
| `rustfs_ecstore` | Storage layer; `event_notification.rs` provides the dispatch hook that bridges ecstore events to the notify system |
| `rustfs_config` | Configuration for each target type (Env vars, KVS parsing, subsystem names) |
### 1.3 Initialization Flow
1. `rustfs/server/event.rs::init_event_notifier()` runs at startup
2. If notify module is enabled (`RUSTFS_NOTIFY_ENABLE=true`), it calls `rustfs_notify::initialize(config)` which:
- Creates a `NotificationSystem` with `EventNotifier`, `TargetRegistry`, and config
- Loads all target configurations from the config store
- Initializes each target (connects, health-checks, starts stream replay workers)
3. An ECStore dispatch hook is installed via `register_event_dispatch_hook()` which:
- Converts `ecstore::EventArgs``notify::EventArgs`
- Parses `EventName` from string
- Spawns an async task to call `notifier_global::notify(args)`
### 1.4 Module Toggle
The notification system respects a module enable/disable flag:
- Environment variable: `RUSTFS_NOTIFY_ENABLE` (default: `DEFAULT_NOTIFY_ENABLE`)
- When disabled, only the **live event stream** is initialized (no targets are loaded)
- This allows in-process event subscription without external delivery
---
## 2. Event Types & Structure
### 2.1 EventName Enum
Defined in `rustfs_s3_types::EventName`. All S3-standard event types plus RustFS extensions:
| Category | Events |
|----------|--------|
| **ObjectAccessed** | `s3:ObjectAccessed:Get`, `s3:ObjectAccessed:Head`, `s3:ObjectAccessed:GetRetention`, `s3:ObjectAccessed:GetLegalHold`, `s3:ObjectAccessed:Attributes` |
| **ObjectCreated** | `s3:ObjectCreated:Put`, `s3:ObjectCreated:Post`, `s3:ObjectCreated:Copy`, `s3:ObjectCreated:CompleteMultipartUpload`, `s3:ObjectCreated:PutRetention`, `s3:ObjectCreated:PutLegalHold` |
| **ObjectRemoved** | `s3:ObjectRemoved:Delete`, `s3:ObjectRemoved:DeleteMarkerCreated`, `s3:ObjectRemoved:DeleteAllVersions`, `s3:ObjectRemoved:NoOP` |
| **ObjectTagging** | `s3:ObjectTagging:Put`, `s3:ObjectTagging:Delete` |
| **ObjectAcl** | `s3:ObjectAcl:Put` |
| **ObjectReplication** | `s3:Replication:OperationFailedReplication`, `s3:Replication:OperationCompletedReplication`, `s3:Replication:OperationMissedThreshold`, `s3:Replication:OperationReplicatedAfterThreshold`, `s3:Replication:OperationNotTracked` |
| **ObjectRestore** | `s3:ObjectRestore:Post`, `s3:ObjectRestore:Completed` |
| **ObjectTransition** | `s3:ObjectTransition:Failed`, `s3:ObjectTransition:Complete` |
| **Lifecycle** | `s3:LifecycleExpiration:Delete`, `s3:LifecycleExpiration:DeleteMarkerCreated`, `s3:LifecycleDelMarkerExpiration:Delete`, `s3:LifecycleTransition` |
| **Bucket** | `s3:BucketCreated:*`, `s3:BucketRemoved:*` |
| **Scanner** | `s3:Scanner:ManyVersions`, `s3:Scanner:LargeVersions`, `s3:Scanner:BigPrefix` |
| **IntelligentTiering** | `s3:IntelligentTiering` |
| **Compound (wildcard)** | `s3:ObjectAccessed:*`, `s3:ObjectCreated:*`, `s3:ObjectRemoved:*`, `s3:ObjectTagging:*`, `s3:Replication:*`, `s3:ObjectRestore:*`, `s3:LifecycleExpiration:*`, `s3:ObjectTransition:*`, `s3:Scanner:*`, `Everything` |
| **Internal** | `ObjectRemovedAbortMultipartUpload`, `ObjectCreatedCreateMultipartUpload`, `ObjectRemovedDeleteObjects` |
### 2.2 Event Schema Versioning
The `event_schema_version` function returns different versions based on event type:
| Version | Events |
|---------|--------|
| `2.1` | ObjectCreated/Removed/Accessed base events |
| `2.2` | Replication events |
| `2.3` | Tagging, ACL, Restore, Lifecycle, IntelligentTiering events |
### 2.3 Event Record Structure (`rustfs_notify::Event`)
```rust
pub struct Event {
pub event_version: String, // e.g., "2.1", "2.2", "2.3"
pub event_source: String, // "rustfs:s3"
pub aws_region: String,
pub event_time: DateTime<Utc>,
pub event_name: EventName,
pub user_identity: Identity, // { principal_id: String }
pub request_parameters: HashMap<String, String>,
pub response_elements: HashMap<String, String>,
pub s3: Metadata, // See below
pub glacier_event_data: Option<GlacierEventData>,
pub source: Source, // { host, port, user_agent }
}
pub struct Metadata {
pub schema_version: String, // "1.0"
pub configuration_id: String,
pub bucket: Bucket, // { name, owner_identity, arn }
pub object: Object, // See below
}
pub struct Object {
pub key: String, // URL-encoded object key
pub size: Option<i64>,
pub e_tag: Option<String>,
pub content_type: Option<String>,
pub user_metadata: Option<HashMap<String, String>>,
pub version_id: Option<String>,
pub sequencer: String, // Monotonic event sequence ID
}
```
- The `key` field is URL-encoded (form-urlencoded)
- `sequencer` is derived from `ObjectInfo.mod_time` nanosecond timestamp, ensuring ordering
- `user_metadata` filters out keys starting with `x-amz-meta-internal-`
- For removed events, `size`, `e_tag`, `content_type`, and `user_metadata` are omitted
### 2.4 EventArgs Builder
Events are constructed via `EventArgsBuilder`:
```rust
let args = EventArgsBuilder::new(EventName::ObjectCreatedPut, "my-bucket", object_info)
.host("10.0.0.1")
.port(9000)
.user_agent("alknet-storage/1.0")
.req_param("principalId", "user-123")
.version_id("v2")
.build();
let event = Event::new(args);
```
The builder pattern ensures all required fields are provided and allows optional fields.
---
## 3. Notification Targets
### 3.1 Target Trait
All targets implement `rustfs_targets::Target<E>`:
```rust
#[async_trait]
pub trait Target<E>: Send + Sync + 'static
where E: Send + Sync + 'static + Clone + Serialize + DeserializeOwned
{
fn id(&self) -> TargetID;
fn name(&self) -> String;
async fn is_active(&self) -> Result<bool, TargetError>;
async fn save(&self, event: Arc<EntityTarget<E>>) -> Result<(), TargetError>;
async fn send_raw_from_store(&self, key: Key, body: Vec<u8>, meta: QueuedPayloadMeta) -> Result<(), TargetError>;
async fn send_from_store(&self, key: Key) -> Result<(), TargetError>;
async fn close(&self) -> Result<(), TargetError>;
fn store(&self) -> Option<&(dyn Store<QueuedPayload, ...>)>;
fn clone_dyn(&self) -> Box<dyn Target<E> + Send + Sync>;
async fn init(&self) -> Result<(), TargetError>;
fn is_enabled(&self) -> bool;
fn delivery_snapshot(&self) -> TargetDeliverySnapshot;
fn record_final_failure(&self);
}
```
### 3.2 Supported Targets
| Target | Crate Module | Protocol | Queue Store | TLS/mTLS | SASL | Notes |
|--------|-------------|----------|-------------|----------|------|-------|
| **Webhook** | `targets::webhook` | HTTP POST | Yes (file) | Yes (CA, client cert, skip_verify) | Bearer token | Health check via HEAD to `/`; TLS hot-reload |
| **Kafka** | `targets::kafka` | Kafka Produce | Yes (file) | Yes (CA, client cert) | PLAIN, SCRAM-SHA-256, SCRAM-SHA-512 | Uses `rustfs_kafka_async`; acknowledgments configurable (-1, 0, 1) |
| **AMQP** | `targets::amqp` | AMQP 0-9-1 | Yes (file) | Yes (CA, client cert via amqps://) | Username/password (in URL or config) | Uses `lapin`; publisher confirms; persistent delivery mode |
| **NATS** | `targets::nats` | NATS Publish | Yes (file) | Yes (CA, client cert) | Token, username/password, credentials file | Subject-based routing |
| **Redis** | `targets::redis` | Redis Pub/Sub | Yes (file) | Yes (CA, client cert, insecure) | Password | Channel publish; connection pooling |
| **MQTT** | `targets::mqtt` | MQTT v5 | Yes (file) | Yes (CA, client cert) | Username/password | Uses `rumqttc`; QoS 0/1; WebSocket path allowlist |
| **MySQL** | `targets::mysql` | MySQL INSERT | Yes (file) | Yes (CA, client cert) | Username/password | Namespace or access format; connection pooling |
| **PostgreSQL** | `targets::postgres` | PostgreSQL INSERT/UPSERT | Yes (file) | Yes (CA, client cert) | Username/password (DSN) | Namespace (UPSERT) or access (append) format; `deadpool-postgres` pooling |
| **Pulsar** | `targets::pulsar` | Pulsar Produce | Yes (file) | Yes (CA, client cert) | Token, OAuth2 | Topic-based; persistent or non-persistent |
**Note**: Elasticsearch is listed as a subsystem constant (`notify_elasticsearch`) but marked `#[allow(dead_code)]`, indicating it's planned but not yet implemented.
### 3.3 Target Identification (ARN)
Each target has a `TargetID` (format: `ID:Name`, e.g., `1:webhook`) and an `ARN` (format: `arn:rustfs:sqs:{region}:{id}:{name}`, e.g., `arn:rustfs:sqs:us-east-1:1:webhook`).
Default partition: `rustfs`, default service: `sqs`.
### 3.4 Queue Store (Persistent Delivery)
Targets that have a `queue_dir` configured use a persistent store for at-least-once delivery:
- Events are first persisted to the queue store, then sent
- If the target is unreachable, events remain in the store and are replayed when connectivity recovers
- Queue store format: `RQP1` magic + metadata length (LE u32) + JSON metadata + raw body
- `QueuedPayload` structure includes: event_name, bucket_name, object_name, content_type, queued_at_unix_ms, payload_len
- Extension: `notify_store` (`.nqs`) for notification events, `audit_store` for audit logs
### 3.5 Delivery Payload Format (`TargetLog`)
```rust
// Serialized as JSON when delivering to targets
struct TargetLog {
event_name: EventName,
key: String, // "{bucket}/{decoded_object_name}"
records: Vec<E>, // For AMQP/NATS: includes full EntityTarget records
// For others: includes serialized Event data
}
```
For AMQP and NATS targets, `build_queued_payload_with_records()` is used, which includes cloned `EntityTarget` records. For other targets, `build_queued_payload()` serializes just the event data.
### 3.6 Concurrency Controls
| Parameter | Default | Env Var |
|-----------|---------|---------|
| Target stream concurrency | 20 | `RUSTFS_NOTIFY_TARGET_STREAM_CONCURRENCY` |
| Send concurrency (inflight limit) | 64 | `RUSTFS_NOTIFY_SEND_CONCURRENCY` |
### 3.7 TLS Hot-Reload
All targets that support TLS (webhook, Kafka, AMQP, NATS, MySQL, PostgreSQL, MQTT) implement `ReloadableTargetTls`:
- A background coordinator polls TLS files for changes
- When fingerprint changes are detected, new material (HTTP client, producer, connection) is built
- Applied via `apply_tls_material()` without requiring a restart
- Supports CA certificates, client certificates, and client keys
---
## 4. Configuration & Rule Engine
### 4.1 Bucket Notification Configuration (XML)
Configuration follows the S3 `NotificationConfiguration` XML schema:
```xml
<NotificationConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<QueueConfiguration>
<Id>my-notification</Id>
<Queue>arn:rustfs:sqs:us-east-1:1:webhook</Queue>
<Event>s3:ObjectCreated:*</Event>
<Event>s3:ObjectRemoved:Delete</Event>
<Filter>
<S3Key>
<FilterRule>
<Name>prefix</Name>
<Value>uploads/</Value>
</FilterRule>
<FilterRule>
<Name>suffix</Name>
<Value>.csv</Value>
</FilterRule>
</S3Key>
</Filter>
</QueueConfiguration>
</NotificationConfiguration>
```
The XML is parsed via `quick_xml` into `NotificationConfiguration``QueueConfig` → validated → converted to `BucketNotificationConfig``RulesMap`.
Key validation rules:
- Lambda and Topic configurations are **not supported** (return `UnsupportedConfiguration` error)
- Only `QueueConfiguration` is supported (maps to all target types, not just SQS)
- One prefix filter and one suffix filter maximum
- Filter values: ≤1024 chars, no `.` or `..` segments, no `\`, valid UTF-8
- No duplicate event names within a queue config
- ARN must exist in the configured target list
### 4.2 RulesMap
`RulesMap` maps `EventName``PatternRules``TargetIdSet`:
- Compound events (like `ObjectCreatedAll`) are **expanded** into specific events on insertion
- Pattern matching: prefix/suffix wildcards (e.g., `uploads/*.csv`)
- URL-encoded keys are matched against both encoded and decoded patterns
- Bitmask-based fast path: `total_events_mask` enables O(1) `has_subscriber()` checks
### 4.3 Dynamically Reconfigurable
- `NotificationSystem::set_target_config()` — add/update a target
- `NotificationSystem::remove_target_config()` — remove a target
- `NotificationSystem::load_bucket_notification_config()` — load per-bucket rules
- `NotificationSystem::remove_bucket_notification_config()` — remove per-bucket rules
- `NotificationSystem::reload_config()` — reload from a new `Config` object
- All changes trigger automatic re-initialization of affected targets
---
## 5. Pipeline & Delivery
### 5.1 Event Flow
```
ECStore operation
ecstore::event_notification::send_event(EventArgs)
↓ (OnceLock dispatch hook)
convert EventArgs → notify::EventArgs
↓ spawn
notifier_global::notify(EventArgs)
NotificationSystem::send_event(Arc<Event>)
NotifyPipeline::send_event()
├── LiveEventHistory::record() (in-memory, last 1024 events)
├── broadcast::send() (tokio broadcast channel, capacity 1024)
└── EventNotifier::send() (async, rule-matched delivery)
├── RuleEngine::match_targets(bucket, event_name, object_key)
└── For each matched target:
├── EntityTarget construction
├── If queue_store: persist then async send
└── If no queue_store: immediate async send
```
### 5.2 Live Event Stream
The `NotifyPipeline` provides an in-process event stream via `tokio::sync::broadcast`:
```rust
// Subscribe to live events
let rx = system.subscribe_live_events();
// Check if there are live listeners
system.has_live_listeners();
// Get recent events since a sequence number
system.recent_live_events_since(after_sequence, limit) LiveEventBatch
```
- Broadcast channel capacity: 1024
- `LiveEventHistory` stores last 1024 events with monotonic sequence numbers
- `LiveEventBatch` includes `events: Vec<Arc<Event>>`, `next_sequence: u64`, `truncated: bool`
### 5.3 Metrics
`NotificationMetrics` tracks:
- Processing count (in-flight)
- Processed count (completed)
- Failed count
- Skipped count (no matching targets)
Per-target `TargetDeliverySnapshot`:
- `total_messages`
- `failed_messages`
- `queue_length`
---
## 6. Live Event Stream
### 6.1 In-Process Subscription
The live event stream is useful for alknet because it provides a **push-based** event feed without requiring external message brokers:
```rust
// This can be used from within the same process
let mut rx = notification_system.subscribe_live_events();
while let Ok(event) = rx.recv().await {
// event: Arc<Event> — full S3 event record
println!("Event: {} on {}/{}", event.event_name, event.s3.bucket.name, event.s3.object.key);
}
```
### 6.2 Event History Replay
The `LiveEventHistory` supports catch-up subscriptions:
```rust
// Get events since sequence number 42
let batch = system.recent_live_events_since(42, 100).await;
// batch.next_sequence → next sequence to request
// batch.truncated → whether there are more events
// batch.events → Vec<Arc<Event>>
```
---
## 7. S3 Select
### 7.1 Architecture Overview
RustFS implements S3 Select using **Apache DataFusion** as the SQL engine:
```
SelectObjectContentRequest
↓ validation (expression type, input/output format, scan range)
↓ preflight (get object info, validate SSE headers)
↓ create EcObjectStore (DataFusion ObjectStore adapter)
↓ get_global_db(input) → QueryDispatcher
↓ Query::new(Context, expression) → execute
↓ DataFusion SQL parser → logical plan → optimized → physical plan → RecordBatch stream
↓ SelectOutputEncoder → CSV or JSON → chunked (128KB) → event stream
```
### 7.2 Key Crates
| Crate | Purpose |
|-------|---------|
| `rustfs_s3select_api` | Query error types, `Context`, `Query`, `QueryResult`, `DatabaseManagerSystem` trait, object store |
| `rustfs_s3select_query` | SQL implementation: parser, analyzer, optimizer, function manager, execution, dispatcher |
### 7.3 SQL Engine
- **Parser**: Custom `RustFsDialect` + `ExtParser` extending DataFusion's SQL parser
- **Supports**: Single SELECT statements only (multi-statement is rejected)
- **Optimizer**: `CascadeOptimizerBuilder` (DataFusion's default rule set)
- **Scheduler**: `LocalScheduler` (single-node execution)
- **Functions**: All of DataFusion's built-in scalar, aggregate, and window functions
### 7.4 Input Formats
| Format | Support | Notes |
|--------|---------|-------|
| **CSV** | ✅ Full | `FileHeaderInfo` (NONE, USE, IGNORE), custom delimiters, quote chars, comment chars, record delimiters |
| **JSON (LINES)** | ✅ Full | NDJSON line-by-line streaming |
| **JSON (DOCUMENT)** | ✅ Limited | Max 128 MiB (OOM guard); no scan range support |
| **Parquet** | ✅ Full | Columnar format |
| **Compression** | ❌ Not supported | Only `NONE` compression currently accepted |
### 7.5 Output Formats
| Format | Options |
|--------|---------|
| **CSV** | Custom field delimiter, quote character, quote escape, record delimiter, quote fields (ALWAYS/ASNEEDED) |
| **JSON** | Line-delimited (NDJSON); custom record delimiter |
### 7.6 Expression Limitations
- Max expression size: 256 KiB (`MAX_SELECT_EXPRESSION_BYTES`)
- Expression type must be `SQL`
- No `AllowQuotedRecordDelimiter` support for CSV
- Scan ranges:
- CSV: supported
- JSON LINES: supported
- JSON DOCUMENT: **not supported**
- Parquet: supported
- Range must be valid (start < end, start < object size)
### 7.7 Object Store Integration
`EcObjectStore` implements DataFusion's `ObjectStore` trait, adapting rustfs's ECStore for query execution:
- Handles `GET` with optional byte ranges (scan range)
- JSON DOCUMENT mode: entire file buffered for DOM parsing, then flattened to NDJSON
- JSON sub-path extraction: `FROM s3object.some.path` navigates to the key before flattening
- Respects SSE-C headers for encrypted objects
### 7.8 Streaming Response
Results are streamed as S3 event types:
1. `Cont` event (continuation marker)
2. `Records` events (128KB chunks)
3. `Progress` events (if `RequestProgress.Enabled=true`) — currently only `BytesReturned` populated
4. `Stats` event (final)
5. `End` event
### 7.9 Error Mapping
| QueryError | S3 Error |
|-----------|----------|
| `Parser` | `ParseSelectFailure` (400) |
| `MultiStatement` | `UnsupportedSqlStructure` |
| `NotImplemented` | `NotImplemented` |
| `Datafusion` (scan range) | `InvalidRequestParameter` |
| `Datafusion` (missing binding) | `EvaluatorBindingDoesNotExist` |
| `Datafusion` (other) | `UnsupportedSqlOperation` |
| `StoreError` (bucket not found) | `NoSuchBucket` |
| `StoreError` (object not found) | `NoSuchKey` |
| `StoreError` (other) | `InternalError` |
---
## 8. Mapping to alknet
### 8.1 rustfs Events → alknet Integration Events
rustfs events are **integration events from rustfs's perspective** and remain **integration events from alknet's perspective**. This is the correct cross-boundary classification per ADR-032.
#### Event Projection: `rustfs::BucketNotificationEvent` → `alknet::EventEnvelope`
Suggested namespace and operation mapping:
| rustfs EventName | alknet Namespace | alknet Operation |
|------------------|-----------------|-----------------|
| `s3:ObjectCreated:Put` | `storage.object` | `created.put` |
| `s3:ObjectCreated:Post` | `storage.object` | `created.post` |
| `s3:ObjectCreated:Copy` | `storage.object` | `created.copy` |
| `s3:ObjectCreated:CompleteMultipartUpload` | `storage.object` | `created.multipart-complete` |
| `s3:ObjectRemoved:Delete` | `storage.object` | `removed.delete` |
| `s3:ObjectRemoved:DeleteMarkerCreated` | `storage.object` | `removed.delete-marker-created` |
| `s3:ObjectAccessed:Get` | `storage.object` | `accessed.get` |
| `s3:ObjectAccessed:Head` | `storage.object` | `accessed.head` |
| `s3:BucketCreated:*` | `storage.bucket` | `created` |
| `s3:BucketRemoved:*` | `storage.bucket` | `removed` |
The full `Event` record from rustfs should be preserved in the `EventEnvelope.payload` field for traceability, while a normalized `metadata` extraction provides fast-path access:
```rust
// Pseudocode for mapping
fn project_rustfs_event(event: &rustfs_notify::Event) -> alknet::EventEnvelope {
let namespace = if event.event_name == EventName::BucketCreated || event.event_name == EventName::BucketRemoved {
"storage.bucket"
} else {
"storage.object"
};
let operation = event.event_name.as_str() // "s3:ObjectCreated:Put"
.strip_prefix("s3:") // "ObjectCreated:Put"
.unwrap_or("unknown")
.to_lowercase()
.replace(':',, ".");
EventEnvelope {
id: uuid::Uuid::new_v4(),
namespace: namespace.into(),
operation: operation.into(), // e.g., "objectcreated.put"
timestamp: event.event_time,
source: "rustfs".into(),
metadata: json!({
"bucket": event.s3.bucket.name,
"key": event.s3.object.key,
"size": event.s3.object.size,
"eTag": event.s3.object.e_tag,
"versionId": event.s3.object.version_id,
"sequencer": event.s3.object.sequencer,
"principalId": event.user_identity.principal_id,
}),
payload: serde_json::to_value(event).ok(),
}
}
```
### 8.2 Subscription Architecture
#### Option A: In-Process Live Event Stream (Recommended)
Since alknet and rustfs share the same process, alknet can subscribe to the live event stream directly:
```rust
// In alknet's initialization
let notification_system = rustfs_notify::notification_system().unwrap();
let mut event_rx = notification_system.subscribe_live_events();
// In alknet's event loop
tokio::spawn(async move {
while let Ok(event) = event_rx.recv().await {
let envelope = project_rustfs_event(&event);
alknet::honker::publish(envelope).await;
}
});
```
**Advantages**:
- Zero-latency, zero-serialization overhead
- No network hop
- Direct access to `Arc<Event>` in-process
- alknet's Honker streams get events immediately
**Considerations**:
- `has_live_listeners()` can be checked before performing expensive event construction
- The broadcast channel capacity is 1024; slow consumers will miss events (acceptable for integration events)
- `recent_live_events_since()` allows catch-up after reconnection
#### Option B: External Target via Webhook/Kafka/etc.
If alknet runs as a separate process, configure a webhook or Kafka target pointing to alknet's event ingestion endpoint:
```json
{
"notify_webhook": {
"1": {
"enable": true,
"endpoint": "https://alknet.internal/events/rustfs",
"auth_token": "Bearer alknet-secret"
}
}
}
```
**Advantages**:
- Decoupled deployment
- RustFS's queue store provides at-least-once delivery
**Considerations**:
- Network latency and serialization overhead
- Need to handle deduplication (at-least-once means possible duplicates)
- Queue store provides durability if alknet is temporarily unavailable
#### Option C: Hybrid — Live Stream + Webhook Fallback
For maximum reliability:
1. In-process live stream for low-latency event propagation
2. Webhook/Kafka target as a fallback for events missed during restarts
3. Use `sequentor` ordering to detect gaps
### 8.3 S3 Select → alknet Operations
S3 Select can be exposed as an alknet operation:
| alknet Operation | Description |
|-----------------|-------------|
| `storage.select` | Run an S3 Select SQL query on an object |
| `storage.select-status` | Check Select availability (optional) |
```rust
// Example alknet call protocol operation
fn handle_storage_select(params: StorageSelectParams) -> Result<StorageSelectResult, Error> {
// 1. Construct SelectObjectContentInput
// 2. Call existing rustfs SelectObjectContent handler
// 3. Stream results back through alknet call protocol
}
```
#### Use Cases for alknet
1. **Metagraph Queries**: Query stored metagraph JSON/CSV objects without downloading them entirely
```sql
SELECT s.name, s.version FROM S3Object s WHERE s.type = 'service'
```
2. **Log Analytics**: Query structured log data stored in S3
```sql
SELECT COUNT(*) as cnt, s.level FROM S3Object s WHERE s.timestamp > '2026-01-01' GROUP BY s.level
```
3. **Ad-hoc Data Exploration**: Quick data inspection without full downloads
```sql
SELECT * FROM S3Object s LIMIT 100
```
4. **Aggregation Pipelines**: Pre-process data before moving to alknet's internal stores
### 8.4 ADR-032 Implications: Cross-Boundary Event Flow
Per ADR-032, rustfs events are **integration events** — they represent facts about state changes that have already happened in the storage system boundary. When alknet consumes them:
```
┌─────────────┐ ┌─────────────┐
│ rustfs │ │ alknet │
│ (bounded │ integration │ (bounded │
│ context) │───── event ─────────▶│ context) │
│ │ │ │
│ S3 Object │ EventEnvelope │ Honker │
│ Created/ │ namespace: │ Stream │
│ Removed/ │ "storage.object" │ Subscriber │
│ Accessed │ operation: │ │
│ │ "created.put" │ Call │
│ │ │ Protocol │
│ S3 Select │ storage.select │ Operation │
│ Results │◀──── call ──────────│ │
└─────────────┘ └─────────────┘
```
Key points:
1. **Events flow inward**: rustfs → alknet (integration events entering alknet's boundary)
2. **Calls flow outward**: alknet → rustfs (alknet initiates S3 Select as a call)
3. **No shared domain model**: alknet shouldn't reference rustfs's `Event` struct directly in its domain; it projects into its own `EventEnvelope` format
4. **Eventual consistency**: rustfs notifications may arrive out of order; `sequentor` field provides ordering within a bucket
5. **At-least-once delivery**: If using webhook/Kafka targets, duplicate events are possible; alknet must be idempotent
6. **No orchestration across boundaries**: alknet doesn't tell rustfs to emit events; it subscribes to events rustfs naturally produces
### 8.5 Implementation Recommendations
1. **Short-term**: Use the **in-process live event stream** to subscribe to rustfs events and re-emit them through alknet's Honker system. This gives immediate value with minimal integration work.
2. **Medium-term**: Add a **webhook notification target** pointing at an alknet HTTP endpoint for redundancy. Configure bucket notification rules via the S3 API (PutBucketNotificationConfiguration).
3. **Long-term**: Consider implementing an **alknet NATS target** that directly publishes events into alknet's NATS infrastructure, bypassing the HTTP layer entirely for lower latency.
4. **S3 Select**: Expose via alknet's call protocol as `storage.select`. The existing `execute_select_object_content` function can be called directly as a library function since alknet and rustfs share the same process.
5. **Event schema versioning**: Store the `event_version` field from rustfs events in alknet's `EventEnvelope.metadata` to handle future schema evolution.
---
## 9. References
### Source Code Locations
| Component | Path |
|-----------|------|
| Event structure | `/crates/notify/src/event.rs` |
| EventName enum | `/crates/s3-types/src/event_name.rs` |
| NotifyPipeline + LiveEventHistory | `/crates/notify/src/pipeline.rs` |
| EventNotifier + TargetList | `/crates/notify/src/notifier.rs` |
| NotificationSystem | `/crates/notify/src/integration.rs` |
| Rule engine | `/crates/notify/src/rule_engine.rs` |
| RulesMap | `/crates/notify/src/rules/rules_map.rs` |
| Bucket notification config | `/crates/notify/src/rules/config.rs` |
| XML notification config | `/crates/notify/src/rules/xml_config.rs` |
| Target trait + QueuedPayload | `/crates/targets/src/target/mod.rs` |
| Webhook target | `/crates/targets/src/target/webhook.rs` |
| Kafka target | `/crates/targets/src/target/kafka.rs` |
| AMQP target | `/crates/targets/src/target/amqp.rs` |
| NATS target | `/crates/targets/src/target/nats.rs` |
| Redis target | `/crates/targets/src/target/redis.rs` |
| MQTT target | `/crates/targets/src/target/mqtt.rs` |
| MySQL target | `/crates/targets/src/target/mysql.rs` |
| PostgreSQL target | `/crates/targets/src/target/postgres.rs` |
| Pulsar target | `/crates/targets/src/target/pulsar.rs` |
| ARN + TargetID | `/crates/targets/src/arn.rs` |
| ECStore event dispatch | `/crates/ecstore/src/event_notification.rs` |
| Server event init | `/rustfs/src/server/event.rs` |
| S3 Select handler | `/rustfs/src/app/select_object.rs` |
| S3 Select query engine | `/crates/s3select-query/src/` |
| S3 Select API | `/crates/s3select-api/src/` |
| S3 Select object store | `/crates/s3select-api/src/object_store.rs` |
| Config subsystem names | `/crates/config/src/notify/mod.rs` |
### AWS S3 Documentation
- [S3 Event Notification Configuration](https://docs.aws.amazon.com/AmazonS3/latest/userguide/EventNotifications.html)
- [S3 Select Documentation](https://docs.aws.amazon.com/AmazonS3/latest/userguide/selecting-content-from-objects.html)
### Internal References
- `/workspace/@alkdev/alknet/docs/research/references/rustfs/rustfs-reference.md` — Companion document covering auth, architecture, and credential mapping

View File

@@ -1,732 +0,0 @@
# RustFS Reference Document
> Status: Research Complete
> Last updated: 2026-06-08
> Source: /workspace/rustfs/ (cloned repository, v1.0.0-beta.7)
> Context: alknet internal service integration research
---
## 1. Architecture Overview
### What is RustFS?
RustFS is a high-performance, distributed, S3-compatible object storage system written in Rust. It is an Apache 2.0-licensed alternative to MinIO that combines S3 API compatibility with OpenStack Swift/Keystone support, designed for data lake, AI, and big data workloads.
**Key characteristics:**
- Language: Rust (edition 2024, MSRV 1.95.0)
- License: Apache 2.0 (no AGPL restrictions)
- Workspace: 57 crates in a flat `crates/` layout
- Main binary: `rustfs/` (75K lines); core engine: `crates/ecstore/` (87K lines)
- Version: 1.0.0-beta.7
### Ports and Endpoints
| Port | Purpose |
|------|---------|
| 9000 | S3 API (primary data path) + Admin API (`/minio/` prefix) |
| 9001 | Web Console UI |
### Request Flow
```
HTTP request
→ server (TLS, auth, routing, compression)
→ app/object_usecase (validation, policy, lifecycle)
→ storage/ecfs (erasure coding, encryption, checksums)
→ ecstore (disk pool selection, data distribution)
→ rio (reader pipeline: encrypt → compress → hash → write)
→ io-core (zero-copy I/O, buffer pool, direct I/O)
→ local disk / remote disk via RPC
```
### Key Crate Map (Security & Auth Focus)
| Crate | Lines | Purpose |
|-------|-------|---------|
| `credentials` | 713 | Credential types (access key / secret key), global credentials |
| `signer` | 1.4K | AWS Signature V4 request signing |
| `iam` | 9.0K | Identity and Access Management (users, groups, policies, OIDC) |
| `policy` | 8.8K | S3 bucket/IAM policy engine |
| `keystone` | 1.9K | OpenStack Keystone auth integration |
| `appauth` | 143 | Application-level auth tokens |
| `crypto` | 1.6K | Encryption primitives |
| `kms` | 8.1K | Key management service integration |
| `protocols` | 18K | FTP/FTPS, WebDAV, Swift API support |
| `s3-ops` | — | S3 operation definitions and mapping |
| `s3-types` | — | S3 event type definitions |
### Startup Sequence (Auth-Relevant Steps)
1. Environment variable compatibility (`MINIO_*``RUSTFS_*`)
2. Tokio runtime construction
3. CLI argument parsing
4. Config parsing, credentials/endpoints initialization
5. HTTP server start (S3 API + optional console)
6. ECStore initialization
7. **Steps 13: Bucket metadata, IAM, Keystone, OIDC** initialization
8. FullReady → serving requests
---
## 2. S3 API Compatibility
### Supported S3 Operations
RustFS implements a substantial subset of the S3 API via the `s3s` crate (a fork/custom build at `https://github.com/rustfs/s3s`). Based on the feature status table and crate structure:
| Category | Status | Details |
|----------|--------|---------|
| Core Object Ops (GET/PUT/DELETE/HEAD) | ✅ Available | Primary data path |
| Multipart Upload | ✅ Available | Upload, download, multipart |
| Versioning | ✅ Available | Object versioning |
| Bucket Operations | ✅ Available | Create, list, delete, metadata |
| Logging | ✅ Available | Access logging |
| Event Notifications | ✅ Available | Webhook, Kafka, AMQP, MQTT, NATS targets |
| Bitrot Protection | ✅ Available | Checskums at storage layer |
| Single Node Mode | ✅ Available | Single-node deployment |
| Bucket Replication | ✅ Available | Cross-region replication |
| KMS | 🚧 Under Testing | Key management service |
| Lifecycle Management | 🚧 Under Testing | Object lifecycle rules |
| Distributed Mode | 🚧 Under Testing | Multi-node erasure coding |
| Admin API | ✅ Available | `/minio/` prefix, 30+ handler modules |
| Console | ✅ Available | Web UI on port 9001 |
| S3 Select | ✅ Available | `s3select-api` + `s3select-query` crates |
| WebDAV | ✅ Available | `protocols` crate, `dav-server` |
| FTP/FTPS | ✅ Available | `libunftp`, `suppaftp` |
| SFTP | — | `russh` + `russh-sftp` crate deps |
### Authentication Methods
RustFS supports multiple authentication methods (derived from `auth.rs`):
| Auth Type | Constant | Detection |
|-----------|----------|-----------|
| AWS Signature V4 (header) | `Signed` | `Authorization: AWS4-HMAC-SHA256 ...` |
| AWS Signature V4 (query) | `Presigned` | `X-Amz-Credential` in query |
| AWS Signature V2 (header) | `SignedV2` | `Authorization: AWS ...` |
| AWS Signature V2 (query) | `PresignedV2` | `AWSAccessKeyId` in query |
| Streaming V4 | `StreamingSigned` | `x-amz-content-sha256: STREAMING-AWS4-HMAC-SHA256-PAYLOAD` |
| Streaming V4 (trailer) | `StreamingSignedTrailer` | `STREAMING-AWS4-HMAC-SHA256-PAYLOAD-TRAILER` |
| Unsigned payload (trailer) | `StreamingUnsignedTrailer` | `STREAMING-UNSIGNED-PAYLOAD-TRAILER` |
| POST policy | `PostPolicy` | `multipart/form-data` content type |
| Bearer JWT | `JWT` | `Authorization: Bearer ...` |
| STS | `STS` | `Action` header presence |
| Anonymous | `Anonymous` | No `Authorization` header |
| Keystone token | — | `X-Auth-Token` header (via middleware) |
### S3 Request Signing
The `rustfs-signer` crate implements AWS Signature V4. The general flow:
1. Client computes a canonical request (method + path + query + headers + payload hash)
2. Client creates a string to sign (algorithm + timestamp + credential scope + canonical request hash)
3. Client computes HMAC-SHA256 signature using the secret key
4. Client sends the `Authorization` header with the signature
---
## 3. OpenStack Swift and Keystone Integration
### Swift API
RustFS provides an **OpenStack Swift-compatible API** as an opt-in feature (behind the `swift` cargo feature flag). This is implemented in `crates/protocols/src/swift/`.
**Swift API endpoint pattern:** `/v1/AUTH_{project_id}/...`
**Supported Swift operations:**
- Container CRUD (create, list, delete, metadata)
- Object CRUD with streaming downloads
- Keystone token authentication
- Multi-tenant isolation with SHA256-based bucket prefixing
- Server-side object copy (COPY method)
- HTTP Range requests (206/416 responses)
- Custom metadata (X-Object-Meta-*, X-Container-Meta-*)
**Not yet implemented:** Account-level ops, large object support (>5GB), object versioning, container ACLs/CORS, TempURL, XML/plain-text response formats.
**Tenant isolation:** Swift containers are mapped to S3 buckets with a secure hash prefix:
```
Swift: /v1/AUTH_abc123/mycontainer
→ S3 Bucket: {sha256(abc123)[0:16]}-mycontainer
```
### Keystone Authentication — Complete Flow
This is the most auth-relevant subsystem for alknet integration.
#### Configuration (Environment Variables)
| Variable | Description | Default |
|----------|-------------|---------|
| `RUSTFS_KEYSTONE_ENABLE` | Enable Keystone auth | `false` |
| `RUSTFS_KEYSTONE_AUTH_URL` | Keystone endpoint URL | (required) |
| `RUSTFS_KEYSTONE_VERSION` | API version (`v3` or `v2.0`) | `v3` |
| `RUSTFS_KEYSTONE_ADMIN_USER` | Admin username | (optional) |
| `RUSTFS_KEYSTONE_ADMIN_PASSWORD` | Admin password | (optional) |
| `RUSTFS_KEYSTONE_ADMIN_PROJECT` | Admin project/tenant | (optional) |
| `RUSTFS_KEYSTONE_ADMIN_DOMAIN` | Admin domain | `Default` |
| `RUSTFS_KEYSTONE_VERIFY_SSL` | Verify TLS certificates | `true` |
| `RUSTFS_KEYSTONE_ENABLE_CACHE` | Enable token caching | `true` |
| `RUSTFS_KEYSTONE_CACHE_SIZE` | Token cache capacity | `10000` |
| `RUSTFS_KEYSTONE_CACHE_TTL` | Token cache TTL (seconds) | `300` |
| `RUSTFS_KEYSTONE_TENANT_PREFIX` | Enable tenant project prefixing | `true` |
| `RUSTFS_KEYSTONE_IMPLICIT_TENANTS` | Auto-create tenants | `true` |
| `RUSTFS_KEYSTONE_TIMEOUT` | Request timeout (seconds) | `30` |
#### Architecture: Component Stack
```
KeystoneClient (HTTP calls to Keystone v3 API)
KeystoneAuthProvider (Authentication + Caching via moka::future::Cache)
KeystoneAuthMiddleware (Tower layer, intercepts HTTP requests)
↓ (task-local: KEYSTONE_CREDENTIALS)
IAMAuth → check_key_valid (Authorization)
RustFS Credentials (access_key starts with "keystone:")
```
#### Authentication Flow
**Request with `X-Auth-Token` header:**
1. **Middleware intercepts:** `KeystoneAuthMiddleware` extracts `X-Auth-Token` header
2. **Cache check:** Token cache hit → return cached credentials (~1-2ms)
3. **Token validation:** Cache miss → `KeystoneClient.validate_token()``GET /v3/auth/tokens` with `X-Auth-Token` and `X-Subject-Token` headers
4. **Token parsing:** Parse `KeystoneToken` (user_id, username, project_id, project_name, domain, roles, expires_at)
5. **Credential mapping:** Convert to `Credentials` struct:
- `access_key`: `keystone:<user_id>` (special prefix identifies Keystone users)
- `secret_key`: `""` (empty — bypasses AWS SigV4 verification)
- `session_token`: the Keystone token string
- `parent_user`: Keystone username
- `groups`: roles list
- `claims`: JSON map with `keystone_user_id`, `keystone_project_id`, `keystone_roles`, `auth_source: "keystone"`
6. **Task-local storage:** Store credentials in `KEYSTONE_CREDENTIALS` task-local (async-scoped to request)
7. **Auth bypass:** IAMAuth detects `keystone:` prefix → returns empty secret key, bypassing SigV4
8. **Authorization:** `check_key_valid()` retrieves credentials from task-local storage
9. **Role check:** `admin` or `reseller_admin` roles → `is_owner=true`; other roles → `is_owner=false`
**Request without `X-Auth-Token`:**
1. Middleware passes through unchanged
2. Standard AWS SigV4 authentication proceeds
3. IAM validation as normal
**Invalid token:**
1. Middleware returns `401 Unauthorized` immediately with XML error body
2. **No fallback** to standard S3 auth
#### EC2 Credentials
RustFS also supports Keystone EC2 credentials for S3 API compatibility:
- `POST /v3/ec2tokens` with `{access, signature, data}` validates EC2-style credentials
- `GET /v3/users/{user_id}/credentials/OS-EC2` lists EC2 credentials for a user
- Access key format: `user_id:project_id` or `user_id`
#### Role Mapping (Keystone → RustFS)
| Keystone Role | RustFS Policy | Permissions |
|---------------|---------------|-------------|
| `admin` | AdminPolicy | Full access (`s3:*`) |
| `Admin` | AdminPolicy | Full access |
| `Member` | ReadWritePolicy | Read/write |
| `_member_` | ReadOnlyPolicy | Read-only |
| `ResellerAdmin` | AdminPolicy | Full access |
| `SwiftOperator` | ReadWritePolicy | Read/write |
| `objectstore:admin` | AdminPolicy | Full access |
| `objectstore:creator` | ReadWritePolicy | Read/write |
Custom role mappings can be added programmatically via `KeystoneIdentityMapper::add_role_mapping()`.
#### Multi-Tenancy
When `RUSTFS_KEYSTONE_TENANT_PREFIX=true`:
- Bucket creation: `mybucket` → stored as `project_id:mybucket`
- Bucket listing: filtered by project_id
- Access control: users can only access their project's buckets
---
## 4. Authentication Model — Complete Reference
### Credentials Struct
The core `Credentials` struct (in `rustfs-credentials`):
```rust
pub struct Credentials {
pub access_key: String, // S3 access key (or "keystone:<user_id>")
pub secret_key: String, // S3 secret key (empty for Keystone)
pub session_token: String, // STS session token / Keystone token
pub expiration: Option<OffsetDateTime>, // Token expiration
pub status: String, // "active" or "off"
pub parent_user: String, // Parent user for STS/service accounts
pub groups: Option<Vec<String>>, // Group membership
pub claims: Option<HashMap<String, Value>>, // JWT/Keystone claims
pub name: Option<String>, // Human-readable name
pub description: Option<String>,
}
```
Key methods:
- `is_expired()` — checks if the credential's expiration has passed
- `is_temp()` — true if `session_token` is non-empty and not expired
- `is_service_account()` — true if claims contain `sa-policy` key and `parent_user` is non-empty
- `is_valid()` — access_key >= 3 chars, secret_key >= 8 chars, not expired, status != "off"
- Default credentials: `rustfsadmin` / `rustfsadmin` (env vars: `RUSTFS_ACCESS_KEY` / `RUSTFS_SECRET_KEY`)
### IAM System
The IAM system (`rustfs-iam`) manages:
- **Users and groups** with RBAC
- **Service accounts** and API key authentication
- **Policy engine** with fine-grained S3-style permissions
- **LDAP/Active Directory** integration
- **Session management** and token validation
- **OIDC integration** (full OpenID Connect with PKCE)
The IAM system is initialized as a singleton (`IAM_SYS`) backed by an `ObjectStore` (persisted in the S3 storage itself). Lookups go through `IamSys::check_key(access_key)` which loads from cache or disk.
### OIDC Support
RustFS has comprehensive OIDC support (`rustfs-iam``oidc.rs`):
**Configuration (environment variables):**
- `RUSTFS_IDENTITY_OPENID_ENABLE=on`
- `RUSTFS_IDENTITY_OPENID_CONFIG_URL` — OIDC discovery URL
- `RUSTFS_IDENTITY_OPENID_CLIENT_ID` — OAuth2 client ID
- `RUSTFS_IDENTITY_OPENID_CLIENT_SECRET` — OAuth2 client secret
- `RUSTFS_IDENTITY_OPENID_SCOPES` — comma-separated scopes (default: `openid,profile,email`)
- `RUSTFS_IDENTITY_OPENID_GROUPS_CLAIM` — claim for group membership
- `RUSTFS_IDENTITY_OPENID_ROLES_CLAIM` — claim for role mapping (Microsoft Entra ID app roles)
- `RUSTFS_IDENTITY_OPENID_CLAIM_NAME` — primary claim for policy mapping
- `RUSTFS_IDENTITY_OPENID_CLAIM_PREFIX` — prefix for claim-to-policy mapping
- `RUSTFS_IDENTITY_OPENID_REDIRECT_URI` — callback URL
- `RUSTFS_IDENTITY_OPENID_REDIRECT_URI_DYNAMIC` — allow dynamic redirect URIs
**Features:**
- Authorization Code flow with PKCE
- OIDC discovery and JWKS auto-refresh
- Multiple OIDC providers (suffixed env vars like `_PRIMARY`, `_SECONDARY`)
- ID token verification (signature, issuer, audience, expiry)
- `AssumeRoleWithWebIdentity` flow (JWT directly, no browser)
- Roles and groups claim mapping to RustFS IAM policies
- Provider-specific configuration (Microsoft Entra ID roles claim support)
**OIDC Claims → RustFS Policy Mapping:**
```json
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": ["admin:*"],
"Resource": ["arn:aws:s3:::*"],
"Condition": {
"ForAnyValue:StringEquals": {
"jwt:roles": ["RustFS.ConsoleAdmin"]
}
}
}]
}
```
### RPC Authentication
RustFS uses a derived RPC secret for inter-node communication:
- Environment variable: `RUSTFS_RPC_SECRET` (explicit) or derived from `access_key + secret_key` via HMAC-SHA256
- Uses a `0xFFFFFFFFFFFFFFFF` mask for the signing context
- Base64url-encoded (no padding) output
---
## 5. Docker Deployment
### Simple Deployment
```yaml
# docker-compose-simple.yml
services:
rustfs:
image: rustfs/rustfs:latest
ports:
- "9000:9000" # S3 API
- "9001:9001" # Console
environment:
- RUSTFS_VOLUMES=/data/rustfs{0...3}
- RUSTFS_ADDRESS=0.0.0.0:9000
- RUSTFS_CONSOLE_ADDRESS=0.0.0.0:9001
- RUSTFS_ACCESS_KEY=rustfsadmin
- RUSTFS_SECRET_KEY=rustfsadmin
- RUSTFS_OBS_LOGGER_LEVEL=info
volumes:
- rustfs_data_0:/data/rustfs0
- rustfs_data_1:/data/rustfs1
- rustfs_data_2:/data/rustfs2
- rustfs_data_3:/data/rustfs3
```
### Full Deployment (with Observability)
```yaml
# docker-compose.yml (with --profile observability)
services:
rustfs:
# ... same as above, plus:
- RUSTFS_OBS_ENDPOINT=http://otel-collector:4318
otel-collector: # OpenTelemetry collector
tempo: # Distributed tracing
jaeger: # Jaeger UI
prometheus: # Metrics
loki: # Logs
grafana: # Dashboards
nginx: # Reverse proxy (optional, --profile proxy)
```
### Dockerfile
- Base: Alpine 3.23.4
- Runs as non-root user `rustfs` (UID/GID 10001:10001)
- Single binary: `/usr/bin/rustfs`
- Entrypoint: `/entrypoint.sh` (processes volumes, log dirs, default credential warnings)
- Health check: HTTP/HTTPS `/health` on port 9000, `/rustfs/console/health` on 9001
- Supports TLS via `RUSTFS_TLS_PATH=/opt/tls` with `rustfs_cert.pem` + `rustfs_key.pem` + optional `ca.crt`
### Keystone-Enabled Deployment
```bash
docker run -d \
-p 9000:9000 -p 9001:9001 \
-e RUSTFS_ACCESS_KEY=admin \
-e RUSTFS_SECRET_KEY=adminsecret \
-e RUSTFS_KEYSTONE_ENABLE=true \
-e RUSTFS_KEYSTONE_AUTH_URL=http://keystone:5000 \
-e RUSTFS_KEYSTONE_VERSION=v3 \
-e RUSTFS_KEYSTONE_ADMIN_USER=admin \
-e RUSTFS_KEYSTONE_ADMIN_PASSWORD=secret \
-e RUSTFS_KEYSTONE_ADMIN_PROJECT=admin \
-e RUSTFS_KEYSTONE_ADMIN_DOMAIN=Default \
-v /data:/data \
rustfs/rustfs:latest
```
### Webhook Notification
```bash
docker run -d --name rustfs -p 9000:9000 \
-e RUSTFS_NOTIFY_ENABLE=true \
-e RUSTFS_NOTIFY_WEBHOOK_ENABLE_PRIMARY=on \
-e RUSTFS_NOTIFY_WEBHOOK_ENDPOINT_PRIMARY=http://host:3020/webhook \
-e RUSTFS_NOTIFY_WEBHOOK_QUEUE_DIR_PRIMARY=/tmp/rustfs-events \
rustfs/rustfs:latest
```
---
## 6. SDK/Client Libraries — Rust S3 Clients
### aws-sdk-s3 (Official AWS SDK for Rust)
RustFS itself uses `aws-sdk-s3` (v1.135.0) as a dependency — this is the most mature Rust S3 client:
```toml
aws-sdk-s3 = { version = "1.135.0", default-features = false, features = ["sigv4a", "default-https-client", "rt-tokio"] }
aws-config = { version = "1.8.18" }
aws-credential-types = { version = "1.2.14" }
```
**Pros:** Full S3 API coverage, SigV4/SigV4a signing, async, production-tested
**Cons:** Heavy dependency (pulls in significant AWS SDK surface area), AWS-centric abstractions
### s3s (RustFS's own S3 framework)
RustFS uses a custom `s3s` crate (`https://github.com/rustfs/s3s`, with `minio` feature):
```toml
s3s = { git = "https://github.com/rustfs/s3s", rev = "507e1312b211c3ddc214b03875d6fabd15d22ed5", features = ["minio"] }
```
This provides S3 request/response types, routing, and the `S3Auth` trait used by RustFS's `IAMAuth`.
### rust-s3 ( Community)
Not used by RustFS, but worth noting as an alternative:
- Crate: `rust-s3` / `s3`
- Simpler API than aws-sdk-s3
- Supports MinIO-compatible endpoints
- Less complete S3 operation coverage
### Recommendation for alknet
For alknet's S3 adapter:
- **Internal use**: aws-sdk-s3, configured with custom endpoint pointing to rustfs
- **Request signing**: If building a lightweight adapter, extract just the signing logic from `rustfs-signer` or use `aws-smithy-runtime` directly
- **The CredentialSet::S3AccessKey variant** (from alknet's credential-provider.md) maps directly to RustFS's `access_key + secret_key` pair; no additional transformation needed
---
## 7. Relevance to Alknet
### 7.1 RustFS as an Internal Object Store Behind Alknet's HTTP Interface
**Architecture:**
```
Client (any S3 SDK)
→ Alknet HTTP adapter (port 443/80 with HTTPS termination)
→ RustFS (port 9000, Docker network, not exposed externally)
→ Disk storage (/data volumes)
```
**Deployment pattern:** RustFS runs as a Docker container on the same Docker network as alknet, listening only on the internal network. Alknet's HTTP interface reverse-proxies S3 API calls to rustfs.
**Reverse proxy considerations:**
- Alknet would forward `Host`, `Authorization`, `X-Auth-Token`, `X-Amz-*` headers unchanged
- RustFS needs the real client IP for S3 policy `SourceIp` conditions; alknet should set `X-Forwarded-For` and configure `RUSTFS_TRUSTED_PROXIES` or use rustfs's `trusted-proxies` crate
- Health check: Alknet proxies `/health` → rustfs:9000
- RustFS supports `X-Forwarded-Proto` for TLS offloading via its `trusted-proxies` crate
**Why behind alknet rather than standalone:**
1. Unified TLS termination at alknet
2. alknet can inject auth headers (e.g., OIDC tokens) before forwarding
3. alknet can enforce rate limiting and access control
4. Network isolation — rustfs only accessible via alknet
**Webhook integration:** RustFS can POST events to alknet via its notification system:
```bash
RUSTFS_NOTIFY_WEBHOOK_ENDPOINT_PRIMARY=http://alknet:3020/webhook
```
### 7.2 Mapping S3 Auth to Alknet's CredentialProvider/CredentialSet
The alknet `CredentialSet` enum directly models the S3 auth pattern:
| RustFS Auth Method | Alknet CredentialSet Variant | Mapping |
|---|---|---|
| Access key + secret key (SigV4) | `S3AccessKey { access_key, secret_key, session_token }` | Direct 1:1 mapping; access_key and secret_key are the S3 credential pair |
| Keystone X-Auth-Token | `OidcToken { access_token, ... }` | Keystone token → OIDC access_token; expires_at maps to Keystone token expiration |
| STS AssumeRole session | `S3AccessKey { ..., session_token: Some(...) }` | STS temporary credentials with session token |
| OIDC (browser flow) | `OidcToken { access_token, refresh_token, expires_at }` | Direct mapping |
| Admin default credentials | `S3AccessKey { access_key: "rustfsadmin", secret_key: "rustfsadmin" }` | Service-level credential |
**S3 Request Signing (Phase C in credential-provider.md):**
The `S3AccessKey` variant contains the raw credential data. The signing computation itself is separate — it's a utility function `s3_sign(credential: &S3AccessKey, request: &HttpRequest) -> SignedRequest` that should live in a shared `alknet-s3` utility crate, not in `CredentialSet`. This matches OpenQ-04 in the credential-provider doc.
**For alknet's `S3CredentialManager`:**
```rust
impl CredentialManager for S3CredentialManager {
fn refresh(&self, current: &CredentialSet) -> Option<CredentialSet> {
// If we have an STS session token, check expiration
// and re-AssumeRole if needed
}
fn is_expired(&self, current: &CredentialSet) -> bool {
match current {
CredentialSet::S3AccessKey { session_token: Some(t), .. }
if !t.is_empty() => check_sts_expiration(t),
CredentialSet::OidcToken { expires_at: Some(ts), .. }
=> *ts < now(),
_ => false, // Static keys don't expire
}
}
fn provision(&self, identity: &Identity) -> Option<CredentialSet> {
// Create a rustfs IAM access key for this alknet identity
// via the rustfs admin API
}
}
```
### 7.3 Alknet as an OIDC Provider for RustFS (Phase D)
This is the most strategically important integration point. RustFS already has complete OIDC support — it just needs an OIDC provider to trust.
**How it would work:**
1. **alknet exposes OIDC endpoints** (via call protocol HTTP adapter or a dedicated `/oidc/` path):
- `GET /.well-known/openid-configuration` — discovery document
- `GET /oidc/authorize` — authorization endpoint
- `POST /oidc/token` — token exchange
- `GET /oidc/userinfo` — user info
- `GET /oidc/jwks` — JSON Web Key Set
- `GET /oidc/logout` — RP-initiated logout
2. **alknet's Identity maps to OIDC claims:**
- `sub``Identity.id` (SSH fingerprint or account UUID)
- `email` → from account metadata (if available)
- `username` → display name or `Identity.id`
- `groups``Identity.scopes` (e.g., `["s3:admin", "s3:readwrite"]`)
- `roles` → derived from scopes (e.g., `scope "s3:admin"` → role `"admin"`)
3. **RustFS configuration** (pointing at alknet):
```bash
RUSTFS_IDENTITY_OPENID_ENABLE=on
RUSTFS_IDENTITY_OPENID_CONFIG_URL=https://alknet:443/.well-known/openid-configuration
RUSTFS_IDENTITY_OPENID_CLIENT_ID=alknet-rustfs-client
RUSTFS_IDENTITY_OPENID_CLIENT_SECRET=<auto-generated>
RUSTFS_IDENTITY_OPENID_SCOPES=openid,profile,email,groups
RUSTFS_IDENTITY_OPENID_GROUPS_CLAIM=groups
RUSTFS_IDENTITY_OPENID_ROLES_CLAIM=roles
```
4. **Authentication flow:**
- User connects to alknet (via SSH/WebTransport/HTTP)
- alknet resolves identity → `Identity { id, scopes, resources }`
- User requests access to rustfs console
- Browser redirects to alknet's OIDC authorize endpoint
- alknet issues authorization code → token exchange → ID token
- RustFS verifies the ID token using alknet's JWKS endpoint
- RustFS maps `groups` and `roles` claims to IAM policies
5. **For `AssumeRoleWithWebIdentity` (programmatic access):**
- alknet issues a JWT directly to the client
- Client presents JWT to RustFS via `Action=AssumeRoleWithWebIdentity`
- RustFS calls `OidcSys::verify_web_identity_token()` which:
- Decodes JWT payload to get `iss` claim
- Finds matching OIDC provider (alknet)
- Verifies signature, issuer, audience, expiry
- Extracts claims → maps to RustFS policies
**This eliminates stored credentials entirely** — alknet identities authenticate directly to rustfs via OIDC, no `S3AccessKey` needed.
### 7.4 Alknet RustFS Adapter Architecture
An alknet HTTP/HTTPS adapter for the S3 API would look like:
```
alknet HTTP adapter
├── Route: /s3/* → reverse proxy to rustfs:9000
│ ├── Preserve all S3 headers (Authorization, X-Amz-*, X-Auth-Token, Content-*)
│ ├── Set X-Forwarded-For, X-Forwarded-Proto
│ ├── Optionally inject X-Auth-Token from alknet Identity
│ └── Response streaming (for large object downloads)
├── Route: /s3/health → rustfs:9000/health (health check)
└── Route: /s3/admin/* → rustfs:9000/minio/* (admin API)
```
**Key considerations:**
- S3 requests can be very large (multipart uploads, 5TB+ objects). The adapter must support streaming both request and response bodies without buffering.
- `X-Forwarded-For` must be set so rustfs can evaluate `SourceIp` condition keys in bucket policies.
- RustFS already handles `X-Forwarded-Proto` for HTTPS offloading via its `trusted-proxies` crate.
- For OIDC integration, the adapter doesn't need to modify auth headers — rustfs handles OIDC token validation itself when pointed at alknet's OIDC endpoint.
**Alknet's `OpenAPIServiceRegistry` integration:**
Since rustfs exposes an S3 API, alknet could auto-register S3 operations via an OpenAPI spec or hardcoded operation specs:
```rust
// In alknet's service registry:
let s3_ops = FromOpenAPI(s3_openapi_spec, config);
// Where config.auth = CredentialSet::S3AccessKey { access_key, secret_key, session_token: None }
// Or: config.auth = CredentialSet::OidcToken { access_token, refresh_token, expires_at }
```
---
## 8. Key RustFS Source Files for Reference
| File | Purpose |
|------|---------|
| `crates/credentials/src/credentials.rs` | `Credentials` struct, global credentials, key generation |
| `crates/credentials/src/constants.rs` | Default access/secret keys, IAM policy constants |
| `crates/signer/` | AWS Signature V4 implementation |
| `crates/keystone/src/config.rs` | Keystone configuration from env vars |
| `crates/keystone/src/client.rs` | Keystone v3 API client (token validation, EC2 creds, admin auth) |
| `crates/keystone/src/auth.rs` | `KeystoneAuthProvider` (token → `Credentials` mapping) |
| `crates/keystone/src/middleware.rs` | Tower middleware extracting `X-Auth-Token`, task-local storage |
| `crates/keystone/src/identity.rs` | `KeystoneIdentityMapper` (role → policy, tenant prefix) |
| `crates/iam/src/oidc.rs` | Complete OIDC system (discovery, PKCE, token exchange, JWT verification) |
| `crates/iam/src/sys.rs` | `IamSys` (IAM singleton, user/key management) |
| `crates/policy/` | S3 bucket/IAM policy evaluation engine |
| `rustfs/src/auth.rs` | `IAMAuth`, `check_key_valid`, auth type detection, condition values |
| `rustfs/src/server/` | HTTP server, TLS, routing, middleware stack |
| `crates/protocols/src/swift/` | OpenStack Swift API implementation |
| `Dockerfile` / `docker-compose-simple.yml` | Deployment configuration |
---
## 9. Configuration Quick Reference
### RustFS Docker Environment Variables (Auth-Relevant)
| Variable | Description | Default |
|----------|-------------|---------|
| `RUSTFS_ACCESS_KEY` | Root access key | `rustfsadmin` |
| `RUSTFS_SECRET_KEY` | Root secret key | `rustfsadmin` |
| `RUSTFS_ADDRESS` | S3 API listen address | `0.0.0.0:9000` |
| `RUSTFS_CONSOLE_ADDRESS` | Console listen address | `0.0.0.0:9001` |
| `RUSTFS_CONSOLE_ENABLE` | Enable web console | `true` |
| `RUSTFS_TLS_PATH` | TLS certificate directory | (none, HTTP) |
| `RUSTFS_KEYSTONE_ENABLE` | Enable Keystone auth | `false` |
| `RUSTFS_KEYSTONE_AUTH_URL` | Keystone v3 endpoint | (required if enabled) |
| `RUSTFS_KEYSTONE_VERSION` | Keystone API version | `v3` |
| `RUSTFS_KEYSTONE_ADMIN_USER` | Keystone admin user | (optional) |
| `RUSTFS_KEYSTONE_ADMIN_PASSWORD` | Keystone admin password | (optional) |
| `RUSTFS_KEYSTONE_ADMIN_PROJECT` | Keystone admin project | (optional) |
| `RUSTFS_KEYSTONE_ADMIN_DOMAIN` | Keystone admin domain | `Default` |
| `RUSTFS_KEYSTONE_VERIFY_SSL` | Verify Keystone TLS | `true` |
| `RUSTFS_KEYSTONE_CACHE_SIZE` | Token cache size | `10000` |
| `RUSTFS_KEYSTONE_CACHE_TTL` | Token cache TTL (sec) | `300` |
| `RUSTFS_KEYSTONE_TENANT_PREFIX` | Enable tenant prefixing | `true` |
| `RUSTFS_IDENTITY_OPENID_ENABLE` | Enable OIDC | `off` |
| `RUSTFS_IDENTITY_OPENID_CONFIG_URL` | OIDC discovery URL | (required) |
| `RUSTFS_IDENTITY_OPENID_CLIENT_ID` | OIDC client ID | (required) |
| `RUSTFS_IDENTITY_OPENID_CLIENT_SECRET` | OIDC client secret | (optional) |
| `RUSTFS_IDENTITY_OPENID_SCOPES` | OIDC scopes | `openid,profile,email` |
| `RUSTFS_IDENTITY_OPENID_GROUPS_CLAIM` | Groups claim name | `groups` |
| `RUSTFS_IDENTITY_OPENID_ROLES_CLAIM` | Roles claim name | (empty, opt-in) |
| `RUSTFS_RPC_SECRET` | Inter-node RPC auth secret | (derived from keys) |
| `RUSTFS_NOTIFY_WEBHOOK_ENABLE_PRIMARY` | Enable webhook notifications | `off` |
| `RUSTFS_NOTIFY_WEBHOOK_ENDPOINT_PRIMARY` | Webhook URL | (required) |
---
## 10. Summary of Integration Paths
### Phase A (Immediate): Static S3 Credentials
- Deploy rustfs as a Docker service next to alknet
- Configure `RUSTFS_ACCESS_KEY` and `RUSTFS_SECRET_KEY`
- alknet stores these as `CredentialSet::S3AccessKey`
- alknet's HTTP adapter reverse-proxies S3 calls to rustfs
- Use `aws-sdk-s3` or `rust-s3` as the client library
**Effort:** Low. No auth changes in either system.
### Phase B: OIDC via External Provider
- Configure rustfs `RUSTFS_IDENTITY_OPENID_*` to point at an external OIDC provider (e.g., Keycloak, Authentik, Microsoft Entra ID)
- alknet can still manage its own auth independently
- Both systems trust the same OIDC provider
**Effort:** Low. Configuration-only change in rustfs.
### Phase C: Managed Credentials
- alknet provisions rustfs access keys via admin API (`/minio/` endpoints)
- `S3CredentialManager` handles session token rotation
- Identity-bound credentials: alknet creates per-user access keys in rustfs IAM
**Effort:** Medium. Requires admin API client, credential lifecycle management.
### Phase D: Alknet as OIDC Provider (Target State)
- alknet exposes OIDC endpoints (`.well-known/openid-configuration`, `/oidc/authorize`, `/oidc/token`, `/oidc/jwks`)
- rustfs trusts alknet as its OIDC provider
- `Identity.scopes` maps to rustfs IAM policies (e.g., `s3:admin` → admin policy)
- No stored S3 credentials — users authenticate directly via alknet identity
- `AssumeRoleWithWebIdentity` for programmatic access
**Effort:** High. Requires building OIDC authorization server in alknet. This is the most elegant but most complex path.
---
## References
- [RustFS GitHub](https://github.com/rustfs/rustfs) — v1.0.0-beta.7
- [RustFS Documentation](https://docs.rustfs.com)
- [RustFS Keystone README](file:///workspace/rustfs/crates/keystone/README.md) — comprehensive Keystone integration docs
- [RustFS OIDC implementation](file:///workspace/rustfs/crates/iam/src/oidc.rs) — full OIDC client with PKCE, discovery, JWKS refresh
- [RustFS auth.rs](file:///workspace/rustfs/rustfs/src/auth.rs) — IAMAuth, check_key_valid, auth type detection
- [alknet credential-provider.md](file:///workspace/@alkdev/alknet/docs/research/phase2/credential-provider.md) — alknet's outbound auth design
- [alknet identity.md](file:///workspace/@alkdev/alknet/docs/architecture/identity.md) — alknet's inbound auth design

View File

@@ -1,808 +0,0 @@
# Alknet Services: irpc Service Architecture
> Status: Research / Draft
> Last updated: 2026-06-06
## Overview
Alknet uses an **irpc-based service layer** to decompose core responsibilities into independently testable, deployable, and replaceable components. Services communicate via irpc protocol enums that work both as in-process async boundaries (tokio channels) and cross-process/cross-network (QUIC streams via noq).
This document defines the service protocols and their relationships, following the head/worker terminology established in [core.md](core.md).
## Design Principles
### 1. Services are protocol enums
An irpc service is defined as a Rust enum annotated with `#[rpc_requests]`. The macro generates two versions:
- **Serializable** (`Request`): safe to encode with postcard, for remote communication
- **With channels** (`RequestWithChannels`): includes `oneshot::Sender` and `mpsc` channels, for local communication
Both versions use the same `Client<S>` type — the local/remote distinction is transparent at the call site.
### 2. Services are the async boundary
Instead of a giant `mpsc` message enum per the irpc documentation's description of the common anti-pattern, each service has its own focused protocol. This keeps responsibilities clear and prevents the "god enum" problem.
### 3. Local-first, remote-capable
Every service can run locally (mpsc channels, zero serialization overhead) or remotely (QUIC streams, postcard serialization). The deployment choice doesn't affect the call sites. A single-node setup runs everything locally. A distributed setup runs auth and secrets on dedicated nodes.
### 4. Event boundary discipline
Per [event_source_types.md](/workspace/research/event_sourcing/event_source_types.md):
- **Honker streams** = domain events (internal to the owning service, for state reconstruction)
- **irpc service calls** = request-response between services (synchronous boundary within a node)
- **Call protocol EventEnvelope** = integration events (cross-node asynchronous boundary)
Domain events are projected to integration events when crossing service or node boundaries. Never publish domain events directly to other services.
## Service Definitions
### AuthService
Verifies identities without holding all keys in memory.
```rust
use irpc::{rpc_requests, channel::{mpsc, oneshot}};
use serde::{Serialize, Deserialize};
#[rpc_requests(message = AuthMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum AuthProtocol {
#[rpc(tx=oneshot::Sender<AuthResult>)]
#[wrap(VerifyPubkey)]
VerifyPubkey {
fingerprint: String,
key_data: Vec<u8>,
},
#[rpc(tx=oneshot::Sender<AuthResult>)]
#[wrap(VerifyToken)]
VerifyToken {
token_bytes: Vec<u8>,
timestamp: u64,
},
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(ReloadKeys)]
ReloadKeys,
#[rpc(tx=oneshot::Sender<bool>)]
#[wrap(CheckAccess)]
CheckAccess {
identity: Identity,
operation: String,
},
}
#[derive(Debug, Serialize, Deserialize)]
enum AuthResult {
Ok(Identity),
Denied(String),
}
#[derive(Debug, Serialize, Deserialize)]
struct Identity {
node_id: String,
fingerprint: String,
scopes: Vec<String>,
}
```
**Backends:**
| Mode | Backend | When to use |
|------|---------|-------------|
| Minimal | `ArcSwap<DynamicConfig>` with all keys in memory | CLI, single-node, few users |
| SQLite | Query `peer_credentials` / `api_keys` on demand | Production, multi-user head nodes |
| Remote | Forward to dedicated auth service | Multi-head clusters, auth federation |
**Why this solves the scaling problem:** Instead of loading all keys into memory and swapping them atomically, the auth service queries SQLite per request. An LRU cache on hot fingerprints avoids repeated DB hits. Key revocations are propagated via honker stream notifications.
### SecretService
Derives keys from a master seed, encrypts/decrypts external credentials. The **only** component that holds the master seed phrase.
```rust
#[rpc_requests(message = SecretMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum SecretProtocol {
#[rpc(tx=oneshot::Sender<DerivedKey>)]
#[wrap(DeriveEd25519)]
DeriveEd25519 {
path: String, // e.g. "m/74'/0'/0'/0'"
},
#[rpc(tx=oneshot::Sender<DerivedKey>)]
#[wrap(DeriveEncryptionKey)]
DeriveEncryptionKey {
path: String, // e.g. "m/74'/2'/0'/0'"
},
#[rpc(tx=oneshot::Sender<DerivedKey>)]
#[wrap(DeriveEthereumKey)]
DeriveEthereumKey {
path: String, // e.g. "m/44'/60'/0'/0/0"
},
#[rpc(tx=oneshot::Sender<Vec<u8>>)]
#[wrap(DerivePassword)]
DerivePassword {
path: String,
length: usize,
},
#[rpc(tx=oneshot::Sender<EncryptedData>)]
#[wrap(Encrypt)]
Encrypt {
plaintext: String,
key_version: u32,
},
#[rpc(tx=oneshot::Sender<String>)]
#[wrap(Decrypt)]
Decrypt {
encrypted: EncryptedData,
},
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(Lock)]
Lock,
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(Unlock)]
Unlock {
passphrase: String,
},
}
#[derive(Debug, Serialize, Deserialize)]
struct DerivedKey {
key_type: KeyType,
private_key: Vec<u8>,
public_key: Vec<u8>,
}
#[derive(Debug, Serialize, Deserialize)]
enum KeyType {
Ed25519,
Aes256Gcm,
Secp256k1,
}
#[derive(Debug, Serialize, Deserialize)]
struct EncryptedData {
key_version: u32,
salt: String, // Base64-encoded
iv: String, // Base64-encoded
data: String, // Base64-encoded
}
```
**Security model:**
| State | What's in memory | What's on disk |
|-------|-----------------|---------------|
| Locked | Nothing | Encrypted database, derivation path metadata |
| Unlocked | Master seed in RAM | Same (seed is never persisted) |
| After use | Derived keys cached in RAM | Derivation paths only |
The seed phrase is entered once (at node startup or via `Unlock` call), held in memory, and never written to disk. Derived keys are computed on demand. The `Lock` call purges the seed and all cached derived keys from memory.
**Derived key patterns (see [storage.md](storage.md) for derivation path conventions):**
- Identity keys: SLIP-0010 `m/74'/0'/0'/0'` → Ed25519 keypair for alknet authentication
- Encryption keys: SLIP-0010 `m/74'/2'/0'/0'` → AES-256-GCM key for external credential encryption
- Ethereum keys: BIP32 `m/44'/60'/0'/0/0` → secp256k1 keypair for smart contract signing
- Site passwords: BIP32 `m/74'/1'/0'/{hash}'` → deterministic password derivation (orbit-db-wallet pattern)
### ConfigService
Dynamic configuration reload. Wraps `ArcSwap<DynamicConfig>` for minimal deployments, or delegates to SQLite-backed storage for production.
```rust
#[rpc_requests(message = ConfigMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum ConfigProtocol {
#[rpc(tx=oneshot::Sender<ForwardingPolicy>)]
#[wrap(GetForwardingPolicy)]
GetForwardingPolicy,
#[rpc(tx=oneshot::Sender<RateLimitConfig>)]
#[wrap(GetRateLimits)]
GetRateLimits,
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(ReloadForwarding)]
ReloadForwarding {
policy: ForwardingPolicy,
},
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(ReloadRateLimits)]
ReloadRateLimits {
limits: RateLimitConfig,
},
}
```
### StorageService
Graph CRUD operations, metagraph management, and honker event bridge. Wraps the `alknet-storage` crate.
```rust
#[rpc_requests(message = StorageMessage)]
#[derive(Debug, Serialize, Deserialize)]
enum StorageProtocol {
#[rpc(tx=oneshot::Sender<Graph>)]
#[wrap(CreateGraph)]
CreateGraph {
graph_type_id: String,
name: String,
},
#[rpc(tx=oneshot::Sender<Node>)]
#[wrap(AddNode)]
AddNode {
graph_id: String,
key: String,
attributes: serde_json::Value,
},
#[rpc(tx=oneshot::Sender<Node>)]
#[wrap(GetNode)]
GetNode {
graph_id: String,
key: String,
},
#[rpc(tx=mpsc::Sender<StorageEvent>)]
#[wrap(Subscribe)]
Subscribe {
stream_name: String,
},
}
```
The `Subscribe` variant uses server-streaming irpc — the client sends one request and receives multiple `StorageEvent` messages via `mpsc::Sender`. These are honker stream events projected into integration events.
## Operation Context and Handler Environment
The call protocol's `OperationSpec` defines *what* an operation looks like (name, namespace, input/output schemas, access control). But the handler that actually processes the call needs more than just `input` — it needs **context**: who made the call, what other operations it can invoke, and what identity it runs as.
This is the pattern established in `@alkdev/operations` and needs to map cleanly to the Rust implementation.
### OperationContext
Every handler receives an `OperationContext` alongside its input:
```rust
pub struct OperationContext {
pub request_id: String,
pub parent_request_id: Option<String>,
pub identity: Option<Identity>,
pub metadata: HashMap<String, Value>,
pub env: OperationEnv,
pub trusted: bool, // set by buildEnv(), not by callers
}
pub struct Identity {
pub id: String,
pub scopes: Vec<String>,
pub resources: Option<HashMap<String, Vec<String>>>,
}
```
Key fields:
- **`request_id`** / **`parent_request_id`**: Call tracing. A mutation that triggers events carries `parent_request_id` so the call graph can link them.
- **`identity`**: The authenticated identity making the call. Populated by the auth service from the call protocol's `call.requested` event. ACL checks use `identity.scopes` and `identity.resources` via the operation's `accessControl`.
- **`metadata`**: Arbitrary key-value context. Used for things like trace IDs, correlation headers, or feature flags.
- **`env`**: The **operation environment** — namespaced access to call other operations. This is the composition mechanism.
- **`trusted`**: Internal flag set by `buildEnv()`. When a handler calls another operation through `env`, the nested call is `trusted` (skips ACL checks). This prevents handlers from having to manage auth scope escalation themselves.
### OperationEnv (the composition mechanism)
`OperationEnv` provides namespaced access to the operation registry. A handler can call other operations without knowing their transport:
```rust
pub type OperationEnv = HashMap<String, HashMap<String, fn(Value, OperationContext) -> ResponseEnvelope>>;
// Usage inside a handler:
let result = context.env["secrets"]["deriveKey"](derive_input, nested_context)?;
```
In TypeScript, `buildEnv()` iterates all registered specs (excluding subscriptions), creates closure functions for each, and passes `trusted: true` in the nested context. The Rust equivalent uses irpc service calls:
```rust
// Local: direct function call through handler map
// Remote: irpc call to the service that owns that operation
```
This means a handler for `/head/docker/create` can internally call `/head/secrets/derive` to get a key for the container, and the nested call is routed through the same service layer — locally if the secret service is on the same node, remotely via irpc if it's on a different node.
### Mapping to irpc
The TypeScript `OperationEnv` pattern maps to irpc as follows:
| TypeScript | Rust (irpc) |
|-----------|-------------|
| `context.env.namespace.op(input)` | `client.rpc(ProtocolMessage::OpName { ... }).await?` |
| `buildEnv(registry, context)` | `irpc::Client::local(tx)` or `irpc::Client::remote(conn)` |
| `registry.execute(id, input, context)` | Service handler dispatch on the enum variant |
| `accessControl` check | `enforceAccess()` before handler dispatch |
| Subscription handlers (`async function*`) | `mpsc::Sender<T>` streaming response |
### Call Protocol Events and Context
The call protocol's `EventEnvelope` carries the context fields:
```json
{
"type": "call.requested",
"id": "uuid-123",
"payload": {
"operationId": "/head/docker/create",
"input": { "image": "nginx", "name": "web" },
"identity": { "id": "node-abc", "scopes": ["docker:read", "docker:write"] },
"parentRequestId": "uuid-122",
"deadline": 1712345678000
}
}
```
The `CallHandler` in `call.ts` receives this event, constructs an `OperationContext` from the payload, validates access control, and dispatches to the registered handler. The same pattern applies in Rust — the `buildCallHandler` function creates the context from the event and calls `registry.execute()`.
### Mutations and Events
A mutation handler can trigger side effects after the main operation:
```
handler(input, context) {
// 1. Perform mutation (e.g., create a node in storage)
let result = storage.create_node(...);
// 2. Trigger side effects (e.g., publish event)
// This is an integration event, not a domain event
pubsub.publish("call.responded", "", {
requestId: context.request_id,
output: result,
});
return result;
}
```
Following the event boundary discipline: the mutation itself uses honker's `stream_publish` for internal state management (domain event), and the call protocol `call.responded` is the integration event that other nodes/services react to. The handler doesn't publish honker events directly — that's the storage service's internal concern. The handler calls `context.env.storage.addNode()` and the storage service internally publishes to honker before returning.
### Adapters: MCP and OpenAPI
The `from_mcp` and `from_openapi` adapters in `@alkdev/operations` demonstrate how external protocols map to the operation model:
- **MCP**: Each MCP tool becomes a `MUTATION` operation. The handler calls `client.callTool()` and wraps the result in a `ResponseEnvelope` with `source: "mcp"`.
- **OpenAPI**: Each HTTP endpoint becomes a `QUERY`, `MUTATION`, or `SUBSCRIPTION` (detected from `text/event-stream` responses). The handler makes HTTP requests and wraps results with `source: "http"`.
These adapters will need to map to irpc in Rust. The `ResponseEnvelope` pattern (wrapping results with source metadata) carries over directly. The `OpenAPIServiceRegistry` and `MCPClientLoader` patterns become irpc service initializers that register their operations with the call protocol's `OperationRegistry`.
The key insight: **adapters are just like any other service** — they register operations in the registry and get an `OperationContext` with `env` access. An MCP adapter can call `/head/secrets/derive` just as easily as a local handler can.
## Service Composition
### Minimal Deployment (Single Node, CLI)
All services run locally as tokio actors:
```
┌──────────────────────────────────────────────┐
│ Single Process │
│ │
│ ┌─────────┐ ┌─────────┐ ┌──────────────┐ │
│ │ Auth │ │ Secret │ │ Config │ │
│ │ Service │ │ Service │ │ Service │ │
│ │ (mpsc) │ │ (mpsc) │ │ (mpsc) │ │
│ └────┬─────┘ └────┬────┘ └──────┬───────┘ │
│ │ │ │ │
│ ┌────▼─────────────▼───────────────▼───────┐ │
│ │ alknet-core Server │ │
│ │ (SSH auth, call protocol, forwarding) │ │
│ └──────────────────────────────────────────┘ │
└──────────────────────────────────────────────┘
```
- Auth service uses `ArcSwap<DynamicConfig>` (all keys in memory)
- Secret service runs unlocked (seed in memory, no external access)
- Config service uses `ArcSwap<DynamicConfig>` directly
### Production Deployment (Multi-Node)
Auth and secrets run on dedicated nodes; workers access them remotely:
```
┌────────────────────┐ ┌─────────────────────┐
│ Auth Node │ │ Secret Node │
│ │ │ │
│ AuthProtocol │ │ SecretProtocol │
│ (SQLite-backed) │ │ (seed in RAM) │
│ │ │ │
└────────┬───────────┘ └──────────┬──────────┘
│ QUIC (irpc) │ QUIC (irpc)
│ │
┌────────▼────────────────────────────▼─────────┐
│ Head Node │
│ │
│ ┌──────────┐ ┌──────────┐ ┌─────────────┐ │
│ │ Config │ │ Storage │ │ alknet-core │ │
│ │ Service │ │ Service │ │ Server │ │
│ │ (local) │ │ (local) │ │ │ │
│ └──────────┘ └──────────┘ └──────────────┘ │
└───────────────────────────────────────────────┘
│ SSH / iroh / TLS
┌────────▼──────────────────────────────────────┐
│ Worker Node │
│ │
│ ┌──────────┐ ┌──────────────┐ │
│ │ Storage │ │ alknet-core │ │
│ │ Client │ │ Client │ │
│ │ (remote) │ │ │ │
│ └──────────┘ └──────────────┘ │
└───────────────────────────────────────────────┘
```
Workers don't hold the seed or the auth database. They request derived keys and auth verification via irpc over QUIC.
## Service and Call Protocol Relationship
Services are **internal** — they run within a node or cluster. The call protocol is **external** — it's how nodes communicate with each other over SSH/QUIC/WebSocket/DNS transports.
A service can be exposed as a call protocol operation:
| Internal Service | Call Protocol Path | Direction |
|-----------------|-------------------|-----------|
| AuthProtocol::VerifyPubkey | `/head/auth/verify` | Worker → Head |
| SecretProtocol::DeriveEd25519 | `/head/secrets/derive` | Worker → Head (restricted) |
| StorageProtocol::Subscribe | `/{node}/storage/watch` | Any → Any |
| ConfigProtocol::GetForwardingPolicy | `/head/config/forwarding` | Worker → Head |
External workers call these through the call protocol, which routes to the service on the head node:
```
Worker Head
│ │
│ call.requested │
│ operation: /head/auth/verify │
│ payload: { fingerprint, key }│
│ ─────────────────────────────►│
│ │ ┌─ AuthProtocol::VerifyPubkey ─┐
│ │ │ (irpc, local mpsc channel) │
│ │ └─ Result: AuthResult ──────────┘
│ │
│ call.responded │
│ payload: { status: "ok" } │
│ ◄─────────────────────────────│
```
## Service Integration Example
A head/worker deployment demonstrates service integration end-to-end:
- **Head node**: runs Auth, Secret, and Config services locally
- **Worker node**: connects to head via alknet call protocol
The worker-to-head protocol maps to call protocol operations:
| Worker Message | Call Protocol Path | Service |
|----------------|-------------------|---------|
| Auth | `/head/auth/verify` | AuthProtocol |
| Heartbeat | `/worker/heartbeat` (subscription) | ConfigProtocol |
| Task result | `/worker/task/submit` | StorageProtocol (persistence) |
| Task assignment | `/head/task/template` (subscription) | StorageProtocol |
Worker keys are derived from the seed by the secret service. The head node's API credentials are stored encrypted and decrypted on demand by the secret service.
## Derived Key Conventions
Standardized SLIP-0010/BIP32 paths (see [storage.md](storage.md) for full table):
| Path | Purpose | Curve/Algorithm |
|------|---------|----------------|
| `m/74'/0'/0'/0'` | Primary identity keypair | Ed25519 (alknet auth) |
| `m/74'/0'/0'/{n}'` | Worker/ device identity | Ed25519 |
| `m/74'/0'/1'/0'` | SSH host key | Ed25519 |
| `m/74'/1'/0'/{hash}'` | Site-specific password | Deterministic (like orbit-db-wallet) |
| `m/74'/2'/0'/0'` | Encryption key for external credentials | AES-256-GCM |
| `m/44'/60'/0'/0/0` | Ethereum signing key | secp256k1 (smart contract) |
The `74'` coin type is unallocated per SLIP-0044 and reserved for alknet.
## Application Services
Core services (auth, secret, config, storage) are infrastructure that every node needs. Application services are domain-specific and pluggable — they expose operations via the call protocol and are registered dynamically by the node operator.
### Service Tiers
```
┌─────────────────────────────────────────────────────────┐
│ Application Layer │
│ DockerService · NodeService · WalletService · GitService│
│ ProxyService · ComputeService · AgentService · ... │
├─────────────────────────────────────────────────────────┤
│ Core Services │
│ AuthService · SecretService · ConfigService │
│ StorageService │
├─────────────────────────────────────────────────────────┤
│ alknet-core │
│ Transport · Call Protocol · SSH · irpc │
└─────────────────────────────────────────────────────────┘
```
### DockerService
Container lifecycle management on a node. Wraps the Docker Engine API (via `bollard` crate, already used in dispatch) and exposes it through the call protocol.
```rust
#[rpc_requests(message = DockerMessage)]
enum DockerProtocol {
#[rpc(tx=oneshot::Sender<ContainerInfo>)]
#[wrap(CreateContainer)]
CreateContainer { image: String, name: Option<String>, env: Vec<(String, String)>, ports: Vec<(u16, u16)> },
#[rpc(tx=oneshot::Sender<ContainerInfo>)]
#[wrap(InspectContainer)]
InspectContainer { id: String },
#[rpc(tx=oneshot::Sender<Vec<ContainerInfo>>)]
#[wrap(ListContainers)]
ListContainers { all: bool },
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(StopContainer)]
StopContainer { id: String, timeout: u64 },
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(RemoveContainer)]
RemoveContainer { id: String, force: bool },
#[rpc(tx=mpsc::Sender<ContainerEvent>)]
#[wrap(StreamEvents)]
StreamEvents { filters: Vec<String> },
}
```
This makes container management a first-class alknet operation that can be called from any connected node, not just SSH. The dispatch project's `InstanceProvider` trait pattern maps directly here.
**Self-hosting use case**: An operator deploys a "server in a box" by connecting a worker node with DockerService registered. A head node (or another authorized node) can then deploy containers remotely via call protocol: `/node/docker/create`, `/node/docker/list`, etc. This replaces manual SSH + docker-compose with automated, auditable, policy-governed deployment.
### NodeService
System health, metrics, and tiered observability. Exposes system metrics and supports tiered escalation from small models to larger models to humans.
```rust
#[rpc_requests(message = NodeMessage)]
enum NodeProtocol {
#[rpc(tx=oneshot::Sender<SystemMetrics>)]
#[wrap(GetMetrics)]
GetMetrics { categories: Vec<MetricCategory> },
#[rpc(tx=oneshot::Sender<HealthStatus>)]
#[wrap(HealthCheck)]
HealthCheck,
#[rpc(tx=mpsc::Sender<SystemEvent>)]
#[wrap(SubscribeMetrics)]
SubscribeMetrics { interval_ms: u64, categories: Vec<MetricCategory> },
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(Escalate)]
Escalate { severity: Severity, message: String, context: serde_json::Value },
}
#[derive(Serialize, Deserialize)]
enum MetricCategory { Cpu, Memory, Disk, Network, Docker, Uptime }
#[derive(Serialize, Deserialize)]
enum Severity { Info, Warning, Critical }
```
**Tiered escalation pattern**: A small model (fast, cheap) subscribes to `/node/metrics/stream` and evaluates simple rules (disk > 90%, memory > 95%, container crashed). When a rule triggers, it calls `/node/alert/escalate` with context. The head node decides whether to notify a larger model or a human.
### WalletService
Multichain wallet operations using a HD derivation library (e.g., wagyu). Derives keys from the same master seed via the secret service, signs transactions, and manages addresses.
```rust
#[rpc_requests(message = WalletMessage)]
enum WalletProtocol {
#[rpc(tx=oneshot::Sender<AddressInfo>)]
#[wrap(GetAddress)]
GetAddress { chain: Chain, path: String },
#[rpc(tx=oneshot::Sender<BalanceInfo>)]
#[wrap(GetBalance)]
GetBalance { chain: Chain, address: String },
#[rpc(tx=oneshot::Sender<SignedTransaction>)]
#[wrap(SignTransaction)]
SignTransaction { chain: Chain, path: String, tx_params: serde_json::Value },
#[rpc(tx=oneshot::Sender<String>)]
#[wrap(VerifyAddress)]
VerifyAddress { chain: Chain, address: String },
}
#[derive(Serialize, Deserialize)]
enum Chain { Bitcoin, Ethereum, Monero, Zcash }
```
The WalletService delegates key derivation to the SecretService via irpc. It never sees the master seed — only derived keypairs for specific paths. This means wallet operations are available to authorized nodes without exposing the full key hierarchy.
### ProxyService
Reverse proxy and TLS certificate management. Automates nginx/certbot configuration for services deployed via DockerService.
```rust
#[rpc_requests(message = ProxyMessage)]
enum ProxyProtocol {
#[rpc(tx=oneshot::Sender<ProxyConfig>)]
#[wrap(GetConfig)]
GetConfig,
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(AddRoute)]
AddRoute { domain: String, upstream: String, tls: bool },
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(RemoveRoute)]
RemoveRoute { domain: String },
#[rpc(tx=oneshot::Sender<CertificateInfo>)]
#[wrap(ProvisionCert)]
ProvisionCert { domain: String },
#[rpc(tx=oneshot::Sender<Vec<CertificateInfo>>)]
#[wrap(ListCerts)]
ListCerts,
}
```
### ComputeService
Abstracts compute provider APIs (starting with dispatch's `InstanceProvider` pattern). Manages remote instances across providers.
```rust
#[rpc_requests(message = ComputeMessage)]
enum ComputeProtocol {
#[rpc(tx=oneshot::Sender<InstanceInfo>)]
#[wrap(CreateInstance)]
CreateInstance { provider: String, spec: InstanceSpec },
#[rpc(tx=oneshot::Sender<Vec<InstanceInfo>>)]
#[wrap(ListInstances)]
ListInstances { provider: Option<String> },
#[rpc(tx=oneshot::Sender<()>)]
#[wrap(DestroyInstance)]
DestroyInstance { id: String },
#[rpc(tx=oneshot::Sender<InstanceInfo>)]
#[wrap(GetInstance)]
GetInstance { id: String },
}
```
### Registration Pattern
Application services register with the call protocol's `OperationRegistry` at startup:
```rust
registry.register(
OperationSpec { name: "/node/docker/create", namespace: "docker", ... },
docker_service.create_container_handler,
);
registry.register(
OperationSpec { name: "/node/metrics/stream", namespace: "node", ... },
node_service.subscribe_metrics_handler,
);
```
A worker node that exposes Docker and Node services registers those operations when it connects to the head. The head can then route calls from any node to the appropriate worker via the call protocol.
### Self-Hosting Stack Example
A minimal self-hosted server with all services:
```
┌─────────────────────────────────────────────────────────┐
│ Head Node │
│ │
│ Core: Auth · Secret · Config · Storage │
│ App: Docker · Node · Proxy · Git · Wallet · Compute │
│ │
│ Call protocol paths: │
│ /head/auth/* │
│ /head/docker/* │
│ /head/proxy/* │
│ /head/wallet/* │
│ /head/compute/* │
│ /head/node/metrics/* │
└─────────────────────────────────────────────────────────┘
```
An operator deploys this by:
1. Running `alknet serve --config stack.toml`
2. Entering their seed phrase once (unlocks the secret service)
3. All services come online with keys derived from the seed
4. Docker containers for Gitea, Postgres, Redis, etc. are managed via DockerService
5. Reverse proxy and TLS are automated via ProxyService
6. Wallet keys are derived on demand via WalletService
No manual SSH, no hardcoded credentials, no separate secret management. The seed phrase is the single root of trust.
## Crate Structure
```
alknet-core/
├── transport/ — Transport trait, TCP, TLS, iroh, DNS
├── call/ — Call protocol, PendingRequestMap, OperationRegistry
├── auth/ — AuthService protocol, identity types
├── secrets/ — SecretService protocol, BIP39, SLIP-0010, AES-GCM
├── config/ — ConfigService protocol, StaticConfig, DynamicConfig
├── handler/ — ServerHandler, SSH authentication hooks
└── serve.rs — Server::run(), multi-transport listeners
alknet-storage/
├── metagraph/ — GraphType, NodeType, EdgeType persistence
├── identity/ — accounts, organizations, peer_credentials, api_keys
├── acl/ — PrincipalNode, DelegatesEdge, access control graph
├── secrets/ — Encrypted node type, encrypt/decrypt, key derivation bridge
├── honker/ — honker integration: notify, stream, queue
├── graph/ — GraphInstance, Node, Edge CRUD with schema validation
└── schema/ — JSON Schema definitions (serde + jsonschema)
```
## Security Considerations
1. **Seed phrase is never persisted** — it's entered at startup or via `Unlock` call and held only in RAM
2. **Derived keys are cached in memory** — cleared on `Lock`
3. **External credentials are encrypted at rest** — the encryption key is itself derived from the seed
4. **Auth service never sees the seed** — it only sees public key fingerprints and verification results
5. **irpc remote communication is over QUIC** — encrypted in transit; irpc doesn't add its own encryption layer (assumes the transport provides it)
6. **Lock wipes all secrets** — a locked secret service returns errors for all requests until unlocked
## Open Questions
- **OQ-SVC-01**: Should the secret service support multiple seed phrases (one per tenant or identity)?
The simplest approach is one seed per node. Multi-seed support (e.g., one per tenant in a multi-tenant system) can be added later by indexing the `Unlock` call with a tenant ID. Defer for now.
- **OQ-SVC-02**: Should service protocols use postcard (binary) or JSON for remote calls?
irpc defaults to postcard for efficiency. However, the call protocol uses JSON `EventEnvelope` for cross-language compatibility. Service-to-service calls should use postcard (Rust-to-Rust), while node-to-node calls use JSON (call protocol). The irpc remote path naturally uses postcard.
- **OQ-SVC-03**: How does the secret service integrate with the existing `EncryptedDataSchema` from `@alkdev/storage`?
The TypeScript `encrypt()`/`decrypt()` functions use PBKDF2 with a password. In Rust, the secret service replaces the password with a derived AES-256-GCM key. The `EncryptedData` schema (key_version, salt, iv, data) stays the same, but key derivation changes from PBKDF2(password) to SLIP-0010(seed, path). This is a superset — the old format can be migrated by re-encrypting with the new key.
- **OQ-SVC-04**: Should workers cache derived keys locally?
Yes, with a TTL. A worker that holds a derived Ed25519 keypair for its session can re-authenticate without calling the secret service every time. The TTL should be configurable (default: 1 hour). The head can revoke by invalidating the session, not by expiring the key.
- **OQ-SVC-05**: How does the smart contract (NFT-based ACL) interact with the secret service?
The Ethereum signing key (`m/44'/60'/0'/0/0`) is derived from the same seed. The secret service can sign transactions on behalf of the node. The smart contract is a separate concern — it's the external source of truth for identity registration. The local ACL graph (in `alknet-storage`) is a cache that's synced from the contract, not the other way around.
## References
- [core.md](core.md) — Core overview, transport, call protocol, head/worker model
- [configuration.md](configuration.md) — Config architecture, auth service, DynamicConfig
- [storage.md](storage.md) — Metagraph, identity, ACL, secrets, event boundaries
- [flow.md](flow.md) — Operation graph, call graph, petgraph mapping
- `/workspace/@alkdev/storage/docs/architecture/encrypted-data.md` — Original encrypted data design (TypeScript)
- `/workspace/research/event_sourcing/event_source_types.md` — Event-driven architecture patterns
- irpc crate — https://docs.rs/irpc — Service protocol definitions, local/remote abstraction
- SLIP-0010 — https://github.com/satoshilabs/slips/blob/master/slip-0010.md — HD key derivation for Ed25519
- BIP39 — https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki — Mnemonic code for generating deterministic keys (widely used beyond cryptocurrency)
- `ed25519-bip32` crate — https://docs.rs/ed25519-bip32 — BIP32-Ed25519 (Cardano/IOHK approach)
- `bip39` crate — https://docs.rs/bip39 — Mnemonic generation and seed derivation

View File

@@ -1,460 +0,0 @@
# Alknet Storage: Metagraph, Identity, ACL, Secrets, and Honker Integration
> Status: Research / Draft
> Last updated: 2026-06-06
## Overview
`alknet-storage` is a Rust crate providing SQLite-backed graph storage, identity management, access control, secrets management, and reactivity via honker. It mirrors the TypeScript `@alkdev/storage` package's design (`sqlite-host.md`, `metagraph-module.md`, `acl.md`) while leveraging Rust's type system and petgraph's performance.
## Terminology
This document uses **head/worker** terminology instead of hub/spoke:
- **Head node**: Coordinating node that can also be a worker
- **Worker node**: Node that connects to a head and registers services
- **Node**: Any participant in the network
## Crate Decomposition
```
alknet-storage
├── metagraph/ — GraphType, NodeType, EdgeType definitions and persistence
├── identity/ — accounts, organizations, peer_credentials, api_keys, audit_logs
├── acl/ — PrincipalNode, DelegatesEdge, access control graph
├── secrets/ — HD key derivation (BIP39/SLIP-0010), encrypted data, secret service bridge
├── honker/ — honker integration: notify, stream, queue, event bridge
├── graph/ — GraphInstance, Node, Edge CRUD with schema validation
└── schema/ — JSON Schema definitions (serde + jsonschema for runtime validation)
```
## Metagraph Data Model
The metagraph is a three-level type system (mirrors `@alkdev/storage` exactly):
1. **GraphType** — A class of graphs (e.g., "call-graph", "acl", "task-dependencies"). Defines structural constraints (directed/undirected/mixed, allows self-loops, multi-edges).
2. **NodeType** — A category of node within a graph type (e.g., "call", "account", "task"). Each node type has a JSON Schema that validates the `attributes` of nodes belonging to that type.
3. **EdgeType** — A category of edge within a graph type (e.g., "triggered", "can_read", "depends_on"). Each edge type has a JSON Schema for its attributes. Optionally constrains which source/target node types are valid.
**Graph instances** belong to a graph type and contain **Nodes** and **Edges** conforming to those type definitions.
### Rust Types
```rust
pub struct GraphType {
pub id: String,
pub name: String, // "call-graph", "acl"
pub description: String,
pub config: GraphConfig, // directed/undirected/mixed, multi, self-loops
pub version: u32,
pub scope: Scope, // System, Tenant, User
pub metadata: serde_json::Value,
}
pub struct GraphConfig {
pub graph_type: GraphDirection, // Directed, Undirected, Mixed
pub multi: bool,
pub allow_self_loops: bool,
}
pub enum Scope {
System,
Tenant,
User,
}
pub struct NodeType {
pub id: String,
pub graph_type_id: String,
pub name: String, // "call", "account"
pub description: String,
pub schema: serde_json::Value, // JSON Schema for node attributes
}
pub struct EdgeType {
pub id: String,
pub graph_type_id: String,
pub name: String, // "triggered", "can_read"
pub description: String,
pub schema: serde_json::Value, // JSON Schema for edge attributes
pub allowed_source_types: Vec<String>, // [] = no restriction
pub allowed_target_types: Vec<String>,
}
pub struct Graph {
pub id: String,
pub graph_type_id: String,
pub name: String,
pub description: String,
pub status: GraphStatus, // Active, Archived, Draft
pub owner_id: Option<String>,
pub project_id: Option<String>,
pub metadata: serde_json::Value,
}
pub enum GraphStatus {
Active,
Archived,
Draft,
}
pub struct Node {
pub id: String,
pub graph_id: String,
pub key: String, // Consumer-defined identity within the graph
pub attributes: serde_json::Value, // Validated by node type schema
pub metadata: serde_json::Value,
}
pub struct Edge {
pub id: String,
pub graph_id: String,
pub key: Option<String>, // Null for anonymous edges
pub source_node_key: String,
pub target_node_key: String,
pub attributes: serde_json::Value, // Validated by edge type schema
pub undirected: bool,
pub metadata: serde_json::Value,
}
```
### SQLite Tables (mirrors `sqlite-host.md`)
Common columns on all tables: `id TEXT PK`, `metadata TEXT JSON DEFAULT '{}'`, `created_at INTEGER TIMESTAMP DEFAULT (strftime('%s','now'))`, `updated_at INTEGER TIMESTAMP DEFAULT (strftime('%s','now'))`.
**graph_types**: `id`, `name TEXT UNIQUE`, `description TEXT DEFAULT ''`, `config TEXT JSON NOT NULL`, `version INTEGER NOT NULL DEFAULT 1`, `scope TEXT NOT NULL DEFAULT 'system'`
**node_types**: `id`, `graph_type_id TEXT FK → graph_types.id CASCADE`, `name TEXT NOT NULL`, `description TEXT DEFAULT ''`, `schema TEXT JSON NOT NULL`. Unique constraint: `(graph_type_id, name)`.
**edge_types**: `id`, `graph_type_id TEXT FK → graph_types.id CASCADE`, `name TEXT NOT NULL`, `description TEXT DEFAULT ''`, `schema TEXT JSON NOT NULL`, `allowed_source_types TEXT JSON DEFAULT '[]'`, `allowed_target_types TEXT JSON DEFAULT '[]'`. Unique constraint: `(graph_type_id, name)`.
**graphs**: `id`, `graph_type_id TEXT FK → graph_types.id SET NULL`, `name TEXT NOT NULL`, `description TEXT DEFAULT ''`, `status TEXT NOT NULL DEFAULT 'draft'`, `owner_id TEXT`, `project_id TEXT`. Indexes on `(owner_id)`, `(project_id)`, `(owner_id, project_id)`.
**nodes**: `id`, `graph_id TEXT FK → graphs.id CASCADE`, `key TEXT NOT NULL`, `attributes TEXT JSON NOT NULL DEFAULT '{}'`. Unique constraint: `(graph_id, key)`. No `node_type_id` column (ADR-020).
**edges**: `id`, `graph_id TEXT FK → graphs.id CASCADE`, `key TEXT`, `source_node_key TEXT NOT NULL`, `target_node_key TEXT NOT NULL`, `attributes TEXT JSON NOT NULL DEFAULT '{}'`, `undirected INTEGER DEFAULT 0`. Unique constraint: `(graph_id, key)`. FK: `source_node_key`, `target_node_key` reference `(nodes.graph_id, nodes.key)` with CASCADE delete (ADR-022).
### System DB vs Tenant DB (ADR-040)
- **System DB** (`system.db`): Identity tables (accounts, organizations, peer_credentials, api_keys, audit_logs) + system-scoped graph types.
- **Tenant DB** (`tenant-{orgId}.db`): Metagraph tables (graph_types, node_types, edge_types, graphs, nodes, edges) + tenant-scoped graph types.
No FK constraints across database files. Consumer enforces referential integrity at application layer.
## Identity Tables
Mirrors `sqlite-host.md` identity tables with the same column definitions and FK cascades:
**accounts**: `email TEXT UNIQUE NOT NULL`, `display_name TEXT`, `access_level TEXT NOT NULL DEFAULT 'user'` (admin/user/service), `status TEXT NOT NULL DEFAULT 'active'` (active/suspended/deactivated).
**organizations**: `name TEXT UNIQUE NOT NULL`, `slug TEXT UNIQUE NOT NULL`, `owner_id TEXT FK → accounts.id RESTRICT`.
**organization_members**: `org_id TEXT FK → organizations.id CASCADE`, `account_id TEXT FK → accounts.id CASCADE`, `membership_level TEXT NOT NULL` (owner/admin/member). Unique constraint: `(org_id, account_id)`.
**api_keys**: `owner_id TEXT FK → accounts.id CASCADE`, `key_hash TEXT UNIQUE NOT NULL`, `name TEXT`, `enabled INTEGER NOT NULL DEFAULT 1`, `expires_at INTEGER TIMESTAMP`, `revoked_at INTEGER TIMESTAMP`, `rotated_to_id TEXT`, `last_used_at INTEGER TIMESTAMP`.
**peer_credentials**: `owner_id TEXT FK → accounts.id CASCADE`, `credential_type TEXT NOT NULL` (ssh_key/cert_authority), `fingerprint TEXT UNIQUE NOT NULL`, `public_key_data TEXT NOT NULL`, `name TEXT`, `enabled INTEGER NOT NULL DEFAULT 1`, `expires_at INTEGER TIMESTAMP`, `revoked_at INTEGER TIMESTAMP`.
**audit_logs**: `action TEXT NOT NULL`, `owner_id TEXT FK → accounts.id RESTRICT`, `credential_id TEXT`, `credential_type TEXT`, `org_id TEXT FK → organizations.id SET NULL`, `details TEXT JSON`.
## Access Control (ACL) as Metagraph
Mirrors `@alkdev/storage acl.md`:
### AclGraph Module
```rust
// Graph config: directed, multi=false, allowSelfLoops=false
pub const ACL_GRAPH_CONFIG: GraphConfig = GraphConfig {
graph_type: GraphDirection::Directed,
multi: false,
allow_self_loops: false,
};
// Node types
pub const PRINCIPAL_NODE: &str = "principal";
pub const RESOURCE_NODE: &str = "resource";
// Edge types
pub const CAN_READ_EDGE: &str = "can_read";
pub const CAN_WRITE_EDGE: &str = "can_write";
pub const CAN_EXECUTE_EDGE: &str = "can_execute";
pub const BELONGS_TO_EDGE: &str = "belongs_to";
pub const DELEGATES_EDGE: &str = "delegates";
// PrincipalNode attributes
pub struct PrincipalNodeAttrs {
pub identity_type: IdentityType, // Account, Org, Service, Role
pub identity_id: String, // FK to accounts.id or organizations.id
pub scopes: Vec<String>,
pub resources: Option<HashMap<String, Vec<String>>>,
}
pub enum IdentityType {
Account,
Org,
Service,
Role,
}
// DelegatesEdge attributes
pub struct DelegatesEdgeAttrs {
pub narrowed_scopes: Vec<String>, // Subset of delegator's scopes
pub narrowable: bool, // Can the delegate further narrow?
}
```
### Principal-Agent Hierarchy
- **Account** nodes represent individual users
- **Org** nodes represent organizations
- **Service** nodes represent automated agents (LLM workers, node credentials)
- **Role** nodes represent named permission sets
Delegation edges (`delegates`) carry `narrowed_scopes` — the delegate can only exercise scopes that are a subset of the delegator's. Liability flows upward; permissions flow downward with narrowing.
### BelongsToEdge (Derived from org_members)
ADR-045: The `organization_members` SQL table is the authoritative source. When membership changes, the consumer writes the SQL row first, then creates or removes the ACL `belongs_to` edge. The edge is derived, not the source of truth.
### Operation-Level ACL
`OperationSpec.access_control` maps to ACL graph traversal at runtime:
```rust
pub fn check_access(
acl_graph: &Graph,
principal_key: &str,
operation_spec: &OperationSpec,
) -> bool {
// Traverse from PrincipalNode to ResourceNode
// Check if any path satisfies required_scopes (AND) and required_scopes_any (OR)
// Honor delegation chains with scope narrowing
}
```
## Honker Integration
### Reactivity Pattern (ADR-047)
Every mutation is atomic with a notification:
```rust
// Insert a node and notify in one transaction
tx.execute(
"INSERT INTO nodes (id, graph_id, key, attributes) VALUES (?, ?, ?, ?)",
&[&node_id, &graph_id, &key, &attrs_json],
)?;
tx.stream_publish("nodes:created", &node_attrs_json)?;
```
This mirrors the TypeScript pattern from `sqlite-host.md` but in Rust, using honker's SQLite extension functions:
```rust
use honker::Database;
let db = Database::open("tenant.db")?;
// Transactional: business write + event stream publish commit together
let mut tx = db.transaction()?;
tx.execute("INSERT INTO nodes (id, graph_id, key, attributes) VALUES (?, ?, ?, ?)", ...)?;
tx.stream_publish("nodes:created", &attrs)?;
tx.commit()?;
// Subscribe to changes
let stream = db.stream("nodes:created");
async for event in stream.subscribe("alknet-node-watcher") {
// event is a serde_json::Value
}
```
### Honker Features Used
| Feature | Use case |
|---------|----------|
| `stream_publish` / `subscribe` | Durable pub/sub for node/edge/membership changes with per-consumer offsets |
| `notify` / `listen` | Ephemeral pub/sub for real-time control channel events |
| `queue` / `claim` / `ack` | Task queue for async operations (key rotation, ACL evaluation) |
| `scheduler` | Periodic tasks (session cleanup, audit log pruning) |
### Database Concurrency
- WAL mode (default) for concurrent reads during writes
- Single writer per `.db` file
- `busy_timeout=5000` default
- `PRAGMA data_version` polling for cross-process wake (honker pattern)
- `max_readers=4` concurrent read connections in the reader pool
## JSON Schema Validation
TypeBox from TypeScript maps to `serde_json::Value` + `jsonschema` in Rust:
| TypeScript | Rust |
|-----------|------|
| `Type.Object({...})` | `serde_json::json!({...})` as JSON Schema |
| `Value.Check(schema, data)` | `jsonschema::validate(&schema, &data)` |
| `Type.Module({...})` | JSON Schema with `$defs` stored in DB |
| `Type.Composite([A, B])` | Merge + intersect via `serde_json` merge logic |
The `jsonschema` crate provides runtime validation analogous to TypeBox's `Value.Check()`. Schema definitions are stored as `serde_json::Value` in the `schema` column of `node_types` and `edge_types` tables.
## Crate Dependency Map
```toml
[dependencies]
honker = "0.x" # SQLite extension with pub/sub/queue
serde = { version = "1", features = ["derive"] }
serde_json = "1"
jsonschema = "0.x" # JSON Schema validation (runtime)
petgraph = "0.x" # Graph data structure (shared with alknet-flowgraph)
rusqlite = { version = "0.x", features = ["bundled"] } # SQLite access (via honker)
uuid = { version = "1", features = ["v4"] }
chrono = "0.x"
thiserror = "1"
tokio = { version = "1", features = ["full"] }
```
## Multi-Tenant Replication Path
For the private use case: single `.db` files, honker for reactivity, no cross-database FK constraints.
For the distributed use case (later):
1. **Smart contracts** (Base L2) own namespace identity → `ownerId` field on `graphs` table
2. **alknet-relay** gossips namespace availability via iroh-gossip or call protocol subscriptions
3. **ACL inference** — Contract `collaborators` → ACL graph `DelegatesEdge` entries
4. **Honker streams**`stream_subscribe("nodes:modified")` carries mutations to relay subscribers
Replication mindset from the start: **every write is atomic with a notification**. The honker stream event is the replication unit. A future replicator reads `_honker_stream_*` tables and propagates changes to subscribed relays.
### Event Boundary Discipline
Following [event_source_types.md](/workspace/research/event_sourcing/event_source_types.md), honker streams serve different roles in different contexts. Preventing conflation is critical:
| Event Type | Source | Consumer | Boundary |
|-----------|--------|----------|----------|
| **Domain events** (Event Sourcing) | Service that owns the data | Same service, for state reconstruction | Internal — never published directly to other services |
| **Integration events** (State Transfer) | Projected from domain events | Other services/nodes, for cache updates | Cross-service — simple, versioned, stripped of internals |
| **Notifications** (Thin Events) | Service that owns the data | Any subscriber, for triggering workflows | Cross-node — just entity ID + action, consumer fetches details |
Conflation anti-patterns to avoid:
- **Leaky event store**: Don't let other services read honker stream events directly to drive business logic. Project domain events into integration events first.
- **Boomerang coupling**: If a consumer of an integration event must call back to the source service synchronously, the event payload is too thin. Upgrade to a fat event.
- **Fat notification trap**: If a notification event carries the full entity state, use state transfer instead.
The call protocol's `EventEnvelope` is the **integration boundary** between nodes. Domain events in honker streams stay within the service that owns them.
## Secrets and HD Key Derivation
### Key Categories
Different categories of secrets require different storage and derivation strategies:
| Category | Example | Derived from seed? | Storage |
|-----------|---------|-------------------|---------|
| **Identity keys** | Ed25519 keypair for alknet auth | Yes — SLIP-0010 `m/74'/0'/0'/0'` | Only derivation path in DB |
| **Encryption keys** | AES-256-GCM key for encrypted nodes | Yes — SLIP-0010 `m/74'/2'/0'/0'` | Only derivation path in DB |
| **External credentials** | OpenAI API key, OAuth token | No — third-party issued | Encrypted in DB with derived key |
| **On-chain identity** | Ethereum key for contract signing | Yes — SLIP-0010 `m/44'/60'/0'/0/0` | Only derivation path in DB |
| **Service registration** | NFT token ID, replicator endpoint | No — on-chain data | Plain in DB or on-chain |
### BIP39 Seed Phrase as Root of Trust
The master seed phrase (BIP39 mnemonic) is the single recovery mechanism for the entire system. From one seed phrase, all self-generated secrets can be derived on demand:
```rust
// Seed phrase → master seed (BIP39)
let mnemonic = Mnemonic::from_phrase(&phrase, Language::English)?;
let seed = mnemonic.to_seed(Some(&passphrase));
// Master seed → SLIP-0010 Ed25519 master key
let master_key = ExtendedPrivKey::new_master(Network::Alknet, &seed)?;
// Derive identity keypair
let identity_key = master_key.derive_path("m/74'/0'/0'/0'")?;
// Derive encryption key material (use first 32 bytes of derived key as AES-256 key)
let encryption_key = master_key.derive_path("m/74'/2'/0'/0'")?;
// Derive Ethereum signing key (for smart contract interactions)
let eth_key = master_key.derive_path("m/44'/60'/0'/0/0")?;
```
### External Credentials: Encryption with Derived Keys
For external credentials (API keys, OAuth tokens) that can't be derived, the existing `EncryptedDataSchema` pattern from `@alkdev/storage` applies — but the encryption key is itself derived from the seed:
1. The secret service derives an AES-256-GCM key via SLIP-0010 path `m/74'/2'/0'/0'`
2. External credentials are encrypted with this derived key using the existing encrypt/decrypt functions
3. The encrypted data is stored as a `SecretNode` in the metagraph
4. Only the derivation path and key version are stored in plain attributes
5. The seed phrase (or derived encryption key) is held only by the secret service — never in the database
### Secret Service
The secret service is an irpc service (see [services.md](services.md)) that:
- Holds the master seed phrase in memory (never persisted to disk in plain text)
- Derives keys on demand via SLIP-0010/BIP39
- Encrypts/decrypts external credentials using derived keys
- Is the **only** component that ever sees the master seed
Workers request derived keys through the secret service's irpc protocol. They never see the seed or the encryption key.
### Derivation Path Conventions
| Path | Purpose |
|------|---------|
| `m/74'/0'/0'/0'` | Primary Ed25519 identity keypair (alknet auth) |
| `m/74'/0'/0'/1'` | Secondary identity keypair (device key) |
| `m/74'/0'/1'/0'` | SSH host key (for server identity) |
| `m/74'/1'/0'/{site_hash}'` | Site-specific password derivation |
| `m/74'/2'/0'/0'` | AES-256-GCM encryption key (for external credentials) |
| `m/44'/60'/0'/0/0` | Ethereum signing key (for smart contract interactions) |
The `74'` coin type is unallocated per SLIP-0044 and can be registered for alknet. The `0'`/`1'`/`2'` account levels divide identity, password, and encryption purposes.
### Rust Crates Required
| Crate | Purpose |
|-------|---------|
| `bip39` | Mnemonic generation and seed derivation |
| `ed25519-bip32` (IOHK) or `rust-bip32-ed25519` (BitBoxSwiss) | SLIP-0010 Ed25519 HD key derivation |
| `aes-gcm` | AES-256-GCM encryption for external credentials |
| `sha2` | SHA-256 for key hashing |
| `irpc` | Service protocol definitions |
## Design Decisions (mapped from TypeScript ADRs)
| Original ADR | Decision | Rust adaptation |
|-------------|----------|-----------------|
| 002 | Metagraph over domain tables | Same 6-table schema, same graph type/node type/edge type model |
| 008 | Common columns pattern | `id`, `metadata`, `created_at`, `updated_at` on all tables |
| 019 | JSON text for schema columns | `serde_json::Value` stored as TEXT in SQLite |
| 020 | No nodeTypeId on nodes | Node type enforced at application layer |
| 022 | Composite FKs for node refs | `source_node_key` + `target_node_key` with cascade |
| 034 | ACL as metagraph | AclGraph is a metagraph instance |
| 038 | SQLite-first, PG removed | SQLite only via honker |
| 040 | System DB + tenant DB | Two `.db` files |
| 041 | Identity tables in storage | Same tables, same constraints |
| 045 | org_members authoritative | SQL table is source of truth, BelongsToEdge is derived |
| 047 | Honker event target | honker stream/notify as pub/sub mechanism |
| 049 | Identity schema restructuring | Separate credential tables, no Gitea columns |
| 050 | SHA-256 for API key hashing | Fast hash for high-entropy machine keys |
| 051 | BIP39/SLIP-0010 for HD key derivation | Seed phrase as root of trust for identity, encryption, and signing keys |
| 052 | Secrets as irpc service | Secret service holds seed, derives keys, encrypts/decrypts external creds |
| 053 | Event boundary discipline | Honker streams are domain events; call protocol is integration boundary |
## References
- `@alkdev/storage` — TypeScript metagraph, identity, ACL, encrypted data implementation
- `@alkdev/flowgraph` — TypeScript call-graph and operation-graph (maps to petgraph in Rust)
- `@alkdev/operations` — TypeScript OperationSpec, CallHandler, registry
- `/workspace/honker` — SQLite extension with pub/sub, streams, queues
- `/workspace/polyglot` — SQL transpiler (future: schema migration validation)
- `/workspace/petgraph` — Graph data structure library (used in alknet-flowgraph)
- `/workspace/jsonschema` — JSON Schema validation (Rust, replaces TypeBox at runtime)
- `/workspace/iroh/iroh-dns` — DNS resolver and endpoint info
- `/workspace/@alkdev/storage/docs/architecture/encrypted-data.md` — Original encrypted data design (TypeScript)
- `/workspace/research/event_sourcing/event_source_types.md` — Event-driven architecture patterns
- [services.md](services.md) — Service layer architecture (irpc protocols)
- [core.md](core.md) — Core overview, head/worker terminology