Files
alknet/docs/architecture/overview.md
glm-5.1 d3633b7839 docs: complete Phase 0 architecture — spec updates, review fixes, and link portability
Update four existing specs (overview, server, napi-and-pubsub, call-protocol) to
reflect Phase 0 decisions: three-layer model, IdentityProvider, ForwardingPolicy,
OperationEnv, static/dynamic config split. Review all 9 Phase 0a ADRs (026-034)
for consistency. Fix 4 critical issues from architecture review: missing OQ-SVC-05
in open-questions.md, deprecated hub terminology, undefined AuthService and noq
terms. Replace inline OQ text with cross-references per format rules. Add
ConfigServiceImpl definition to configuration.md. Port absolute workspace paths
to project-relative links by copying referenced docs (feasibility, certbot,
fail2ban, event_source_types) into docs/research/.
2026-06-07 11:27:52 +00:00

16 KiB

status, last_updated
status last_updated
reviewed 2026-06-07

Alknet Overview

Purpose

Alknet is a self-hostable SSH-based tunnel tool that provides VPN-like functionality without being a VPN protocol. It enables:

  • Private tunneling of services (Postgres, Redis, internal APIs) over SSH
  • Censorship circumvention — SSH over TLS on port 443 looks like HTTPS to DPI
  • NAT traversal — iroh transport allows peer-to-peer connections without public IPs or port forwarding
  • Service mesh connectivity — a lightweight transport layer for the pubsub/operations event system

The core insight: SSH tunnels work because SSH is fundamental infrastructure. Blocking it breaks the internet. Alknet makes SSH tunneling accessible through a simple CLI with pluggable transports.

Crate Structure

Alknet is decomposed into six crates with a strict acyclic dependency graph (ADR-027):

Crate Purpose Exists Now?
alknet-core Transport, SSH, call protocol, config, auth types, OperationSpec, Interface trait Yes
alknet-napi Node.js native addon via napi-rs Yes
alknet-secret BIP39, SLIP-0010 HD key derivation, AES-256-GCM, SecretProtocol irpc service Phase 2+
alknet-storage SQLite-backed metagraph, identity tables, ACL graph, honker, StorageProtocol Phase 2+
alknet-flowgraph FlowGraph<N,E> over petgraph, operation graph, call graph Phase 2+
alknet (CLI) Binary that assembles everything with feature flags Yes

The four library crates (core, secret, storage, flowgraph) are independent of each other. Dependencies flow upward only: the CLI binary sits at the top and wires concrete implementations together. alknet-storage implements alknet-core's IdentityProvider trait without a crate dependency — the CLI binary provides the bridge.

irpc is behind a feature flag in alknet-core. Nodes that only do SSH tunneling don't need the service layer overhead.

Three-Layer Model

Alknet uses a three-layer model (ADR-026):

Layer Responsibility Examples
Layer 1: Transport Produces byte streams (AsyncRead + AsyncWrite + Unpin + Send) TCP, TLS, iroh, DNS (future), WebTransport (future)
Layer 2: Interface Consumes a transport stream and produces call protocol sessions SSH (handshake + auth + channel multiplexing), raw framing (length-prefix + JSON)
Layer 3: Protocol Carries semantics — operation registry, service calls, events Call protocol, OperationEnv, operation dispatch

SSH is an interface, not a transport. The three-layer model enables DNS control channels (DNS transport + raw framing), local service mesh (TCP + raw framing), and browser direct call protocol (WebTransport + raw framing) without wrapping SSH inside those transports.

A connection is always a (Transport, Interface) pair. The protocol layer is agnostic to both.

Service Layer

The irpc service layer decomposes alknet's core responsibilities into independently testable, deployable, and replaceable components (ADR-033, services.md):

  • Auth (AuthProtocol) — verify identities, check credentials
  • Secret (SecretProtocol) — derive keys, encrypt/decrypt
  • Config (ConfigProtocol) — dynamic config reload
  • Storage (StorageProtocol) — graph CRUD, metagraph operations

OperationEnv is the universal composition mechanism. A handler receives context.env.invoke("secrets", "derive", input) and doesn't know whether the dispatch is local (direct function call), in-cluster (irpc service), or cross-node (call protocol EventEnvelope). Three dispatch paths, one handler-facing API.

Phase boundary: Phase 1 ships ConfigIdentityProvider (ArcSwap-backed) and ConfigServiceImpl (ArcSwap-backed) as the only auth and config implementations. The irpc service protocols (AuthProtocol, SecretProtocol, etc.) and the production deployment topology (multi-node with StorageIdentityProvider) are contracted in the specs but will be implemented in Phase 2+. Application services (DockerService, NodeService, agent services) are downstream concerns that build on top of the call protocol and OperationEnv.

Identity

Identity struct and IdentityProvider trait are core types in alknet-core (ADR-029, identity.md):

pub struct Identity {
    pub id: String,          // Fingerprint (config auth) or account UUID (database auth)
    pub scopes: Vec<String>, // Authorization scope strings
    pub resources: HashMap<String, Vec<String>>, // Resource-level authorization
}

IdentityProvider decouples alknet-core from identity storage. Phase 1 ships ConfigIdentityProvider (reads from ArcSwap<DynamicConfig.auth>). StorageIdentityProvider (Phase 2+, backed by SQLite) replaces it for production deployments. Both produce the same Identity result.

Exports

Binary: alknet

A single binary with subcommands:

alknet serve     — Start the server (accepts SSH connections)
alknet connect  — Start the client (opens SSH session, exposes SOCKS5/port-forwards)

Library: alknet-core

The alknet-core crate exports the pluggable components for embedding or programmatic use:

  • Transport trait — produces a duplex stream for SSH to run over
  • TcpTransport — direct TCP connection
  • TlsTransport — TCP + tokio-rustls TLS
  • IrohTransport — iroh QUIC P2P connection
  • Interface trait — consumes transport stream, produces call protocol session
  • Socks5Server — local SOCKS5 proxy that forwards through SSH channels
  • PortForwarder — manages local/remote port forwards
  • ServerHandler — russh server handler with configurable auth and channel policies
  • Identity / IdentityProvider — core identity types (ADR-029)
  • OperationSpec — operation registration for call protocol (ADR-025)
  • ConnectOptions / ServeOptions — programmatic configuration structs
  • StaticConfig / DynamicConfig — static/immutable vs. hot-reloadable config (ADR-030)
  • ConfigReloadHandle — programmatic reload of dynamic config

Dependencies

Dependency Purpose Crate Feature-gated
russh SSH client & server core No (core)
tokio Async runtime core No (core)
tokio-rustls TLS wrapping core Yes (tls)
rustls TLS implementation core Yes (tls)
rustls-acme ACME/Let's Encrypt auto-cert core Yes (acme)
iroh P2P QUIC transport core Yes (iroh)
irpc Streaming RPC service layer core Yes (irpc)
arc-swap Lock-free dynamic config core No (core)
serde Serialization core No (core)
clap CLI argument parsing CLI No (CLI)
toml TOML config file CLI No (CLI)
tracing Structured logging core No (core)
anyhow / thiserror Error handling core No (core)
bip39 Mnemonic generation secret No (secret)
ed25519-bip32 HD key derivation secret No (secret)
aes-gcm AES-256-GCM encryption secret No (secret)
rusqlite SQLite (via honker) storage No (storage)
honker Event-sourced storage storage No (storage)
petgraph Graph data structure storage, flowgraph No
jsonschema JSON Schema validation storage, flowgraph No

Note: tun-rs is no longer a dependency. TUN support is deferred in favor of the external tun2proxy tool (ADR-014).

Architecture Constraints

  1. SSH runs over transport, not alongside — The transport layer produces a single AsyncRead+AsyncWrite+Unpin+Send stream. SSH runs over that stream via russh::client::connect_stream() / russh::server::run_stream(). The SSH layer never knows what transport it's on. (ADR-001, ADR-004)

  2. Three-layer model: Transport, Interface, Protocol — SSH is an interface (Layer 2), not a transport (Layer 1). A connection is always a (Transport, Interface) pair. The call protocol (Layer 3) is agnostic to both. This enables DNS control channels, raw framing, and WebTransport direct call protocol without wrapping SSH inside those transports. (ADR-026)

  3. SOCKS5 is the primary client interface — Port forwarding is built on top of SOCKS5-like channel management. For VPN-like "route all traffic" behavior, users run tun2proxy alongside alknet's SOCKS5 proxy. TUN is not in the project scope. (ADR-005, ADR-014)

  4. No logging of tunnel destinations — The server logs auth attempts and connections (for fail2ban) but does not log channel_open_direct_tcpip destinations, DNS lookups, or bytes transferred. (ADR-006, ADR-013)

  5. Programmatic-first API — Configuration via CLI flags, library API structs (ConnectOptions, ServeOptions), and environment variables. No ~/.ssh/config parsing. Optional --config TOML file for reproducible deployments. (ADR-011, ADR-030)

  6. Feature flags control transport inclusiontls, iroh, acme, irpc are feature-gated so the base install is lean. Users opt in to heavier dependencies.

  7. Authentication is key-based and unified — Ed25519 public key (default) and OpenSSH certificate authority. Same key material for SSH and token auth. Identity resolves through IdentityProvider trait, decoupling core from identity storage. (ADR-012, ADR-023, ADR-029)

  8. NAPI exposes both connect() and serve() — The napi-rs wrapper provides client and server functionality, using napi-rs as the FFI bridge. The NAPI layer is transport-agnostic and not tied to pubsub. (ADR-015, ADR-016)

  9. Static/dynamic config split — Transport-level settings (listen address, TLS certs) are immutable after startup. Auth, forwarding policy, and rate limits are hot-reloadable via ArcSwap<DynamicConfig>. (ADR-030)

  10. Forwarding policy enforced before proxy spawn — Each channel_open_direct_tcpip is checked against ForwardingPolicy before a TCP connection is made. Default-allow preserves current behavior. (ADR-031)

  11. OperationEnv as universal composition mechanism — Handlers call context.env.invoke(namespace, op, input) regardless of dispatch path (local, irpc service, remote call protocol). (ADR-033)

  12. Event boundary discipline — Domain events (Honker streams) stay within the owning service. irpc calls are synchronous and in-cluster. Call protocol EventEnvelope is the only thing that crosses node boundaries. (ADR-032)

  13. Error handling follows a consistent layered pattern — Transport and auth errors cause reconnection (client, with exponential backoff) or connection rejection (server). Channel-level errors (target unreachable, proxy failure) close the individual channel without killing the session. Library API errors propagate via anyhow::Result / thiserror types. CLI reports errors to stderr with appropriate exit codes. NAPI errors are marshalled as JavaScript exceptions.

Design Decisions

ADR Decision Summary
001 Pluggable transport Transport trait produces AsyncRead+AsyncWrite+Unpin+Send, SSH consumes it
002 TUN shim separate Superseded — TUN is deferred, use tun2proxy (ADR-014)
003 iroh stream join tokio::io::join(recv, send) combines QUIC halves
004 SSH over transport SSH never accesses TCP/iroh/TLS directly
005 SOCKS5 first SOCKS5 is the primary interface; TUN is external (tun2proxy)
006 No logging of tunnel destinations Server logs auth and connections, not destinations
007 NAPI single stream NAPI exposes duplex streams, not SSH multiplexing
008 ACME/Let's Encrypt Auto-provision TLS certs, domain and IP paths
009 Default iroh relay n0 relay by default, --iroh-relay override
010 Transport chaining --proxy works with all transports natively
011 Programmatic-first No SSH config files; options are structs, env vars, CLI flags (amended by ADR-030 for optional TOML)
012 Key + cert-authority Ed25519 keys + OpenSSH CA; no password auth
013 Fail2ban-friendly Structured auth logs + built-in rate limiting
014 Defer TUN Use tun2proxy for VPN-like behavior; no alknet-tun binary
015 napi-rs Standard Node.js native addon tooling
016 connect + serve NAPI exposes both client and server from the start
017 Stealth mode Protocol multiplexing on port 443
018 Control channel Reserved alknet-control destination for pubsub
019 Proxy dual semantics --proxy routes transport on client, data on server
023 Unified auth Same key material for SSH and token auth
024 Bidirectional call protocol Both sides can initiate calls
025 Handler/spec separation Downstream registers operations without modifying core
026 Three-layer model SSH is Layer 2, not Layer 1
027 Crate decomposition Six crates, acyclic deps, feature-gated irpc
028 Auth as irpc service IdentityProvider is the contract, irpc is one backend
029 Identity as core type Identity and IdentityProvider in alknet-core
030 Static/dynamic config ArcSwap for hot-reloadable auth and forwarding
031 Forwarding policy Per-identity, per-destination, per-transport rules
032 Event boundary Domain events never cross service boundaries
033 OperationEnv Universal composition, three dispatch paths
034 Head/worker Replaces hub/spoke terminology

Open Questions

See open-questions.md for all open and resolved questions. Key open questions: OQ-15 (QUIC coexistence), OQ-19 (WebTransport TLS), OQ-20 (worker registration), OQ-IF-01 (Interface session / EventEnvelope relationship).

References