--- status: draft last_updated: 2026-06-22-17 --- # Endpoint ALPN router, handler registry, connection accept loops, multi-connectivity, and graceful shutdown. See [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) for the full rationale. ## AlknetEndpoint The central runtime type. Manages one or more QUIC connection sources, each feeding into the same ALPN router. ```rust pub struct AlknetEndpoint { // QUIC connection sources — both optional, both can be active simultaneously quinn: Option, // Public QUIC+TLS iroh: Option, // P2P relay-assisted handlers: Arc, dynamic: Arc>, identity_provider: Arc, shutdown: watch::Receiver, } ``` ### Why multiple connection sources? A node can be reachable through different paths depending on its network context: | Source | Requires | Identity source | Use case | |--------|----------|-----------------|----------| | `quinn::Endpoint` | Public IP, TLS cert | TLS cert (network), SSH key (auth) | VPS, replicators, service hosts | | `iroh::Endpoint` | Relay access | NodeId (Ed25519) | Home servers, NAT, IoT | These are not interchangeable transports — they are **complementary connectivity modes**. A node behind NAT that also has a public IP can use both simultaneously. Both produce QUIC connections that dispatch through the same `HandlerRegistry` by ALPN string. ### TCP is NOT an endpoint concern Bare TCP (SSH over port 22) does not use QUIC or ALPN. In the new model, TCP access is handled by individual handlers — the SSH handler can listen on a TCP socket independently. This is a handler-specific concern, not a core endpoint concern. The reference implementation's TCP transport (`alknet-main/crates/alknet-core/src/transport/tcp.rs`) is SSH-specific. It doesn't generalize to the ALPN model. ## HandlerRegistry Maps ALPN byte strings to `ProtocolHandler` instances. ```rust pub struct HandlerRegistry { handlers: HashMap<&'static [u8], Arc>, } impl HandlerRegistry { pub fn new() -> Self; pub fn register(&mut self, handler: Arc); pub fn get(&self, alpn: &[u8]) -> Option<&Arc>; pub fn alpn_strings(&self) -> Vec>; } ``` - `register()`: Insert a handler. Panics if the ALPN is already registered. - `get()`: Look up a handler by ALPN string. - `alpn_strings()`: Return all registered ALPN strings. Used to build the TLS `ServerConfig` (for quinn) and the ALPN list (for iroh). Registration is static at startup (see [OQ-04](../../open-questions.md)). The CLI builds a `HandlerRegistry`, inserts all handlers, and passes it to `AlknetEndpoint::new()`. ### ALPN strings in TLS ServerConfig and iroh endpoint The quinn endpoint's `rustls::ServerConfig` ALPN list is set from `registry.alpn_strings()` at construction time. The iroh endpoint's ALPN list is similarly derived. Both connection sources advertise the same set of ALPNs. ## Accept Loops Each active connection source runs its own accept loop. All loops dispatch through the same `HandlerRegistry`: ### Quinn accept loop (public QUIC+TLS) ``` loop { tokio::select! { incoming = quinn_endpoint.accept() => { let connection = incoming.await; // TLS handshake + ALPN negotiation match connection { Ok(conn) => dispatch(conn), Err(e) => { /* log TLS handshake failure, continue */ } } } _ = shutdown.changed() => break, } } ``` ### iroh accept loop (P2P relay-assisted) iroh's `Endpoint` natively supports ALPN negotiation (step 4 of its connection establishment). The `iroh::Endpoint::set_alpns()` method configures which ALPNs the endpoint advertises — the same mechanism iroh's own `Router` uses internally with its `ProtocolMap`. We use `iroh::Endpoint` directly (not iroh's `Router`) because our `HandlerRegistry` is shared between quinn and iroh connection sources, and our `AuthContext` construction differs per source. Our accept loop replaces iroh's `Router` accept loop with our own dispatch: ``` loop { tokio::select! { incoming = iroh_endpoint.accept() => { // incoming is an iroh::endpoint::Incoming let accepting = incoming.accept(); // Accepting state let alpn = accepting.alpn().await; // ALPN from TLS handshake match alpn { Ok(alpn) => dispatch(alpn, accepting), Err(e) => { /* log handshake failure, continue */ } } } _ = shutdown.changed() => break, } } ``` See iroh's `protocol.rs` (`/workspace/iroh/iroh/src/protocol.rs`) for the reference implementation of this pattern — `handle_connection()` reads the ALPN, looks up the handler in `ProtocolMap`, and calls `handler.accept(connection)`. Our dispatch is the same pattern with our `HandlerRegistry`. ### Dispatch function (shared) ``` fn dispatch(connection) { let alpn = connection.alpn(); match handlers.get(alpn) { Some(handler) => { let auth = AuthContext::from_connection(&connection); let conn = Connection::from_quinn(connection); // or from_iroh tokio::spawn(async move { if let Err(e) = handler.handle(conn, &auth).await { // log error, connection closes } }); } None => connection.close(0u32, "no handler"), } } ``` ### What the accept loops do NOT do - **No byte-peeking**: ALPN negotiation handles protocol detection. The old `stealth` module's `detect_protocol()` is unnecessary. - **No per-handler accept loops**: The old `ListenerConfig` enum had Stream/Http/Dns variants with different accept paths. ALPN unifies this. - **No SSH-specific logic**: The accept loop is ALPN-agnostic. It doesn't know or care what protocol the handler speaks. ## Stealth Mode as ALPN Dispatch The reference implementation's "stealth mode" is SSH-over-TLS on port 443. The TLS cert is **camouflage**, not identity — it makes the port look like a web server to port scanners and DPI systems. Non-SSH traffic gets a fake nginx 404. In the ALPN model, this maps to: - The `alknet/http` handler is registered for standard HTTP ALPNs (`h2`, `http/1.1`) - The HTTP handler can serve a decoy website or a fake 404 - Real services use `alknet/ssh`, `alknet/call`, etc. - Clients that don't offer alknet ALPNs get the HTTP handler — just like port scanners in stealth mode No byte-peeking, no `ProtocolDetection` enum. ALPN does the routing. ## Network Identity vs Auth Identity A key distinction that the ALPN model makes explicit: | Layer | Purpose | Mechanism | |-------|---------|-----------| | **Network identity** | How a client finds and verifies the node | X.509 cert (domain) or RFC 7250 raw key (Ed25519) or iroh NodeId | | **Auth identity** | Who the peer is and what they can do | SSH key, API token, certificate (handlers) | The TLS cert (or raw public key, or NodeId) is the node's network-facing identity. It's NOT the node's authentication identity. Auth happens inside the handler via `IdentityProvider`. This matches the reference implementation: the TLS cert encrypts and camouflages, but SSH key exchange handles the actual authentication. ## RFC 7250: Raw Public Keys in TLS RFC 7250 raw public keys are the **default TLS identity mode** for most alknet nodes. They eliminate the need for domain names, CAs, and certificate renewal — the Ed25519 public key IS the node's identity. iroh uses this model with its `NodeId`. The implementation is ~100 lines (see `iroh/iroh/src/tls/resolver.rs`): take an Ed25519 key, wrap its SPKI public key as a `CertificateDer`, tell rustls `only_raw_public_keys() -> true`. No X.509, no CAs, no domain names, no cert renewal. Key implications: - **Default for alknet-native clients**: SSH, git, and alknet-native clients all work with raw Ed25519 keys out of the box. The same key type used for SSH auth can serve as the TLS identity. This is the most common deployment mode. - **No domain required**: A node without a domain name uses raw public keys for the quinn path — key-based identity with direct QUIC over UDP. - **Key = identity**: The Ed25519 public key IS the node's identity. No CA trust chain, no cert expiry. The key can be derived from alknet-vault. - **X.509 is for domain-hosted services**: Domain-facing identity (replicators, public services, browsers) uses X.509 certs. This is a separate use case, not the default. - **Browser limitation**: Browsers don't support RFC 7250. For browser/WebTransport clients, X.509 certs are needed. For all other clients, raw public keys work fine. The quinn and iroh paths share the same key-based identity model via RFC 7250. They're distinguished by **connection establishment** (direct UDP vs relay-assisted), not by identity: | Path | Connection establishment | Default identity | Alternative identity | |------|------------------------|-----------------|---------------------| | quinn | Direct UDP, public IP | RFC 7250 raw key (most nodes) | X.509 cert (domain-hosted, browsers) | | iroh | Relay-assisted P2P | RFC 7250 raw key (NodeId) | N/A | ## TLS Identity TLS identity in alknet has two distinct use cases, each with a different trust model and provisioning mechanism. See OQ-12 for the full rationale. ### Use case 1: P2P / Key-based identity (default) Most alknet nodes use RFC 7250 raw Ed25519 public keys for TLS identity. No domain name, no CA, no certificate renewal. The Ed25519 public key IS the node's identity — the same key model as iroh's `NodeId`, but for direct QUIC connections. `TlsIdentity::RawKey` in `StaticConfig` configures this mode. The endpoint builds a `rustls::ServerConfig` with `only_raw_public_keys() -> true` and a `ResolvesServerCert` that generates the certificate on-the-fly from the key, exactly as iroh does (see `iroh/iroh/src/tls/resolver.rs`). This mode works natively with SSH auth (same key type) and git (SSH key-based auth). It is the default for alknet-native clients. **Browser/WebTransport clients do not support RFC 7250** — they require X.509 certificates. ### Use case 2: Domain-hosted services Nodes that serve browser/WebTransport clients, or nodes with public domain names, use X.509 certificates. This has two sub-cases: - **Manual**: Provide cert/key file paths via `TlsIdentity::X509`. The endpoint loads them at startup and builds a standard `rustls::ServerConfig`. - **ACME auto-provisioning**: Let's Encrypt via `rustls-acme`. The reverse-proxy project (`/workspace/@alkdev/reverse-proxy`) demonstrates the complete pattern: per-listener ACME state machine, `ResolvesServerCertAcme` rustls integration, TLS-ALPN-01 challenge handling, automatic renewal. This is a proven, solved implementation pattern. It will be adapted to alknet's `AlknetEndpoint` context as an additional `TlsIdentity` variant or `ResolvesServerCert` implementation. `TlsIdentity::SelfSigned` is for development only — the endpoint generates a self-signed cert on startup. External clients will not trust it. ### iroh endpoint identity The iroh endpoint does not need TLS certificate configuration — it uses `NodeId` (Ed25519) for identity, which is RFC 7250 raw key identity built into the iroh endpoint. ### Identity model comparison | Path | Identity model | Client compatibility | Use case | |------|---------------|---------------------|----------| | quinn + `TlsIdentity::RawKey` | RFC 7250 Ed25519 raw key | alknet-native, SSH, git | Personal nodes, P2P, most deployments | | quinn + `TlsIdentity::X509` | X.509 domain certificate | All clients including browsers | Relays, public services, WebTransport | | quinn + `TlsIdentity::SelfSigned` | X.509 self-signed cert | None (dev only) | Local development | | iroh | NodeId (Ed25519, RFC 7250 built-in) | alknet-native, iroh clients | NAT traversal, home servers | ## Graceful Shutdown ```rust impl AlknetEndpoint { pub fn shutdown_sender(&self) -> watch::Sender; pub async fn shutdown(&self) -> Result<(), EndpointError>; } ``` - `shutdown_sender()` returns a clone of the shutdown channel sender. Call `send(true)` to signal shutdown. - `shutdown()` signals all accept loops to stop, waits for in-flight connections with a drain timeout (default: 2 seconds), then forcefully closes remaining connections. - SIGTERM/SIGINT are wired to the shutdown channel by the CLI binary. The drain timeout is configurable via `StaticConfig::drain_timeout`. ## Error Handling ### EndpointError Fatal errors that prevent the endpoint from starting or continuing. ```rust pub enum EndpointError { BindFailed(io::Error), TlsConfig(io::Error), HandlerNotFound(Vec), // ALPN string with no registered handler } ``` ### HandlerError Non-fatal errors within a handler. See [core-types.md](core-types.md) for details. ### Accept loop errors - **TLS handshake failure**: Log and continue. The client may have offered no compatible ALPN, or the cert may be untrusted. - **Handler panic**: Caught by tokio's task isolation. The connection is dropped. Other connections continue. - **Connection-level errors** (quinn/iroh `ConnectionError`): Log and continue. The accept loop keeps running. ## Key Differences from Reference Implementation | Aspect | Reference (`alknet-main`) | New Model | |--------|---------------------------|-----------| | Transport | `TransportAcceptor` trait, `TransportKind` enum | `quinn::Endpoint` + `iroh::Endpoint`, ALPN dispatch | | Listener config | `ListenerConfig` enum (Stream/Http/Dns) | Single `HandlerRegistry`, ALPN dispatch | | Protocol detection | Byte-peeking (`stealth::detect_protocol`) | ALPN negotiation (TLS layer) | | Stealth mode | SSH-over-TLS with byte-peek | HTTP handler on `h2`/`http/1.1` serves decoy | | Accept loop | Per-transport, SSH-centric | Per-connection-source, ALPN-agnostic | | Handler model | `ServerHandler` + `russh::server::Handler` | `ProtocolHandler::handle(Connection, &AuthContext)` | | Config | `ServeOptions` builder | `StaticConfig` + `HandlerRegistry` + `AlknetEndpoint::new()` | | iroh | Separate `IrohAcceptor` + `IrohTransport` | `Option` on `AlknetEndpoint` | | Network vs auth identity | Conflated (TLS cert + SSH key both "auth") | Explicitly separated (TLS/NodeId = network, SSH key/token = auth) | ## Design Decisions | Decision | ADR | Summary | |----------|-----|---------| | Multi-connectivity endpoint (quinn + iroh) | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Both optional, both feed same ALPN router | | Static handler registration | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Two-way door, start static, add ArcSwap later | | TCP is not an endpoint concern | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | TCP SSH is a handler concern, not core | | No byte-peeking, ALPN dispatch only | [ADR-001](../../decisions/001-alpn-protocol-dispatch.md) | TLS layer handles protocol detection | | Stealth mode = HTTP handler on standard ALPNs | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Decoy via ALPN routing, not byte-peek | | Network identity ≠ auth identity | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | TLS cert/NodeId = network, SSH key/token = auth | | Handler panics isolated | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | tokio task isolation, connection closes | ## Open Questions See [open-questions.md](../../open-questions.md) for full details. - **OQ-04**: Resolved — HandlerRegistry is static at startup. - **OQ-05**: Resolved — multi-connectivity endpoint with quinn + iroh, both feature-gated. - **OQ-12**: Resolved — two distinct TLS identity use cases: RFC 7250 raw keys (default, P2P) and X.509 certs (domain-hosted, browsers). ACME is a proven pattern from the reverse-proxy project, not speculative future work.