docs(architecture): fix OQ-05 — multi-connectivity endpoint, not multi-transport

Correct the conflation of quinn/TLS/iroh as interchangeable transports.
They are complementary connectivity modes serving different deployment
contexts: quinn (public IP + TLS), iroh (NAT traversal via relay), TCP
(handler-specific, not core). Clarify that TLS cert = network identity,
not auth identity. Map stealth mode to HTTP handler on standard ALPNs
instead of byte-peeking. Resolve OQ-05 as one-way door. SendStream/
RecvStream now use internal enum dispatch for both quinn and iroh
streams.
This commit is contained in:
2026-06-16 12:41:03 +00:00
parent 90d5f4eaf9
commit 5c8448ff86
6 changed files with 234 additions and 142 deletions

View File

@@ -49,6 +49,7 @@ See [open-questions.md](open-questions.md) for the full tracker.
- **OQ-01**: BiStream type — trait with Connection parameter (ADR-007)
- **OQ-02**: AuthContext timing — hybrid model (ADR-004)
- **OQ-03**: ALPN naming — `alknet/` prefix, no version (ADR-006)
- **OQ-05**: Multi-connectivity endpoint — quinn + iroh, both feature-gated (ADR-010)
- **OQ-06**: ALPN per connection, not per stream (ADR-006)
- **OQ-08**: Vault integration — CLI-embedded via call protocol (ADR-008)
@@ -57,8 +58,6 @@ See [open-questions.md](open-questions.md) for the full tracker.
- **OQ-12**: TLS certificate provisioning — file paths in StaticConfig, ACME later
**Two-way doors (resolved or deferred to implementation):**
- **OQ-04**: Dynamic handler registration — resolved: static at startup (ADR-010)
- **OQ-05**: Multi-transport endpoint — start with quinn, add transport trait later
- **OQ-07**: Call protocol scope — start with one stream per operation
- **OQ-11**: Handler-level auth resolution observability — decide during implementation

View File

@@ -35,7 +35,7 @@ Core library for ALPN-based protocol dispatch. Every handler crate depends on al
| OQ | Title | Status | Relevance |
|----|-------|--------|-----------|
| OQ-04 | Dynamic handler registration | resolved (start static) | HandlerRegistry is immutable at startup |
| OQ-05 | Multi-transport endpoint | open (start with quinn) | AlknetEndpoint uses quinn directly |
| OQ-05 | Multi-connectivity endpoint | resolved (quinn + iroh) | AlknetEndpoint supports both, both feature-gated |
| OQ-11 | AuthContext resolution completeness | open | How handlers signal auth completion |
## Key Design Principles

View File

@@ -90,11 +90,11 @@ See [ADR-007](../../decisions/007-bistream-type-definition.md) for why BiStream
## SendStream and RecvStream
Concrete types wrapping QUIC stream halves.
Concrete types wrapping QUIC stream halves. Both quinn and iroh produce QUIC connections — `SendStream` and `RecvStream` need to wrap either source.
```rust
pub struct SendStream { /* wraps quinn::SendStream or test mock */ }
pub struct RecvStream { /* wraps quinn::RecvStream or test mock */ }
pub struct SendStream { /* wraps quinn::SendStream or iroh::SendStream or test mock */ }
pub struct RecvStream { /* wraps quinn::RecvStream or iroh::RecvStream or test mock */ }
impl AsyncWrite for SendStream { ... }
impl AsyncRead for RecvStream { ... }
@@ -102,9 +102,9 @@ impl AsyncRead for RecvStream { ... }
- `SendStream` implements `AsyncWrite`. Write bytes to the peer.
- `RecvStream` implements `AsyncRead`. Read bytes from the peer.
- These are not trait objects — they are concrete wrapper types that delegate to `quinn::SendStream` / `quinn::RecvStream` in production and to test mocks in tests.
- These are concrete wrapper types that use internal enum dispatch to delegate to the appropriate QUIC stream type (quinn or iroh) in production, and to test mocks in tests.
This is a two-way door decision. If future transports need different stream types, `SendStream` and `RecvStream` can become wrappers with enum dispatch. For v1, concrete wrappers over quinn types are simpler and zero-cost.
Since the endpoint supports both quinn and iroh connection sources (ADR-010), streams may come from either. `Connection::new()` wraps the appropriate stream source based on where the connection came from.
## StreamError
@@ -117,7 +117,7 @@ pub enum StreamError {
}
```
Returned by `accept_bi()`, `open_bi()`, and stream read/write operations. Maps from `quinn::ConnectionError` and `quinn::StreamError`.
Returned by `accept_bi()`, `open_bi()`, and stream read/write operations. Maps from `quinn::ConnectionError` / `quinn::StreamError` and their iroh equivalents.
## Design Decisions
@@ -126,8 +126,8 @@ Returned by `accept_bi()`, `open_bi()`, and stream read/write operations. Maps f
| ProtocolHandler receives Connection, not BiStream | [ADR-007](../../decisions/007-bistream-type-definition.md) | Handlers that need multiple streams (SSH, call) have direct access to the Connection |
| BiStream is a trait | [ADR-007](../../decisions/007-bistream-type-definition.md) | WASM door preserved, test mocks possible |
| HandlerError is non-fatal | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Handler errors close the connection, not the endpoint |
| SendStream/RecvStream are concrete wrappers | Two-way door | Can become enum dispatch later if multi-transport is needed |
| SendStream/RecvStream wrap quinn + iroh | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Internal enum dispatch for both QUIC sources |
## Open Questions
- **OQ-05**: See [open-questions.md](../../open-questions.md) — multi-transport. If quinn is the only transport in v1, SendStream/RecvStream can be concrete wrappers.
None active for this document.

View File

@@ -1,21 +1,24 @@
---
status: draft
last_updated: 2026-06-16
last_updated: 2026-06-17
---
# Endpoint
ALPN router, handler registry, connection accept loop, and graceful shutdown.
ALPN router, handler registry, connection accept loops, multi-connectivity, and graceful shutdown.
See [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) for the full rationale.
## AlknetEndpoint
The central runtime type. Owns the QUIC endpoint, holds the handler registry, and runs the accept loop.
The central runtime type. Manages one or more QUIC connection sources, each feeding into the same ALPN router.
```rust
pub struct AlknetEndpoint {
endpoint: quinn::Endpoint,
// QUIC connection sources — both optional, both can be active simultaneously
quinn: Option<quinn::Endpoint>, // Public QUIC+TLS
iroh: Option<iroh::Endpoint>, // P2P relay-assisted
handlers: Arc<HandlerRegistry>,
dynamic: Arc<ArcSwap<DynamicConfig>>,
identity_provider: Arc<dyn IdentityProvider>,
@@ -23,62 +26,22 @@ pub struct AlknetEndpoint {
}
```
### Construction
### Why multiple connection sources?
The CLI binary constructs an `AlknetEndpoint` at startup:
A node can be reachable through different paths depending on its network context:
1. Build `HandlerRegistry` by inserting handlers for each ALPN.
2. Build `StaticConfig` from CLI arguments / config file.
3. Build `rustls::ServerConfig` from TLS cert/key and the registry's ALPN strings.
4. Bind `quinn::Endpoint` with the `ServerConfig`.
5. Create `ArcSwap<DynamicConfig>` and `ConfigIdentityProvider`.
6. Call `AlknetEndpoint::new(endpoint, handlers, dynamic, identity_provider)`.
| Source | Requires | Identity source | Use case |
|--------|----------|-----------------|----------|
| `quinn::Endpoint` | Public IP, TLS cert | TLS cert (network), SSH key (auth) | VPS, replicators, service hosts |
| `iroh::Endpoint` | Relay access | NodeId (Ed25519) | Home servers, NAT, IoT |
### Accept Loop
These are not interchangeable transports — they are **complementary connectivity modes**. A node behind NAT that also has a public IP can use both simultaneously. Both produce QUIC connections that dispatch through the same `HandlerRegistry` by ALPN string.
```
loop {
tokio::select! {
incoming = endpoint.accept() => {
let connection = incoming.await; // TLS handshake + ALPN negotiation
match connection {
Ok(conn) => {
let alpn = conn.alpn();
match handlers.get(alpn) {
Some(handler) => {
let auth = AuthContext::from_connection(&conn);
let conn = Connection::new(conn);
tokio::spawn(async move {
if let Err(e) = handler.handle(conn, &auth).await {
// log error, connection closes
}
});
}
None => {
// ALPN has no handler — should not happen
// (ServerConfig only advertises registered ALPNs)
conn.close(0u32, "no handler");
}
}
}
Err(e) => {
// TLS handshake or connection-level error
// log and continue accepting
}
}
}
_ = shutdown.changed() => {
break; // graceful shutdown
}
}
}
```
### TCP is NOT an endpoint concern
### What the accept loop does NOT do
Bare TCP (SSH over port 22) does not use QUIC or ALPN. In the new model, TCP access is handled by individual handlers — the SSH handler can listen on a TCP socket independently. This is a handler-specific concern, not a core endpoint concern.
- **No byte-peeking**: ALPN negotiation handles protocol detection. The old `stealth` module's `detect_protocol()` is unnecessary.
- **No per-handler accept loops**: The old model had `ListenerConfig::Stream`, `ListenerConfig::Http`, `ListenerConfig::Dns` with different accept paths. ALPN unifies this.
- **No SSH-specific logic**: The accept loop is ALPN-agnostic. It doesn't know or care what protocol the handler speaks.
The reference implementation's TCP transport (`alknet-main/crates/alknet-core/src/transport/tcp.rs`) is SSH-specific. It doesn't generalize to the ALPN model.
## HandlerRegistry
@@ -91,24 +54,124 @@ pub struct HandlerRegistry {
impl HandlerRegistry {
pub fn new() -> Self;
pub fn register(&mut self, handler: Arc<dyn ProtocolHandler>);
pub fn register(&mut self, handler: Arc<dyn ProtocolController>);
pub fn get(&self, alpn: &[u8]) -> Option<&Arc<dyn ProtocolHandler>>;
pub fn alpn_strings(&self) -> Vec<Vec<u8>>;
}
```
- `register()`: Insert a handler. Panics if the ALPN is already registered (duplicate handlers are a bug).
- `get()`: Look up a handler by ALPN string. Returns `None` if no handler is registered.
- `alpn_strings()`: Return all registered ALPN strings. Used to build the TLS `ServerConfig`.
- `register()`: Insert a handler. Panics if the ALPN is already registered.
- `get()`: Look up a handler by ALPN string.
- `alpn_strings()`: Return all registered ALPN strings. Used to build the TLS `ServerConfig` (for quinn) and the ALPN list (for iroh).
Registration is static at startup (see [OQ-04](../../open-questions.md) and ADR-010). The CLI builds a `HandlerRegistry`, inserts all handlers, and passes it to `AlknetEndpoint`. The registry is immutable after construction.
Registration is static at startup (see [OQ-04](../../open-questions.md)). The CLI builds a `HandlerRegistry`, inserts all handlers, and passes it to `AlknetEndpoint::new()`.
### ALPN strings in the TLS ServerConfig
### ALPN strings in TLS ServerConfig and iroh endpoint
The `rustls::ServerConfig`'s ALPN protocol list is set from `registry.alpn_strings()` at construction time. This means:
- Only registered handlers' ALPNs are advertised during TLS negotiation.
- If a client offers an ALPN that's not in the list, the TLS handshake fails — correct behavior.
- Adding a handler at runtime requires rebuilding the `ServerConfig` (see OQ-04).
The quinn endpoint's `rustls::ServerConfig` ALPN list is set from `registry.alpn_strings()` at construction time. The iroh endpoint's ALPN list is similarly derived. Both connection sources advertise the same set of ALPNs.
## Accept Loops
Each active connection source runs its own accept loop. All loops dispatch through the same `HandlerRegistry`:
### Quinn accept loop (public QUIC+TLS)
```
loop {
tokio::select! {
incoming = quinn_endpoint.accept() => {
let connection = incoming.await; // TLS handshake + ALPN negotiation
match connection {
Ok(conn) => dispatch(conn),
Err(e) => { /* log TLS handshake failure, continue */ }
}
}
_ = shutdown.changed() => break,
}
}
```
### iroh accept loop (P2P relay-assisted)
```
loop {
tokio::select! {
incoming = iroh_endpoint.accept() => {
let connection = incoming.await; // iroh QUIC connection + ALPN
match connection {
Ok(conn) => dispatch(conn),
Err(e) => { /* log connection error, continue */ }
}
}
_ = shutdown.changed() => break,
}
}
```
### Dispatch function (shared)
```
fn dispatch(connection) {
let alpn = connection.alpn();
match handlers.get(alpn) {
Some(handler) => {
let auth = AuthContext::from_connection(&connection);
let conn = Connection::new(connection);
tokio::spawn(async move {
if let Err(e) = handler.handle(conn, &auth).await {
// log error, connection closes
}
});
}
None => connection.close(0u32, "no handler"),
}
}
```
### What the accept loops do NOT do
- **No byte-peeking**: ALPN negotiation handles protocol detection. The old `stealth` module's `detect_protocol()` is unnecessary.
- **No per-handler accept loops**: The old `ListenerConfig` enum had Stream/Http/Dns variants with different accept paths. ALPN unifies this.
- **No SSH-specific logic**: The accept loop is ALPN-agnostic. It doesn't know or care what protocol the handler speaks.
## Stealth Mode as ALPN Dispatch
The reference implementation's "stealth mode" is SSH-over-TLS on port 443. The TLS cert is **camouflage**, not identity — it makes the port look like a web server to port scanners and DPI systems. Non-SSH traffic gets a fake nginx 404.
In the ALPN model, this maps to:
- The `alknet/http` handler is registered for standard HTTP ALPNs (`h2`, `http/1.1`)
- The HTTP handler can serve a decoy website or a fake 404
- Real services use `alknet/ssh`, `alknet/call`, etc.
- Clients that don't offer alknet ALPNs get the HTTP handler — just like port scanners in stealth mode
No byte-peeking, no `ProtocolDetection` enum. ALPN does the routing.
## Network Identity vs Auth Identity
A key distinction that the ALPN model makes explicit:
| Layer | Purpose | Mechanism |
|-------|---------|-----------|
| **Network identity** | How a client finds and verifies the node | TLS cert (quinn), NodeId (iroh) |
| **Auth identity** | Who the peer is and what they can do | SSH key, API token, certificate (handlers) |
The TLS cert is the node's network-facing identity — it's what `alknet.example.com` resolves to. It's NOT the node's authentication identity. Auth happens inside the handler via `IdentityProvider`.
This matches the reference implementation: the TLS cert encrypts and camouflages, but SSH key exchange handles the actual authentication.
## TLS Certificate Provisioning
For the quinn endpoint, `StaticConfig` provides TLS configuration via file paths:
- **Manual**: `tls_cert` and `tls_key` file paths. Required for production use.
- **Self-signed**: For development. The endpoint can generate a self-signed cert on startup.
The `rustls::ServerConfig` is built from cert + key + ALPN list at startup.
ACME auto-provisioning (Let's Encrypt) is not in scope for v1. It will be added as a feature later (see OQ-12).
The iroh endpoint does not need TLS certs — it uses `NodeId` for identity.
## Graceful Shutdown
@@ -120,7 +183,7 @@ impl AlknetEndpoint {
```
- `shutdown_sender()` returns a clone of the shutdown channel sender. Call `send(true)` to signal shutdown.
- `shutdown()` waits for in-flight connections to complete, with a drain timeout (default: 2 seconds). After the timeout, remaining connections are forcefully closed.
- `shutdown()` signals all accept loops to stop, waits for in-flight connections with a drain timeout (default: 2 seconds), then forcefully closes remaining connections.
- SIGTERM/SIGINT are wired to the shutdown channel by the CLI binary.
The drain timeout is configurable via `StaticConfig::drain_timeout`.
@@ -145,39 +208,34 @@ Non-fatal errors within a handler. See [core-types.md](core-types.md) for detail
### Accept loop errors
- **TLS handshake failure**: Log and continue. The client may have offered no compatible ALPN, or the cert may be untrusted by the client.
- **TLS handshake failure**: Log and continue. The client may have offered no compatible ALPN, or the cert may be untrusted.
- **Handler panic**: Caught by tokio's task isolation. The connection is dropped. Other connections continue.
- **Connection-level errors** (quinn `ConnectionError`): Log and continue. The accept loop keeps running.
## TLS Certificate Provisioning
`StaticConfig` provides TLS configuration via file paths:
- **Manual**: `tls_cert` and `tls_key` file paths. Required for production use.
- **Self-signed**: For development. The endpoint can generate a self-signed cert on startup.
The `rustls::ServerConfig` is built from cert + key + ALPN list at startup.
ACME auto-provisioning (Let's Encrypt) is not in scope for v1. It will be added as a feature later (see OQ-12).
- **Connection-level errors** (quinn/iroh `ConnectionError`): Log and continue. The accept loop keeps running.
## Key Differences from Reference Implementation
| Aspect | Reference (`alknet-main`) | New Model |
|--------|---------------------------|-----------|
| Transport | `TransportAcceptor` trait, `TransportKind` enum | `quinn::Endpoint` directly |
| Listener config | `ListenerConfig` enum (Stream/Http/Dns) | Single endpoint, ALPN dispatch |
| Transport | `TransportAcceptor` trait, `TransportKind` enum | `quinn::Endpoint` + `iroh::Endpoint`, ALPN dispatch |
| Listener config | `ListenerConfig` enum (Stream/Http/Dns) | Single `HandlerRegistry`, ALPN dispatch |
| Protocol detection | Byte-peeking (`stealth::detect_protocol`) | ALPN negotiation (TLS layer) |
| Accept loop | Per-transport, SSH-centric | ALPN-agnostic, handler-dispatched |
| Stealth mode | SSH-over-TLS with byte-peek | HTTP handler on `h2`/`http/1.1` serves decoy |
| Accept loop | Per-transport, SSH-centric | Per-connection-source, ALPN-agnostic |
| Handler model | `ServerHandler` + `russh::server::Handler` | `ProtocolHandler::handle(Connection, &AuthContext)` |
| Config | `ServeOptions` builder | `StaticConfig` + `HandlerRegistry` + `AlknetEndpoint::new()` |
| iroh | Separate `IrohAcceptor` + `IrohTransport` | `Option<iroh::Endpoint>` on `AlknetEndpoint` |
| Network vs auth identity | Conflated (TLS cert + SSH key both "auth") | Explicitly separated (TLS/NodeId = network, SSH key/token = auth) |
## Design Decisions
| Decision | ADR | Summary |
|----------|-----|---------|
| Multi-connectivity endpoint (quinn + iroh) | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Both optional, both feed same ALPN router |
| Static handler registration | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Two-way door, start static, add ArcSwap later |
| quinn::Endpoint directly, no TransportAcceptor | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Start with quinn, abstract later if needed |
| TCP is not an endpoint concern | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | TCP SSH is a handler concern, not core |
| No byte-peeking, ALPN dispatch only | [ADR-001](../../decisions/001-alpn-protocol-dispatch.md) | TLS layer handles protocol detection |
| Stealth mode = HTTP handler on standard ALPNs | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Decoy via ALPN routing, not byte-peek |
| Network identity ≠ auth identity | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | TLS cert/NodeId = network, SSH key/token = auth |
| Handler panics isolated | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | tokio task isolation, connection closes |
## Open Questions
@@ -185,5 +243,5 @@ ACME auto-provisioning (Let's Encrypt) is not in scope for v1. It will be added
See [open-questions.md](../../open-questions.md) for full details.
- **OQ-04**: Resolved — HandlerRegistry is static at startup.
- **OQ-05**: Open — start with quinn, abstract later if needed.
- **OQ-05**: Resolved — multi-connectivity endpoint with quinn + iroh, both feature-gated.
- **OQ-12**: Resolved — start with file paths in StaticConfig, add ACME later.

View File

@@ -6,36 +6,59 @@ Proposed
## Context
ADR-001 establishes ALPN-based protocol dispatch: a single QUIC+TLS endpoint accepts connections, and the ALPN negotiated during the TLS handshake routes each connection to the correct `ProtocolHandler`. ADR-002 defines the `ProtocolHandler` trait. ADR-006 establishes one ALPN per connection. ADR-007 defines `Connection` and `BiStream`.
ADR-001 establishes ALPN-based protocol dispatch: a single endpoint accepts connections, and the ALPN negotiated during the TLS handshake routes each connection to the correct `ProtocolHandler`. ADR-002 defines the `ProtocolHandler` trait. ADR-006 establishes one ALPN per connection. ADR-007 defines `Connection` and `BiStream`.
The question now is: **how does the endpoint work?** What accepts QUIC connections, negotiates ALPN, and hands connections to handlers? This is the central runtime piece of alknet-core — every handler depends on it.
The question now is: **how does the endpoint work?** What accepts connections, negotiates ALPN, and hands connections to handlers? This is the central runtime piece of alknet-core — every handler depends on it.
The reference implementation (`alknet-main`) uses a `Server` struct that binds a `TransportAcceptor`, runs an accept loop, and dispatches to a `ServerHandler` based on transport type and interface kind. This has three problems that the ALPN model solves:
### Multiple connectivity modes, not multiple transports
1. **Multiple listener types**: `ListenerConfig` has three variants (Stream, Http, Dns) with per-variant configuration and validation. ALPN eliminates this — one endpoint, one listener, ALPN does the routing.
2. **Protocol detection by byte-peeking**: The `stealth` module reads the first bytes to detect SSH vs HTTP. ALPN negotiation makes this unnecessary — the TLS handshake tells you the protocol before any application bytes are read.
3. **SSH-centric accept loop**: The current `handle_connection` immediately enters `russh::server::run_stream`. In the new model, the accept loop is ALPN-agnostic — it doesn't know or care what protocol the handler speaks.
The reference implementation supports three connectivity modes that serve **fundamentally different deployment contexts**:
### iroh's pattern
1. **QUIC+TLS (public)** — The node has a public IP and open ports. TLS provides protocol routing via ALPN negotiation. The TLS certificate is the node's **network-facing identity** — it's what clients verify when connecting to `alknet.example.com:4433`. This is the mode for replicators, VPS hosts, service providers. SSH key auth still handles **authentication** — the TLS cert is not the auth identity, it's the network identity.
iroh's `Router` registers `ProtocolHandler` instances with ALPN strings, then calls `endpoint.accept()` in a loop. For each incoming connection, it reads the negotiated ALPN, looks up the handler, and calls `handler.accept(connection)`. This is clean and proven.
2. **iroh P2P (NAT traversal)** — The node has no public IP or open ports. iroh's relay handles NAT traversal and connection brokering. Node identity comes from iroh's `NodeId` (Ed25519 key pair). The relay is a signaling service, not a proxy — it helps peers establish direct QUIC connections. This is the mode for home servers, IoT devices, anything behind NAT.
3. **TCP (local/dev)** — Bare SSH over TCP. Port 22. No TLS, no ALPN, no certs. SSH key exchange handles both identity and authentication. This is the mode for local network access and development.
These are not interchangeable "transports" to be abstracted behind a trait. They are **different ways a node can be reached**, each with different identity and authentication implications:
| Mode | Identity source | Auth mechanism | Requires public IP | Use case |
|------|-----------------|----------------|-------------------|----------|
| QUIC+TLS | TLS cert (network) + SSH key (auth) | SSH key, API key | Yes | VPS, replicators |
| iroh P2P | NodeId (Ed25519) | NodeId, SSH key | No | Home servers, NAT |
| TCP | SSH host key | SSH key | Yes (local) | Dev, LAN |
### What the old "stealth mode" actually was
The reference implementation's "stealth mode" is **SSH-over-TLS on port 443**. The TLS cert is NOT the node's identity — it's **camouflage**. The purpose is to make port 443 look like a web server to port scanners and DPI systems. Non-SSH traffic gets a fake nginx 404. SSH auth still happens via SSH key exchange *inside* the TLS tunnel.
In the new ALPN model, this concept maps to: the endpoint speaks QUIC+TLS with ALPN, and the `alknet/http` handler can serve a decoy website on `h2`/`http/1.1` while real services use `alknet/ssh`, `alknet/call`, etc. The ALPN router does the "stealth" job — unknown ALPNs get the HTTP handler, which can serve whatever fronting content is desired. No byte-peeking needed.
### iroh produces QUIC connections
iroh's `Endpoint::accept()` produces incoming QUIC connections. These connections have ALPNs. They can feed directly into the same `HandlerRegistry` dispatch. The iroh endpoint and the quinn endpoint both produce QUIC connections — the difference is how they're established (relay-assisted vs direct), not how handlers consume them.
This means: **the ALPN router is transport-agnostic**. It dispatches by ALPN string. It doesn't care whether the connection came from a quinn endpoint or an iroh endpoint. Both produce connections that handlers can use via the same `Connection` type.
### Key design questions
1. **Handler registration**: Static (at startup) or dynamic (at runtime)?
2. **TLS certificate management**: How does the endpoint get TLS certs? Where does ACME fit?
3. **Connection lifecycle**: Who owns the `quinn::Endpoint`? How does graceful shutdown work?
1. **How many endpoints can a node have?** A node may need to listen on quinn (public QUIC+TLS) AND iroh (P2P relay) simultaneously. These are not alternatives — they're complementary connectivity modes.
2. **Handler registration**: Static (at startup) or dynamic (at runtime)?
3. **Connection lifecycle**: Who owns the endpoints? How does graceful shutdown work?
4. **Error handling**: What happens when a handler panics? When ALPN negotiation fails?
## Decision
### Endpoint owns the QUIC endpoint
### A node can have multiple endpoints
`alknet-core` owns the `quinn::Endpoint` directly. The endpoint binds to a single address, configures TLS with a `rustls::ServerConfig` that includes the ALPN strings from all registered handlers, and accepts connections in a loop.
`AlknetEndpoint` manages one or more QUIC connection sources. Each source produces connections that feed into the same `HandlerRegistry`:
```rust
pub struct AlknetEndpoint {
endpoint: quinn::Endpoint,
// One or more QUIC connection sources
quinn: Option<quinn::Endpoint>, // Public QUIC+TLS
iroh: Option<iroh::Endpoint>, // P2P relay-assisted
handlers: Arc<HandlerRegistry>,
dynamic: Arc<ArcSwap<DynamicConfig>>,
identity_provider: Arc<dyn IdentityProvider>,
@@ -43,7 +66,9 @@ pub struct AlknetEndpoint {
}
```
There is no `TransportAcceptor` trait, no `TransportKind` enum, no `ListenerConfig` enum. QUIC+TLS+ALPN replaces all of that.
A node that has a public IP runs with `quinn: Some(...)` — it listens on a public address with TLS+ALPN. A node behind NAT runs with `iroh: Some(...)` — it connects to a relay and accepts P2P connections. A node that has both runs with both — it's reachable via either path, and both feed into the same ALPN router.
**TCP mode is not an endpoint concern.** TCP mode in the reference implementation is SSH over raw TCP on port 22. This is not QUIC and doesn't have ALPN. In the new model, TCP access to SSH is handled by the SSH handler directly — it can listen on a TCP socket independently of the ALPN endpoint. This is a handler-specific concern, not a core endpoint concern.
### HandlerRegistry maps ALPN strings to ProtocolHandler instances
@@ -53,44 +78,52 @@ pub struct HandlerRegistry {
}
```
Registration is static at startup. The CLI binary constructs a `HandlerRegistry` by inserting handlers for each ALPN, then passes it to `AlknetEndpoint::new()`. The ALPN strings in the TLS `ServerConfig` are derived from the registry's keys.
Registration is static at startup (OQ-04). The CLI binary constructs a `HandlerRegistry`, inserts handlers, and passes it to `AlknetEndpoint::new()`.
This is a two-way door (OQ-04): starting static is simple. If dynamic registration is needed later, the registry can be wrapped in `ArcSwap<HandlerRegistry>` and the TLS `ServerConfig` can be regenerated. But ALPN negotiation happens during the TLS handshake, so adding a handler at runtime requires the next connection to use the new ALPN — which the client already has to know about. Dynamic registration has limited value for v1.
The ALPN strings for the quinn endpoint's TLS `ServerConfig` are derived from the registry's keys. The iroh endpoint's ALPN strings are also derived from the registry — both endpoints advertise the same set of ALPNs.
### Accept loop: connect, dispatch, spawn
### Accept loop: accept from all sources, dispatch by ALPN
The endpoint runs accept loops for each active connection source. All loops dispatch through the same `HandlerRegistry`:
```
// Quinn accept loop (if configured)
loop {
incoming = endpoint.accept().await
incoming = quinn_endpoint.accept().await
connection = incoming.await // TLS handshake + ALPN negotiation
dispatch(connection)
}
// iroh accept loop (if configured)
loop {
incoming = iroh_endpoint.accept().await
connection = incoming.await // iroh QUIC connection + ALPN
dispatch(connection)
}
fn dispatch(connection) {
alpn = connection.alpn()
handler = registry.get(alpn)
match handler {
Some(h) => {
auth = resolve_endpoint_auth(connection) // TLS client cert, etc.
tokio::spawn(h.handle(connection, &auth))
auth = AuthContext::from_connection(&connection)
conn = Connection::new(connection)
tokio::spawn(h.handle(conn, &auth))
}
None => connection.close()
}
}
```
Key behaviors:
- **ALPN mismatch**: The TLS handshake fails. This is correct — the client and server have no protocol in common.
- **Handler not found**: Should not happen — the `ServerConfig` only advertises ALPNs that have registered handlers. If somehow a connection negotiates an ALPN with no handler, the connection is closed with an error log.
- **Handler panic**: The handler runs in a spawned tokio task. If it panics, the task is caught by tokio's panic handler. The connection is dropped. Other connections are unaffected.
- **Graceful shutdown**: A `watch::Sender<bool>` signals the accept loop to stop accepting new connections. Existing connections are given a drain timeout (2 seconds default), then forcefully closed.
Both accept loops are `tokio::select!`-ed against the shutdown signal.
### TLS certificate configuration
### TLS certificate and the distinction between network identity and auth identity
TLS certs come from `StaticConfig`:
- File paths (`tls_cert`, `tls_key`) for manual provisioning
- Self-signed for development
For the quinn endpoint, the TLS cert serves as **network-facing identity** — it's what clients verify when connecting to a domain name. It is NOT the node's authentication identity. Authentication is handled by handlers (SSH key exchange, API tokens, etc.).
The `rustls::ServerConfig` is built from the cert + key + ALPN list at startup. The ALPN list is derived from `HandlerRegistry::alpn_strings()`.
This is the same model as the reference implementation's TLS mode: the cert makes the port look legitimate and encrypts traffic, but SSH key exchange handles the actual authentication. The ALPN model extends this: the cert + ALPN routing is the network layer, handler-specific auth is the application layer.
ACME auto-provisioning (Let's Encrypt) is not in scope for v1. It will be added as a feature later (see OQ-12).
For the iroh endpoint, the `NodeId` serves as network identity. No TLS cert is needed — iroh's QUIC uses the NodeId for connection verification.
### Error taxonomy
@@ -115,18 +148,18 @@ pub enum HandlerError {
## Consequences
**Positive:**
- Single accept loop replaces multiple listener types and byte-peeking
- ALPN negotiation happens at the TLS layer — no application-level protocol detection
- A node can be reachable via multiple paths simultaneously (public QUIC+TLS, iroh P2P)
- ALPN router is transport-agnostic — dispatches by ALPN string regardless of connection source
- Adding a handler is registering an ALPN string — no endpoint code changes
- Handler panics are isolated — one bad handler can't take down the endpoint
- `quinn::Endpoint` is the only transport — no TransportAcceptor trait needed for v1
- The endpoint is testable: give it mock handlers and a test ALPN, verify dispatch
- "Stealth mode" maps naturally to the HTTP handler serving decoy content on `h2`/`http/1.1`
- Both iroh and quinn produce QUIC connections — same `Connection` type works for both
**Negative:**
- Direct quinn dependency in alknet-core — WASM targets can't use quinn (mitigated: WASM clients don't run endpoints, they connect to them; the WASM door is for client-side handlers, not the endpoint itself)
- No runtime handler registration without regenerating the TLS config (mitigated: two-way door, start static, add ArcSwap later if needed)
- alknet-core depends on both quinn and iroh (mitigated: both are feature-gated; a node that only needs one doesn't compile the other)
- The endpoint is more complex than a single quinn listener — it manages multiple accept loops
- TLS cert provisioning is manual (file paths) for v1 — ACME auto-provisioning is a future feature (OQ-12)
- One address per endpoint — if you need to listen on multiple addresses, run multiple endpoints (acceptable for v1)
- No runtime handler registration without regenerating the TLS config (mitigated: two-way door, start static, add ArcSwap later if needed)
## References
@@ -136,6 +169,8 @@ pub enum HandlerError {
- ADR-007: BiStream type definition (Connection, SendStream, RecvStream)
- ADR-009: One-way door decision framework
- OQ-04: Dynamic handler registration (two-way door, start static)
- OQ-05: Multi-transport endpoint (two-way door, start with quinn)
- OQ-05: Multi-transport endpoint (now: multi-connectivity endpoint)
- iroh Router pattern: `docs/research/references/iroh/`
- Reference implementation: `alknet-main/crates/alknet-core/src/server/serve.rs`
- Reference stealth mode: `alknet-main/crates/alknet-core/src/server/stealth.rs`
- Reference iroh transport: `alknet-main/crates/alknet-core/src/transport/iroh_transport.rs`

View File

@@ -53,14 +53,14 @@ Door type classifications follow ADR-009:
## Theme: Transport and Endpoint
### OQ-05: Multi-Transport Endpoint
### OQ-05: Multi-Connectivity Endpoint
- **Origin**: [overview.md](overview.md)
- **Status**: open
- **Door type**: Two-way
- **Priority**: low
- **Resolution**: Start with quinn (QUIC over UDP). `AlknetEndpoint` uses `quinn::Endpoint` directly. The endpoint can be made transport-agnostic later by abstracting the connection accept loop behind a trait. iroh connectivity produces QUIC connections that can feed into the same ALPN router. `SendStream`/`RecvStream` are concrete wrappers over quinn types — can become enum dispatch if multi-transport is needed. See ADR-010.
- **Cross-references**: ADR-001, ADR-010, [core-types.md](crates/core/core-types.md)
- **Status**: resolved
- **Door type**: One-way
- **Priority**: high
- **Resolution**: `AlknetEndpoint` supports both `quinn::Endpoint` (public QUIC+TLS) and `iroh::Endpoint` (P2P relay-assisted) simultaneously, both optional and feature-gated. Both produce QUIC connections that dispatch through the same `HandlerRegistry` by ALPN string. These are not interchangeable transports — they serve fundamentally different deployment contexts (public IP vs NAT traversal). TCP is not an endpoint concern — bare TCP SSH is handled by the SSH handler directly. See ADR-010.
- **Cross-references**: ADR-001, ADR-010, [endpoint.md](crates/core/endpoint.md)
### OQ-06: Server-Side ALPN vs Client-Side ALPN