docs(architecture): spec alknet-core with per-crate subdocs, ADR-010/011

Add alknet-core architecture specs in docs/architecture/crates/core/ with
focused subdocuments for core types, endpoint, auth, and config. Write
ADR-010 (ALPN Router and Endpoint) defining AlknetEndpoint, HandlerRegistry,
accept loop, and graceful shutdown. Write ADR-011 (AuthContext Structure)
defining AuthContext fields, immutability in handle(), and IdentityProvider
injection pattern. Resolve OQ-04 (static registration), OQ-12 (file paths
only for v1). Add OQ-11 (auth observability). Fix remaining alknet-secret
references to alknet-vault across ADRs 003/004/005/009.
This commit is contained in:
2026-06-16 12:07:17 +00:00
parent 80128a56e5
commit 90d5f4eaf9
13 changed files with 1151 additions and 18 deletions

View File

@@ -0,0 +1,47 @@
---
status: draft
last_updated: 2026-06-16
---
# alknet-core
Core library for ALPN-based protocol dispatch. Every handler crate depends on alknet-core.
## Documents
| Document | Status | Description |
|----------|--------|-------------|
| [core-types.md](core-types.md) | draft | ProtocolHandler trait, HandlerError, Connection, BiStream, StreamError |
| [endpoint.md](endpoint.md) | draft | ALPN router, HandlerRegistry, accept loop, graceful shutdown |
| [auth.md](auth.md) | draft | AuthContext, Identity, IdentityProvider, AuthToken, resolution flow |
| [config.md](config.md) | draft | StaticConfig, DynamicConfig, ArcSwap, ConfigReloadHandle |
## Applicable ADRs
| ADR | Title | Relevance |
|-----|-------|-----------|
| [001](../../decisions/001-alpn-protocol-dispatch.md) | ALPN-Based Protocol Dispatch | Core architectural model |
| [002](../../decisions/002-protocol-handler-trait.md) | ProtocolHandler Trait | The trait every handler implements |
| [003](../../decisions/003-crate-decomposition.md) | Crate Decomposition | alknet-core's position in the crate graph |
| [004](../../decisions/004-auth-as-shared-core.md) | Auth as Shared Core | IdentityProvider in core |
| [006](../../decisions/006-alpn-convention-and-connection-model.md) | ALPN String Convention | ALPN format, one-ALPN-per-connection |
| [007](../../decisions/007-bistream-type-definition.md) | BiStream Type Definition | Connection, BiStream trait, SendStream, RecvStream |
| [009](../../decisions/009-one-way-door-decision-framework.md) | One-Way Door Framework | Decision classification |
| [010](../../decisions/010-alpn-router-and-endpoint.md) | ALPN Router and Endpoint | Endpoint, HandlerRegistry, accept loop |
| [011](../../decisions/011-authcontext-structure.md) | AuthContext Structure | AuthContext fields and resolution flow |
## Relevant Open Questions
| OQ | Title | Status | Relevance |
|----|-------|--------|-----------|
| OQ-04 | Dynamic handler registration | resolved (start static) | HandlerRegistry is immutable at startup |
| OQ-05 | Multi-transport endpoint | open (start with quinn) | AlknetEndpoint uses quinn directly |
| OQ-11 | AuthContext resolution completeness | open | How handlers signal auth completion |
## Key Design Principles
1. **One trait, one dispatch point**: `ProtocolHandler` is the only abstraction handlers implement. No StreamInterface/MessageInterface split.
2. **ALPN does the routing**: The endpoint dispatches by ALPN string. No byte-peeking, no ListenerConfig enum.
3. **Handlers own their wire format**: Each handler manages its own protocol parsing. alknet-core provides the Connection, not the framing.
4. **Auth is hybrid**: The endpoint provides what it can (TLS-level auth). Handlers complete what they need. AuthContext may be partial.
5. **WASM door preserved**: BiStream is a trait, Connection is an opaque type. Core types don't assume tokio or quinn in public APIs.

View File

@@ -0,0 +1,237 @@
---
status: draft
last_updated: 2026-06-16
---
# Authentication
AuthContext, Identity, IdentityProvider, AuthToken, and the resolution flow.
See [ADR-004](../../decisions/004-auth-as-shared-core.md) and [ADR-011](../../decisions/011-authcontext-structure.md) for rationale.
## AuthContext
Created by the endpoint for each incoming connection. Passed to `ProtocolHandler::handle()` as an immutable reference.
```rust
#[derive(Clone)]
pub struct AuthContext {
/// The peer's authenticated identity, if resolved by the endpoint.
/// None means the endpoint has no identity information for this connection.
pub identity: Option<Identity>,
/// The negotiated ALPN for this connection. Always present.
pub alpn: Vec<u8>,
/// The peer's remote address, if available. Informational (NAT/proxy).
pub remote_addr: Option<SocketAddr>,
/// SHA-256 fingerprint of the TLS client certificate, if presented.
/// Set by the endpoint during TLS handshake. Handlers may use this for
/// fingerprint-based auth even when IdentityProvider returns None.
pub tls_client_fingerprint: Option<String>,
}
```
### Construction by the endpoint
The endpoint constructs `AuthContext` from the QUIC connection:
1. `alpn`: From `connection.alpn()` — always present after TLS handshake.
2. `remote_addr`: From `connection.remote_addr()` — may be `None` for iroh connections.
3. `tls_client_fingerprint`: Extracted from the TLS session's client certificate, if one was presented.
4. `identity`: If a TLS client fingerprint is available, the endpoint calls `IdentityProvider::resolve_from_fingerprint()`. If it resolves, `identity = Some(resolved)`. If not, `identity = None`.
### Handler-level resolution
Handlers that require authentication extract protocol-specific credentials and call `IdentityProvider` inside `handle()`:
```rust
// Example: CallAdapter extracting an AuthToken from the first frame
async fn handle(&self, connection: Connection, auth: &AuthContext) -> Result<(), HandlerError> {
let identity = match &auth.identity {
Some(id) => id.clone(), // Endpoint already resolved identity
None => {
let stream = connection.accept_bi().await?;
let token = extract_auth_token(stream).await?;
self.identity_provider
.resolve_from_token(&token)
.ok_or(HandlerError::AuthRequired)?
}
};
// ... proceed with authenticated identity
}
```
Handlers that don't require authentication (e.g., DNS resolver, health check) can ignore `auth.identity` entirely.
### AuthContext is Clone and immutable
- `derive(Clone)` allows handlers to clone `AuthContext` for per-stream or per-channel contexts.
- `handle()` receives `&AuthContext` — immutable. Handlers that resolve identity create local variables, they don't mutate the shared context. This prevents cross-contamination between streams on the same connection.
## Identity
The authenticated peer identity. Carries authorization information.
```rust
#[derive(Debug, Clone, PartialEq)]
pub struct Identity {
/// Unique identifier string. Fingerprint, key prefix, or principal name.
pub id: String,
/// Authorization scopes. e.g., ["relay:connect", "secrets:derive"]
pub scopes: Vec<String>,
/// Named resource lists. e.g., {"service": ["gitea", "registry"]}
pub resources: HashMap<String, Vec<String>>,
}
```
This is the same structure as the reference implementation (`alknet-main/crates/alknet-core/src/auth/identity.rs`), minus the russh dependency. The `id` field is ALPN-agnostic:
- SSH key auth: `"SHA256:abc123..."` (key fingerprint)
- API key auth: `"alk_test"` (key prefix)
- Certificate auth: `"username"` (principal name)
## AuthToken
Opaque authentication token carried in protocol frames.
```rust
#[derive(Debug, Clone)]
pub struct AuthToken {
pub raw: Vec<u8>,
}
```
Unchanged from the reference implementation. The handler that extracted it knows its encoding (UTF-8 string, binary token, etc.).
## IdentityProvider
Trait for resolving credentials to identities. Implemented by `ConfigIdentityProvider`.
```rust
pub trait IdentityProvider: Send + Sync + 'static {
fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
}
```
- `resolve_from_fingerprint()`: Used by the endpoint (TLS client cert) and by SSH (key fingerprint).
- `resolve_from_token()`: Used by call protocol (AuthToken in first frame) and HTTP (Bearer header).
Both methods return `Option<Identity>``None` means the credential is not recognized.
## ConfigIdentityProvider
The default implementation. Resolves identities from `DynamicConfig`:
```rust
pub struct ConfigIdentityProvider {
dynamic: Arc<ArcSwap<DynamicConfig>>,
}
```
The "Config" prefix indicates that identities are resolved from configuration (as opposed to a database or external service). This reads from `ArcSwap<DynamicConfig>`, which is hot-reloadable — not from `StaticConfig`. An alternative name would be `DynamicConfigIdentityProvider` to make this clearer, but `ConfigIdentityProvider` is consistent with the reference implementation and the naming is unlikely to cause confusion in practice.
How it resolves:
- **Fingerprint**: Look up in `DynamicConfig::auth::authorized_keys_fingerprints`. If found, return `Identity { id: fingerprint, scopes: ["relay:connect"], resources: {} }`.
- **Token**: Parse as UTF-8. If it starts with `alk_`, look up in `DynamicConfig::auth::api_keys` by prefix match + SHA-256 hash. If found and not expired, return `Identity { id: prefix, scopes: entry.scopes, resources: entry.resources }`.
Changes to `DynamicConfig` via `ConfigReloadHandle` are reflected immediately — `ConfigIdentityProvider` reads from `ArcSwap` on every call.
## Resolution Flow
### Endpoint-level (before `handle()`)
```
QUIC connection arrives
→ TLS handshake (ALPN negotiation)
→ Extract TLS client certificate fingerprint (if presented)
→ If fingerprint present: IdentityProvider::resolve_from_fingerprint()
→ Some(identity): auth.identity = Some(identity)
→ None: auth.identity = None
→ Construct AuthContext { identity, alpn, remote_addr, tls_client_fingerprint }
→ Look up handler by alpn
→ tokio::spawn(handler.handle(connection, &auth))
```
### Handler-level (inside `handle()`)
```
Handler receives &AuthContext
→ If auth.identity is Some: use it (endpoint already resolved)
→ If auth.identity is None and handler requires auth:
→ Extract protocol-specific credential (AuthToken, SSH key, etc.)
→ Call IdentityProvider::resolve_from_token() or resolve_from_fingerprint()
→ If resolved: use the Identity
→ If not resolved: return HandlerError::AuthRequired
→ If handler doesn't require auth: proceed without identity
```
## IdentityProvider Injection
Handlers need access to `IdentityProvider` to resolve credentials inside `handle()`. Since `ProtocolHandler::handle()` doesn't receive an `IdentityProvider` parameter, each handler must obtain it through **constructor injection**:
```rust
// Example: SshAdapter holds an Arc<dyn IdentityProvider>
pub struct SshAdapter {
identity_provider: Arc<dyn IdentityProvider>,
// ... other handler-specific state
}
#[async_trait]
impl ProtocolHandler for SshAdapter {
fn alpn(&self) -> &'static [u8] { b"alknet/ssh" }
async fn handle(&self, connection: Connection, auth: &AuthContext) -> Result<(), HandlerError> {
let identity = match &auth.identity {
Some(id) => id.clone(),
None => {
// Extract SSH key fingerprint, resolve via identity_provider
let fingerprint = extract_ssh_fingerprint(&connection).await?;
self.identity_provider
.resolve_from_fingerprint(&fingerprint)
.ok_or(HandlerError::AuthRequired)?
}
};
// ...
}
}
```
The CLI binary constructs each handler with `Arc::clone(&identity_provider)` and passes it when building the `HandlerRegistry`. This is the **assembly pattern**: the CLI (the only crate that depends on all handlers) wires dependencies together.
The endpoint's `AlknetEndpoint` also holds `Arc<dyn IdentityProvider>` for endpoint-level auth resolution (TLS client certificate fingerprints), but handlers don't receive it from the endpoint — they receive it at construction time from the CLI.
| Handler | Credential source | Resolution method |
|---------|------------------|-----------------|
| SshAdapter | SSH public key handshake | `resolve_from_fingerprint()` |
| CallAdapter | AuthToken in first frame | `resolve_from_token()` |
| HttpAdapter | `Authorization: Bearer` header | `resolve_from_token()` |
| DnsAdapter | AuthToken in query labels | `resolve_from_token()` |
| GitAdapter | Signed push certificate | `resolve_from_fingerprint()` |
| SftpAdapter | SSH key (shares with SshAdapter) | `resolve_from_fingerprint()` |
## Key Differences from Reference Implementation
| Aspect | Reference | New Model |
|--------|-----------|-----------|
| Auth resolution | Inside SSH handler, before `handle()` | Hybrid: endpoint resolves TLS-level, handler resolves protocol-level |
| AuthContext type | None (just `Arc<ArcSwap<DynamicConfig>>` + `IdentityProvider`) | Explicit struct with optional fields |
| `Identity.id` | Always a fingerprint or API key prefix | Same, but ALPN-agnostic documentation |
| `ConfigIdentityProvider` | Depends on russh for `PublicKey` types | No russh dependency; fingerprints stored as strings |
| Credential phases | AD phases in `CredentialProvider` | Two paths: fingerprint and token. No phases. |
## Design Decisions
| Decision | ADR | Summary |
|----------|-----|---------|
| Hybrid auth model | [ADR-004](../../decisions/004-auth-as-shared-core.md) | Endpoint resolves TLS-level, handler resolves protocol-level |
| AuthContext with optional Identity | [ADR-011](../../decisions/011-authcontext-structure.md) | Explicit None, not "partially authenticated" |
| AuthContext is immutable in handle() | [ADR-011](../../decisions/011-authcontext-structure.md) | Handlers create local variables for resolved identity |
| Two resolution paths | [ADR-004](../../decisions/004-auth-as-shared-core.md) | Fingerprint and token, not phased auth |
## Open Questions
- **OQ-11**: See [open-questions.md](../../open-questions.md) — handler-level auth resolution observability.

View File

@@ -0,0 +1,198 @@
---
status: draft
last_updated: 2026-06-16
---
# Configuration
StaticConfig, DynamicConfig, ArcSwap, and ConfigReloadHandle.
## StaticConfig
Immutable configuration resolved at startup. Cannot be changed without restarting the endpoint.
```rust
pub struct StaticConfig {
/// Bind address for the QUIC endpoint (e.g., "0.0.0.0:4433").
pub listen_addr: SocketAddr,
/// Path to TLS certificate file (PEM).
/// Required for QUIC+TLS. The endpoint will not start without TLS configuration.
pub tls_cert: Option<PathBuf>,
/// Path to TLS private key file (PEM).
/// Required alongside tls_cert.
pub tls_key: Option<PathBuf>,
/// Drain timeout for graceful shutdown (default: 2 seconds).
pub drain_timeout: Duration,
}
```
### Key differences from reference implementation
The reference `StaticConfig` (in `alknet-main/crates/alknet-core/src/config/static_config.rs`) is SSH-centric: it holds `host_key`, `host_key_algorithm`, `proxy_config`, `stealth`, `transport_mode`, and `listeners`. The new model removes all of these:
- **No `host_key`/`host_key_algorithm`**: SSH host keys are managed by the SSH handler, not by core config. The endpoint uses TLS certs, not SSH host keys.
- **No `proxy_config`**: Outbound proxy is an SSH-specific concern (SOCKS5/HTTP CONNECT forwarding). Not in core config.
- **No `stealth`**: ALPN eliminates the need for stealth/byte-peeking. See [ADR-001](../../decisions/001-alpn-protocol-dispatch.md).
- **No `transport_mode`/`listeners`**: The old `ServeTransportMode` and `ListenerConfig` enum are replaced by a single `listen_addr`. QUIC+TLS+ALPN replaces multiple listener types. See [ADR-010](../../decisions/010-alpn-router-and-endpoint.md).
- **No `iroh_relay`**: iroh transport is deferred (OQ-05). The v1 endpoint uses quinn directly.
### Construction
`StaticConfig` is constructed by the CLI binary from CLI arguments or a config file. The exact shape of `StartupOptions` (or whatever the CLI uses) is a CLI concern, not a core concern. alknet-core provides `StaticConfig` as a data structure; the CLI is responsible for populating it.
```rust
// The CLI binary constructs StaticConfig from its own options/config.
// StartupOptions is NOT a core type — it belongs to the alknet CLI binary.
// alknet-core receives a fully populated StaticConfig.
let static_config = StaticConfig {
listen_addr: "0.0.0.0:4433".parse()?,
tls_cert: Some("/path/to/cert.pem".into()),
tls_key: Some("/path/to/key.pem".into()),
drain_timeout: Duration::from_secs(2),
};
```
## DynamicConfig
Runtime-reloadable configuration. Hot-reloaded via `ArcSwap` without restarting the endpoint.
```rust
#[derive(Debug, Clone)]
pub struct DynamicConfig {
pub auth: AuthPolicy,
pub rate_limits: RateLimitConfig,
}
```
### AuthPolicy
Authorization policy derived from authorized keys, certificate authorities, and API keys.
```rust
pub struct AuthPolicy {
/// SHA-256 fingerprints of authorized keys (SSH keys, TLS client certs).
/// Stored as strings to avoid russh dependency in core.
pub authorized_fingerprints: HashSet<String>,
/// Certificate authorities for certificate-based auth.
/// The exact structure is TBD — it will be defined when alknet-ssh
/// is implemented. For now, this is a placeholder that reserves
/// the field. alknet-ssh will define `CertAuthorityEntry` with
/// the necessary fields (public key, principals, options).
pub cert_authorities: Vec<CertAuthorityEntry>,
/// API keys for token-based auth.
pub api_keys: Vec<ApiKeyEntry>,
}
```
`CertAuthorityEntry` is a placeholder type. Its fields will be defined when alknet-ssh is implemented and the certificate authority validation requirements are clear. For v1, `cert_authorities` will be an empty vector.
This replaces the reference implementation's `AuthPolicy` which depended on `russh::keys::PublicKey`. The new version stores fingerprints as strings, not russh types. This removes the russh dependency from alknet-core.
### ApiKeyEntry
```rust
pub struct ApiKeyEntry {
/// Key prefix (first 8 chars of the key). Used for O(1) lookup.
pub prefix: String,
/// SHA-256 hash of the full key. Used for verification.
pub hash: String,
/// Authorization scopes granted by this key.
pub scopes: Vec<String>,
/// Human-readable description.
pub description: String,
/// Unix timestamp when the key expires. None = never expires.
pub expires_at: Option<u64>,
}
```
Carries forward from the reference implementation with no changes.
### RateLimitConfig
```rust
pub struct RateLimitConfig {
pub max_connections_per_ip: usize,
pub max_auth_attempts: usize,
}
```
Carries forward from the reference implementation. Note: `max_connections_per_ip` and `max_auth_attempts` appear in both `StaticConfig` and `RateLimitConfig`. The relationship is:
- `StaticConfig` does NOT contain rate limit fields. Rate limits are entirely dynamic.
- `RateLimitConfig` in `DynamicConfig` is the authoritative source at runtime.
- The CLI binary sets initial `RateLimitConfig` values when creating the initial `DynamicConfig`.
- Hot-reloading `DynamicConfig` via `ConfigReloadHandle` replaces rate limits immediately — no restart needed.
## ArcSwap Pattern
`DynamicConfig` is wrapped in `Arc<ArcSwap<DynamicConfig>>` for lock-free reads and atomic swaps.
```rust
let dynamic = Arc::new(ArcSwap::new(Arc::new(DynamicConfig::default())));
```
- **Reads**: `dynamic.load()` returns `Arc<DynamicConfig>`. Multiple readers can hold references simultaneously without blocking.
- **Writes**: `dynamic.store(Arc::new(new_config))` atomically replaces the config. All subsequent reads see the new config.
- **No locks**: `ArcSwap` uses atomic operations. No reader is ever blocked by a writer.
This pattern carries forward directly from the reference implementation (`alknet-main/crates/alknet-core/src/config/dynamic_config.rs`).
## ConfigReloadHandle
```rust
pub struct ConfigReloadHandle {
dynamic: Arc<ArcSwap<DynamicConfig>>,
}
impl ConfigReloadHandle {
pub fn reload(&self, new_config: DynamicConfig);
pub fn dynamic(&self) -> Arc<DynamicConfig>;
}
```
- `reload()`: Atomically replaces the dynamic config. All subsequent reads (including in-flight `IdentityProvider` calls) see the new config.
- `dynamic()`: Returns the current config as `Arc<DynamicConfig>`.
The CLI binary creates a `ConfigReloadHandle` and passes it to a config watcher (file watcher, SIGHUP handler, or call protocol operation) that calls `reload()` when config changes are detected.
## ConfigError
```rust
pub enum ConfigError {
InvalidFlag { name: String },
KeyFileNotFound { path: String },
BindFailed(io::Error),
TlsConfig(io::Error),
IncompatibleOptions,
}
```
Simplified from the reference implementation. Removes proxy-specific errors (now an SSH concern) and listener validation errors (no more `ListenerConfig` enum).
## Key Differences from Reference Implementation
| Aspect | Reference | New Model |
|--------|-----------|-----------|
| StaticConfig fields | SSH host key, stealth, transport_mode, listeners, proxy | listen_addr, TLS cert/key, drain_timeout, rate limits |
| DynamicConfig.auth | `HashSet<PublicKey>` (russh types) | `HashSet<String>` (fingerprint strings) |
| ListenerConfig | Enum with Stream/Http/Dns variants | Eliminated — single endpoint, ALPN dispatch |
| TransportMode | Tcp/Tls/Iroh | Eliminated — always QUIC+TLS |
| Stealth mode | Byte-peeking HTTP/SSH detection | Eliminated — ALPN handles protocol detection |
| ForwardingPolicy | In DynamicConfig | Moved to handler-specific config (SSH) |
## Design Decisions
| Decision | ADR | Summary |
|----------|-----|---------|
| No russh dependency in core | [ADR-003](../../decisions/003-crate-decomposition.md) | Core is ALPN-agnostic; russh is an alknet-ssh dependency |
| ArcSwap for dynamic config | Carry-forward from reference | Lock-free reads, atomic swaps |
| No ListenerConfig | [ADR-001](../../decisions/001-alpn-protocol-dispatch.md) | Single endpoint, ALPN replaces multiple listener types |

View File

@@ -0,0 +1,133 @@
---
status: draft
last_updated: 2026-06-16
---
# Core Types
ProtocolHandler, HandlerError, Connection, BiStream, SendStream, RecvStream, StreamError.
## ProtocolHandler
The central abstraction. Every handler implements one trait:
```rust
#[async_trait]
pub trait ProtocolHandler: Send + Sync + 'static {
fn alpn(&self) -> &'static [u8];
async fn handle(&self, connection: Connection, auth: &AuthContext) -> Result<(), HandlerError>;
}
```
- `alpn()` returns the handler's ALPN identifier as a static byte string (e.g., `b"alknet/ssh"`, `b"alknet/call"`).
- `handle()` receives a `Connection` (not a single BiStream) and an `AuthContext`. Returns `HandlerError` on failure.
- Handlers that need a single stream call `connection.accept_bi()` once. Handlers that multiplex (SSH, call) open/accept streams as needed.
See [ADR-002](../../decisions/002-protocol-handler-trait.md) and [ADR-007](../../decisions/007-bistream-type-definition.md) for rationale.
## HandlerError
Non-fatal errors within a handler's `handle()` method. The endpoint catches these, logs them, and closes the connection. Other connections are unaffected.
```rust
pub enum HandlerError {
ConnectionClosed,
StreamError(io::Error),
AuthRequired,
Internal(Box<dyn std::error::Error + Send + Sync>),
}
```
- `ConnectionClosed`: The peer closed the connection. Clean exit.
- `StreamError`: An I/O error on a stream within the connection.
- `AuthRequired`: The handler requires authentication and couldn't resolve the peer's identity. The endpoint closes the connection with an appropriate error. Handlers that support multi-step auth (like SSH) should handle auth challenges within their protocol, not return `AuthRequired` until all attempts are exhausted.
- `Internal`: Handler-specific errors (protocol violations, upstream failures, etc.).
Handler panics are caught by tokio's task isolation. The connection is dropped, other connections continue.
## Connection
An opaque type wrapping a QUIC connection. Handlers receive a `Connection` in `handle()`.
```rust
pub struct Connection {
// Private: wraps the underlying QUIC connection or test mock
}
impl Connection {
pub async fn accept_bi(&self) -> Result<(SendStream, RecvStream), StreamError>;
pub async fn open_bi(&self) -> Result<(SendStream, RecvStream), StreamError>;
pub fn remote_alpn(&self) -> &[u8];
pub fn remote_addr(&self) -> Option<SocketAddr>;
pub fn close(&self, code: u32, reason: &str);
}
```
- `accept_bi()`: Wait for the peer to open a bidirectional stream. Returns `(SendStream, RecvStream)`.
- `open_bi()`: Open a bidirectional stream to the peer. Returns `(SendStream, RecvStream)`.
- `remote_alpn()`: The ALPN negotiated for this connection. Always present.
- `remote_addr()`: The peer's address, if available. Informational (NAT/proxy).
- `close()`: Close the connection with an error code and reason.
The `Connection` type does not expose quinn types in its public API. It wraps `quinn::Connection` internally, but the wrapper allows test implementations.
See [ADR-007](../../decisions/007-bistream-type-definition.md) for why handlers receive Connection instead of BiStream.
## BiStream
A trait for bidirectional byte streams. Used primarily for client-side and test scenarios.
```rust
pub trait BiStream: AsyncRead + AsyncWrite + Send + Unpin {}
```
Handlers that only need a single stream can obtain one via `connection.accept_bi()` and treat the `(SendStream, RecvStream)` pair as a BiStream. The `BiStream` trait is a convenience for:
- Client-side code that has a single bidirectional stream
- Test mocks that need to simulate a stream
- Future transport abstractions (WebTransport, raw TCP) that produce bidirectional byte streams
See [ADR-007](../../decisions/007-bistream-type-definition.md) for why BiStream is a trait.
## SendStream and RecvStream
Concrete types wrapping QUIC stream halves.
```rust
pub struct SendStream { /* wraps quinn::SendStream or test mock */ }
pub struct RecvStream { /* wraps quinn::RecvStream or test mock */ }
impl AsyncWrite for SendStream { ... }
impl AsyncRead for RecvStream { ... }
```
- `SendStream` implements `AsyncWrite`. Write bytes to the peer.
- `RecvStream` implements `AsyncRead`. Read bytes from the peer.
- These are not trait objects — they are concrete wrapper types that delegate to `quinn::SendStream` / `quinn::RecvStream` in production and to test mocks in tests.
This is a two-way door decision. If future transports need different stream types, `SendStream` and `RecvStream` can become wrappers with enum dispatch. For v1, concrete wrappers over quinn types are simpler and zero-cost.
## StreamError
```rust
pub enum StreamError {
ConnectionClosed,
StreamClosed,
Timeout,
Internal(io::Error),
}
```
Returned by `accept_bi()`, `open_bi()`, and stream read/write operations. Maps from `quinn::ConnectionError` and `quinn::StreamError`.
## Design Decisions
| Decision | ADR | Summary |
|----------|-----|---------|
| ProtocolHandler receives Connection, not BiStream | [ADR-007](../../decisions/007-bistream-type-definition.md) | Handlers that need multiple streams (SSH, call) have direct access to the Connection |
| BiStream is a trait | [ADR-007](../../decisions/007-bistream-type-definition.md) | WASM door preserved, test mocks possible |
| HandlerError is non-fatal | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Handler errors close the connection, not the endpoint |
| SendStream/RecvStream are concrete wrappers | Two-way door | Can become enum dispatch later if multi-transport is needed |
## Open Questions
- **OQ-05**: See [open-questions.md](../../open-questions.md) — multi-transport. If quinn is the only transport in v1, SendStream/RecvStream can be concrete wrappers.

View File

@@ -0,0 +1,189 @@
---
status: draft
last_updated: 2026-06-16
---
# Endpoint
ALPN router, handler registry, connection accept loop, and graceful shutdown.
See [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) for the full rationale.
## AlknetEndpoint
The central runtime type. Owns the QUIC endpoint, holds the handler registry, and runs the accept loop.
```rust
pub struct AlknetEndpoint {
endpoint: quinn::Endpoint,
handlers: Arc<HandlerRegistry>,
dynamic: Arc<ArcSwap<DynamicConfig>>,
identity_provider: Arc<dyn IdentityProvider>,
shutdown: watch::Receiver<bool>,
}
```
### Construction
The CLI binary constructs an `AlknetEndpoint` at startup:
1. Build `HandlerRegistry` by inserting handlers for each ALPN.
2. Build `StaticConfig` from CLI arguments / config file.
3. Build `rustls::ServerConfig` from TLS cert/key and the registry's ALPN strings.
4. Bind `quinn::Endpoint` with the `ServerConfig`.
5. Create `ArcSwap<DynamicConfig>` and `ConfigIdentityProvider`.
6. Call `AlknetEndpoint::new(endpoint, handlers, dynamic, identity_provider)`.
### Accept Loop
```
loop {
tokio::select! {
incoming = endpoint.accept() => {
let connection = incoming.await; // TLS handshake + ALPN negotiation
match connection {
Ok(conn) => {
let alpn = conn.alpn();
match handlers.get(alpn) {
Some(handler) => {
let auth = AuthContext::from_connection(&conn);
let conn = Connection::new(conn);
tokio::spawn(async move {
if let Err(e) = handler.handle(conn, &auth).await {
// log error, connection closes
}
});
}
None => {
// ALPN has no handler — should not happen
// (ServerConfig only advertises registered ALPNs)
conn.close(0u32, "no handler");
}
}
}
Err(e) => {
// TLS handshake or connection-level error
// log and continue accepting
}
}
}
_ = shutdown.changed() => {
break; // graceful shutdown
}
}
}
```
### What the accept loop does NOT do
- **No byte-peeking**: ALPN negotiation handles protocol detection. The old `stealth` module's `detect_protocol()` is unnecessary.
- **No per-handler accept loops**: The old model had `ListenerConfig::Stream`, `ListenerConfig::Http`, `ListenerConfig::Dns` with different accept paths. ALPN unifies this.
- **No SSH-specific logic**: The accept loop is ALPN-agnostic. It doesn't know or care what protocol the handler speaks.
## HandlerRegistry
Maps ALPN byte strings to `ProtocolHandler` instances.
```rust
pub struct HandlerRegistry {
handlers: HashMap<&'static [u8], Arc<dyn ProtocolHandler>>,
}
impl HandlerRegistry {
pub fn new() -> Self;
pub fn register(&mut self, handler: Arc<dyn ProtocolHandler>);
pub fn get(&self, alpn: &[u8]) -> Option<&Arc<dyn ProtocolHandler>>;
pub fn alpn_strings(&self) -> Vec<Vec<u8>>;
}
```
- `register()`: Insert a handler. Panics if the ALPN is already registered (duplicate handlers are a bug).
- `get()`: Look up a handler by ALPN string. Returns `None` if no handler is registered.
- `alpn_strings()`: Return all registered ALPN strings. Used to build the TLS `ServerConfig`.
Registration is static at startup (see [OQ-04](../../open-questions.md) and ADR-010). The CLI builds a `HandlerRegistry`, inserts all handlers, and passes it to `AlknetEndpoint`. The registry is immutable after construction.
### ALPN strings in the TLS ServerConfig
The `rustls::ServerConfig`'s ALPN protocol list is set from `registry.alpn_strings()` at construction time. This means:
- Only registered handlers' ALPNs are advertised during TLS negotiation.
- If a client offers an ALPN that's not in the list, the TLS handshake fails — correct behavior.
- Adding a handler at runtime requires rebuilding the `ServerConfig` (see OQ-04).
## Graceful Shutdown
```rust
impl AlknetEndpoint {
pub fn shutdown_sender(&self) -> watch::Sender<bool>;
pub async fn shutdown(&self) -> Result<(), EndpointError>;
}
```
- `shutdown_sender()` returns a clone of the shutdown channel sender. Call `send(true)` to signal shutdown.
- `shutdown()` waits for in-flight connections to complete, with a drain timeout (default: 2 seconds). After the timeout, remaining connections are forcefully closed.
- SIGTERM/SIGINT are wired to the shutdown channel by the CLI binary.
The drain timeout is configurable via `StaticConfig::drain_timeout`.
## Error Handling
### EndpointError
Fatal errors that prevent the endpoint from starting or continuing.
```rust
pub enum EndpointError {
BindFailed(io::Error),
TlsConfig(io::Error),
HandlerNotFound(Vec<u8>), // ALPN string with no registered handler
}
```
### HandlerError
Non-fatal errors within a handler. See [core-types.md](core-types.md) for details.
### Accept loop errors
- **TLS handshake failure**: Log and continue. The client may have offered no compatible ALPN, or the cert may be untrusted by the client.
- **Handler panic**: Caught by tokio's task isolation. The connection is dropped. Other connections continue.
- **Connection-level errors** (quinn `ConnectionError`): Log and continue. The accept loop keeps running.
## TLS Certificate Provisioning
`StaticConfig` provides TLS configuration via file paths:
- **Manual**: `tls_cert` and `tls_key` file paths. Required for production use.
- **Self-signed**: For development. The endpoint can generate a self-signed cert on startup.
The `rustls::ServerConfig` is built from cert + key + ALPN list at startup.
ACME auto-provisioning (Let's Encrypt) is not in scope for v1. It will be added as a feature later (see OQ-12).
## Key Differences from Reference Implementation
| Aspect | Reference (`alknet-main`) | New Model |
|--------|---------------------------|-----------|
| Transport | `TransportAcceptor` trait, `TransportKind` enum | `quinn::Endpoint` directly |
| Listener config | `ListenerConfig` enum (Stream/Http/Dns) | Single endpoint, ALPN dispatch |
| Protocol detection | Byte-peeking (`stealth::detect_protocol`) | ALPN negotiation (TLS layer) |
| Accept loop | Per-transport, SSH-centric | ALPN-agnostic, handler-dispatched |
| Handler model | `ServerHandler` + `russh::server::Handler` | `ProtocolHandler::handle(Connection, &AuthContext)` |
| Config | `ServeOptions` builder | `StaticConfig` + `HandlerRegistry` + `AlknetEndpoint::new()` |
## Design Decisions
| Decision | ADR | Summary |
|----------|-----|---------|
| Static handler registration | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Two-way door, start static, add ArcSwap later |
| quinn::Endpoint directly, no TransportAcceptor | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Start with quinn, abstract later if needed |
| No byte-peeking, ALPN dispatch only | [ADR-001](../../decisions/001-alpn-protocol-dispatch.md) | TLS layer handles protocol detection |
| Handler panics isolated | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | tokio task isolation, connection closes |
## Open Questions
See [open-questions.md](../../open-questions.md) for full details.
- **OQ-04**: Resolved — HandlerRegistry is static at startup.
- **OQ-05**: Open — start with quinn, abstract later if needed.
- **OQ-12**: Resolved — start with file paths in StaticConfig, add ACME later.