docs(architecture): spec alknet-core with per-crate subdocs, ADR-010/011

Add alknet-core architecture specs in docs/architecture/crates/core/ with
focused subdocuments for core types, endpoint, auth, and config. Write
ADR-010 (ALPN Router and Endpoint) defining AlknetEndpoint, HandlerRegistry,
accept loop, and graceful shutdown. Write ADR-011 (AuthContext Structure)
defining AuthContext fields, immutability in handle(), and IdentityProvider
injection pattern. Resolve OQ-04 (static registration), OQ-12 (file paths
only for v1). Add OQ-11 (auth observability). Fix remaining alknet-secret
references to alknet-vault across ADRs 003/004/005/009.
This commit is contained in:
2026-06-16 12:07:17 +00:00
parent 80128a56e5
commit 90d5f4eaf9
13 changed files with 1151 additions and 18 deletions

View File

@@ -1,15 +1,15 @@
--- ---
status: draft status: draft
last_updated: 2026-06-16 last_updated: 2026-06-17
--- ---
# Alknet Architecture # Alknet Architecture
## Current State ## Current State
**Pre-implementation.** The project has completed a pivot from a three-layer model to an ALPN-as-service model. The greenfield workspace contains only `alknet-vault` (stable) and research/reference material. Foundational ADRs (001009) are in place, including the BiStream type definition (ADR-007), vault integration (ADR-008), and the one-way door decision framework (ADR-009). Architecture specs are ready for Phase 1 implementation planning. **Pre-implementation.** The project has completed a pivot from a three-layer model to an ALPN-as-service model. The greenfield workspace contains only `alknet-vault` (stable) and research/reference material. Foundational ADRs (001011) are in place, including the BiStream type definition (ADR-007), vault integration (ADR-008), ALPN router/endpoint (ADR-010), and AuthContext structure (ADR-011). The alknet-core crate spec is in draft.
**Next step**: Resolve remaining two-way-door questions during implementation. Start with alknet-core (ProtocolHandler trait, Connection, endpoint, router, auth types, config). **Next step**: Review alknet-core spec documents, then begin implementation. Two-way-door questions (OQ-05, OQ-07, OQ-11, OQ-12) will be resolved during implementation.
## Architecture Documents ## Architecture Documents
@@ -17,8 +17,13 @@ last_updated: 2026-06-16
|----------|--------|-------------| |----------|--------|-------------|
| [overview.md](overview.md) | draft | Workspace-level overview, crate graph, shared types, design principles | | [overview.md](overview.md) | draft | Workspace-level overview, crate graph, shared types, design principles |
| [open-questions.md](open-questions.md) | draft | Centralized OQ tracker with door-type classifications | | [open-questions.md](open-questions.md) | draft | Centralized OQ tracker with door-type classifications |
| [crates/core/README.md](crates/core/README.md) | draft | alknet-core crate index |
| [crates/core/core-types.md](crates/core/core-types.md) | draft | ProtocolHandler, HandlerError, Connection, BiStream, StreamError |
| [crates/core/endpoint.md](crates/core/endpoint.md) | draft | ALPN router, HandlerRegistry, accept loop, shutdown |
| [crates/core/auth.md](crates/core/auth.md) | draft | AuthContext, Identity, IdentityProvider, AuthToken, resolution flow |
| [crates/core/config.md](crates/core/config.md) | draft | StaticConfig, DynamicConfig, ArcSwap, ConfigReloadHandle |
Crate-specific specs will be created when each crate is ready for Phase 1 architecture work, not in advance. Crate-specific specs for alknet-call, alknet-ssh, etc. will be created when each crate is ready for Phase 1 architecture work.
## ADR Table ## ADR Table
@@ -33,6 +38,8 @@ Crate-specific specs will be created when each crate is ready for Phase 1 archit
| [007](decisions/007-bistream-type-definition.md) | BiStream Type Definition | Accepted | | [007](decisions/007-bistream-type-definition.md) | BiStream Type Definition | Accepted |
| [008](decisions/008-secret-service-integration.md) | Vault Integration Point | Accepted | | [008](decisions/008-secret-service-integration.md) | Vault Integration Point | Accepted |
| [009](decisions/009-one-way-door-decision-framework.md) | One-Way Door Decision Framework | Accepted | | [009](decisions/009-one-way-door-decision-framework.md) | One-Way Door Decision Framework | Accepted |
| [010](decisions/010-alpn-router-and-endpoint.md) | ALPN Router and Endpoint | Proposed |
| [011](decisions/011-authcontext-structure.md) | AuthContext Structure and Resolution Flow | Proposed |
## Open Questions ## Open Questions
@@ -45,10 +52,15 @@ See [open-questions.md](open-questions.md) for the full tracker.
- **OQ-06**: ALPN per connection, not per stream (ADR-006) - **OQ-06**: ALPN per connection, not per stream (ADR-006)
- **OQ-08**: Vault integration — CLI-embedded via call protocol (ADR-008) - **OQ-08**: Vault integration — CLI-embedded via call protocol (ADR-008)
**Two-way doors (deferred to implementation):** **Resolved two-way doors:**
- **OQ-04**: Dynamic handler registration — start static, add ArcSwap later - **OQ-04**: Dynamic handler registration — static at startup (ADR-010)
- **OQ-12**: TLS certificate provisioning — file paths in StaticConfig, ACME later
**Two-way doors (resolved or deferred to implementation):**
- **OQ-04**: Dynamic handler registration — resolved: static at startup (ADR-010)
- **OQ-05**: Multi-transport endpoint — start with quinn, add transport trait later - **OQ-05**: Multi-transport endpoint — start with quinn, add transport trait later
- **OQ-07**: Call protocol scope — start with one stream per operation - **OQ-07**: Call protocol scope — start with one stream per operation
- **OQ-11**: Handler-level auth resolution observability — decide during implementation
**Deferred (not active):** **Deferred (not active):**
- **OQ-09**: WASM target boundaries — design constraint, not deliverable - **OQ-09**: WASM target boundaries — design constraint, not deliverable

View File

@@ -0,0 +1,47 @@
---
status: draft
last_updated: 2026-06-16
---
# alknet-core
Core library for ALPN-based protocol dispatch. Every handler crate depends on alknet-core.
## Documents
| Document | Status | Description |
|----------|--------|-------------|
| [core-types.md](core-types.md) | draft | ProtocolHandler trait, HandlerError, Connection, BiStream, StreamError |
| [endpoint.md](endpoint.md) | draft | ALPN router, HandlerRegistry, accept loop, graceful shutdown |
| [auth.md](auth.md) | draft | AuthContext, Identity, IdentityProvider, AuthToken, resolution flow |
| [config.md](config.md) | draft | StaticConfig, DynamicConfig, ArcSwap, ConfigReloadHandle |
## Applicable ADRs
| ADR | Title | Relevance |
|-----|-------|-----------|
| [001](../../decisions/001-alpn-protocol-dispatch.md) | ALPN-Based Protocol Dispatch | Core architectural model |
| [002](../../decisions/002-protocol-handler-trait.md) | ProtocolHandler Trait | The trait every handler implements |
| [003](../../decisions/003-crate-decomposition.md) | Crate Decomposition | alknet-core's position in the crate graph |
| [004](../../decisions/004-auth-as-shared-core.md) | Auth as Shared Core | IdentityProvider in core |
| [006](../../decisions/006-alpn-convention-and-connection-model.md) | ALPN String Convention | ALPN format, one-ALPN-per-connection |
| [007](../../decisions/007-bistream-type-definition.md) | BiStream Type Definition | Connection, BiStream trait, SendStream, RecvStream |
| [009](../../decisions/009-one-way-door-decision-framework.md) | One-Way Door Framework | Decision classification |
| [010](../../decisions/010-alpn-router-and-endpoint.md) | ALPN Router and Endpoint | Endpoint, HandlerRegistry, accept loop |
| [011](../../decisions/011-authcontext-structure.md) | AuthContext Structure | AuthContext fields and resolution flow |
## Relevant Open Questions
| OQ | Title | Status | Relevance |
|----|-------|--------|-----------|
| OQ-04 | Dynamic handler registration | resolved (start static) | HandlerRegistry is immutable at startup |
| OQ-05 | Multi-transport endpoint | open (start with quinn) | AlknetEndpoint uses quinn directly |
| OQ-11 | AuthContext resolution completeness | open | How handlers signal auth completion |
## Key Design Principles
1. **One trait, one dispatch point**: `ProtocolHandler` is the only abstraction handlers implement. No StreamInterface/MessageInterface split.
2. **ALPN does the routing**: The endpoint dispatches by ALPN string. No byte-peeking, no ListenerConfig enum.
3. **Handlers own their wire format**: Each handler manages its own protocol parsing. alknet-core provides the Connection, not the framing.
4. **Auth is hybrid**: The endpoint provides what it can (TLS-level auth). Handlers complete what they need. AuthContext may be partial.
5. **WASM door preserved**: BiStream is a trait, Connection is an opaque type. Core types don't assume tokio or quinn in public APIs.

View File

@@ -0,0 +1,237 @@
---
status: draft
last_updated: 2026-06-16
---
# Authentication
AuthContext, Identity, IdentityProvider, AuthToken, and the resolution flow.
See [ADR-004](../../decisions/004-auth-as-shared-core.md) and [ADR-011](../../decisions/011-authcontext-structure.md) for rationale.
## AuthContext
Created by the endpoint for each incoming connection. Passed to `ProtocolHandler::handle()` as an immutable reference.
```rust
#[derive(Clone)]
pub struct AuthContext {
/// The peer's authenticated identity, if resolved by the endpoint.
/// None means the endpoint has no identity information for this connection.
pub identity: Option<Identity>,
/// The negotiated ALPN for this connection. Always present.
pub alpn: Vec<u8>,
/// The peer's remote address, if available. Informational (NAT/proxy).
pub remote_addr: Option<SocketAddr>,
/// SHA-256 fingerprint of the TLS client certificate, if presented.
/// Set by the endpoint during TLS handshake. Handlers may use this for
/// fingerprint-based auth even when IdentityProvider returns None.
pub tls_client_fingerprint: Option<String>,
}
```
### Construction by the endpoint
The endpoint constructs `AuthContext` from the QUIC connection:
1. `alpn`: From `connection.alpn()` — always present after TLS handshake.
2. `remote_addr`: From `connection.remote_addr()` — may be `None` for iroh connections.
3. `tls_client_fingerprint`: Extracted from the TLS session's client certificate, if one was presented.
4. `identity`: If a TLS client fingerprint is available, the endpoint calls `IdentityProvider::resolve_from_fingerprint()`. If it resolves, `identity = Some(resolved)`. If not, `identity = None`.
### Handler-level resolution
Handlers that require authentication extract protocol-specific credentials and call `IdentityProvider` inside `handle()`:
```rust
// Example: CallAdapter extracting an AuthToken from the first frame
async fn handle(&self, connection: Connection, auth: &AuthContext) -> Result<(), HandlerError> {
let identity = match &auth.identity {
Some(id) => id.clone(), // Endpoint already resolved identity
None => {
let stream = connection.accept_bi().await?;
let token = extract_auth_token(stream).await?;
self.identity_provider
.resolve_from_token(&token)
.ok_or(HandlerError::AuthRequired)?
}
};
// ... proceed with authenticated identity
}
```
Handlers that don't require authentication (e.g., DNS resolver, health check) can ignore `auth.identity` entirely.
### AuthContext is Clone and immutable
- `derive(Clone)` allows handlers to clone `AuthContext` for per-stream or per-channel contexts.
- `handle()` receives `&AuthContext` — immutable. Handlers that resolve identity create local variables, they don't mutate the shared context. This prevents cross-contamination between streams on the same connection.
## Identity
The authenticated peer identity. Carries authorization information.
```rust
#[derive(Debug, Clone, PartialEq)]
pub struct Identity {
/// Unique identifier string. Fingerprint, key prefix, or principal name.
pub id: String,
/// Authorization scopes. e.g., ["relay:connect", "secrets:derive"]
pub scopes: Vec<String>,
/// Named resource lists. e.g., {"service": ["gitea", "registry"]}
pub resources: HashMap<String, Vec<String>>,
}
```
This is the same structure as the reference implementation (`alknet-main/crates/alknet-core/src/auth/identity.rs`), minus the russh dependency. The `id` field is ALPN-agnostic:
- SSH key auth: `"SHA256:abc123..."` (key fingerprint)
- API key auth: `"alk_test"` (key prefix)
- Certificate auth: `"username"` (principal name)
## AuthToken
Opaque authentication token carried in protocol frames.
```rust
#[derive(Debug, Clone)]
pub struct AuthToken {
pub raw: Vec<u8>,
}
```
Unchanged from the reference implementation. The handler that extracted it knows its encoding (UTF-8 string, binary token, etc.).
## IdentityProvider
Trait for resolving credentials to identities. Implemented by `ConfigIdentityProvider`.
```rust
pub trait IdentityProvider: Send + Sync + 'static {
fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
}
```
- `resolve_from_fingerprint()`: Used by the endpoint (TLS client cert) and by SSH (key fingerprint).
- `resolve_from_token()`: Used by call protocol (AuthToken in first frame) and HTTP (Bearer header).
Both methods return `Option<Identity>``None` means the credential is not recognized.
## ConfigIdentityProvider
The default implementation. Resolves identities from `DynamicConfig`:
```rust
pub struct ConfigIdentityProvider {
dynamic: Arc<ArcSwap<DynamicConfig>>,
}
```
The "Config" prefix indicates that identities are resolved from configuration (as opposed to a database or external service). This reads from `ArcSwap<DynamicConfig>`, which is hot-reloadable — not from `StaticConfig`. An alternative name would be `DynamicConfigIdentityProvider` to make this clearer, but `ConfigIdentityProvider` is consistent with the reference implementation and the naming is unlikely to cause confusion in practice.
How it resolves:
- **Fingerprint**: Look up in `DynamicConfig::auth::authorized_keys_fingerprints`. If found, return `Identity { id: fingerprint, scopes: ["relay:connect"], resources: {} }`.
- **Token**: Parse as UTF-8. If it starts with `alk_`, look up in `DynamicConfig::auth::api_keys` by prefix match + SHA-256 hash. If found and not expired, return `Identity { id: prefix, scopes: entry.scopes, resources: entry.resources }`.
Changes to `DynamicConfig` via `ConfigReloadHandle` are reflected immediately — `ConfigIdentityProvider` reads from `ArcSwap` on every call.
## Resolution Flow
### Endpoint-level (before `handle()`)
```
QUIC connection arrives
→ TLS handshake (ALPN negotiation)
→ Extract TLS client certificate fingerprint (if presented)
→ If fingerprint present: IdentityProvider::resolve_from_fingerprint()
→ Some(identity): auth.identity = Some(identity)
→ None: auth.identity = None
→ Construct AuthContext { identity, alpn, remote_addr, tls_client_fingerprint }
→ Look up handler by alpn
→ tokio::spawn(handler.handle(connection, &auth))
```
### Handler-level (inside `handle()`)
```
Handler receives &AuthContext
→ If auth.identity is Some: use it (endpoint already resolved)
→ If auth.identity is None and handler requires auth:
→ Extract protocol-specific credential (AuthToken, SSH key, etc.)
→ Call IdentityProvider::resolve_from_token() or resolve_from_fingerprint()
→ If resolved: use the Identity
→ If not resolved: return HandlerError::AuthRequired
→ If handler doesn't require auth: proceed without identity
```
## IdentityProvider Injection
Handlers need access to `IdentityProvider` to resolve credentials inside `handle()`. Since `ProtocolHandler::handle()` doesn't receive an `IdentityProvider` parameter, each handler must obtain it through **constructor injection**:
```rust
// Example: SshAdapter holds an Arc<dyn IdentityProvider>
pub struct SshAdapter {
identity_provider: Arc<dyn IdentityProvider>,
// ... other handler-specific state
}
#[async_trait]
impl ProtocolHandler for SshAdapter {
fn alpn(&self) -> &'static [u8] { b"alknet/ssh" }
async fn handle(&self, connection: Connection, auth: &AuthContext) -> Result<(), HandlerError> {
let identity = match &auth.identity {
Some(id) => id.clone(),
None => {
// Extract SSH key fingerprint, resolve via identity_provider
let fingerprint = extract_ssh_fingerprint(&connection).await?;
self.identity_provider
.resolve_from_fingerprint(&fingerprint)
.ok_or(HandlerError::AuthRequired)?
}
};
// ...
}
}
```
The CLI binary constructs each handler with `Arc::clone(&identity_provider)` and passes it when building the `HandlerRegistry`. This is the **assembly pattern**: the CLI (the only crate that depends on all handlers) wires dependencies together.
The endpoint's `AlknetEndpoint` also holds `Arc<dyn IdentityProvider>` for endpoint-level auth resolution (TLS client certificate fingerprints), but handlers don't receive it from the endpoint — they receive it at construction time from the CLI.
| Handler | Credential source | Resolution method |
|---------|------------------|-----------------|
| SshAdapter | SSH public key handshake | `resolve_from_fingerprint()` |
| CallAdapter | AuthToken in first frame | `resolve_from_token()` |
| HttpAdapter | `Authorization: Bearer` header | `resolve_from_token()` |
| DnsAdapter | AuthToken in query labels | `resolve_from_token()` |
| GitAdapter | Signed push certificate | `resolve_from_fingerprint()` |
| SftpAdapter | SSH key (shares with SshAdapter) | `resolve_from_fingerprint()` |
## Key Differences from Reference Implementation
| Aspect | Reference | New Model |
|--------|-----------|-----------|
| Auth resolution | Inside SSH handler, before `handle()` | Hybrid: endpoint resolves TLS-level, handler resolves protocol-level |
| AuthContext type | None (just `Arc<ArcSwap<DynamicConfig>>` + `IdentityProvider`) | Explicit struct with optional fields |
| `Identity.id` | Always a fingerprint or API key prefix | Same, but ALPN-agnostic documentation |
| `ConfigIdentityProvider` | Depends on russh for `PublicKey` types | No russh dependency; fingerprints stored as strings |
| Credential phases | AD phases in `CredentialProvider` | Two paths: fingerprint and token. No phases. |
## Design Decisions
| Decision | ADR | Summary |
|----------|-----|---------|
| Hybrid auth model | [ADR-004](../../decisions/004-auth-as-shared-core.md) | Endpoint resolves TLS-level, handler resolves protocol-level |
| AuthContext with optional Identity | [ADR-011](../../decisions/011-authcontext-structure.md) | Explicit None, not "partially authenticated" |
| AuthContext is immutable in handle() | [ADR-011](../../decisions/011-authcontext-structure.md) | Handlers create local variables for resolved identity |
| Two resolution paths | [ADR-004](../../decisions/004-auth-as-shared-core.md) | Fingerprint and token, not phased auth |
## Open Questions
- **OQ-11**: See [open-questions.md](../../open-questions.md) — handler-level auth resolution observability.

View File

@@ -0,0 +1,198 @@
---
status: draft
last_updated: 2026-06-16
---
# Configuration
StaticConfig, DynamicConfig, ArcSwap, and ConfigReloadHandle.
## StaticConfig
Immutable configuration resolved at startup. Cannot be changed without restarting the endpoint.
```rust
pub struct StaticConfig {
/// Bind address for the QUIC endpoint (e.g., "0.0.0.0:4433").
pub listen_addr: SocketAddr,
/// Path to TLS certificate file (PEM).
/// Required for QUIC+TLS. The endpoint will not start without TLS configuration.
pub tls_cert: Option<PathBuf>,
/// Path to TLS private key file (PEM).
/// Required alongside tls_cert.
pub tls_key: Option<PathBuf>,
/// Drain timeout for graceful shutdown (default: 2 seconds).
pub drain_timeout: Duration,
}
```
### Key differences from reference implementation
The reference `StaticConfig` (in `alknet-main/crates/alknet-core/src/config/static_config.rs`) is SSH-centric: it holds `host_key`, `host_key_algorithm`, `proxy_config`, `stealth`, `transport_mode`, and `listeners`. The new model removes all of these:
- **No `host_key`/`host_key_algorithm`**: SSH host keys are managed by the SSH handler, not by core config. The endpoint uses TLS certs, not SSH host keys.
- **No `proxy_config`**: Outbound proxy is an SSH-specific concern (SOCKS5/HTTP CONNECT forwarding). Not in core config.
- **No `stealth`**: ALPN eliminates the need for stealth/byte-peeking. See [ADR-001](../../decisions/001-alpn-protocol-dispatch.md).
- **No `transport_mode`/`listeners`**: The old `ServeTransportMode` and `ListenerConfig` enum are replaced by a single `listen_addr`. QUIC+TLS+ALPN replaces multiple listener types. See [ADR-010](../../decisions/010-alpn-router-and-endpoint.md).
- **No `iroh_relay`**: iroh transport is deferred (OQ-05). The v1 endpoint uses quinn directly.
### Construction
`StaticConfig` is constructed by the CLI binary from CLI arguments or a config file. The exact shape of `StartupOptions` (or whatever the CLI uses) is a CLI concern, not a core concern. alknet-core provides `StaticConfig` as a data structure; the CLI is responsible for populating it.
```rust
// The CLI binary constructs StaticConfig from its own options/config.
// StartupOptions is NOT a core type — it belongs to the alknet CLI binary.
// alknet-core receives a fully populated StaticConfig.
let static_config = StaticConfig {
listen_addr: "0.0.0.0:4433".parse()?,
tls_cert: Some("/path/to/cert.pem".into()),
tls_key: Some("/path/to/key.pem".into()),
drain_timeout: Duration::from_secs(2),
};
```
## DynamicConfig
Runtime-reloadable configuration. Hot-reloaded via `ArcSwap` without restarting the endpoint.
```rust
#[derive(Debug, Clone)]
pub struct DynamicConfig {
pub auth: AuthPolicy,
pub rate_limits: RateLimitConfig,
}
```
### AuthPolicy
Authorization policy derived from authorized keys, certificate authorities, and API keys.
```rust
pub struct AuthPolicy {
/// SHA-256 fingerprints of authorized keys (SSH keys, TLS client certs).
/// Stored as strings to avoid russh dependency in core.
pub authorized_fingerprints: HashSet<String>,
/// Certificate authorities for certificate-based auth.
/// The exact structure is TBD — it will be defined when alknet-ssh
/// is implemented. For now, this is a placeholder that reserves
/// the field. alknet-ssh will define `CertAuthorityEntry` with
/// the necessary fields (public key, principals, options).
pub cert_authorities: Vec<CertAuthorityEntry>,
/// API keys for token-based auth.
pub api_keys: Vec<ApiKeyEntry>,
}
```
`CertAuthorityEntry` is a placeholder type. Its fields will be defined when alknet-ssh is implemented and the certificate authority validation requirements are clear. For v1, `cert_authorities` will be an empty vector.
This replaces the reference implementation's `AuthPolicy` which depended on `russh::keys::PublicKey`. The new version stores fingerprints as strings, not russh types. This removes the russh dependency from alknet-core.
### ApiKeyEntry
```rust
pub struct ApiKeyEntry {
/// Key prefix (first 8 chars of the key). Used for O(1) lookup.
pub prefix: String,
/// SHA-256 hash of the full key. Used for verification.
pub hash: String,
/// Authorization scopes granted by this key.
pub scopes: Vec<String>,
/// Human-readable description.
pub description: String,
/// Unix timestamp when the key expires. None = never expires.
pub expires_at: Option<u64>,
}
```
Carries forward from the reference implementation with no changes.
### RateLimitConfig
```rust
pub struct RateLimitConfig {
pub max_connections_per_ip: usize,
pub max_auth_attempts: usize,
}
```
Carries forward from the reference implementation. Note: `max_connections_per_ip` and `max_auth_attempts` appear in both `StaticConfig` and `RateLimitConfig`. The relationship is:
- `StaticConfig` does NOT contain rate limit fields. Rate limits are entirely dynamic.
- `RateLimitConfig` in `DynamicConfig` is the authoritative source at runtime.
- The CLI binary sets initial `RateLimitConfig` values when creating the initial `DynamicConfig`.
- Hot-reloading `DynamicConfig` via `ConfigReloadHandle` replaces rate limits immediately — no restart needed.
## ArcSwap Pattern
`DynamicConfig` is wrapped in `Arc<ArcSwap<DynamicConfig>>` for lock-free reads and atomic swaps.
```rust
let dynamic = Arc::new(ArcSwap::new(Arc::new(DynamicConfig::default())));
```
- **Reads**: `dynamic.load()` returns `Arc<DynamicConfig>`. Multiple readers can hold references simultaneously without blocking.
- **Writes**: `dynamic.store(Arc::new(new_config))` atomically replaces the config. All subsequent reads see the new config.
- **No locks**: `ArcSwap` uses atomic operations. No reader is ever blocked by a writer.
This pattern carries forward directly from the reference implementation (`alknet-main/crates/alknet-core/src/config/dynamic_config.rs`).
## ConfigReloadHandle
```rust
pub struct ConfigReloadHandle {
dynamic: Arc<ArcSwap<DynamicConfig>>,
}
impl ConfigReloadHandle {
pub fn reload(&self, new_config: DynamicConfig);
pub fn dynamic(&self) -> Arc<DynamicConfig>;
}
```
- `reload()`: Atomically replaces the dynamic config. All subsequent reads (including in-flight `IdentityProvider` calls) see the new config.
- `dynamic()`: Returns the current config as `Arc<DynamicConfig>`.
The CLI binary creates a `ConfigReloadHandle` and passes it to a config watcher (file watcher, SIGHUP handler, or call protocol operation) that calls `reload()` when config changes are detected.
## ConfigError
```rust
pub enum ConfigError {
InvalidFlag { name: String },
KeyFileNotFound { path: String },
BindFailed(io::Error),
TlsConfig(io::Error),
IncompatibleOptions,
}
```
Simplified from the reference implementation. Removes proxy-specific errors (now an SSH concern) and listener validation errors (no more `ListenerConfig` enum).
## Key Differences from Reference Implementation
| Aspect | Reference | New Model |
|--------|-----------|-----------|
| StaticConfig fields | SSH host key, stealth, transport_mode, listeners, proxy | listen_addr, TLS cert/key, drain_timeout, rate limits |
| DynamicConfig.auth | `HashSet<PublicKey>` (russh types) | `HashSet<String>` (fingerprint strings) |
| ListenerConfig | Enum with Stream/Http/Dns variants | Eliminated — single endpoint, ALPN dispatch |
| TransportMode | Tcp/Tls/Iroh | Eliminated — always QUIC+TLS |
| Stealth mode | Byte-peeking HTTP/SSH detection | Eliminated — ALPN handles protocol detection |
| ForwardingPolicy | In DynamicConfig | Moved to handler-specific config (SSH) |
## Design Decisions
| Decision | ADR | Summary |
|----------|-----|---------|
| No russh dependency in core | [ADR-003](../../decisions/003-crate-decomposition.md) | Core is ALPN-agnostic; russh is an alknet-ssh dependency |
| ArcSwap for dynamic config | Carry-forward from reference | Lock-free reads, atomic swaps |
| No ListenerConfig | [ADR-001](../../decisions/001-alpn-protocol-dispatch.md) | Single endpoint, ALPN replaces multiple listener types |

View File

@@ -0,0 +1,133 @@
---
status: draft
last_updated: 2026-06-16
---
# Core Types
ProtocolHandler, HandlerError, Connection, BiStream, SendStream, RecvStream, StreamError.
## ProtocolHandler
The central abstraction. Every handler implements one trait:
```rust
#[async_trait]
pub trait ProtocolHandler: Send + Sync + 'static {
fn alpn(&self) -> &'static [u8];
async fn handle(&self, connection: Connection, auth: &AuthContext) -> Result<(), HandlerError>;
}
```
- `alpn()` returns the handler's ALPN identifier as a static byte string (e.g., `b"alknet/ssh"`, `b"alknet/call"`).
- `handle()` receives a `Connection` (not a single BiStream) and an `AuthContext`. Returns `HandlerError` on failure.
- Handlers that need a single stream call `connection.accept_bi()` once. Handlers that multiplex (SSH, call) open/accept streams as needed.
See [ADR-002](../../decisions/002-protocol-handler-trait.md) and [ADR-007](../../decisions/007-bistream-type-definition.md) for rationale.
## HandlerError
Non-fatal errors within a handler's `handle()` method. The endpoint catches these, logs them, and closes the connection. Other connections are unaffected.
```rust
pub enum HandlerError {
ConnectionClosed,
StreamError(io::Error),
AuthRequired,
Internal(Box<dyn std::error::Error + Send + Sync>),
}
```
- `ConnectionClosed`: The peer closed the connection. Clean exit.
- `StreamError`: An I/O error on a stream within the connection.
- `AuthRequired`: The handler requires authentication and couldn't resolve the peer's identity. The endpoint closes the connection with an appropriate error. Handlers that support multi-step auth (like SSH) should handle auth challenges within their protocol, not return `AuthRequired` until all attempts are exhausted.
- `Internal`: Handler-specific errors (protocol violations, upstream failures, etc.).
Handler panics are caught by tokio's task isolation. The connection is dropped, other connections continue.
## Connection
An opaque type wrapping a QUIC connection. Handlers receive a `Connection` in `handle()`.
```rust
pub struct Connection {
// Private: wraps the underlying QUIC connection or test mock
}
impl Connection {
pub async fn accept_bi(&self) -> Result<(SendStream, RecvStream), StreamError>;
pub async fn open_bi(&self) -> Result<(SendStream, RecvStream), StreamError>;
pub fn remote_alpn(&self) -> &[u8];
pub fn remote_addr(&self) -> Option<SocketAddr>;
pub fn close(&self, code: u32, reason: &str);
}
```
- `accept_bi()`: Wait for the peer to open a bidirectional stream. Returns `(SendStream, RecvStream)`.
- `open_bi()`: Open a bidirectional stream to the peer. Returns `(SendStream, RecvStream)`.
- `remote_alpn()`: The ALPN negotiated for this connection. Always present.
- `remote_addr()`: The peer's address, if available. Informational (NAT/proxy).
- `close()`: Close the connection with an error code and reason.
The `Connection` type does not expose quinn types in its public API. It wraps `quinn::Connection` internally, but the wrapper allows test implementations.
See [ADR-007](../../decisions/007-bistream-type-definition.md) for why handlers receive Connection instead of BiStream.
## BiStream
A trait for bidirectional byte streams. Used primarily for client-side and test scenarios.
```rust
pub trait BiStream: AsyncRead + AsyncWrite + Send + Unpin {}
```
Handlers that only need a single stream can obtain one via `connection.accept_bi()` and treat the `(SendStream, RecvStream)` pair as a BiStream. The `BiStream` trait is a convenience for:
- Client-side code that has a single bidirectional stream
- Test mocks that need to simulate a stream
- Future transport abstractions (WebTransport, raw TCP) that produce bidirectional byte streams
See [ADR-007](../../decisions/007-bistream-type-definition.md) for why BiStream is a trait.
## SendStream and RecvStream
Concrete types wrapping QUIC stream halves.
```rust
pub struct SendStream { /* wraps quinn::SendStream or test mock */ }
pub struct RecvStream { /* wraps quinn::RecvStream or test mock */ }
impl AsyncWrite for SendStream { ... }
impl AsyncRead for RecvStream { ... }
```
- `SendStream` implements `AsyncWrite`. Write bytes to the peer.
- `RecvStream` implements `AsyncRead`. Read bytes from the peer.
- These are not trait objects — they are concrete wrapper types that delegate to `quinn::SendStream` / `quinn::RecvStream` in production and to test mocks in tests.
This is a two-way door decision. If future transports need different stream types, `SendStream` and `RecvStream` can become wrappers with enum dispatch. For v1, concrete wrappers over quinn types are simpler and zero-cost.
## StreamError
```rust
pub enum StreamError {
ConnectionClosed,
StreamClosed,
Timeout,
Internal(io::Error),
}
```
Returned by `accept_bi()`, `open_bi()`, and stream read/write operations. Maps from `quinn::ConnectionError` and `quinn::StreamError`.
## Design Decisions
| Decision | ADR | Summary |
|----------|-----|---------|
| ProtocolHandler receives Connection, not BiStream | [ADR-007](../../decisions/007-bistream-type-definition.md) | Handlers that need multiple streams (SSH, call) have direct access to the Connection |
| BiStream is a trait | [ADR-007](../../decisions/007-bistream-type-definition.md) | WASM door preserved, test mocks possible |
| HandlerError is non-fatal | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Handler errors close the connection, not the endpoint |
| SendStream/RecvStream are concrete wrappers | Two-way door | Can become enum dispatch later if multi-transport is needed |
## Open Questions
- **OQ-05**: See [open-questions.md](../../open-questions.md) — multi-transport. If quinn is the only transport in v1, SendStream/RecvStream can be concrete wrappers.

View File

@@ -0,0 +1,189 @@
---
status: draft
last_updated: 2026-06-16
---
# Endpoint
ALPN router, handler registry, connection accept loop, and graceful shutdown.
See [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) for the full rationale.
## AlknetEndpoint
The central runtime type. Owns the QUIC endpoint, holds the handler registry, and runs the accept loop.
```rust
pub struct AlknetEndpoint {
endpoint: quinn::Endpoint,
handlers: Arc<HandlerRegistry>,
dynamic: Arc<ArcSwap<DynamicConfig>>,
identity_provider: Arc<dyn IdentityProvider>,
shutdown: watch::Receiver<bool>,
}
```
### Construction
The CLI binary constructs an `AlknetEndpoint` at startup:
1. Build `HandlerRegistry` by inserting handlers for each ALPN.
2. Build `StaticConfig` from CLI arguments / config file.
3. Build `rustls::ServerConfig` from TLS cert/key and the registry's ALPN strings.
4. Bind `quinn::Endpoint` with the `ServerConfig`.
5. Create `ArcSwap<DynamicConfig>` and `ConfigIdentityProvider`.
6. Call `AlknetEndpoint::new(endpoint, handlers, dynamic, identity_provider)`.
### Accept Loop
```
loop {
tokio::select! {
incoming = endpoint.accept() => {
let connection = incoming.await; // TLS handshake + ALPN negotiation
match connection {
Ok(conn) => {
let alpn = conn.alpn();
match handlers.get(alpn) {
Some(handler) => {
let auth = AuthContext::from_connection(&conn);
let conn = Connection::new(conn);
tokio::spawn(async move {
if let Err(e) = handler.handle(conn, &auth).await {
// log error, connection closes
}
});
}
None => {
// ALPN has no handler — should not happen
// (ServerConfig only advertises registered ALPNs)
conn.close(0u32, "no handler");
}
}
}
Err(e) => {
// TLS handshake or connection-level error
// log and continue accepting
}
}
}
_ = shutdown.changed() => {
break; // graceful shutdown
}
}
}
```
### What the accept loop does NOT do
- **No byte-peeking**: ALPN negotiation handles protocol detection. The old `stealth` module's `detect_protocol()` is unnecessary.
- **No per-handler accept loops**: The old model had `ListenerConfig::Stream`, `ListenerConfig::Http`, `ListenerConfig::Dns` with different accept paths. ALPN unifies this.
- **No SSH-specific logic**: The accept loop is ALPN-agnostic. It doesn't know or care what protocol the handler speaks.
## HandlerRegistry
Maps ALPN byte strings to `ProtocolHandler` instances.
```rust
pub struct HandlerRegistry {
handlers: HashMap<&'static [u8], Arc<dyn ProtocolHandler>>,
}
impl HandlerRegistry {
pub fn new() -> Self;
pub fn register(&mut self, handler: Arc<dyn ProtocolHandler>);
pub fn get(&self, alpn: &[u8]) -> Option<&Arc<dyn ProtocolHandler>>;
pub fn alpn_strings(&self) -> Vec<Vec<u8>>;
}
```
- `register()`: Insert a handler. Panics if the ALPN is already registered (duplicate handlers are a bug).
- `get()`: Look up a handler by ALPN string. Returns `None` if no handler is registered.
- `alpn_strings()`: Return all registered ALPN strings. Used to build the TLS `ServerConfig`.
Registration is static at startup (see [OQ-04](../../open-questions.md) and ADR-010). The CLI builds a `HandlerRegistry`, inserts all handlers, and passes it to `AlknetEndpoint`. The registry is immutable after construction.
### ALPN strings in the TLS ServerConfig
The `rustls::ServerConfig`'s ALPN protocol list is set from `registry.alpn_strings()` at construction time. This means:
- Only registered handlers' ALPNs are advertised during TLS negotiation.
- If a client offers an ALPN that's not in the list, the TLS handshake fails — correct behavior.
- Adding a handler at runtime requires rebuilding the `ServerConfig` (see OQ-04).
## Graceful Shutdown
```rust
impl AlknetEndpoint {
pub fn shutdown_sender(&self) -> watch::Sender<bool>;
pub async fn shutdown(&self) -> Result<(), EndpointError>;
}
```
- `shutdown_sender()` returns a clone of the shutdown channel sender. Call `send(true)` to signal shutdown.
- `shutdown()` waits for in-flight connections to complete, with a drain timeout (default: 2 seconds). After the timeout, remaining connections are forcefully closed.
- SIGTERM/SIGINT are wired to the shutdown channel by the CLI binary.
The drain timeout is configurable via `StaticConfig::drain_timeout`.
## Error Handling
### EndpointError
Fatal errors that prevent the endpoint from starting or continuing.
```rust
pub enum EndpointError {
BindFailed(io::Error),
TlsConfig(io::Error),
HandlerNotFound(Vec<u8>), // ALPN string with no registered handler
}
```
### HandlerError
Non-fatal errors within a handler. See [core-types.md](core-types.md) for details.
### Accept loop errors
- **TLS handshake failure**: Log and continue. The client may have offered no compatible ALPN, or the cert may be untrusted by the client.
- **Handler panic**: Caught by tokio's task isolation. The connection is dropped. Other connections continue.
- **Connection-level errors** (quinn `ConnectionError`): Log and continue. The accept loop keeps running.
## TLS Certificate Provisioning
`StaticConfig` provides TLS configuration via file paths:
- **Manual**: `tls_cert` and `tls_key` file paths. Required for production use.
- **Self-signed**: For development. The endpoint can generate a self-signed cert on startup.
The `rustls::ServerConfig` is built from cert + key + ALPN list at startup.
ACME auto-provisioning (Let's Encrypt) is not in scope for v1. It will be added as a feature later (see OQ-12).
## Key Differences from Reference Implementation
| Aspect | Reference (`alknet-main`) | New Model |
|--------|---------------------------|-----------|
| Transport | `TransportAcceptor` trait, `TransportKind` enum | `quinn::Endpoint` directly |
| Listener config | `ListenerConfig` enum (Stream/Http/Dns) | Single endpoint, ALPN dispatch |
| Protocol detection | Byte-peeking (`stealth::detect_protocol`) | ALPN negotiation (TLS layer) |
| Accept loop | Per-transport, SSH-centric | ALPN-agnostic, handler-dispatched |
| Handler model | `ServerHandler` + `russh::server::Handler` | `ProtocolHandler::handle(Connection, &AuthContext)` |
| Config | `ServeOptions` builder | `StaticConfig` + `HandlerRegistry` + `AlknetEndpoint::new()` |
## Design Decisions
| Decision | ADR | Summary |
|----------|-----|---------|
| Static handler registration | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Two-way door, start static, add ArcSwap later |
| quinn::Endpoint directly, no TransportAcceptor | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Start with quinn, abstract later if needed |
| No byte-peeking, ALPN dispatch only | [ADR-001](../../decisions/001-alpn-protocol-dispatch.md) | TLS layer handles protocol detection |
| Handler panics isolated | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | tokio task isolation, connection closes |
## Open Questions
See [open-questions.md](../../open-questions.md) for full details.
- **OQ-04**: Resolved — HandlerRegistry is static at startup.
- **OQ-05**: Open — start with quinn, abstract later if needed.
- **OQ-12**: Resolved — start with file paths in StaticConfig, add ACME later.

View File

@@ -12,7 +12,7 @@ The new ALPN dispatch model eliminates the need for a shared interface layer. Ea
Key constraints: Key constraints:
- Protocol crates must depend on alknet-core for auth/identity/config — but not on each other - Protocol crates must depend on alknet-core for auth/identity/config — but not on each other
- alknet-secret is already standalone (no alknet-core dependency) and must remain so (renamed to alknet-vault — see ADR-008) - alknet-vault is already standalone (no alknet-core dependency) and must remain so (see ADR-008)
- The CLI binary assembles everything — it's the only crate that depends on all handler crates - The CLI binary assembles everything — it's the only crate that depends on all handler crates
- Some handlers (SFTP, call protocol) need to compile to WASM for browser/client use - Some handlers (SFTP, call protocol) need to compile to WASM for browser/client use
- irpc is the foundation for the call protocol — it provides the operation registry, framing, and pub/sub patterns - irpc is the foundation for the call protocol — it provides the operation registry, framing, and pub/sub patterns

View File

@@ -42,7 +42,7 @@ The `AuthContext` passed to `handle()` may be partial — containing only transp
The `CredentialProvider` concept from the previous architecture is simplified: there is no phase progression (AD). The `IdentityProvider` has two resolution paths — fingerprint and token — and a `ConfigIdentityProvider` implementation that draws from static and dynamic config. The `CredentialProvider` concept from the previous architecture is simplified: there is no phase progression (AD). The `IdentityProvider` has two resolution paths — fingerprint and token — and a `ConfigIdentityProvider` implementation that draws from static and dynamic config.
`alknet-secret` remains independent. It does not depend on `alknet-core` or `IdentityProvider`. The secret service provides derived keys on request; identity resolution is a separate concern. `alknet-vault` stays standalone. It does not depend on `alknet-core` or `IdentityProvider`. The vault provides derived keys on request; identity resolution is a separate concern.
## Consequences ## Consequences

View File

@@ -30,7 +30,7 @@ This means:
- The TypeScript "operations" and "pub/sub" patterns that can import OpenAPI schemas and expose MCP tools are supported at the protocol level - The TypeScript "operations" and "pub/sub" patterns that can import OpenAPI schemas and expose MCP tools are supported at the protocol level
- Future NAPI and WASM clients speak the same wire format - Future NAPI and WASM clients speak the same wire format
The `SecretProtocol` in alknet-secret also uses irpc as its service protocol. This is consistent — alknet-secret's irpc service is an independent service that happens to use the same framing, not a dependency on alknet-call. The `VaultProtocol` in alknet-vault also uses irpc as its service protocol. This is consistent — alknet-vault's irpc service is an independent service that happens to use the same framing, not a dependency on alknet-call.
## Consequences ## Consequences
@@ -39,7 +39,7 @@ The `SecretProtocol` in alknet-secret also uses irpc as its service protocol. Th
- JSON Schema compatible — OpenAPI import, MCP tool exposure, cross-language client generation - JSON Schema compatible — OpenAPI import, MCP tool exposure, cross-language client generation
- No need to design a custom RPC wire format — irpc's is already battle-tested - No need to design a custom RPC wire format — irpc's is already battle-tested
- The call protocol inherits irpc's streaming and subscription patterns - The call protocol inherits irpc's streaming and subscription patterns
- Consistency with alknet-secret's service model — both use irpc - Consistency with alknet-vault's service model — both use irpc
**Negative:** **Negative:**
- alknet-call depends on irpc — if irpc has limitations or bugs, we're affected (mitigated: irpc is lightweight and we can fork if needed) - alknet-call depends on irpc — if irpc has limitations or bugs, we're affected (mitigated: irpc is lightweight and we can fork if needed)

View File

@@ -8,7 +8,7 @@ Accepted
Not all architectural decisions carry the same reversal cost. Some decisions are easy to change later — if you pick the wrong data structure, you refactor. Other decisions are nearly impossible to reverse — if you build a type hierarchy that forecloses WASM compatibility, every handler written against that hierarchy must be rewritten. Not all architectural decisions carry the same reversal cost. Some decisions are easy to change later — if you pick the wrong data structure, you refactor. Other decisions are nearly impossible to reverse — if you build a type hierarchy that forecloses WASM compatibility, every handler written against that hierarchy must be rewritten.
This distinction matters especially during Phase 0 (exploration) and early Phase 1 (architecture). The project is post-pivot with foundational ADRs in place but no implementation code yet (except alknet-secret). Decisions made now shape the API surface that every handler depends on. This distinction matters especially during Phase 0 (exploration) and early Phase 1 (architecture). The project is post-pivot with foundational ADRs in place but no implementation code yet (except alknet-vault). Decisions made now shape the API surface that every handler depends on.
Without an explicit framework, one-way doors can be treated as casually as two-way doors, leading to costly rework. Or conversely, two-way doors can be over-analyzed, blocking progress on decisions that are cheap to reverse. Without an explicit framework, one-way doors can be treated as casually as two-way doors, leading to costly rework. Or conversely, two-way doors can be over-analyzed, blocking progress on decisions that are cheap to reverse.

View File

@@ -0,0 +1,141 @@
# ADR-010: ALPN Router and Endpoint
## Status
Proposed
## Context
ADR-001 establishes ALPN-based protocol dispatch: a single QUIC+TLS endpoint accepts connections, and the ALPN negotiated during the TLS handshake routes each connection to the correct `ProtocolHandler`. ADR-002 defines the `ProtocolHandler` trait. ADR-006 establishes one ALPN per connection. ADR-007 defines `Connection` and `BiStream`.
The question now is: **how does the endpoint work?** What accepts QUIC connections, negotiates ALPN, and hands connections to handlers? This is the central runtime piece of alknet-core — every handler depends on it.
The reference implementation (`alknet-main`) uses a `Server` struct that binds a `TransportAcceptor`, runs an accept loop, and dispatches to a `ServerHandler` based on transport type and interface kind. This has three problems that the ALPN model solves:
1. **Multiple listener types**: `ListenerConfig` has three variants (Stream, Http, Dns) with per-variant configuration and validation. ALPN eliminates this — one endpoint, one listener, ALPN does the routing.
2. **Protocol detection by byte-peeking**: The `stealth` module reads the first bytes to detect SSH vs HTTP. ALPN negotiation makes this unnecessary — the TLS handshake tells you the protocol before any application bytes are read.
3. **SSH-centric accept loop**: The current `handle_connection` immediately enters `russh::server::run_stream`. In the new model, the accept loop is ALPN-agnostic — it doesn't know or care what protocol the handler speaks.
### iroh's pattern
iroh's `Router` registers `ProtocolHandler` instances with ALPN strings, then calls `endpoint.accept()` in a loop. For each incoming connection, it reads the negotiated ALPN, looks up the handler, and calls `handler.accept(connection)`. This is clean and proven.
### Key design questions
1. **Handler registration**: Static (at startup) or dynamic (at runtime)?
2. **TLS certificate management**: How does the endpoint get TLS certs? Where does ACME fit?
3. **Connection lifecycle**: Who owns the `quinn::Endpoint`? How does graceful shutdown work?
4. **Error handling**: What happens when a handler panics? When ALPN negotiation fails?
## Decision
### Endpoint owns the QUIC endpoint
`alknet-core` owns the `quinn::Endpoint` directly. The endpoint binds to a single address, configures TLS with a `rustls::ServerConfig` that includes the ALPN strings from all registered handlers, and accepts connections in a loop.
```rust
pub struct AlknetEndpoint {
endpoint: quinn::Endpoint,
handlers: Arc<HandlerRegistry>,
dynamic: Arc<ArcSwap<DynamicConfig>>,
identity_provider: Arc<dyn IdentityProvider>,
shutdown: watch::Receiver<bool>,
}
```
There is no `TransportAcceptor` trait, no `TransportKind` enum, no `ListenerConfig` enum. QUIC+TLS+ALPN replaces all of that.
### HandlerRegistry maps ALPN strings to ProtocolHandler instances
```rust
pub struct HandlerRegistry {
handlers: HashMap<&'static [u8], Arc<dyn ProtocolHandler>>,
}
```
Registration is static at startup. The CLI binary constructs a `HandlerRegistry` by inserting handlers for each ALPN, then passes it to `AlknetEndpoint::new()`. The ALPN strings in the TLS `ServerConfig` are derived from the registry's keys.
This is a two-way door (OQ-04): starting static is simple. If dynamic registration is needed later, the registry can be wrapped in `ArcSwap<HandlerRegistry>` and the TLS `ServerConfig` can be regenerated. But ALPN negotiation happens during the TLS handshake, so adding a handler at runtime requires the next connection to use the new ALPN — which the client already has to know about. Dynamic registration has limited value for v1.
### Accept loop: connect, dispatch, spawn
```
loop {
incoming = endpoint.accept().await
connection = incoming.await // TLS handshake + ALPN negotiation
alpn = connection.alpn()
handler = registry.get(alpn)
match handler {
Some(h) => {
auth = resolve_endpoint_auth(connection) // TLS client cert, etc.
tokio::spawn(h.handle(connection, &auth))
}
None => connection.close()
}
}
```
Key behaviors:
- **ALPN mismatch**: The TLS handshake fails. This is correct — the client and server have no protocol in common.
- **Handler not found**: Should not happen — the `ServerConfig` only advertises ALPNs that have registered handlers. If somehow a connection negotiates an ALPN with no handler, the connection is closed with an error log.
- **Handler panic**: The handler runs in a spawned tokio task. If it panics, the task is caught by tokio's panic handler. The connection is dropped. Other connections are unaffected.
- **Graceful shutdown**: A `watch::Sender<bool>` signals the accept loop to stop accepting new connections. Existing connections are given a drain timeout (2 seconds default), then forcefully closed.
### TLS certificate configuration
TLS certs come from `StaticConfig`:
- File paths (`tls_cert`, `tls_key`) for manual provisioning
- Self-signed for development
The `rustls::ServerConfig` is built from the cert + key + ALPN list at startup. The ALPN list is derived from `HandlerRegistry::alpn_strings()`.
ACME auto-provisioning (Let's Encrypt) is not in scope for v1. It will be added as a feature later (see OQ-12).
### Error taxonomy
```rust
pub enum EndpointError {
BindFailed(io::Error),
TlsConfig(io::Error),
HandlerNotFound(Vec<u8>), // ALPN string with no registered handler
}
pub enum HandlerError {
ConnectionClosed,
StreamError(io::Error),
AuthRequired,
Internal(Box<dyn std::error::Error + Send + Sync>),
}
```
- `EndpointError`: Problems starting or running the endpoint. Fatal — the endpoint cannot accept connections.
- `HandlerError`: Problems within a handler's `handle()` method. Non-fatal — the connection is closed, but the endpoint keeps running.
## Consequences
**Positive:**
- Single accept loop replaces multiple listener types and byte-peeking
- ALPN negotiation happens at the TLS layer — no application-level protocol detection
- Adding a handler is registering an ALPN string — no endpoint code changes
- Handler panics are isolated — one bad handler can't take down the endpoint
- `quinn::Endpoint` is the only transport — no TransportAcceptor trait needed for v1
- The endpoint is testable: give it mock handlers and a test ALPN, verify dispatch
**Negative:**
- Direct quinn dependency in alknet-core — WASM targets can't use quinn (mitigated: WASM clients don't run endpoints, they connect to them; the WASM door is for client-side handlers, not the endpoint itself)
- No runtime handler registration without regenerating the TLS config (mitigated: two-way door, start static, add ArcSwap later if needed)
- TLS cert provisioning is manual (file paths) for v1 — ACME auto-provisioning is a future feature (OQ-12)
- One address per endpoint — if you need to listen on multiple addresses, run multiple endpoints (acceptable for v1)
## References
- ADR-001: ALPN-based protocol dispatch
- ADR-002: ProtocolHandler trait
- ADR-006: ALPN string convention and connection model
- ADR-007: BiStream type definition (Connection, SendStream, RecvStream)
- ADR-009: One-way door decision framework
- OQ-04: Dynamic handler registration (two-way door, start static)
- OQ-05: Multi-transport endpoint (two-way door, start with quinn)
- iroh Router pattern: `docs/research/references/iroh/`
- Reference implementation: `alknet-main/crates/alknet-core/src/server/serve.rs`

View File

@@ -0,0 +1,156 @@
# ADR-011: AuthContext Structure and Resolution Flow
## Status
Proposed
## Context
ADR-004 establishes the hybrid auth model: the endpoint resolves what it can (TLS client certificate fingerprint), handlers resolve what they must (AuthToken in the first frame, Bearer header, SSH key fingerprint). The `AuthContext` passed to `handle()` may be partial.
The reference implementation's `Identity` struct is:
```rust
pub struct Identity {
pub id: String,
pub scopes: Vec<String>,
pub resources: HashMap<String, Vec<String>>,
}
```
And `ConfigIdentityProvider` resolves fingerprints and API keys to `Identity`. This works well and carries forward.
But the reference implementation has no `AuthContext` type — auth resolution happens inside the SSH handler before calling `IdentityProvider`. The new model needs a type that represents "what the endpoint knows about this connection's identity before the handler starts," plus a way for handlers to enrich it.
This is a one-way door: once handlers depend on `AuthContext`'s structure, changing it affects every handler. The structure must be right.
### Design considerations
1. **Handlers need identity information to make authorization decisions.** A handler that requires authentication needs to know: is the peer authenticated? Who are they? What scopes do they have?
2. **The endpoint may have zero, partial, or complete identity information.** A plain QUIC connection with no TLS client cert gives the endpoint nothing. A TLS connection with a client cert gives the endpoint a fingerprint that may resolve to an Identity. A handler that extracts an AuthToken from the first frame can complete the resolution.
3. **AuthContext must not be SSH-specific.** The reference implementation's auth types are tangled with russh (SSH key fingerprints, certificate authorities). The new model needs to be ALPN-agnostic.
4. **AuthContext is constructed by the endpoint and enriched by handlers.** The endpoint creates it from TLS-level information. The handler mutates or replaces it with protocol-level information.
5. **AuthContext must be cheap to construct.** Every incoming connection gets one, even if authentication ultimately fails.
## Decision
### AuthContext is a struct with optional fields
```rust
pub struct AuthContext {
/// The peer's authenticated identity, if resolved.
/// None means the endpoint has no identity information for this connection.
/// Some(Identity) means the endpoint resolved the peer's identity.
pub identity: Option<Identity>,
/// The negotiated ALPN for this connection.
/// Always present — the endpoint sets this from the TLS handshake.
pub alpn: Vec<u8>,
/// The peer's remote address, if available.
pub remote_addr: Option<SocketAddr>,
/// TLS client certificate fingerprint, if the client presented a certificate.
/// Set by the endpoint during TLS handshake. Handlers may use this for
/// SSH host key verification or other fingerprint-based auth.
pub tls_client_fingerprint: Option<String>,
}
```
Key design points:
- `identity: Option<Identity>` — not `Identity` with optional fields, not a separate `PartialAuthContext`. The endpoint sets it to `None` if it has no identity information, or `Some(identity)` if it resolved one. Handlers that need to complete auth call `IdentityProvider` themselves and store the resolved identity in a local variable — they do NOT mutate AuthContext (see immutability section below).
- `alpn` is always present — every connection has a negotiated ALPN.
- `remote_addr` is informational. It's available from the QUIC connection and useful for logging and rate limiting, but it's not authoritative (clients can be behind NATs/proxies).
- `tls_client_fingerprint` captures the TLS-level credential. If present, it's the SHA-256 fingerprint of the client's TLS certificate. This is separate from `identity` because a handler might need the fingerprint even when `IdentityProvider::resolve_from_fingerprint()` returns `None` (e.g., unknown cert, but the handler wants to log it).
### AuthContext is Clone
`AuthContext` derives `Clone`. Handlers can clone it for per-stream or per-channel contexts within a connection. The `Identity` inside is also `Clone`.
### Handler-level auth enrichment pattern
Handlers that need to complete authentication do so inside `handle()`:
```rust
async fn handle(&self, connection: Connection, auth: &AuthContext) -> Result<(), HandlerError> {
let identity = if let Some(id) = &auth.identity {
id.clone() // Endpoint already resolved identity
} else {
// Extract credentials from the protocol, resolve via IdentityProvider
let token = self.extract_auth_token(&connection).await?;
self.identity_provider.resolve_from_token(&token)
.ok_or(HandlerError::AuthRequired)?
};
// ... proceed with authenticated identity
}
```
Handlers that don't need authentication (e.g., DNS resolver, health check) can ignore `auth.identity` entirely.
### Identity carries over from reference implementation
```rust
pub struct Identity {
pub id: String,
pub scopes: Vec<String>,
pub resources: HashMap<String, Vec<String>>,
}
```
This is the same structure from the reference implementation, minus the russh dependency. It's ALPN-agnostic:
- `id`: A unique identifier string. For SSH key auth, this is the SHA-256 fingerprint. For API key auth, this is the key prefix. For certificate auth, this is the principal name.
- `scopes`: Authorization scopes. `["relay:connect", "secrets:derive"]` etc.
- `resources`: Named resource lists. `{"service": ["gitea", "registry"]}` etc.
### AuthToken carries raw bytes
```rust
pub struct AuthToken {
pub raw: Vec<u8>,
}
```
Unchanged from the reference implementation. Opaque bytes — the handler that extracted it knows its encoding.
### IdentityProvider carries over with minor adaptation
```rust
pub trait IdentityProvider: Send + Sync + 'static {
fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
}
```
The implementation (`ConfigIdentityProvider`) changes from the reference: it no longer depends on russh types for key storage. Instead, it stores fingerprint strings and API key entries, drawing from `DynamicConfig` via `ArcSwap`.
### AuthContext is NOT mutable inside handle()
The `handle()` signature passes `&AuthContext` (immutable reference). Handlers that resolve identity create a local variable with the resolved identity — they don't mutate the AuthContext. This prevents accidental cross-contamination between streams on the same connection.
## Consequences
**Positive:**
- `AuthContext` is a value type — cheap to construct, clone, and pass around
- Handlers that don't need auth can ignore it entirely
- The endpoint provides what it can for free (TLS client cert fingerprint), handlers complete what they need
- No russh dependency in AuthContext — it's ALPN-agnostic
- `Option<Identity>` is explicit — there's no "partially authenticated" state that handlers have to interpret
- Handlers that need to enrich auth create local variables, not mutation — clean data flow
**Negative:**
- Handlers that need auth must call `IdentityProvider` themselves — this is intentional (ADR-004 hybrid model) but means each handler has its own auth extraction logic
- `tls_client_fingerprint` is separate from `identity` — a handler might wonder "why do I have a fingerprint but no identity?" This happens when the client presents a cert that's not in the authorized keys. The handler can log the fingerprint for debugging.
- `AuthContext` doesn't carry protocol-specific auth state (e.g., SSH auth method, HTTP auth scheme). This is by design — protocol-specific details belong inside the handler, not in the shared auth context.
## References
- ADR-002: ProtocolHandler trait
- ADR-004: Auth as shared core (IdentityProvider, hybrid auth model)
- ADR-007: BiStream type definition (Connection parameter)
- ADR-010: ALPN router and endpoint (where AuthContext is created)
- Reference implementation: `alknet-main/crates/alknet-core/src/auth/identity.rs`

View File

@@ -1,6 +1,6 @@
--- ---
status: draft status: draft
last_updated: 2026-06-16 last_updated: 2026-06-17
--- ---
# Open Questions # Open Questions
@@ -45,11 +45,11 @@ Door type classifications follow ADR-009:
### OQ-04: Dynamic Handler Registration at Runtime vs Static at Startup ### OQ-04: Dynamic Handler Registration at Runtime vs Static at Startup
- **Origin**: [overview.md](overview.md) - **Origin**: [overview.md](overview.md)
- **Status**: open - **Status**: resolved
- **Door type**: Two-way - **Door type**: Two-way
- **Priority**: low - **Priority**: low
- **Resolution**: (deferred to implementation) Start with static registration at startup. The `ArcSwap<DynamicConfig>` pattern from the previous implementation can be applied later if needed. ALPN advertisement requires endpoint restart anyway (TLS ALPN is negotiated during handshake), so dynamic registration has limited value in v1. - **Resolution**: Static registration at startup. `HandlerRegistry` is immutable after construction. ALPN strings in the TLS `ServerConfig` are derived from the registry at startup — adding a handler at runtime requires rebuilding the TLS config. The `ArcSwap<HandlerRegistry>` pattern can be applied later if needed (two-way door). See ADR-010.
- **Cross-references**: ADR-001 - **Cross-references**: ADR-001, ADR-010, [endpoint.md](crates/core/endpoint.md)
## Theme: Transport and Endpoint ## Theme: Transport and Endpoint
@@ -59,8 +59,8 @@ Door type classifications follow ADR-009:
- **Status**: open - **Status**: open
- **Door type**: Two-way - **Door type**: Two-way
- **Priority**: low - **Priority**: low
- **Resolution**: (deferred to implementation) Start with quinn (QUIC over UDP). The endpoint can be made transport-agnostic later by abstracting the connection accept loop behind a trait. iroh connectivity produces QUIC connections that can feed into the same ALPN router. - **Resolution**: Start with quinn (QUIC over UDP). `AlknetEndpoint` uses `quinn::Endpoint` directly. The endpoint can be made transport-agnostic later by abstracting the connection accept loop behind a trait. iroh connectivity produces QUIC connections that can feed into the same ALPN router. `SendStream`/`RecvStream` are concrete wrappers over quinn types — can become enum dispatch if multi-transport is needed. See ADR-010.
- **Cross-references**: ADR-001 - **Cross-references**: ADR-001, ADR-010, [core-types.md](crates/core/core-types.md)
### OQ-06: Server-Side ALPN vs Client-Side ALPN ### OQ-06: Server-Side ALPN vs Client-Side ALPN
@@ -113,4 +113,24 @@ These questions are acknowledged but not active. They will be promoted to open w
- **Door type**: Two-way - **Door type**: Two-way
- **Priority**: low - **Priority**: low
- **Resolution**: Deferred per the cleanup plan. Start with git smart protocol over QUIC streams. ERC721 integration and full server capabilities are additive. Resolve when speccing alknet-git. - **Resolution**: Deferred per the cleanup plan. Start with git smart protocol over QUIC streams. ERC721 integration and full server capabilities are additive. Resolve when speccing alknet-git.
- **Cross-references**: ADR-001 - **Cross-references**: ADR-001
## Theme: alknet-core
### OQ-11: Handler-Level Auth Resolution Observability
- **Origin**: [auth.md](crates/core/auth.md)
- **Status**: open
- **Door type**: Two-way
- **Priority**: medium
- **Resolution**: When a handler resolves identity inside `handle()`, should the resolved `Identity` be stored somewhere for observability (e.g., connection logging), or is the handler's local variable sufficient? Options: (A) handlers return the resolved identity from `handle()`, (B) handlers call a method on Connection to set identity, (C) handlers log locally and the resolved identity stays local. Two-way door — can be decided during implementation.
- **Cross-references**: ADR-004, ADR-011
### OQ-12: TLS Certificate Provisioning in AlknetEndpoint
- **Origin**: [endpoint.md](crates/core/endpoint.md), [config.md](crates/core/config.md)
- **Status**: resolved
- **Door type**: Two-way
- **Priority**: medium
- **Resolution**: Start with file paths in StaticConfig (option A). The CLI binary provides `tls_cert` and `tls_key` paths at startup. ACME auto-provisioning (option B) and external cert managers (option C) are additive — they can be added as features without changing the core StaticConfig or endpoint lifecycle. `StaticConfig` does NOT include `acme_domain` in v1; ACME will be a separate feature when implemented.
- **Cross-references**: ADR-010, [config.md](crates/core/config.md)