docs(architecture): spec alknet-core with per-crate subdocs, ADR-010/011
Add alknet-core architecture specs in docs/architecture/crates/core/ with focused subdocuments for core types, endpoint, auth, and config. Write ADR-010 (ALPN Router and Endpoint) defining AlknetEndpoint, HandlerRegistry, accept loop, and graceful shutdown. Write ADR-011 (AuthContext Structure) defining AuthContext fields, immutability in handle(), and IdentityProvider injection pattern. Resolve OQ-04 (static registration), OQ-12 (file paths only for v1). Add OQ-11 (auth observability). Fix remaining alknet-secret references to alknet-vault across ADRs 003/004/005/009.
This commit is contained in:
47
docs/architecture/crates/core/README.md
Normal file
47
docs/architecture/crates/core/README.md
Normal file
@@ -0,0 +1,47 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-16
|
||||
---
|
||||
|
||||
# alknet-core
|
||||
|
||||
Core library for ALPN-based protocol dispatch. Every handler crate depends on alknet-core.
|
||||
|
||||
## Documents
|
||||
|
||||
| Document | Status | Description |
|
||||
|----------|--------|-------------|
|
||||
| [core-types.md](core-types.md) | draft | ProtocolHandler trait, HandlerError, Connection, BiStream, StreamError |
|
||||
| [endpoint.md](endpoint.md) | draft | ALPN router, HandlerRegistry, accept loop, graceful shutdown |
|
||||
| [auth.md](auth.md) | draft | AuthContext, Identity, IdentityProvider, AuthToken, resolution flow |
|
||||
| [config.md](config.md) | draft | StaticConfig, DynamicConfig, ArcSwap, ConfigReloadHandle |
|
||||
|
||||
## Applicable ADRs
|
||||
|
||||
| ADR | Title | Relevance |
|
||||
|-----|-------|-----------|
|
||||
| [001](../../decisions/001-alpn-protocol-dispatch.md) | ALPN-Based Protocol Dispatch | Core architectural model |
|
||||
| [002](../../decisions/002-protocol-handler-trait.md) | ProtocolHandler Trait | The trait every handler implements |
|
||||
| [003](../../decisions/003-crate-decomposition.md) | Crate Decomposition | alknet-core's position in the crate graph |
|
||||
| [004](../../decisions/004-auth-as-shared-core.md) | Auth as Shared Core | IdentityProvider in core |
|
||||
| [006](../../decisions/006-alpn-convention-and-connection-model.md) | ALPN String Convention | ALPN format, one-ALPN-per-connection |
|
||||
| [007](../../decisions/007-bistream-type-definition.md) | BiStream Type Definition | Connection, BiStream trait, SendStream, RecvStream |
|
||||
| [009](../../decisions/009-one-way-door-decision-framework.md) | One-Way Door Framework | Decision classification |
|
||||
| [010](../../decisions/010-alpn-router-and-endpoint.md) | ALPN Router and Endpoint | Endpoint, HandlerRegistry, accept loop |
|
||||
| [011](../../decisions/011-authcontext-structure.md) | AuthContext Structure | AuthContext fields and resolution flow |
|
||||
|
||||
## Relevant Open Questions
|
||||
|
||||
| OQ | Title | Status | Relevance |
|
||||
|----|-------|--------|-----------|
|
||||
| OQ-04 | Dynamic handler registration | resolved (start static) | HandlerRegistry is immutable at startup |
|
||||
| OQ-05 | Multi-transport endpoint | open (start with quinn) | AlknetEndpoint uses quinn directly |
|
||||
| OQ-11 | AuthContext resolution completeness | open | How handlers signal auth completion |
|
||||
|
||||
## Key Design Principles
|
||||
|
||||
1. **One trait, one dispatch point**: `ProtocolHandler` is the only abstraction handlers implement. No StreamInterface/MessageInterface split.
|
||||
2. **ALPN does the routing**: The endpoint dispatches by ALPN string. No byte-peeking, no ListenerConfig enum.
|
||||
3. **Handlers own their wire format**: Each handler manages its own protocol parsing. alknet-core provides the Connection, not the framing.
|
||||
4. **Auth is hybrid**: The endpoint provides what it can (TLS-level auth). Handlers complete what they need. AuthContext may be partial.
|
||||
5. **WASM door preserved**: BiStream is a trait, Connection is an opaque type. Core types don't assume tokio or quinn in public APIs.
|
||||
237
docs/architecture/crates/core/auth.md
Normal file
237
docs/architecture/crates/core/auth.md
Normal file
@@ -0,0 +1,237 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-16
|
||||
---
|
||||
|
||||
# Authentication
|
||||
|
||||
AuthContext, Identity, IdentityProvider, AuthToken, and the resolution flow.
|
||||
|
||||
See [ADR-004](../../decisions/004-auth-as-shared-core.md) and [ADR-011](../../decisions/011-authcontext-structure.md) for rationale.
|
||||
|
||||
## AuthContext
|
||||
|
||||
Created by the endpoint for each incoming connection. Passed to `ProtocolHandler::handle()` as an immutable reference.
|
||||
|
||||
```rust
|
||||
#[derive(Clone)]
|
||||
pub struct AuthContext {
|
||||
/// The peer's authenticated identity, if resolved by the endpoint.
|
||||
/// None means the endpoint has no identity information for this connection.
|
||||
pub identity: Option<Identity>,
|
||||
|
||||
/// The negotiated ALPN for this connection. Always present.
|
||||
pub alpn: Vec<u8>,
|
||||
|
||||
/// The peer's remote address, if available. Informational (NAT/proxy).
|
||||
pub remote_addr: Option<SocketAddr>,
|
||||
|
||||
/// SHA-256 fingerprint of the TLS client certificate, if presented.
|
||||
/// Set by the endpoint during TLS handshake. Handlers may use this for
|
||||
/// fingerprint-based auth even when IdentityProvider returns None.
|
||||
pub tls_client_fingerprint: Option<String>,
|
||||
}
|
||||
```
|
||||
|
||||
### Construction by the endpoint
|
||||
|
||||
The endpoint constructs `AuthContext` from the QUIC connection:
|
||||
|
||||
1. `alpn`: From `connection.alpn()` — always present after TLS handshake.
|
||||
2. `remote_addr`: From `connection.remote_addr()` — may be `None` for iroh connections.
|
||||
3. `tls_client_fingerprint`: Extracted from the TLS session's client certificate, if one was presented.
|
||||
4. `identity`: If a TLS client fingerprint is available, the endpoint calls `IdentityProvider::resolve_from_fingerprint()`. If it resolves, `identity = Some(resolved)`. If not, `identity = None`.
|
||||
|
||||
### Handler-level resolution
|
||||
|
||||
Handlers that require authentication extract protocol-specific credentials and call `IdentityProvider` inside `handle()`:
|
||||
|
||||
```rust
|
||||
// Example: CallAdapter extracting an AuthToken from the first frame
|
||||
async fn handle(&self, connection: Connection, auth: &AuthContext) -> Result<(), HandlerError> {
|
||||
let identity = match &auth.identity {
|
||||
Some(id) => id.clone(), // Endpoint already resolved identity
|
||||
None => {
|
||||
let stream = connection.accept_bi().await?;
|
||||
let token = extract_auth_token(stream).await?;
|
||||
self.identity_provider
|
||||
.resolve_from_token(&token)
|
||||
.ok_or(HandlerError::AuthRequired)?
|
||||
}
|
||||
};
|
||||
// ... proceed with authenticated identity
|
||||
}
|
||||
```
|
||||
|
||||
Handlers that don't require authentication (e.g., DNS resolver, health check) can ignore `auth.identity` entirely.
|
||||
|
||||
### AuthContext is Clone and immutable
|
||||
|
||||
- `derive(Clone)` allows handlers to clone `AuthContext` for per-stream or per-channel contexts.
|
||||
- `handle()` receives `&AuthContext` — immutable. Handlers that resolve identity create local variables, they don't mutate the shared context. This prevents cross-contamination between streams on the same connection.
|
||||
|
||||
## Identity
|
||||
|
||||
The authenticated peer identity. Carries authorization information.
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone, PartialEq)]
|
||||
pub struct Identity {
|
||||
/// Unique identifier string. Fingerprint, key prefix, or principal name.
|
||||
pub id: String,
|
||||
|
||||
/// Authorization scopes. e.g., ["relay:connect", "secrets:derive"]
|
||||
pub scopes: Vec<String>,
|
||||
|
||||
/// Named resource lists. e.g., {"service": ["gitea", "registry"]}
|
||||
pub resources: HashMap<String, Vec<String>>,
|
||||
}
|
||||
```
|
||||
|
||||
This is the same structure as the reference implementation (`alknet-main/crates/alknet-core/src/auth/identity.rs`), minus the russh dependency. The `id` field is ALPN-agnostic:
|
||||
- SSH key auth: `"SHA256:abc123..."` (key fingerprint)
|
||||
- API key auth: `"alk_test"` (key prefix)
|
||||
- Certificate auth: `"username"` (principal name)
|
||||
|
||||
## AuthToken
|
||||
|
||||
Opaque authentication token carried in protocol frames.
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct AuthToken {
|
||||
pub raw: Vec<u8>,
|
||||
}
|
||||
```
|
||||
|
||||
Unchanged from the reference implementation. The handler that extracted it knows its encoding (UTF-8 string, binary token, etc.).
|
||||
|
||||
## IdentityProvider
|
||||
|
||||
Trait for resolving credentials to identities. Implemented by `ConfigIdentityProvider`.
|
||||
|
||||
```rust
|
||||
pub trait IdentityProvider: Send + Sync + 'static {
|
||||
fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
|
||||
fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
|
||||
}
|
||||
```
|
||||
|
||||
- `resolve_from_fingerprint()`: Used by the endpoint (TLS client cert) and by SSH (key fingerprint).
|
||||
- `resolve_from_token()`: Used by call protocol (AuthToken in first frame) and HTTP (Bearer header).
|
||||
|
||||
Both methods return `Option<Identity>` — `None` means the credential is not recognized.
|
||||
|
||||
## ConfigIdentityProvider
|
||||
|
||||
The default implementation. Resolves identities from `DynamicConfig`:
|
||||
|
||||
```rust
|
||||
pub struct ConfigIdentityProvider {
|
||||
dynamic: Arc<ArcSwap<DynamicConfig>>,
|
||||
}
|
||||
```
|
||||
|
||||
The "Config" prefix indicates that identities are resolved from configuration (as opposed to a database or external service). This reads from `ArcSwap<DynamicConfig>`, which is hot-reloadable — not from `StaticConfig`. An alternative name would be `DynamicConfigIdentityProvider` to make this clearer, but `ConfigIdentityProvider` is consistent with the reference implementation and the naming is unlikely to cause confusion in practice.
|
||||
|
||||
How it resolves:
|
||||
- **Fingerprint**: Look up in `DynamicConfig::auth::authorized_keys_fingerprints`. If found, return `Identity { id: fingerprint, scopes: ["relay:connect"], resources: {} }`.
|
||||
- **Token**: Parse as UTF-8. If it starts with `alk_`, look up in `DynamicConfig::auth::api_keys` by prefix match + SHA-256 hash. If found and not expired, return `Identity { id: prefix, scopes: entry.scopes, resources: entry.resources }`.
|
||||
|
||||
Changes to `DynamicConfig` via `ConfigReloadHandle` are reflected immediately — `ConfigIdentityProvider` reads from `ArcSwap` on every call.
|
||||
|
||||
## Resolution Flow
|
||||
|
||||
### Endpoint-level (before `handle()`)
|
||||
|
||||
```
|
||||
QUIC connection arrives
|
||||
→ TLS handshake (ALPN negotiation)
|
||||
→ Extract TLS client certificate fingerprint (if presented)
|
||||
→ If fingerprint present: IdentityProvider::resolve_from_fingerprint()
|
||||
→ Some(identity): auth.identity = Some(identity)
|
||||
→ None: auth.identity = None
|
||||
→ Construct AuthContext { identity, alpn, remote_addr, tls_client_fingerprint }
|
||||
→ Look up handler by alpn
|
||||
→ tokio::spawn(handler.handle(connection, &auth))
|
||||
```
|
||||
|
||||
### Handler-level (inside `handle()`)
|
||||
|
||||
```
|
||||
Handler receives &AuthContext
|
||||
→ If auth.identity is Some: use it (endpoint already resolved)
|
||||
→ If auth.identity is None and handler requires auth:
|
||||
→ Extract protocol-specific credential (AuthToken, SSH key, etc.)
|
||||
→ Call IdentityProvider::resolve_from_token() or resolve_from_fingerprint()
|
||||
→ If resolved: use the Identity
|
||||
→ If not resolved: return HandlerError::AuthRequired
|
||||
→ If handler doesn't require auth: proceed without identity
|
||||
```
|
||||
|
||||
## IdentityProvider Injection
|
||||
|
||||
Handlers need access to `IdentityProvider` to resolve credentials inside `handle()`. Since `ProtocolHandler::handle()` doesn't receive an `IdentityProvider` parameter, each handler must obtain it through **constructor injection**:
|
||||
|
||||
```rust
|
||||
// Example: SshAdapter holds an Arc<dyn IdentityProvider>
|
||||
pub struct SshAdapter {
|
||||
identity_provider: Arc<dyn IdentityProvider>,
|
||||
// ... other handler-specific state
|
||||
}
|
||||
|
||||
#[async_trait]
|
||||
impl ProtocolHandler for SshAdapter {
|
||||
fn alpn(&self) -> &'static [u8] { b"alknet/ssh" }
|
||||
|
||||
async fn handle(&self, connection: Connection, auth: &AuthContext) -> Result<(), HandlerError> {
|
||||
let identity = match &auth.identity {
|
||||
Some(id) => id.clone(),
|
||||
None => {
|
||||
// Extract SSH key fingerprint, resolve via identity_provider
|
||||
let fingerprint = extract_ssh_fingerprint(&connection).await?;
|
||||
self.identity_provider
|
||||
.resolve_from_fingerprint(&fingerprint)
|
||||
.ok_or(HandlerError::AuthRequired)?
|
||||
}
|
||||
};
|
||||
// ...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The CLI binary constructs each handler with `Arc::clone(&identity_provider)` and passes it when building the `HandlerRegistry`. This is the **assembly pattern**: the CLI (the only crate that depends on all handlers) wires dependencies together.
|
||||
|
||||
The endpoint's `AlknetEndpoint` also holds `Arc<dyn IdentityProvider>` for endpoint-level auth resolution (TLS client certificate fingerprints), but handlers don't receive it from the endpoint — they receive it at construction time from the CLI.
|
||||
|
||||
| Handler | Credential source | Resolution method |
|
||||
|---------|------------------|-----------------|
|
||||
| SshAdapter | SSH public key handshake | `resolve_from_fingerprint()` |
|
||||
| CallAdapter | AuthToken in first frame | `resolve_from_token()` |
|
||||
| HttpAdapter | `Authorization: Bearer` header | `resolve_from_token()` |
|
||||
| DnsAdapter | AuthToken in query labels | `resolve_from_token()` |
|
||||
| GitAdapter | Signed push certificate | `resolve_from_fingerprint()` |
|
||||
| SftpAdapter | SSH key (shares with SshAdapter) | `resolve_from_fingerprint()` |
|
||||
|
||||
## Key Differences from Reference Implementation
|
||||
|
||||
| Aspect | Reference | New Model |
|
||||
|--------|-----------|-----------|
|
||||
| Auth resolution | Inside SSH handler, before `handle()` | Hybrid: endpoint resolves TLS-level, handler resolves protocol-level |
|
||||
| AuthContext type | None (just `Arc<ArcSwap<DynamicConfig>>` + `IdentityProvider`) | Explicit struct with optional fields |
|
||||
| `Identity.id` | Always a fingerprint or API key prefix | Same, but ALPN-agnostic documentation |
|
||||
| `ConfigIdentityProvider` | Depends on russh for `PublicKey` types | No russh dependency; fingerprints stored as strings |
|
||||
| Credential phases | A–D phases in `CredentialProvider` | Two paths: fingerprint and token. No phases. |
|
||||
|
||||
## Design Decisions
|
||||
|
||||
| Decision | ADR | Summary |
|
||||
|----------|-----|---------|
|
||||
| Hybrid auth model | [ADR-004](../../decisions/004-auth-as-shared-core.md) | Endpoint resolves TLS-level, handler resolves protocol-level |
|
||||
| AuthContext with optional Identity | [ADR-011](../../decisions/011-authcontext-structure.md) | Explicit None, not "partially authenticated" |
|
||||
| AuthContext is immutable in handle() | [ADR-011](../../decisions/011-authcontext-structure.md) | Handlers create local variables for resolved identity |
|
||||
| Two resolution paths | [ADR-004](../../decisions/004-auth-as-shared-core.md) | Fingerprint and token, not phased auth |
|
||||
|
||||
## Open Questions
|
||||
|
||||
- **OQ-11**: See [open-questions.md](../../open-questions.md) — handler-level auth resolution observability.
|
||||
198
docs/architecture/crates/core/config.md
Normal file
198
docs/architecture/crates/core/config.md
Normal file
@@ -0,0 +1,198 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-16
|
||||
---
|
||||
|
||||
# Configuration
|
||||
|
||||
StaticConfig, DynamicConfig, ArcSwap, and ConfigReloadHandle.
|
||||
|
||||
## StaticConfig
|
||||
|
||||
Immutable configuration resolved at startup. Cannot be changed without restarting the endpoint.
|
||||
|
||||
```rust
|
||||
pub struct StaticConfig {
|
||||
/// Bind address for the QUIC endpoint (e.g., "0.0.0.0:4433").
|
||||
pub listen_addr: SocketAddr,
|
||||
|
||||
/// Path to TLS certificate file (PEM).
|
||||
/// Required for QUIC+TLS. The endpoint will not start without TLS configuration.
|
||||
pub tls_cert: Option<PathBuf>,
|
||||
|
||||
/// Path to TLS private key file (PEM).
|
||||
/// Required alongside tls_cert.
|
||||
pub tls_key: Option<PathBuf>,
|
||||
|
||||
/// Drain timeout for graceful shutdown (default: 2 seconds).
|
||||
pub drain_timeout: Duration,
|
||||
}
|
||||
```
|
||||
|
||||
### Key differences from reference implementation
|
||||
|
||||
The reference `StaticConfig` (in `alknet-main/crates/alknet-core/src/config/static_config.rs`) is SSH-centric: it holds `host_key`, `host_key_algorithm`, `proxy_config`, `stealth`, `transport_mode`, and `listeners`. The new model removes all of these:
|
||||
|
||||
- **No `host_key`/`host_key_algorithm`**: SSH host keys are managed by the SSH handler, not by core config. The endpoint uses TLS certs, not SSH host keys.
|
||||
- **No `proxy_config`**: Outbound proxy is an SSH-specific concern (SOCKS5/HTTP CONNECT forwarding). Not in core config.
|
||||
- **No `stealth`**: ALPN eliminates the need for stealth/byte-peeking. See [ADR-001](../../decisions/001-alpn-protocol-dispatch.md).
|
||||
- **No `transport_mode`/`listeners`**: The old `ServeTransportMode` and `ListenerConfig` enum are replaced by a single `listen_addr`. QUIC+TLS+ALPN replaces multiple listener types. See [ADR-010](../../decisions/010-alpn-router-and-endpoint.md).
|
||||
- **No `iroh_relay`**: iroh transport is deferred (OQ-05). The v1 endpoint uses quinn directly.
|
||||
|
||||
### Construction
|
||||
|
||||
`StaticConfig` is constructed by the CLI binary from CLI arguments or a config file. The exact shape of `StartupOptions` (or whatever the CLI uses) is a CLI concern, not a core concern. alknet-core provides `StaticConfig` as a data structure; the CLI is responsible for populating it.
|
||||
|
||||
```rust
|
||||
// The CLI binary constructs StaticConfig from its own options/config.
|
||||
// StartupOptions is NOT a core type — it belongs to the alknet CLI binary.
|
||||
// alknet-core receives a fully populated StaticConfig.
|
||||
let static_config = StaticConfig {
|
||||
listen_addr: "0.0.0.0:4433".parse()?,
|
||||
tls_cert: Some("/path/to/cert.pem".into()),
|
||||
tls_key: Some("/path/to/key.pem".into()),
|
||||
drain_timeout: Duration::from_secs(2),
|
||||
};
|
||||
```
|
||||
|
||||
## DynamicConfig
|
||||
|
||||
Runtime-reloadable configuration. Hot-reloaded via `ArcSwap` without restarting the endpoint.
|
||||
|
||||
```rust
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct DynamicConfig {
|
||||
pub auth: AuthPolicy,
|
||||
pub rate_limits: RateLimitConfig,
|
||||
}
|
||||
```
|
||||
|
||||
### AuthPolicy
|
||||
|
||||
Authorization policy derived from authorized keys, certificate authorities, and API keys.
|
||||
|
||||
```rust
|
||||
pub struct AuthPolicy {
|
||||
/// SHA-256 fingerprints of authorized keys (SSH keys, TLS client certs).
|
||||
/// Stored as strings to avoid russh dependency in core.
|
||||
pub authorized_fingerprints: HashSet<String>,
|
||||
|
||||
/// Certificate authorities for certificate-based auth.
|
||||
/// The exact structure is TBD — it will be defined when alknet-ssh
|
||||
/// is implemented. For now, this is a placeholder that reserves
|
||||
/// the field. alknet-ssh will define `CertAuthorityEntry` with
|
||||
/// the necessary fields (public key, principals, options).
|
||||
pub cert_authorities: Vec<CertAuthorityEntry>,
|
||||
|
||||
/// API keys for token-based auth.
|
||||
pub api_keys: Vec<ApiKeyEntry>,
|
||||
}
|
||||
```
|
||||
|
||||
`CertAuthorityEntry` is a placeholder type. Its fields will be defined when alknet-ssh is implemented and the certificate authority validation requirements are clear. For v1, `cert_authorities` will be an empty vector.
|
||||
|
||||
This replaces the reference implementation's `AuthPolicy` which depended on `russh::keys::PublicKey`. The new version stores fingerprints as strings, not russh types. This removes the russh dependency from alknet-core.
|
||||
|
||||
### ApiKeyEntry
|
||||
|
||||
```rust
|
||||
pub struct ApiKeyEntry {
|
||||
/// Key prefix (first 8 chars of the key). Used for O(1) lookup.
|
||||
pub prefix: String,
|
||||
|
||||
/// SHA-256 hash of the full key. Used for verification.
|
||||
pub hash: String,
|
||||
|
||||
/// Authorization scopes granted by this key.
|
||||
pub scopes: Vec<String>,
|
||||
|
||||
/// Human-readable description.
|
||||
pub description: String,
|
||||
|
||||
/// Unix timestamp when the key expires. None = never expires.
|
||||
pub expires_at: Option<u64>,
|
||||
}
|
||||
```
|
||||
|
||||
Carries forward from the reference implementation with no changes.
|
||||
|
||||
### RateLimitConfig
|
||||
|
||||
```rust
|
||||
pub struct RateLimitConfig {
|
||||
pub max_connections_per_ip: usize,
|
||||
pub max_auth_attempts: usize,
|
||||
}
|
||||
```
|
||||
|
||||
Carries forward from the reference implementation. Note: `max_connections_per_ip` and `max_auth_attempts` appear in both `StaticConfig` and `RateLimitConfig`. The relationship is:
|
||||
|
||||
- `StaticConfig` does NOT contain rate limit fields. Rate limits are entirely dynamic.
|
||||
- `RateLimitConfig` in `DynamicConfig` is the authoritative source at runtime.
|
||||
- The CLI binary sets initial `RateLimitConfig` values when creating the initial `DynamicConfig`.
|
||||
- Hot-reloading `DynamicConfig` via `ConfigReloadHandle` replaces rate limits immediately — no restart needed.
|
||||
|
||||
## ArcSwap Pattern
|
||||
|
||||
`DynamicConfig` is wrapped in `Arc<ArcSwap<DynamicConfig>>` for lock-free reads and atomic swaps.
|
||||
|
||||
```rust
|
||||
let dynamic = Arc::new(ArcSwap::new(Arc::new(DynamicConfig::default())));
|
||||
```
|
||||
|
||||
- **Reads**: `dynamic.load()` returns `Arc<DynamicConfig>`. Multiple readers can hold references simultaneously without blocking.
|
||||
- **Writes**: `dynamic.store(Arc::new(new_config))` atomically replaces the config. All subsequent reads see the new config.
|
||||
- **No locks**: `ArcSwap` uses atomic operations. No reader is ever blocked by a writer.
|
||||
|
||||
This pattern carries forward directly from the reference implementation (`alknet-main/crates/alknet-core/src/config/dynamic_config.rs`).
|
||||
|
||||
## ConfigReloadHandle
|
||||
|
||||
```rust
|
||||
pub struct ConfigReloadHandle {
|
||||
dynamic: Arc<ArcSwap<DynamicConfig>>,
|
||||
}
|
||||
|
||||
impl ConfigReloadHandle {
|
||||
pub fn reload(&self, new_config: DynamicConfig);
|
||||
pub fn dynamic(&self) -> Arc<DynamicConfig>;
|
||||
}
|
||||
```
|
||||
|
||||
- `reload()`: Atomically replaces the dynamic config. All subsequent reads (including in-flight `IdentityProvider` calls) see the new config.
|
||||
- `dynamic()`: Returns the current config as `Arc<DynamicConfig>`.
|
||||
|
||||
The CLI binary creates a `ConfigReloadHandle` and passes it to a config watcher (file watcher, SIGHUP handler, or call protocol operation) that calls `reload()` when config changes are detected.
|
||||
|
||||
## ConfigError
|
||||
|
||||
```rust
|
||||
pub enum ConfigError {
|
||||
InvalidFlag { name: String },
|
||||
KeyFileNotFound { path: String },
|
||||
BindFailed(io::Error),
|
||||
TlsConfig(io::Error),
|
||||
IncompatibleOptions,
|
||||
}
|
||||
```
|
||||
|
||||
Simplified from the reference implementation. Removes proxy-specific errors (now an SSH concern) and listener validation errors (no more `ListenerConfig` enum).
|
||||
|
||||
## Key Differences from Reference Implementation
|
||||
|
||||
| Aspect | Reference | New Model |
|
||||
|--------|-----------|-----------|
|
||||
| StaticConfig fields | SSH host key, stealth, transport_mode, listeners, proxy | listen_addr, TLS cert/key, drain_timeout, rate limits |
|
||||
| DynamicConfig.auth | `HashSet<PublicKey>` (russh types) | `HashSet<String>` (fingerprint strings) |
|
||||
| ListenerConfig | Enum with Stream/Http/Dns variants | Eliminated — single endpoint, ALPN dispatch |
|
||||
| TransportMode | Tcp/Tls/Iroh | Eliminated — always QUIC+TLS |
|
||||
| Stealth mode | Byte-peeking HTTP/SSH detection | Eliminated — ALPN handles protocol detection |
|
||||
| ForwardingPolicy | In DynamicConfig | Moved to handler-specific config (SSH) |
|
||||
|
||||
## Design Decisions
|
||||
|
||||
| Decision | ADR | Summary |
|
||||
|----------|-----|---------|
|
||||
| No russh dependency in core | [ADR-003](../../decisions/003-crate-decomposition.md) | Core is ALPN-agnostic; russh is an alknet-ssh dependency |
|
||||
| ArcSwap for dynamic config | Carry-forward from reference | Lock-free reads, atomic swaps |
|
||||
| No ListenerConfig | [ADR-001](../../decisions/001-alpn-protocol-dispatch.md) | Single endpoint, ALPN replaces multiple listener types |
|
||||
133
docs/architecture/crates/core/core-types.md
Normal file
133
docs/architecture/crates/core/core-types.md
Normal file
@@ -0,0 +1,133 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-16
|
||||
---
|
||||
|
||||
# Core Types
|
||||
|
||||
ProtocolHandler, HandlerError, Connection, BiStream, SendStream, RecvStream, StreamError.
|
||||
|
||||
## ProtocolHandler
|
||||
|
||||
The central abstraction. Every handler implements one trait:
|
||||
|
||||
```rust
|
||||
#[async_trait]
|
||||
pub trait ProtocolHandler: Send + Sync + 'static {
|
||||
fn alpn(&self) -> &'static [u8];
|
||||
async fn handle(&self, connection: Connection, auth: &AuthContext) -> Result<(), HandlerError>;
|
||||
}
|
||||
```
|
||||
|
||||
- `alpn()` returns the handler's ALPN identifier as a static byte string (e.g., `b"alknet/ssh"`, `b"alknet/call"`).
|
||||
- `handle()` receives a `Connection` (not a single BiStream) and an `AuthContext`. Returns `HandlerError` on failure.
|
||||
- Handlers that need a single stream call `connection.accept_bi()` once. Handlers that multiplex (SSH, call) open/accept streams as needed.
|
||||
|
||||
See [ADR-002](../../decisions/002-protocol-handler-trait.md) and [ADR-007](../../decisions/007-bistream-type-definition.md) for rationale.
|
||||
|
||||
## HandlerError
|
||||
|
||||
Non-fatal errors within a handler's `handle()` method. The endpoint catches these, logs them, and closes the connection. Other connections are unaffected.
|
||||
|
||||
```rust
|
||||
pub enum HandlerError {
|
||||
ConnectionClosed,
|
||||
StreamError(io::Error),
|
||||
AuthRequired,
|
||||
Internal(Box<dyn std::error::Error + Send + Sync>),
|
||||
}
|
||||
```
|
||||
|
||||
- `ConnectionClosed`: The peer closed the connection. Clean exit.
|
||||
- `StreamError`: An I/O error on a stream within the connection.
|
||||
- `AuthRequired`: The handler requires authentication and couldn't resolve the peer's identity. The endpoint closes the connection with an appropriate error. Handlers that support multi-step auth (like SSH) should handle auth challenges within their protocol, not return `AuthRequired` until all attempts are exhausted.
|
||||
- `Internal`: Handler-specific errors (protocol violations, upstream failures, etc.).
|
||||
|
||||
Handler panics are caught by tokio's task isolation. The connection is dropped, other connections continue.
|
||||
|
||||
## Connection
|
||||
|
||||
An opaque type wrapping a QUIC connection. Handlers receive a `Connection` in `handle()`.
|
||||
|
||||
```rust
|
||||
pub struct Connection {
|
||||
// Private: wraps the underlying QUIC connection or test mock
|
||||
}
|
||||
|
||||
impl Connection {
|
||||
pub async fn accept_bi(&self) -> Result<(SendStream, RecvStream), StreamError>;
|
||||
pub async fn open_bi(&self) -> Result<(SendStream, RecvStream), StreamError>;
|
||||
pub fn remote_alpn(&self) -> &[u8];
|
||||
pub fn remote_addr(&self) -> Option<SocketAddr>;
|
||||
pub fn close(&self, code: u32, reason: &str);
|
||||
}
|
||||
```
|
||||
|
||||
- `accept_bi()`: Wait for the peer to open a bidirectional stream. Returns `(SendStream, RecvStream)`.
|
||||
- `open_bi()`: Open a bidirectional stream to the peer. Returns `(SendStream, RecvStream)`.
|
||||
- `remote_alpn()`: The ALPN negotiated for this connection. Always present.
|
||||
- `remote_addr()`: The peer's address, if available. Informational (NAT/proxy).
|
||||
- `close()`: Close the connection with an error code and reason.
|
||||
|
||||
The `Connection` type does not expose quinn types in its public API. It wraps `quinn::Connection` internally, but the wrapper allows test implementations.
|
||||
|
||||
See [ADR-007](../../decisions/007-bistream-type-definition.md) for why handlers receive Connection instead of BiStream.
|
||||
|
||||
## BiStream
|
||||
|
||||
A trait for bidirectional byte streams. Used primarily for client-side and test scenarios.
|
||||
|
||||
```rust
|
||||
pub trait BiStream: AsyncRead + AsyncWrite + Send + Unpin {}
|
||||
```
|
||||
|
||||
Handlers that only need a single stream can obtain one via `connection.accept_bi()` and treat the `(SendStream, RecvStream)` pair as a BiStream. The `BiStream` trait is a convenience for:
|
||||
- Client-side code that has a single bidirectional stream
|
||||
- Test mocks that need to simulate a stream
|
||||
- Future transport abstractions (WebTransport, raw TCP) that produce bidirectional byte streams
|
||||
|
||||
See [ADR-007](../../decisions/007-bistream-type-definition.md) for why BiStream is a trait.
|
||||
|
||||
## SendStream and RecvStream
|
||||
|
||||
Concrete types wrapping QUIC stream halves.
|
||||
|
||||
```rust
|
||||
pub struct SendStream { /* wraps quinn::SendStream or test mock */ }
|
||||
pub struct RecvStream { /* wraps quinn::RecvStream or test mock */ }
|
||||
|
||||
impl AsyncWrite for SendStream { ... }
|
||||
impl AsyncRead for RecvStream { ... }
|
||||
```
|
||||
|
||||
- `SendStream` implements `AsyncWrite`. Write bytes to the peer.
|
||||
- `RecvStream` implements `AsyncRead`. Read bytes from the peer.
|
||||
- These are not trait objects — they are concrete wrapper types that delegate to `quinn::SendStream` / `quinn::RecvStream` in production and to test mocks in tests.
|
||||
|
||||
This is a two-way door decision. If future transports need different stream types, `SendStream` and `RecvStream` can become wrappers with enum dispatch. For v1, concrete wrappers over quinn types are simpler and zero-cost.
|
||||
|
||||
## StreamError
|
||||
|
||||
```rust
|
||||
pub enum StreamError {
|
||||
ConnectionClosed,
|
||||
StreamClosed,
|
||||
Timeout,
|
||||
Internal(io::Error),
|
||||
}
|
||||
```
|
||||
|
||||
Returned by `accept_bi()`, `open_bi()`, and stream read/write operations. Maps from `quinn::ConnectionError` and `quinn::StreamError`.
|
||||
|
||||
## Design Decisions
|
||||
|
||||
| Decision | ADR | Summary |
|
||||
|----------|-----|---------|
|
||||
| ProtocolHandler receives Connection, not BiStream | [ADR-007](../../decisions/007-bistream-type-definition.md) | Handlers that need multiple streams (SSH, call) have direct access to the Connection |
|
||||
| BiStream is a trait | [ADR-007](../../decisions/007-bistream-type-definition.md) | WASM door preserved, test mocks possible |
|
||||
| HandlerError is non-fatal | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Handler errors close the connection, not the endpoint |
|
||||
| SendStream/RecvStream are concrete wrappers | Two-way door | Can become enum dispatch later if multi-transport is needed |
|
||||
|
||||
## Open Questions
|
||||
|
||||
- **OQ-05**: See [open-questions.md](../../open-questions.md) — multi-transport. If quinn is the only transport in v1, SendStream/RecvStream can be concrete wrappers.
|
||||
189
docs/architecture/crates/core/endpoint.md
Normal file
189
docs/architecture/crates/core/endpoint.md
Normal file
@@ -0,0 +1,189 @@
|
||||
---
|
||||
status: draft
|
||||
last_updated: 2026-06-16
|
||||
---
|
||||
|
||||
# Endpoint
|
||||
|
||||
ALPN router, handler registry, connection accept loop, and graceful shutdown.
|
||||
|
||||
See [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) for the full rationale.
|
||||
|
||||
## AlknetEndpoint
|
||||
|
||||
The central runtime type. Owns the QUIC endpoint, holds the handler registry, and runs the accept loop.
|
||||
|
||||
```rust
|
||||
pub struct AlknetEndpoint {
|
||||
endpoint: quinn::Endpoint,
|
||||
handlers: Arc<HandlerRegistry>,
|
||||
dynamic: Arc<ArcSwap<DynamicConfig>>,
|
||||
identity_provider: Arc<dyn IdentityProvider>,
|
||||
shutdown: watch::Receiver<bool>,
|
||||
}
|
||||
```
|
||||
|
||||
### Construction
|
||||
|
||||
The CLI binary constructs an `AlknetEndpoint` at startup:
|
||||
|
||||
1. Build `HandlerRegistry` by inserting handlers for each ALPN.
|
||||
2. Build `StaticConfig` from CLI arguments / config file.
|
||||
3. Build `rustls::ServerConfig` from TLS cert/key and the registry's ALPN strings.
|
||||
4. Bind `quinn::Endpoint` with the `ServerConfig`.
|
||||
5. Create `ArcSwap<DynamicConfig>` and `ConfigIdentityProvider`.
|
||||
6. Call `AlknetEndpoint::new(endpoint, handlers, dynamic, identity_provider)`.
|
||||
|
||||
### Accept Loop
|
||||
|
||||
```
|
||||
loop {
|
||||
tokio::select! {
|
||||
incoming = endpoint.accept() => {
|
||||
let connection = incoming.await; // TLS handshake + ALPN negotiation
|
||||
match connection {
|
||||
Ok(conn) => {
|
||||
let alpn = conn.alpn();
|
||||
match handlers.get(alpn) {
|
||||
Some(handler) => {
|
||||
let auth = AuthContext::from_connection(&conn);
|
||||
let conn = Connection::new(conn);
|
||||
tokio::spawn(async move {
|
||||
if let Err(e) = handler.handle(conn, &auth).await {
|
||||
// log error, connection closes
|
||||
}
|
||||
});
|
||||
}
|
||||
None => {
|
||||
// ALPN has no handler — should not happen
|
||||
// (ServerConfig only advertises registered ALPNs)
|
||||
conn.close(0u32, "no handler");
|
||||
}
|
||||
}
|
||||
}
|
||||
Err(e) => {
|
||||
// TLS handshake or connection-level error
|
||||
// log and continue accepting
|
||||
}
|
||||
}
|
||||
}
|
||||
_ = shutdown.changed() => {
|
||||
break; // graceful shutdown
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### What the accept loop does NOT do
|
||||
|
||||
- **No byte-peeking**: ALPN negotiation handles protocol detection. The old `stealth` module's `detect_protocol()` is unnecessary.
|
||||
- **No per-handler accept loops**: The old model had `ListenerConfig::Stream`, `ListenerConfig::Http`, `ListenerConfig::Dns` with different accept paths. ALPN unifies this.
|
||||
- **No SSH-specific logic**: The accept loop is ALPN-agnostic. It doesn't know or care what protocol the handler speaks.
|
||||
|
||||
## HandlerRegistry
|
||||
|
||||
Maps ALPN byte strings to `ProtocolHandler` instances.
|
||||
|
||||
```rust
|
||||
pub struct HandlerRegistry {
|
||||
handlers: HashMap<&'static [u8], Arc<dyn ProtocolHandler>>,
|
||||
}
|
||||
|
||||
impl HandlerRegistry {
|
||||
pub fn new() -> Self;
|
||||
pub fn register(&mut self, handler: Arc<dyn ProtocolHandler>);
|
||||
pub fn get(&self, alpn: &[u8]) -> Option<&Arc<dyn ProtocolHandler>>;
|
||||
pub fn alpn_strings(&self) -> Vec<Vec<u8>>;
|
||||
}
|
||||
```
|
||||
|
||||
- `register()`: Insert a handler. Panics if the ALPN is already registered (duplicate handlers are a bug).
|
||||
- `get()`: Look up a handler by ALPN string. Returns `None` if no handler is registered.
|
||||
- `alpn_strings()`: Return all registered ALPN strings. Used to build the TLS `ServerConfig`.
|
||||
|
||||
Registration is static at startup (see [OQ-04](../../open-questions.md) and ADR-010). The CLI builds a `HandlerRegistry`, inserts all handlers, and passes it to `AlknetEndpoint`. The registry is immutable after construction.
|
||||
|
||||
### ALPN strings in the TLS ServerConfig
|
||||
|
||||
The `rustls::ServerConfig`'s ALPN protocol list is set from `registry.alpn_strings()` at construction time. This means:
|
||||
- Only registered handlers' ALPNs are advertised during TLS negotiation.
|
||||
- If a client offers an ALPN that's not in the list, the TLS handshake fails — correct behavior.
|
||||
- Adding a handler at runtime requires rebuilding the `ServerConfig` (see OQ-04).
|
||||
|
||||
## Graceful Shutdown
|
||||
|
||||
```rust
|
||||
impl AlknetEndpoint {
|
||||
pub fn shutdown_sender(&self) -> watch::Sender<bool>;
|
||||
pub async fn shutdown(&self) -> Result<(), EndpointError>;
|
||||
}
|
||||
```
|
||||
|
||||
- `shutdown_sender()` returns a clone of the shutdown channel sender. Call `send(true)` to signal shutdown.
|
||||
- `shutdown()` waits for in-flight connections to complete, with a drain timeout (default: 2 seconds). After the timeout, remaining connections are forcefully closed.
|
||||
- SIGTERM/SIGINT are wired to the shutdown channel by the CLI binary.
|
||||
|
||||
The drain timeout is configurable via `StaticConfig::drain_timeout`.
|
||||
|
||||
## Error Handling
|
||||
|
||||
### EndpointError
|
||||
|
||||
Fatal errors that prevent the endpoint from starting or continuing.
|
||||
|
||||
```rust
|
||||
pub enum EndpointError {
|
||||
BindFailed(io::Error),
|
||||
TlsConfig(io::Error),
|
||||
HandlerNotFound(Vec<u8>), // ALPN string with no registered handler
|
||||
}
|
||||
```
|
||||
|
||||
### HandlerError
|
||||
|
||||
Non-fatal errors within a handler. See [core-types.md](core-types.md) for details.
|
||||
|
||||
### Accept loop errors
|
||||
|
||||
- **TLS handshake failure**: Log and continue. The client may have offered no compatible ALPN, or the cert may be untrusted by the client.
|
||||
- **Handler panic**: Caught by tokio's task isolation. The connection is dropped. Other connections continue.
|
||||
- **Connection-level errors** (quinn `ConnectionError`): Log and continue. The accept loop keeps running.
|
||||
|
||||
## TLS Certificate Provisioning
|
||||
|
||||
`StaticConfig` provides TLS configuration via file paths:
|
||||
|
||||
- **Manual**: `tls_cert` and `tls_key` file paths. Required for production use.
|
||||
- **Self-signed**: For development. The endpoint can generate a self-signed cert on startup.
|
||||
|
||||
The `rustls::ServerConfig` is built from cert + key + ALPN list at startup.
|
||||
|
||||
ACME auto-provisioning (Let's Encrypt) is not in scope for v1. It will be added as a feature later (see OQ-12).
|
||||
|
||||
## Key Differences from Reference Implementation
|
||||
|
||||
| Aspect | Reference (`alknet-main`) | New Model |
|
||||
|--------|---------------------------|-----------|
|
||||
| Transport | `TransportAcceptor` trait, `TransportKind` enum | `quinn::Endpoint` directly |
|
||||
| Listener config | `ListenerConfig` enum (Stream/Http/Dns) | Single endpoint, ALPN dispatch |
|
||||
| Protocol detection | Byte-peeking (`stealth::detect_protocol`) | ALPN negotiation (TLS layer) |
|
||||
| Accept loop | Per-transport, SSH-centric | ALPN-agnostic, handler-dispatched |
|
||||
| Handler model | `ServerHandler` + `russh::server::Handler` | `ProtocolHandler::handle(Connection, &AuthContext)` |
|
||||
| Config | `ServeOptions` builder | `StaticConfig` + `HandlerRegistry` + `AlknetEndpoint::new()` |
|
||||
|
||||
## Design Decisions
|
||||
|
||||
| Decision | ADR | Summary |
|
||||
|----------|-----|---------|
|
||||
| Static handler registration | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Two-way door, start static, add ArcSwap later |
|
||||
| quinn::Endpoint directly, no TransportAcceptor | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | Start with quinn, abstract later if needed |
|
||||
| No byte-peeking, ALPN dispatch only | [ADR-001](../../decisions/001-alpn-protocol-dispatch.md) | TLS layer handles protocol detection |
|
||||
| Handler panics isolated | [ADR-010](../../decisions/010-alpn-router-and-endpoint.md) | tokio task isolation, connection closes |
|
||||
|
||||
## Open Questions
|
||||
|
||||
See [open-questions.md](../../open-questions.md) for full details.
|
||||
|
||||
- **OQ-04**: Resolved — HandlerRegistry is static at startup.
|
||||
- **OQ-05**: Open — start with quinn, abstract later if needed.
|
||||
- **OQ-12**: Resolved — start with file paths in StaticConfig, add ACME later.
|
||||
Reference in New Issue
Block a user