261 lines
9.7 KiB
Markdown
261 lines
9.7 KiB
Markdown
---
|
|
id: core/endpoint
|
|
name: Implement AlknetEndpoint, HandlerRegistry, accept loops (quinn + iroh), TLS identity, and graceful shutdown
|
|
status: completed
|
|
depends_on: [core/core-types, core/config, core/auth]
|
|
scope: broad
|
|
risk: high
|
|
impact: component
|
|
level: implementation
|
|
---
|
|
|
|
## Description
|
|
|
|
Implement the ALPN router and endpoint in `src/endpoint.rs`. This is the
|
|
integration point of alknet-core — it ties together the core types, config,
|
|
and auth into the central runtime that accepts connections and dispatches to
|
|
handlers by ALPN string.
|
|
|
|
### AlknetEndpoint
|
|
|
|
```rust
|
|
pub struct AlknetEndpoint {
|
|
quinn: Option<quinn::Endpoint>,
|
|
iroh: Option<iroh::Endpoint>,
|
|
handlers: Arc<HandlerRegistry>,
|
|
dynamic: Arc<ArcSwap<DynamicConfig>>,
|
|
identity_provider: Arc<dyn IdentityProvider>,
|
|
shutdown: watch::Receiver<bool>,
|
|
}
|
|
```
|
|
|
|
Manages one or more QUIC connection sources, each feeding into the same ALPN
|
|
router. Both quinn and iroh are optional (feature-gated), both can be active
|
|
simultaneously (ADR-010).
|
|
|
|
### HandlerRegistry
|
|
|
|
```rust
|
|
pub struct HandlerRegistry {
|
|
handlers: HashMap<&'static [u8], Arc<dyn ProtocolHandler>>,
|
|
}
|
|
|
|
impl HandlerRegistry {
|
|
pub fn new() -> Self;
|
|
pub fn register(&mut self, handler: Arc<dyn ProtocolHandler>);
|
|
pub fn get(&self, alpn: &[u8]) -> Option<&Arc<dyn ProtocolHandler>>;
|
|
pub fn alpn_strings(&self) -> Vec<Vec<u8>>;
|
|
}
|
|
```
|
|
|
|
- `register()`: insert a handler. Panics if ALPN already registered.
|
|
- `get()`: look up by ALPN string.
|
|
- `alpn_strings()`: all registered ALPNs — used to build TLS ServerConfig
|
|
(quinn) and ALPN list (iroh).
|
|
- Registration is **static at startup** (OQ-04, ADR-010). The CLI builds the
|
|
registry, inserts all handlers, passes to `AlknetEndpoint::new()`.
|
|
|
|
### Accept loops
|
|
|
|
Each active connection source runs its own accept loop. Both dispatch through
|
|
the same `HandlerRegistry`.
|
|
|
|
**Quinn accept loop** (public QUIC+TLS):
|
|
```
|
|
loop {
|
|
tokio::select! {
|
|
incoming = quinn_endpoint.accept() => {
|
|
let connection = incoming.await;
|
|
match connection {
|
|
Ok(conn) => dispatch(conn),
|
|
Err(e) => { /* log TLS handshake failure, continue */ }
|
|
}
|
|
}
|
|
_ = shutdown.changed() => break,
|
|
}
|
|
}
|
|
```
|
|
|
|
**iroh accept loop** (P2P relay-assisted):
|
|
```
|
|
loop {
|
|
tokio::select! {
|
|
incoming = iroh_endpoint.accept() => {
|
|
let accepting = incoming.accept();
|
|
let alpn = accepting.alpn().await;
|
|
match alpn {
|
|
Ok(alpn) => dispatch(alpn, accepting),
|
|
Err(e) => { /* log handshake failure, continue */ }
|
|
}
|
|
}
|
|
_ = shutdown.changed() => break,
|
|
}
|
|
}
|
|
```
|
|
|
|
Use `iroh::Endpoint` directly (not iroh's `Router`) because our HandlerRegistry
|
|
is shared between quinn and iroh, and our AuthContext construction differs per
|
|
source. See iroh's `protocol.rs` for the reference pattern.
|
|
|
|
### Dispatch function (shared)
|
|
|
|
```
|
|
fn dispatch(connection) {
|
|
let alpn = connection.alpn();
|
|
match handlers.get(alpn) {
|
|
Some(handler) => {
|
|
let auth = AuthContext::from_connection(&connection);
|
|
let conn = Connection::from_quinn(connection); // or from_iroh
|
|
tokio::spawn(async move {
|
|
if let Err(e) = handler.handle(conn, &auth).await {
|
|
// log error, connection closes
|
|
}
|
|
});
|
|
}
|
|
None => connection.close(0u32, "no handler"),
|
|
}
|
|
}
|
|
```
|
|
|
|
### AuthContext construction
|
|
|
|
The endpoint constructs `AuthContext` from the QUIC connection:
|
|
1. `alpn`: from `connection.alpn()` — always present
|
|
2. `remote_addr`: from `connection.remote_addr()` — may be None for iroh
|
|
3. `tls_client_fingerprint`: extracted from TLS session's client cert, if presented
|
|
4. `identity`: if fingerprint available, call `IdentityProvider::resolve_from_fingerprint()`.
|
|
If resolves, `identity = Some(resolved)`. If not, `identity = None`.
|
|
|
|
### TLS Identity
|
|
|
|
Three modes per `TlsIdentity` (OQ-12):
|
|
|
|
**RawKey (RFC 7250, default for P2P)**:
|
|
- Build `rustls::ServerConfig` with `only_raw_public_keys() -> true`
|
|
- `ResolvesServerCert` generates cert on-the-fly from the Ed25519 key
|
|
- ~100 lines — see `iroh/iroh/src/tls/resolver.rs` for the reference pattern
|
|
- Works natively with SSH auth and git; browsers do NOT support RFC 7250
|
|
|
|
**X509 (domain-hosted)**:
|
|
- Load cert/key from file paths
|
|
- Standard `rustls::ServerConfig`
|
|
- For browser/WebTransport clients and public domain services
|
|
|
|
**SelfSigned (dev only)**:
|
|
- Generate self-signed cert on startup
|
|
- External clients will not trust it
|
|
|
|
**ACME (future, not in this task)**:
|
|
- The reverse-proxy project demonstrates the complete ACME pattern. It will be
|
|
adapted as an additional `TlsIdentity` variant or `ResolvesServerCert` impl.
|
|
For now, X509 with manual certs is the domain path. Note this as a TODO.
|
|
|
|
The quinn endpoint's `rustls::ServerConfig` ALPN list is set from
|
|
`registry.alpn_strings()` at construction time. The iroh endpoint's ALPN list
|
|
is similarly derived. Both advertise the same set of ALPNs.
|
|
|
|
### Graceful shutdown
|
|
|
|
```rust
|
|
impl AlknetEndpoint {
|
|
pub fn shutdown_sender(&self) -> watch::Sender<bool>;
|
|
pub async fn shutdown(&self) -> Result<(), EndpointError>;
|
|
}
|
|
```
|
|
|
|
- `shutdown_sender()`: clone of shutdown channel sender. `send(true)` signals shutdown.
|
|
- `shutdown()`: signals all accept loops to stop, waits for in-flight connections
|
|
with drain timeout (default 2s from StaticConfig), then forcefully closes remaining.
|
|
- SIGTERM/SIGINT wired to shutdown channel by the CLI binary (not core's concern).
|
|
|
|
### EndpointError
|
|
|
|
```rust
|
|
pub enum EndpointError {
|
|
BindFailed(io::Error),
|
|
TlsConfig(io::Error),
|
|
HandlerNotFound(Vec<u8>),
|
|
}
|
|
```
|
|
|
|
Fatal errors that prevent the endpoint from starting or continuing.
|
|
|
|
### Accept loop error handling
|
|
|
|
- **TLS handshake failure**: log and continue. Client may have offered no
|
|
compatible ALPN, or cert may be untrusted.
|
|
- **Handler panic**: caught by tokio's task isolation. Connection dropped,
|
|
others continue.
|
|
- **Connection-level errors** (quinn/iroh ConnectionError): log and continue.
|
|
Accept loop keeps running.
|
|
|
|
### What the accept loops do NOT do
|
|
|
|
- No byte-peeking (ALPN handles protocol detection)
|
|
- No per-handler accept loops (ALPN unifies)
|
|
- No SSH-specific logic (accept loop is ALPN-agnostic)
|
|
|
|
### TCP is NOT an endpoint concern
|
|
|
|
Bare TCP (SSH over port 22) does not use QUIC or ALPN. TCP access is handled by
|
|
individual handlers (the SSH handler can listen on TCP independently). This is
|
|
handler-specific, not core endpoint.
|
|
|
|
## Acceptance Criteria
|
|
|
|
- [ ] `AlknetEndpoint` struct with quinn/iroh (both Option, both feature-gated)
|
|
- [ ] `HandlerRegistry` with new/register/get/alpn_strings
|
|
- [ ] `register()` panics on duplicate ALPN
|
|
- [ ] Quinn accept loop runs, dispatches by ALPN, respects shutdown
|
|
- [ ] iroh accept loop runs, dispatches by ALPN, respects shutdown
|
|
- [ ] Dispatch function spawns handler task via `tokio::spawn`
|
|
- [ ] AuthContext constructed from connection (alpn, remote_addr, fingerprint, identity)
|
|
- [ ] TLS RawKey mode: rustls ServerConfig with `only_raw_public_keys()`, on-the-fly cert
|
|
- [ ] TLS X509 mode: load cert/key from files, standard ServerConfig
|
|
- [ ] TLS SelfSigned mode: generate self-signed cert on startup
|
|
- [ ] ALPN list in TLS ServerConfig set from `registry.alpn_strings()`
|
|
- [ ] Graceful shutdown: signal accept loops to stop, drain timeout, force close
|
|
- [ ] `EndpointError` enum with all variants
|
|
- [ ] Accept loop errors logged, loop continues (no crash on handshake failure)
|
|
- [ ] Handler panics caught by tokio task isolation (connection dropped, others continue)
|
|
- [ ] No byte-peeking, no per-handler accept loops, no SSH-specific logic
|
|
- [ ] Unit test: HandlerRegistry register/get/alpn_strings
|
|
- [ ] Unit test: HandlerRegistry register panics on duplicate ALPN
|
|
- [ ] Integration test: endpoint with mock handler, verify dispatch by ALPN
|
|
- [ ] `cargo test -p alknet-core` succeeds
|
|
- [ ] `cargo clippy -p alknet-core` succeeds with no warnings
|
|
|
|
## References
|
|
|
|
- docs/architecture/crates/core/endpoint.md — full endpoint spec
|
|
- docs/architecture/decisions/001-alpn-protocol-dispatch.md — ADR-001
|
|
- docs/architecture/decisions/010-alpn-router-and-endpoint.md — ADR-010
|
|
- docs/architecture/decisions/006-alpn-convention-and-connection-model.md — ADR-006
|
|
- docs/architecture/decisions/007-bistream-type-definition.md — ADR-007
|
|
- iroh reference: `/workspace/iroh/iroh/src/protocol.rs` (accept loop pattern)
|
|
- iroh reference: `/workspace/iroh/iroh/src/tls/resolver.rs` (RFC 7250 raw key)
|
|
|
|
## Notes
|
|
|
|
> This is the integration point of alknet-core — it ties together types, config,
|
|
> and auth. The highest-risk task in core because it involves QUIC connection
|
|
> handling, TLS identity (3 modes), and graceful shutdown. The RFC 7250 raw key
|
|
> path is ~100 lines (iroh has a reference implementation). ACME is deferred —
|
|
> note as TODO, use X509 manual certs for the domain path for now. TCP is NOT
|
|
> an endpoint concern — it's handler-specific.
|
|
|
|
## Summary
|
|
|
|
Implemented `AlknetEndpoint` with quinn+iroh accept loops (both feature-gated,
|
|
both `Option`), `HandlerRegistry` (new/register/get/alpn_strings with
|
|
panic-on-duplicate), dispatch via `tokio::spawn` by ALPN, `AuthContext`
|
|
construction from connection (alpn/remote_addr/fingerprint/identity), TLS
|
|
identity modes (RawKey RFC 7250 via on-the-fly cert resolver, X509 from files,
|
|
SelfSigned via rcgen), `EndpointError` enum, graceful shutdown with drain
|
|
timeout + force close. ACME deferred as TODO per task spec. 55 tests
|
|
(--all-features), 52 (default), 47 (no-default); clippy clean across all 3
|
|
feature combos. Merged to develop.
|
|
|
|
Note: The agent initially worked in the main worktree (incomplete/broken code
|
|
was stashed and dropped). The final work was properly pushed to the feat branch
|
|
and merged from there. |