tasks: decompose vault, core, call crates into 28 atomic implementation tasks

Break down the three initial crates (alknet-vault, alknet-core, alknet-call)
into dependency-ordered task files for implementation agents.

Structure:
- tasks/vault/ (10 tasks) — drift fixes from ADR-025/026 refactor, review,
  spec sync. Vault is independent and can run fully in parallel with core/call.
- tasks/core/ (6 tasks) — crate init, core types, config, auth, endpoint,
  review. Core is foundational; call depends on it.
- tasks/call/ (12 tasks) — split into registry/ and protocol/ topic subdirs
  reflecting the two subsystems. CallAdapter is the merge point.

Key decisions:
- Drifts 3+9+10 grouped as one task (key-versioning-rotation) — the complete
  ADR-021 rotation feature that doesn't compile in pieces
- Reviews injected at end of each crate phase (vault, core, call)
- Vault spec-sync task removes the drift table and bumps doc status to stable
- ACME deferred in core/endpoint (noted as TODO; X509 manual certs for now)
- OperationEnv kept as a trait (load-bearing for ADR-024 layering)

Validated: 28 tasks, no cycles, 11 generations of parallel work.
Critical path runs through call (11 tasks). Vault completes by generation 4.
6 high-risk tasks identified (21%): irpc-removal, endpoint, operation-context,
operation-env, call-adapter, abort-cascade.
This commit is contained in:
2026-06-23 12:41:47 +00:00
parent 2e34590522
commit 098fd8b9b9
28 changed files with 4271 additions and 0 deletions

162
tasks/core/auth.md Normal file
View File

@@ -0,0 +1,162 @@
---
id: core/auth
name: Implement AuthContext, Identity, AuthToken, IdentityProvider trait, and ConfigIdentityProvider
status: pending
depends_on: [core/core-types]
scope: moderate
risk: medium
impact: component
level: implementation
---
## Description
Implement the authentication types in `src/auth.rs`. Auth is hybrid: the
endpoint resolves what it can (TLS-level), handlers resolve what they need
(protocol-level). AuthContext may be partial — handlers complete auth inside
`handle()`.
### AuthContext
```rust
#[derive(Clone)]
pub struct AuthContext {
pub identity: Option<Identity>,
pub alpn: Vec<u8>,
pub remote_addr: Option<SocketAddr>,
pub tls_client_fingerprint: Option<String>,
}
```
Created by the endpoint for each incoming connection. Passed to
`ProtocolHandler::handle()` as an immutable reference.
- `identity`: peer's authenticated identity, if resolved by the endpoint. None
means the endpoint has no identity info for this connection.
- `alpn`: negotiated ALPN — always present after TLS handshake.
- `remote_addr`: peer's address, if available (may be None for iroh).
- `tls_client_fingerprint`: SHA-256 fingerprint of TLS client cert, if presented.
`AuthContext` is `Clone` (handlers clone for per-stream contexts) and immutable
in `handle()` (handlers create local variables for resolved identity, they
don't mutate the shared context).
### Identity
```rust
#[derive(Debug, Clone, PartialEq)]
pub struct Identity {
pub id: String,
pub scopes: Vec<String>,
pub resources: HashMap<String, Vec<String>>,
}
```
The authenticated peer identity. `id` is ALPN-agnostic:
- SSH key auth: `"SHA256:abc123..."` (key fingerprint)
- API key auth: `"alk_test"` (key prefix)
- Certificate auth: `"username"` (principal name)
### AuthToken
```rust
#[derive(Debug, Clone)]
pub struct AuthToken {
pub raw: Vec<u8>,
}
```
Opaque authentication token carried in protocol frames. The handler that
extracted it knows its encoding.
### IdentityProvider trait
```rust
pub trait IdentityProvider: Send + Sync + 'static {
fn resolve_from_fingerprint(&self, fingerprint: &str) -> Option<Identity>;
fn resolve_from_token(&self, token: &AuthToken) -> Option<Identity>;
}
```
- `resolve_from_fingerprint()`: used by endpoint (TLS client cert) and SSH (key fingerprint)
- `resolve_from_token()`: used by call protocol (AuthToken in first frame) and HTTP (Bearer header)
- Both return `Option<Identity>` — None means credential not recognized
### ConfigIdentityProvider
```rust
pub struct ConfigIdentityProvider {
dynamic: Arc<ArcSwap<DynamicConfig>>,
}
```
The default implementation. Resolves identities from `DynamicConfig` (reads
from ArcSwap on every call — hot-reloadable).
Resolution logic:
- **Fingerprint**: look up in `DynamicConfig::auth::authorized_fingerprints`.
If found, return `Identity { id: fingerprint, scopes: ["relay:connect"], resources: {} }`.
- **Token**: parse as UTF-8. If starts with `alk_`, look up in
`DynamicConfig::auth::api_keys` by prefix match + SHA-256 hash. If found and
not expired, return `Identity { id: prefix, scopes: entry.scopes, resources: entry.resources }`.
Changes to DynamicConfig via ConfigReloadHandle are reflected immediately.
### Two Identity Scopes
There are two distinct identity scopes that must not be conflated:
| Scope | Where set | Where stored | Represents | Used for |
|-------|-----------|--------------|------------|----------|
| Connection-level | Handler in `handle()` | `Connection` (via `set_identity`) | Who opened the QUIC connection | Observability, logging |
| Per-request | CallAdapter per `call.requested` | `OperationContext.identity` | Who makes this specific call | ACL (ADR-015) |
The connection-level identity is stable (set once). The per-request identity
is dynamic (resolved per call, potentially different across requests). The
per-request identity takes precedence for ACL.
### Security constraints
- **Token entropy**: generated `alk_` tokens must have ≥128 bits of entropy.
The prefix (first 8 chars) is for O(1) lookup and is not secret — it appears
in logs by design. SHA-256 of the full token allows offline verification; this
is safe only if the full token is high-entropy.
- **Config reload must be authenticated**: a reload that adds an authorized
fingerprint or API key grants access immediately. The reload trigger must be
local-only or admin-scoped.
- **Connection-level identity is for observability only**: per-request identity
takes precedence for ACL.
## Acceptance Criteria
- [ ] `AuthContext` struct with all 4 fields, derives `Clone`
- [ ] `Identity` struct with `id`, `scopes`, `resources`, derives `Clone`, `PartialEq`
- [ ] `AuthToken` struct with `raw` field, derives `Clone`
- [ ] `IdentityProvider` trait with both methods
- [ ] `ConfigIdentityProvider` struct holding `Arc<ArcSwap<DynamicConfig>>`
- [ ] `ConfigIdentityProvider::resolve_from_fingerprint` looks up in authorized_fingerprints
- [ ] `ConfigIdentityProvider::resolve_from_token` parses `alk_` prefix, matches by hash, checks expiry
- [ ] ConfigIdentityProvider reads from ArcSwap on every call (hot-reloadable)
- [ ] Unit test: fingerprint resolution (known fingerprint → Some, unknown → None)
- [ ] Unit test: token resolution (valid non-expired → Some, expired → None, unknown → None)
- [ ] Unit test: config reload changes resolution results immediately
- [ ] `cargo test -p alknet-core` succeeds
- [ ] `cargo clippy -p alknet-core` succeeds with no warnings
## References
- docs/architecture/crates/core/auth.md — all type definitions, resolution flow
- docs/architecture/decisions/004-auth-as-shared-core.md — ADR-004
- docs/architecture/decisions/011-authcontext-structure.md — ADR-011
## Notes
> Auth is hybrid: endpoint resolves TLS-level, handler resolves protocol-level.
> AuthContext may be partial (identity = None). The two identity scopes
> (connection-level for observability, per-request for ACL) must not be
> conflated. ConfigIdentityProvider reads from ArcSwap on every call so config
> reloads take effect immediately.
## Summary
> To be filled on completion

190
tasks/core/config.md Normal file
View File

@@ -0,0 +1,190 @@
---
id: core/config
name: Implement StaticConfig, DynamicConfig, AuthPolicy, ApiKeyEntry, ConfigReloadHandle, TlsIdentity
status: pending
depends_on: [core/core-types]
scope: moderate
risk: low
impact: component
level: implementation
---
## Description
Implement the configuration types in `src/config.rs`. These are the config
structures consumed by the endpoint and the CLI binary. StaticConfig is
immutable at startup; DynamicConfig is hot-reloadable via ArcSwap.
### StaticConfig
```rust
pub struct StaticConfig {
pub listen_addr: Option<SocketAddr>,
pub tls_identity: Option<TlsIdentity>,
pub iroh_relay: Option<RelayUrl>,
pub drain_timeout: Duration,
}
```
Immutable configuration resolved at startup. `listen_addr` is None for
iroh-only nodes. `tls_identity` is required if `listen_addr` is Some.
### TlsIdentity
```rust
pub enum TlsIdentity {
X509 { cert: PathBuf, key: PathBuf },
RawKey(iroh::SecretKey),
SelfSigned,
}
```
Three modes (OQ-12):
- `X509`: domain certificate for browser/WebTransport clients
- `RawKey`: RFC 7250 raw Ed25519 public key — default for P2P, no domain/CA
- `SelfSigned`: development only
`RawKey` uses `iroh::SecretKey` (Ed25519) — re-exported from iroh, which
alknet-core depends on (feature-gated). The key can be derived from
alknet-vault at the assembly layer or generated fresh.
### DynamicConfig
```rust
#[derive(Debug, Clone)]
pub struct DynamicConfig {
pub auth: AuthPolicy,
pub rate_limits: RateLimitConfig,
}
```
Runtime-reloadable via ArcSwap.
### AuthPolicy
```rust
pub struct AuthPolicy {
pub authorized_fingerprints: HashSet<String>,
pub api_keys: Vec<ApiKeyEntry>,
}
```
Fingerprints stored as strings (no russh dependency in core — ADR-003).
Certificate authority entries deferred to alknet-ssh (omitted from v1 to avoid
referencing an undefined type; adding back is additive).
### ApiKeyEntry
```rust
pub struct ApiKeyEntry {
pub prefix: String,
pub hash: String,
pub scopes: Vec<String>,
pub description: String,
pub expires_at: Option<u64>,
}
```
Carries forward from reference implementation. Prefix (first 8 chars) for O(1)
lookup, SHA-256 hash for verification.
### RateLimitConfig
```rust
pub struct RateLimitConfig {
pub max_connections_per_ip: usize,
pub max_auth_attempts: usize,
}
```
### ArcSwap pattern
```rust
let dynamic = Arc::new(ArcSwap::new(Arc::new(DynamicConfig::default())));
```
- Reads: `dynamic.load()` returns `Arc<DynamicConfig>` — lock-free
- Writes: `dynamic.store(Arc::new(new_config))` — atomic swap
- No locks: ArcSwap uses atomic operations
### ConfigReloadHandle
```rust
pub struct ConfigReloadHandle {
dynamic: Arc<ArcSwap<DynamicConfig>>,
}
impl ConfigReloadHandle {
pub fn reload(&self, new_config: DynamicConfig);
pub fn dynamic(&self) -> Arc<DynamicConfig>;
}
```
- `reload()`: atomically replaces the dynamic config
- `dynamic()`: returns current config as `Arc<DynamicConfig>`
**Config reload is a privilege-escalation path.** A reload that adds an
authorized fingerprint or API key grants access immediately. The reload
trigger must be authenticated/local-only (SIGHUP, file watch, or admin call
protocol operation). The implementation must not ship a reload endpoint with
no auth "for convenience."
### ConfigError
```rust
pub enum ConfigError {
InvalidFlag { name: String },
KeyFileNotFound { path: String },
BindFailed(io::Error),
TlsConfig(io::Error),
IncompatibleOptions,
}
```
### Defaults
- `drain_timeout`: 2 seconds
- `max_connections_per_ip`: implementation default (reference uses a reasonable value)
- `max_auth_attempts`: implementation default
- `DynamicConfig::default()`: empty auth policy, default rate limits
### What NOT to include
Per the spec, StaticConfig does NOT include: `host_key`, `host_key_algorithm`,
`proxy_config`, `stealth`, `transport_mode`, `listeners`. These are removed in
the new model (ALPN dispatch replaces them — see config.md Key Differences).
## Acceptance Criteria
- [ ] `StaticConfig` struct with all fields per config.md
- [ ] `TlsIdentity` enum with X509, RawKey, SelfSigned variants
- [ ] `DynamicConfig` struct with `auth` and `rate_limits` fields
- [ ] `AuthPolicy` struct with `authorized_fingerprints` and `api_keys`
- [ ] `ApiKeyEntry` struct with all 5 fields
- [ ] `RateLimitConfig` struct with both fields
- [ ] `ConfigReloadHandle` with `reload()` and `dynamic()` methods
- [ ] `ConfigError` enum with all variants
- [ ] `DynamicConfig` derives `Clone`, `Debug` (for ArcSwap)
- [ ] Default values match config.md (drain_timeout = 2s, etc.)
- [ ] No russh dependency (fingerprints as strings)
- [ ] Unit tests for Default impls
- [ ] Unit test: ConfigReloadHandle reload swaps config atomically
- [ ] `cargo test -p alknet-core` succeeds
- [ ] `cargo clippy -p alknet-core` succeeds with no warnings
## References
- docs/architecture/crates/core/config.md — all type definitions
- docs/architecture/decisions/003-crate-decomposition.md — ADR-003 (no russh in core)
- docs/architecture/decisions/010-alpn-router-and-endpoint.md — ADR-010 (no ListenerConfig)
## Notes
> Config reload is a privilege-escalation path — do not ship an unauthenticated
> reload endpoint. The ArcSwap pattern carries forward from the reference
> implementation. StaticConfig removes all SSH-centric fields (host_key,
> stealth, transport_mode, listeners) — those are handler concerns now.
## Summary
> To be filled on completion

224
tasks/core/core-types.md Normal file
View File

@@ -0,0 +1,224 @@
---
id: core/core-types
name: "Implement core types: ProtocolHandler, Connection, BiStream, SendStream, RecvStream, StreamError, HandlerError, Capabilities"
status: pending
depends_on: [core/crate-init]
scope: broad
risk: medium
impact: component
level: implementation
---
## Description
Implement the core types in `src/types.rs`. These are the foundational
abstractions that every handler crate depends on. This is the most
cross-crate-boundary task in core — `Capabilities` in particular is used
heavily by alknet-call's operation registry and composition model.
### ProtocolHandler trait
```rust
#[async_trait]
pub trait ProtocolHandler: Send + Sync + 'static {
fn alpn(&self) -> &'static [u8];
async fn handle(&self, connection: Connection, auth: &AuthContext) -> Result<(), HandlerError>;
}
```
- `alpn()` returns the handler's ALPN identifier as a static byte string
- `handle()` receives a `Connection` (not a single BiStream) and an `AuthContext`
- Handlers that need a single stream call `connection.accept_bi()` once
- Handlers that multiplex (SSH, call) open/accept streams as needed
See ADR-002, ADR-007.
### HandlerError
```rust
pub enum HandlerError {
ConnectionClosed,
StreamError(io::Error),
AuthRequired,
Internal(Box<dyn std::error::Error + Send + Sync>),
}
```
Non-fatal errors within `handle()`. The endpoint catches these, logs them,
closes the connection. Other connections are unaffected. Handler panics are
caught by tokio's task isolation.
### Connection
```rust
pub struct Connection {
// Private: wraps the underlying QUIC connection or test mock
identity: OnceLock<Identity>,
}
impl Connection {
#[cfg(feature = "quinn")]
pub fn from_quinn(conn: quinn::Connection) -> Self;
#[cfg(feature = "iroh")]
pub fn from_iroh(conn: iroh::Connection) -> Self;
pub async fn accept_bi(&self) -> Result<(SendStream, RecvStream), StreamError>;
pub async fn open_bi(&self) -> Result<(SendStream, RecvStream), StreamError>;
pub fn remote_alpn(&self) -> &[u8];
pub fn remote_addr(&self) -> Option<SocketAddr>;
pub fn close(&self, code: u32, reason: &str);
pub fn set_identity(&self, identity: Identity) -> Result<(), IdentityAlreadySet>;
pub fn identity(&self) -> Option<&Identity>;
}
```
- Opaque type wrapping a QUIC connection (quinn or iroh, feature-gated)
- `set_identity` is write-once-read-many via `OnceLock` (OQ-11) — handlers
store resolved identity for observability; the endpoint does NOT read it
after `handle()` returns (the Connection is moved into the spawned task)
- Internal enum dispatch for quinn vs iroh vs test mock
- `Connection` does not expose quinn types in its public API
### BiStream trait
```rust
pub trait BiStream: AsyncRead + AsyncWrite + Send + Unpin {}
```
A convenience trait for client-side code, test mocks, and future transport
abstractions (WebTransport, raw TCP). Handlers that need a single stream
obtain one via `connection.accept_bi()` and treat the pair as a BiStream.
### SendStream and RecvStream
```rust
pub struct SendStream { /* wraps quinn::SendStream or iroh::SendStream or test mock */ }
pub struct RecvStream { /* wraps quinn::RecvStream or iroh::RecvStream or test mock */ }
impl AsyncWrite for SendStream { ... }
impl AsyncRead for RecvStream { ... }
```
Concrete wrapper types using internal enum dispatch to delegate to the
appropriate QUIC stream type (quinn or iroh) in production, and to test mocks
in tests.
### StreamError
```rust
pub enum StreamError {
ConnectionClosed,
StreamClosed,
Timeout,
Internal(io::Error),
}
```
Returned by `accept_bi()`, `open_bi()`, and stream read/write operations.
Maps from `quinn::ConnectionError` / `quinn::StreamError` and iroh equivalents.
### From<StreamError> for HandlerError
```rust
impl From<StreamError> for HandlerError {
fn from(e: StreamError) -> Self {
match e {
StreamError::ConnectionClosed => HandlerError::ConnectionClosed,
StreamError::StreamClosed => HandlerError::StreamError(
io::Error::new(io::ErrorKind::ConnectionReset, "stream closed")),
StreamError::Timeout => HandlerError::StreamError(
io::Error::new(io::ErrorKind::TimedOut, "stream timed out")),
StreamError::Internal(e) => HandlerError::StreamError(e),
}
}
}
```
This `From` impl is the canonical conversion — handlers use `?` on
`accept_bi()` / `open_bi()`.
### Capabilities
```rust
#[derive(Clone, Zeroize, ZeroizeOnDrop)]
pub struct Capabilities {
entries: HashMap<String, Secret<String>>,
}
impl Capabilities {
pub fn new() -> Self;
pub fn with_api_key(mut self, service: &str, key: String) -> Self;
pub fn with_http_token(mut self, service: &str, token: String) -> Self;
pub fn get(&self, service: &str) -> Option<&Secret<String>>;
}
```
Critical constraints (ADR-014, ADR-022, review #002 W2):
- **Non-serializable**: does NOT derive `Serialize`. Cannot appear in
`EventEnvelope` payloads even by accident.
- **Zeroized**: derives `Zeroize` and `ZeroizeOnDrop`. Secret material does
not linger in freed heap memory.
- **Clone + Send + Sync**: required by the composition model —
`OperationEnv::invoke()` clones the parent's capabilities for each child.
- **Immutable after construction**: no `set`, no `insert`, no `mut` accessors.
This is the guard from review #002 W2 — makes clone semantics genuinely
two-way (Arc-based vs deep-copy are behaviorally identical when neither
supports mutation).
- **Private fields**: the builder API (`new`, `with_*`) is the only
construction path.
Use `secrecy::Secret<String>` (from the `secrecy` crate) or a similar wrapper
for the secret values. Add `secrecy` to dependencies if needed, or implement
a simple `Secret` wrapper that zeroizes on drop and redacts in Debug.
### IdentityAlreadySet error
```rust
#[derive(Debug, thiserror::Error)]
pub enum IdentityAlreadySet {
#[error("connection identity already set")]
AlreadySet,
}
```
Returned by `Connection::set_identity()` if called a second time.
## Acceptance Criteria
- [ ] `ProtocolHandler` trait defined with `alpn()` and `handle()` (async)
- [ ] `HandlerError` enum with all 4 variants
- [ ] `Connection` struct with all methods (from_quinn/from_iroh feature-gated)
- [ ] `Connection::set_identity` write-once via `OnceLock`, returns `IdentityAlreadySet` on second call
- [ ] `BiStream` trait defined (AsyncRead + AsyncWrite + Send + Unpin)
- [ ] `SendStream` implements `AsyncWrite`
- [ ] `RecvStream` implements `AsyncRead`
- [ ] `StreamError` enum with all 4 variants
- [ ] `From<StreamError> for HandlerError` impl
- [ ] `Capabilities` struct with `new()`, `with_api_key()`, `with_http_token()`, `get()`
- [ ] `Capabilities` derives `Clone`, `Zeroize`, `ZeroizeOnDrop` — NOT `Serialize`
- [ ] `Capabilities` fields are private (builder API only, no mut accessors)
- [ ] `IdentityAlreadySet` error type
- [ ] Unit tests for Capabilities (build, get, clone, zeroize)
- [ ] Unit test: `Connection::set_identity` once succeeds, twice returns error
- [ ] `cargo test -p alknet-core` succeeds
- [ ] `cargo clippy -p alknet-core` succeeds with no warnings
## References
- docs/architecture/crates/core/core-types.md — all type definitions
- docs/architecture/decisions/002-protocol-handler-trait.md — ADR-002
- docs/architecture/decisions/007-bistream-type-definition.md — ADR-007
- docs/architecture/decisions/014-secret-material-flow-and-capability-injection.md — ADR-014 (Capabilities)
- docs/architecture/decisions/022-handler-registration-provenance-and-composition-authority.md — ADR-022
## Notes
> This is the most cross-crate-boundary task in core. `Capabilities` is used
> heavily by alknet-call's operation registry and composition model — it must
> be right the first time. The immutability guard (no mut accessors) is the
> security control from review #002 W2 that makes clone semantics safe. The
> `Connection` type uses internal enum dispatch for quinn/iroh/test — do not
> expose quinn types in the public API.
## Summary
> To be filled on completion

116
tasks/core/crate-init.md Normal file
View File

@@ -0,0 +1,116 @@
---
id: core/crate-init
name: Initialize alknet-core crate with Cargo.toml, dependencies, and module skeleton
status: pending
depends_on: []
scope: moderate
risk: low
impact: project
level: implementation
---
## Description
Initialize the `alknet-core` crate from scratch. The workspace currently has
only `alknet-vault`. This task creates the crate directory, `Cargo.toml`,
`lib.rs`, and the module skeleton that subsequent core tasks will fill in.
### Crate setup
Create `crates/alknet-core/` with:
- `Cargo.toml` — package metadata, dependencies, feature flags
- `src/lib.rs` — crate root with module declarations and re-exports
- Module skeleton files (empty or with `// TODO` markers) for:
- `src/types.rs` — ProtocolHandler, HandlerError, Connection, BiStream, SendStream, RecvStream, StreamError, Capabilities
- `src/auth.rs` — AuthContext, Identity, IdentityProvider, AuthToken, ConfigIdentityProvider
- `src/config.rs` — StaticConfig, DynamicConfig, AuthPolicy, ApiKeyEntry, RateLimitConfig, ConfigReloadHandle, ConfigError, TlsIdentity
- `src/endpoint.rs` — AlknetEndpoint, HandlerRegistry, EndpointError
### Dependencies
Per the architecture specs (overview.md, core/README.md, endpoint.md):
| Crate | Purpose |
|-------|---------|
| `tokio` 1 (full) | Async runtime, watch channel for shutdown |
| `quinn` | QUIC endpoint (feature-gated) |
| `iroh` | P2P relay-assisted endpoint (feature-gated) |
| `rustls` | TLS implementation |
| `rustls-pki-types` | TLS types (CertificateDer, PrivateKeyDer) |
| `serde` 1 | Serialization for config types |
| `serde_json` 1 | JSON for config, JSON Schema values |
| `toml` 0.8 | Config file format |
| `arc-swap` 1 | Atomic config swap for DynamicConfig |
| `async-trait` 0.1 | ProtocolHandler trait (async fn in trait) |
| `tracing` 0.1 | Structured logging |
| `thiserror` 2 | Error enums |
| `zeroize` 1 | Capabilities zeroization |
| `bytes` 1 | Byte buffer types for streams |
| `futures` | AsyncRead/AsyncWrite for BiStream trait |
### Feature flags
```toml
[features]
default = ["quinn"]
quinn = ["dep:quinn"]
iroh = ["dep:iroh"]
```
Both quinn and iroh are optional, both can be active simultaneously (ADR-010).
`quinn` is default-on for the common case; `iroh` is opt-in.
### Workspace Cargo.toml
Add `crates/alknet-core` to the workspace `members` list in the root
`Cargo.toml`.
### Module skeleton
```rust
// src/lib.rs
//! alknet-core: Core library for ALPN-based protocol dispatch.
pub mod types;
pub mod auth;
pub mod config;
pub mod endpoint;
// Re-exports (filled in by subsequent tasks)
```
Each module file gets a doc comment and `// TODO: implement` marker. The
subsequent tasks (core-types, config, auth, endpoint) fill these in.
## Acceptance Criteria
- [ ] `crates/alknet-core/Cargo.toml` exists with all dependencies and feature flags
- [ ] `crates/alknet-core/src/lib.rs` exists with module declarations
- [ ] Module skeleton files exist: `types.rs`, `auth.rs`, `config.rs`, `endpoint.rs`
- [ ] Root `Cargo.toml` `members` list includes `crates/alknet-core`
- [ ] `cargo check -p alknet-core` succeeds
- [ ] `cargo clippy -p alknet-core` succeeds with no warnings
- [ ] Dual licensing: `MIT OR Apache-2.0` (workspace-inherited)
## References
- docs/architecture/overview.md — crate graph, shared types
- docs/architecture/crates/core/README.md — crate index
- docs/architecture/crates/core/core-types.md — types to implement
- docs/architecture/crates/core/endpoint.md — endpoint, features (quinn + iroh)
- docs/architecture/crates/core/config.md — config types
- docs/architecture/crates/core/auth.md — auth types
- docs/architecture/decisions/003-crate-decomposition.md — ADR-003
- docs/architecture/decisions/010-alpn-router-and-endpoint.md — ADR-010 (feature-gating)
## Notes
> This is the foundational setup task for alknet-core. All subsequent core
> tasks depend on this one. The crate has no alknet dependencies (vault is
> standalone; core doesn't depend on vault). The feature flags for quinn/iroh
> are important — both are optional and can be active simultaneously.
## Summary
> To be filled on completion

249
tasks/core/endpoint.md Normal file
View File

@@ -0,0 +1,249 @@
---
id: core/endpoint
name: Implement AlknetEndpoint, HandlerRegistry, accept loops (quinn + iroh), TLS identity, and graceful shutdown
status: pending
depends_on: [core/core-types, core/config, core/auth]
scope: broad
risk: high
impact: component
level: implementation
---
## Description
Implement the ALPN router and endpoint in `src/endpoint.rs`. This is the
integration point of alknet-core — it ties together the core types, config,
and auth into the central runtime that accepts connections and dispatches to
handlers by ALPN string.
### AlknetEndpoint
```rust
pub struct AlknetEndpoint {
quinn: Option<quinn::Endpoint>,
iroh: Option<iroh::Endpoint>,
handlers: Arc<HandlerRegistry>,
dynamic: Arc<ArcSwap<DynamicConfig>>,
identity_provider: Arc<dyn IdentityProvider>,
shutdown: watch::Receiver<bool>,
}
```
Manages one or more QUIC connection sources, each feeding into the same ALPN
router. Both quinn and iroh are optional (feature-gated), both can be active
simultaneously (ADR-010).
### HandlerRegistry
```rust
pub struct HandlerRegistry {
handlers: HashMap<&'static [u8], Arc<dyn ProtocolHandler>>,
}
impl HandlerRegistry {
pub fn new() -> Self;
pub fn register(&mut self, handler: Arc<dyn ProtocolHandler>);
pub fn get(&self, alpn: &[u8]) -> Option<&Arc<dyn ProtocolHandler>>;
pub fn alpn_strings(&self) -> Vec<Vec<u8>>;
}
```
- `register()`: insert a handler. Panics if ALPN already registered.
- `get()`: look up by ALPN string.
- `alpn_strings()`: all registered ALPNs — used to build TLS ServerConfig
(quinn) and ALPN list (iroh).
- Registration is **static at startup** (OQ-04, ADR-010). The CLI builds the
registry, inserts all handlers, passes to `AlknetEndpoint::new()`.
### Accept loops
Each active connection source runs its own accept loop. Both dispatch through
the same `HandlerRegistry`.
**Quinn accept loop** (public QUIC+TLS):
```
loop {
tokio::select! {
incoming = quinn_endpoint.accept() => {
let connection = incoming.await;
match connection {
Ok(conn) => dispatch(conn),
Err(e) => { /* log TLS handshake failure, continue */ }
}
}
_ = shutdown.changed() => break,
}
}
```
**iroh accept loop** (P2P relay-assisted):
```
loop {
tokio::select! {
incoming = iroh_endpoint.accept() => {
let accepting = incoming.accept();
let alpn = accepting.alpn().await;
match alpn {
Ok(alpn) => dispatch(alpn, accepting),
Err(e) => { /* log handshake failure, continue */ }
}
}
_ = shutdown.changed() => break,
}
}
```
Use `iroh::Endpoint` directly (not iroh's `Router`) because our HandlerRegistry
is shared between quinn and iroh, and our AuthContext construction differs per
source. See iroh's `protocol.rs` for the reference pattern.
### Dispatch function (shared)
```
fn dispatch(connection) {
let alpn = connection.alpn();
match handlers.get(alpn) {
Some(handler) => {
let auth = AuthContext::from_connection(&connection);
let conn = Connection::from_quinn(connection); // or from_iroh
tokio::spawn(async move {
if let Err(e) = handler.handle(conn, &auth).await {
// log error, connection closes
}
});
}
None => connection.close(0u32, "no handler"),
}
}
```
### AuthContext construction
The endpoint constructs `AuthContext` from the QUIC connection:
1. `alpn`: from `connection.alpn()` — always present
2. `remote_addr`: from `connection.remote_addr()` — may be None for iroh
3. `tls_client_fingerprint`: extracted from TLS session's client cert, if presented
4. `identity`: if fingerprint available, call `IdentityProvider::resolve_from_fingerprint()`.
If resolves, `identity = Some(resolved)`. If not, `identity = None`.
### TLS Identity
Three modes per `TlsIdentity` (OQ-12):
**RawKey (RFC 7250, default for P2P)**:
- Build `rustls::ServerConfig` with `only_raw_public_keys() -> true`
- `ResolvesServerCert` generates cert on-the-fly from the Ed25519 key
- ~100 lines — see `iroh/iroh/src/tls/resolver.rs` for the reference pattern
- Works natively with SSH auth and git; browsers do NOT support RFC 7250
**X509 (domain-hosted)**:
- Load cert/key from file paths
- Standard `rustls::ServerConfig`
- For browser/WebTransport clients and public domain services
**SelfSigned (dev only)**:
- Generate self-signed cert on startup
- External clients will not trust it
**ACME (future, not in this task)**:
- The reverse-proxy project demonstrates the complete ACME pattern. It will be
adapted as an additional `TlsIdentity` variant or `ResolvesServerCert` impl.
For now, X509 with manual certs is the domain path. Note this as a TODO.
The quinn endpoint's `rustls::ServerConfig` ALPN list is set from
`registry.alpn_strings()` at construction time. The iroh endpoint's ALPN list
is similarly derived. Both advertise the same set of ALPNs.
### Graceful shutdown
```rust
impl AlknetEndpoint {
pub fn shutdown_sender(&self) -> watch::Sender<bool>;
pub async fn shutdown(&self) -> Result<(), EndpointError>;
}
```
- `shutdown_sender()`: clone of shutdown channel sender. `send(true)` signals shutdown.
- `shutdown()`: signals all accept loops to stop, waits for in-flight connections
with drain timeout (default 2s from StaticConfig), then forcefully closes remaining.
- SIGTERM/SIGINT wired to shutdown channel by the CLI binary (not core's concern).
### EndpointError
```rust
pub enum EndpointError {
BindFailed(io::Error),
TlsConfig(io::Error),
HandlerNotFound(Vec<u8>),
}
```
Fatal errors that prevent the endpoint from starting or continuing.
### Accept loop error handling
- **TLS handshake failure**: log and continue. Client may have offered no
compatible ALPN, or cert may be untrusted.
- **Handler panic**: caught by tokio's task isolation. Connection dropped,
others continue.
- **Connection-level errors** (quinn/iroh ConnectionError): log and continue.
Accept loop keeps running.
### What the accept loops do NOT do
- No byte-peeking (ALPN handles protocol detection)
- No per-handler accept loops (ALPN unifies)
- No SSH-specific logic (accept loop is ALPN-agnostic)
### TCP is NOT an endpoint concern
Bare TCP (SSH over port 22) does not use QUIC or ALPN. TCP access is handled by
individual handlers (the SSH handler can listen on TCP independently). This is
handler-specific, not core endpoint.
## Acceptance Criteria
- [ ] `AlknetEndpoint` struct with quinn/iroh (both Option, both feature-gated)
- [ ] `HandlerRegistry` with new/register/get/alpn_strings
- [ ] `register()` panics on duplicate ALPN
- [ ] Quinn accept loop runs, dispatches by ALPN, respects shutdown
- [ ] iroh accept loop runs, dispatches by ALPN, respects shutdown
- [ ] Dispatch function spawns handler task via `tokio::spawn`
- [ ] AuthContext constructed from connection (alpn, remote_addr, fingerprint, identity)
- [ ] TLS RawKey mode: rustls ServerConfig with `only_raw_public_keys()`, on-the-fly cert
- [ ] TLS X509 mode: load cert/key from files, standard ServerConfig
- [ ] TLS SelfSigned mode: generate self-signed cert on startup
- [ ] ALPN list in TLS ServerConfig set from `registry.alpn_strings()`
- [ ] Graceful shutdown: signal accept loops to stop, drain timeout, force close
- [ ] `EndpointError` enum with all variants
- [ ] Accept loop errors logged, loop continues (no crash on handshake failure)
- [ ] Handler panics caught by tokio task isolation (connection dropped, others continue)
- [ ] No byte-peeking, no per-handler accept loops, no SSH-specific logic
- [ ] Unit test: HandlerRegistry register/get/alpn_strings
- [ ] Unit test: HandlerRegistry register panics on duplicate ALPN
- [ ] Integration test: endpoint with mock handler, verify dispatch by ALPN
- [ ] `cargo test -p alknet-core` succeeds
- [ ] `cargo clippy -p alknet-core` succeeds with no warnings
## References
- docs/architecture/crates/core/endpoint.md — full endpoint spec
- docs/architecture/decisions/001-alpn-protocol-dispatch.md — ADR-001
- docs/architecture/decisions/010-alpn-router-and-endpoint.md — ADR-010
- docs/architecture/decisions/006-alpn-convention-and-connection-model.md — ADR-006
- docs/architecture/decisions/007-bistream-type-definition.md — ADR-007
- iroh reference: `/workspace/iroh/iroh/src/protocol.rs` (accept loop pattern)
- iroh reference: `/workspace/iroh/iroh/src/tls/resolver.rs` (RFC 7250 raw key)
## Notes
> This is the integration point of alknet-core — it ties together types, config,
> and auth. The highest-risk task in core because it involves QUIC connection
> handling, TLS identity (3 modes), and graceful shutdown. The RFC 7250 raw key
> path is ~100 lines (iroh has a reference implementation). ACME is deferred —
> note as TODO, use X509 manual certs for the domain path for now. TCP is NOT
> an endpoint concern — it's handler-specific.
## Summary
> To be filled on completion

122
tasks/core/review-core.md Normal file
View File

@@ -0,0 +1,122 @@
---
id: core/review-core
name: Review alknet-core implementation for spec conformance and pattern consistency
status: pending
depends_on: [core/endpoint]
scope: moderate
risk: low
impact: phase
level: review
---
## Description
Review the alknet-core implementation for spec conformance, pattern
consistency, and correctness before alknet-call (which depends on core)
begins implementation. This is the quality checkpoint at the end of the core
phase.
### Review Checklist
1. **Core types conformance** (core-types.md):
- `ProtocolHandler` trait signature matches spec (alpn, handle)
- `HandlerError` has all 4 variants (ConnectionClosed, StreamError, AuthRequired, Internal)
- `Connection` has all methods, from_quinn/from_iroh feature-gated
- `Connection::set_identity` is write-once via OnceLock
- `BiStream` is a trait (AsyncRead + AsyncWrite + Send + Unpin)
- `SendStream` implements AsyncWrite, `RecvStream` implements AsyncRead
- `StreamError` has all 4 variants
- `From<StreamError> for HandlerError` impl matches spec mapping table
- `Capabilities` is non-serializable, zeroized, immutable, Clone+Send+Sync
- `Capabilities` has builder API (new, with_api_key, with_http_token, get), private fields
2. **Config conformance** (config.md):
- `StaticConfig` fields match (listen_addr, tls_identity, iroh_relay, drain_timeout)
- `TlsIdentity` has X509, RawKey, SelfSigned
- `DynamicConfig` has auth and rate_limits
- `AuthPolicy` has authorized_fingerprints (HashSet<String>), api_keys (Vec<ApiKeyEntry>)
- `ApiKeyEntry` has all 5 fields (prefix, hash, scopes, description, expires_at)
- `ConfigReloadHandle` has reload() and dynamic()
- No russh dependency (fingerprints as strings)
- No removed fields (host_key, stealth, transport_mode, listeners)
3. **Auth conformance** (auth.md):
- `AuthContext` has all 4 fields, derives Clone
- `Identity` has id, scopes, resources
- `AuthToken` has raw field
- `IdentityProvider` trait with both methods
- `ConfigIdentityProvider` reads from ArcSwap on every call
- Fingerprint resolution looks up in authorized_fingerprints
- Token resolution: alk_ prefix, hash match, expiry check
- Two identity scopes documented (connection-level vs per-request)
4. **Endpoint conformance** (endpoint.md):
- `AlknetEndpoint` has quinn/iroh (both Option, both feature-gated)
- `HandlerRegistry` register/get/alpn_strings, panics on duplicate
- Quinn accept loop: select on accept + shutdown, dispatch by ALPN
- iroh accept loop: select on accept + shutdown, dispatch by ALPN
- Dispatch spawns handler task via tokio::spawn
- AuthContext constructed from connection (alpn, remote_addr, fingerprint, identity)
- TLS RawKey: only_raw_public_keys(), on-the-fly cert from Ed25519
- TLS X509: load from files
- TLS SelfSigned: generate on startup
- ALPN list in ServerConfig from registry.alpn_strings()
- Graceful shutdown: drain timeout, force close
- EndpointError has all 3 variants
- No byte-peeking, no per-handler loops, no SSH-specific logic
5. **Pattern consistency**:
- ArcSwap used consistently for DynamicConfig
- Feature flags (quinn, iroh) gate transport code correctly
- Error handling patterns consistent (thiserror, Result propagation)
- No quinn/iroh types in public API (Connection wraps them)
6. **Security constraints**:
- Capabilities non-serializable (no Serialize derive)
- Capabilities zeroized (Zeroize, ZeroizeOnDrop)
- Capabilities immutable (no mut accessors)
- Config reload is privilege escalation (no unauthenticated reload endpoint)
- Token entropy requirement documented
7. **Test coverage**:
- Unit tests for Capabilities (build, get, clone, zeroize)
- Unit tests for config types and reload
- Unit tests for auth resolution (fingerprint, token, expiry)
- Unit tests for HandlerRegistry
- Integration test: endpoint dispatch by ALPN
## Acceptance Criteria
- [ ] All core types match core-types.md
- [ ] All config types match config.md
- [ ] All auth types match auth.md
- [ ] Endpoint matches endpoint.md (accept loops, TLS modes, shutdown)
- [ ] Capabilities security constraints satisfied (non-serializable, zeroized, immutable)
- [ ] No russh dependency in core
- [ ] No quinn/iroh types in public API
- [ ] ArcSwap pattern consistent
- [ ] Feature flags gate transport code correctly
- [ ] Test coverage adequate for all functionality
- [ ] `cargo fmt --check -p alknet-core` passes
- [ ] `cargo clippy -p alknet-core` passes with no warnings
- [ ] All tests pass
## References
- docs/architecture/crates/core/README.md
- docs/architecture/crates/core/core-types.md
- docs/architecture/crates/core/config.md
- docs/architecture/crates/core/auth.md
- docs/architecture/crates/core/endpoint.md
- docs/architecture/decisions/ (relevant ADRs: 001-011, 014, 015, 022)
## Notes
> This review verifies core is spec-conformant before alknet-call begins.
> alknet-call depends heavily on core types (ProtocolHandler, Connection,
> AuthContext, Capabilities, IdentityProvider) — any issues here propagate to
> call. If deviations are found, document and fix before proceeding.
## Summary
> To be filled on completion