Decompose architecture into 35 atomic tasks across 10 generations for implementation

2026-06-02 09:02:55 +00:00
parent b5c59ef3bc
commit 14dbd81195
35 changed files with 1636 additions and 0 deletions
--- a/tasks/server/channel-proxy.md
+++ b/tasks/server/channel-proxy.md
@@ -0,0 +1,49 @@
+---
+id: server/channel-proxy
+name: Implement server channel proxy — direct TCP and outbound proxy connections
+status: pending
+depends_on:
+  - server/handler
+  - auth/error-types
+scope: moderate
+risk: medium
+impact: component
+level: implementation
+---
+
+## Description
+
+Implement the server's channel proxy logic that makes outbound TCP connections on behalf of SSH clients. When `channel_open_direct_tcpip(host, port)` is called for a non-reserved destination:
+
+1. Connect to `host:port`, either directly or via the configured outbound proxy
+2. Run `tokio::io::copy_bidirectional` between the SSH channel stream and the outbound TCP stream
+3. Clean up when either side disconnects
+
+Supports three outbound proxy modes per server.md: Direct, SOCKS5 proxy, HTTP CONNECT proxy.
+
+## Acceptance Criteria
+
+- [ ] `crates/wraith-core/src/server/channel_proxy.rs` exports channel proxy functions
+- [ ] `ProxyConfig` enum: `Direct`, `Socks5 { addr: SocketAddr }`, `HttpConnect { addr: SocketAddr }`
+- [ ] `connect_outbound(target: SocketAddr, proxy: &ProxyConfig) -> Result<TcpStream>` — connects to target directly or via proxy
+- [ ] Direct mode: `TcpStream::connect(target)`
+- [ ] SOCKS5 proxy: establishes SOCKS5 handshake, sends CONNECT command for target
+- [ ] HTTP CONNECT proxy: sends `CONNECT host:port HTTP/1.1` to proxy, reads 200 response
+- [ ] `proxy_channel(channel: ChannelStream, target: SocketAddr, proxy: &ProxyConfig)` — spawns bidirectional copy task
+- [ ] Channel errors (target unreachable, proxy failure) close that channel without affecting SSH session
+- [ ] No logging of tunnel destinations (ADR-006) — only transport/auth events are logged
+- [ ] Unit tests: direct connection proxy, SOCKS5 proxy handshake, HTTP CONNECT proxy handshake, target unreachable handling
+
+## References
+
+- docs/architecture/server.md — Channel Handling, Outbound Proxy Modes sections
+- docs/architecture/decisions/006-no-logging-of-tunnel-destinations.md — no destination logging
+- docs/architecture/decisions/019-proxy-dual-semantics.md — server `--proxy` meaning
+
+## Notes
+
+> To be filled by implementation agent
+
+## Summary
+
+> To be filled on completion
--- a/tasks/server/control-channel.md
+++ b/tasks/server/control-channel.md
@@ -0,0 +1,50 @@
+---
+id: server/control-channel
+name: Implement wraith-control reserved channel for pubsub event bus bridging (ADR-018)
+status: pending
+depends_on:
+  - server/handler
+  - auth/error-types
+scope: narrow
+risk: medium
+impact: component
+level: implementation
+---
+
+## Description
+
+Implement the control channel routing per ADR-018. When the server receives a `channel_open_direct_tcpip` request for `wraith-control:0`:
+
+1. The handler detects the reserved `wraith-` prefix destination
+2. Instead of making a TCP connection, it bridges the SSH channel to an internal event bus handle
+3. `EventEnvelope` JSON flows bidirectionally over the SSH channel
+
+The entire `wraith-` prefix is reserved — no TCP connections should be attempted for `wraith-*` destinations. The control channel is optional; servers without pubsub configured should accept the channel and provide a configurable behavior (reject or provide a loopback pipe).
+
+At this stage, implement the routing logic and a `ControlChannel` trait that consumers can implement. The actual pubsub bridge implementation would be in a separate crate or behind a feature flag.
+
+## Acceptance Criteria
+
+- [ ] `crates/wraith-core/src/server/control_channel.rs` exports `ControlChannelHandler` trait and routing logic
+- [ ] `WRAITH_CONTROL_DESTINATION` constant defined as `"wraith-control"` (ADR-018)
+- [ ] `WRAITH_PREFIX` constant defined as `"wraith-"` for namespace reservation
+- [ ] `ControlChannelHandler` trait: `async fn handle_channel(stream: Box<dyn AsyncRead + AsyncWrite + Unpin + Send>)`
+- [ ] Server handler detects `wraith-*` prefix and routes to `ControlChannelHandler` instead of TCP proxy
+- [ ] If no `ControlChannelHandler` configured, reject the channel open request (SSH channel open failure)
+- [ ] Non-reserved destinations continue through normal TCP proxy path
+- [ ] Server constraint enforced: no TCP connections to `wraith-*` destinations
+- [ ] Unit tests: reserved destination detected, non-reserved passes through, prefix matching works
+
+## References
+
+- docs/architecture/server.md — Channel Handling section (reserved destinations), Constraints section
+- docs/architecture/decisions/018-control-channel-for-pubsub.md — control channel rationale
+- docs/architecture/napi-and-pubsub.md — server-side control channel behavior
+
+## Notes
+
+> To be filled by implementation agent
+
+## Summary
+
+> To be filled on completion
--- a/tasks/server/handler.md
+++ b/tasks/server/handler.md
@@ -0,0 +1,46 @@
+---
+id: server/handler
+name: Implement ServerHandler — russh server handler with auth and channel dispatch
+status: pending
+depends_on:
+  - auth/server-auth-handler
+  - transport/trait-and-types
+scope: moderate
+risk: medium
+impact: component
+level: implementation
+---
+
+## Description
+
+Implement the core `ServerHandler` that implements `russh::server::Handler`. This is the heart of the server. Per server.md, it has two primary responsibilities:
+
+1. **`auth_publickey()`**: Delegated to `ServerAuthConfig` — checks key against authorized set or validates cert-authority
+2. **`channel_open_direct_tcpip()`**: Routes the channel — either to a TCP target (directly or via proxy) or internally for reserved `wraith-*` destinations (ADR-018)
+
+At this stage, implement the handler struct, auth delegation, and the channel dispatch skeleton (actual TCP connection and proxy logic in dependent tasks).
+
+## Acceptance Criteria
+
+- [ ] `crates/wraith-core/src/server/handler.rs` exports `ServerHandler`
+- [ ] `ServerHandler` implements `russh::server::Handler`
+- [ ] `ServerHandler` holds: `Arc<ServerAuthConfig>`, `outbound_proxy: Option<ProxyConfig>`, `remote_addr: Option<SocketAddr>`
+- [ ] `auth_publickey()` delegates to `ServerAuthConfig` and returns `Accept` or `Reject`
+- [ ] `channel_open_direct_tcpip()` dispatches: if `host.starts_with("wraith-")`, route to internal handler (stub for control channel); otherwise, spawn TCP proxy task (stub that logs and returns error for now)
+- [ ] One `ServerHandler` instance per connection; state is not shared between connections (unless explicitly Arc'd)
+- [ ] Structured auth logging via `tracing::info!` with `remote_addr`, `key_fingerprint`, `result` (ADR-013)
+- [ ] Unit tests: auth delegation works, reserved destination routing logic, unknown channel types rejected
+
+## References
+
+- docs/architecture/server.md — Server Handler Behavior section, channel handling
+- docs/architecture/decisions/018-control-channel-for-pubsub.md — reserved `wraith-*` destinations
+- docs/architecture/decisions/013-fail2ban-friendly-logging.md — structured auth logging
+
+## Notes
+
+> To be filled by implementation agent
+
+## Summary
+
+> To be filled on completion
--- a/tasks/server/rate-limiting-and-logging.md
+++ b/tasks/server/rate-limiting-and-logging.md
@@ -0,0 +1,50 @@
+---
+id: server/rate-limiting-and-logging
+name: Implement server rate limiting and fail2ban-friendly structured logging
+status: pending
+depends_on:
+  - server/handler
+scope: narrow
+risk: low
+impact: component
+level: implementation
+---
+
+## Description
+
+Implement the two-layer abuse protection per ADR-013:
+
+1. **Structured logging** at INFO level for fail2ban integration: auth attempts (remote_addr, user, key_fingerprint, accept/reject), connection opened/closed (remote_addr, transport, duration)
+2. **Built-in rate limiting**: `--max-connections-per-ip` (reject new connections from IPs with N active connections), `--max-auth-attempts` (disconnect after N failed auth attempts per connection)
+
+No logging of tunnel destinations, DNS resolutions, or bytes transferred (ADR-006).
+
+## Acceptance Criteria
+
+- [ ] `crates/wraith-core/src/server/rate_limit.rs` exports connection rate limiter
+- [ ] `ConnectionRateLimiter` tracks active connections per IP using `HashMap<IpAddr, usize>`
+- [ ] `ConnectionRateLimiter::check(ip) -> bool` — returns `true` if connection allowed, `false` if over limit
+- [ ] `ConnectionRateLimiter::on_connect(ip)` — increment counter
+- [ ] `ConnectionRateLimiter::on_disconnect(ip)` — decrement counter
+- [ ] `AuthAttemptLimiter` tracks failed auth attempts per connection
+- [ ] `AuthAttemptLimiter::check() -> bool` — returns `true` if under limit
+- [ ] `AuthAttemptLimiter::on_failure()` — increment failure counter
+- [ ] Structured `tracing::info!` logging on: auth attempt, connection opened, connection closed
+- [ ] Log format includes key-value pairs: `remote_addr`, `user`, `key_fingerprint`, `result`, `transport`, `duration`
+- [ ] No logging of: channel open targets, DNS resolutions, bytes transferred
+- [ ] Integration with `ServerHandler`: rate limiter checked before auth, auth attempt limiter checked during auth
+- [ ] Unit tests: connection limit enforced, auth attempt limit enforced, log format verification
+
+## References
+
+- docs/architecture/server.md — Logging and Rate Limiting section
+- docs/architecture/decisions/013-fail2ban-friendly-logging.md — logging format, rate limiting flags
+- docs/architecture/decisions/006-no-logging-of-tunnel-destinations.md — no destination logging
+
+## Notes
+
+> To be filled by implementation agent
+
+## Summary
+
+> To be filled on completion
--- a/tasks/server/serve-loop.md
+++ b/tasks/server/serve-loop.md
@@ -0,0 +1,54 @@
+---
+id: server/serve-loop
+name: Implement server accept loop, graceful shutdown, and ServeOptions config
+status: pending
+depends_on:
+  - server/handler
+  - server/channel-proxy
+  - server/rate-limiting-and-logging
+  - transport/trait-and-types
+scope: moderate
+risk: medium
+impact: component
+level: implementation
+---
+
+## Description
+
+Implement the server's main accept loop and configuration. This ties together the transport acceptor, server handler, rate limiting, and logging into a coherent server process.
+
+`ServeOptions` is the programmatic configuration struct (ADR-011) for the server. The accept loop:
+1. Binds a `TransportAcceptor` based on transport mode
+2. Accepts incoming connections (respecting rate limits)
+3. Creates a `ServerHandler` per connection
+4. Passes the stream to `russh::server::run_stream()`
+5. Handles graceful shutdown on SIGTERM/SIGINT
+
+## Acceptance Criteria
+
+- [ ] `crates/wraith-core/src/server/mod.rs` re-exports all server components
+- [ ] `ServeOptions` struct with fields matching server.md CLI interface: `key`, `authorized_keys`, `cert_authority`, `transport_mode`, `listen_addr`, `tls_cert`, `tls_key`, `acme_domain`, `stealth`, `proxy`, `iroh_relay`, `max_connections_per_ip`, `max_auth_attempts`
+- [ ] `Server::new(opts: ServeOptions) -> Result<Server>` — creates server with bound acceptor, auth config, rate limiter
+- [ ] `Server::run()` — enters accept loop, for each connection: check rate limit → create handler → `run_stream()`
+- [ ] Stealth mode integration: if enabled, protocol detection before `run_stream()`
+- [ ] Graceful shutdown: `Server::shutdown()` method and signal handler (SIGTERM/SIGINT)
+  - Stop accepting new connections
+  - Send SSH disconnect to active sessions
+  - Wait for drain timeout (~2 seconds per session)
+  - Forcibly terminate remaining connections
+- [ ] iroh mode: prints endpoint ID on startup
+- [ ] `ServeOptions::key` and `ServeOptions::authorized_keys` accept `KeySource` (file or in-memory)
+- [ ] Integration test: start server, client connects via mock transport, session works, shutdown completes
+
+## References
+
+- docs/architecture/server.md — full server spec including graceful shutdown
+- docs/architecture/decisions/011-no-ssh-config-programmatic-api.md — ServeOptions programmatic struct
+
+## Notes
+
+> To be filled by implementation agent
+
+## Summary
+
+> To be filled on completion
--- a/tasks/server/stealth-mode.md
+++ b/tasks/server/stealth-mode.md
@@ -0,0 +1,50 @@
+---
+id: server/stealth-mode
+name: Implement stealth mode — protocol multiplexing on port 443 (ADR-017)
+status: pending
+depends_on:
+  - transport/tls-transport
+  - server/handler
+scope: narrow
+risk: medium
+impact: component
+level: implementation
+---
+
+## Description
+
+Implement stealth mode per ADR-017. When `--stealth` is enabled alongside TLS transport on port 443:
+
+1. After completing the TLS handshake, peek at the first bytes of the connection
+2. If the connection starts with `SSH-2.0-`, proceed with `russh::server::run_stream()`
+3. If the connection starts with anything else (HTTP, random data), respond with `HTTP/1.1 404 Not Found\r\nServer: nginx\r\n\r\n` and close
+
+This makes the server appear as an nginx web server returning 404 errors to non-SSH connections, making it indistinguishable from a regular HTTPS site to port scanners and DPI systems.
+
+Stealth mode requires TLS transport. The CLI should reject or warn if `--stealth` is used without `--transport tls`.
+
+## Acceptance Criteria
+
+- [ ] `crates/wraith-core/src/server/stealth.rs` exports stealth mode protocol detection
+- [ ] `detect_protocol(stream: TlsStream) -> ProtocolDetection` — peeks at first bytes to determine SSH vs HTTP
+- [ ] `ProtocolDetection` enum: `Ssh`, `Http` (or `Unknown`)
+- [ ] If SSH detected: pass stream to `russh::server::run_stream()`
+- [ ] If HTTP/unknown detected: write `HTTP/1.1 404 Not Found\r\nServer: nginx\r\n\r\n` then close
+- [ ] Peek uses `tokio::io::BufReader` or similar buffered read to avoid consuming the SSH banner bytes
+- [ ] Integration with `TlsAcceptor` flow: after accept + TLS handshake, optionally run protocol detection before passing to russh
+- [ ] Stealth mode flag validated: requires TLS transport, warn/reject otherwise
+- [ ] Unit tests: SSH banner detection, HTTP request detection, random data → fake nginx 404
+- [ ] Integration test: stealth server responds to HTTP scanner with 404, SSH client connects successfully
+
+## References
+
+- docs/architecture/server.md — Stealth Mode section
+- docs/architecture/decisions/017-stealth-mode-protocol-multiplexing.md — protocol multiplexing design
+
+## Notes
+
+> To be filled by implementation agent
+
+## Summary
+
+> To be filled on completion