Resolve all architecture open questions, add 13 ADRs, update specs
Resolved all 11 open questions based on project guidance: Transport: - OQ-01/OQ-07: ACME/Let's Encrypt with domain + IP paths (ADR-008) - OQ-02: Default to n0 relay, --iroh-relay override (ADR-009) - OQ-05: Transport chaining supported natively (ADR-010) Client: - OQ-06: Programmatic-first API, no ~/.ssh/config (ADR-011) Server: - OQ-04: Ed25519 + OpenSSH cert-authority, no password auth (ADR-012) - OQ-08: fail2ban-friendly logging + built-in rate limiting (ADR-013) TUN: - OQ-03/OQ-09: Deferred entirely, recommend tun2proxy (ADR-014) - tun-shim.md marked deprecated NAPI: - OQ-10: Expose both connect() and serve() (ADR-016) - OQ-11: Use napi-rs for FFI bridge (ADR-015) Additional ADRs created during review: - ADR-006: No logging of tunnel destinations (was phantom reference) - ADR-017: Stealth mode protocol multiplexing - ADR-018: Control channel for pubsub over SSH Fixed: ADR-002 status → Superseded, ADR-007 title typo, WRAUTH_SERVER typo, ADR-005 stale wraith-tun refs, undefined ACL feature removed from server.md, --proxy semantic difference documented.
This commit is contained in:
@@ -9,85 +9,85 @@ last_updated: 2026-06-01
|
||||
|
||||
### OQ-01: TLS certificate management strategy
|
||||
- **Origin**: [server.md](server.md)
|
||||
- **Status**: open
|
||||
- **Priority**: medium
|
||||
- **Details**: Should the server support ACME/Let's Encrypt auto-provisioning (like https_proxy does), or is manual cert management sufficient? Auto-provisioning is more user-friendly but adds complexity and a dependency on the ACME protocol. Self-signed certs with `--insecure` flag on the client side covers the simple case.
|
||||
- **Cross-references**: Server spec, TlsTransport implementation
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~medium~~ —
|
||||
- **Resolution**: ADR-008 — Support both domain-based and IP-based ACME/Let's Encrypt auto-provisioning, plus manual certs. Domain-based uses standard certbot-style flow with HTTP-01/TLS-ALPN-01 challenges. IP-based uses short-lived certs via TLS-ALPN-01 on port 443. Manual certs via `--tls-cert`/`--tls-key` always supported. Implementation uses `rustls-acme` or similar pure-Rust ACME client.
|
||||
- **Cross-references**: [ADR-008](decisions/008-acme-lets-encrypt.md), Server spec, TlsTransport implementation
|
||||
|
||||
### OQ-02: iroh relay configuration defaults
|
||||
- **Origin**: [transport.md](transport.md)
|
||||
- **Status**: open
|
||||
- **Priority**: low
|
||||
- **Details**: Should the default iroh relay be n0's free servers, or should users be required to specify one? n0's relay is convenient for testing and quick start but creates a dependency. Self-hosted relay is better for production. Consider: default to n0, allow `--iroh-relay` override, and document self-hosting.
|
||||
- **Cross-references**: Transport spec, iroh docs
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~low~~ —
|
||||
- **Resolution**: ADR-009 — Default to n0's free relay servers. Allow override via `--iroh-relay <url>`. Document self-hosted relay setup. This matches iroh's own defaults and minimizes friction for testing/development.
|
||||
- **Cross-references**: [ADR-009](decisions/009-default-iroh-relay.md), Transport spec
|
||||
|
||||
### OQ-05: Transport chaining support in CLI
|
||||
- **Origin**: [transport.md](transport.md)
|
||||
- **Status**: open
|
||||
- **Priority**: low
|
||||
- **Details**: Should `--transport iroh --proxy socks5://...` be supported natively, or should chaining be a manual configuration thing? The iroh transport's `connect()` method would need to route its outbound through the proxy. This is possible (iroh's `Endpoint::builder` supports proxy configuration) but adds CLI complexity. Consider: defer to Phase 2.
|
||||
- **Cross-references**: Transport spec
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~low~~ —
|
||||
- **Resolution**: ADR-010 — Support `--transport iroh --proxy socks5://...` natively in the CLI. iroh's endpoint builder accepts proxy configuration directly, so the implementation is minimal. Other transport combinations (TCP+TLS) are already implicit.
|
||||
- **Cross-references**: [ADR-010](decisions/010-transport-chaining-cli.md), Transport spec
|
||||
|
||||
## Client
|
||||
|
||||
### OQ-06: SSH config file parsing
|
||||
- **Origin**: [client.md](client.md)
|
||||
- **Status**: open
|
||||
- **Priority**: low
|
||||
- **Details**: Should the client read `~/.ssh/config` for default host/key/port settings? russh-config crate exists and can parse this. Would reduce CLI verbosity for frequent connections. Consider: `--config` flag to read from a wraith-specific config instead, avoiding OpenSSH config parsing complexity.
|
||||
- **Cross-references**: Client spec
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~low~~ —
|
||||
- **Resolution**: ADR-011 — No `~/.ssh/config` parsing, no custom config file. Configuration is programmatic-first: CLI flags, library API structs (`ConnectOptions`, `ServeOptions`), and environment variables. Cross-platform path issues (`~` expansion) are avoided. The library API is the primary interface; if config files are needed later, they can be a separate layer.
|
||||
- **Cross-references**: [ADR-011](decisions/011-no-ssh-config-programmatic-api.md), Client spec
|
||||
|
||||
## Server
|
||||
|
||||
### OQ-07: ACME/Let's Encrypt support
|
||||
- **Origin**: [server.md](server.md)
|
||||
- **Status**: open
|
||||
- **Priority**: medium
|
||||
- **Details**: Auto-provisioning TLS certs from Let's Encrypt would make TLS mode much easier to set up. But it requires port 80 or port 443 + TLS-ALPN-01 challenge support, and a persistent cert store. Consider: defer to Phase 2, document manual cert setup for MVP.
|
||||
- **Cross-references**: Server spec, TlsTransport
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~medium~~ —
|
||||
- **Resolution**: ADR-008 — Same resolution as OQ-01. Both domain-based (standard, domain-bound, auto-renewing) and IP-based (short-lived, no domain required) ACME flows are supported. The domain-based path requires port 80 or DNS access for challenges. The IP-based path uses TLS-ALPN-01 on port 443 and requires the ACME client to run continuously.
|
||||
- **Cross-references**: [ADR-008](decisions/008-acme-lets-encrypt.md), Server spec, TlsTransport
|
||||
|
||||
### OQ-08: Connection limits and rate limiting
|
||||
- **Origin**: [server.md](server.md)
|
||||
- **Status**: open
|
||||
- **Priority**: low
|
||||
- **Details**: Should the server support configurable connection limits, rate limiting, and max simultaneous channels? Useful for preventing abuse on public-facing servers. Consider: `--max-connections` and `--max-channels-per-connection` flags.
|
||||
- **Cross-references**: Server spec
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~low~~ —
|
||||
- **Resolution**: ADR-013 — Two-layer approach: (1) Structured logging of auth attempts and connections at INFO level for fail2ban integration on Linux — matches our production fail2ban setup with nftables and systemd journal. (2) Built-in rate limiting: `--max-connections-per-ip` and `--max-auth-attempts` flags providing platform-independent abuse protection.
|
||||
- **Cross-references**: [ADR-013](decisions/013-fail2ban-friendly-logging.md), Server spec, Production fail2ban docs
|
||||
|
||||
### OQ-04: Authentication beyond Ed25519 keys
|
||||
- **Origin**: [client.md](client.md), [server.md](server.md)
|
||||
- **Status**: open
|
||||
- **Priority**: low
|
||||
- **Details**: Should password authentication be supported? Should SSH certificates (OpenSSH cert-authority) be supported? Password auth is convenient but less secure. Certificates are useful for large-scale deployments. Consider: password auth as optional flag (`--allow-password`), certificates as future feature.
|
||||
- **Cross-references**: Client spec, Server spec
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~low~~ —
|
||||
- **Resolution**: ADR-012 — Ed25519 public key (default, unchanged) + OpenSSH certificate authority support (new, important for multi-user). No password authentication over SSH channels. If a local SOCKS5 proxy needs its own auth, that's a separate concern. Cert-authority makes multi-user management practical: one CA entry in `authorized_keys` instead of N individual keys. Certificates support expiry and restrictions.
|
||||
- **Cross-references**: [ADR-012](decisions/012-auth-ed25519-and-cert-authority.md), Client spec, Server spec
|
||||
|
||||
## TUN
|
||||
|
||||
### OQ-03: Windows TUN support scope
|
||||
- **Origin**: [tun-shim.md](tun-shim.md)
|
||||
- **Status**: open
|
||||
- **Priority**: low
|
||||
- **Details**: tun-rs supports Windows via wintun.dll but distributing a DLL adds complexity. Consider: Linux and macOS only for MVP, Windows as a follow-up.
|
||||
- **Cross-references**: tun-shim.md
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~low~~ —
|
||||
- **Resolution**: ADR-014 — TUN is deferred entirely from the wraith project. For VPN-like behavior, users run `tun2proxy --proxy socks5://127.0.0.1:1080` alongside wraith. This eliminates all TUN-related scope questions (Windows, TCP reconstruction, etc.).
|
||||
- **Cross-references**: [ADR-014](decisions/014-defer-tun-recommend-socks5-proxy.md)
|
||||
|
||||
### OQ-09: TCP reconstruction approach for TUN
|
||||
- **Origin**: [tun-shim.md](tun-shim.md)
|
||||
- **Status**: open
|
||||
- **Priority**: medium
|
||||
- **Details**: Should the TUN shim use a userspace TCP stack (like smoltcp or tun2proxy's ip-stack) for reliable TCP reconstruction, or forward raw IP packets through SOCKS5? Raw packet forwarding requires handling segmentation, retransmission, and reordering. Userspace TCP solves this but is more code. Consider: start with SOCKS5 proxying (each TUN packet becomes a SOCKS5 connection) and add TCP reconstruction if needed.
|
||||
- **Cross-references**: tun-shim.md
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~medium~~ —
|
||||
- **Resolution**: ADR-014 — TUN is deferred from wraith. tun2proxy (external tool) handles this if users need VPN-like behavior.
|
||||
- **Cross-references**: [ADR-014](decisions/014-defer-tun-recommend-socks5-proxy.md)
|
||||
|
||||
## NAPI / PubSub
|
||||
|
||||
### OQ-10: NAPI wrapper API surface
|
||||
- **Origin**: [napi-and-pubsub.md](napi-and-pubsub.md)
|
||||
- **Status**: open
|
||||
- **Priority**: medium
|
||||
- **Details**: Should the NAPI wrapper expose just a `connect()` function returning a `Duplex` stream, or also expose `serve()` for server-side use from Node.js? Server-side would enable running a wraith server from a Node.js process. Consider: `connect()` only for MVP, `serve()` as follow-up.
|
||||
- **Cross-references**: napi-and-pubsub.md
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~medium~~ —
|
||||
- **Resolution**: ADR-016 — Expose both `connect()` and `serve()` from the start. Both are fundamental operations needed by the pubsub event target system (spokes use `connect()`, hubs could use `serve()`). The NAPI layer is transport-agnostic — it doesn't know about pubsub's `EventEnvelope`. The pubsub adapter wraps the `Duplex` stream. This ensures the NAPI wrapper is reusable for any stream-based protocol, not tied specifically to pubsub.
|
||||
- **Cross-references**: [ADR-016](decisions/016-napi-expose-connect-and-serve.md), napi-and-pubsub.md
|
||||
|
||||
### OQ-11: napi-rs vs uniffi for FFI bridge
|
||||
- **Origin**: [napi-and-pubsub.md](napi-and-pubsub.md)
|
||||
- **Status**: open
|
||||
- **Priority**: low
|
||||
- **Details**: napi-rs is the standard for Node.js native addons and has the best ecosystem. uniffi supports more targets (Python, Swift, Kotlin) but is less mature for Node.js. Since the primary consumer is TypeScript/Node.js (pubsub/operations ecosystem), napi-rs is the logical choice. But if future Python or mobile consumers are anticipated, uniffi could be worth the investment.
|
||||
- **Cross-references**: napi-and-pubsub.md
|
||||
- **Status**: ~~resolved~~
|
||||
- **Priority**: ~~low~~ —
|
||||
- **Resolution**: ADR-015 — Use napi-rs. It's the standard for Node.js native addons, matches our primary consumer (TypeScript/Node.js), and has the best ecosystem and documentation. If future Python or mobile consumers are needed, a separate uniffi layer can be added — the Rust core doesn't change.
|
||||
- **Cross-references**: [ADR-015](decisions/015-napi-rs-for-ffi-bridge.md), napi-and-pubsub.md
|
||||
Reference in New Issue
Block a user