diff --git a/docs/architecture/README.md b/docs/architecture/README.md new file mode 100644 index 0000000..a3c1ac6 --- /dev/null +++ b/docs/architecture/README.md @@ -0,0 +1,61 @@ +--- +status: draft +last_updated: 2026-06-11 +--- + +# Reverse Proxy — Architecture + +## Current State + +**Phase 0 (Exploration) — Complete.** Phase 1 (Architecture) — In progress. + +This project replaces our vulnerable nginx 1.24.0 installation with a +memory-safe Rust/axum reverse proxy. The primary motivation is CVE-2026-42945 +(unauthenticated RCE in nginx's rewrite module) and the broader pattern of +memory corruption bugs in nginx's C codebase. + +## Architecture Documents + +| Document | Status | Description | +|----------|--------|-------------| +| [overview.md](overview.md) | Draft | Vision, scope, crate dependencies, exports | +| [proxy.md](proxy.md) | Draft | Reverse proxy handler, request flow, header injection | +| [tls.md](tls.md) | Draft | TLS termination, ACME, manual certs, SNI | +| [config.md](config.md) | Draft | TOML config format, static/dynamic split, ArcSwap reload | +| [operations.md](operations.md) | Draft | Rate limiting, logging, health check, systemd, shutdown | + +## ADR Table + +| ADR | Title | Status | +|-----|-------|--------| +| [001](decisions/001-rust-axum.md) | Rust with Axum | Accepted | +| [002](decisions/002-custom-proxy-handler.md) | Custom Proxy Handler | Accepted | +| [003](decisions/003-toml-config.md) | TOML Configuration Format | Accepted | +| [004](decisions/004-rustls-acme.md) | ACME-Primary Certificate Management | Accepted | +| [005](decisions/005-tokio-rustls-direct.md) | tokio-rustls Directly, Not axum-server | Accepted | +| [006](decisions/006-rate-limiting-approach.md) | Token Bucket Rate Limiting | Accepted | +| [007](decisions/007-custom-log-format.md) | Custom Structured Log Format | Accepted | +| [008](decisions/008-static-dynamic-config-split.md) | Static/Dynamic Config Split with ArcSwap | Accepted | +| [009](decisions/009-signal-handling.md) | Signal Handling Strategy | Accepted | + +## Open Questions + +See [open-questions.md](open-questions.md) for the full tracker. + +| OQ | Question | Priority | Status | +|----|----------|----------|--------| +| OQ-01 | Should cipher suites be restricted beyond rustls defaults? | medium | open | +| ~~OQ-02~~ | ~~What log format should fail2ban consume?~~ | ~~high~~ | **resolved** (ADR-007) | +| OQ-03 | Should the health check endpoint be on a separate port? | low | open | +| OQ-04 | Config reload: SIGHUP only or also Unix socket API? | low | open | +| OQ-05 | Should the proxy bind to multiple addresses? | low | open | +| OQ-06 | Should upstream timeouts be configurable per-site? | low | open | + +## Document Lifecycle + +| Status | Meaning | Transitions | +|--------|---------|-------------| +| `draft` | Under active development. May change significantly. | → `reviewed` when open questions are resolved | +| `reviewed` | Architecture is final. Implementation may begin. | → `stable` when implementation is complete | +| `stable` | Locked. Changes require review and may warrant an ADR. | → `deprecated` when superseded | +| `deprecated` | Superseded. Kept for reference. | Removed when no longer referenced | \ No newline at end of file diff --git a/docs/architecture/config.md b/docs/architecture/config.md new file mode 100644 index 0000000..ab84267 --- /dev/null +++ b/docs/architecture/config.md @@ -0,0 +1,206 @@ +--- +status: draft +last_updated: 2026-06-11 +--- + +# Configuration + +## What It Is + +The configuration system defines how the proxy is configured, how configuration +is loaded, and how dynamic configuration can be reloaded without restarting the +process. + +## Why It Exists + +The proxy needs to be configurable without hard-coding domains, upstream +addresses, or TLS settings. The configuration system separates immutable +startup parameters (bind addresses, TLS mode) from runtime-adjustable +parameters (site definitions, rate limits) using the `ArcSwap` pattern proven +in the alknet project. + +## Architecture + +``` +config.toml + │ + ▼ +┌──────────────────────┐ +│ serde::Deserialize │ +│ (TOML → Config) │ +└──────────┬───────────┘ + │ + ▼ +┌──────────────────────┐ ┌──────────────────────┐ +│ StaticConfig │ │ DynamicConfig │ +│ (immutable) │ │ (hot-reloadable) │ +│ │ │ │ +│ bind_addr │ │ sites[] │ +│ http_port │ │ rate_limit │ +│ https_port │ │ body_limit │ +│ tls.mode │ │ proxy_headers │ +│ tls.acme_domain │ │ │ +│ tls.cert_path │ │ ← ArcSwap → │ +│ tls.key_path │ │ ConfigReloadHandle │ +│ tls.cache_dir │ │ .reload(new_config) │ +│ log_level │ │ │ +│ log_format │ └───────────────────────┘ +└──────────────────────┘ +``` + +## Static vs Dynamic Configuration + +This split follows the pattern established in alknet (ADR-030) and adapted +for our simpler use case. + +### StaticConfig + +Immutable after startup. Changes require a process restart. + +| Field | Type | Description | +|-------|------|-------------| +| `bind_addr` | `String` | IP address to bind to (e.g., `"15.235.125.95"`) | +| `http_port` | `u16` | Port for HTTP→HTTPS redirect (default: `80`; set to `0` to disable) | +| `https_port` | `u16` | Port for TLS listener (default: `443`) | +| `tls.mode` | `"acme"` or `"manual"` | Certificate provisioning mode | +| `tls.acme_domain` | `String` | Domain for ACME (ACME mode only) | +| `tls.acme_cache_dir` | `String` | ACME state cache directory | +| `tls.acme_directory` | `"production"` or `"staging"` | Let's Encrypt directory | +| `tls.cert_path` | `String` | Certificate file path (manual mode only) | +| `tls.key_path` | `String` | Private key file path (manual mode only) | +| `log_level` | `"trace"`, `"debug"`, `"info"`, `"warn"`, `"error"` | Logging verbosity | +| `log_format` | `"text"` or `"json"` | Log output format | + +**Why these are static:** Changing bind addresses, ports, or TLS mode requires +creating new listeners and TLS configurations — operations that fundamentally +require a restart. There's no safe way to change these at runtime. + +### DynamicConfig + +Hot-reloadable at runtime via `ArcSwap`. Changes take effect for new +connections immediately. + +| Field | Type | Description | +|-------|------|-------------| +| `sites` | `Vec` | Site definitions (hostname → upstream mapping) | +| `rate_limit.requests_per_second` | `u32` | Rate limit per IP (global in Phase 1) | +| `rate_limit.burst` | `u32` | Burst capacity (global in Phase 1) | +| `body_limit_bytes` | `u64` | Max request body size in bytes (global in Phase 1) | + +**SiteConfig:** + +| Field | Type | Description | +|-------|------|-------------| +| `host` | `String` | Hostname to match (e.g., `"git.alk.dev"`) | +| `upstream` | `String` | Upstream address (e.g., `"127.0.0.1:3000"`) | +| `upstream_scheme` | `"http"` or `"https"` | Protocol for upstream connection (default: `"http"`) | + +**Why these are dynamic:** Site definitions and rate limits are per-request +concerns. Adding a site or changing a rate limit should not require restarting +the proxy and dropping active connections. Rate limits and body limits are +global settings in Phase 1; per-site configuration for these may be added in +Phase 2. + +## Config Reload + +### ArcSwap Pattern + +`DynamicConfig` is wrapped in `Arc>`. This provides: + +- **Lock-free reads**: Every handler reads the current config via a single + `Arc` dereference — no lock contention on the request hot path. +- **Atomic writes**: `ConfigReloadHandle::reload(new_config)` swaps the entire + config atomically. All new requests see the new config immediately. +- **No partial updates**: The entire config is swapped at once. There's no risk + of reading a half-updated config. + +See [ADR-008](decisions/008-static-dynamic-config-split.md) for the rationale +behind this split. + +### Reload Trigger + +The initial implementation uses SIGHUP as the reload trigger. When the process +receives SIGHUP: + +1. Re-read the config file from disk +2. Deserialize into `DynamicConfig` +3. Validate (check upstream reachability is optional) +4. Call `ConfigReloadHandle::reload(new_config)` + +Future implementations could add a Unix domain socket API or HTTP endpoint for +config reload, but SIGHUP is sufficient for Phase 1. + +## TOML Config Format + +```toml +# reverse-proxy config + +[server] +bind_addr = "15.235.125.95" +http_port = 80 +https_port = 443 + +[server.tls] +mode = "acme" # "acme" or "manual" +acme_domain = "git.alk.dev" +acme_cache_dir = "/var/lib/reverse-proxy/acme-cache" +acme_directory = "production" # "production" or "staging" + +# Manual mode (uncomment and comment out ACME settings) +# mode = "manual" +# cert_path = "/etc/letsencrypt/live/git.alk.dev/fullchain.pem" +# key_path = "/etc/letsencrypt/live/git.alk.dev/privkey.pem" + +[server.logging] +level = "info" +format = "text" # "text" or "json" + +[rate_limit] +requests_per_second = 10 +burst = 20 + +[body] +limit_bytes = 104857600 # 100 MB + +[[sites]] +host = "git.alk.dev" +upstream = "127.0.0.1:3000" +upstream_scheme = "http" +``` + +### Validation + +On startup, the config is validated: + +1. `bind_addr` is not `0.0.0.0` (must be explicit) +2. In ACME mode, `acme_domain` must be set +3. In manual mode, `cert_path` and `key_path` must both be set and the files + must be readable +4. Each site must have a `host` and `upstream` +5. `rate_limit.requests_per_second` must be > 0 +6. `body.limit_bytes` must be > 0 + +On SIGHUP reload, the same validation applies. If the new config fails +validation, the reload is rejected and the old config remains active. An error +is logged. + +**On startup**: If config validation fails, the process exits with a non-zero +code and logs the validation errors. The proxy will not start with an invalid +configuration. + +## Design Decisions + +All design decisions are documented as ADRs in [decisions/](decisions/). + +| ADR | Decision | Summary | +|-----|----------|---------| +| [003](decisions/003-toml-config.md) | TOML configuration format | Rust-native, unambiguous, excellent serde support | +| [008](decisions/008-static-dynamic-config-split.md) | Static/dynamic config split | Immutable StaticConfig, hot-reloadable DynamicConfig via ArcSwap | + +## Open Questions + +Open questions are tracked in [open-questions.md](open-questions.md). Key +questions affecting this document: + +- **OQ-04**: Should config reload support a Unix domain socket API in addition + to SIGHUP? (open) \ No newline at end of file diff --git a/docs/architecture/decisions/001-rust-axum.md b/docs/architecture/decisions/001-rust-axum.md new file mode 100644 index 0000000..e8a2e36 --- /dev/null +++ b/docs/architecture/decisions/001-rust-axum.md @@ -0,0 +1,61 @@ +# ADR-001: Rust with Axum + +## Status + +Accepted + +## Context + +Our current nginx 1.24.0 installation is vulnerable to multiple actively-exploited +CVEs, most critically CVE-2026-42945 (CVSS 9.2, unauthenticated RCE via +`ngx_http_rewrite_module`). Six of seven recent nginx CVEs are memory corruption +bugs (buffer overflow, use-after-free, buffer overread) — the exact class of +vulnerabilities that Rust eliminates by construction. + +The threat landscape is worsening: LLM-assisted fuzzing is accelerating bug +discovery in nginx's C codebase, and security researchers report additional +undisclosed vulnerabilities. + +We need to replace nginx with a memory-safe alternative that can handle: +- TLS termination +- HTTP reverse proxying to backend services +- Rate limiting with fail2ban-compatible logging +- Operational simplicity (single binary, systemd integration) + +## Decision + +Use Rust with the axum web framework for the reverse proxy implementation. + +**Rust** provides: +- Memory safety by construction (no buffer overflows, use-after-free, or + double-free at runtime) +- rustls (pure Rust TLS) avoids OpenSSL dependency and its CVE history +- Single static binary deployment with no runtime dependencies +- Excellent async I/O support via tokio + +**axum** provides: +- Ergonomic handler definitions with extractors +- Tower middleware ecosystem (Service trait, layers) +- Type-safe routing and state management +- Well-maintained, widely used, good documentation + +## Consequences + +**Positive:** +- Eliminates the entire class of memory corruption vulnerabilities affecting + nginx +- Single binary deployment simplifies operations +- Rust's type system catches many errors at compile time +- axum + tower provides composable middleware + +**Negative:** +- Smaller ecosystem than nginx for HTTP proxy features (but our use case is + simple) +- We maintain the code (vs. using a battle-tested C project) +- Less granular control over HTTP/2 and connection pooling compared to nginx +- Team needs Rust expertise (already available) + +## References + +- [threat-landscape.md](../../research/threat-landscape.md) +- [overview.md](../overview.md) \ No newline at end of file diff --git a/docs/architecture/decisions/002-custom-proxy-handler.md b/docs/architecture/decisions/002-custom-proxy-handler.md new file mode 100644 index 0000000..159a509 --- /dev/null +++ b/docs/architecture/decisions/002-custom-proxy-handler.md @@ -0,0 +1,56 @@ +# ADR-002: Custom Proxy Handler + +## Status + +Accepted + +## Context + +We need to implement HTTP reverse proxying — receiving requests and forwarding +them to an upstream service (Gitea on localhost:3000). Two approaches are +available: + +1. **`axum-reverse-proxy` crate**: Provides path-based routing, header + forwarding, round-robin load balancing, TLS support, retry mechanisms, and + RFC 9110 compliance. +2. **Custom handler** (Felix Knorr pattern): Build a handler using hyper's + `Client` to forward requests. ~50-100 lines of Rust for our needs. + +Our use case is minimal: single upstream per domain, single domain, no load +balancing, no retry, no HTTP/2 proxying. + +## Decision + +Implement a custom proxy handler using hyper's `Client` for request forwarding, +following the pattern demonstrated by Felix Knorr and used in the alknet +project's channel proxy. + +## Rationale + +- `axum-reverse-proxy` adds complexity we don't need (load balancing, retry, + path-based routing to multiple backends) +- Our proxy case is the simplest possible: match a Host header, forward the + entire request to a single upstream, stream the response back +- The Felix Knorr pattern is proven, idiomatic, and ~50-100 lines +- We maintain full control over header injection, error handling, and upstream + connection behavior +- If requirements grow, we can adopt `axum-reverse-proxy` later + +## Consequences + +**Positive:** +- Minimal dependencies +- Full control over proxy behavior +- Easy to understand and audit (~100 lines of proxy code) +- No unnecessary abstraction layers + +**Negative:** +- We implement and maintain proxy logic ourselves (but it's trivial for our + use case) +- If requirements grow to load balancing or retry, we'd need to add that + ourselves or switch to `axum-reverse-proxy` + +## References + +- [proxy.md](../proxy.md) +- Felix Knorr, "Replacing nginx with axum" (felix-knorr.net/posts/2024-10-13-replacing-nginx-with-axum.html) \ No newline at end of file diff --git a/docs/architecture/decisions/003-toml-config.md b/docs/architecture/decisions/003-toml-config.md new file mode 100644 index 0000000..e2181db --- /dev/null +++ b/docs/architecture/decisions/003-toml-config.md @@ -0,0 +1,44 @@ +# ADR-003: TOML Configuration Format + +## Status + +Accepted + +## Context + +The proxy needs a configuration file format for defining sites, TLS settings, +bind addresses, and rate limits. Options include TOML, YAML, JSON, and custom +binary formats. + +## Decision + +Use TOML as the configuration file format. + +## Rationale + +- **Rust-native**: TOML is the configuration format for Cargo (Rust's package + manager). The Rust ecosystem has first-class TOML support via `serde` + + `toml` crate. +- **Unambiguous**: TOML has a single canonical representation for any given + data structure, unlike YAML which has multiple equivalent representations and + surprising type coercion rules (e.g., `no` → boolean, `1.0` → float). +- **Human-friendly**: TOML is easy to read and write for simple configurations + like ours. It supports sections (tables), arrays, and inline tables. +- **Good error messages**: The `toml` crate provides clear deserialization + error messages pointing to the exact field that failed. + +## Consequences + +**Positive:** +- Familiar to Rust developers (Cargo.toml) +- Clear, unambiguous syntax +- Excellent serde integration with detailed error reporting +- No type coercion surprises + +**Negative:** +- Not as widely used for config outside Rust (but our audience is ourselves) +- No `#include` or file composition (each config file is self-contained) + +## References + +- [config.md](../config.md) \ No newline at end of file diff --git a/docs/architecture/decisions/004-rustls-acme.md b/docs/architecture/decisions/004-rustls-acme.md new file mode 100644 index 0000000..f5a5a90 --- /dev/null +++ b/docs/architecture/decisions/004-rustls-acme.md @@ -0,0 +1,67 @@ +# ADR-004: ACME-Primary Certificate Management + +## Status + +Accepted + +## Context + +The proxy needs TLS certificates for HTTPS. Two approaches are available: + +1. **certbot (external ACME client)**: Run certbot as a cron job or systemd + timer to obtain and renew certificates. The proxy loads certificates from + files on disk. Renewal requires either SIGHUP/restart or inotify file + watching to pick up new certs. + +2. **rustls-acme (built-in ACME client)**: The proxy handles ACME + certificate provisioning and renewal internally as a background task. No + external certbot dependency. The `ResolvesServerCertAcme` cert resolver + automatically serves the correct certificate and updates when renewed. + +The alknet project has successfully implemented the rustls-acme approach, and +its patterns are directly reusable. + +## Decision + +Use `rustls-acme` as the primary certificate management mode, with manual +certificate paths as a fallback mode for testing, self-signed certs, and +corporate CA environments. + +## Rationale + +- **Eliminates certbot dependency**: No external cron job, no deploy hooks, no + certbot package to install and maintain. The proxy is self-contained. +- **Automatic renewal**: `rustls-acme` runs as a background tokio task that + handles certificate provisioning and renewal automatically (~30 days before + expiry). +- **No restart needed**: When `rustls-acme` provisions a new certificate, the + `ResolvesServerCertAcme` resolver updates atomically. No SIGHUP, no restart, + no file watching. +- **Proven pattern**: alknet uses the same approach successfully. +- **Cache persistence**: `DirCache` persists ACME state between restarts, + avoiding re-provisioning. +- **Fallback mode**: Manual cert paths are still supported for environments + where ACME is not possible. + +## Consequences + +**Positive:** +- Single binary deployment (no certbot dependency) +- Zero-downtime certificate renewal +- Simpler operational model (no certbot cron, no deploy hooks) +- Proven in alknet + +**Negative:** +- `rustls-acme` is an additional dependency +- ACME challenges require either port 80 (HTTP-01) or TLS-ALPN-01 on port 443, + which our proxy already listens on +- Less control over certificate issuance compared to certbot (e.g., no DNS-01 + challenge support, though rustls-acme supports TLS-ALPN-01 which is sufficient + for our use case) +- Manual mode requires restart for cert changes (acceptable for fallback) + +## References + +- [tls.md](../tls.md) +- alknet ADR-008: ACME/Let's Encrypt decision +- `rustls-acme` crate: https://github.com/FlorianUekermann/rustls-acme \ No newline at end of file diff --git a/docs/architecture/decisions/005-tokio-rustls-direct.md b/docs/architecture/decisions/005-tokio-rustls-direct.md new file mode 100644 index 0000000..45d1b7f --- /dev/null +++ b/docs/architecture/decisions/005-tokio-rustls-direct.md @@ -0,0 +1,65 @@ +# ADR-005: tokio-rustls Directly, Not axum-server + +## Status + +Accepted + +## Context + +We need to serve HTTPS (TLS) traffic through axum. Two approaches exist for +integrating TLS with axum: + +1. **`axum-server`**: A wrapper that provides TLS support for axum via + `tls_rustls` feature. Handles TCP binding, TLS accept, and passing TLS + streams to axum. Simple API but limited control over the TLS configuration. + +2. **`tokio-rustls` directly**: Bind TCP manually, perform TLS handshake with + `TlsAcceptor`, then serve the TLS stream to axum/hyper. More code but full + control over `ServerConfig`, cipher suites, ALPN protocols, and cert + resolvers. + +The alknet project uses tokio-rustls directly and has proven this pattern for +both manual and ACME certificate management. + +## Decision + +Use `tokio-rustls` directly for TLS termination, with `hyper` serving the +resulting TLS streams to axum. Do not use `axum-server`. + +## Rationale + +- **ACME integration**: The `rustls-acme` `ResolvesServerCertAcme` resolver + needs to be set as the certificate resolver on `ServerConfig` via + `with_cert_resolver()`. `axum-server` does not expose this level of control + over the `ServerConfig`. +- **Cipher suite control**: We may need to configure cipher suites beyond the + defaults (see OQ-01). `axum-server` wraps the `ServerConfig` construction + and may not expose `CryptoProvider` configuration. Direct `tokio-rustls` + usage gives us full control. +- **ALPN configuration**: ACME TLS-ALPN-01 challenge requires adding + `acme-tls/1` to the ALPN protocol list. This is only possible with direct + `ServerConfig` access. +- **Proven pattern**: alknet uses exactly this approach (`TlsAcceptor` wrapping + `tokio-rustls`, with manual or ACME `ServerConfig` construction). +- **No abstraction cost**: The code to bind TCP, accept TLS, and serve to + axum/hyper is ~50 lines. `axum-server` saves little for our simple case. + +## Consequences + +**Positive:** +- Full control over TLS configuration +- Direct `rustls-acme` integration +- Ability to add ALPN protocols for ACME challenges +- Proven pattern from alknet + +**Negative:** +- Slightly more code than `axum-server` (~50 lines for the TLS acceptor loop) +- Need to manage the TCP listener and TLS accept explicitly +- Must handle the `TlsStream` → `hyper::service_fn` → axum + integration manually (well-documented pattern from Felix Knorr's blog and + alknet) + +## References + +- [tls.md](../tls.md) +- alknet transport layer (`alknet-core/src/transport/tls.rs`, `alknet-core/src/transport/acme.rs`) \ No newline at end of file diff --git a/docs/architecture/decisions/006-rate-limiting-approach.md b/docs/architecture/decisions/006-rate-limiting-approach.md new file mode 100644 index 0000000..3f5ff5f --- /dev/null +++ b/docs/architecture/decisions/006-rate-limiting-approach.md @@ -0,0 +1,77 @@ +# ADR-006: Token Bucket Rate Limiting with In-Memory State + +## Status + +Accepted + +## Context + +The proxy must enforce request rate limits per client IP address, replacing +nginx's `limit_req_zone` directive. Rate limiting is critical for preventing +abuse and for fail2ban integration (rate-limited requests trigger fail2ban +actions). + +Several rate limiting approaches exist: +- **Token bucket**: Tokens accumulate at a fixed rate; each request consumes a + token. Allows short bursts up to the bucket capacity. +- **Leaky bucket**: Requests are processed at a fixed rate; excess requests + queue or are rejected. No burst allowance. +- **Fixed window**: Count requests in fixed time windows (e.g., per minute). + Allows burst at window boundaries. +- **Sliding window**: Count requests in a rolling time window. More accurate + than fixed window but more complex. + +The current nginx config uses `limit_req zone=gitea_limit burst=20 nodelay`, +which is a token bucket with burst allowance. + +For state storage: +- **In-memory HashMap**: Fast, no external dependencies, lost on restart. +- **External store (Redis, etc.)**: Shared across instances, persists across + restarts. Adds operational complexity. +- **tower-governor crate**: Pre-built rate limiting middleware. Uses + generalized cell algorithm. Adds dependency. + +## Decision + +Use a token bucket algorithm with in-memory `HashMap` +state, protected by `tokio::sync::Mutex`. Rate limiting runs as axum middleware +before the proxy handler. + +Rate limits are global per-IP (not per-site) in Phase 1. Per-site rate limits +may be added in Phase 2 as the config model evolves. + +Stale entries in the HashMap are cleaned up periodically. A background task +scans the HashMap at a configurable interval (default: 60 seconds) and removes +entries that haven't been accessed within the cleanup interval. + +## Rationale + +- Token bucket matches nginx's `limit_req burst` semantics, ensuring + behavioral compatibility during migration. +- In-memory state is sufficient for a single-instance proxy (no shared state + needed). +- `tokio::sync::Mutex` (not `std::sync::Mutex`) avoids holding the lock across + await points and integrates with the async runtime. +- Custom implementation gives full control over logging output for fail2ban + integration (ADR-007). +- State loss on restart is acceptable — rate limit state is inherently + ephemeral. + +## Consequences + +**Positive:** +- Behavioral compatibility with nginx rate limiting +- Full control over fail2ban log format +- No external dependencies (Redis, etc.) +- Simple implementation (~100 lines) + +**Negative:** +- Rate limit state is lost on restart (acceptable for single-instance deploy) +- Not suitable for multi-instance deployments without external state store + (Phase 1 is single-instance) +- HashMap grows over time without eviction (mitigated by periodic cleanup) + +## References + +- [operations.md](../operations.md) +- nginx `limit_req` documentation \ No newline at end of file diff --git a/docs/architecture/decisions/007-custom-log-format.md b/docs/architecture/decisions/007-custom-log-format.md new file mode 100644 index 0000000..1b34888 --- /dev/null +++ b/docs/architecture/decisions/007-custom-log-format.md @@ -0,0 +1,67 @@ +# ADR-007: Custom Structured Log Format for Fail2ban + +## Status + +Accepted + +## Context + +The proxy needs to produce log output that fail2ban can parse to detect and ban +abusive IP addresses. The current nginx setup uses nginx's default log format +with standard fail2ban filters. + +Options for fail2ban integration: +- **nginx-compatible format**: Replicate nginx's log format so existing + fail2ban filters work unchanged. Couples us to nginx's format. +- **Custom structured format**: Design a clean, parseable format with a + corresponding custom fail2ban filter. Gives us control and clarity. +- **JSON format**: Machine-readable but harder for fail2ban regex matching. + +## Decision + +Use a custom structured log format with a corresponding custom fail2ban filter. + +The format for rate-limited requests: + +``` +RATE_LIMIT client_ip= host= path= status=429 +``` + +The format for general access logs: + +``` +REQUEST client_ip= host= method= path= status= upstream= duration_ms= +``` + +A corresponding fail2ban filter (`/etc/fail2ban/filter.d/reverse-proxy.conf`) +uses regex matching on the `RATE_LIMIT` prefix and `client_ip=` field. + +## Rationale + +- Custom format is clear, unambiguous, and self-documenting +- No coupling to nginx's format, which may change or include fields we don't + produce +- `key=value` pairs are easy to parse with regex and easy to extend +- The `RATE_LIMIT` prefix makes it trivial to distinguish rate-limit events + from other logs +- Writing a custom fail2ban filter is straightforward (5 lines of config) +- We control both sides (the proxy and the filter), so compatibility is + guaranteed + +## Consequences + +**Positive:** +- Clean, purpose-built format +- Easy to extend with new fields +- No dependency on nginx log format +- Custom fail2ban filter is simple to maintain + +**Negative:** +- Cannot reuse existing nginx fail2ban filters (trivial to write our own) +- Existing fail2ban configurations need updating (acceptable since we're + replacing nginx entirely) + +## References + +- [operations.md](../operations.md) +- [open-questions.md](../open-questions.md) OQ-02 (now resolved) \ No newline at end of file diff --git a/docs/architecture/decisions/008-static-dynamic-config-split.md b/docs/architecture/decisions/008-static-dynamic-config-split.md new file mode 100644 index 0000000..bc89bfc --- /dev/null +++ b/docs/architecture/decisions/008-static-dynamic-config-split.md @@ -0,0 +1,76 @@ +# ADR-008: Static/Dynamic Configuration Split with ArcSwap + +## Status + +Accepted + +## Context + +The proxy needs configuration that can be partially reloaded at runtime (site +definitions, rate limits) without restarting the process and dropping active +connections. However, some configuration (bind addresses, TLS mode) fundamentally +requires creating new listeners and cannot be changed at runtime. + +Two approaches: +- **Full restart for all config changes**: Simple, but requires dropping + active connections for every change, including trivial rate limit adjustments. +- **Static/dynamic split**: Immutable parameters (bind address, TLS mode) in a + `StaticConfig` that requires restart; runtime-adjustable parameters (sites, + rate limits) in a `DynamicConfig` that can be atomically swapped via + `Arc>` without dropping connections. + +This pattern is proven in the alknet project, which uses the same +`ArcSwap` approach for auth policy, forwarding rules, and rate +limits. + +## Decision + +Split configuration into `StaticConfig` (immutable after startup) and +`DynamicConfig` (hot-reloadable via `ArcSwap`). The split is: + +**StaticConfig** (restart required): +- Bind address, HTTP port, HTTPS port +- TLS mode (ACME vs. manual), cert paths, ACME settings +- Log level and format + +**DynamicConfig** (hot-reloadable via SIGHUP): +- Site definitions (hostname → upstream mappings) +- Rate limits (requests per second, burst) +- Body size limits + +`ConfigReloadHandle` provides a `reload(DynamicConfig)` method that atomically +swaps the entire config. All request handlers read `DynamicConfig` via +`ArcSwap::load()` — a lock-free operation. + +## Rationale + +- Rate limits and site definitions change more frequently than bind addresses + and TLS settings. Hot-reload avoids unnecessary downtime. +- `ArcSwap` provides lock-free reads and atomic writes — no partial updates, + no lock contention on the hot path. +- Proven pattern from alknet, where it's used for auth policy, forwarding + rules, and rate limits. +- SIGHUP trigger is simple, well-understood, and compatible with systemd and + process supervisors. +- The entire config is swapped at once, preventing inconsistent states where + some sites use the old config and others use the new one. + +## Consequences + +**Positive:** +- Zero-downtime config reload for sites and rate limits +- Lock-free reads on the request hot path +- Atomic config updates — no partial states +- Proven pattern from alknet + +**Negative:** +- Two config types add conceptual complexity +- SIGHUP reload requires reading the config file from disk (need to handle + file read errors gracefully) +- Must validate DynamicConfig before swapping (invalid config must not replace + valid config) + +## References + +- [config.md](../config.md) +- alknet ADR-030 (static/dynamic config split) \ No newline at end of file diff --git a/docs/architecture/decisions/009-signal-handling.md b/docs/architecture/decisions/009-signal-handling.md new file mode 100644 index 0000000..ceb30e0 --- /dev/null +++ b/docs/architecture/decisions/009-signal-handling.md @@ -0,0 +1,62 @@ +# ADR-009: Signal Handling Strategy + +## Status + +Accepted + +## Context + +The proxy needs to handle Unix signals for: +- **Graceful shutdown**: SIGTERM and SIGINT should stop accepting new + connections, drain in-flight requests, then exit. +- **Config reload**: SIGHUP should trigger a DynamicConfig reload from disk. + +Two approaches for signal handling: +- **`tokio::signal`**: Built into tokio. Handles SIGTERM and SIGINT via + `ctrl_c()`. Does not directly handle SIGHUP. +- **`signal-hook`**: External crate. Handles all Unix signals including SIGHUP. + More flexible but adds a dependency. + +## Decision + +Use `signal-hook` for all signal handling. Specifically: +- `signal-hook::flag` to set termination flags on SIGTERM/SIGINT +- `signal-hook` to register a SIGHUP handler that triggers config reload + +`tokio::signal::ctrl_c()` is registered as a secondary shutdown trigger; both +mechanisms converge on the same shutdown path. This is a belt-and-suspenders +approach: `signal-hook` handles all signals including SIGHUP, while +`ctrl_c()` provides a fallback for environments where signal handling may not +be fully wired (e.g., container runtimes). + +The shutdown sequence: +1. On SIGTERM or SIGINT: stop accepting new connections, wait up to 30 seconds + for in-flight requests to complete, then exit with code 0. +2. On SIGHUP: re-read config file, validate, and swap DynamicConfig if valid. + Log the result. + +## Rationale + +- SIGHUP handling is required for config reload — `tokio::signal` doesn't + support SIGHUP. +- `signal-hook` is well-maintained, widely used, and handles all Unix signals. +- Using one signal handling mechanism (rather than mixing `tokio::signal` and + `signal-hook`) is simpler and avoids edge cases. +- `signal-hook::flag` is a minimal, safe API for signal-triggered flags. + +## Consequences + +**Positive:** +- SIGHUP for config reload is simple and well-understood +- Single signal handling mechanism for all signals +- Compatible with systemd (SIGTERM for shutdown) and standard Unix conventions + +**Negative:** +- `signal-hook` is an additional dependency (but a well-established one) +- Signal handling requires careful coordination with the tokio runtime (async + signal receivers must be properly integrated) + +## References + +- [operations.md](../operations.md) +- [config.md](../config.md) \ No newline at end of file diff --git a/docs/architecture/open-questions.md b/docs/architecture/open-questions.md new file mode 100644 index 0000000..8226a2a --- /dev/null +++ b/docs/architecture/open-questions.md @@ -0,0 +1,86 @@ +--- +status: draft +last_updated: 2026-06-11 +--- + +# Open Questions + +## TLS + +### OQ-01: Should cipher suites be restricted beyond rustls defaults? + +- **Origin**: [tls.md](tls.md) +- **Status**: open +- **Priority**: medium +- **Context**: Our current nginx config explicitly restricts cipher suites to + four ECDHE-AES-GCM suites. rustls 0.23 with `aws_lc_rs` defaults to a + conservative set that excludes all weak ciphers (no SHA-1, no 3DES, no RC4, + no CBC-mode suites, no RSA key exchange). The defaults include TLS 1.3 suites + which nginx also allows. Restricting further would reduce compatibility with + older clients; not restricting means accepting a wider (but still safe) set + than the current nginx config. +- **Cross-references**: ADR-005 + +## Logging and Monitoring + +### ~~OQ-02: What log format should fail2ban consume?~~ + +- **Origin**: [operations.md](operations.md), [proxy.md](proxy.md) +- **Status**: resolved +- **Priority**: high +- **Resolution**: Custom structured log format with `key=value` pairs and + `RATE_LIMIT` prefix. A corresponding custom fail2ban filter will be provided. + See ADR-007. +- **Cross-references**: ADR-007 + +### OQ-03: Should the health check endpoint be on a separate port? + +- **Origin**: [operations.md](operations.md) +- **Status**: open +- **Priority**: low +- **Context**: Currently the health check is on the main HTTPS listener at + `/health`. Alternatives: (a) separate unencrypted port for health checks + (simpler for load balancers but less secure), (b) admin port with its own + listener (more complex but isolates operational traffic), (c) on the main + listener (simplest, proposed approach). For a single-server deployment behind + no external load balancer, the main listener is fine. +- **Cross-references**: None + +## Configuration + +### OQ-04: Should config reload support a Unix domain socket API in addition to SIGHUP? + +- **Origin**: [config.md](config.md) +- **Status**: open +- **Priority**: low +- **Context**: Phase 1 uses SIGHUP for config reload, which is simple and proven. + A Unix domain socket API would allow programmatic reload (e.g., from an admin + tool or CI/CD pipeline) and could return success/failure status. This adds + complexity and is not needed for Phase 1. +- **Cross-references**: None + +## Deployment + +### OQ-05: Should the proxy bind to multiple addresses or just one? + +- **Origin**: [overview.md](overview.md) +- **Status**: open +- **Priority**: low +- **Context**: Current nginx config binds to a specific IP (`15.235.125.95`). + The proposed config uses `bind_addr` which could be any IP. For Phase 1, the + config will specify a single IP address. Multi-address binding (listening on + multiple IPs) is not needed but could be added as an array of addresses. +- **Cross-references**: None + +## Proxy + +### OQ-06: Should upstream timeouts be configurable per-site? + +- **Origin**: [proxy.md](proxy.md) +- **Status**: open +- **Priority**: low +- **Context**: Phase 1 uses global defaults (5s connect timeout, 60s request + timeout) for all upstream connections. Per-site timeout configuration would + allow tuning for different upstream services (e.g., a slow database-backed + API vs. a fast static site). Not needed for Phase 1 with a single upstream. +- **Cross-references**: None \ No newline at end of file diff --git a/docs/architecture/operations.md b/docs/architecture/operations.md new file mode 100644 index 0000000..630dcff --- /dev/null +++ b/docs/architecture/operations.md @@ -0,0 +1,250 @@ +--- +status: draft +last_updated: 2026-06-11 +--- + +# Operations + +## What It Is + +The operations component covers everything related to running the proxy in +production: rate limiting, logging (fail2ban integration), health checks, +systemd integration, and graceful shutdown. + +## Why It Exists + +A reverse proxy that can't be monitored, rate-limited, or gracefully restarted +is not production-ready. These concerns are cross-cutting — they affect the +proxy handler, the TLS layer, and the config system. + +## Rate Limiting + +### Requirements + +- Limit requests per IP address (replacing nginx's `limit_req_zone`) +- Default: 10 requests/second with burst of 20 (matching current nginx config) +- Configurable via DynamicConfig (no restart needed) +- Must produce logs that fail2ban can consume + +### Design + +The rate limiter runs as axum middleware before the proxy handler. It uses a +token bucket algorithm per client IP, matching nginx's `limit_req burst` +semantics. + +Rate limits are global per-IP in Phase 1 (not per-site). A request from IP +address X counts against the same bucket regardless of which site it targets. +Per-site rate limits may be added in Phase 2. + +When a request exceeds the rate limit, the middleware returns `429 Too Many +Requests` and logs the event with structured fields. + +### State Eviction + +The per-IP token bucket state grows over time as new IPs are seen. A +background task runs at a configurable interval (default: 60 seconds) and +removes entries that haven't been accessed within the cleanup interval. This +prevents unbounded memory growth. + +### Fail2ban Integration + +Rate limit events are logged in a structured format that a custom fail2ban +filter can parse. See [ADR-007](decisions/007-custom-log-format.md) for the +format decision. + +The log format uses `key=value` pairs with a `RATE_LIMIT` prefix: + +``` +RATE_LIMIT client_ip=X.X.X.X host=Y.Z path=/W status=429 +``` + +A corresponding fail2ban filter and jail configuration are provided as part +of the deployment documentation. + +## Logging + +### Structure + +All logs use `tracing` with structured fields. The proxy outputs two types of +log entries: + +1. **Access logs**: Every proxied request is logged at `info` level with + structured fields. + + ``` + REQUEST client_ip=1.2.3.4 host=git.alk.dev method=GET path=/user/repo status=200 upstream=127.0.0.1:3000 duration_ms=45 + ``` + +2. **Event logs**: Rate limits, TLS errors, upstream failures, config reloads, + etc. + + ``` + RATE_LIMIT client_ip=1.2.3.4 host=git.alk.dev path=/login status=429 + UPSTREAM_ERROR host=git.alk.dev upstream=127.0.0.1:3000 error="connection refused" + CONFIG_RELOAD status=success sites=1 + ``` + +### Output + +Logs are written to: +- **stdout/stderr**: For systemd/journald integration +- **File** (optional): For fail2ban consumption at + `/var/log/reverse-proxy/access.log` + +The `tracing-subscriber` layer configuration supports both simultaneously via +`Layer` composition. + +### Log Levels + +| Level | Use | +|-------|-----| +| `error` | Unrecoverable failures (TLS handshake failure, config validation) | +| `warn` | Rate limit exceeded, upstream unreachable, upstream timeout | +| `info` | Access logs, config reloads, ACME events, startup/shutdown | +| `debug` | Request/response headers, connection details | +| `trace` | Detailed protocol-level information | + +Configurable via `log_level` in StaticConfig. + +## Health Check + +### Endpoint + +``` +GET /health → 200 OK (empty body) +``` + +The health check endpoint is accessible on the main HTTPS listener. It returns +200 if the process is alive and serving requests. + +**Limitation**: Since `/health` is served over TLS, it cannot detect TLS +configuration errors that prevent the TLS handshake from completing. External +monitoring should also check TCP connectivity to port 443 independently. + +### What It Checks + +- Process is running and the tokio runtime is responsive +- TLS listener is accepting connections +- Config is loaded (StaticConfig and DynamicConfig are initialized) + +It does **not** check upstream reachability. The health check answers "is the +proxy process healthy?", not "is the upstream reachable?" — upstream health is +a separate concern that would produce 502/504 responses in the proxy handler. + +### Future Extensions + +- `/health/ready` — readiness check that includes upstream reachability +- Prometheus metrics at `/metrics` + +## Systemd Integration + +### Unit File + +```ini +[Unit] +Description=Reverse Proxy +After=network.target +Wants=network-online.target + +[Service] +Type=notify +NotifyAccess=all +ExecStart=/usr/local/bin/reverse-proxy --config /etc/reverse-proxy/config.toml +Restart=on-failure +RestartSec=5 + +# Security hardening +NoNewPrivileges=yes +ProtectSystem=strict +ProtectHome=yes +PrivateTmp=yes +ReadWritePaths=/var/lib/reverse-proxy /var/log/reverse-proxy + +# ACME challenge cache directory +StateDirectory=reverse-proxy + +[Install] +WantedBy=multi-user.target +``` + +The proxy signals readiness to systemd via `sd_notify` after binding listeners +and completing the initial configuration load. + +## Graceful Shutdown + +### Signal Handling + +The proxy handles three signals via `signal-hook` (see [ADR-009](decisions/009-signal-handling.md)): + +- **SIGTERM / SIGINT**: Graceful shutdown. Stop accepting new connections, wait + for in-flight requests to complete (up to a configurable timeout), then exit. +- **SIGHUP**: Config reload. Re-read the config file, validate, and swap + DynamicConfig if valid. + +### SIGHUP for Config Reload + +SIGHUP triggers config reload (see [config.md](config.md) for details). The +process does not exit on SIGHUP. + +### Timeout + +In-flight requests have a configurable shutdown timeout (default: 30 seconds). +After the timeout, remaining connections are forcefully closed and the process +exits. + +## Deployment + +### Binary + +Single static binary, no runtime dependencies: + +```bash +cargo build --release +# Produces: target/release/reverse-proxy +``` + +The binary is self-contained — no system libraries beyond libc for DNS +resolution. The `aws_lc_rs` crypto provider is statically linked. + +### Configuration + +```bash +# Config file +/etc/reverse-proxy/config.toml + +# ACME cache directory +/var/lib/reverse-proxy/acme-cache/ + +# Log directory (optional, for fail2ban) +/var/log/reverse-proxy/ +``` + +### CLI + +```bash +reverse-proxy [OPTIONS] + +Options: + --config Path to config file [default: /etc/reverse-proxy/config.toml] + --validate Validate config and exit + --help Show help + --version Show version +``` + +## Design Decisions + +All design decisions are documented as ADRs in [decisions/](decisions/). + +| ADR | Decision | Summary | +|-----|----------|---------| +| [001](decisions/001-rust-axum.md) | Rust with axum | Memory safety; single binary deployment | +| [006](decisions/006-rate-limiting-approach.md) | Token bucket rate limiting | In-memory per-IP token bucket matching nginx burst semantics | +| [007](decisions/007-custom-log-format.md) | Custom structured log format | key=value pairs with RATE_LIMIT prefix for fail2ban | +| [009](decisions/009-signal-handling.md) | Signal handling strategy | signal-hook for SIGTERM/SIGINT/SIGHUP | + +## Open Questions + +Open questions are tracked in [open-questions.md](open-questions.md). Key +questions affecting this document: + +- **OQ-03**: Should the health check endpoint be on a separate port? (open) \ No newline at end of file diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md new file mode 100644 index 0000000..26caf8d --- /dev/null +++ b/docs/architecture/overview.md @@ -0,0 +1,166 @@ +--- +status: draft +last_updated: 2026-06-11 +--- + +# Overview + +## Vision + +A memory-safe, minimal reverse proxy that replaces our vulnerable nginx instance +for forward-proxying to backend services. The proxy terminates TLS, injects +standard proxy headers, enforces rate limits, and forwards requests to upstream +services — with operational feature parity for our current single-domain Gitea +setup. + +## Why This Exists + +Our nginx 1.24.0 installation is vulnerable to multiple actively-exploited +CVEs, including CVE-2026-42945 (unauthenticated RCE via `rewrite`/`set` +directives). The broader threat landscape is worsening: LLM-assisted fuzzing +is accelerating bug discovery in nginx's C codebase, and security researchers +report additional undisclosed vulnerabilities. Upgrading nginx patches known +CVEs but does not address the structural problem — memory corruption bugs are +endemic to C, and the discovery rate is accelerating. + +Rust's memory safety eliminates the entire class of buffer overflow, +use-after-free, and double-free bugs that constitute 6 of 7 recent nginx CVEs. +Combined with rustls (pure Rust TLS, no OpenSSL dependency), this provides a +fundamentally safer baseline. + +See [threat-landscape.md](../research/threat-landscape.md) for full vulnerability +details. + +## Scope + +### In Scope + +- **Phase 1**: Replace nginx for `git.alk.dev` with feature parity + - TLS termination with ACME (Let's Encrypt) certificate management + - Manual certificate paths as fallback mode + - HTTP → HTTPS redirect + - Reverse proxy to Gitea at `127.0.0.1:3000` + - Proxy header injection (Host, X-Real-IP, X-Forwarded-For, X-Forwarded-Proto) + - Request rate limiting with fail2ban-compatible logging (global per-IP; per-site in Phase 2) + - 100 MB body size limit (global; per-site in Phase 2) + - Configurable bind address (no `0.0.0.0` default) + - Health check endpoint + - Graceful shutdown (SIGTERM handling) + - Systemd unit file + +- **Phase 2**: Multi-site support + - SNI-based TLS routing for multiple domains + - Config file for site definitions + - Dynamic config reload (ArcSwap pattern) + +- **Phase 3**: Operational hardening + - Metrics endpoint (Prometheus-compatible) + - Connection limits and timeouts + - Log rotation + +### Out of Scope + +- HTTP/2 or HTTP/3 proxying (services that need these run their own native + Rust servers — e.g., `api.alk.dev`) +- Load balancing or round-robin upstream selection +- WebSocket proxying (can be added later if needed) +- Static file serving +- Access control beyond rate limiting (no auth, no IP allowlists in Phase 1) +- CGI, SCGI, uWSGI, FastCGI + +## Architecture + +``` + ┌────────────────────────────────────┐ + │ reverse-proxy (Rust/axum) │ +config.toml ──────► │ StaticConfig + DynamicConfig │ + │ (ArcSwap for hot-reload) │ + │ │ +bind_addr:80 ──► │ HTTP listener → 301 redirect │ + │ to HTTPS │ + │ │ +bind_addr:443 ──► │ TLS listener (tokio-rustls) │ + │ ├─ ACME mode: rustls-acme resolver │ + │ │ (auto cert provisioning/renewal) │ + │ └─ Manual mode: cert/key file paths │ + │ │ + │ axum router │ + │ ├─ Host-based routing │ + │ ├─ Rate limiting middleware │ + │ ├─ Proxy header injection │ + │ ├─ Body size limit (100MB) │ + │ └─ Reverse proxy handler │ + │ └─ hyper Client → upstream │ + │ │ + │ /health → 200 OK │ + └────────────────────────────────────┘ +``` + +## Crate Dependencies + +### Core + +| Crate | Version | Purpose | Notes | +|-------|---------|---------|-------| +| `axum` | 0.8 | HTTP framework | Routing, middleware, extractors | +| `tokio` | 1 (full) | Async runtime | Multi-threaded runtime | +| `hyper` | 1 | HTTP protocol | Used via axum, and directly for proxy `Client` | +| `tower` | 0.5 | Middleware ecosystem | Service trait, layers | +| `rustls` | 0.23 | TLS implementation | `aws_lc_rs` crypto provider | +| `tokio-rustls` | 0.26 | Async TLS I/O | Wraps TCP with TLS | +| `rustls-acme` | 0.12 | ACME client | Let's Encrypt auto-provisioning and renewal | + +### Supporting + +| Crate | Version | Purpose | Notes | +|-------|---------|---------|-------| +| `serde` | 1 | Serialization | TOML config deserialization | +| `toml` | 0.8 | Config format | Declarative site definitions | +| `arc-swap` | 1 | Atomic config swap | Lock-free DynamicConfig reload | +| `tracing` | 0.1 | Structured logging | fail2ban-compatible output | +| `tracing-subscriber` | 0.3 | Log output | File + journald support | +| `rustls-pemfile` | 2 | PEM parsing | Manual cert loading | +| `rustls-pki-types` | 1 | TLS types | CertificateDer, PrivateKeyDer | +| `clap` | 4 | CLI arguments | Server startup options | +| `signal-hook` | 0.3 | Signal handling | SIGTERM/SIGINT for shutdown, SIGHUP for config reload | + +Versions listed are minimum major versions. Implementation should pin exact +versions in `Cargo.toml` per standard Rust practice. + +## Exports + +This is a single-binary deployment. There are no library exports. The product +is the `reverse-proxy` binary plus a systemd unit file and a config file. + +## Dependencies on Other Projects + +- **alknet**: The `ArcSwap` pattern, `tokio-rustls` TLS acceptor + construction, `rustls-acme` integration, and `ServerConfig` builder patterns + are adapted from alknet's transport and config layers. These patterns are + referenced as validation that the approaches work in production; all code + in this project is written from scratch. + +## Design Decisions + +All design decisions are documented as ADRs in [decisions/](decisions/). + +| ADR | Decision | Summary | +|-----|----------|---------| +| [001](decisions/001-rust-axum.md) | Rust with axum | Memory safety eliminates the bug class causing nginx CVEs; axum provides ergonomic tower integration | +| [002](decisions/002-custom-proxy-handler.md) | Custom proxy handler | Single upstream, single domain — axum-reverse-proxy adds unnecessary complexity | +| [003](decisions/003-toml-config.md) | TOML configuration format | Rust-native, unambiguous, excellent serde support | +| [004](decisions/004-rustls-acme.md) | ACME-primary certificate management | Eliminates certbot dependency; automatic provisioning and renewal | +| [005](decisions/005-tokio-rustls-direct.md) | tokio-rustls directly, not axum-server | Full control over TLS config, ACME resolver integration, cipher suite configuration | +| [006](decisions/006-rate-limiting-approach.md) | Token bucket rate limiting | In-memory per-IP token bucket matching nginx burst semantics | +| [007](decisions/007-custom-log-format.md) | Custom structured log format | key=value pairs with RATE_LIMIT prefix for fail2ban | +| [008](decisions/008-static-dynamic-config-split.md) | Static/dynamic config with ArcSwap | Immutable StaticConfig, hot-reloadable DynamicConfig via ArcSwap | +| [009](decisions/009-signal-handling.md) | Signal handling strategy | signal-hook for SIGTERM/SIGINT/SIGHUP | + +## Open Questions + +Open questions are tracked in [open-questions.md](open-questions.md). Key +questions affecting this document: + +- **OQ-01**: Should cipher suites be restricted beyond rustls defaults? (open) +- **OQ-03**: Should the health check endpoint be on a separate port? (open) +- **OQ-05**: Should the proxy bind to multiple addresses or just one? (open) \ No newline at end of file diff --git a/docs/architecture/proxy.md b/docs/architecture/proxy.md new file mode 100644 index 0000000..46e83ce --- /dev/null +++ b/docs/architecture/proxy.md @@ -0,0 +1,169 @@ +--- +status: draft +last_updated: 2026-06-11 +--- + +# Proxy Handler + +## What It Is + +The proxy handler is the core component that receives an incoming HTTP request +on the TLS-terminated connection, applies middleware (rate limiting, header +injection, body size limits), and forwards it to the upstream service. + +## Why It Exists + +This component replaces nginx's `proxy_pass` directive. For our use case — +single upstream per domain, no load balancing, no HTTP/2 proxying — a custom +handler is simpler and more maintainable than a general-purpose proxy library. + +## Architecture + +``` +Incoming HTTPS request + │ + ▼ +┌─────────────────┐ +│ axum Router │ +│ (Host-based) │─── /health → 200 OK +│ │ +│ match Host │ +│ header on │ +│ incoming req │ +└───────┬─────────┘ + │ + ▼ +┌─────────────────┐ +│ Rate Limiting │ ← tower middleware layer +│ Middleware │ +└───────┬─────────┘ + │ + ▼ +┌─────────────────┐ +│ Proxy Header │ ← custom middleware / handler +│ Injection │ +│ │ +│ X-Real-IP │ ← connect_info remote_addr +│ X-Forwarded-For │ ← append to existing or set +│ X-Forwarded-Proto │ ← "https" (or "http" on port 80) +│ Host │ ← original host header (already set) +└───────┬─────────┘ + │ + ▼ +┌─────────────────┐ +│ Body Size Limit │ ← DefaultBodyLimit(100 MB) +│ Middleware │ +└───────┬─────────┘ + │ + ▼ +┌─────────────────┐ +│ Reverse Proxy │ ← hyper Client request forwarding +│ Handler │ +│ │ +│ 1. Build upstream│ +│ URI from │ +│ original req │ +│ 2. Forward req │ +│ to upstream │ +│ 3. Stream │ +│ response back │ +└─────────────────┘ +``` + +## Request Flow + +### 1. Host-Based Routing + +The axum router uses a `Host` extractor to match incoming requests to site +definitions from `DynamicConfig`. Each site definition maps a hostname to an +upstream address. + +Where `host_based_proxy` reads the `Host` header, looks up the site in +`DynamicConfig.sites`, and either proxies to the upstream or returns 404. + +### 2. Proxy Header Injection + +Headers are injected before forwarding. The handler reads connection metadata +from axum's `ConnectInfo` and the original request: + +| Header | Value Source | Notes | +|--------|-------------|-------| +| `Host` | Original request `Host` header | Already present; preserved as-is | +| `X-Real-IP` | `ConnectInfo` remote IP | Set to client's IP address | +| `X-Forwarded-For` | Client IP, appended if header exists | Comma-separated list of proxies | +| `X-Forwarded-Proto` | Determined by listener | `https` on port 443, `http` on port 80 | + +The `X-Forwarded-For` handling must append the client IP to any existing value +(rather than replacing it), to support chained proxies correctly. + +### 3. Request Forwarding + +The proxy handler constructs a new request to the upstream: + +1. Build the upstream URI using the site's `upstream_scheme` and `upstream` + address, preserving the original path and query string +2. Copy the request method, headers, and body from the original +3. Inject proxy headers (X-Real-IP, X-Forwarded-For, X-Forwarded-Proto) +4. Send the request via a shared hyper Client instance +5. Stream the response back to the client + +The hyper Client is created once at startup and shared via axum's `State`. It +must be configured with: +- Connection pooling (hyper default behavior) +- Connect timeout: 5 seconds +- Request timeout: 60 seconds +- No redirect following (proxies should not follow redirects) + +### 4. Error Handling + +| Upstream Condition | Response | Notes | +|-------------------|----------|-------| +| Upstream reachable | Stream response as-is | Headers, status, body all forwarded | +| Upstream unreachable | 502 Bad Gateway | Logged at `warn` level | +| Upstream timeout | 504 Gateway Timeout | Logged at `warn` level | +| Request body too large | 413 Payload Too Large | From `DefaultBodyLimit` middleware | +| Rate limit exceeded | 429 Too Many Requests | Logged at `info` level | +| Unknown Host header | 404 Not Found | No matching site definition | + +### 5. HTTP → HTTPS Redirect + +A separate HTTP listener on port 80 handles redirect. It reads the `Host` +header from the incoming request and returns a 301 Permanent Redirect to the +HTTPS equivalent URL (preserving the path and query string). + +This listener runs on the same bind address as the TLS listener but on port 80. + +## Upstream Connection + +The upstream connection scheme defaults to `http://` since the proxy and backend +services typically run on the same host (e.g., `127.0.0.1:3000`). The +`upstream_scheme` field in each site's configuration allows specifying `https://` +for upstreams that require TLS (e.g., separate hosts or secure internal services). + +For the initial deployment (`git.alk.dev` → `127.0.0.1:3000`), the upstream +connection uses plain HTTP, as TLS between the proxy and Gitea on loopback is +unnecessary. + +## Body Size Limit + +axum's `DefaultBodyLimit` layer sets the maximum request body size. For +compatibility with Gitea's push operations (large pack files), this defaults +to 100 MB. In Phase 1, the body limit is a global setting; Phase 2 may add +per-site body limits. + +## Design Decisions + +All design decisions are documented as ADRs in [decisions/](decisions/). + +| ADR | Decision | Summary | +|-----|----------|---------| +| [002](decisions/002-custom-proxy-handler.md) | Custom proxy handler | Single upstream, single domain — simpler than a general proxy library | +| [007](decisions/007-custom-log-format.md) | Custom structured log format | key=value pairs with RATE_LIMIT prefix for fail2ban | + +## Open Questions + +Open questions are tracked in [open-questions.md](open-questions.md). Key +questions affecting this document: + +- **OQ-06**: Should upstream timeouts be configurable per-site? (open — Phase 1 + uses global defaults of 5s connect, 60s request) \ No newline at end of file diff --git a/docs/architecture/tls.md b/docs/architecture/tls.md new file mode 100644 index 0000000..572c4e8 --- /dev/null +++ b/docs/architecture/tls.md @@ -0,0 +1,220 @@ +--- +status: draft +last_updated: 2026-06-11 +--- + +# TLS Termination + +## What It Is + +The TLS termination component handles all aspects of encrypted connections: +certificate provisioning (ACME and manual), TLS handshake, SNI-based certificate +selection, and connection wrapping for the axum router. + +## Why It Exists + +TLS termination is the security boundary between the public internet and our +upstream services. It replaces nginx's `ssl_certificate`, `ssl_protocols`, and +`ssl_ciphers` configuration with a memory-safe Rust implementation using rustls. + +## Architecture + +``` + ┌──────────────────────────────────────────┐ + │ TLS Termination │ + │ │ + bind_addr:443 ──► │ TcpListener::bind(bind_addr) │ + │ │ │ + │ ▼ │ + │ tokio-rustls::TlsAcceptor │ + │ │ │ + │ ├─ ACME mode: │ + │ │ rustls-acme::ResolvesServerCertAcme │ + │ │ (auto-provisions & renews certs) │ + │ │ │ + │ └─ Manual mode: │ + │ rustls::ServerConfig │ + │ .with_single_cert(cert_chain, key) │ + │ │ + │ │ │ + │ ▼ │ + │ TlsStream │ + │ │ │ + │ ▼ │ + │ hyper::service_fn → axum router │ + └──────────────────────────────────────────┘ + + bind_addr:80 ──► HTTP listener (redirect to HTTPS, no TLS) +``` + +## Certificate Provisioning + +### ACME Mode (Primary) + +Uses `rustls-acme` for automatic certificate provisioning and renewal through +Let's Encrypt. This is the primary mode — no certbot dependency, no cron jobs, +no deploy hooks. + +**How it works:** + +1. `AcmeCertProvider` configures the ACME client with the domain, cache + directory, and Let's Encrypt directory (staging or production). +2. `AcmeConfig::new(vec![domain])` creates an ACME configuration for the + domain. +3. The ACME state machine runs as a background tokio task, handling: + - Account registration with Let's Encrypt + - Certificate ordering + - TLS-ALPN-01 challenge (or HTTP-01 challenge) + - Certificate issuance + - Certificate renewal (automatic, ~30 days before expiry) +4. `ResolvesServerCertAcme` is a rustls `ResolvesServerCert` implementation + that automatically serves the ACME-provisioned certificate. +5. When a new certificate is issued, the resolver updates atomically — no + restart or signal handling needed. + +**Configuration:** + +```toml +[tls] +mode = "acme" +acme_domain = "git.alk.dev" +acme_cache_dir = "/var/lib/reverse-proxy/acme-cache" +acme_directory = "production" # or "staging" for testing +``` + +**Cache directory:** The `DirCache` from rustls-acme persists ACME account data, +private keys, and certificates between restarts. This avoids re-provisioning on +every restart. + +### Manual Mode (Fallback) + +For environments where ACME is not desired (testing, self-signed certs, +corporate CAs, or BYO certificates), the proxy loads certificates from file +paths at startup. + +```toml +[tls] +mode = "manual" +cert_path = "/etc/letsencrypt/live/git.alk.dev/fullchain.pem" +key_path = "/etc/letsencrypt/live/git.alk.dev/privkey.pem" +``` + +Certificate files are loaded once at startup using `rustls_pemfile`. Manual +mode requires a restart to pick up new certificates. + +**Why not hot-reload manual certs?** ACME mode handles renewal automatically. +Manual mode is for cases where you control cert rotation externally (certbot, +manual renewal). In that case, a SIGHUP-triggered restart is simpler and more +reliable than file watching. If zero-downtime cert rotation is needed, use ACME +mode. + +## TLS Configuration + +### Protocol Versions + +The proxy supports TLS 1.2 and TLS 1.3 only, matching the minimum security +level of the current nginx configuration. The `aws_lc_rs` crypto provider +defaults to these protocol versions; explicit configuration ensures no +regression if defaults change in future rustls releases. + +### Cipher Suites + +rustls 0.23 with the `aws_lc_rs` crypto provider defaults to a conservative +cipher suite selection that excludes all weak ciphers (no SHA-1, no 3DES, no +RC4, no CBC-mode suites, no RSA key exchange). + +The current nginx config explicitly restricts to: + +``` +ECDHE-ECDSA-AES128-GCM-SHA256 +ECDHE-RSA-AES128-GCM-SHA256 +ECDHE-ECDSA-AES256-GCM-SHA384 +ECDHE-RSA-AES256-GCM-SHA384 +``` + +rustls's defaults include these plus TLS 1.3 suites (which nginx's config +also allows via `TLSv1.3`). The default rustls cipher list is a strict subset +of what browsers accept. + +See [open-questions.md](open-questions.md) OQ-01 for whether to further +restrict cipher suites beyond rustls defaults. + +### ServerConfig Construction + +For manual mode, the `ServerConfig` is built with `with_no_client_auth()` and +`with_single_cert()`, loading the certificate chain and private key from disk. + +For ACME mode, the `ServerConfig` is built with `with_cert_resolver()`, passing +the `ResolvesServerCertAcme` resolver. The ACME TLS-ALPN-01 protocol identifier +(`acme-tls/1`) must be registered in the `alpn_protocols` list so the server +can respond to TLS-ALPN-01 challenges. + +Both modes use the `aws_lc_rs` crypto provider with safe default protocol +versions (TLS 1.2 and TLS 1.3). + +## SNI-Based Certificate Selection + +### Current (Single Domain) + +For single-domain setups, SNI selection is trivial: there's only one +certificate, so `with_single_cert()` or `ResolvesServerCertAcme` (which +handles the domain) is sufficient. + +### Future (Multi-Domain) + +When multiple domains are served, SNI selection works as follows: + +1. **TLS handshake**: The client sends the SNI extension indicating which + hostname it's connecting to. +2. **Certificate resolution**: In ACME mode, `ResolvesServerCertAcme` handles + this automatically — it stores certificates keyed by domain. In manual mode, + a custom `ResolvesServerCert` implementation maps SNI hostname to the + correct `CertifiedKey`. +3. **HTTP routing**: After the TLS handshake, axum's `Host` extractor routes + the request to the correct site handler based on the `Host` header. + +This is the same pattern nginx uses — SNI selects the cert during TLS, then +`Host` header selects the server block. In manual mode, a `ResolvesServerCert` +implementation maps SNI hostname to the correct `CertifiedKey`. + +## HTTP Listener (Port 80) + +The HTTP listener on port 80 is a plain TCP listener with no TLS. It has one +job: redirect all requests to the HTTPS equivalent. + +The listener binds to the same IP address as the TLS listener, but on port 80. + +### ACME Challenge Type + +The default ACME challenge type is **TLS-ALPN-01**, since the proxy already +listens on port 443. This avoids requiring a separate HTTP-01 challenge server. +HTTP-01 is available as a fallback for environments where TLS-ALPN-01 is not +suitable (e.g., behind a CDN that terminates TLS). When using HTTP-01, the +port 80 listener serves `/.well-known/acme-challenge/{token}` paths for +challenge verification. + +## Key Files and Crates + +| Component | Crate | Purpose | +|-----------|-------|---------| +| TLS acceptor | `tokio-rustls` 0.26 | Async TLS handshake over TCP streams | +| TLS config | `rustls` 0.23 | ServerConfig, CryptoProvider, cipher suites | +| ACME client | `rustls-acme` 0.12 | Automatic cert provisioning and renewal | +| PEM parsing | `rustls-pemfile` 2 | Load cert/key from PEM files (manual mode) | +| PKI types | `rustls-pki-types` 1 | CertificateDer, PrivateKeyDer | + +## Design Decisions + +All design decisions are documented as ADRs in [decisions/](decisions/). + +| ADR | Decision | Summary | +|-----|----------|---------| +| [004](decisions/004-rustls-acme.md) | ACME-primary cert management | Eliminates certbot; automatic provisioning and renewal | +| [005](decisions/005-tokio-rustls-direct.md) | tokio-rustls directly | Full control over TLS config and ACME resolver integration | + +## Open Questions + +Open questions are tracked in [open-questions.md](open-questions.md). Key +questions affecting this document: + +- **OQ-01**: Should cipher suites be restricted beyond rustls defaults? (open) \ No newline at end of file diff --git a/docs/research/threat-landscape.md b/docs/research/threat-landscape.md new file mode 100644 index 0000000..1e190a4 --- /dev/null +++ b/docs/research/threat-landscape.md @@ -0,0 +1,86 @@ +# Threat Landscape + +## Active Nginx Vulnerabilities (May 2026) + +All disclosed by DepthFirst's autonomous security analysis. Four related CVEs from a single audit, plus additional ones discovered separately. + +### Critical + +**CVE-2026-42945 (CVSS 9.2) — "NGINX Rift"** +- Heap buffer overflow in `ngx_http_rewrite_module`, present since 2008 (18 years) +- Unauthenticated RCE via `rewrite` + `set` directives +- Working PoC publicly released on GitHub +- **Actively exploited in the wild** within 3 days of disclosure +- Our config uses `rewrite`-equivalent logic (HTTP→HTTPS redirect) +- Affects 0.6.27–1.30.0, fixed in 1.31.0/1.30.1 +- **We are vulnerable** (running 1.24.0) + +### High + +**CVE-2026-42946 (CVSS 8.3)** +- Buffer overread in `ngx_http_scgi_module` and `ngx_http_uwsgi_module` +- Worker crash or memory disclosure +- Excessive memory allocation attack (can trigger ~1TB allocation) +- Affects 0.8.42–1.30.0, fixed in 1.31.0/1.30.1 +- **We are vulnerable** (running 1.24.0, though we don't use scgi/uwsgi) + +### Medium + +**CVE-2026-40701** +- Use-after-free in OCSP resolver +- Limited data modification or worker restart +- Affects 1.19.0–1.30.0, fixed in 1.31.0/1.30.1 +- **We are vulnerable** (running 1.24.0) + +**CVE-2026-9256** +- Buffer overflow in `ngx_http_rewrite_module` (separate from Rift) +- Affects 0.1.17–1.31.0, fixed in 1.31.1+ +- **We are vulnerable** (running 1.24.0) + +**CVE-2026-42926** +- HTTP/2 request injection in `ngx_http_proxy_module` +- Affects 1.29.4–1.30.0, fixed in 1.31.0/1.30.1 +- We are not directly vulnerable (1.24.0 is outside range) + +**CVE-2026-40460** +- HTTP/3 address spoofing +- Affects 1.25.0–1.30.0 +- We are not directly vulnerable (1.24.0 is outside range) + +### Low + +**CVE-2026-42934** +- Buffer overread in `ngx_http_charset_module` +- Affects 0.3.50–1.30.0, fixed in 1.31.0/1.30.1 +- **We are vulnerable** (running 1.24.0) + +## Unreleased Vulnerabilities + +Security researchers in relevant communities report at least 4 additional RCE vulnerabilities in nginx that have not yet been publicly disclosed. Researchers are expressing frustration with F5/nginx's slow response times and are considering public disclosure to force action. + +This means the known CVEs above are likely just the tip of the iceberg. + +## Risk Assessment + +| Factor | Level | Notes | +|--------|-------|-------| +| Current exposure | **Critical** | Actively exploited RCE in our nginx version | +| Patch availability | **Available** | 1.30.1/1.31.0+ fix all known CVEs, but requires manual upgrade from Ubuntu default | +| Future risk | **High** | More undisclosed vulns likely; C codebase with systemic memory safety issues | +| Mitigation urgency | **Immediate** | RCE with public PoC and active exploitation | + +## Why Rust Helps + +- Memory safety by construction eliminates: buffer overflows, use-after-free, double-free, out-of-bounds reads/writes +- This is the **exact class of bugs** affecting nginx right now (6 out of 7 recent CVEs are memory corruption) +- rustls (pure Rust TLS) avoids OpenSSL dependency and its own CVE history +- Does NOT eliminate logic bugs — still need careful rate limiting, header injection, access control +- But provides a fundamentally safer baseline to build on + +## Short-term Mitigation (While Developing Replacement) + +1. Upgrade nginx to 1.30.1+ or 1.31.1+ immediately +2. Consider removing rewrite directives if possible +3. Ensure fail2ban is actively monitoring +4. Firewall restrictions on port 80/443 if feasible +5. Prioritize the Rust proxy project \ No newline at end of file