From ceb59ad9b9a2b3c5ed2e25923ea66803ed139062 Mon Sep 17 00:00:00 2001 From: "glm-5.1" Date: Thu, 11 Jun 2026 10:56:40 +0000 Subject: [PATCH] Resolve all architecture review findings (7 critical, 14 warnings, 6 suggestions) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Critical findings resolved: - C1: Site routing is global (per-listener TOML, global runtime lookup) - C2: X-Forwarded-For replaces (not appends) — edge proxy model (ADR-021) - C3: Hop-by-hop header handling rules specified (proxy.md) - C4: ACME failure behavior defined (tls.md) - C5: Startup sequence with fail-fast semantics (operations.md) - C6: Per-listener Router instances with shared global state (overview.md) - C7: Rate limiter adopts new params on next request, no state clear (operations.md) Warnings resolved: - W1: Admin socket wire protocol specified - W2: Host header port stripped, hostnames only in config - W3: HTTP redirect URL construction with port handling - W4: /health on HTTPS matches regardless of Host header - W5: Static config changes logged as warning during reload - W6: Reload operations serialized via Mutex - W7: http_port validation rules added (9 new rules total) - W8: upstream format validation (host:port required, no scheme) - W9: TLS error handling table (SNI, version, cipher failures) - W10: IPv6 rate limited per /64 prefix - W11: Graceful shutdown sequence specified (6 steps) - W12: Error response bodies: minimal plain text, no version disclosure - W13: upstream_scheme HTTPS uses system CA store - W14: allow_wildcard_bind is OR between config and CLI - W15: ADR-010 Phase 2 list updated (timeouts moved to Phase 1) - W17: LoggingConfig static/restart note added Suggestions applied: - S2: ConnectInfo propagation note - S3: Case-insensitive host matching (RFC 7230) - S5: Response streaming behavior (chunk-by-chunk) - S6: Token bucket nodelay semantics - S7: File watching explicitly out of scope - S8: All paths forwarded without filtering - S9: shutdown_timeout_secs referenced in shutdown description - S11: Consolidated defaults table in config.md --- docs/architecture/README.md | 1 + docs/architecture/config.md | 93 ++++++++++++- .../decisions/010-multi-site-phase1.md | 10 +- .../021-x-forwarded-for-edge-proxy.md | 74 ++++++++++ docs/architecture/operations.md | 125 ++++++++++++++++- docs/architecture/overview.md | 62 +++++---- docs/architecture/proxy.md | 127 +++++++++++++++--- docs/architecture/tls.md | 36 +++++ 8 files changed, 467 insertions(+), 61 deletions(-) create mode 100644 docs/architecture/decisions/021-x-forwarded-for-edge-proxy.md diff --git a/docs/architecture/README.md b/docs/architecture/README.md index 9e88fb5..79c685e 100644 --- a/docs/architecture/README.md +++ b/docs/architecture/README.md @@ -52,6 +52,7 @@ certificate via ACME. | [018](decisions/018-body-size-limit.md) | Request Body Size Limit | Accepted | | [019](decisions/019-multi-config-listeners.md) | Multi-Config Listener Support | Accepted | | [020](decisions/020-container-deployment.md) | Container Deployment Model | Accepted | +| [021](decisions/021-x-forwarded-for-edge-proxy.md) | X-Forwarded-For Edge Proxy Model | Accepted | ## Open Questions diff --git a/docs/architecture/config.md b/docs/architecture/config.md index 2b1ea64..38f4b52 100644 --- a/docs/architecture/config.md +++ b/docs/architecture/config.md @@ -100,6 +100,11 @@ Immutable after startup. Changes require a process restart. | `format` | `"text"` or `"json"` | Log output format | | `log_file_path` | `String` | Path to log file. When set, structured logs are written to this file in addition to stdout/stderr. Strongly recommended for fail2ban integration in container deployments (see ADR-020). Default: not set (file logging disabled) | +**Note**: The entire `LoggingConfig` (including `log_file_path`) is static and +requires a process restart to change. Log file path changes require reopening +file handles, which is complex and low-value for Phase 1. Log rotation (Phase 2) +will be handled via signal-based or built-in rotation. + **ListenerConfig** (per-listener static config): | Field | Type | Description | @@ -141,10 +146,13 @@ connections immediately. | `upstream_connect_timeout_secs` | `u64` | TCP connect timeout in seconds (default: `5`; see ADR-015, ADR-017) | | `upstream_request_timeout_secs` | `u64` | Full request timeout in seconds (default: `60`; see ADR-015, ADR-017) | -Sites are defined per listener in the `[[listeners]]` entries. Each listener -routes its own sites independently. The `DynamicConfig` collects all sites -across all listeners for hot-reload via `ArcSwap`. When a config reload -occurs, all listener site mappings are updated atomically. +Sites are defined per listener in the `[[listeners]]` entries for organizational +purposes, but at runtime they are collected into a single global routing table +in `DynamicConfig`. The proxy looks up the `Host` header in this global table to +route requests. Hostnames must be unique across all listeners — a `Host` header +can only match one site definition, regardless of which listener received the +request. See ADR-019 for the rationale behind the `[[listeners]]` configuration +format. **Why these are dynamic:** See ADR-008 for the rationale. Site definitions and rate limits are per-request concerns that should not require restarting @@ -152,6 +160,29 @@ the proxy or dropping active connections. Rate limits and body limits are global settings in Phase 1; per-site configuration for these is deferred to Phase 2. +### Default Values + +| Field | Type | Default | Required | +|-------|------|---------|----------| +| `allow_wildcard_bind` | `bool` | `false` | No | +| `health_check_port` | `u16` | `9900` | No | +| `admin_socket_path` | `String` | `/run/reverse-proxy/admin.sock` | No | +| `shutdown_timeout_secs` | `u64` | `30` | No | +| `logging.level` | `String` | `"info"` | No | +| `logging.format` | `String` | `"text"` | No | +| `logging.log_file_path` | `String` | (not set) | No | +| `listeners[].http_port` | `u16` | `80` | No | +| `listeners[].https_port` | `u16` | `443` | No | +| `listeners[].tls.acme_directory` | `String` | `"production"` | No | +| `sites[].upstream_scheme` | `String` | `"http"` | No | +| `sites[].upstream_connect_timeout_secs` | `u64` | `5` | No | +| `sites[].upstream_request_timeout_secs` | `u64` | `60` | No | +| `rate_limit.requests_per_second` | `u32` | — | Yes | +| `rate_limit.burst` | `u32` | — | Yes | +| `body.limit_bytes` | `u64` | — | Yes | + +Fields without defaults are required and must be specified in the config file. + ## Config Reload ### ArcSwap Pattern @@ -187,6 +218,41 @@ Both mechanisms converge on the same code path: 3. Validate (check upstream reachability is optional) 4. Call `ConfigReloadHandle::reload(new_config)` +### Static Config Changes During Reload + +When the config file is reloaded (via SIGHUP or admin socket), the entire file +is read and validated — both static and dynamic portions. This provides early +error detection for misconfigurations that would prevent a restart from +succeeding. + +If the full config fails validation, the reload is rejected and the old +DynamicConfig remains active. + +If the full config passes validation but static fields have changed, the +DynamicConfig is swapped normally and a warning is logged listing the changed +static fields and noting that a restart is required for those changes to take +effect. This gives operators early feedback about config drift. + +Only the DynamicConfig portion is swapped via ArcSwap. StaticConfig changes +require a process restart to take effect. + +### Reload Serialization + +Reload operations are serialized using a `tokio::sync::Mutex` on the reload +code path. If a reload is in progress (triggered by SIGHUP or admin socket) and +a second reload is requested, the second request waits for the first to +complete, then re-reads the config file (getting the latest version) and +proceeds. This prevents race conditions where two concurrent reloads could apply +an older config over a newer one. + +### Out of Scope: File Watching + +Automatic file watching (inotify, fsnotify, etc.) is out of scope for Phase 1. +Config reload is triggered explicitly by SIGHUP or admin socket command. File +watching adds complexity (debouncing, handling atomic renames, handling editor +swap files) that is not justified for a single-instance proxy with infrequent +config changes. + ## TOML Config Format ### Multi-Config (Dedicated-IP Per Domain) @@ -296,7 +362,7 @@ upstream = "127.0.0.1:8080" On startup, the config is validated: 1. At least one `[[listeners]]` entry must exist -2. Each listener's `bind_addr` is not `0.0.0.0` unless `allow_wildcard_bind = true` (in config) or `--allow-wildcard-bind` (CLI flag) is set (see ADR-016, ADR-020) +2. Each listener's `bind_addr` is not `0.0.0.0` unless `allow_wildcard_bind` is enabled. This can be enabled via config (`allow_wildcard_bind = true`) or CLI flag (`--allow-wildcard-bind`). Either source enables it — it is an OR relationship, not AND. The CLI flag does not override the config value; if either is set, wildcard binding is allowed. 3. Each listener's `bind_addr` and `https_port` combination must be unique 4. In ACME mode, `acme_domains` must be non-empty 5. In manual mode, `cert_path` and `key_path` must both be set and the files @@ -308,6 +374,23 @@ On startup, the config is validated: route a request to when the `Host` header matches multiple sites. 8. `rate_limit.requests_per_second` must be > 0 9. `body.limit_bytes` must be > 0 +10. Each listener's `bind_addr` and `http_port` combination must be unique + (prevents bind-time errors, same as rule 3 for `https_port`) +11. Within a listener, `http_port` and `https_port` must differ +12. `https_port` must be 1–65535 (required — TLS needs a port) +13. `http_port` must be 0 (disabled) or 1–65535 +14. `health_check_port` must not conflict with any listener's `http_port` or + `https_port` on the same bind address +15. Site `host` values must not include a port number (e.g., `git.alk.dev`, + not `git.alk.dev:443`) +16. Site `host` values must be valid hostnames (not IP addresses, not + including ports). Hostnames are normalized to lowercase during validation. +17. `upstream` must be in `host:port` format where `port` is a required integer + 1–65535. Examples: `gitea:3000`, `127.0.0.1:3000`, `[::1]:3000`. Invalid + examples: `gitea` (missing port), `http://gitea:3000` (includes scheme), + `10.0.0.5` (missing port). The `upstream_scheme` field handles the protocol. +18. `upstream_scheme` values are case-sensitive: only `"http"` or `"https"` + (lowercase). Default is `"http"`. On SIGHUP reload, the same validation applies. If the new config fails validation, the reload is rejected and the old config remains active. An error diff --git a/docs/architecture/decisions/010-multi-site-phase1.md b/docs/architecture/decisions/010-multi-site-phase1.md index 7e022f4..76ffe2b 100644 --- a/docs/architecture/decisions/010-multi-site-phase1.md +++ b/docs/architecture/decisions/010-multi-site-phase1.md @@ -48,10 +48,12 @@ Phase 1 scope becomes: Phase 2 scope shifts to operational hardening: 1. Per-site rate limits and body limits -2. Per-site upstream timeouts -3. Metrics endpoint (Prometheus-compatible) -4. Connection limits and timeouts -5. Log rotation +2. Metrics endpoint (Prometheus-compatible) +3. Connection limits and timeouts +4. Log rotation + +Note: "Per-site upstream timeouts" was originally listed here but was moved to +Phase 1 via ADR-015. Phase 3 remains future enhancements. diff --git a/docs/architecture/decisions/021-x-forwarded-for-edge-proxy.md b/docs/architecture/decisions/021-x-forwarded-for-edge-proxy.md new file mode 100644 index 0000000..ca612a6 --- /dev/null +++ b/docs/architecture/decisions/021-x-forwarded-for-edge-proxy.md @@ -0,0 +1,74 @@ +# ADR-021: X-Forwarded-For Edge Proxy Model + +## Status + +Accepted + +## Context + +The reverse proxy terminates TLS and injects standard proxy headers (Host, +X-Real-IP, X-Forwarded-For, X-Forwarded-Proto) before forwarding requests to +upstream services. The question is how to handle the `X-Forwarded-For` header +when the request already contains one. + +Two approaches exist: + +1. **Append**: Add the client IP to any existing `X-Forwarded-For` value. + This preserves the proxy chain history but trusts that existing values are + legitimate. In a chained proxy scenario (e.g., CDN → reverse proxy → + backend), appending is correct because the CDN's `X-Forwarded-For` value + is trustworthy. + +2. **Replace**: Set `X-Forwarded-For` to the client's IP address from + `ConnectInfo`, discarding any existing value. This is correct + when the proxy is the edge proxy directly facing the internet, because + client-provided `X-Forwarded-For` headers are untrusted — any client can + inject arbitrary values. + +This proxy is deployed as the **edge proxy** — it sits directly in front of the +internet with no trusted proxies upstream. All client connections come directly +to this proxy. + +## Decision + +Set `X-Forwarded-For` to the client's IP address from `ConnectInfo`, +**replacing** any existing value rather than appending. + +The proxy is an edge proxy. There are no trusted proxies upstream, so existing +`X-Forwarded-For` headers from clients cannot be trusted. Replacing prevents +header spoofing attacks where a malicious client injects fake IP addresses to +confuse upstream services or bypass IP-based access controls. + +## Rationale + +- The proxy is the edge proxy — it directly faces the internet. No CDN or + other trusted proxy sits in front of it. +- Client-provided `X-Forwarded-For` headers are untrusted. Any client can send + `X-Forwarded-For: 1.2.3.4` to spoof their IP. +- Appending to an untrusted header creates a security vulnerability: upstream + services (like Gitea) may use `X-Forwarded-For` for IP-based rate limiting or + access control. Spoofed values would bypass these protections. +- `X-Real-IP` is also set to the client's IP from `ConnectInfo`, providing a + trustworthy header for upstream services that need the real client IP. +- If this proxy is ever placed behind a CDN or other trusted proxy, the header + handling model should be revisited. A "trusted proxies" configuration can be + added in Phase 2 to support chained proxy scenarios. + +## Consequences + +**Positive:** +- Prevents `X-Forwarded-For` spoofing attacks +- Upstream services receive the real client IP, not a spoofed one +- Simple, predictable header handling — no trust model to configure +- Consistent with the proxy's role as the edge proxy + +**Negative:** +- If the proxy is placed behind a CDN or other proxy in the future, the header + handling must be updated to support a "trusted proxies" model +- Does not preserve legitimate proxy chain history (not applicable in our + deployment — there are no proxies upstream) + +## References + +- [proxy.md](../proxy.md) +- Architecture Review C2 (X-Forwarded-For security model) \ No newline at end of file diff --git a/docs/architecture/operations.md b/docs/architecture/operations.md index 8d59ad4..2d265b7 100644 --- a/docs/architecture/operations.md +++ b/docs/architecture/operations.md @@ -36,6 +36,12 @@ Rate limits are global per-IP in Phase 1 (not per-site). A request from IP address X counts against the same bucket regardless of which site it targets. Per-site rate limits may be added in Phase 2. +The token bucket uses **nodelay** semantics matching nginx's `limit_req burst +nodelay`: when the bucket is empty, the request is immediately rejected with +429 — requests are not queued. Tokens are added at a rate of +`requests_per_second` (1 token every 1000ms / requests_per_second), and the +bucket capacity is the `burst` value. + When a request exceeds the rate limit, the middleware returns `429 Too Many Requests` and logs the event with structured fields. @@ -47,6 +53,37 @@ whose last access timestamp is older than a configurable eviction age (default: 300 seconds / 5 minutes). This prevents unbounded memory growth while preserving recent entries that may still receive traffic. +### Config Reload Behavior + +When rate limit parameters change (e.g., from 10 req/s burst 20 to 20 req/s +burst 40), the behavior is: + +1. New `DynamicConfig` is swapped in via ArcSwap. +2. On the next request from an existing IP, the rate limiter reads the current + `DynamicConfig` for rate/burst parameters. +3. The token bucket refills using the new rate, and its capacity is set to the + new burst maximum. +4. If the current token count exceeds the new burst maximum, it is capped to + the new burst maximum. + +The HashMap is **not** cleared — this avoids creating a rate-limiting gap. +Existing buckets adopt new parameters on their next request. The eviction task +continues removing stale entries independently. + +### IPv6 Rate Limiting + +IPv6 addresses have a vastly larger address space than IPv4. Rate limiting per +individual IPv6 address (`/128`) is ineffective against attackers who can +generate many addresses within a `/64` prefix. + +- **IPv4**: Rate limited per individual address (`/32`). +- **IPv6**: Rate limited per `/64` prefix. All addresses in the same `/64` share + the same token bucket. This matches RFC 4941 privacy extension boundaries and + common anti-abuse practice. + +The rate limiter normalizes IPv6 addresses to their `/64` prefix before +bucket lookup. + ### Fail2ban Integration Rate limit events are logged in a structured format that a custom fail2ban @@ -225,13 +262,46 @@ process does not exit on SIGHUP. The admin Unix domain socket provides programmatic config reload with feedback. This is useful for CI/CD pipelines and automation tools. See ADR-014 for the -command protocol. +rationale. -### Timeout +**Protocol:** -In-flight requests have a configurable shutdown timeout (default: 30 seconds). -After the timeout, remaining connections are forcefully closed and the process -exits. +- **Connection lifecycle**: One command per connection. Client connects, sends + one newline-terminated command, receives one newline-terminated JSON + response, then the server closes the connection. +- **Message framing**: Newline-delimited (`\n`). Responses end with `\n`. +- **Commands**: + - `reload` — Re-read config file, validate, and swap DynamicConfig. Returns + `{"status": "ok"}` or `{"status": "error", "message": "..."}`. + - `status` — Return basic process info. Returns + `{"status": "ok", "uptime_secs": 1234, "sites": 2}`. +- **Error responses**: Unrecognized commands return + `{"status": "error", "message": "unknown command: "}`. Invalid or empty + input returns `{"status": "error", "message": "invalid input"}`. +- **Concurrency**: Multiple clients can connect simultaneously, but reload + operations are serialized (see Config Reload section in config.md). +- **Socket cleanup**: The proxy removes any existing socket file at startup + before binding. If the file exists and another process is listening, a warning + is logged and the admin socket is disabled (but the proxy continues starting). + +### Shutdown Sequence + +On SIGTERM or SIGINT, the proxy performs a graceful shutdown: + +1. **Stop accepting new connections** — Close all TCP listening sockets. No new + connections are accepted. +2. **Close idle keep-alive connections** — Send `Connection: close` on any idle + connections in the keep-alive pool. +3. **Wait for in-flight requests** — Up to `shutdown_timeout_secs` (default: 30) + for active requests to complete. +4. **Force-close remaining connections** — After the timeout, any remaining + connections are forcefully closed via TCP RST. +5. **Cancel background tasks** — ACME renewal tasks, rate limiter eviction task, + and admin socket listener are all cancelled. +6. **Exit with code 0**. + +The `shutdown_timeout_secs` is configurable in StaticConfig (default: 30 +seconds). See config.md for details. ## Deployment @@ -443,6 +513,51 @@ continues to be routed directly to the Gitea container via Docker port publishing (e.g., `203.0.113.10:22:2222`), matching the current deployment pattern. +## Startup Sequence + +The proxy starts components in a specific order to ensure fail-fast behavior +and correct dependency initialization: + +1. **Parse and validate config** — Read the TOML config file, deserialize into + `StaticConfig` and `DynamicConfig`, and validate all rules. If validation + fails, exit with non-zero code and log errors. No ports are bound. + +2. **Initialize DynamicConfig** — Load sites, rate limits, and body limits into + `ArcSwap`. + +3. **Initialize shared state** — Create the rate limiter + `HashMap`, the shared `hyper::Client`, and the + `tracing-subscriber` with file and stdout layers. + +4. **Bind health check port** (if enabled) — Bind `127.0.0.1:{health_check_port}`. + Fail-fast if bind fails. + +5. **Bind admin socket** (if enabled) — Remove any stale socket file first, then + bind the Unix domain socket. If the socket file exists and another process is + listening, log a warning and fail the admin socket (but continue starting — + the admin socket is non-critical). + +6. **Bind all listener ports** — For each listener: bind HTTP port (if enabled) + and HTTPS port. If any bind fails, fail-fast and exit. All ports are bound + before proceeding. + +7. **Load TLS configuration** — For each listener: load manual certificates or + initialize ACME state machine. If manual certificate loading fails, fail-fast + and exit. For ACME: if no cached certificate exists and ACME provisioning + fails, fail-fast and exit. + +8. **Start TCP listeners** — Begin accepting connections on all bound ports. + +9. **Start background tasks** — ACME renewal tasks (per listener in ACME mode), + rate limiter eviction task, signal handler task, admin socket handler task. + +10. **Signal readiness** — Send `sd_notify("READY=1")` to systemd (if running + under systemd). + +**Failure semantics**: **Fail-fast**. If any step fails, the process exits with +a non-zero code. The proxy does not partially start. All ports are bound before +any connections are accepted. + ## Design Decisions All design decisions are documented as ADRs in [decisions/](decisions/). diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md index a62de37..70553fa 100644 --- a/docs/architecture/overview.md +++ b/docs/architecture/overview.md @@ -86,31 +86,35 @@ details. config.toml ───────► │ StaticConfig + DynamicConfig │ (volume mount) │ (ArcSwap for hot-reload) │ │ │ - │ ┌─ Listener 1 ─────────────────┐ │ - bind_addr:80 ────► │ │ HTTP → 301 redirect │ │ - (published) │ └────────────────────────────────┘ │ - │ │ - bind_addr:443 ────► │ │ TLS listener (tokio-rustls) │ │ - (published) │ │ ├─ ACME or Manual TLS config │ │ - │ │ └─ axum router │ │ - │ │ ├─ Host-based routing │ │ - │ │ ├─ git.alk.dev → gitea:3000 │ │ - │ │ └─ Rate limiting, headers │ │ - │ └────────────────────────────────┘ │ - │ │ - │ ┌─ Listener N ─────────────────┐ │ - bind_addr_N:80 ───► │ │ HTTP → 301 redirect │ │ - │ └────────────────────────────────┘ │ - │ │ - bind_addr_N:443 ───► │ │ TLS listener (tokio-rustls) │ │ - │ │ ├─ Manual TLS cert │ │ - │ │ └─ axum router │ │ - │ │ ├─ alk.dev → app:8080 │ │ - │ │ └─ Rate limiting, headers │ │ - │ └────────────────────────────────┘ │ - │ │ - │ /health → 200 OK (port 9900) │ - └────────────────────────────────────┘ + │ ┌─ Listener 1 ─────────────────┐ │ + bind_addr:80 ────► │ │ HTTP → 301 redirect │ │ + (published) │ └────────────────────────────────┘ │ + │ │ + bind_addr:443 ────► │ │ TLS listener (tokio-rustls) │ │ + (published) │ │ ├─ ACME or Manual TLS config │ │ + │ │ └─ axum router (per-listener) │ │ + │ │ ├─ /health → 200 OK (any) │ │ + │ │ ├─ Host → global site lookup │ │ + │ │ ├─ git.alk.dev → gitea:3000 │ │ + │ │ └─ Rate limiting, headers │ │ + │ └────────────────────────────────┘ │ + │ │ + │ ┌─ Listener N ─────────────────┐ │ + bind_addr_N:80 ───► │ │ HTTP → 301 redirect │ │ + │ └────────────────────────────────┘ │ + │ │ + bind_addr_N:443 ───► │ │ TLS listener (tokio-rustls) │ │ + │ │ ├─ Manual TLS cert │ │ + │ │ └─ axum router (per-listener) │ │ + │ │ ├─ /health → 200 OK (any) │ │ + │ │ ├─ Host → global site lookup │ │ + │ │ ├─ alk.dev → app:8080 │ │ + │ │ └─ Rate limiting, headers │ │ + │ └────────────────────────────────┘ │ + │ │ + │ /health → 200 OK (port 9900) │ + │ Admin socket (Unix domain) │ + └────────────────────────────────────┘ │ │ ┌──────┘ └──────┐ │ │ @@ -121,6 +125,13 @@ details. └─ admin socket (rw) ``` +Each listener has its own `axum::Router` instance with its own middleware stack, +but all routers share `Arc>` and +`Arc>>` via axum State. Site routing is +global: the `Host` header is matched against a single routing table collected +from all listeners' site definitions. Hostnames must be unique across all +listeners — see C1 resolution in the architecture review. + In container deployments (ADR-020), the proxy runs in a minimal container with `0.0.0.0` bind address and Docker port publishing. Upstream addresses use Docker DNS names for same-host containers (e.g., `gitea:3000`) but also support @@ -196,6 +207,7 @@ All design decisions are documented as ADRs in [decisions/](decisions/). | [018](decisions/018-body-size-limit.md) | Request body size limit | 100 MB default matching nginx, Gitea push compatibility | | [019](decisions/019-multi-config-listeners.md) | Multi-config listeners | `[[listeners]]` supporting both dedicated-IP and shared-IP deployment models | | [020](decisions/020-container-deployment.md) | Container deployment model | Defense-in-depth via container isolation; file-primary logging; flexible upstream addressing | +| [021](decisions/021-x-forwarded-for-edge-proxy.md) | X-Forwarded-For edge proxy model | Replace, don't append — proxy is the edge, no trusted upstream proxies | ## Open Questions diff --git a/docs/architecture/proxy.md b/docs/architecture/proxy.md index 58d213b..1f067ff 100644 --- a/docs/architecture/proxy.md +++ b/docs/architecture/proxy.md @@ -76,26 +76,43 @@ Incoming HTTPS request ### 1. Host-Based Routing The axum router uses a `Host` extractor to match incoming requests to site -definitions from `DynamicConfig`. Each site definition maps a hostname to an -upstream address. +definitions from `DynamicConfig`. Sites are defined per-listener in the TOML +configuration for organizational purposes, but at runtime they are collected +into a single global routing table. The proxy looks up the `Host` header in +this global table and either proxies to the upstream or returns 404. -Where `host_based_proxy` reads the `Host` header, looks up the site in -`DynamicConfig.sites`, and either proxies to the upstream or returns 404. +Host matching is **case-insensitive** per RFC 7230 §2.7.3. The `Host` header +is normalized to lowercase before matching. Site `host` values in +configuration are normalized to lowercase during validation. + +The `Host` header port component (e.g., `git.alk.dev:443`) is stripped before +matching. Site `host` values must not include ports. + +The proxy does not filter or restrict paths. All paths and query strings on a +known host are forwarded to the upstream without modification. + +The `/health` path is a special case: it matches regardless of the `Host` +header and is evaluated before host-based routing. A `GET /health` request on +any hostname returns `200 OK` with an empty body. ### 2. Proxy Header Injection -Headers are injected before forwarding. The handler reads connection metadata -from axum's `ConnectInfo` and the original request: +Headers are injected before forwarding. The proxy is an **edge proxy** — it +sits directly in front of the internet with no trusted proxies upstream. This +means the client IP from `ConnectInfo` is the real client IP, and +existing `X-Forwarded-For` headers from the client cannot be trusted. | Header | Value Source | Notes | |--------|-------------|-------| -| `Host` | Original request `Host` header | Already present; preserved as-is | +| `Host` | Original request `Host` header | Preserved as-is | | `X-Real-IP` | `ConnectInfo` remote IP | Set to client's IP address | -| `X-Forwarded-For` | Client IP, appended if header exists | Comma-separated list of proxies | +| `X-Forwarded-For` | `ConnectInfo` remote IP | **Replaced**, not appended. The proxy is the edge proxy — there are no trusted proxies upstream, so existing `X-Forwarded-For` values from the client cannot be trusted. | | `X-Forwarded-Proto` | Determined by which listener port received the request | `https` for requests on the listener's `https_port`, `http` for requests on the listener's `http_port` | -The `X-Forwarded-For` handling must append the client IP to any existing value -(rather than replacing it), to support chained proxies correctly. +**ConnectInfo propagation**: `ConnectInfo` is populated by +extracting `TcpStream::peer_addr()` before wrapping the connection in +`TlsStream`. Each listener provides this information to its axum Router via +`axum::ServiceExt::into_make_service_with_connect_info::()`. ### 3. Request Forwarding @@ -106,7 +123,12 @@ The proxy handler constructs a new request to the upstream: 2. Copy the request method, headers, and body from the original 3. Inject proxy headers (X-Real-IP, X-Forwarded-For, X-Forwarded-Proto) 4. Send the request via a shared hyper Client instance -5. Stream the response back to the client +5. Stream the response back to the client (chunk-by-chunk, not buffered) + + If the client disconnects while the upstream is still sending, the upstream + connection is closed and the event is logged at `debug` level. If the + upstream disconnects mid-stream, the client receives whatever data was + already sent and the connection is closed. The hyper Client is created once at startup and shared via axum's `State`. It must be configured with (see ADR-017 for rationale): @@ -118,22 +140,77 @@ Per-site timeout overrides are available via `upstream_connect_timeout_secs` and `upstream_request_timeout_secs` in `SiteConfig` (see ADR-015). When not specified, defaults of 5s connect and 60s request are used. -### 4. Error Handling +### 4. Header Handling -| Upstream Condition | Response | Notes | -|-------------------|----------|-------| -| Upstream reachable | Stream response as-is | Headers, status, body all forwarded | -| Upstream unreachable | 502 Bad Gateway | Logged at `warn` level | -| Upstream timeout | 504 Gateway Timeout | Logged at `warn` level | -| Request body too large | 413 Payload Too Large | From `DefaultBodyLimit` middleware | -| Rate limit exceeded | 429 Too Many Requests | Logged at `info` level | -| Unknown Host header | 404 Not Found | No matching site definition | +The proxy must handle request and response headers correctly to avoid security +issues and protocol violations. -### 5. HTTP → HTTPS Redirect +**Headers removed before forwarding (hop-by-hop headers per RFC 2616 §13.5.1):** + +- `Connection` +- `Keep-Alive` +- `Proxy-Authorization` +- `Proxy-Authenticate` +- `TE` +- `Trailers` +- `Transfer-Encoding` +- `Upgrade` + +These headers are connection-specific and must not be forwarded to the +upstream. Removing `Proxy-Authorization` and `Proxy-Authenticate` prevents +credential leakage. + +**Headers added or modified:** + +See the Proxy Header Injection section above for the full list of proxy headers +(X-Real-IP, X-Forwarded-For, X-Forwarded-Proto, Host). + +**Headers NOT added in Phase 1:** + +- `Via`: Not added. The proxy is an edge proxy and `Via` is primarily for + tracking proxy chains. Can be added in Phase 2 if needed. + +**Response headers:** + +Upstream response headers are forwarded as-is to the client, with the following +exceptions: +- Hop-by-hop headers listed above are removed +- The proxy does not add a `Server` header to responses + +### 5. Error Handling + +All error responses use plain text bodies with no proxy version or identity +information. No upstream error details are included. Response format: + +- Content-Type: `text/plain; charset=utf-8` +- Body: Brief status text matching the HTTP status (e.g., `Bad Gateway` for 502) + +| Upstream Condition | Response | Body | Notes | +|-------------------|----------|------|-------| +| Upstream reachable | Stream response as-is | (upstream body) | Headers, status, body all forwarded | +| Upstream unreachable | 502 Bad Gateway | `Bad Gateway` | Logged at `warn` level | +| Upstream timeout | 504 Gateway Timeout | `Gateway Timeout` | Logged at `warn` level | +| Request body too large | 413 Payload Too Large | `Payload Too Large` | From `DefaultBodyLimit` middleware | +| Rate limit exceeded | 429 Too Many Requests | `Too Many Requests` | Logged at `info` level | +| Unknown Host header | 404 Not Found | `Not Found` | No matching site definition | +| Missing Host header | 400 Bad Request | `Bad Request` | Required for routing | + +### 6. HTTP → HTTPS Redirect A separate HTTP listener on port 80 (per listener) handles redirect. It reads the `Host` header from the incoming request and returns a 301 Permanent Redirect -to the HTTPS equivalent URL (preserving the path and query string). +to the HTTPS equivalent URL. + +The redirect URL is constructed as: +`https://{host}:{https_port}/{path}?{query}` + +Where: +- `{host}` is the hostname portion of the `Host` header (port stripped) +- `{https_port}` is the listener's `https_port`, omitted if it's 443 +- `{path}` and `{query}` are preserved from the original request + +If the incoming request has no `Host` header, the proxy returns `400 Bad +Request`. Each listener has its own HTTP redirect on its own bind address. @@ -148,6 +225,11 @@ For the initial deployment, upstream connections use plain HTTP (e.g., `git.alk.dev` → `127.0.0.1:3000`, `alk.dev` → `127.0.0.1:8080`) since TLS between the proxy and backend services on loopback is unnecessary. +When `upstream_scheme` is `"https"`, the proxy validates the upstream's TLS +certificate using the system's native TLS root certificates (via `rustls` root +cert store). Certificate validation failures result in a 502 Bad Gateway +response. No certificate pinning or custom CA support is provided in Phase 1. + ## Body Size Limit axum's `DefaultBodyLimit` layer sets the maximum request body size. The default @@ -168,6 +250,7 @@ All design decisions are documented as ADRs in [decisions/](decisions/). | [015](decisions/015-per-site-timeouts.md) | Per-site upstream timeouts with defaults | 5s connect / 60s request defaults, per-site overrides | | [017](decisions/017-upstream-connection-defaults.md) | Upstream connection defaults | HTTP/1.1, no redirects, connection pooling | | [018](decisions/018-body-size-limit.md) | Request body size limit | 100 MB default matching nginx, Gitea push compatibility | +| [021](decisions/021-x-forwarded-for-edge-proxy.md) | X-Forwarded-For edge proxy model | Replace, don't append — proxy is the edge, no trusted upstream proxies | ## Open Questions diff --git a/docs/architecture/tls.md b/docs/architecture/tls.md index e14ace3..221f847 100644 --- a/docs/architecture/tls.md +++ b/docs/architecture/tls.md @@ -243,6 +243,42 @@ suitable (e.g., behind a CDN that terminates TLS). When using HTTP-01, the port 80 listener serves `/.well-known/acme-challenge/{token}` paths for challenge verification. +## Certificate Failure Behavior + +ACME certificate provisioning and renewal can fail for various reasons (network +outages, Let's Encrypt unavailability, DNS issues, rate limiting). The proxy's +behavior depends on the scenario: + +| Scenario | Behavior | +|----------|----------| +| First start, no cached cert, ACME unreachable | **Fail to start** with clear error message. The proxy cannot serve TLS without a certificate. | +| First start, no cached cert, ACME succeeds | Normal startup. Certificate is provisioned and cached. | +| Start with cached cert, ACME unreachable for renewal | **Start normally** with cached cert. Log error at `warn` level. `rustls-acme` retries per its built-in schedule. | +| Renewal failure after startup | **Continue serving existing cert**. Log error at `warn` level. `rustls-acme` retries per its built-in schedule. | +| Cached cert expired, renewal fails at startup | **Fail to start** if cert is expired at startup. An expired certificate cannot serve valid TLS. | +| Cached cert expires during runtime | **Continue serving expired cert**. Clients will receive certificate errors. Log at `error` level. This is the correct behavior — silently dropping TLS would be worse. | + +The key principle: **never start without a valid TLS certificate**, but **always +continue serving if a valid cert exists**, even if renewal fails. + +## TLS Error Handling + +TLS handshake failures are logged and the connection is closed. The proxy does +not serve a default certificate for unknown hostnames — connections that don't +match any configured certificate fail. + +| Scenario | Behavior | +|----------|----------| +| SNI hostname doesn't match any certificate (manual mode) | TLS handshake fails with `unrecognized_name` alert. Log at `warn` level with client IP and SNI hostname. | +| No SNI extension sent by client | TLS handshake fails with `handshake_failure` alert. Log at `warn` level with client IP. | +| Unsupported TLS version (1.0/1.1) | TLS handshake fails with `protocol_version` alert. Log at `info` level. | +| Cipher suite negotiation fails | TLS handshake fails with `handshake_failure` alert. Log at `info` level with client IP. | +| Certificate expired (manual mode) | Connection fails during TLS handshake. Log at `error` level. Other listeners/connections continue serving. | + +In ACME mode, the `ResolvesServerCertAcme` resolver handles certificate +selection automatically — there is no SNI mismatch scenario because the +resolver serves the ACME-provisioned certificate for all valid domains. + ## Key Files and Crates | Component | Crate | Purpose |