diff --git a/docs/architecture/README.md b/docs/architecture/README.md index c99a76c..7dfb0a2 100644 --- a/docs/architecture/README.md +++ b/docs/architecture/README.md @@ -50,6 +50,7 @@ certificate via ACME. | [016](decisions/016-explicit-bind-address.md) | Explicit Bind Address Requirement | Accepted | | [017](decisions/017-upstream-connection-defaults.md) | Upstream Connection Defaults | Accepted | | [018](decisions/018-body-size-limit.md) | Request Body Size Limit | Accepted | +| [019](decisions/019-multi-config-listeners.md) | Multi-Config Listener Support | Accepted | ## Open Questions @@ -63,7 +64,7 @@ See [open-questions.md](open-questions.md) for the full tracker. | ~~OQ-04~~ | ~~Config reload: SIGHUP only or also Unix socket API?~~ | ~~low~~ | **resolved** (ADR-014) | | ~~OQ-05~~ | ~~Should the proxy bind to multiple addresses?~~ | ~~low~~ | **resolved** (single bind_addr sufficient) | | ~~OQ-06~~ | ~~Should upstream timeouts be configurable per-site?~~ | ~~low~~ | **resolved** (ADR-015) | -| OQ-07 | Should per-site TLS overrides be supported for mixed ACME/manual domains? | low | open | +| ~~OQ-07~~ | ~~Should per-site TLS overrides be supported for mixed ACME/manual domains?~~ | ~~low~~ | **resolved** (ADR-019) | ## Document Lifecycle diff --git a/docs/architecture/config.md b/docs/architecture/config.md index 6d3d4c0..593b9ac 100644 --- a/docs/architecture/config.md +++ b/docs/architecture/config.md @@ -31,54 +31,84 @@ config.toml └──────────┬───────────┘ │ ▼ -┌──────────────────────┐ ┌──────────────────────┐ -│ StaticConfig │ │ DynamicConfig │ -│ (immutable) │ │ (hot-reloadable) │ -│ │ │ │ -│ bind_addr │ │ sites[] │ -│ http_port │ │ rate_limit │ -│ https_port │ │ body_limit │ -│ health_check_port │ │ proxy_headers │ -│ admin_socket_path │ │ │ -│ tls.mode │ │ ← ArcSwap → │ -│ tls.acme_domains │ │ │ -│ tls.cert_path │ │ ← ArcSwap → │ -│ tls.key_path │ │ ConfigReloadHandle │ -│ tls.cache_dir │ │ .reload(new_config) │ -│ log_level │ │ │ -│ log_format │ └───────────────────────┘ +┌──────────────────────┐ +│ StaticConfig │ +│ (immutable) │ +│ │ +│ health_check_port │ +│ admin_socket_path │ +│ log_level │ +│ log_format │ +│ │ +│ listeners[] │ +│ ┌────────────────┐ │ +│ │ Listener 1 │ │ +│ │ bind_addr │ │ +│ │ http_port │ │ +│ │ https_port │ │ +│ │ tls.mode │ │ +│ │ tls.acme_domains│ │ +│ │ tls.acme_cache_dir│ │ +│ │ tls.acme_directory│ │ +│ │ tls.cert_path │ │ +│ │ tls.key_path │ │ +│ └────────────────┘ │ +│ ┌────────────────┐ │ +│ │ Listener N │ │ +│ │ ... │ │ +│ └────────────────┘ │ └──────────────────────┘ + +┌──────────────────────┐ +│ DynamicConfig │ +│ (hot-reloadable) │ +│ │ +│ sites[] │ +│ rate_limit │ +│ body_limit │ +│ │ +│ ← ArcSwap → │ +│ ConfigReloadHandle │ +│ .reload(new_config) │ +└───────────────────────┘ ``` ## Static vs Dynamic Configuration This split follows the pattern established in alknet (ADR-030) and adapted -for our simpler use case. +for our simpler use case. See ADR-019 for the rationale behind the +`[[listeners]]` configuration format. ### StaticConfig Immutable after startup. Changes require a process restart. +| Field | Type | Description | +|-------|------|-------------| +| `listeners` | `Vec` | Independent TLS endpoints, each with its own bind address and TLS config (see ADR-019) | +| `health_check_port` | `u16` | Port for local health check endpoint (default: `9900`; set to `0` to disable; see ADR-013) | +| `admin_socket_path` | `String` | Unix domain socket path for admin API (default: `/run/reverse-proxy/admin.sock`; empty string to disable; see ADR-014) | +| `log_level` | `"trace"`, `"debug"`, `"info"`, `"warn"`, `"error"` | Logging verbosity | +| `log_format` | `"text"` or `"json"` | Log output format | + +**ListenerConfig** (per-listener static config): + | Field | Type | Description | |-------|------|-------------| | `bind_addr` | `String` | IP address to bind to (must be explicit, no `0.0.0.0`; see ADR-016) | | `http_port` | `u16` | Port for HTTP→HTTPS redirect (default: `80`; set to `0` to disable) | | `https_port` | `u16` | Port for TLS listener (default: `443`) | -| `health_check_port` | `u16` | Port for local health check endpoint (default: `9900`; set to `0` to disable; see ADR-013) | -| `admin_socket_path` | `String` | Unix domain socket path for admin API (default: `/run/reverse-proxy/admin.sock`; empty string to disable; see ADR-014) | | `tls.mode` | `"acme"` or `"manual"` | Certificate provisioning mode | | `tls.acme_domains` | `Vec` | Domains for ACME SAN certificate (ACME mode only) | | `tls.acme_cache_dir` | `String` | ACME state cache directory | | `tls.acme_directory` | `"production"` or `"staging"` | Let's Encrypt directory | | `tls.cert_path` | `String` | Certificate file path (manual mode only) | | `tls.key_path` | `String` | Private key file path (manual mode only) | -| `log_level` | `"trace"`, `"debug"`, `"info"`, `"warn"`, `"error"` | Logging verbosity | -| `log_format` | `"text"` or `"json"` | Log output format | -**Why these are static:** See ADR-008 for the rationale behind the -static/dynamic split. In summary: changing bind addresses, ports, or TLS mode -requires creating new listeners and TLS configurations — operations that -fundamentally require a restart. +**Why listeners are static:** Each listener requires binding a TCP socket and +constructing a TLS acceptor — operations that fundamentally require a restart. +Changing a listener's bind address, TLS mode, or certificate configuration +cannot be done without creating new listeners. See ADR-008 and ADR-019. ### DynamicConfig @@ -100,7 +130,12 @@ connections immediately. | `upstream` | `String` | Upstream address (e.g., `"127.0.0.1:3000"`) | | `upstream_scheme` | `"http"` or `"https"` | Protocol for upstream connection (default: `"http"`) | | `upstream_connect_timeout_secs` | `u64` | TCP connect timeout in seconds (default: `5`; see ADR-015, ADR-017) | -| `upstream_request_timeout_secs` | `u64` | Full request timeout in seconds (default: `60`; see ADR-015, ADR-017) | | +| `upstream_request_timeout_secs` | `u64` | Full request timeout in seconds (default: `60`; see ADR-015, ADR-017) | + +Sites are defined per listener in the `[[listeners]]` entries. Each listener +routes its own sites independently. The `DynamicConfig` collects all sites +across all listeners for hot-reload via `ArcSwap`. When a config reload +occurs, all listener site mappings are updated atomically. **Why these are dynamic:** See ADR-008 for the rationale. Site definitions and rate limits are per-request concerns that should not require restarting @@ -145,28 +180,19 @@ Both mechanisms converge on the same code path: ## TOML Config Format +### Multi-Config (Dedicated-IP Per Domain) + +The primary deployment model — each listener on its own IP with its own TLS +certificate: + ```toml # reverse-proxy config -[server] -bind_addr = "203.0.113.10" # Replace with actual bind address -http_port = 80 -https_port = 443 +# Global settings health_check_port = 9900 # Local health check (0 to disable) admin_socket_path = "/run/reverse-proxy/admin.sock" # Empty string to disable -[server.tls] -mode = "acme" # "acme" or "manual" -acme_domains = ["git.alk.dev", "alk.dev"] -acme_cache_dir = "/var/lib/reverse-proxy/acme-cache" -acme_directory = "production" # "production" or "staging" - -# Manual mode (uncomment and comment out ACME settings) -# mode = "manual" -# cert_path = "/etc/letsencrypt/live/git.alk.dev/fullchain.pem" -# key_path = "/etc/letsencrypt/live/git.alk.dev/privkey.pem" - -[server.logging] +[logging] level = "info" format = "text" # "text" or "json" @@ -177,31 +203,100 @@ burst = 20 [body] limit_bytes = 104857600 # 100 MB -[[sites]] +# Listener 1: git.alk.dev on its own IP +[[listeners]] +bind_addr = "203.0.113.10" +http_port = 80 +https_port = 443 + +[listeners.tls] +mode = "acme" +acme_domains = ["git.alk.dev"] +acme_cache_dir = "/var/lib/reverse-proxy/acme-cache-git" +acme_directory = "production" + +[[listeners.sites]] host = "git.alk.dev" upstream = "127.0.0.1:3000" upstream_scheme = "http" # upstream_connect_timeout_secs = 5 # Default: 5s -# upstream_request_timeout_secs = 60 # Default: 60s +# upstream_request_timeout_secs = 60 # Default: 60s -[[sites]] +# Listener 2: alk.dev on its own IP with a manual certificate +[[listeners]] +bind_addr = "203.0.113.11" +http_port = 80 +https_port = 443 + +[listeners.tls] +mode = "manual" +cert_path = "/etc/ssl/alk.dev/fullchain.pem" +key_path = "/etc/ssl/alk.dev/privkey.pem" + +[[listeners.sites]] host = "alk.dev" upstream = "127.0.0.1:8080" upstream_scheme = "http" ``` +### Shared-IP Multi-Domain (SAN Certificate) + +A single listener serving multiple domains with one SAN certificate: + +```toml +# Global settings +health_check_port = 9900 +admin_socket_path = "/run/reverse-proxy/admin.sock" + +[logging] +level = "info" +format = "text" + +[rate_limit] +requests_per_second = 10 +burst = 20 + +[body] +limit_bytes = 104857600 + +# Single listener with multi-domain SAN certificate +[[listeners]] +bind_addr = "203.0.113.10" +http_port = 80 +https_port = 443 + +[listeners.tls] +mode = "acme" +acme_domains = ["git.alk.dev", "alk.dev"] +acme_cache_dir = "/var/lib/reverse-proxy/acme-cache" +acme_directory = "production" + +[[listeners.sites]] +host = "git.alk.dev" +upstream = "127.0.0.1:3000" + +[[listeners.sites]] +host = "alk.dev" +upstream = "127.0.0.1:8080" +``` + ### Validation On startup, the config is validated: -1. `bind_addr` is not `0.0.0.0` (must be explicit) -2. In ACME mode, `acme_domains` must be non-empty -3. In manual mode, `cert_path` and `key_path` must both be set and the files +1. At least one `[[listeners]]` entry must exist +2. Each listener's `bind_addr` is not `0.0.0.0` (must be explicit; see ADR-016) +3. Each listener's `bind_addr` and `https_port` combination must be unique +4. In ACME mode, `acme_domains` must be non-empty +5. In manual mode, `cert_path` and `key_path` must both be set and the files must be readable -4. Each site must have a `host` and `upstream` -5. Site `host` values must be unique (no duplicate hostnames) -6. `rate_limit.requests_per_second` must be > 0 -7. `body.limit_bytes` must be > 0 +6. Each site must have a `host` and `upstream` +7. Site `host` values must be unique across all listeners (no duplicate + hostnames, even across different listeners). Duplicate hostnames would create + ambiguous routing — the proxy would not know which listener's upstream to + route a request to when the `Host` header matches multiple sites. +8. `rate_limit.requests_per_second` must be > 0 +9. `body.limit_bytes` must be > 0 On SIGHUP reload, the same validation applies. If the new config fails validation, the reload is rejected and the old config remains active. An error @@ -225,6 +320,7 @@ All design decisions are documented as ADRs in [decisions/](decisions/). | [014](decisions/014-unix-socket-reload.md) | Unix domain socket config reload API | Programmatic reload with success/failure feedback | | [015](decisions/015-per-site-timeouts.md) | Per-site upstream timeouts with defaults | 5s connect / 60s request defaults, per-site overrides | | [016](decisions/016-explicit-bind-address.md) | Explicit bind address required | Rejects `0.0.0.0` to prevent accidental exposure | +| [019](decisions/019-multi-config-listeners.md) | Multi-config listeners | `[[listeners]]` supporting both dedicated-IP and shared-IP deployment models | ## Open Questions @@ -233,5 +329,5 @@ questions affecting this document: - ~~**OQ-04**: Should config reload support a Unix domain socket API in addition to SIGHUP?~~ (resolved — ADR-014: Unix domain socket admin API added) -- **OQ-07**: Should per-site TLS overrides be supported for mixed ACME/manual - domains? (open) \ No newline at end of file +- ~~**OQ-07**: Should per-site TLS overrides be supported for mixed ACME/manual + domains?~~ (resolved — ADR-019: `[[listeners]]` with per-listener TLS config) \ No newline at end of file diff --git a/docs/architecture/decisions/010-multi-site-phase1.md b/docs/architecture/decisions/010-multi-site-phase1.md index 32e17f2..7e022f4 100644 --- a/docs/architecture/decisions/010-multi-site-phase1.md +++ b/docs/architecture/decisions/010-multi-site-phase1.md @@ -31,7 +31,8 @@ and `rustls-acme` supports multi-domain certificates natively. Move multi-site support from Phase 2 into Phase 1. The proxy supports multiple sites from the initial release: -- `[[sites]]` array in config (already the planned format) +- `[[listeners.sites]]` array in each listener config (after ADR-019; was + `[[sites]]` at top level) - Host-based routing via axum's `Host` extractor (already the planned approach) - Multi-domain ACME certificate provisioning via `rustls-acme` - Each site maps a hostname to an upstream address @@ -78,8 +79,7 @@ Phase 3 remains future enhancements. - Slightly more testing surface (must verify correct routing with multiple sites) - Must test multi-domain ACME provisioning (not just single-domain) -- Wildcard or fallback site behavior needs to be defined (addressed in - OQ-07) +- Wildcard or fallback site behavior is defined by the listener's site routing ## References diff --git a/docs/architecture/decisions/011-multi-domain-tls.md b/docs/architecture/decisions/011-multi-domain-tls.md index 344dfc3..32525b6 100644 --- a/docs/architecture/decisions/011-multi-domain-tls.md +++ b/docs/architecture/decisions/011-multi-domain-tls.md @@ -30,23 +30,22 @@ certificate covering all proxied domains. Manual mode uses certificate file paths (single cert file with all domains, or one cert per domain resolved via SNI). -The config format changes from the previous single-domain format: +With ADR-019, TLS configuration lives inside `[[listeners]]` entries. Each +listener has its own TLS mode and domain list. The config format is: ```toml -# Previous (single-domain) format — no longer used -[tls] -mode = "acme" -acme_domain = "git.alk.dev" # single string -``` +# Current format (after ADR-019) +[[listeners]] +bind_addr = "203.0.113.10" -To the current multi-domain format: - -```toml -[tls] +[listeners.tls] mode = "acme" acme_domains = ["git.alk.dev", "alk.dev"] # array of strings ``` +The previous single-listener format (pre-ADR-019) used a `[server.tls]` section +which is no longer valid. + In ACME mode, `rustls-acme` provisions a single certificate covering all listed domains via Subject Alternative Names (SAN). This is the standard Let's Encrypt approach for multi-domain certificates. @@ -82,11 +81,12 @@ certificate or separate certificates resolved via SNI). domains must be validated) — mitigated by Let's Encrypt's domain-level validation - Per-site TLS configuration (e.g., a domain with a manual cert) requires a - future config extension (OQ-07) + future config extension — addressed by ADR-019 (multi-config listeners) ## References - [tls.md](../tls.md) - [config.md](../config.md) - ADR-010 (multi-site in Phase 1) -- ADR-004 (ACME-primary certificate management) \ No newline at end of file +- ADR-004 (ACME-primary certificate management) +- ADR-019 (multi-config listener support) \ No newline at end of file diff --git a/docs/architecture/decisions/016-explicit-bind-address.md b/docs/architecture/decisions/016-explicit-bind-address.md index 68cd21f..a120ddb 100644 --- a/docs/architecture/decisions/016-explicit-bind-address.md +++ b/docs/architecture/decisions/016-explicit-bind-address.md @@ -18,8 +18,9 @@ deployment. ## Decision -The `bind_addr` field must be an explicit IP address. `0.0.0.0` is rejected -during config validation. The proxy will not start if `bind_addr` is `0.0.0.0`. +The `bind_addr` field on each `[[listeners]]` entry must be an explicit IP +address. `0.0.0.0` is rejected during config validation. The proxy will not +start if any listener's `bind_addr` is `0.0.0.0`. ## Rationale diff --git a/docs/architecture/decisions/019-multi-config-listeners.md b/docs/architecture/decisions/019-multi-config-listeners.md new file mode 100644 index 0000000..b120896 --- /dev/null +++ b/docs/architecture/decisions/019-multi-config-listeners.md @@ -0,0 +1,171 @@ +# ADR-019: Multi-Config Listener Support + +## Status + +Accepted + +## Context + +OQ-07 asked whether per-site TLS overrides should be supported for mixed +ACME/manual domains. The original framing assumed a single listener with one +TLS configuration, where the question was about mixing certificate sources +within that listener. + +However, there are two distinct deployment models for multi-domain TLS: + +1. **Shared-IP multi-domain**: One IP address, one TLS listener, one SAN + certificate covering multiple domains via SNI. All domains share the same + ACME configuration. This is what the current architecture documents describe. + +2. **Dedicated-IP single-domain (multi-config)**: Each IP address gets its own + SSL certificate for its own domain. In this model, 1 IP = 1 SSL = 1 domain. + Each listener is independently configured with its own bind address, TLS + certificate, and site mapping. No SNI multiplexing is needed because each + domain has its own IP. + +The actual deployment uses model 2 (dedicated-IP single-domain). The proxy +should support both models, and the choice between them should be a deployment +concern, not an architectural limitation. + +There are two approaches to supporting model 2: + +- **Multiple instances**: Run separate `reverse-proxy` processes, each with + their own config file binding to a different IP. Simple, isolated, no + code changes needed. +- **Multi-listener single process**: One process with a `[[listeners]]` + config that defines multiple independent listeners, each with their own + bind address, TLS config, and site mapping. More complex but shares + resources (process overhead, connection pool, logging). + +## Decision + +Support both deployment models by extending the configuration format with +`[[listeners]]`, where each listener is an independent TLS endpoint with its +own bind address, TLS configuration, and site routing. + +The config format changes from a single implicit listener to explicit +`[[listeners]]` entries: + +```toml +# Multi-config: two listeners, each on a different IP +[[listeners]] +bind_addr = "203.0.113.10" +http_port = 80 +https_port = 443 + +[listeners.tls] +mode = "acme" +acme_domains = ["git.alk.dev"] +acme_cache_dir = "/var/lib/reverse-proxy/acme-cache" +acme_directory = "production" + +[[listeners.sites]] +host = "git.alk.dev" +upstream = "127.0.0.1:3000" + +[[listeners]] +bind_addr = "203.0.113.11" +http_port = 80 +https_port = 443 + +[listeners.tls] +mode = "manual" +cert_path = "/etc/ssl/alk.dev/fullchain.pem" +key_path = "/etc/ssl/alk.dev/privkey.pem" + +[[listeners.sites]] +host = "alk.dev" +upstream = "127.0.0.1:8080" +``` + +Each `[[listeners]]` entry is an independent TLS endpoint with: +- Its own `bind_addr` (required, no `0.0.0.0`; see ADR-016) +- Its own `http_port` and `https_port` +- Its own `tls` configuration (ACME or manual) +- Its own `[[sites]]` array for host-based routing + +The single-listener case is a natural subset of this format — a config with +one `[[listeners]]` entry behaves identically to the current single-listener +design. The multi-domain SAN case (one IP, one cert, multiple domains) is +also supported: a single listener with multiple domains in `acme_domains` +and multiple `[[sites]]`. + +Global settings (logging, rate limiting, body limits, health check, admin +socket) are top-level configuration. Listener-specific settings (bind address, +ports, TLS, sites) live inside each `[[listeners]]` entry. + +Example configuration: + +```toml +# Global settings +health_check_port = 9900 +admin_socket_path = "/run/reverse-proxy/admin.sock" + +[logging] +level = "info" +format = "text" + +[rate_limit] +requests_per_second = 10 +burst = 20 + +[body] +limit_bytes = 104857600 + +# Listener definitions +[[listeners]] +bind_addr = "203.0.113.10" +# ... listener config ... +``` + +## Rationale + +- The dedicated-IP model (1 IP = 1 SSL = 1 domain) is our actual deployment. + The architecture should natively support it. +- The shared-IP multi-domain model (SAN certificate) is also common and useful. + Supporting both is straightforward with the `[[listeners]]` format. +- Multiple instances would work but wastes resources (separate processes, + separate connection pools, separate logging pipelines) for what is + fundamentally the same binary doing the same job. +- A single process with multiple listeners is more efficient: one process, one + log pipeline, one config reload mechanism, one health check endpoint. +- The `[[listeners]]` format is a natural extension — each listener is + independent, but global concerns (logging, rate limiting, health checks) + are shared, which is correct since they're all the same proxy. +- The single-listener case is a degenerate case of `[[listeners]]` with one + entry, so existing configurations translate trivially. + +## Consequences + +**Positive:** +- Natively supports both deployment models (dedicated-IP and shared-IP) +- Single process for all listeners: shared logging, config reload, health + checks, rate limiting +- Each listener is fully independent — different IPs, different TLS configs, + different domains +- The config format is regular and predictable — `[[listeners]]` is a + natural TOML array of tables +- Mixed ACME/manual configurations are naturally supported: each listener + chooses its own TLS mode +- The single-listener, single-domain case is trivially supported + +**Negative:** +- Config format change from `[server]` with implicit single listener to + `[[listeners]]` with explicit listener entries +- The proxy must manage multiple TCP listeners and TLS acceptors, which + adds implementation complexity +- Each listener needs its own ACME state machine (in ACME mode), increasing + resource usage proportional to listener count +- Global rate limiting applies across all listeners (per-IP rate limiting + doesn't distinguish which listener received the request — this is correct + behavior since the IP is the same regardless of listener) + +## References + +- [config.md](../config.md) +- [tls.md](../tls.md) +- [overview.md](../overview.md) +- ADR-008 (static/dynamic config split) +- ADR-011 (multi-domain TLS config) +- ADR-016 (explicit bind address) +- OQ-05 (multiple bind addresses — now resolved via per-listener `bind_addr`) \ No newline at end of file diff --git a/docs/architecture/open-questions.md b/docs/architecture/open-questions.md index 90a7fa6..b55c844 100644 --- a/docs/architecture/open-questions.md +++ b/docs/architecture/open-questions.md @@ -27,19 +27,19 @@ last_updated: 2026-06-11 See ADR-007. - **Cross-references**: ADR-007 -### OQ-07: Should per-site TLS overrides be supported for mixed ACME/manual domains? +### ~~OQ-07: Should per-site TLS overrides be supported for mixed ACME/manual domains?~~ - **Origin**: [tls.md](tls.md), [config.md](config.md) -- **Status**: open +- **Status**: resolved - **Priority**: low -- **Context**: Phase 1 uses a single TLS configuration (ACME or manual) for all - domains. All domains share the same ACME config and certificate. If a future - domain needs a manual certificate (e.g., a corporate CA cert) while other - domains use ACME, a per-site TLS override would be needed. This would require - a custom `ResolvesServerCert` that combines ACME-provisioned certs with - manually loaded certs. For now, all proxied domains use the same ACME config, - so this is not needed. -- **Cross-references**: ADR-011 +- **Resolution**: Resolved by introducing `[[listeners]]` configuration. Each + listener is an independent TLS endpoint with its own bind address, TLS config, + and site routing. This supports both deployment models: (1) shared-IP + multi-domain (one listener, SAN certificate, SNI routing) and (2) dedicated-IP + single-domain (multiple listeners, each with its own IP/cert/domain). Mixed + ACME/manual configurations are naturally supported since each listener has its + own TLS mode. See ADR-019. +- **Cross-references**: ADR-011, ADR-019 ## Logging and Monitoring @@ -73,11 +73,12 @@ last_updated: 2026-06-11 - **Origin**: [overview.md](overview.md) - **Status**: resolved - **Priority**: low -- **Resolution**: A single `bind_addr` is sufficient. The proxy binds to one - explicit IP address (not `0.0.0.0`). Multi-address binding is not needed for - this single-server deployment. If needed in the future, `bind_addr` could be - extended to an array. See config.md for the `bind_addr` field. -- **Cross-references**: ADR-016 +- **Resolution**: A single `bind_addr` per listener entry is sufficient. ADR-019 + introduced `[[listeners]]`, where each listener has its own `bind_addr`. This + supports multiple bind addresses in a single process — one per listener — + without needing an array of addresses on a single listener. See ADR-016 and + ADR-019. +- **Cross-references**: ADR-016, ADR-019 ## Proxy diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md index 316fe6a..bc2c852 100644 --- a/docs/architecture/overview.md +++ b/docs/architecture/overview.md @@ -38,8 +38,11 @@ details. ### In Scope - **Phase 1**: Multi-site reverse proxy with TLS termination - - TLS termination with ACME (Let's Encrypt) multi-domain certificate management - - Manual certificate paths as fallback mode + - Multiple independent TLS listeners via `[[listeners]]` configuration + - Each listener has its own bind address, TLS config, and site routing + - Supports both dedicated-IP (1 IP = 1 cert = 1 domain) and shared-IP + (SAN certificate) deployment models (ADR-019) + - TLS termination with ACME (Let's Encrypt) and manual certificate management - Cipher suite restriction matching nginx scope (ECDHE-AES-GCM + TLS 1.3) - HTTP → HTTPS redirect - Host-based routing to multiple upstream services @@ -49,7 +52,7 @@ details. - Per-site upstream timeouts with sensible defaults (5s connect, 60s request) - Request rate limiting with fail2ban-compatible logging (global per-IP) - 100 MB body size limit (global) - - Configurable bind address (must be explicit, no `0.0.0.0`) + - Configurable bind addresses (must be explicit, no `0.0.0.0`) - Local health check endpoint on separate port (default: 9900, localhost only) - Unix domain socket admin API for config reload with feedback - Graceful shutdown (SIGTERM handling) @@ -64,7 +67,6 @@ details. - **Phase 3**: Future enhancements - Wildcard subdomain support - - Per-site TLS overrides (manual certs for specific domains) ### Out of Scope @@ -75,36 +77,37 @@ details. - Static file serving - Access control beyond rate limiting (no auth, no IP allowlists in Phase 1) - CGI, SCGI, uWSGI, FastCGI -- Per-site TLS configuration (all domains share one ACME config in Phase 1) ## Architecture ``` ┌────────────────────────────────────┐ │ reverse-proxy (Rust/axum) │ -config.toml ──────► │ StaticConfig + DynamicConfig │ +config.toml ───────► │ StaticConfig + DynamicConfig │ │ (ArcSwap for hot-reload) │ │ │ -bind_addr:80 ──► │ HTTP listener → 301 redirect │ - │ to HTTPS │ + │ ┌─ Listener 1 ─────────────────┐ │ +bind_addr_1:80 ───► │ │ HTTP → 301 redirect │ │ + │ └────────────────────────────────┘ │ +bind_addr_1:443 ───► │ │ TLS listener (tokio-rustls) │ │ + │ │ ├─ ACME or Manual TLS config │ │ + │ │ └─ axum router │ │ + │ │ ├─ Host-based routing │ │ + │ │ ├─ git.alk.dev → :3000 │ │ + │ │ └─ Rate limiting, headers │ │ + │ └────────────────────────────────┘ │ │ │ -bind_addr:443 ──► │ TLS listener (tokio-rustls) │ - │ ├─ ACME mode: rustls-acme resolver │ - │ │ (multi-domain SAN cert, │ - │ │ auto-provision & renew) │ - │ └─ Manual mode: cert/key file paths │ + │ ┌─ Listener N ─────────────────┐ │ +bind_addr_N:80 ───► │ │ HTTP → 301 redirect │ │ + │ └────────────────────────────────┘ │ +bind_addr_N:443 ───► │ │ TLS listener (tokio-rustls) │ │ + │ │ ├─ Manual TLS cert │ │ + │ │ └─ axum router │ │ + │ │ ├─ alk.dev → :8080 │ │ + │ │ └─ Rate limiting, headers │ │ + │ └────────────────────────────────┘ │ │ │ - │ axum router │ - │ ├─ Host-based routing │ - │ │ ├─ git.alk.dev → :3000 │ - │ │ └─ alk.dev → :8080 │ - │ ├─ Rate limiting middleware │ - │ ├─ Proxy header injection │ - │ ├─ Body size limit (100MB) │ - │ └─ Reverse proxy handler │ - │ └─ hyper Client → upstream │ - │ │ - │ /health → 200 OK │ + │ /health → 200 OK (port 9900) │ └────────────────────────────────────┘ ``` @@ -176,6 +179,7 @@ All design decisions are documented as ADRs in [decisions/](decisions/). | [016](decisions/016-explicit-bind-address.md) | Explicit bind address required | Rejects `0.0.0.0` to prevent accidental exposure | | [017](decisions/017-upstream-connection-defaults.md) | Upstream connection defaults | HTTP/1.1, no redirects, connection pooling | | [018](decisions/018-body-size-limit.md) | Request body size limit | 100 MB default matching nginx, Gitea push compatibility | +| [019](decisions/019-multi-config-listeners.md) | Multi-config listeners | `[[listeners]]` supporting both dedicated-IP and shared-IP deployment models | ## Open Questions @@ -184,4 +188,4 @@ questions affecting this document: - ~~**OQ-01**: Should cipher suites be restricted beyond rustls defaults?~~ (resolved — ADR-012) - ~~**OQ-03**: Should the health check endpoint be on a separate port?~~ (resolved — ADR-013) -- **OQ-07**: Should per-site TLS overrides be supported for mixed ACME/manual domains? (open) \ No newline at end of file +- ~~**OQ-07**: Should per-site TLS overrides be supported for mixed ACME/manual domains?~~ (resolved — ADR-019: `[[listeners]]` with per-listener TLS config) \ No newline at end of file diff --git a/docs/architecture/proxy.md b/docs/architecture/proxy.md index 4369bc0..8c10eaa 100644 --- a/docs/architecture/proxy.md +++ b/docs/architecture/proxy.md @@ -131,11 +131,11 @@ specified, defaults of 5s connect and 60s request are used. ### 5. HTTP → HTTPS Redirect -A separate HTTP listener on port 80 handles redirect. It reads the `Host` -header from the incoming request and returns a 301 Permanent Redirect to the -HTTPS equivalent URL (preserving the path and query string). +A separate HTTP listener on port 80 (per listener) handles redirect. It reads +the `Host` header from the incoming request and returns a 301 Permanent Redirect +to the HTTPS equivalent URL (preserving the path and query string). -This listener runs on the same bind address as the TLS listener but on port 80. +Each listener has its own HTTP redirect on its own bind address. ## Upstream Connection diff --git a/docs/architecture/tls.md b/docs/architecture/tls.md index b1fc3da..e14ace3 100644 --- a/docs/architecture/tls.md +++ b/docs/architecture/tls.md @@ -19,36 +19,50 @@ upstream services. It replaces nginx's `ssl_certificate`, `ssl_protocols`, and ## Architecture +The proxy supports multiple independent TLS listeners, each with its own bind +address, TLS configuration, and site routing. See ADR-019 for the rationale. + ``` ┌──────────────────────────────────────────┐ │ TLS Termination │ │ │ - bind_addr:443 ──► │ TcpListener::bind(bind_addr) │ - │ │ │ - │ ▼ │ - │ tokio-rustls::TlsAcceptor │ - │ │ │ - │ ├─ ACME mode: │ - │ │ rustls-acme::ResolvesServerCertAcme │ - │ │ (auto-provisions & renews certs) │ - │ │ │ - │ └─ Manual mode: │ - │ rustls::ServerConfig │ - │ .with_single_cert(cert_chain, key) │ + │ ┌─ Listener 1 ─────────────────────────┐ │ + │ │ bind_addr_1:443 │ │ + │ │ TcpListener::bind(bind_addr_1) │ │ + │ │ │ │ │ + │ │ ▼ │ │ + │ │ tokio-rustls::TlsAcceptor │ │ + │ │ │ │ │ + │ │ ACME or Manual TLS config │ │ + │ │ (per-listener TLS mode) │ │ + │ │ │ │ │ + │ │ ▼ │ │ + │ │ TlsStream │ │ + │ │ │ │ │ + │ │ ▼ │ │ + │ │ axum router (per-listener sites) │ │ + │ └───────────────────────────────────────┘ │ │ │ - │ │ │ - │ ▼ │ - │ TlsStream │ - │ │ │ - │ ▼ │ - │ hyper::service_fn → axum router │ + │ ┌─ Listener N ─────────────────────────┐ │ + │ │ bind_addr_N:443 │ │ + │ │ ... (same structure) │ │ + │ └───────────────────────────────────────┘ │ └──────────────────────────────────────────┘ bind_addr:80 ──► HTTP listener (redirect to HTTPS, no TLS) ``` +Each listener is independently configured. This supports two deployment models: + +1. **Shared-IP multi-domain**: One listener with multiple domains in + `acme_domains`, using a single SAN certificate and SNI routing. +2. **Dedicated-IP single-domain**: Multiple listeners, each with its own IP, + its own TLS certificate, and its own site. No SNI needed. + ## Certificate Provisioning +Each listener has its own TLS mode (ACME or manual), configured independently. + ### ACME Mode (Primary) Uses `rustls-acme` for automatic certificate provisioning and renewal through @@ -57,12 +71,14 @@ no deploy hooks. **How it works:** -1. `AcmeCertProvider` configures the ACME client with the domain list, cache - directory, and Let's Encrypt directory (staging or production). -2. `AcmeConfig::new(domains)` creates an ACME configuration for all listed - domains. Let's Encrypt will issue a single SAN certificate covering all - domains. -3. The ACME state machine runs as a background tokio task, handling: +1. Each listener in ACME mode creates its own `AcmeCertProvider` with the + listener's domain list, cache directory, and Let's Encrypt directory. +2. `AcmeConfig::new(domains)` creates an ACME configuration for the domains + listed in that listener's `acme_domains`. Let's Encrypt will issue a + certificate covering those domains (a single SAN certificate or a + single-domain certificate, depending on how many domains are listed). +3. The ACME state machine runs as a background tokio task per listener, + handling: - Account registration with Let's Encrypt - Certificate ordering - TLS-ALPN-01 challenge (or HTTP-01 challenge) @@ -73,10 +89,13 @@ no deploy hooks. 5. When a new certificate is issued, the resolver updates atomically — no restart or signal handling needed. -**Configuration:** +**Configuration (within a `[[listeners]]` entry):** ```toml -[server.tls] +[[listeners]] +bind_addr = "203.0.113.10" + +[listeners.tls] mode = "acme" acme_domains = ["git.alk.dev", "alk.dev"] acme_cache_dir = "/var/lib/reverse-proxy/acme-cache" @@ -85,7 +104,8 @@ acme_directory = "production" # or "staging" for testing **Cache directory:** The `DirCache` from rustls-acme persists ACME account data, private keys, and certificates between restarts. This avoids re-provisioning on -every restart. +every restart. Each listener should use its own cache directory to avoid conflicts +between separate ACME state machines. ### Manual Mode (Fallback) @@ -94,10 +114,13 @@ corporate CAs, or BYO certificates), the proxy loads certificates from file paths at startup. ```toml -[tls] +[[listeners]] +bind_addr = "203.0.113.11" + +[listeners.tls] mode = "manual" -cert_path = "/etc/letsencrypt/live/git.alk.dev/fullchain.pem" -key_path = "/etc/letsencrypt/live/git.alk.dev/privkey.pem" +cert_path = "/etc/ssl/alk.dev/fullchain.pem" +key_path = "/etc/ssl/alk.dev/privkey.pem" ``` Certificate files are loaded once at startup using `rustls_pemfile`. Manual @@ -138,27 +161,42 @@ parity during migration. ### ServerConfig Construction +Each listener constructs its own `ServerConfig` based on its TLS mode. + For manual mode, the `ServerConfig` is built with `with_no_client_auth()` and -a custom `ResolvesServerCert` implementation that maps SNI hostnames to -certificate/key pairs loaded from disk. +the loaded certificate chain and private key. If the listener serves multiple +domains from a single listener, a custom `ResolvesServerCert` implementation +maps SNI hostnames to certificate/key pairs loaded from disk. For ACME mode, the `ServerConfig` is built with `with_cert_resolver()`, passing -the `ResolvesServerCertAcme` resolver. The ACME configuration includes all -domains listed in `acme_domains`, and the resolver manages a single SAN -certificate covering all of them. The ACME TLS-ALPN-01 protocol identifier -(`acme-tls/1`) must be registered in the `alpn_protocols` list so the server -can respond to TLS-ALPN-01 challenges. +the `ResolvesServerCertAcme` resolver. The ACME configuration includes the +domains listed in that listener's `acme_domains`, and the resolver manages the +certificate. The ACME TLS-ALPN-01 protocol identifier (`acme-tls/1`) must be +registered in the `alpn_protocols` list so the server can respond to +TLS-ALPN-01 challenges. Both modes use the `aws_lc_rs` crypto provider with safe default protocol versions (TLS 1.2 and TLS 1.3). ## SNI-Based Certificate Selection -### ACME Mode (Multi-Domain) +### Dedicated-IP Single-Domain (Multi-Config) + +In the dedicated-IP model, each listener binds to its own IP address and serves +exactly one domain with one certificate. SNI is not required for certificate +selection — the listener's TLS config already has the correct certificate. + +This is the simplest case: one IP, one listener, one certificate, one domain. +No SNI resolution logic is needed. + +### Shared-IP Multi-Domain (SAN Certificate) + +In the shared-IP model, a single listener serves multiple domains using a SAN +certificate. SNI-based certificate selection is required. In ACME mode, `rustls-acme` manages a single SAN certificate covering all -configured domains. The `ResolvesServerCertAcme` resolver automatically serves -the correct certificate during the TLS handshake. +configured domains for that listener. The `ResolvesServerCertAcme` resolver +automatically serves the correct certificate during the TLS handshake. 1. **TLS handshake**: The client sends the SNI extension indicating which hostname it's connecting to. @@ -172,10 +210,11 @@ This is the same pattern nginx uses — SNI selects the cert during TLS, then `Host` header selects the server block. ACME mode handles this automatically through the cert resolver. -### Manual Mode (Multi-Domain) +### Manual Mode with Multiple Domains -In manual mode, a custom `ResolvesServerCert` implementation is required to -map SNI hostnames to the correct `CertifiedKey`. This implementation: +In manual mode on a shared-IP listener, a custom `ResolvesServerCert` +implementation maps SNI hostnames to the correct `CertifiedKey`. This +implementation: 1. Loads certificate files at startup (or on SIGHUP for reload) 2. Maps each domain name to its certificate chain and private key @@ -183,17 +222,17 @@ map SNI hostnames to the correct `CertifiedKey`. This implementation: matching `CertifiedKey` The custom resolver must handle the case where no matching certificate exists -for the SNI hostname — in this case, the handshake fails, which is the -correct behavior (we don't serve a default certificate for unknown domains). - -See [open-questions.md](open-questions.md) OQ-07 for per-site TLS overrides. +for the SNI hostname — in this case, the handshake fails, which is the correct +behavior (we don't serve a default certificate for unknown domains). ## HTTP Listener (Port 80) -The HTTP listener on port 80 is a plain TCP listener with no TLS. It has one -job: redirect all requests to the HTTPS equivalent. +Each listener has its own HTTP listener on port 80 (or the configured +`http_port`). It is a plain TCP listener with no TLS. It has one job: redirect +all requests to the HTTPS equivalent. -The listener binds to the same IP address as the TLS listener, but on port 80. +Each HTTP listener binds to the same IP address as its corresponding TLS +listener, but on port 80. ### ACME Challenge Type @@ -225,6 +264,7 @@ All design decisions are documented as ADRs in [decisions/](decisions/). | [010](decisions/010-multi-site-phase1.md) | Multi-site in Phase 1 | Multiple domains from initial release | | [011](decisions/011-multi-domain-tls.md) | Multi-domain TLS config | Single SAN certificate covering all domains via rustls-acme | | [012](decisions/012-cipher-suite-restriction.md) | Restrict cipher suites | Match nginx scope: four ECDHE-AES-GCM suites for TLS 1.2, all TLS 1.3 suites | +| [019](decisions/019-multi-config-listeners.md) | Multi-config listeners | `[[listeners]]` supporting both dedicated-IP and shared-IP deployment models | ## Open Questions @@ -232,5 +272,4 @@ Open questions are tracked in [open-questions.md](open-questions.md). Key questions affecting this document: - ~~**OQ-01**: Should cipher suites be restricted beyond rustls defaults?~~ (resolved — ADR-012: restrict to nginx scope) -- **OQ-07**: Should per-site TLS overrides be supported for mixed ACME/manual - domains? (open) \ No newline at end of file +- ~~**OQ-07**: Should per-site TLS overrides be supported for mixed ACME/manual domains?~~ (resolved — ADR-019: `[[listeners]]` with per-listener TLS config) \ No newline at end of file