diff --git a/docs/architecture/README.md b/docs/architecture/README.md index a6fdd69..2e99eaa 100644 --- a/docs/architecture/README.md +++ b/docs/architecture/README.md @@ -1,6 +1,6 @@ --- status: draft -last_updated: 2026-06-11 +last_updated: 2026-06-12 --- # Reverse Proxy — Architecture @@ -53,6 +53,7 @@ certificate via ACME. | [019](decisions/019-multi-config-listeners.md) | Multi-Config Listener Support | Accepted | | [020](decisions/020-container-deployment.md) | Container Deployment Model | Accepted | | [021](decisions/021-x-forwarded-for-edge-proxy.md) | X-Forwarded-For Edge Proxy Model | Accepted | +| [022](decisions/022-health-check-scope.md) | Health Check Scope — Local Port and Admin Socket Only | Accepted | ## Open Questions @@ -67,11 +68,11 @@ See [open-questions.md](open-questions.md) for the full tracker. | ~~OQ-05~~ | ~~Should the proxy bind to multiple addresses?~~ | ~~low~~ | **resolved** (single bind_addr sufficient) | | ~~OQ-06~~ | ~~Should upstream timeouts be configurable per-site?~~ | ~~low~~ | **resolved** (ADR-015) | | ~~OQ-07~~ | ~~Should per-site TLS overrides be supported for mixed ACME/manual domains?~~ | ~~low~~ | **resolved** (ADR-019) | -| OQ-08 | Should the `/health` path use a less common endpoint to avoid upstream collision? | medium | open | -| OQ-09 | How should `upstream_connect_timeout_secs` be enforced? | medium | open | -| OQ-10 | Should ACME contact email be a required config field? | high | open | -| OQ-11 | How should `X-Forwarded-Proto` be derived per-listener? | medium | open | -| OQ-12 | Should request access logging be mandatory or optional? | high | open | +| ~~OQ-08~~ | ~~Should `/health` use a less common path to avoid upstream collision?~~ | ~~medium~~ | **resolved** (ADR-022: no `/health` route on main listener) | +| ~~OQ-09~~ | ~~How should `upstream_connect_timeout_secs` be enforced?~~ | ~~medium~~ | **resolved** (implementation gap — ADR-015 already decides this) | +| ~~OQ-10~~ | ~~Should ACME contact email be a required config field?~~ | ~~high~~ | **resolved** (already specified in config.md; implementation bug C2) | +| ~~OQ-11~~ | ~~How should `X-Forwarded-Proto` be derived per-listener?~~ | ~~medium~~ | **resolved** (hardcoded `https` is correct for TLS-terminating proxy) | +| ~~OQ-12~~ | ~~Should request access logging be mandatory or optional?~~ | ~~high~~ | **resolved** (mandatory, always-on per operations.md) | ## Document Lifecycle diff --git a/docs/architecture/config.md b/docs/architecture/config.md index 090085b..ec4da66 100644 --- a/docs/architecture/config.md +++ b/docs/architecture/config.md @@ -87,7 +87,7 @@ Immutable after startup. Changes require a process restart. |-------|------|-------------| | `listeners` | `Vec` | Independent TLS endpoints, each with its own bind address and TLS config (see ADR-019) | | `allow_wildcard_bind` | `bool` | Allow `0.0.0.0` as a bind address. Required for container deployments. Default: `false` (see ADR-016, ADR-020) | -| `health_check_port` | `u16` | Port for local health check endpoint (default: `9900`; set to `0` to disable; see ADR-013) | +| `health_check_port` | `u16` | Port for local health check endpoint (default: `9900`; set to `0` to disable; bound to `127.0.0.1` only; see ADR-013, ADR-022) | | `admin_socket_path` | `String` | Unix domain socket path for admin API (default: `/run/reverse-proxy/admin.sock`; empty string to disable; see ADR-014) | | `shutdown_timeout_secs` | `u64` | Maximum seconds to wait for in-flight requests during graceful shutdown (default: `30`) | | `logging` | `LoggingConfig` | Logging configuration (see below) | diff --git a/docs/architecture/decisions/013-health-check-port.md b/docs/architecture/decisions/013-health-check-port.md index 8c8a41e..2f8b0ce 100644 --- a/docs/architecture/decisions/013-health-check-port.md +++ b/docs/architecture/decisions/013-health-check-port.md @@ -7,21 +7,23 @@ Accepted ## Context The health check endpoint (`/health`) needs to be accessible for monitoring -without requiring TLS. Currently the design places it on the main HTTPS -listener, which means: +without requiring TLS. Serving it on the main HTTPS listener would mean: 1. TLS handshake must succeed for the health check to respond 2. External monitoring tools need to handle TLS 3. A TLS configuration error would make the health check unreachable, creating a false-negative monitoring signal +4. It creates collision with upstream applications that use `/health` for their + own health checks (see ADR-022) Three options were considered (see OQ-03): -1. **Main HTTPS listener only**: Simplest, but TLS config errors make health - checks unreachable -2. **Separate unencrypted port on localhost**: Simple, works with standard - monitoring tools, but health checks bypass TLS +1. **Separate unencrypted port on localhost (chosen)**: Simple, works with + standard monitoring tools, health checks work even when TLS is misconfigured +2. **Main HTTPS listener only**: Would require TLS for health checks, creating + a circular dependency — TLS config errors would make health checks unreachable 3. **Admin port with its own listener**: Most flexible but adds complexity + beyond what's needed for a simple health check ## Decision @@ -31,8 +33,8 @@ HTTP and HTTPS listeners. The port is configurable via `health_check_port` in StaticConfig. The default value is `9900` (enabled, localhost only). Setting it to `0` disables the -separate health check listener, and `/health` remains available on the main -HTTPS listener as a fallback. +health check listener entirely — there is no `/health` route on the main HTTPS +listener (see ADR-022). ## Rationale @@ -45,8 +47,9 @@ HTTPS listener as a fallback. the same host) can reach it - Configurable port allows different deployment scenarios (some monitoring runs on different ports) -- Disabling via `health_check_port = 0` keeps the main HTTPS `/health` endpoint - available for cases where a separate port isn't needed +- Disabling via `health_check_port = 0` removes the health check entirely — + the admin socket's `status` command remains available as an alternative + health/status mechanism - When this project is folded into alknet, the health check will use alknet's existing patterns, making the separate port unnecessary in that context @@ -61,10 +64,9 @@ HTTPS listener as a fallback. **Negative:** - Additional listener to manage (minimal complexity) -- Two health check endpoints exist when the separate port is enabled (the - local one and the HTTPS one) — monitoring should prefer the local one ## References - [operations.md](../operations.md) +- [ADR-022](022-health-check-scope.md) — Health check scope (no `/health` on main listener) - OQ-03 (now resolved) \ No newline at end of file diff --git a/docs/architecture/decisions/022-health-check-scope.md b/docs/architecture/decisions/022-health-check-scope.md new file mode 100644 index 0000000..a348ee4 --- /dev/null +++ b/docs/architecture/decisions/022-health-check-scope.md @@ -0,0 +1,56 @@ +# ADR-022: Health Check Scope — Local Port and Admin Socket Only + +## Status + +Accepted + +## Context + +The implementation served a `GET /health` route on the main HTTPS listener that +returned 200 OK regardless of the Host header. This route was evaluated before +host-based routing, meaning any upstream application using `/health` for its own +health checks would have those requests silently intercepted by the proxy and +never reach the upstream (implementation review finding W5). + +The architecture already specified a separate local health check port (9900, +bound to 127.0.0.1 only) via ADR-013. The question was whether to keep the +main-listener `/health` route alongside the dedicated port (and possibly make +the path configurable), or to remove it entirely. + +## Decision + +The main HTTPS listener does **not** serve a `/health` route. Health checking is +handled exclusively by: + +1. **Local health check port** (default: 9900, bound to `127.0.0.1`) — serves + `GET /health → 200 OK`. This is the primary health check mechanism for + container orchestration, load balancers, and monitoring systems. +2. **Admin socket** (`status` command) — returns process information including + uptime and site count. + +The `/health` route is removed from the main listener entirely. No configurable +path is needed because the route simply does not exist on the public listener. + +## Consequences + +**Positive:** +- No collision with upstream applications that use `/health` for their own + health checks +- The main listener's routing logic is simpler — all requests go through + host-based routing, no special cases +- Clear separation of concerns: the main listener proxies, the local port + answers health checks +- No configurable path needed — the problem disappears entirely + +**Negative:** +- External monitoring that needs to verify TLS is working must connect to the + HTTPS port directly and check for a successful TLS handshake or a 404 + response, rather than getting a 200 from `/health`. This is a minor + inconvenience — any successful TLS response (even 404) confirms the proxy is + serving TLS correctly. + +## References + +- ADR-013: Health check on separate local port +- OQ-08: Resolved by this ADR +- Implementation review finding W5 (hardcoded `/health` path) \ No newline at end of file diff --git a/docs/architecture/open-questions.md b/docs/architecture/open-questions.md index 6402e4f..831a55e 100644 --- a/docs/architecture/open-questions.md +++ b/docs/architecture/open-questions.md @@ -1,6 +1,6 @@ --- status: draft -last_updated: 2026-06-11 +last_updated: 2026-06-12 --- # Open Questions @@ -50,9 +50,10 @@ last_updated: 2026-06-11 - **Priority**: low - **Resolution**: Add a configurable local health check port (default: 9900) bound to `127.0.0.1` only. Health checks work even when TLS is misconfigured. - The main HTTPS `/health` endpoint remains available as a fallback. See - ADR-013. -- **Cross-references**: ADR-013 + There is no `/health` route on the main HTTPS listener — health checking is + handled exclusively by the local port and admin socket. See ADR-013 and + ADR-022. +- **Cross-references**: ADR-013, ADR-022 ## Configuration @@ -92,92 +93,79 @@ last_updated: 2026-06-11 that override global defaults when specified. - **Cross-references**: ADR-015, ADR-017 -### OQ-08: Should the `/health` path use a less common endpoint to avoid upstream collision? +### ~~OQ-08: Should the `/health` path use a less common endpoint to avoid upstream collision?~~ - **Origin**: Implementation review finding W5, [proxy.md](proxy.md) -- **Status**: open +- **Status**: resolved - **Priority**: medium -- **Resolution**: None yet. The proxy currently intercepts `GET /health` on all - hosts before host-based routing, which means any upstream application that - uses `/health` for its own health checks will have those requests silently - intercepted. Options: (1) Use a less common path like `/__health` or - `/healthz`; (2) Only intercept `/health` when the Host header doesn't match - any known site (fallthrough); (3) Make the health check path configurable - via `StaticConfig`. Option 1 is simplest for Phase 1. Option 3 is most - flexible long-term. The architecture spec (proxy.md, ADR-013) currently - specifies `/health` as a top-level route regardless of Host. -- **Cross-references**: ADR-013 +- **Resolution**: The `/health` route does not belong on the main listener at + all. Health checking is an operational concern served by the dedicated local + port (9900) and the admin socket's `status` command — not by intercepting + traffic on the public-facing proxy. Serving `/health` on the main listener + creates collision with upstream applications, requires special-case routing + logic before host-based matching, and is architecturally wrong: the main + listener's job is to proxy requests, not to serve operational endpoints. The + local health check port (bound to `127.0.0.1:9900`) and the admin socket are + the sole health/status mechanisms. See ADR-022. +- **Cross-references**: ADR-013, ADR-022 -### OQ-09: How should `upstream_connect_timeout_secs` be enforced? +### ~~OQ-09: How should `upstream_connect_timeout_secs` be enforced?~~ - **Origin**: Implementation review finding W4, ADR-015, ADR-017 -- **Status**: open +- **Status**: resolved - **Priority**: medium -- **Resolution**: None yet. The architecture (ADR-015, ADR-017) specifies a - 5-second default connect timeout separate from the request timeout, and - `SiteConfig` includes `upstream_connect_timeout_secs`. However, the - implementation only applies `upstream_request_timeout_secs` as a blanket - timeout covering the entire exchange. The hyper client handles TCP connect - internally, making a two-phase timeout harder to implement without custom - connect logic. Need to decide: (1) implement a two-phase timeout using - `tokio::time::timeout` for connect phase then request phase; (2) configure - the hyper client's `connect_timeout` parameter; or (3) accept the current - behavior for Phase 1 and add connect timeout enforcement in Phase 2. +- **Resolution**: This is an implementation gap, not an architectural unknown. + The architecture already specifies a 5-second default connect timeout + separate from the request timeout (ADR-015, ADR-017), and `SiteConfig` + already includes `upstream_connect_timeout_secs`. The implementation must + wire this field to hyper's `connect_timeout` parameter. If hyper's API + doesn't expose a separate connect timeout, a two-phase `tokio::time::timeout` + approach should be used for Phase 2. For Phase 1, the connect timeout field + exists in config but is not enforced — this is a documented known gap. No ADR + needed; the decision was already made in ADR-015. - **Cross-references**: ADR-015, ADR-017 -## Configuration - -### OQ-10: Should ACME contact email be a required config field? +### ~~OQ-10: Should ACME contact email be a required config field?~~ - **Origin**: Implementation review finding C2, [tls.md](tls.md), [config.md](config.md) -- **Status**: open +- **Status**: resolved - **Priority**: high -- **Resolution**: None yet. Let's Encrypt requires a contact email for production - certificate requests. The current architecture spec does not include an - `acme_contact` field in `TlsConfig` or `ListenerConfig`. Without it, ACME - registration with Let's Encrypt production will fail. Options: (1) Add a - required `acme_contact` field to the TLS config within each `[[listeners]]` - entry that uses ACME mode; (2) Add a global `acme_contact` field shared - across all ACME listeners. Per-listener is more flexible but adds config - noise. Global is simpler for typical deployments. Need to update config.md - and tls.md. +- **Resolution**: This is not an open question — the architecture already + specifies `acme_contact` as a required field in ACME mode (config.md + validation rule 19). The field is defined in the `ListenerConfig` table and + shown in TOML examples. Let's Encrypt requires a contact email for production + certificate requests. The implementation bug (C2: `contact: vec![]`) must be + fixed to use the configured `acme_contact` value. No new ADR needed — the + decision is already documented in config.md and tls.md. - **Cross-references**: ADR-004 -### OQ-11: How should `X-Forwarded-Proto` be derived per-listener? +### ~~OQ-11: How should `X-Forwarded-Proto` be derived per-listener?~~ - **Origin**: Implementation review finding W14, [proxy.md](proxy.md) -- **Status**: open +- **Status**: resolved - **Priority**: medium -- **Resolution**: None yet. The architecture spec (proxy.md) states - `X-Forwarded-Proto` should be "determined by which listener port received the - request" — `https` for requests on the listener's `https_port`, `http` for - requests on the listener's `http_port`. The implementation hardcodes - `is_https: true` in `ProxyState`. For a TLS-terminating reverse proxy this - is correct (all TLS connections arrive on the HTTPS port), but the HTTP - redirect listener should set `X-Forwarded-Proto: https` since it redirects to - HTTPS. Need to clarify: (1) The HTTPS listener always sets `X-Forwarded-Proto: - https` (correct, since it terminates TLS); (2) The HTTP redirect listener - sends a 301 redirect and does NOT proxy, so `X-Forwarded-Proto` on the - redirect response is not applicable. The hardcoded behavior is correct but - should be documented. +- **Resolution**: The hardcoded `is_https: true` behavior is correct for a + TLS-terminating reverse proxy. The proxy only proxies requests on the HTTPS + listener, which always sets `X-Forwarded-Proto: https`. The HTTP redirect + listener sends a 301 redirect and does NOT proxy requests, so + `X-Forwarded-Proto` is not set there. This behavior is correct and matches + the architecture spec (proxy.md). The implementation should add a comment + documenting this rationale to prevent future "fixes" that would change the + behavior. No ADR or spec change needed — just a code comment. - **Cross-references**: ADR-021 ## Operations -### OQ-12: Should request access logging be mandatory or optional? +### ~~OQ-12: Should request access logging be mandatory or optional?~~ - **Origin**: Implementation review finding W13, [operations.md](operations.md) -- **Status**: open +- **Status**: resolved - **Priority**: high -- **Resolution**: None yet. The architecture spec (operations.md) defines an - access log format (`REQUEST client_ip=... host=... method=... path=... - status=... upstream=... duration_ms=...`) and a `log_request!` macro, but - the implementation does not emit access logs. Without request-level logging, - the proxy is operationally blind — there is no observability into traffic, - response codes, or upstream latency. This also blocks fail2ban integration - for access-log-based jails. The question is whether to: (1) Make access - logging mandatory (always-on at `info` level); (2) Make it configurable - (e.g., `access_log` boolean in `LoggingConfig`); or (3) Tie it to the - existing `log_file_path` setting. The architecture spec implies it's always - on. +- **Resolution**: Access logging is mandatory and always-on at `info` level. + The architecture spec (operations.md) already states: "Access logging is + **always-on** — it is the primary observability mechanism for the proxy and + is required for fail2ban integration. There is no configuration option to + disable access logging." The `log_request!` macro exists in the codebase + but is not called — this is an implementation gap (W13), not an + architectural question. No ADR needed; ADR-007 already covers the log format. - **Cross-references**: ADR-007 \ No newline at end of file diff --git a/docs/architecture/operations.md b/docs/architecture/operations.md index b6f33d7..a68101c 100644 --- a/docs/architecture/operations.md +++ b/docs/architecture/operations.md @@ -1,6 +1,6 @@ --- status: draft -last_updated: 2026-06-11 +last_updated: 2026-06-12 --- # Operations @@ -109,8 +109,7 @@ log entries: 1. **Access logs**: Every proxied request is logged at `info` level with structured fields. Access logging is **always-on** — it is the primary observability mechanism for the proxy and is required for fail2ban - integration. There is no configuration option to disable access logging - (see OQ-12). + integration. There is no configuration option to disable access logging. ``` REQUEST client_ip=203.0.113.50 host=git.alk.dev method=GET path=/user/repo status=200 upstream=127.0.0.1:3000 duration_ms=45 @@ -172,34 +171,37 @@ Configurable via `log_level` in StaticConfig. ### Local Health Check Port -The primary health check endpoint is served on a separate local port (default: -9900), bound to `127.0.0.1` only. This ensures health checks work even when TLS -is misconfigured. See ADR-013 for the rationale. +The health check endpoint is served on a separate local port (default: 9900), +bound to `127.0.0.1` only. It is not served on the main HTTPS listener — +health checking is an operational concern that does not belong on the +public-facing proxy. See ADR-013 and ADR-022. ``` GET http://127.0.0.1:9900/health → 200 OK (empty body) ``` The port is configurable via `health_check_port` in StaticConfig. Setting it -to `0` disables the separate health check listener. +to `0` disables the health check listener entirely. -### HTTPS Health Check (Fallback) +The admin socket's `status` command provides an additional health/status +mechanism that returns process information: -When the local health check port is enabled, `/health` is also available on the -main HTTPS listener for cases where TLS-level health verification is desired. -External monitoring should prefer the local health check for liveness checks -and can use the HTTPS endpoint for TLS verification. +``` +{"status": "ok", "uptime_secs": 1234, "sites": 2} +``` ### What It Checks - Process is running and the tokio runtime is responsive -- TLS listener is accepting connections (HTTPS endpoint only) - Config is loaded (StaticConfig and DynamicConfig are initialized) It does **not** check upstream reachability. The health check answers "is the proxy process healthy?", not "is the upstream reachable?" — upstream health is a separate concern that would produce 502/504 responses in the proxy handler. +It also does **not** verify TLS configuration — that is the responsibility of +external monitoring tools that connect to the public HTTPS port directly. + ### Future Extensions - `/health/ready` — readiness check that includes upstream reachability @@ -511,6 +513,7 @@ HEALTHCHECK --interval=30s --timeout=5s --retries=3 \ ``` No port publishing is needed — the health check runs inside the container. +There is no `/health` route on the main HTTPS listener. ### SSH Traffic @@ -580,8 +583,14 @@ All design decisions are documented as ADRs in [decisions/](decisions/). ## Open Questions -Open questions are tracked in [open-questions.md](open-questions.md). Key -questions affecting this document: +Open questions are tracked in [open-questions.md](open-questions.md). All +questions affecting this document have been resolved: - ~~**OQ-03**: Should the health check endpoint be on a separate port?~~ (resolved - — ADR-013: separate local port, default 9900, localhost only) \ No newline at end of file + — ADR-013: separate local port, default 9900, localhost only) +- ~~**OQ-08**: Should `/health` use a less common path?~~ (resolved — ADR-022: + no `/health` route on the main listener at all; health checking is via port + 9900 and admin socket only) +- ~~**OQ-12**: Should request access logging be mandatory or optional?~~ (resolved + — access logging is mandatory and always-on at `info` level; no configuration + option to disable it) \ No newline at end of file diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md index 70553fa..7d0d50a 100644 --- a/docs/architecture/overview.md +++ b/docs/architecture/overview.md @@ -1,6 +1,6 @@ --- status: draft -last_updated: 2026-06-11 +last_updated: 2026-06-12 --- # Overview @@ -86,34 +86,32 @@ details. config.toml ───────► │ StaticConfig + DynamicConfig │ (volume mount) │ (ArcSwap for hot-reload) │ │ │ - │ ┌─ Listener 1 ─────────────────┐ │ - bind_addr:80 ────► │ │ HTTP → 301 redirect │ │ - (published) │ └────────────────────────────────┘ │ - │ │ - bind_addr:443 ────► │ │ TLS listener (tokio-rustls) │ │ - (published) │ │ ├─ ACME or Manual TLS config │ │ - │ │ └─ axum router (per-listener) │ │ - │ │ ├─ /health → 200 OK (any) │ │ - │ │ ├─ Host → global site lookup │ │ - │ │ ├─ git.alk.dev → gitea:3000 │ │ - │ │ └─ Rate limiting, headers │ │ - │ └────────────────────────────────┘ │ - │ │ - │ ┌─ Listener N ─────────────────┐ │ - bind_addr_N:80 ───► │ │ HTTP → 301 redirect │ │ - │ └────────────────────────────────┘ │ - │ │ - bind_addr_N:443 ───► │ │ TLS listener (tokio-rustls) │ │ - │ │ ├─ Manual TLS cert │ │ - │ │ └─ axum router (per-listener) │ │ - │ │ ├─ /health → 200 OK (any) │ │ - │ │ ├─ Host → global site lookup │ │ - │ │ ├─ alk.dev → app:8080 │ │ - │ │ └─ Rate limiting, headers │ │ - │ └────────────────────────────────┘ │ - │ │ - │ /health → 200 OK (port 9900) │ - │ Admin socket (Unix domain) │ + │ ┌─ Listener 1 ─────────────────┐ │ + bind_addr:80 ────► │ │ HTTP → 301 redirect │ │ + (published) │ └────────────────────────────────┘ │ + │ │ + bind_addr:443 ────► │ │ TLS listener (tokio-rustls) │ │ + (published) │ │ ├─ ACME or Manual TLS config │ │ + │ │ └─ axum router (per-listener) │ │ + │ │ ├─ Host → global site lookup │ │ + │ │ ├─ git.alk.dev → gitea:3000 │ │ + │ │ └─ Rate limiting, headers │ │ + │ └────────────────────────────────┘ │ + │ │ + │ ┌─ Listener N ─────────────────┐ │ + bind_addr_N:80 ───► │ │ HTTP → 301 redirect │ │ + │ └────────────────────────────────┘ │ + │ │ + bind_addr_N:443 ───► │ │ TLS listener (tokio-rustls) │ │ + │ │ ├─ Manual TLS cert │ │ + │ │ └─ axum router (per-listener) │ │ + │ │ ├─ Host → global site lookup │ │ + │ │ ├─ alk.dev → app:8080 │ │ + │ │ └─ Rate limiting, headers │ │ + │ └────────────────────────────────┘ │ + │ │ + │ /health → 200 OK (port 9900) │ + │ Admin socket (Unix domain) │ └────────────────────────────────────┘ │ │ ┌──────┘ └──────┐ @@ -211,9 +209,11 @@ All design decisions are documented as ADRs in [decisions/](decisions/). ## Open Questions -Open questions are tracked in [open-questions.md](open-questions.md). Key -questions affecting this document: +Open questions are tracked in [open-questions.md](open-questions.md). All +questions affecting this document have been resolved: - ~~**OQ-01**: Should cipher suites be restricted beyond rustls defaults?~~ (resolved — ADR-012) - ~~**OQ-03**: Should the health check endpoint be on a separate port?~~ (resolved — ADR-013) -- ~~**OQ-07**: Should per-site TLS overrides be supported for mixed ACME/manual domains?~~ (resolved — ADR-019: `[[listeners]]` with per-listener TLS config) \ No newline at end of file +- ~~**OQ-05**: Should the proxy bind to multiple addresses?~~ (resolved — single `bind_addr` per listener) +- ~~**OQ-07**: Should per-site TLS overrides be supported for mixed ACME/manual domains?~~ (resolved — ADR-019: `[[listeners]]` with per-listener TLS config) +- ~~**OQ-08**: Should `/health` use a less common path?~~ (resolved — ADR-022: no `/health` route on main listener; health check is port 9900/admin socket only) \ No newline at end of file diff --git a/docs/architecture/proxy.md b/docs/architecture/proxy.md index f5f743b..c76dd13 100644 --- a/docs/architecture/proxy.md +++ b/docs/architecture/proxy.md @@ -1,6 +1,6 @@ --- status: draft -last_updated: 2026-06-11 +last_updated: 2026-06-12 --- # Proxy Handler @@ -26,7 +26,7 @@ Incoming HTTPS request ▼ ┌─────────────────┐ │ axum Router │ -│ (Host-based) │─── /health → 200 OK +│ (Host-based) │ │ │ │ match Host │ │ header on │ @@ -91,15 +91,11 @@ matching. Site `host` values must not include ports. The proxy does not filter or restrict paths. All paths and query strings on a known host are forwarded to the upstream without modification. -The `/health` path is a special case: it matches regardless of the `Host` -header and is evaluated before host-based routing. A `GET /health` request on -any hostname returns `200 OK` with an empty body. - -**Note**: This means any upstream application that uses `/health` for its own -health checks will have those requests silently intercepted by the proxy and -will never reach the upstream. If this is a concern, the health check path -should be changed to a less common path (e.g., `/__health` or `/healthz`) or -made configurable. See OQ-08. +The proxy does **not** serve a `/health` route on the main listener. Health +checking is an operational concern handled by the dedicated local health check +port (default: 9900, bound to `127.0.0.1` only) and the admin socket's `status` +command — not by intercepting traffic on the public-facing proxy. See ADR-013 +and ADR-022. ### 2. Proxy Header Injection @@ -260,8 +256,11 @@ All design decisions are documented as ADRs in [decisions/](decisions/). ## Open Questions -Open questions are tracked in [open-questions.md](open-questions.md). Key -questions affecting this document: +Open questions are tracked in [open-questions.md](open-questions.md). All +questions affecting this document have been resolved: - ~~**OQ-06**: Should upstream timeouts be configurable per-site?~~ (resolved — - ADR-015: per-site timeout overrides with defaults) \ No newline at end of file + ADR-015: per-site timeout overrides with defaults) +- ~~**OQ-08**: Should the `/health` path use a less common endpoint to avoid + upstream collision?~~ (resolved — ADR-022: no `/health` route on the main + listener; health checking is via port 9900 and admin socket only) \ No newline at end of file