Resolve all open questions, remove /health from main listener (ADR-022)
Resolve OQ-08 through OQ-12 after reviewing implementation findings: - OQ-08: Remove /health route from the main HTTPS listener entirely. Health checking belongs on port 9900 and admin socket only, not on the public-facing proxy. This eliminates upstream collision problems and special-case routing logic. (ADR-022) - OQ-09: Not an architectural unknown — ADR-015 already decided on a separate connect timeout. The implementation gap is a known issue. - OQ-10: Not an open question — acme_contact is already specified as required in config.md. The empty contact list is bug C2. - OQ-11: Hardcoded is_https=true is correct for a TLS-terminating proxy. HTTP listener redirects, doesn't proxy. Just needs a comment. - OQ-12: Access logging is already specified as mandatory/always-on in operations.md. Missing log_request! calls are bug W13. Updated docs: proxy.md, operations.md, overview.md, config.md, open-questions.md, README.md, ADR-013. Created ADR-022.
This commit is contained in:
@@ -1,6 +1,6 @@
|
|||||||
---
|
---
|
||||||
status: draft
|
status: draft
|
||||||
last_updated: 2026-06-11
|
last_updated: 2026-06-12
|
||||||
---
|
---
|
||||||
|
|
||||||
# Reverse Proxy — Architecture
|
# Reverse Proxy — Architecture
|
||||||
@@ -53,6 +53,7 @@ certificate via ACME.
|
|||||||
| [019](decisions/019-multi-config-listeners.md) | Multi-Config Listener Support | Accepted |
|
| [019](decisions/019-multi-config-listeners.md) | Multi-Config Listener Support | Accepted |
|
||||||
| [020](decisions/020-container-deployment.md) | Container Deployment Model | Accepted |
|
| [020](decisions/020-container-deployment.md) | Container Deployment Model | Accepted |
|
||||||
| [021](decisions/021-x-forwarded-for-edge-proxy.md) | X-Forwarded-For Edge Proxy Model | Accepted |
|
| [021](decisions/021-x-forwarded-for-edge-proxy.md) | X-Forwarded-For Edge Proxy Model | Accepted |
|
||||||
|
| [022](decisions/022-health-check-scope.md) | Health Check Scope — Local Port and Admin Socket Only | Accepted |
|
||||||
|
|
||||||
## Open Questions
|
## Open Questions
|
||||||
|
|
||||||
@@ -67,11 +68,11 @@ See [open-questions.md](open-questions.md) for the full tracker.
|
|||||||
| ~~OQ-05~~ | ~~Should the proxy bind to multiple addresses?~~ | ~~low~~ | **resolved** (single bind_addr sufficient) |
|
| ~~OQ-05~~ | ~~Should the proxy bind to multiple addresses?~~ | ~~low~~ | **resolved** (single bind_addr sufficient) |
|
||||||
| ~~OQ-06~~ | ~~Should upstream timeouts be configurable per-site?~~ | ~~low~~ | **resolved** (ADR-015) |
|
| ~~OQ-06~~ | ~~Should upstream timeouts be configurable per-site?~~ | ~~low~~ | **resolved** (ADR-015) |
|
||||||
| ~~OQ-07~~ | ~~Should per-site TLS overrides be supported for mixed ACME/manual domains?~~ | ~~low~~ | **resolved** (ADR-019) |
|
| ~~OQ-07~~ | ~~Should per-site TLS overrides be supported for mixed ACME/manual domains?~~ | ~~low~~ | **resolved** (ADR-019) |
|
||||||
| OQ-08 | Should the `/health` path use a less common endpoint to avoid upstream collision? | medium | open |
|
| ~~OQ-08~~ | ~~Should `/health` use a less common path to avoid upstream collision?~~ | ~~medium~~ | **resolved** (ADR-022: no `/health` route on main listener) |
|
||||||
| OQ-09 | How should `upstream_connect_timeout_secs` be enforced? | medium | open |
|
| ~~OQ-09~~ | ~~How should `upstream_connect_timeout_secs` be enforced?~~ | ~~medium~~ | **resolved** (implementation gap — ADR-015 already decides this) |
|
||||||
| OQ-10 | Should ACME contact email be a required config field? | high | open |
|
| ~~OQ-10~~ | ~~Should ACME contact email be a required config field?~~ | ~~high~~ | **resolved** (already specified in config.md; implementation bug C2) |
|
||||||
| OQ-11 | How should `X-Forwarded-Proto` be derived per-listener? | medium | open |
|
| ~~OQ-11~~ | ~~How should `X-Forwarded-Proto` be derived per-listener?~~ | ~~medium~~ | **resolved** (hardcoded `https` is correct for TLS-terminating proxy) |
|
||||||
| OQ-12 | Should request access logging be mandatory or optional? | high | open |
|
| ~~OQ-12~~ | ~~Should request access logging be mandatory or optional?~~ | ~~high~~ | **resolved** (mandatory, always-on per operations.md) |
|
||||||
|
|
||||||
## Document Lifecycle
|
## Document Lifecycle
|
||||||
|
|
||||||
|
|||||||
@@ -87,7 +87,7 @@ Immutable after startup. Changes require a process restart.
|
|||||||
|-------|------|-------------|
|
|-------|------|-------------|
|
||||||
| `listeners` | `Vec<ListenerConfig>` | Independent TLS endpoints, each with its own bind address and TLS config (see ADR-019) |
|
| `listeners` | `Vec<ListenerConfig>` | Independent TLS endpoints, each with its own bind address and TLS config (see ADR-019) |
|
||||||
| `allow_wildcard_bind` | `bool` | Allow `0.0.0.0` as a bind address. Required for container deployments. Default: `false` (see ADR-016, ADR-020) |
|
| `allow_wildcard_bind` | `bool` | Allow `0.0.0.0` as a bind address. Required for container deployments. Default: `false` (see ADR-016, ADR-020) |
|
||||||
| `health_check_port` | `u16` | Port for local health check endpoint (default: `9900`; set to `0` to disable; see ADR-013) |
|
| `health_check_port` | `u16` | Port for local health check endpoint (default: `9900`; set to `0` to disable; bound to `127.0.0.1` only; see ADR-013, ADR-022) |
|
||||||
| `admin_socket_path` | `String` | Unix domain socket path for admin API (default: `/run/reverse-proxy/admin.sock`; empty string to disable; see ADR-014) |
|
| `admin_socket_path` | `String` | Unix domain socket path for admin API (default: `/run/reverse-proxy/admin.sock`; empty string to disable; see ADR-014) |
|
||||||
| `shutdown_timeout_secs` | `u64` | Maximum seconds to wait for in-flight requests during graceful shutdown (default: `30`) |
|
| `shutdown_timeout_secs` | `u64` | Maximum seconds to wait for in-flight requests during graceful shutdown (default: `30`) |
|
||||||
| `logging` | `LoggingConfig` | Logging configuration (see below) |
|
| `logging` | `LoggingConfig` | Logging configuration (see below) |
|
||||||
|
|||||||
@@ -7,21 +7,23 @@ Accepted
|
|||||||
## Context
|
## Context
|
||||||
|
|
||||||
The health check endpoint (`/health`) needs to be accessible for monitoring
|
The health check endpoint (`/health`) needs to be accessible for monitoring
|
||||||
without requiring TLS. Currently the design places it on the main HTTPS
|
without requiring TLS. Serving it on the main HTTPS listener would mean:
|
||||||
listener, which means:
|
|
||||||
|
|
||||||
1. TLS handshake must succeed for the health check to respond
|
1. TLS handshake must succeed for the health check to respond
|
||||||
2. External monitoring tools need to handle TLS
|
2. External monitoring tools need to handle TLS
|
||||||
3. A TLS configuration error would make the health check unreachable, creating
|
3. A TLS configuration error would make the health check unreachable, creating
|
||||||
a false-negative monitoring signal
|
a false-negative monitoring signal
|
||||||
|
4. It creates collision with upstream applications that use `/health` for their
|
||||||
|
own health checks (see ADR-022)
|
||||||
|
|
||||||
Three options were considered (see OQ-03):
|
Three options were considered (see OQ-03):
|
||||||
|
|
||||||
1. **Main HTTPS listener only**: Simplest, but TLS config errors make health
|
1. **Separate unencrypted port on localhost (chosen)**: Simple, works with
|
||||||
checks unreachable
|
standard monitoring tools, health checks work even when TLS is misconfigured
|
||||||
2. **Separate unencrypted port on localhost**: Simple, works with standard
|
2. **Main HTTPS listener only**: Would require TLS for health checks, creating
|
||||||
monitoring tools, but health checks bypass TLS
|
a circular dependency — TLS config errors would make health checks unreachable
|
||||||
3. **Admin port with its own listener**: Most flexible but adds complexity
|
3. **Admin port with its own listener**: Most flexible but adds complexity
|
||||||
|
beyond what's needed for a simple health check
|
||||||
|
|
||||||
## Decision
|
## Decision
|
||||||
|
|
||||||
@@ -31,8 +33,8 @@ HTTP and HTTPS listeners.
|
|||||||
|
|
||||||
The port is configurable via `health_check_port` in StaticConfig. The default
|
The port is configurable via `health_check_port` in StaticConfig. The default
|
||||||
value is `9900` (enabled, localhost only). Setting it to `0` disables the
|
value is `9900` (enabled, localhost only). Setting it to `0` disables the
|
||||||
separate health check listener, and `/health` remains available on the main
|
health check listener entirely — there is no `/health` route on the main HTTPS
|
||||||
HTTPS listener as a fallback.
|
listener (see ADR-022).
|
||||||
|
|
||||||
## Rationale
|
## Rationale
|
||||||
|
|
||||||
@@ -45,8 +47,9 @@ HTTPS listener as a fallback.
|
|||||||
the same host) can reach it
|
the same host) can reach it
|
||||||
- Configurable port allows different deployment scenarios (some monitoring runs
|
- Configurable port allows different deployment scenarios (some monitoring runs
|
||||||
on different ports)
|
on different ports)
|
||||||
- Disabling via `health_check_port = 0` keeps the main HTTPS `/health` endpoint
|
- Disabling via `health_check_port = 0` removes the health check entirely —
|
||||||
available for cases where a separate port isn't needed
|
the admin socket's `status` command remains available as an alternative
|
||||||
|
health/status mechanism
|
||||||
- When this project is folded into alknet, the health check will use alknet's
|
- When this project is folded into alknet, the health check will use alknet's
|
||||||
existing patterns, making the separate port unnecessary in that context
|
existing patterns, making the separate port unnecessary in that context
|
||||||
|
|
||||||
@@ -61,10 +64,9 @@ HTTPS listener as a fallback.
|
|||||||
|
|
||||||
**Negative:**
|
**Negative:**
|
||||||
- Additional listener to manage (minimal complexity)
|
- Additional listener to manage (minimal complexity)
|
||||||
- Two health check endpoints exist when the separate port is enabled (the
|
|
||||||
local one and the HTTPS one) — monitoring should prefer the local one
|
|
||||||
|
|
||||||
## References
|
## References
|
||||||
|
|
||||||
- [operations.md](../operations.md)
|
- [operations.md](../operations.md)
|
||||||
|
- [ADR-022](022-health-check-scope.md) — Health check scope (no `/health` on main listener)
|
||||||
- OQ-03 (now resolved)
|
- OQ-03 (now resolved)
|
||||||
56
docs/architecture/decisions/022-health-check-scope.md
Normal file
56
docs/architecture/decisions/022-health-check-scope.md
Normal file
@@ -0,0 +1,56 @@
|
|||||||
|
# ADR-022: Health Check Scope — Local Port and Admin Socket Only
|
||||||
|
|
||||||
|
## Status
|
||||||
|
|
||||||
|
Accepted
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
The implementation served a `GET /health` route on the main HTTPS listener that
|
||||||
|
returned 200 OK regardless of the Host header. This route was evaluated before
|
||||||
|
host-based routing, meaning any upstream application using `/health` for its own
|
||||||
|
health checks would have those requests silently intercepted by the proxy and
|
||||||
|
never reach the upstream (implementation review finding W5).
|
||||||
|
|
||||||
|
The architecture already specified a separate local health check port (9900,
|
||||||
|
bound to 127.0.0.1 only) via ADR-013. The question was whether to keep the
|
||||||
|
main-listener `/health` route alongside the dedicated port (and possibly make
|
||||||
|
the path configurable), or to remove it entirely.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
The main HTTPS listener does **not** serve a `/health` route. Health checking is
|
||||||
|
handled exclusively by:
|
||||||
|
|
||||||
|
1. **Local health check port** (default: 9900, bound to `127.0.0.1`) — serves
|
||||||
|
`GET /health → 200 OK`. This is the primary health check mechanism for
|
||||||
|
container orchestration, load balancers, and monitoring systems.
|
||||||
|
2. **Admin socket** (`status` command) — returns process information including
|
||||||
|
uptime and site count.
|
||||||
|
|
||||||
|
The `/health` route is removed from the main listener entirely. No configurable
|
||||||
|
path is needed because the route simply does not exist on the public listener.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
**Positive:**
|
||||||
|
- No collision with upstream applications that use `/health` for their own
|
||||||
|
health checks
|
||||||
|
- The main listener's routing logic is simpler — all requests go through
|
||||||
|
host-based routing, no special cases
|
||||||
|
- Clear separation of concerns: the main listener proxies, the local port
|
||||||
|
answers health checks
|
||||||
|
- No configurable path needed — the problem disappears entirely
|
||||||
|
|
||||||
|
**Negative:**
|
||||||
|
- External monitoring that needs to verify TLS is working must connect to the
|
||||||
|
HTTPS port directly and check for a successful TLS handshake or a 404
|
||||||
|
response, rather than getting a 200 from `/health`. This is a minor
|
||||||
|
inconvenience — any successful TLS response (even 404) confirms the proxy is
|
||||||
|
serving TLS correctly.
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- ADR-013: Health check on separate local port
|
||||||
|
- OQ-08: Resolved by this ADR
|
||||||
|
- Implementation review finding W5 (hardcoded `/health` path)
|
||||||
@@ -1,6 +1,6 @@
|
|||||||
---
|
---
|
||||||
status: draft
|
status: draft
|
||||||
last_updated: 2026-06-11
|
last_updated: 2026-06-12
|
||||||
---
|
---
|
||||||
|
|
||||||
# Open Questions
|
# Open Questions
|
||||||
@@ -50,9 +50,10 @@ last_updated: 2026-06-11
|
|||||||
- **Priority**: low
|
- **Priority**: low
|
||||||
- **Resolution**: Add a configurable local health check port (default: 9900)
|
- **Resolution**: Add a configurable local health check port (default: 9900)
|
||||||
bound to `127.0.0.1` only. Health checks work even when TLS is misconfigured.
|
bound to `127.0.0.1` only. Health checks work even when TLS is misconfigured.
|
||||||
The main HTTPS `/health` endpoint remains available as a fallback. See
|
There is no `/health` route on the main HTTPS listener — health checking is
|
||||||
ADR-013.
|
handled exclusively by the local port and admin socket. See ADR-013 and
|
||||||
- **Cross-references**: ADR-013
|
ADR-022.
|
||||||
|
- **Cross-references**: ADR-013, ADR-022
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
@@ -92,92 +93,79 @@ last_updated: 2026-06-11
|
|||||||
that override global defaults when specified.
|
that override global defaults when specified.
|
||||||
- **Cross-references**: ADR-015, ADR-017
|
- **Cross-references**: ADR-015, ADR-017
|
||||||
|
|
||||||
### OQ-08: Should the `/health` path use a less common endpoint to avoid upstream collision?
|
### ~~OQ-08: Should the `/health` path use a less common endpoint to avoid upstream collision?~~
|
||||||
|
|
||||||
- **Origin**: Implementation review finding W5, [proxy.md](proxy.md)
|
- **Origin**: Implementation review finding W5, [proxy.md](proxy.md)
|
||||||
- **Status**: open
|
- **Status**: resolved
|
||||||
- **Priority**: medium
|
- **Priority**: medium
|
||||||
- **Resolution**: None yet. The proxy currently intercepts `GET /health` on all
|
- **Resolution**: The `/health` route does not belong on the main listener at
|
||||||
hosts before host-based routing, which means any upstream application that
|
all. Health checking is an operational concern served by the dedicated local
|
||||||
uses `/health` for its own health checks will have those requests silently
|
port (9900) and the admin socket's `status` command — not by intercepting
|
||||||
intercepted. Options: (1) Use a less common path like `/__health` or
|
traffic on the public-facing proxy. Serving `/health` on the main listener
|
||||||
`/healthz`; (2) Only intercept `/health` when the Host header doesn't match
|
creates collision with upstream applications, requires special-case routing
|
||||||
any known site (fallthrough); (3) Make the health check path configurable
|
logic before host-based matching, and is architecturally wrong: the main
|
||||||
via `StaticConfig`. Option 1 is simplest for Phase 1. Option 3 is most
|
listener's job is to proxy requests, not to serve operational endpoints. The
|
||||||
flexible long-term. The architecture spec (proxy.md, ADR-013) currently
|
local health check port (bound to `127.0.0.1:9900`) and the admin socket are
|
||||||
specifies `/health` as a top-level route regardless of Host.
|
the sole health/status mechanisms. See ADR-022.
|
||||||
- **Cross-references**: ADR-013
|
- **Cross-references**: ADR-013, ADR-022
|
||||||
|
|
||||||
### OQ-09: How should `upstream_connect_timeout_secs` be enforced?
|
### ~~OQ-09: How should `upstream_connect_timeout_secs` be enforced?~~
|
||||||
|
|
||||||
- **Origin**: Implementation review finding W4, ADR-015, ADR-017
|
- **Origin**: Implementation review finding W4, ADR-015, ADR-017
|
||||||
- **Status**: open
|
- **Status**: resolved
|
||||||
- **Priority**: medium
|
- **Priority**: medium
|
||||||
- **Resolution**: None yet. The architecture (ADR-015, ADR-017) specifies a
|
- **Resolution**: This is an implementation gap, not an architectural unknown.
|
||||||
5-second default connect timeout separate from the request timeout, and
|
The architecture already specifies a 5-second default connect timeout
|
||||||
`SiteConfig` includes `upstream_connect_timeout_secs`. However, the
|
separate from the request timeout (ADR-015, ADR-017), and `SiteConfig`
|
||||||
implementation only applies `upstream_request_timeout_secs` as a blanket
|
already includes `upstream_connect_timeout_secs`. The implementation must
|
||||||
timeout covering the entire exchange. The hyper client handles TCP connect
|
wire this field to hyper's `connect_timeout` parameter. If hyper's API
|
||||||
internally, making a two-phase timeout harder to implement without custom
|
doesn't expose a separate connect timeout, a two-phase `tokio::time::timeout`
|
||||||
connect logic. Need to decide: (1) implement a two-phase timeout using
|
approach should be used for Phase 2. For Phase 1, the connect timeout field
|
||||||
`tokio::time::timeout` for connect phase then request phase; (2) configure
|
exists in config but is not enforced — this is a documented known gap. No ADR
|
||||||
the hyper client's `connect_timeout` parameter; or (3) accept the current
|
needed; the decision was already made in ADR-015.
|
||||||
behavior for Phase 1 and add connect timeout enforcement in Phase 2.
|
|
||||||
- **Cross-references**: ADR-015, ADR-017
|
- **Cross-references**: ADR-015, ADR-017
|
||||||
|
|
||||||
## Configuration
|
### ~~OQ-10: Should ACME contact email be a required config field?~~
|
||||||
|
|
||||||
### OQ-10: Should ACME contact email be a required config field?
|
|
||||||
|
|
||||||
- **Origin**: Implementation review finding C2, [tls.md](tls.md), [config.md](config.md)
|
- **Origin**: Implementation review finding C2, [tls.md](tls.md), [config.md](config.md)
|
||||||
- **Status**: open
|
- **Status**: resolved
|
||||||
- **Priority**: high
|
- **Priority**: high
|
||||||
- **Resolution**: None yet. Let's Encrypt requires a contact email for production
|
- **Resolution**: This is not an open question — the architecture already
|
||||||
certificate requests. The current architecture spec does not include an
|
specifies `acme_contact` as a required field in ACME mode (config.md
|
||||||
`acme_contact` field in `TlsConfig` or `ListenerConfig`. Without it, ACME
|
validation rule 19). The field is defined in the `ListenerConfig` table and
|
||||||
registration with Let's Encrypt production will fail. Options: (1) Add a
|
shown in TOML examples. Let's Encrypt requires a contact email for production
|
||||||
required `acme_contact` field to the TLS config within each `[[listeners]]`
|
certificate requests. The implementation bug (C2: `contact: vec![]`) must be
|
||||||
entry that uses ACME mode; (2) Add a global `acme_contact` field shared
|
fixed to use the configured `acme_contact` value. No new ADR needed — the
|
||||||
across all ACME listeners. Per-listener is more flexible but adds config
|
decision is already documented in config.md and tls.md.
|
||||||
noise. Global is simpler for typical deployments. Need to update config.md
|
|
||||||
and tls.md.
|
|
||||||
- **Cross-references**: ADR-004
|
- **Cross-references**: ADR-004
|
||||||
|
|
||||||
### OQ-11: How should `X-Forwarded-Proto` be derived per-listener?
|
### ~~OQ-11: How should `X-Forwarded-Proto` be derived per-listener?~~
|
||||||
|
|
||||||
- **Origin**: Implementation review finding W14, [proxy.md](proxy.md)
|
- **Origin**: Implementation review finding W14, [proxy.md](proxy.md)
|
||||||
- **Status**: open
|
- **Status**: resolved
|
||||||
- **Priority**: medium
|
- **Priority**: medium
|
||||||
- **Resolution**: None yet. The architecture spec (proxy.md) states
|
- **Resolution**: The hardcoded `is_https: true` behavior is correct for a
|
||||||
`X-Forwarded-Proto` should be "determined by which listener port received the
|
TLS-terminating reverse proxy. The proxy only proxies requests on the HTTPS
|
||||||
request" — `https` for requests on the listener's `https_port`, `http` for
|
listener, which always sets `X-Forwarded-Proto: https`. The HTTP redirect
|
||||||
requests on the listener's `http_port`. The implementation hardcodes
|
listener sends a 301 redirect and does NOT proxy requests, so
|
||||||
`is_https: true` in `ProxyState`. For a TLS-terminating reverse proxy this
|
`X-Forwarded-Proto` is not set there. This behavior is correct and matches
|
||||||
is correct (all TLS connections arrive on the HTTPS port), but the HTTP
|
the architecture spec (proxy.md). The implementation should add a comment
|
||||||
redirect listener should set `X-Forwarded-Proto: https` since it redirects to
|
documenting this rationale to prevent future "fixes" that would change the
|
||||||
HTTPS. Need to clarify: (1) The HTTPS listener always sets `X-Forwarded-Proto:
|
behavior. No ADR or spec change needed — just a code comment.
|
||||||
https` (correct, since it terminates TLS); (2) The HTTP redirect listener
|
|
||||||
sends a 301 redirect and does NOT proxy, so `X-Forwarded-Proto` on the
|
|
||||||
redirect response is not applicable. The hardcoded behavior is correct but
|
|
||||||
should be documented.
|
|
||||||
- **Cross-references**: ADR-021
|
- **Cross-references**: ADR-021
|
||||||
|
|
||||||
## Operations
|
## Operations
|
||||||
|
|
||||||
### OQ-12: Should request access logging be mandatory or optional?
|
### ~~OQ-12: Should request access logging be mandatory or optional?~~
|
||||||
|
|
||||||
- **Origin**: Implementation review finding W13, [operations.md](operations.md)
|
- **Origin**: Implementation review finding W13, [operations.md](operations.md)
|
||||||
- **Status**: open
|
- **Status**: resolved
|
||||||
- **Priority**: high
|
- **Priority**: high
|
||||||
- **Resolution**: None yet. The architecture spec (operations.md) defines an
|
- **Resolution**: Access logging is mandatory and always-on at `info` level.
|
||||||
access log format (`REQUEST client_ip=... host=... method=... path=...
|
The architecture spec (operations.md) already states: "Access logging is
|
||||||
status=... upstream=... duration_ms=...`) and a `log_request!` macro, but
|
**always-on** — it is the primary observability mechanism for the proxy and
|
||||||
the implementation does not emit access logs. Without request-level logging,
|
is required for fail2ban integration. There is no configuration option to
|
||||||
the proxy is operationally blind — there is no observability into traffic,
|
disable access logging." The `log_request!` macro exists in the codebase
|
||||||
response codes, or upstream latency. This also blocks fail2ban integration
|
but is not called — this is an implementation gap (W13), not an
|
||||||
for access-log-based jails. The question is whether to: (1) Make access
|
architectural question. No ADR needed; ADR-007 already covers the log format.
|
||||||
logging mandatory (always-on at `info` level); (2) Make it configurable
|
|
||||||
(e.g., `access_log` boolean in `LoggingConfig`); or (3) Tie it to the
|
|
||||||
existing `log_file_path` setting. The architecture spec implies it's always
|
|
||||||
on.
|
|
||||||
- **Cross-references**: ADR-007
|
- **Cross-references**: ADR-007
|
||||||
@@ -1,6 +1,6 @@
|
|||||||
---
|
---
|
||||||
status: draft
|
status: draft
|
||||||
last_updated: 2026-06-11
|
last_updated: 2026-06-12
|
||||||
---
|
---
|
||||||
|
|
||||||
# Operations
|
# Operations
|
||||||
@@ -109,8 +109,7 @@ log entries:
|
|||||||
1. **Access logs**: Every proxied request is logged at `info` level with
|
1. **Access logs**: Every proxied request is logged at `info` level with
|
||||||
structured fields. Access logging is **always-on** — it is the primary
|
structured fields. Access logging is **always-on** — it is the primary
|
||||||
observability mechanism for the proxy and is required for fail2ban
|
observability mechanism for the proxy and is required for fail2ban
|
||||||
integration. There is no configuration option to disable access logging
|
integration. There is no configuration option to disable access logging.
|
||||||
(see OQ-12).
|
|
||||||
|
|
||||||
```
|
```
|
||||||
REQUEST client_ip=203.0.113.50 host=git.alk.dev method=GET path=/user/repo status=200 upstream=127.0.0.1:3000 duration_ms=45
|
REQUEST client_ip=203.0.113.50 host=git.alk.dev method=GET path=/user/repo status=200 upstream=127.0.0.1:3000 duration_ms=45
|
||||||
@@ -172,34 +171,37 @@ Configurable via `log_level` in StaticConfig.
|
|||||||
|
|
||||||
### Local Health Check Port
|
### Local Health Check Port
|
||||||
|
|
||||||
The primary health check endpoint is served on a separate local port (default:
|
The health check endpoint is served on a separate local port (default: 9900),
|
||||||
9900), bound to `127.0.0.1` only. This ensures health checks work even when TLS
|
bound to `127.0.0.1` only. It is not served on the main HTTPS listener —
|
||||||
is misconfigured. See ADR-013 for the rationale.
|
health checking is an operational concern that does not belong on the
|
||||||
|
public-facing proxy. See ADR-013 and ADR-022.
|
||||||
|
|
||||||
```
|
```
|
||||||
GET http://127.0.0.1:9900/health → 200 OK (empty body)
|
GET http://127.0.0.1:9900/health → 200 OK (empty body)
|
||||||
```
|
```
|
||||||
|
|
||||||
The port is configurable via `health_check_port` in StaticConfig. Setting it
|
The port is configurable via `health_check_port` in StaticConfig. Setting it
|
||||||
to `0` disables the separate health check listener.
|
to `0` disables the health check listener entirely.
|
||||||
|
|
||||||
### HTTPS Health Check (Fallback)
|
The admin socket's `status` command provides an additional health/status
|
||||||
|
mechanism that returns process information:
|
||||||
|
|
||||||
When the local health check port is enabled, `/health` is also available on the
|
```
|
||||||
main HTTPS listener for cases where TLS-level health verification is desired.
|
{"status": "ok", "uptime_secs": 1234, "sites": 2}
|
||||||
External monitoring should prefer the local health check for liveness checks
|
```
|
||||||
and can use the HTTPS endpoint for TLS verification.
|
|
||||||
|
|
||||||
### What It Checks
|
### What It Checks
|
||||||
|
|
||||||
- Process is running and the tokio runtime is responsive
|
- Process is running and the tokio runtime is responsive
|
||||||
- TLS listener is accepting connections (HTTPS endpoint only)
|
|
||||||
- Config is loaded (StaticConfig and DynamicConfig are initialized)
|
- Config is loaded (StaticConfig and DynamicConfig are initialized)
|
||||||
|
|
||||||
It does **not** check upstream reachability. The health check answers "is the
|
It does **not** check upstream reachability. The health check answers "is the
|
||||||
proxy process healthy?", not "is the upstream reachable?" — upstream health is
|
proxy process healthy?", not "is the upstream reachable?" — upstream health is
|
||||||
a separate concern that would produce 502/504 responses in the proxy handler.
|
a separate concern that would produce 502/504 responses in the proxy handler.
|
||||||
|
|
||||||
|
It also does **not** verify TLS configuration — that is the responsibility of
|
||||||
|
external monitoring tools that connect to the public HTTPS port directly.
|
||||||
|
|
||||||
### Future Extensions
|
### Future Extensions
|
||||||
|
|
||||||
- `/health/ready` — readiness check that includes upstream reachability
|
- `/health/ready` — readiness check that includes upstream reachability
|
||||||
@@ -511,6 +513,7 @@ HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
|
|||||||
```
|
```
|
||||||
|
|
||||||
No port publishing is needed — the health check runs inside the container.
|
No port publishing is needed — the health check runs inside the container.
|
||||||
|
There is no `/health` route on the main HTTPS listener.
|
||||||
|
|
||||||
### SSH Traffic
|
### SSH Traffic
|
||||||
|
|
||||||
@@ -580,8 +583,14 @@ All design decisions are documented as ADRs in [decisions/](decisions/).
|
|||||||
|
|
||||||
## Open Questions
|
## Open Questions
|
||||||
|
|
||||||
Open questions are tracked in [open-questions.md](open-questions.md). Key
|
Open questions are tracked in [open-questions.md](open-questions.md). All
|
||||||
questions affecting this document:
|
questions affecting this document have been resolved:
|
||||||
|
|
||||||
- ~~**OQ-03**: Should the health check endpoint be on a separate port?~~ (resolved
|
- ~~**OQ-03**: Should the health check endpoint be on a separate port?~~ (resolved
|
||||||
— ADR-013: separate local port, default 9900, localhost only)
|
— ADR-013: separate local port, default 9900, localhost only)
|
||||||
|
- ~~**OQ-08**: Should `/health` use a less common path?~~ (resolved — ADR-022:
|
||||||
|
no `/health` route on the main listener at all; health checking is via port
|
||||||
|
9900 and admin socket only)
|
||||||
|
- ~~**OQ-12**: Should request access logging be mandatory or optional?~~ (resolved
|
||||||
|
— access logging is mandatory and always-on at `info` level; no configuration
|
||||||
|
option to disable it)
|
||||||
@@ -1,6 +1,6 @@
|
|||||||
---
|
---
|
||||||
status: draft
|
status: draft
|
||||||
last_updated: 2026-06-11
|
last_updated: 2026-06-12
|
||||||
---
|
---
|
||||||
|
|
||||||
# Overview
|
# Overview
|
||||||
@@ -86,34 +86,32 @@ details.
|
|||||||
config.toml ───────► │ StaticConfig + DynamicConfig │
|
config.toml ───────► │ StaticConfig + DynamicConfig │
|
||||||
(volume mount) │ (ArcSwap for hot-reload) │
|
(volume mount) │ (ArcSwap for hot-reload) │
|
||||||
│ │
|
│ │
|
||||||
│ ┌─ Listener 1 ─────────────────┐ │
|
│ ┌─ Listener 1 ─────────────────┐ │
|
||||||
bind_addr:80 ────► │ │ HTTP → 301 redirect │ │
|
bind_addr:80 ────► │ │ HTTP → 301 redirect │ │
|
||||||
(published) │ └────────────────────────────────┘ │
|
(published) │ └────────────────────────────────┘ │
|
||||||
│ │
|
│ │
|
||||||
bind_addr:443 ────► │ │ TLS listener (tokio-rustls) │ │
|
bind_addr:443 ────► │ │ TLS listener (tokio-rustls) │ │
|
||||||
(published) │ │ ├─ ACME or Manual TLS config │ │
|
(published) │ │ ├─ ACME or Manual TLS config │ │
|
||||||
│ │ └─ axum router (per-listener) │ │
|
│ │ └─ axum router (per-listener) │ │
|
||||||
│ │ ├─ /health → 200 OK (any) │ │
|
│ │ ├─ Host → global site lookup │ │
|
||||||
│ │ ├─ Host → global site lookup │ │
|
│ │ ├─ git.alk.dev → gitea:3000 │ │
|
||||||
│ │ ├─ git.alk.dev → gitea:3000 │ │
|
│ │ └─ Rate limiting, headers │ │
|
||||||
│ │ └─ Rate limiting, headers │ │
|
│ └────────────────────────────────┘ │
|
||||||
│ └────────────────────────────────┘ │
|
│ │
|
||||||
│ │
|
│ ┌─ Listener N ─────────────────┐ │
|
||||||
│ ┌─ Listener N ─────────────────┐ │
|
bind_addr_N:80 ───► │ │ HTTP → 301 redirect │ │
|
||||||
bind_addr_N:80 ───► │ │ HTTP → 301 redirect │ │
|
│ └────────────────────────────────┘ │
|
||||||
│ └────────────────────────────────┘ │
|
│ │
|
||||||
│ │
|
bind_addr_N:443 ───► │ │ TLS listener (tokio-rustls) │ │
|
||||||
bind_addr_N:443 ───► │ │ TLS listener (tokio-rustls) │ │
|
│ │ ├─ Manual TLS cert │ │
|
||||||
│ │ ├─ Manual TLS cert │ │
|
│ │ └─ axum router (per-listener) │ │
|
||||||
│ │ └─ axum router (per-listener) │ │
|
│ │ ├─ Host → global site lookup │ │
|
||||||
│ │ ├─ /health → 200 OK (any) │ │
|
│ │ ├─ alk.dev → app:8080 │ │
|
||||||
│ │ ├─ Host → global site lookup │ │
|
│ │ └─ Rate limiting, headers │ │
|
||||||
│ │ ├─ alk.dev → app:8080 │ │
|
│ └────────────────────────────────┘ │
|
||||||
│ │ └─ Rate limiting, headers │ │
|
│ │
|
||||||
│ └────────────────────────────────┘ │
|
│ /health → 200 OK (port 9900) │
|
||||||
│ │
|
│ Admin socket (Unix domain) │
|
||||||
│ /health → 200 OK (port 9900) │
|
|
||||||
│ Admin socket (Unix domain) │
|
|
||||||
└────────────────────────────────────┘
|
└────────────────────────────────────┘
|
||||||
│ │
|
│ │
|
||||||
┌──────┘ └──────┐
|
┌──────┘ └──────┐
|
||||||
@@ -211,9 +209,11 @@ All design decisions are documented as ADRs in [decisions/](decisions/).
|
|||||||
|
|
||||||
## Open Questions
|
## Open Questions
|
||||||
|
|
||||||
Open questions are tracked in [open-questions.md](open-questions.md). Key
|
Open questions are tracked in [open-questions.md](open-questions.md). All
|
||||||
questions affecting this document:
|
questions affecting this document have been resolved:
|
||||||
|
|
||||||
- ~~**OQ-01**: Should cipher suites be restricted beyond rustls defaults?~~ (resolved — ADR-012)
|
- ~~**OQ-01**: Should cipher suites be restricted beyond rustls defaults?~~ (resolved — ADR-012)
|
||||||
- ~~**OQ-03**: Should the health check endpoint be on a separate port?~~ (resolved — ADR-013)
|
- ~~**OQ-03**: Should the health check endpoint be on a separate port?~~ (resolved — ADR-013)
|
||||||
- ~~**OQ-07**: Should per-site TLS overrides be supported for mixed ACME/manual domains?~~ (resolved — ADR-019: `[[listeners]]` with per-listener TLS config)
|
- ~~**OQ-05**: Should the proxy bind to multiple addresses?~~ (resolved — single `bind_addr` per listener)
|
||||||
|
- ~~**OQ-07**: Should per-site TLS overrides be supported for mixed ACME/manual domains?~~ (resolved — ADR-019: `[[listeners]]` with per-listener TLS config)
|
||||||
|
- ~~**OQ-08**: Should `/health` use a less common path?~~ (resolved — ADR-022: no `/health` route on main listener; health check is port 9900/admin socket only)
|
||||||
@@ -1,6 +1,6 @@
|
|||||||
---
|
---
|
||||||
status: draft
|
status: draft
|
||||||
last_updated: 2026-06-11
|
last_updated: 2026-06-12
|
||||||
---
|
---
|
||||||
|
|
||||||
# Proxy Handler
|
# Proxy Handler
|
||||||
@@ -26,7 +26,7 @@ Incoming HTTPS request
|
|||||||
▼
|
▼
|
||||||
┌─────────────────┐
|
┌─────────────────┐
|
||||||
│ axum Router │
|
│ axum Router │
|
||||||
│ (Host-based) │─── /health → 200 OK
|
│ (Host-based) │
|
||||||
│ │
|
│ │
|
||||||
│ match Host │
|
│ match Host │
|
||||||
│ header on │
|
│ header on │
|
||||||
@@ -91,15 +91,11 @@ matching. Site `host` values must not include ports.
|
|||||||
The proxy does not filter or restrict paths. All paths and query strings on a
|
The proxy does not filter or restrict paths. All paths and query strings on a
|
||||||
known host are forwarded to the upstream without modification.
|
known host are forwarded to the upstream without modification.
|
||||||
|
|
||||||
The `/health` path is a special case: it matches regardless of the `Host`
|
The proxy does **not** serve a `/health` route on the main listener. Health
|
||||||
header and is evaluated before host-based routing. A `GET /health` request on
|
checking is an operational concern handled by the dedicated local health check
|
||||||
any hostname returns `200 OK` with an empty body.
|
port (default: 9900, bound to `127.0.0.1` only) and the admin socket's `status`
|
||||||
|
command — not by intercepting traffic on the public-facing proxy. See ADR-013
|
||||||
**Note**: This means any upstream application that uses `/health` for its own
|
and ADR-022.
|
||||||
health checks will have those requests silently intercepted by the proxy and
|
|
||||||
will never reach the upstream. If this is a concern, the health check path
|
|
||||||
should be changed to a less common path (e.g., `/__health` or `/healthz`) or
|
|
||||||
made configurable. See OQ-08.
|
|
||||||
|
|
||||||
### 2. Proxy Header Injection
|
### 2. Proxy Header Injection
|
||||||
|
|
||||||
@@ -260,8 +256,11 @@ All design decisions are documented as ADRs in [decisions/](decisions/).
|
|||||||
|
|
||||||
## Open Questions
|
## Open Questions
|
||||||
|
|
||||||
Open questions are tracked in [open-questions.md](open-questions.md). Key
|
Open questions are tracked in [open-questions.md](open-questions.md). All
|
||||||
questions affecting this document:
|
questions affecting this document have been resolved:
|
||||||
|
|
||||||
- ~~**OQ-06**: Should upstream timeouts be configurable per-site?~~ (resolved —
|
- ~~**OQ-06**: Should upstream timeouts be configurable per-site?~~ (resolved —
|
||||||
ADR-015: per-site timeout overrides with defaults)
|
ADR-015: per-site timeout overrides with defaults)
|
||||||
|
- ~~**OQ-08**: Should the `/health` path use a less common endpoint to avoid
|
||||||
|
upstream collision?~~ (resolved — ADR-022: no `/health` route on the main
|
||||||
|
listener; health checking is via port 9900 and admin socket only)
|
||||||
Reference in New Issue
Block a user