Resolve OQ-08 through OQ-12 after reviewing implementation findings: - OQ-08: Remove /health route from the main HTTPS listener entirely. Health checking belongs on port 9900 and admin socket only, not on the public-facing proxy. This eliminates upstream collision problems and special-case routing logic. (ADR-022) - OQ-09: Not an architectural unknown — ADR-015 already decided on a separate connect timeout. The implementation gap is a known issue. - OQ-10: Not an open question — acme_contact is already specified as required in config.md. The empty contact list is bug C2. - OQ-11: Hardcoded is_https=true is correct for a TLS-terminating proxy. HTTP listener redirects, doesn't proxy. Just needs a comment. - OQ-12: Access logging is already specified as mandatory/always-on in operations.md. Missing log_request! calls are bug W13. Updated docs: proxy.md, operations.md, overview.md, config.md, open-questions.md, README.md, ADR-013. Created ADR-022.
2.9 KiB
2.9 KiB
ADR-013: Health Check on Separate Local Port
Status
Accepted
Context
The health check endpoint (/health) needs to be accessible for monitoring
without requiring TLS. Serving it on the main HTTPS listener would mean:
- TLS handshake must succeed for the health check to respond
- External monitoring tools need to handle TLS
- A TLS configuration error would make the health check unreachable, creating a false-negative monitoring signal
- It creates collision with upstream applications that use
/healthfor their own health checks (see ADR-022)
Three options were considered (see OQ-03):
- Separate unencrypted port on localhost (chosen): Simple, works with standard monitoring tools, health checks work even when TLS is misconfigured
- Main HTTPS listener only: Would require TLS for health checks, creating a circular dependency — TLS config errors would make health checks unreachable
- Admin port with its own listener: Most flexible but adds complexity beyond what's needed for a simple health check
Decision
Add a configurable health check port that binds to 127.0.0.1 only (localhost),
serving /health over plain HTTP. This is a separate listener from the main
HTTP and HTTPS listeners.
The port is configurable via health_check_port in StaticConfig. The default
value is 9900 (enabled, localhost only). Setting it to 0 disables the
health check listener entirely — there is no /health route on the main HTTPS
listener (see ADR-022).
Rationale
- A local-only health check port is the standard pattern for reverse proxies and service meshes (envoy, haproxy, k8s health probes all use this pattern)
- Health checks should work even when TLS is misconfigured — that's the whole point of monitoring
- Binding to
127.0.0.1only means the health check is not exposed to the internet — only local monitoring tools (systemd, scripts, load balancers on the same host) can reach it - Configurable port allows different deployment scenarios (some monitoring runs on different ports)
- Disabling via
health_check_port = 0removes the health check entirely — the admin socket'sstatuscommand remains available as an alternative health/status mechanism - When this project is folded into alknet, the health check will use alknet's existing patterns, making the separate port unnecessary in that context
Consequences
Positive:
- Health checks work even when TLS is misconfigured
- Standard pattern that monitoring tools expect
- Not exposed to the internet (localhost only)
- Configurable — can be disabled if not needed
- systemd can use it for
NotifyAccessreadiness checks
Negative:
- Additional listener to manage (minimal complexity)
References
- operations.md
- ADR-022 — Health check scope (no
/healthon main listener) - OQ-03 (now resolved)