Files

glm-5.1 fe1ae6c05e Resolve all open questions, remove /health from main listener (ADR-022)

Resolve OQ-08 through OQ-12 after reviewing implementation findings:

- OQ-08: Remove /health route from the main HTTPS listener entirely.
  Health checking belongs on port 9900 and admin socket only, not on
  the public-facing proxy. This eliminates upstream collision problems
  and special-case routing logic. (ADR-022)

- OQ-09: Not an architectural unknown — ADR-015 already decided on a
  separate connect timeout. The implementation gap is a known issue.

- OQ-10: Not an open question — acme_contact is already specified as
  required in config.md. The empty contact list is bug C2.

- OQ-11: Hardcoded is_https=true is correct for a TLS-terminating
  proxy. HTTP listener redirects, doesn't proxy. Just needs a comment.

- OQ-12: Access logging is already specified as mandatory/always-on in
  operations.md. Missing log_request! calls are bug W13.

Updated docs: proxy.md, operations.md, overview.md, config.md,
open-questions.md, README.md, ADR-013. Created ADR-022.

2026-06-12 03:39:52 +00:00

2.9 KiB

Raw Blame History

ADR-013: Health Check on Separate Local Port

Status

Accepted

Context

The health check endpoint (/health) needs to be accessible for monitoring without requiring TLS. Serving it on the main HTTPS listener would mean:

TLS handshake must succeed for the health check to respond
External monitoring tools need to handle TLS
A TLS configuration error would make the health check unreachable, creating a false-negative monitoring signal
It creates collision with upstream applications that use /health for their own health checks (see ADR-022)

Three options were considered (see OQ-03):

Separate unencrypted port on localhost (chosen): Simple, works with standard monitoring tools, health checks work even when TLS is misconfigured
Main HTTPS listener only: Would require TLS for health checks, creating a circular dependency — TLS config errors would make health checks unreachable
Admin port with its own listener: Most flexible but adds complexity beyond what's needed for a simple health check

Decision

Add a configurable health check port that binds to 127.0.0.1 only (localhost), serving /health over plain HTTP. This is a separate listener from the main HTTP and HTTPS listeners.

The port is configurable via health_check_port in StaticConfig. The default value is 9900 (enabled, localhost only). Setting it to 0 disables the health check listener entirely — there is no /health route on the main HTTPS listener (see ADR-022).

Rationale

A local-only health check port is the standard pattern for reverse proxies and service meshes (envoy, haproxy, k8s health probes all use this pattern)
Health checks should work even when TLS is misconfigured — that's the whole point of monitoring
Binding to 127.0.0.1 only means the health check is not exposed to the internet — only local monitoring tools (systemd, scripts, load balancers on the same host) can reach it
Configurable port allows different deployment scenarios (some monitoring runs on different ports)
Disabling via health_check_port = 0 removes the health check entirely — the admin socket's status command remains available as an alternative health/status mechanism
When this project is folded into alknet, the health check will use alknet's existing patterns, making the separate port unnecessary in that context

Consequences

Positive:

Health checks work even when TLS is misconfigured
Standard pattern that monitoring tools expect
Not exposed to the internet (localhost only)
Configurable — can be disabled if not needed
systemd can use it for NotifyAccess readiness checks

Negative:

Additional listener to manage (minimal complexity)

References

operations.md
ADR-022 — Health check scope (no /health on main listener)
OQ-03 (now resolved)

2.9 KiB Raw Blame History