Files

glm-5.1 9a2352e61c Resolve 5 open questions, add 7 ADRs for previously undocumented decisions

Resolve open questions:
- OQ-01: Restrict cipher suites to match nginx scope (4 ECDHE-AES-GCM
  suites for TLS 1.2 + all TLS 1.3 suites) — ADR-012
- OQ-03: Health check on separate local port (default 9900, localhost
  only) — ADR-013
- OQ-04: Add Unix domain socket admin API for config reload alongside
  SIGHUP, with structured success/failure responses — ADR-014
- OQ-06: Per-site upstream timeouts with defaults (5s connect, 60s
  request), overridable in SiteConfig — ADR-015

Document previously undocumented decisions flagged by architecture review:
- ADR-016: Explicit bind address requirement (reject 0.0.0.0)
- ADR-017: Upstream connection defaults (HTTP/1.1, no redirects, pooling)
- ADR-018: 100 MB body size limit (matches nginx, Gitea compatibility)

OQ-07 (per-site TLS overrides) remains open for future consideration.

Spec updates:
- config.md: add health_check_port, admin_socket_path, per-site timeout
  fields, update TOML example and validation rules
- proxy.md: reference ADR-015/017/018 for timeouts, connection defaults,
  and body limit decisions
- tls.md: replace OQ-01 cipher suite section with ADR-012 decision
- operations.md: add local health check port section, admin socket reload
- overview.md: update Phase 1 scope with new features, add ADR references
- open-questions.md: resolve OQ-01/03/04/06, keep OQ-07 open

2026-06-11 09:07:36 +00:00

2.6 KiB

Raw Blame History

ADR-013: Health Check on Separate Local Port

Status

Accepted

Context

The health check endpoint (/health) needs to be accessible for monitoring without requiring TLS. Currently the design places it on the main HTTPS listener, which means:

TLS handshake must succeed for the health check to respond
External monitoring tools need to handle TLS
A TLS configuration error would make the health check unreachable, creating a false-negative monitoring signal

Three options were considered (see OQ-03):

Main HTTPS listener only: Simplest, but TLS config errors make health checks unreachable
Separate unencrypted port on localhost: Simple, works with standard monitoring tools, but health checks bypass TLS
Admin port with its own listener: Most flexible but adds complexity

Decision

Add a configurable health check port that binds to 127.0.0.1 only (localhost), serving /health over plain HTTP. This is a separate listener from the main HTTP and HTTPS listeners.

The port is configurable via health_check_port in StaticConfig. Setting it to 0 (default) disables the separate health check listener, and /health remains available on the main HTTPS listener as a fallback.

Rationale

A local-only health check port is the standard pattern for reverse proxies and service meshes (envoy, haproxy, k8s health probes all use this pattern)
Health checks should work even when TLS is misconfigured — that's the whole point of monitoring
Binding to 127.0.0.1 only means the health check is not exposed to the internet — only local monitoring tools (systemd, scripts, load balancers on the same host) can reach it
Configurable port allows different deployment scenarios (some monitoring runs on different ports)
Disabling via health_check_port = 0 keeps the main HTTPS /health endpoint available for cases where a separate port isn't needed
When this project is folded into alknet, the health check will use alknet's existing patterns, making the separate port unnecessary in that context

Consequences

Positive:

Health checks work even when TLS is misconfigured
Standard pattern that monitoring tools expect
Not exposed to the internet (localhost only)
Configurable — can be disabled if not needed
systemd can use it for NotifyAccess readiness checks

Negative:

Additional listener to manage (minimal complexity)
Two health check endpoints exist when the separate port is enabled (the local one and the HTTPS one) — monitoring should prefer the local one

References

operations.md
OQ-03 (now resolved)

2.6 KiB Raw Blame History