Resolve OQ-08 through OQ-12 after reviewing implementation findings: - OQ-08: Remove /health route from the main HTTPS listener entirely. Health checking belongs on port 9900 and admin socket only, not on the public-facing proxy. This eliminates upstream collision problems and special-case routing logic. (ADR-022) - OQ-09: Not an architectural unknown — ADR-015 already decided on a separate connect timeout. The implementation gap is a known issue. - OQ-10: Not an open question — acme_contact is already specified as required in config.md. The empty contact list is bug C2. - OQ-11: Hardcoded is_https=true is correct for a TLS-terminating proxy. HTTP listener redirects, doesn't proxy. Just needs a comment. - OQ-12: Access logging is already specified as mandatory/always-on in operations.md. Missing log_request! calls are bug W13. Updated docs: proxy.md, operations.md, overview.md, config.md, open-questions.md, README.md, ADR-013. Created ADR-022.
56 lines
2.2 KiB
Markdown
56 lines
2.2 KiB
Markdown
# ADR-022: Health Check Scope — Local Port and Admin Socket Only
|
|
|
|
## Status
|
|
|
|
Accepted
|
|
|
|
## Context
|
|
|
|
The implementation served a `GET /health` route on the main HTTPS listener that
|
|
returned 200 OK regardless of the Host header. This route was evaluated before
|
|
host-based routing, meaning any upstream application using `/health` for its own
|
|
health checks would have those requests silently intercepted by the proxy and
|
|
never reach the upstream (implementation review finding W5).
|
|
|
|
The architecture already specified a separate local health check port (9900,
|
|
bound to 127.0.0.1 only) via ADR-013. The question was whether to keep the
|
|
main-listener `/health` route alongside the dedicated port (and possibly make
|
|
the path configurable), or to remove it entirely.
|
|
|
|
## Decision
|
|
|
|
The main HTTPS listener does **not** serve a `/health` route. Health checking is
|
|
handled exclusively by:
|
|
|
|
1. **Local health check port** (default: 9900, bound to `127.0.0.1`) — serves
|
|
`GET /health → 200 OK`. This is the primary health check mechanism for
|
|
container orchestration, load balancers, and monitoring systems.
|
|
2. **Admin socket** (`status` command) — returns process information including
|
|
uptime and site count.
|
|
|
|
The `/health` route is removed from the main listener entirely. No configurable
|
|
path is needed because the route simply does not exist on the public listener.
|
|
|
|
## Consequences
|
|
|
|
**Positive:**
|
|
- No collision with upstream applications that use `/health` for their own
|
|
health checks
|
|
- The main listener's routing logic is simpler — all requests go through
|
|
host-based routing, no special cases
|
|
- Clear separation of concerns: the main listener proxies, the local port
|
|
answers health checks
|
|
- No configurable path needed — the problem disappears entirely
|
|
|
|
**Negative:**
|
|
- External monitoring that needs to verify TLS is working must connect to the
|
|
HTTPS port directly and check for a successful TLS handshake or a 404
|
|
response, rather than getting a 200 from `/health`. This is a minor
|
|
inconvenience — any successful TLS response (even 404) confirms the proxy is
|
|
serving TLS correctly.
|
|
|
|
## References
|
|
|
|
- ADR-013: Health check on separate local port
|
|
- OQ-08: Resolved by this ADR
|
|
- Implementation review finding W5 (hardcoded `/health` path) |