Resolve OQ-08 through OQ-12 after reviewing implementation findings:
- OQ-08: Remove /health route from the main HTTPS listener entirely.
Health checking belongs on port 9900 and admin socket only, not on
the public-facing proxy. This eliminates upstream collision problems
and special-case routing logic. (ADR-022)
- OQ-09: Not an architectural unknown — ADR-015 already decided on a
separate connect timeout. The implementation gap is a known issue.
- OQ-10: Not an open question — acme_contact is already specified as
required in config.md. The empty contact list is bug C2.
- OQ-11: Hardcoded is_https=true is correct for a TLS-terminating
proxy. HTTP listener redirects, doesn't proxy. Just needs a comment.
- OQ-12: Access logging is already specified as mandatory/always-on in
operations.md. Missing log_request! calls are bug W13.
Updated docs: proxy.md, operations.md, overview.md, config.md,
open-questions.md, README.md, ADR-013. Created ADR-022.
Analyzed 29 findings from the implementation review (002-implementation-review.md)
and identified 8 architecture-level concerns requiring spec changes:
Architecture gaps addressed:
- C2: Added acme_contact field to config.md, tls.md, and operations.md.
Let's Encrypt requires a contact email for production; the spec was missing
this required field.
- C4: Added StaticConfig drift tracking requirement to config.md reload
section. ConfigReloadHandle must update its stored StaticConfig after each
successful reload to prevent stale warnings.
- W1: Updated shutdown sequence in operations.md to specify that server tasks
should be joined (not aborted) during the drain window.
- W5: Added health check path collision note to proxy.md.
- W13: Clarified that access logging is always-on in operations.md.
- W14: Updated X-Forwarded-Proto description in proxy.md to clarify that it
is always 'https' since the HTTP listener redirects rather than proxies.
New open questions added:
- OQ-08: Should /health use a less common path to avoid upstream collision?
- OQ-09: How should upstream_connect_timeout_secs be enforced?
- OQ-10: Should ACME contact email be a required config field?
- OQ-11: How should X-Forwarded-Proto be derived per-listener?
- OQ-12: Should request access logging be mandatory or optional?
The remaining 21 findings are implementation-level bugs, code quality issues,
or Phase 2 improvements that don't require architecture spec changes.
W13: No request access logging - log_request! macro defined but never called
W14: is_https hardcoded to true on ProxyState - X-Forwarded-Proto always https
S9: Rate limiting silently bypassed when no client IP found
S10: Integration test TOML has [[listeners.listeners.sites]] typo
S11: No Server response header added by proxy (upstream's is stripped)
Comprehensive pre-implementation review of all architecture specs (overview,
proxy, tls, config, operations, 20 ADRs, open questions). Findings cover:
- Site routing model contradiction (per-listener vs global)
- X-Forwarded-For security model (edge proxy should replace, not append)
- Missing hop-by-hop header handling rules
- Undefined ACME failure behavior at startup/renewal
- Unspecified startup sequence and partial failure semantics
- Ambiguous per-listener vs shared router architecture
- Rate limiter state behavior on config reload
Plus warnings about admin socket protocol, Host header port handling,
port validation gaps, upstream format validation, TLS error handling,
shutdown draining, error response bodies, reload race conditions, and more.
- ADR-020: Document defense-in-depth rationale for running in a minimal
Docker container (memory-safe language + container isolation), flexible
upstream addressing (Docker DNS, loopback, LAN, tunnel endpoints),
file-primary logging for fail2ban, and volume mount strategy
- ADR-016: Add allow_wildcard_bind override for container deployments
where 0.0.0.0 is correct inside the container network namespace
- operations.md: Add container deployment section with Docker Compose
example, networking table, volume mounts, and health check integration;
flip logging to file-primary for fail2ban reliability; note systemd as
alternative to container deployment
- config.md: Restructure logging fields into nested LoggingConfig (matching
TOML [logging] section), add allow_wildcard_bind, shutdown_timeout_secs,
and log_file_path fields; clarify upstream addressing supports Docker
DNS and tunnel endpoints; update validation rule for 0.0.0.0 override
- overview.md: Update architecture diagram for container model with Docker
networking and volume mounts; add ADR-020 reference
- proxy.md: Clarify X-Forwarded-Proto is determined by listener port, not
hardcoded 80/443
- ADR-013: Fix health_check_port default contradiction (default is 9900,
not 0/disabled as previously stated)
Introduce [[listeners]] configuration to support both dedicated-IP
(1 IP = 1 cert = 1 domain) and shared-IP (SAN certificate) deployment
models. Each listener is an independent TLS endpoint with its own bind
address, TLS config, and site routing. OQ-07 is now resolved.
Changes:
- Add ADR-019 for multi-config listener support
- Update config format from [server] to [[listeners]] entries
- Update tls.md for per-listener TLS and certificate provisioning
- Update overview.md architecture diagram and scope
- Update proxy.md for per-listener HTTP redirect
- Fix stale references in ADR-010, ADR-011, ADR-016
- Update OQ-05 resolution (per-listener bind_addr supersedes)
- Add unique-host rationale to config validation rules
- Architecture review: fix all 3 critical and 6 warning issues
Promote multi-site support from Phase 2 to Phase 1 (ADR-010): the proxy
must support git.alk.dev and alk.dev from initial release. Add multi-domain
TLS configuration (ADR-011): acme_domains array replaces acme_domain string,
single SAN certificate via rustls-acme.
Key changes:
- ADR-010: Multi-site in Phase 1 — avoids config format migration later
- ADR-011: Multi-domain TLS — single SAN cert, acme_domains Vec<String>
- ADR-002: Updated rationale for multi-site (one upstream per domain)
- overview.md: Phase 1 now includes multi-site, alk.dev pass-through,
dual licensing (MIT OR Apache-2.0), real IP removed
- config.md: acme_domain → acme_domains, TOML example shows both sites,
validation adds unique host check, real IP replaced with 203.0.113.10
- tls.md: Multi-domain SNI section moved from Future to current, manual
mode uses ResolvesServerCert for SNI mapping, TOML header fixed
- proxy.md: Updated for multi-site, removed single-domain language
- operations.md: RFC 5737 documentation IPs, clarified rate limit eviction
semantics (distinct scan interval vs eviction age)
- open-questions.md: OQ-05 resolved (single bind_addr sufficient), new
OQ-07 (per-site TLS overrides)
Review fixes:
- acme_domains (plural) consistently used across all docs and diagram
- ADR-011 clearly scopes acme_domain as previous design
- Inline decision rationale extracted: tls.md hot-reload → ADR-004 ref,
config.md static/dynamic → ADR-008 ref
- TOML section headers consistent (server.tls)