Triage implementation review findings and update architecture specs

Analyzed 29 findings from the implementation review (002-implementation-review.md)
and identified 8 architecture-level concerns requiring spec changes:

Architecture gaps addressed:
- C2: Added acme_contact field to config.md, tls.md, and operations.md.
  Let's Encrypt requires a contact email for production; the spec was missing
  this required field.
- C4: Added StaticConfig drift tracking requirement to config.md reload
  section. ConfigReloadHandle must update its stored StaticConfig after each
  successful reload to prevent stale warnings.
- W1: Updated shutdown sequence in operations.md to specify that server tasks
  should be joined (not aborted) during the drain window.
- W5: Added health check path collision note to proxy.md.
- W13: Clarified that access logging is always-on in operations.md.
- W14: Updated X-Forwarded-Proto description in proxy.md to clarify that it
  is always 'https' since the HTTP listener redirects rather than proxies.

New open questions added:
- OQ-08: Should /health use a less common path to avoid upstream collision?
- OQ-09: How should upstream_connect_timeout_secs be enforced?
- OQ-10: Should ACME contact email be a required config field?
- OQ-11: How should X-Forwarded-Proto be derived per-listener?
- OQ-12: Should request access logging be mandatory or optional?

The remaining 21 findings are implementation-level bugs, code quality issues,
or Phase 2 improvements that don't require architecture spec changes.
This commit is contained in:
2026-06-11 15:04:09 +00:00
parent 5478df7ab7
commit 68d27c4789
6 changed files with 135 additions and 10 deletions

View File

@@ -77,16 +77,19 @@ no deploy hooks.
listed in that listener's `acme_domains`. Let's Encrypt will issue a
certificate covering those domains (a single SAN certificate or a
single-domain certificate, depending on how many domains are listed).
3. The ACME state machine runs as a background tokio task per listener,
3. The `acme_contact` field provides a contact email address (as a `mailto:`
URI) required by Let's Encrypt for production certificate requests. Without
a contact email, Let's Encrypt production API returns a 400-level error.
4. The ACME state machine runs as a background tokio task per listener,
handling:
- Account registration with Let's Encrypt
- Certificate ordering
- TLS-ALPN-01 challenge (or HTTP-01 challenge)
- Certificate issuance
- Certificate renewal (automatic, ~30 days before expiry)
4. `ResolvesServerCertAcme` is a rustls `ResolvesServerCert` implementation
5. `ResolvesServerCertAcme` is a rustls `ResolvesServerCert` implementation
that automatically serves the ACME-provisioned certificate.
5. When a new certificate is issued, the resolver updates atomically — no
6. When a new certificate is issued, the resolver updates atomically — no
restart or signal handling needed.
**Configuration (within a `[[listeners]]` entry):**
@@ -100,6 +103,7 @@ mode = "acme"
acme_domains = ["git.alk.dev", "alk.dev"]
acme_cache_dir = "/var/lib/reverse-proxy/acme-cache"
acme_directory = "production" # or "staging" for testing
acme_contact = "mailto:admin@alk.dev" # Required for Let's Encrypt production
```
**Cache directory:** The `DirCache` from rustls-acme persists ACME account data,